Fred Wu writes:

I work at National Prescribing Services in Australia. I have a database representing say, antidiabetic drug utilisation for the entire Australia in the past few years. I planned to do a longitudinal analysis across GP Division Network (112 divisions in AUS) using mixed-effects models (or as you called in your book varying intercept and varying slope) on this data.

The problem here is: as data actually represent the population who use antidiabetic drugs in AUS, should I use 112 fixed dummy variables to capture the random variations or use varying intercept and varying slope for the model ? Because some one may aruge, like divisions in AUS or states in USA can hardly be considered from a “superpopulation”, then fixed dummies should be used. What I think is the population are those who use the drugs, what will happen when the rest need to use them? In terms of exchangeability, using varying intercept and varying slopes can be justified.

Also you provided in your book the various definition on fixed effects and random effects, my own preference in the longitudinal setting is time-invariant covariates are treated as fixed-effects, and time-variant covariates are random effects where the fixed effect parameter is the population average and random effects are subject-specific effects.

My reply:

Even if you only want to estimate these 112 parameters, I’d think you’d want to estimate them as accurately as possible. And I think that would be done using multilevel modeling. There’s no particular reason to expect that least squares is the best way to go, and lots of reasons to expect otherwise. Similarly, in Red State Blue State etc. we’re only interested in our 50 states, not any others—we’re not sitting around waiting for D.C. and Puerto Rico to be added in as #51 and #52—and multilevel modeling is a good way of doing that.

For some justification of multilevel modeling in such situations, see this paper by the Twins.