Kari Lock Morgan writes:
I’m writing now though with a multilevel modeling question that has been nagging me for quite some time now. In your book with Jennifer Hill, you include a group-level predictor (for example, 12.15 on page 266), but then end up fitting this as an individual-level predictor with lmer. How can this be okay? It seems as if lmer can’t really still be fitting the model specified in 12.15? In particular, I’m worried about analyzing a cluster randomized experiment where the treatment is applied at the cluster level and outcomes are at the individual level – intuitively, of course it should matter that the treatment was applied at the cluster level, not the individual level, and therefore somehow this should enter into how the model is fit? However, I can’t seem to grasp how lmer would know this, unless it is implicitly looking at the covariates to see if they vary within groups or not, which I’m guessing it’s not? In your book you act as if fitting the model with county-level uranium as an individual predictor is the same as fitting it as a group-level predictor, which makes me think perhaps I am missing something obvious?
My reply: It indeed is annoying that lmer (and, for that matter, stan_lmer in its current incarnation) only allows individual-level predictors, so that any group-level predictors need to be expanded to the individual level (for example, u_full <- u[group]). But from the standpoint of fitting the statistical model, it doesn't matter. Regarding the question, how does the model "know" that, in this case, u_full is actually an expanded group-level predictor: The answer is that it "figures it out" based on the dependence between u_full and the error terms. It all works out.
This is great! I always wondered about this. I know this is a side issue but I was trying to think of where this can lead you astray (expanding to the individual level). I am thinking about precision of estimates in terms of sample size. Would your precision be closer to the sample size of the group level than the individual? Can you then be “fooled” in the sense that you would expect a much larger sample size when estimating group-level predictors that you actually had?
Pophealth:
I discussed some of this in my 2006 paper, Multilevel (hierarchical) modeling: What it can and cannot do (originally written with the idea that it would be coauthored with Jan de Leeuw, but then he said he thought the paper was fine as is, so I just published it on its own).
Yes, I agree this is really helpful to have this reassurance as I’ve worried about this too. This was one of the top results when I googled this question. Is there some “formal” derivation of how exactly the model “figures it out” (that the repeated individual-level predictors are actually group-level predictors) that you can point me to?
Quentin:
I don’t know exactly what a formal derivation would look like, but I discuss the issue in my 2005 paper (actually written in 2002, based on ideas from 1997), Analysis of variance: Why it is more important than ever.