Skip to content
 

17 groups, 6 group-level predictors: What to do?

Yi-Chun Ou writes:

I am using a multilevel model with three levels. I read that you wrote a book about multilevel models, and wonder if you can solve the following question.

The data structure is like this:

Level one: customer (8444 customers)
Level two: companys (90 companies)
Level three: industry (17 industries)

I use 6 level-three variables (i.e. industry characteristics) to explain the variance of the level-one effect across industries. The question here is whether there is an over-fitting problem since there are only 17 industries. I understand that this must be a problem for non-multilevel models, but is it also a problem for multilevel models?

My reply: Yes, this could be a problem. I’d suggest combining some of your variables into a common score, or using only some of the variables, or using strong priors to control the inferences. This is an interesting and important area of statistics research, to do this sort of thing systematically. There’s lots of work on what to do with hundreds of data points and thousands of predictors, but not so much with 17 data and 6 predictors.

One Comment

  1. Jeremy Miles says:

    This sounds to me like a situation where you just don’t have enough data (information) to make good inferences. You can try to bring in some more information by using strong priors, but it’s going to be hard to get good answers (I’d have thought).