Question about p-values in multilevel models

Posted on September 2, 2009 4:41 PM by Andrew

Eric Schwartz writes a long question (complete with R code). My reply is at the end. Schwartz writes:

How do you recommend one tests whether fixed effects are different from zero in a generalized linear mixed model?

Douglas Bates argues that it is inappropriate to use p-values from t-statistics reported in lmer(). I’ve followed all of his and your subsequent postings on the topic that I have found, but still have some questions:

Q1] Why then does a Poisson model (see Model1 below) include p-values (from Z-statistic) but the overdispersed Poisson model (Model2 below) does not include p-values (from t-statistic)?

[Q2] When using the overdispersed Poisson (Model2) with a large dataset (10000 units across 2000 groups in one batch and 300 groups in another batch), how do you recommend obtaining p-values for the covariates?

[Q3] How does one test whether each random effect (i.e., a group’s intercept) is different than zero?

It seems you prefer to simply construct an interval of 2 standard errors are each side of the point estimate, and then see if it includes zero. [i.e., ( estimate – 2*standard error, estimate + 2*standard error) ] . In your book with Jennifer Hill and on your blog, your mention that using 2 s.e. is probably enough in practice for applied problems. Under what conditions is this a reasonable approximation of significance?

I understand I can use mcsamp() to call mcmcsamp() . Is this the Gelman-Bates compromise you would advocate?

Here are the models referred to above:

Consider this model. Let J “groups” in the batch be represented by j=1,…,J. Observations in the data are i=1,…, n where n >>J. It is a varying intercept, constant slope model. So, each individual has its own intercept, but all individuals share common coefficients of the fixed effects, x1 and x2.

Consider two specifications:

Model 1. Poisson. Conditional on knowing each group’s random effect and the fixed effects’ coefficients, the counts are simply Poisson.

Y_i ~ Poission( \mu + a_{ j[i]} + \beta1*x1_i + \beta2*x2_i )

a_{ j}~ N(0,sigma_a), j=1,…,J .

R code: lmer( y ~ (1|id) + x1 + x2 , family=poisson(link=”log”) )

Model 2. Poisson with overdispersion. Condtional knowing each group’s random effect and the fixed effects’ coefficients, the counts are Poisson-log-normal.

Y_i ~ Poission( \mu + a_{ j[i]} + \beta1*x1_i + \beta2*x2_i +\epsilon_i )

a_{ j} ~ N(0,sigma_a), j=1,…,J

\epsilon_i ~ N(0,sigma_epsilon), i=1,…,n.

R code: lmer( y ~ (1|id) + x1 + x2 , family=quasipoisson(link=”log”) )

My reply:

1. I’m not really interested in testing whether things are different from zero. I’d say just go with the confidence intervals, posterior inferences, etc.

2. I’m not sure on the Poisson thing, but almost always you’ll want to do overdispersed Poisson (as discussed in ARM).

3. You can use mcsamp() to call mcmcsamp(), except that mcmcsamp() is not currently working (except for some very simple models). But don’t get mad at Doug Bates–he’s doing it all for us for free!