I received an email with the above title from Daniel Adkins, who elaborates:

I [Adkins] am having a tough time with a dataset including 40K obs and 8K subjects. Trying to estimate a 2 level logit with random intercept and age slope and about 13 fixed covariates. I have tried several R packages (lme4, lme4a, glmmPQL, MCMCglmm) and stata xtmelogit and gllamm to no avail. xtmelogit crashes from insufficient memory. The R packages yield false convergences. A simpler model w/ random intercept only gives stable estimates in lme4 with a very large number of quadrature point (nAGQ>220). When i try this (nAGQ=221) with the random age term, it doesn’t make it through a single iteration in 72 hours (have tried both w/ and w/out RE correlation). I am using a power desktop that is top of the line compared to anything other than a cluster. Have tried start values for fixed effects in lme4 and that doesn’t help (couldn’t figure out how to specify RE starts). Do you have any advice. Should I move to BUGS (no experience, so that would be a little painful)? Try as SEM in Mplus? Is there a trick in lme4 or gllamm, that might help? Cluster/more RAM? I am thinking of splitting the sample, modeling, then combining estimates via meta-analysis, but reviewers may balk at this.

Could you give any advice? I have done a lot of diagnostics and simpler models, so I know ~ what the parameter estimates are, but need the formal full models for a paper.

What with that 93% marginal tax rate [regular readers will know what I'm talking about], I might as well just answer this one for free . . .

My first question is what happened in gllamm? My general impression is that gllamm is more reliable than lme4 etc for nonlinear models. Another option is to try bglmer–the regularization in the priors should make the computation more stable. I don’t know Mplus so can’t say about that, but, no, I wouldn’t recommend Bugs–my guess is that would be too slow.

Another option is some sort of intermediate approach, such as: Fit one of those simpler models that you like. In that simpler model, compute and save the “linear predictor” (X*beta). Then use this as a single predictor (maybe it can have a varying slope also) in a multilevel model. Or something like that. If you have a simpler model that you understand, then you should be able to blaze a trail from there to the model you finally want to fit.

Adkins sent an update:

I was finally able to get stable, sensible estimates in lme4 after 1) rescaling age (to avoid estimating a very small random effect variance) and 2) moving to the 64-bit version of R. To answer your question, my sense was that gllamm would get stuck in a flat, slightly bumpy portion of the likelihood function and couldn’t get out to move toward the global optimum. Could have probably circumvented this with good starts but I figured things out in lme4 first.

Aaahhh–rescaling. That’s advice I gave someone else the other day, oddly enough. I should’ve thought of that.

Is bglmer available somewhere?

Very soon!

With a big dataset and these problems, I think we should move to compute the MCMC in C, calling it from R. Do you know of any site where there is a sample code to do this with hierarchical models? It would help us to adapt the code to fit our needs. The other day I was having trouble with the task of generate random number from a Wwishart Distribution in C.

Manoel

@Manoel: check out the MCMCglmm package (particularly the rIW function, which samples from the "conditional inverse Wishart distribution" based on C code), and which is GPL …

What with that 93% marginal tax rate [regular readers will know what I'm talking about], I might as well just answer this one for free . . .

As long as you're referencing this "fact", you should reference my previous post where I show this is fallacious.

Or it must have been debunked in detail in the blogosphere.

Yes, I was kidding about the 93% tax rate. I just threw that one out for the amusement of the regular readers.

Revolution analytics, RevoScaleR, maybe … try it out