Skip to content
Archive of posts filed under the Multilevel Modeling category.

Demystifying Blup

In our recent thread on computing hierarchical models with big datasets, someone brought up Blup. I thought it might be worth explaining what Blup is and how it relates to hierarchical models. Blup stands for Best Linear Unbiased Prediction, but in my terminology it’s just hierarchical modeling. Let me break it down: – “Best” doesn’t [...]

Hierarchical/multilevel modeling with “big data”

Dean Eckles writes: I make extensive use of random effects models in my academic and industry research, as they are very often appropriate. However, with very large data sets, I am not sure what to do. Say I have thousands of levels of a grouping factor, and the number of observations totals in the billions. [...]

17 groups, 6 group-level predictors: What to do?

Yi-Chun Ou writes:

Meta-analyses of impact evaluations of aid programs

Eva Vivalt points me to this. I don’t know anything about it, but I am intrigued by the idea of a meta-analysis being done outside of the usual channels.

Fixed effects and identification

Tom Clark writes: Drew Linzer and I [Tom] have been working on a paper about the use of modeled (“random”) and unmodeled (“fixed”) effects. Not directly in response to the paper, but in conversations about the topic over the past few months, several people have said to us things to the effect of “I prefer [...]

Check your missing-data imputations using cross-validation

Elena Grewal writes: I am currently using the iterative regression imputation model as implemented in the Stata ICE package. I am using data from a survey of about 90,000 students in 142 schools and my variable of interest is parent level of education. I want only this variable to be imputed with as little bias [...]

NSF program “to support analytic and methodological research in support of its surveys”

David Hogg points me to this announcement of a program from the National Center for Science and Engineering Statistics of the National Science Foundation: The Center would like to enhance its efforts to support analytic and methodological research in support of its surveys, and to engage in the education and training of researchers in the [...]

Modeling group-level predictors in a multilevel regression

Trey Causey writes: Do you have suggestions as to model selection strategies akin to Bayesian model averaging for multilevel models when level-2 inputs are of substantive interest? I [Causey] have seen plenty of R packages and procedures for non-multilevel models, and tried the glmulti package but found that it did not perform well with more [...]

As a Bayesian I want scientists to report their data non-Bayesianly

Philipp Doebler writes:

Multilevel modeling even when you’re not interested in predictions for new groups

Fred Wu writes:

Reference on longitudinal models?

Antonio Ramos writes: The book with Hill has very little on longitudinal models. So do you recommended any reference to complement your book on covariance structures typical from these models, such as AR(1), Antedependence, Factor Analytic, etc? I am very much interest in BUGS code for these basic models as well as how to extend [...]

How many data points do you really have?

Chris Harrison writes:

Meta-analysis, game theory, and incentives to do replicable research

One of the key insights of game theory is to solve problems in reverse time order. You first figure out what you would do in the endgame, then decide a middle-game strategy to get you where you want to be at the end, then you choose an opening that will take you on your desired [...]

How many parameters are in a multilevel model?

Stephen Collins writes: I’m reading your Multilevel modeling book and am trying to apply it to my work. I’m concerned with how to estimate a random intercept model if there are hundreds/thousands of levels. In the Gibbs sampling, am I sampling a parameter for each level? Or, just the hyper-parameters? In other words, say I [...]

R-squared for multilevel models

Fred Schiff writes:

Bayesian Anova found useful in ecology

David LeBauer points me to this article in PLoS One by Andy Hector, Thomas Bell, Yann Hautier, Forest Isbell, Marc Kéry, Peter Reich, Jasper van Ruijven, and Bernhard Schmid. Here’s the abstract: The idea that species diversity can influence ecosystem functioning has been controversial and its importance relative to compositional effects hotly debated. Unfortunately, assessing [...]

Statistical ethics violation

A colleague writes: When I was in NYC I went to this party by group of Japanese bio-scientists. There, one guy told me about how the biggest pharmaceutical company in Japan did their statistics. They ran 100 different tests and reported the most significant one. (This was in 2006 and he said they stopped doing [...]

The scope for snooping

Macartan Humphreys sent the following question to David Madigan and me:

Looking at many comparisons may increase the risk of finding something statistically significant by epidemiologists, a population with relatively low multilevel modeling consumption

To understand the above title, see here. Masanao writes: This report claims that eating meat increases the risk of cancer. I’m sure you can’t read the page but you probably can understand the graphs. Different bars represent subdivision in the amount of the particular type of meat one consumes. And each chunk is different types [...]

I got 99 comparisons but multiplicity ain’t one

After I gave my talk at an econ seminar on Why We (Usually) Don’t Care About Multiple Comparisons, I got the following comment: One question that came up later was whether your argument is really with testing in general, rather than only with testing in multiple comparison settings. My reply: Yes, my argument is with [...]

An interweaving-transformation strategy for boosting MCMC efficiency

Yaming Yu and Xiao-Li Meng write in with a cool new idea for improving the efficiency of Gibbs and Metropolis in multilevel models: For a broad class of multilevel models, there exist two well-known competing parameterizations, the centered parameterization (CP) and the non-centered parameterization (NCP), for effective MCMC implementation. Much literature has been devoted to [...]

Death!

This graph shows the estimate that Kenny Shirley and I have of support for the death penalty by sex and race in the U.S. since 1955: We also found that capital punishment used to be more popular in the Northeast than in the South, but now it’s the other way around. Here’s the abstract to [...]

More reason to like Sims besides just his name

John Horton points to Sims‘s comment on Angrist and Pischke: Top of page 8—he criticizes economists for using clustered standard errors—suggests using multilevel models instead. Awesome! So now there are at least two Nobel prize winners in economics who’ve expressed skepticism about controlled experiments. (I wonder if Sims is such a danger in a parking [...]

Combining data from many sources

Mark Grote writes:

Analysis of Power Law of Participation

Rick Wash writes: A colleague as USC (Lian Jian) and I were recently discussing a statistical analysis issue that both of us have run into recently. We both mostly do research about how people use online interactive websites. One property that most of these systems have is known as the “powerlaw of participation” — the [...]