Skip to content
Archive of posts filed under the Multilevel Modeling category.

Forking paths vs. six quick regression tips

Bill Harris writes: I know you’re on a blog delay, but I’d like to vote to raise the odds that my question in a comment to http://andrewgelman.com/2015/09/15/even-though-its-published-in-a-top-psychology-journal-she-still-doesnt-believe-it/gets discussed, in case it’s not in your queue. It’s likely just my simple misunderstanding, but I’ve sensed two bits of contradictory advice in your writing: fit one complete model all at […]

Where the fat people at?

Pearly Dhingra points me to this article, “The Geographic Distribution of Obesity in the US and the Potential Regional Differences in Misreporting of Obesity,” by Anh Le, Suzanne Judd, David Allison, Reena Oza-Frank, Olivia Affuso, Monika Safford, Virginia Howard, and George Howard, who write: Data from BRFSS [the behavioral risk factor surveillance system] suggest that […]

Stunning breakthrough: Using Stan to map cancer screening!

Paul Alper points me to this article, Breast Cancer Screening, Incidence, and Mortality Across US Counties, by Charles Harding, Francesco Pompei, Dmitriy Burmistrov, Gilbert Welch, Rediet Abebe, and Richard Wilson. Their substantive conclusion is there’s too much screening going on, but here I want to focus on their statistical methods: Spline methods were used to […]

One thing I like about hierarchical modeling is that is not just about criticism. It’s a way to improve inferences, not just a way to adjust p-values.

In an email exchange regarding the difficulty many researchers have in engaging with statistical criticism (see here for a recent example), a colleague of mine opined: Nowadays, promotion requires more publications, and in an academic environment, researchers are asked to do more than they can. So many researchers just work like workers in a product […]

rstanarm and more!

Ben Goodrich writes: The rstanarm R package, which has been mentioned several times on stan-users, is now available in binary form on CRAN mirrors (unless you are using an old version of R and / or an old version of OSX). It is an R package that comes with a few precompiled Stan models — […]

Plausibility vs. probability, prior distributions, and the garden of forking paths

I’ll start off this blog on the first work day of the new year with an important post connecting some ideas we’ve been lately talking a lot about. Someone rolls a die four times, and he tells you he got the numbers 1, 4, 3, 6. Is this a plausible outcome? Sure. Is is probable? […]

“Once I was told to try every possible specification of a dependent variable (count, proportion, binary indicator, you name it) in a regression until I find a significant relationship. That is it, no justification for choosing one specification over another besides finding significance. . . . In another occasion I was asked to re-write a theory section of a paper to reflect an incidental finding from our analysis, so that it shows up as if we were asking a question about the incidental finding and had come up with the supported hypothesis a priori. . . .”

Ethan Bolker points me to this discussion. My reply: As discussed in my paper with Hill and Yajima, I think the best approach is to analyze all comparisons rather than picking just some. If there is prior understanding that some comparisons are more important than others, that understanding can be included as predictors in the […]

Hierarchical modeling when you have only 2 groups: I still think it’s a good idea, you just need an informative prior on the group-level variation

Dan Chamberlain writes: I am working on a Bayesian analysis of some data from a randomized controlled trial comparing two different drugs for treating seizures in children. I have been using your book as a resource and I have a question about hierarchical modeling. If you have the time, I would greatly appreciate any advice […]

Judea Pearl and I briefly discuss extrapolation, causal inference, and hierarchical modeling

OK, I guess it looks like the Buzzfeed-style headlines are officially over. Anyway, Judea Pearl writes: I missed the discussion you had here about Econometrics: Instrument locally, extrapolate globally, which also touched on my work with Elias Bareinboim. So, please allow me to start a new discussion about extrapolation and external validity. First, two recent […]

Syllabus for my course on design and analysis of sample surveys

Here’s last year’s course plan. Maybe I’ll change it a bit, haven’t decided yet. The course number is Political Science 4365, and it’s also cross-listed in Statistics.

How does Brad Cooper analyze hierarchical survey data with post-stratification?

Laura Holder writes: I am working on a project involving a large survey data set and am interested in applying a model-based approach in the context of post-stratification (as you frequently discuss). I’m attempting to determine the most suitable approach for my circumstance. The data set I am working with is collected via a two […]

Inference from an intervention with many outcomes, not using “statistical significance”

Kate Casey writes: I have been reading your papers “Type S error rates for classical…” and “Why We (Usually) Don’t Have to Worry…” with great interest and would be grateful for your views on the appropriateness of a potentially related application. I have a non-hierarchical dataset of 28 individuals who participated in a randomized control […]

You won’t believe these stunning transformations: How to parameterize hyperpriors in hierarchical models?

Isaac Armstrong writes: I was working through your textbook “Data Analysis Using Regression and Multilevel/Hierarchical Models” but wanted to learn more and started working through your “Bayesian Data Analysis” text. I’ve got a few questions about your rat tumor example that I’d like to ask. I’ve been trying to understand one of the hierarchical models […]

3 new priors you can’t do without, for coefficients and variance parameters in multilevel regression

Partha Lahiri writes, in reference to my 2006 paper: I am interested in finding out a good prior for the regression coefficients and variance components in a multi-level setting. For concreteness, let’s say we have a model like the following: Level 1: Y_ijk | theta_ij ~(ind) N( theta_ij, sigma^2) Level 2: theta_ij| mu_i ~(ind) N( […]

Stop screaming already: Exaggeration of effects of fan distraction in NCAA basketball

John Ezekowitz writes: I have been reading your work on published effect sizes, and I thought you might be interested in this example, which is of small consequence but grates me as a basketball and data fan. Kevin Quealy and Justin Wolfers published an analysis in The NYT on fans’ effectiveness in causing road teams […]

3 postdoc opportunities you can’t miss—here in our group at Columbia! Apply NOW, don’t miss out!

Hey, just once, the Buzzfeed-style hype is appropriate. We have 3 amazing postdoc opportunities here, and you need to apply NOW. Here’s the deal: we’re working on some amazing projects. You know about Stan and associated exciting projects in computational statistics. There’s the virtual database query, which is the way I like to describe our […]

Hierarchical logistic regression in Stan: The untold story

Corey Yanofsky pointed me to a paper by Neal Beck, Estimating grouped data models with a binary dependent variable and fixed effects: What are the issues?, which begins: This article deals with a very simple issue: if we have grouped data with a binary dependent variable and want to include fixed effects (group specific intercepts) […]

Latest gay gene tabloid hype

The tabloid in question is the journal Nature, which along with Science and PPNAS (the Proceedings of the National Academy of Sciences, publisher of gems such as the himmicanes and hurricanes study) has in recent years become notorious for publishing flashy but unsubstantiated scientific claims. As Lord Acton never said, publicity corrupts, and absolute publicity […]

Mindset interventions are a scalable treatment for academic underachievement — or not?

Someone points me to this post by Scott Alexander, criticizing the work of psychology researcher Carol Dweck. Alexander looks carefully at an article, “Mindset Interventions Are A Scalable Treatment For Academic Underachievement,” by David Paunesku, Gregory Walton, Carissa Romero, Eric Smith, David Yeager, and Carol Dweck, and he finds the following: Among ordinary students, the […]

PMXStan: an R package to facilitate Bayesian PKPD modeling with Stan

From Yuan Xiong, David A James, Fei He, and Wenping Wang at Novartis. Full version of the poster here.