Skip to content
Archive of posts filed under the Multilevel Modeling category.

The next Lancet retraction? [“Subcortical brain volume differences in participants with attention deficit hyperactivity disorder in children and adults”]

Someone who prefers to remain anonymous asks for my thoughts on this post by Michael Corrigan and Robert Whitaker, “Lancet Psychiatry Needs to Retract the ADHD-Enigma Study: Authors’ conclusion that individuals with ADHD have smaller brains is belied by their own data,” which begins: Lancet Psychiatry, a UK-based medical journal, recently published a study titled […]

Prediction model for fleet management

Chang writes: I am working on a fleet management system these days: basically, I am trying to predict the usage ‘y’ of our fleet in a zip code in the future. We have some factors ‘X’, such as number of active users, number of active merchants etc. If I can fix the time horizon, the […]

Let’s accept the idea that treatment effects vary—not as something special but just as a matter of course

Tyler Cowen writes: Does knowing the price lower your enjoyment of goods and services? I [Cowen] don’t quite agree with this as stated, as the experience of enjoying a bargain can make it more pleasurable, or at least I have seen this for many people. Some in fact enjoy the bargain only, not the actual […]

Mortality rate trends by age, ethnicity, sex, and state (link fixed)

There continues to be a lot of discussion on the purported increase in mortality rates among middle-aged white people in America. Actually an increase among women and not much change among men but you don’t hear so much about this as it contradicts the “struggling white men” story that we hear so much about in […]

Whassup, Pace investigators? You’re still hiding your data. C’mon dudes, loosen up. We’re getting chronic fatigue waiting for you already!

James Coyne writes: For those of you who have not heard of the struggle for release of the data from the publicly funded PACE trial of adaptive pacing therapy, cognitive behaviour therapy, graded exercise therapy, and specialist medical care for chronic fatigue syndrome, you can access my [Coyne’s] initial call for release of the portion […]

“Bias” and “variance” are two ways of looking at the same thing. (“Bias” is conditional, “variance” is unconditional.)

Someone asked me about the distinction between bias and noise and I sent him some links. Then I thought this might interest some of you too, so here it is: Here’s a recent paper on election polling where we try to be explicit about what is bias and what is variance: And here are some […]

“A blog post that can help an industry”

Tim Bock writes: I understood how to address weights in statistical tests by reading Lu and Gelman (2003). Thanks. You may be disappointed to know that this knowledge allowed me to write software, which has been used to compute many billions of p-values. When I read your posts and papers on forking paths, I always […]

Cage match: Null-hypothesis-significance-testing meets incrementalism. Nobody comes out alive.

It goes like this. Null-hypothesis-significance-testing (NHST) only works when you have enough accuracy that you can confidently reject the null hypothesis. You get this accuracy from a large sample of measurements with low bias and low variance. But you also need a large effect size. Or, at least, a large effect size, compared to the […]

Facebook’s Prophet uses Stan

Sean Taylor, a research scientist at Facebook and Stan user, writes: I wanted to tell you about an open source forecasting package we just released called Prophet:  I thought the readers of your blog might be interested in both the package and the fact that we built it on top of Stan. Under the hood, […]

Thanks for attending StanCon 2017!

Thank you all for coming and making the first Stan Conference a success! The organizers were blown away by how many people came to the first conference. We had over 150 registrants this year! StanCon 2017 Video The organizers managed to get a video stream on YouTube: We have over 1900 views since StanCon! (We lost […]

Looking for rigor in all the wrong places

My talk in the upcoming conference on Inference from Non Probability Samples, 16-17 Mar in Paris: Looking for rigor in all the wrong places What do the following ideas and practices have in common: unbiased estimation, statistical significance, insistence on random sampling, and avoidance of prior information? All have been embraced as ways of enforcing […]

Two unrelated topics in one post: (1) Teaching useful algebra classes, and (2) doing more careful psychological measurements

Kevin Lewis and Paul Alper send me so much material, I think they need their own blogs. In the meantime, I keep posting the stuff they send me, as part of my desperate effort to empty my inbox. 1. From Lewis: “Should Students Assessed as Needing Remedial Mathematics Take College-Level Quantitative Courses Instead? A Randomized […]

Avoiding selection bias by analyzing all possible forking paths

Ivan Zupic points me to this online discussion of the article, Dwork et al. 2015, The reusable holdout: Preserving validity in adaptive data analysis. The discussants are all talking about the connection between adaptive data analysis and the garden of forking paths; for example, this from one commenter: The idea of adaptive data analysis is […]

fMRI clusterf******

Several people pointed me to this paper by Anders Eklund, Thomas Nichols, and Hans Knutsson, which begins: Functional MRI (fMRI) is 25 years old, yet surprisingly its most common statistical methods have not been validated using real data. Here, we used resting-state fMRI data from 499 healthy controls to conduct 3 million task group analyses. […]

What is the chance that your vote will decide the election? Ask Stan!

I was impressed by Pierre-Antoine Kremp’s open-source poll aggregator and election forecaster (all in R and Stan with an automatic data feed!) so I wrote to Kremp: I was thinking it could be fun to compute probability of decisive vote by state, as in this paper. This can be done with some not difficult but […]

Modeling statewide presidential election votes through 2028

David Leonhardt of the NYT asked a bunch of different people, including me, which of various Romney-won states in 2012 would be likely to be won by a Democrat in 2020, 2024, or 2028, and which of various Obama-won states would go for a Republican in any of those future years. If I’m going to […]

Some modeling and computational ideas to look into

Can we implement these in Stan? Marginally specified priors for non-parametric Bayesian estimation (by David Kessler, Peter Hoff, and David Dunson): Prior specification for non-parametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. A statistician is unlikely to have informed opinions about all aspects of […]

Mister P can solve problems with survey weighting

It’s tough being a blogger who’s expected to respond immediately to topics in his area of expertise. For example, here’s Scott “fraac” Adams posting on 8 Oct 2016, post titled “Why Does This Happen on My Vacation? (The Trump Tapes).” After some careful reflection, Adams wrote, “My prediction of a 98% chance of Trump winning […]

Trump +1 in Florida; or, a quick comment on that “5 groups analyze the same poll” exercise

Nate Cohn at the New York Times arranged a comparative study on a recent Florida pre-election poll. He sent the raw data to four groups (Charles Franklin; Patrick Ruffini; Margie Omero, Robert Green, Adam Rosenblatt; and Sam Corbett-Davies, David Rothschild, and me) and asked each of us to analyze the data how we’d like to […]

Q: “Is A 50-State Poll As Good As 50 State Polls?” A: Use Mister P.

Jeff Lax points to this post from Nate Silver and asks for my thoughts. In his post, Nate talks about data quality issues of national and state polls. It’s a good discussion, but the one thing he unfortunately doesn’t talk about is multilevel regression and poststratification (or see here for more). What you want to […]