Mister P wins again

Chad Kiewiet De Jonge, Gary Langer, and Sofi Sinozich write: This paper presents state-level estimates of the 2016 presidential election using data from the ABC News/Washington Post tracking poll and multilevel regression with poststratification (MRP). While previous implementations of MRP for election forecasting have relied on data from prior elections to establish poststratification targets for […]

“Bayesian Meta-Analysis with Weakly Informative Prior Distributions”

Donny Williams sends along this paper, with Philippe Rast and Paul-Christian Bürkner, and writes: This paper is similar to the Chung et al. avoiding boundary estimates papers (here and here), but we use fully Bayesian methods, and specifically the half-Cauchy prior. We show it has as good of performance as a fully informed prior based […]

Multilevel modeling in Stan improves goodness of fit — literally.

John McDonnell sends along this post he wrote with Patrick Foley on how they used item-response models in Stan to get better clothing fit for their customers: There’s so much about traditional retail that has been difficult to replicate online. In some senses, perfect fit may be the final frontier for eCommerce. Since at Stitch […]

Stan goes to the World Cup

Leo Egidi shares his 2018 World Cup model, which he’s fitting in Stan. But I don’t like this: First, something’s missing. Where’s the U.S.?? More seriously, what’s with that “16.74%” thing? So bogus. You might as well say you’re 66.31 inches tall. Anyway, as is often the case with Bayesian models, the point here is […]

Global shifts in the phenological synchrony of species interactions over recent decades

Heather Kharouba et al. write: Phenological responses to climate change (e.g., earlier leaf-out or egg hatch date) are now well documented and clearly linked to rising temperatures in recent decades. Such shifts in the phenologies of interacting species may lead to shifts in their synchrony, with cascading community and ecosystem consequences . . . We […]

The necessity—and the difficulty—of admitting failure in research and clinical practice

Bill Jefferys sends along this excellent newspaper article by Siddhartha Mukherjee, “A failure to heal,” about the necessity—and the difficulty—of admitting failure in research and clinical practice. Mukherjee writes: What happens when a clinical trial fails? This year, the Food and Drug Administration approved some 40 new medicines to treat human illnesses, including 13 for […]

Forking paths come from choices in data processing and also from choices in analysis

Michael Wiebe writes: I’m a PhD student in economics at UBC. I’m trying to get a good understanding of the garden of forking paths, and I have some questions about your paper with Eric Loken. You describe the garden of forking paths as “researcher degrees of freedom without fishing” (#3), where the researcher only performs […]

Against Screening

Matthew Simonson writes: I have a question that may be of interest to your readers (and even if not, I’d love to hear your response). I’ve been analyzing a dataset of over 100 Middle Eastern political groups (MAROB) to see how these groups react to government repression. Observations are at the group-year level and include […]

“This is a weakness of our Bayesian Data Analysis book: We don’t have a lot of examples with informative priors.”

Roy Tamura writes: I am trying to implement a recommendation you made a few years ago. In my clinical trial of drug versus placebo, patients were stratified into two cohorts and randomized within strata. Time to event is the endpoint with the proportional hazards regression with strata and treatment as independent factors. There is evidence […]

“I admire the authors for simply admitting they made an error and stating clearly and without equivocation that their original conclusions were not substantiated.”

David Allison writes: I hope you will consider covering this in your blog. I admire the authors for simply admitting they made an error and stating clearly and without equivocation that their original conclusions were not substantiated. More attention to the confusing effects of regression to the mean are warranted as is more praise for […]

Regularized Prediction and Poststratification (the generalization of Mister P)

This came up in comments recently so I thought I’d clarify the point. Mister P is MRP, multilevel regression and poststratification. The idea goes like this: 1. You want to adjust for differences between sample and population. Let y be your outcome of interest and X be your demographic and geographic variables you’d like to […]

How to reduce Type M errors in exploratory research?

Miao Yu writes: Recently, I found this piece [a news article by Janet Pelley, Sulfur dioxide pollution tied to degraded sperm quality, published in Chemical & Engineering News] and the original paper [Inverse Association between Ambient Sulfur Dioxide Exposure and Semen Quality in Wuhan, China, by Yuewei Liu, published in Environmental Science & Technology]. Air […]

Zero-excluding priors are probably a bad idea for hierarchical variance parameters

(This is Dan, but in quick mode) I was on the subway when I saw Andrew’s last post and it doesn’t strike me as a particularly great idea. So let’s take a look at the suggestion for 8 schools using a centred parameterization.  This is not as comprehensive as doing a proper simulation study, but […]

Individual and aggregate causal effects: Social media and depression among teenagers

This one starts out as a simple story of correction of a statistical analysis and turns into an interesting discussion of causal inference for multilevel models. Michael Daly writes: I saw your piece on ‘Have Smartphone Destroyed a Generation’ and wanted to flag some of the associations underlying key claims in this debate (which is […]

Psychometrics corner: They want to fit a multilevel model instead of running 37 separate correlation analyses

Anouschka Foltz writes: One of my students has some data, and there is an issue with multiple comparisons. While trying to find out how to best deal with the issue, I came across your article with Martin Lindquist, “Correlations and Multiple Comparisons in Functional Imaging: A Statistical Perspective.” And while my student’s work does not […]

Using partial pooling when preparing data for machine learning applications

Geoffrey Simmons writes: I reached out to John Mount/Nina Zumel over at Win Vector with a suggestion for their vtreat package, which automates many common challenges in preparing data for machine learning applications. The default behavior for impact coding high-cardinality variables had been a naive bayes approach, which I found to be problematic due its multi-modal output (assigning […]

The Millennium Villages Project: a retrospective, observational, endline evaluation

Shira Mitchell et al. write (preprint version here if that link doesn’t work): The Millennium Villages Project (MVP) was a 10 year, multisector, rural development project, initiated in 2005, operating across ten sites in ten sub-Saharan African countries to achieve the Millennium Development Goals (MDGs). . . . In this endline evaluation of the MVP, […]

Fitting a hierarchical model without losing control

Tim Disher writes: I have been asked to run some regularized regressions on a small N high p situation, which for the primary outcome has lead to more realistic coefficient estimates and better performance on cv (yay!). Rstanarm made this process very easy for me so I am grateful for it. I have now been […]

“The Internal and External Validity of the Regression Discontinuity Design: A Meta-Analysis of 15 Within-Study-Comparisons”

Jag Bhalla points to this post by Alex Tabarrok pointing to this paper, “The Internal and External Validity of the Regression Discontinuity Design: A Meta-Analysis of 15 Within-Study-Comparisons,” by Duncan Chaplin, Thomas Cook, Jelena Zurovac, Jared Coopersmith, Mariel Finucane, Lauren Vollmer, and Rebecca Morris, which reports that regression discontinuity (RD) estimation performed well in these […]

Justify my love

