Here it is: Bayesian inference for A/B testing Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University Lauren Kennedy, Columbia Population Research Center, Columbia University Suppose we want to use empirical data to compare two or more decisions or treatment options. Classical statistical methods based on statistical significance and p-values break down […]

**Multilevel Modeling**category.

## Spatial patterns in crime: Where’s he gonna strike next?

Wouter Steenbeek writes: I am a criminologist and mostly do spatial analyses of crime patterns: where does crime occur and why in these neighborhoods / at these locations, and so on. Currently, I am thinking about offender decision-making behavior, specifically his ‘location choice’ of where to offend. Hey, how about criminologists instead of looking to […]

## An economist wrote in, asking why it would make sense to fit Bayesian hierarchical models instead of frequentist random effects.

An economist wrote in, asking why it would make sense to fit Bayesian hierarchical models instead of frequentist random effects. My reply: Short answer is that anything Bayesian can be done non-Bayesianly: just take some summary of the posterior distribution, call it an “estimator,” and there you go. Non-Bayesian can be regularized, it can use […]

## Forking paths said to be a concern in evaluating stock-market trading strategies

Kevin Lewis points us to this paper by Tarun Chordia, Amit Goyal, and Alessio Saretto. I have no disagreement with the substance, but I don’t like their statistical framework with that “false discoveries” thing, as I don’t think there are any true zeros. I believe that most possible trading strategies have very little effect but […]

## Bob’s talk at Berkeley, Thursday 22 March, 3 pm

It’s at the Institute for Data Science at Berkeley. Hierarchical Modeling in Stan for Pooling, Prediction, and Multiple Comparisons 22 March 2018, 3pm 190 Doe Library. UC Berkeley. And here’s the abstract: I’ll provide an end-to-end example of using R and Stan to carry out full Bayesian inference for a simple set of repeated binary […]

## Important statistical theory research project! Perfect for the stat grad students (or ambitious undergrads) out there.

Hey kids! Time to think about writing that statistics Ph.D. thesis. It would be great to write something on a cool applied project, but: (a) you might not be connected to a cool applied project, and you typically can’t do these on your own, you need collaborators who know what they’re doing and who care […]

## What prior to use for item-response parameters?

Joshua Pritkin writes: There is a Stan case study by Daniel Furr on a hierarchical two-parameter logistic item response model. My question is whether to model the covariance between log alpha and beta parameters. I asked Daniel Furr about this and he said, “The argument I would make for modelling the covariance is that it […]

## Bayes for estimating a small effect in the context of large variation

Shira Mitchell and Mariel Finucane, two statisticians at Mathematica Policy Research (that’s the policy-analysis organization, not the Wolfram software company) write: We here at Mathematica have questions about priors for a health policy evaluation. Here’s the setting: In our dataset, healthcare (per person per month) expenditures are highly variable (sd = $2500), but from prior […]

## Research project in London and Chicago to develop and fit hierarchical models for development economics in Stan!

Rachael Meager at the London School of Economics and Dean Karlan at Northwestern University write: We are seeking a Research Assistant skilled in R programming and the production of R packages. The successful applicant will have experience creating R packages accessible on github or CRAN, and ideally will have experience working with Rstan. The main […]

## Use multilevel modeling to correct for the “winner’s curse” arising from selection of successful experimental results

John Snow writes: I came across this blog by Milan Shen recently and thought you might find it interesting. A couple of things jumped out at me. It seemed like the so-called ‘Winner’s Curse’ is just another way of describing the statistical significance filter. It also doesn’t look like their correction method is very effective. […]

## What’s Wrong with “Evidence-Based Medicine” and How Can We Do Better? (My talk at the University of Michigan Friday 2pm)

Tomorrow (Fri 9 Feb) 2pm at the NCRC Research Auditorium (Building 10) at the University of Michigan: What’s Wrong with “Evidence-Based Medicine” and How Can We Do Better? Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University “Evidence-based medicine” sounds like a good idea, but it can run into problems when the […]

## 354 possible control groups; what to do?

Jonas Cederlöf writes: I’m a PhD student in economics at Stockholm University and a frequent reader of your blog. I have for a long time followed your quest in trying to bring attention to p-hacking and multiple comparison problems in research. I’m now myself faced with the aforementioned problem and want to at the very […]

## N=1 experiments and multilevel models

N=1 experiments are the hot new thing. Here are some things to read: Design and Implementation of N-of-1 Trials: A User’s Guide, edited by Richard Kravitz and Naihua Duan for the Agency for Healthcare Research and Quality, U.S. Department of Health and Human Services (2014). Single-patient (n-of-1) trials: a pragmatic clinical decision methodology for patient-centered […]

## Looking at all possible comparisons at once: It’s not “overfitting” if you put it in a multilevel model

Rémi Gau writes: The human brain mapping conference is on these days and heard via tweeter about this Overfitting toolbox for fMRI studies that helps explore the multiplicity of analytical pipelines in a more systematic fashion. Reminded me a bit of your multiverse analysis: thought you might like the idea. The link is to a […]

## Stacking and multiverse

It’s a coincidence that there is another multiverse posting today. Recently Tim Disher asked in Stan discussion forum a question “Multiverse analysis – concatenating posteriors?” Tim refers to a paper “Increasing Transparency Through a Multiverse Analysis” by Sara Steegen, Francis Tuerlinckx, Andrew Gelman, and Wolf Vanpaemel. The abstract says Empirical research inevitably includes constructing a […]

## The multiverse in action!

In a recent paper, “Degrees of Freedom in Planning, Running, Analyzing, and Reporting Psychological Studies: A Checklist to Avoid p-Hacking,” Jelte Wicherts, Coosje Veldkamp, Hilde Augusteijn, Marjan Bakker, Robbie van Aert, and Marcel van Assen write: The designing, collecting, analyzing, and reporting of psychological studies entail many choices that are often arbitrary. The opportunistic use […]

## How to get a sense of Type M and type S errors in neonatology, where trials are often very small? Try fake-data simulation!

Tim Disher read my paper with John Carlin, “Beyond Power Calculations: Assessing Type S (Sign) and Type M (Magnitude) Errors,” and followed up with a question: I am a doctoral student conducting research within the field of neonatology, where trials are often very small, and I have long suspected that many intervention effects are potentially […]

## A Python program for multivariate missing-data imputation that works on large datasets!?

Alex Stenlake and Ranjit Lall write about a program they wrote for imputing missing data: Strategies for analyzing missing data have become increasingly sophisticated in recent years, most notably with the growing popularity of the best-practice technique of multiple imputation. However, existing algorithms for implementing multiple imputation suffer from limited computational efficiency, scalability, and capacity […]

## “Handling Multiplicity in Neuroimaging through Bayesian Lenses with Hierarchical Modeling”

Donald Williams points us to this new paper by Gang Chen, Yaqiong Xiao, Paul Taylor, Tracy Riggins, Fengji Geng, Elizabeth Redcay, and Robert Cox: In neuroimaging, the multiplicity issue may sneak into data analysis through several channels . . . One widely recognized aspect of multiplicity, multiple testing, occurs when the investigator fits a separate […]

## A debate about robust standard errors: Perspective from an outsider

A colleague pointed me to a debate among some political science methodologists about robust standard errors, and I told him that the topic didn’t really interest me because I haven’t found a use for robust standard errors in my own work. My colleague urged me to look at the debate more carefully, though, so I […]