Skip to content
Archive of posts filed under the Bayesian Statistics category.

Here’s the title of my talk at the New York R conference, 20 Apr 2018:

The intersection of Graphics and Bayes, a slice of the Venn diagram that’s a lot more crowded than you might realize And here are some relevant papers: [2003] A Bayesian formulation of exploratory data analysis and goodness-of-fit testing. {\em International Statistical Review} {\bf 71}, 369–382. (Andrew Gelman) [2004] Exploratory data analysis for complex models (with […]

Incorporating Bayes factor into my understanding of scientific information and the replication crisis

I was having this discussion with Dan Kahan, who was arguing that my ideas about type M and type S error, while mathematically correct, represent a bit of a dead end in that, if you want to evaluate statistically-based scientific claims, you’re better off simply using likelihood ratios or Bayes factors. Kahan would like to […]

Important statistical theory research project! Perfect for the stat grad students (or ambitious undergrads) out there.

Hey kids! Time to think about writing that statistics Ph.D. thesis. It would be great to write something on a cool applied project, but: (a) you might not be connected to a cool applied project, and you typically can’t do these on your own, you need collaborators who know what they’re doing and who care […]

Information flows both ways (Martian conspiracy theory edition)

A topic that arises from time to time in Bayesian statistics is the desire of analysts to propagate information in one direction, with no backwash, as it were. But the logic of Bayesian inference doesn’t work that way. If A and B are two uncertain statements, and A tells you something about B, then learning […]

My talk this Wednesday at Stanford business school

It’s in the Organizational Behavior Seminar, Wed 7 Mar at noon in room E247: Toward replicable research in the human sciences: How can we get from where we are, to where we want to be? We’ve heard a lot about the replication crisis in science. Now it’s time to consider solutions from several directions including […]

What prior to use for item-response parameters?

Joshua Pritkin writes: There is a Stan case study by Daniel Furr on a hierarchical two-parameter logistic item response model. My question is whether to model the covariance between log alpha and beta parameters. I asked Daniel Furr about this and he said, “The argument I would make for modelling the covariance is that it […]

Bayes for estimating a small effect in the context of large variation

Shira Mitchell and Mariel Finucane, two statisticians at Mathematica Policy Research (that’s the policy-analysis organization, not the Wolfram software company) write: We here at Mathematica have questions about priors for a health policy evaluation. Here’s the setting: In our dataset, healthcare (per person per month) expenditures are highly variable (sd = $2500), but from prior […]

Rasmussen and Williams never said that Gaussian processes resolve the problem of overfitting

Apparently there’s an idea out there that Bayesian inference with Gaussian processes automatically avoids overfitting. But no, you can still overfit. To be precise, Bayesian inference by design avoids overfitting—if the evaluation is performed by averaging over the prior distribution. But to the extent this is not the case—to the extent that the frequency-evaluation distribution […]

The curse of dimensionality and finite vs. asymptotic convergence results

Related to our (Aki, Andrew, Jonah) Pareto smoothed importance sampling paper I (Aki) received a few times a comment that why bother with Pareto smoothing as you can always choose the proposal distribution so that importance ratios are bounded and then central limit theorem holds. The curse of dimensionality here is that the papers they […]

What’s Wrong with “Evidence-Based Medicine” and How Can We Do Better? (My talk at the University of Michigan Friday 2pm)

Tomorrow (Fri 9 Feb) 2pm at the NCRC Research Auditorium (Building 10) at the University of Michigan: What’s Wrong with “Evidence-Based Medicine” and How Can We Do Better? Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University “Evidence-based medicine” sounds like a good idea, but it can run into problems when the […]

354 possible control groups; what to do?

Jonas Cederlöf writes: I’m a PhD student in economics at Stockholm University and a frequent reader of your blog. I have for a long time followed your quest in trying to bring attention to p-hacking and multiple comparison problems in research. I’m now myself faced with the aforementioned problem and want to at the very […]

Eid ma clack shaw zupoven del ba.

When I say “I love you”, you look accordingly skeptical – Frida Hyvönen A few years back, Bill Callahan wrote a song about the night he dreamt the perfect song. In a fever, he woke and wrote it down before going back to sleep. The next morning, as he struggled to read his handwriting, he saw […]

Methodological terrorism. For reals. (How to deal with “what we don’t know” in missing-data imputation.)

Kevin Lewis points us to this paper, by Aaron Safer-Lichtenstein, Gary LaFree, Thomas Loughran, on the methodology of terrorism studies. This is about as close to actual “methodological terrorism” as we’re ever gonna see here. The linked article begins: Although the empirical and analytical study of terrorism has grown dramatically in the past decade and […]

N=1 experiments and multilevel models

N=1 experiments are the hot new thing. Here are some things to read: Design and Implementation of N-of-1 Trials: A User’s Guide, edited by Richard Kravitz and Naihua Duan for the Agency for Healthcare Research and Quality, U.S. Department of Health and Human Services (2014). Single-patient (n-of-1) trials: a pragmatic clinical decision methodology for patient-centered […]

The Anti-Bayesian Moment and Its Passing

This bit of reconstructed intellectual history is from a few years ago but I thought it’s worth repeating. It comes from the rejoinder that X and I wrote to our article, “‘Not only defended but also applied’: The perceived absurdity of Bayesian inference.” The rejoinder is called “The anti-Bayesian moment and its passing,” and it […]

Stacking and multiverse

It’s a coincidence that there is another multiverse posting today. Recently Tim Disher asked in Stan discussion forum a question “Multiverse analysis – concatenating posteriors?” Tim refers to a paper “Increasing Transparency Through a Multiverse Analysis” by Sara Steegen, Francis Tuerlinckx, Andrew Gelman, and Wolf Vanpaemel. The abstract says Empirical research inevitably includes constructing a […]

State-space modeling for poll aggregation . . . in Stan!

Peter Ellis writes: As part of familiarising myself with the Stan probabilistic programming language, I replicate Simon Jackman’s state space modelling with house effects of the 2007 Australian federal election. . . . It’s not quite the model that I’d use—indeed, Ellis writes, “I’m fairly new to Stan and I’m pretty sure my Stan programs […]

How productized Bayesian revenue estimation with Stan

Markus Ojala writes: Bayesian modeling is becoming mainstream in many application areas. Applying it needs still a lot of knowledge about distributions and modeling techniques but the recent development in probabilistic programming languages have made it much more tractable. Stan is a promising language that suits single analysis cases well. With the improvements in approximation […]

We were measuring the speed of Stan incorrectly—it’s faster than we thought in some cases due to antithetical sampling

Aki points out that in cases of antithetical sampling, our effective sample size calculations were unduly truncated above at the number of iterations. It turns out the effective sample size can be greater than the number of iterations if the draws are anticorrelated. And all we really care about for speed is effective sample size […]

Static sensitivity analysis: Computing robustness of Bayesian inferences to the choice of hyperparameters

Ryan Giordano wrote: Last year at StanCon we talked about how you can differentiate under the integral to automatically calculate quantitative hyperparameter robustness for Bayesian posteriors. Since then, I’ve packaged the idea up into an R library that plays nice with Stan. You can install it from this github repo. I’m sure you’ll be pretty […]