Skip to content
Archive of posts filed under the Bayesian Statistics category.

The fallacy of the excluded middle — statistical philosophy edition

I happened to come across this post from 2012 and noticed a point I’d like to share again. I was discussing an article by David Cox and Deborah Mayo, in which Cox wrote: [Bayesians’] conceptual theories are trying to do two entirely different things. One is trying to extract information from the data, while the […]

Three informal case studies: (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs

Andrew suggested I cross-post these from the Stan forums to his blog, so here goes. Maximum marginal likelihood and posterior approximations with Monte Carlo expectation maximization: I unpack the goal of max marginal likelihood and approximate Bayes with MMAP and Laplace approximations. I then go through the basic EM algorithm (with a traditional analytic example […]

“The most important aspect of a statistical analysis is not what you do with the data, it’s what data you use” (survey adjustment edition)

Dean Eckles pointed me to this recent report by Andrew Mercer, Arnold Lau, and Courtney Kennedy of the Pew Research Center, titled, “For Weighting Online Opt-In Samples, What Matters Most? The right variables make a big difference for accuracy. Complex statistical methods, not so much.” I like most of what they write, but I think […]

When LOO and other cross-validation approaches are valid

Introduction Zacco asked in Stan discourse whether leave-one-out (LOO) cross-validation is valid for phylogenetic models. He also referred to Dan’s excellent blog post which mentioned iid assumption. Instead of iid it would be better to talk about exchangeability assumption, but I (Aki) got a bit lost in my discourse answer (so don’t bother to go […]

Continuous tempering through path sampling

Yuling prepared this poster summarizing our recent work on path sampling using a continuous joint distribution. The method is really cool and represents a real advance over what Xiao-Li and I were doing in our 1998 paper. It’s still gonna have problems in high or even moderate dimensions, and ultimately I think we’re gonna need […]

Awesome MCMC animation site by Chi Feng! On Github!

Sean Talts and Bob Carpenter pointed us to this awesome MCMC animation site by Chi Feng. For instance, here’s NUTS on a banana-shaped density. This is indeed super-cool, and maybe there’s a way to connect these with Stan/ShinyStan/Bayesplot so as to automatically make movies of Stan model fits. This would be great, both to help […]

Parsimonious principle vs integration over all uncertainties

tl;dr If you have bad models, bad priors or bad inference choose the simplest possible model. If you have good models, good priors, good inference, use the most elaborate model for predictions. To make interpretation easier you may use a smaller model with similar predictive performance as the most elaborate model. Merijn Mestdagh emailed me […]

“The idea of replication is central not just to scientific practice but also to formal statistics . . . Frequentist statistics relies on the reference set of repeated experiments, and Bayesian statistics relies on the prior distribution which represents the population of effects.”

Rolf Zwaan (who we last encountered here in “From zero to Ted talk in 18 simple steps”), Alexander Etz, Richard Lucas, and M. Brent Donnellan wrote an article, “Making replication mainstream,” which begins: Many philosophers of science and methodologists have argued that the ability to repeat studies and obtain similar results is an essential component […]

Mister P wins again

Chad Kiewiet De Jonge, Gary Langer, and Sofi Sinozich write: This paper presents state-level estimates of the 2016 presidential election using data from the ABC News/Washington Post tracking poll and multilevel regression with poststratification (MRP). While previous implementations of MRP for election forecasting have relied on data from prior elections to establish poststratification targets for […]

“Bayesian Meta-Analysis with Weakly Informative Prior Distributions”

Donny Williams sends along this paper, with Philippe Rast and Paul-Christian Bürkner, and writes: This paper is similar to the Chung et al. avoiding boundary estimates papers (here and here), but we use fully Bayesian methods, and specifically the half-Cauchy prior. We show it has as good of performance as a fully informed prior based […]

Joint inference or modular inference? Pierre Jacob, Lawrence Murray, Chris Holmes, Christian Robert discuss conditions on the strength and weaknesses of these choices

Pierre Jacob, Lawrence Murray, Chris Holmes, Christian Robert write: In modern applications, statisticians are faced with integrating heterogeneous data modalities relevant for an inference, prediction, or decision problem. In such circumstances, it is convenient to use a graphical model to represent the statistical dependencies, via a set of connected “modules”, each relating to a specific […]

Divisibility in statistics: Where is it needed?

The basics of Bayesian inference is p(parameters|data) proportional to p(parameters)*p(data|parameters). And, for predictions, p(predictions|data) = integral_parameters p(predictions|parameters,data)*p(parameters|data). In these expressions (and the corresponding simpler versions for maximum likelihood), “parameters” and “data” are unitary objects. Yes, it can be helpful to think of the parameter objects as being a list or vector of individual parameters; and […]

All of Life is 6 to 5 Against

Donny Williams writes: I have a question I have been considering asking you for a while. The more I have learned about Bayesian methods, including regularly reading the journal Bayesian Analysis (preparing a submission here, actually!), etc., I have come to not only see that frequency properties are studied of Bayesian models, but it is […]

Anyone want to run this Bayesian computing conference in 2022?

OK, people think I’m obsessive with a blog with a 6-month lag, but that’s nothing compared to some statistics conferences. Mylène Bédard sends this along for anyone who might be interested: The Bayesian Computation Section of ISBA is soliciting proposals to host its flagship conference: Bayes Comp 2022 The expectation is that the meeting will […]

Yes, but did it work? Evaluating variational inference

That’s the title of a recent article by Yuling Yao, Aki Vehtari, Daniel Simpson, and myself, which presents some diagnostics for variational approximations to posterior inference: We were motivated to write this paper by the success/failure of ADVI, the automatic variational inference algorithm devised by Alp Kucukelbir et al. The success was that ADVI solved […]

I am the supercargo

In a form of sympathetic magic, many built life-size replicas of airplanes out of straw and cut new military-style landing strips out of the jungle, hoping to attract more airplanes. – Wikipedia Twenty years ago, Geri Halliwell left the Spice Girls, so I’ve been thinking about Cargo Cults a lot. As an analogy for what […]

Opportunity for Comment!

(This is Dan) Last September, Jonah, Aki, Michael, Andrew and I wrote a paper on the role of visualization in the Bayesian workflow.  This paper is going to be published as a discussion paper in the Journal of the Royal Statistical Society Series A and the associated read paper meeting (where we present the paper and […]

Power analysis and NIH-style statistical practice: What’s the implicit model?

So. Following up on our discussion of “the 80% power lie,” I was thinking about the implicit model underlying NIH’s 80% power rule. Several commenters pointed out that, to have your study design approved by NSF, it’s not required that you demonstrate that you have 80% power for real; what’s needed is to show 80% […]

Bayesians are frequentists

Bayesians are frequentists. What I mean is, the Bayesian prior distribution corresponds to the frequentist sample space: it’s the set of problems for which a particular statistical model or procedure will be applied. I was thinking about this in the context of this question from Vlad Malik: I noticed this comment on Twitter in reference […]

Stan goes to the World Cup

Leo Egidi shares his 2018 World Cup model, which he’s fitting in Stan. But I don’t like this: First, something’s missing. Where’s the U.S.?? More seriously, what’s with that “16.74%” thing? So bogus. You might as well say you’re 66.31 inches tall. Anyway, as is often the case with Bayesian models, the point here is […]