Skip to content
Archive of posts filed under the Bayesian Statistics category.

No tradeoff between regularization and discovery

We had a couple recent discussions regarding questionable claims based on p-values extracted from forking paths, and in both cases (a study “trying large numbers of combinations of otherwise-unused drugs against a large number of untreatable illnesses,” and a salami-slicing exercise looking for public opinion changes in subgroups of the population), I recommended fitting a […]

Baseball, apple pie, and Stan

Ben sends along these two baseball job ads that mention experience with Stan as a preferred qualification: St. Louis Cardinals Baseball Development Analyst Tampa Bay Rays Baseball Research and Development Analyst

“Bayesian evidence synthesis”

Donny Williams writes: My colleagues and I have a paper recently accepted in the journal Psychological Science in which we “bang” on Bayes factors. We explicitly show how the Bayes factor varies according to tau (I thought you might find this interesting for yourself and your blog’s readers). There is also a very nice figure. […]

Mick Cooney: case study on modeling loss curves in insurance with RStan

This is great. Thanks, Mick! All the Stan case studies are here.

Partial pooling with informative priors on the hierarchical variance parameters: The next frontier in multilevel modeling

Ed Vul writes: In the course of tinkering with someone else’s hairy dataset with a great many candidate explanatory variables (some of which are largely orthogonal factors, but the ones of most interest are competing “binning” schemes of the same latent elements). I wondered about the following “model selection” strategy, which you may have alluded […]

Splines in Stan; Spatial Models in Stan !

Two case studies: Splines in Stan, by Milad Kharratzadeh. Spatial Models in Stan: Intrinsic Auto-Regressive Models for Areal Data, by Mitzi Morris. This is great. Thanks, Mitzi! Thanks, Milad!

Tenure-Track or Tenured Prof. in Machine Learning in Aalto, Finland

This job advertisement for a position in Aalto, Finland, is by Aki We are looking for a professor to either further strengthen our strong research fields, with keywords including statistical machine learning, probabilistic modelling, Bayesian inference, kernel methods, computational statistics, or complementing them with deep learning. Collaboration with other fields is welcome, with local opportunities […]

The house is stronger than the foundations

Oliver Maclaren writes: Regarding the whole ‘double use of data’ issue with posterior predictive checks [see here and, for a longer discussion, here], I just wanted to note that David Cox describes the ‘Fisherian reduction’ as (I’ve summarised slightly; see p. 24 of ‘Principles of Statistical Inference) – Find the likelihood function – Reduce to […]

I disagree with Tyler Cowen regarding a so-called lack of Bayesianism in religious belief

Tyler Cowen writes: I am frustrated by the lack of Bayesianism in most of the religious belief I observe. I’ve never met a believer who asserted: “I’m really not sure here. But I think Lutheranism is true with p = .018, and the next strongest contender comes in only at .014, so call me Lutheran.” […]

What am I missing and what will this paper likely lead researchers to think and do?

This post is by Keith. In a previous post Ken Rice brought our attention to a recent paper he had published with Julian Higgins and  Thomas Lumley (RHL). After I obtained access and read the paper, I made some critical comments regarding RHL which ended with “Or maybe I missed something.” This post will try to discern […]

Should we worry about rigged priors? A long discussion.

Today’s discussion starts with Stuart Buck, who came across a post by John Cook linking to my post, “Bayesian statistics: What’s it all about?”. Cook wrote about the benefit of prior distributions in making assumptions explicit. Buck shared Cook’s post with Jon Baron, who wrote: My concern is that if researchers are systematically too optimistic […]

“Do statistical methods have an expiration date?” My talk at the University of Texas this Friday 2pm

Fri 6 Oct at the Seay Auditorium (room SEA 4.244): Do statistical methods have an expiration date? Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University There is a statistical crisis in science, particularly in psychology where many celebrated findings have failed to replicate, and where careful analysis has revealed that many […]

What you value should set out how you act and that how you represent what to possibly act upon: Aesthetics -> Ethics -> Logic.

I often include references to CS Peirce in my comments. Some might think way too often. However, this whole post will be trying to extract some morsels of insight from some of his later work. With the hope that it will enable applying statistics more thoughtfully. Now, making sense of Peirce, that is getting him […]

Getting the right uncertainties when fitting multilevel models

Cesare Aloisi writes: I am writing you regarding something I recently stumbled upon in your book Data Analysis Using Regression and Multilevel/Hierarchical Models which confused me, in hopes you could help me understand it. This book has been my reference guide for many years now, and I am extremely grateful for everything I learnt from […]

Will Stanton hit 61 home runs this season?

[edit: Juho Kokkala corrected my homework. Thanks! I updated the post. Also see some further elaboration in my reply to Andrew’s comment. As Andrew likes to say …] So far, Giancarlo Stanton has hit 56 home runs in 555 at bats over 149 games. Miami has 10 games left to play. What’s the chance he’ll […]

Using black-box machine learning predictions as inputs to a Bayesian analysis

Following up on this discussion [Designing an animal-like brain: black-box “deep learning algorithms” to solve problems, with an (approximately) Bayesian “consciousness” or “executive functioning organ” that attempts to make sense of all these inferences], Mike Betancourt writes: I’m not sure AI (or machine learning) + Bayesian wrapper would address the points raised in the paper. […]

Causal inference using data from a non-representative sample

Dan Gibbons writes: I have been looking at using synthetic control estimates for estimating the effects of healthcare policies, particularly because for say county-level data the nontreated comparison units one would use in say a difference-in-differences estimator or quantile DID estimator (if one didn’t want to use the mean) are not especially clear. However, given […]

Looking for the bottom line

I recommend this discussion of how to summarize posterior distributions. I don’t recommend summarizing by the posterior probability that the new treatment is better than the old treatment, as that is not a bottom-line statement!

Self-study resources for Bayes and Stan?

Someone writes: I’m interested in learning more about data analysis techniques; I’ve bought books on Bayesian Statistics (including yours), on R programming, and on several other ‘related stuff’. Since I generally study this whenever I have some free time, I’m looking for sources that are meant for self study. Are there any sources that you […]

Touch me, I want to feel your data.

(This is not a paper we wrote by mistake.) (This is also not Andrew) (This is also really a blog about an aspect of the paper, which mostly focusses on issues around visualisation and how visualisation can improve workflow. So you should read it.) Recently Australians have been living through a predictably ugly debate around […]