[edit: Juho Kokkala corrected my homework. Thanks! I updated the post. Also see some further elaboration in my reply to Andrew’s comment. As Andrew likes to say …] So far, Giancarlo Stanton has hit 56 home runs in 555 at bats over 149 games. Miami has 10 games left to play. What’s the chance he’ll […]

**Bayesian Statistics**category.

## Using black-box machine learning predictions as inputs to a Bayesian analysis

Following up on this discussion [Designing an animal-like brain: black-box “deep learning algorithms” to solve problems, with an (approximately) Bayesian “consciousness” or “executive functioning organ” that attempts to make sense of all these inferences], Mike Betancourt writes: I’m not sure AI (or machine learning) + Bayesian wrapper would address the points raised in the paper. […]

## Causal inference using data from a non-representative sample

Dan Gibbons writes: I have been looking at using synthetic control estimates for estimating the effects of healthcare policies, particularly because for say county-level data the nontreated comparison units one would use in say a difference-in-differences estimator or quantile DID estimator (if one didn’t want to use the mean) are not especially clear. However, given […]

## Self-study resources for Bayes and Stan?

Someone writes: I’m interested in learning more about data analysis techniques; I’ve bought books on Bayesian Statistics (including yours), on R programming, and on several other ‘related stuff’. Since I generally study this whenever I have some free time, I’m looking for sources that are meant for self study. Are there any sources that you […]

## Touch me, I want to feel your data.

(This is not a paper we wrote by mistake.) (This is also not Andrew) (This is also really a blog about an aspect of the paper, which mostly focusses on issues around visualisation and how visualisation can improve workflow. So you should read it.) Recently Australians have been living through a predictably ugly debate around […]

## How to design and conduct a subgroup analysis?

Brian MacGillivray writes: I’ve just published a paper that draws on your work on the garden of forking paths, as well as your concept of statistics as being the science of defaults. The article is called, “Characterising bias in regulatory risk and decision analysis: An analysis of heuristics applied in health technology appraisal, chemicals regulation, […]

## (It’s never a) Total Eclipse of the Prior

(This is not by Andrew) This is a paper we (Gelman, Simpson, Betancourt) wrote by mistake. The paper in question, recently arXiv’d, is called “The prior can generally only be understood in the context of the likelihood”. How the sausage was made Now, to be very clear (and because I’ve been told since I moved […]

## Rosenbaum (1999): Choice as an Alternative to Control in Observational Studies

Winston Lin wrote in a blog comment earlier this year: Paul Rosenbaum’s 1999 paper “Choice as an Alternative to Control in Observational Studies” is really thoughtful and well-written. The comments and rejoinder include an interesting exchange between Manski and Rosenbaum on external validity and the role of theories. And here it is. Rosenbaum begins: In […]

## Iterative importance sampling

Aki points us to some papers: Langevin Incremental Mixture Importance Sampling Parallel Adaptive Importance Sampling Iterative importance sampling algorithms for parameter estimation problems Next one is not iterative, but interesting in other way Black-box Importance Sampling Importance sampling is what you call it when you’d like to have draws of theta from some target distribution […]

## Chris Moore, Guy Molyneux, Etan Green, and David Daniels on Bayesian umpires

Kevin Lewis points us to a paper by Etan Green and David Daniels, who conclude that “decisions of [baseball] umpires reflect an accurate, probabilistic, and state-specific understanding of their rational expectations—as well as an ability to integrate those prior beliefs in a manner that approximates Bayes rule.” This is similar to what was found in […]

## It is somewhat paradoxical that good stories tend to be anomalous, given that when it comes to statistical data, we generally want what is typical, not what is surprising. Our resolution of this paradox is . . .

From a blog comment a few years ago regarding an article by Robert Kosara: As Thomas and I discuss in our paper [When Do Stories Work? Evidence and Illustration in the Social Sciences], it is somewhat paradoxical that good stories tend to be anomalous, given that when it comes to statistical data, we generally want […]

## Bigshot statistician keeps publishing papers with errors; is there anything we can do to get him to stop???

OK, here’s a paper with a true theorem but then some false corollaries. First the theorem: The above is actually ok. It’s all true. But then a few pages later comes the false statement: This is just wrong, for two reasons. First, the relevant reference distribution is discrete uniform, not continuous uniform, so the normal […]

## “This finding did not reach statistical significance, but it indicates a 94.6% probability that statins were responsible for the symptoms.”

Charles Jackson writes: The attached item from JAMA, which I came across in my doctor’s waiting room, contains the statements: Nineteen of 203 patients treated with statins and 10 of 217 patients treated with placebo met the study definition of myalgia (9.4% vs 4.6%. P = .054). This finding did not reach statistical significance, but […]

## Seemingly intuitive and low math intros to Bayes never seem to deliver as hoped: Why?

This post was prompted by recent nicely done videos by Rasmus Baath that provide an intuitive and low math introduction to Bayesian material. Now, I do not know that these have delivered less than he hoped for. Nor I have asked him. However, given similar material I and others have tried out in the past that […]

## Died in the Wool

Garrett M. writes: I’m an analyst at an investment management firm. I read your blog daily to improve my understanding of statistics, as it’s central to the work I do. I had two (hopefully straightforward) questions related to time series analysis that I was hoping I could get your thoughts on: First, much of the […]

## Short course on Bayesian data analysis and Stan 23-25 Aug in NYC!

Jonah “ShinyStan” Gabry, Mike “Riemannian NUTS” Betancourt, and I will be giving a three-day short course next month in New York, following the model of our successful courses in 2015 and 2016. Before class everyone should install R, RStudio and RStan on their computers. (If you already have these, please update to the latest version […]

## Hey—here are some tools in R and Stan to designing more effective clinical trials! How cool is that?

In statistical work, design and data analysis are often considered separately. Sometimes we do all sorts of modeling and planning in the design stage, only to analyze data using simple comparisons. Other times, we design our studies casually, even thoughtlessly, and then try to salvage what we can using elaborate data analyses. It would be […]