Someone writes: I’m interested in learning more about data analysis techniques; I’ve bought books on Bayesian Statistics (including yours), on R programming, and on several other ‘related stuff’. Since I generally study this whenever I have some free time, I’m looking for sources that are meant for self study. Are there any sources that you […]

**Bayesian Statistics**category.

## Touch me, I want to feel your data.

(This is not a paper we wrote by mistake.) (This is also not Andrew) (This is also really a blog about an aspect of the paper, which mostly focusses on issues around visualisation and how visualisation can improve workflow. So you should read it.) Recently Australians have been living through a predictably ugly debate around […]

## How to design and conduct a subgroup analysis?

Brian MacGillivray writes: I’ve just published a paper that draws on your work on the garden of forking paths, as well as your concept of statistics as being the science of defaults. The article is called, “Characterising bias in regulatory risk and decision analysis: An analysis of heuristics applied in health technology appraisal, chemicals regulation, […]

## (It’s never a) Total Eclipse of the Prior

(This is not by Andrew) This is a paper we (Gelman, Simpson, Betancourt) wrote by mistake. The paper in question, recently arXiv’d, is called “The prior can generally only be understood in the context of the likelihood”. How the sausage was made Now, to be very clear (and because I’ve been told since I moved […]

## Rosenbaum (1999): Choice as an Alternative to Control in Observational Studies

Winston Lin wrote in a blog comment earlier this year: Paul Rosenbaum’s 1999 paper “Choice as an Alternative to Control in Observational Studies” is really thoughtful and well-written. The comments and rejoinder include an interesting exchange between Manski and Rosenbaum on external validity and the role of theories. And here it is. Rosenbaum begins: In […]

## Iterative importance sampling

Aki points us to some papers: Langevin Incremental Mixture Importance Sampling Parallel Adaptive Importance Sampling Iterative importance sampling algorithms for parameter estimation problems Next one is not iterative, but interesting in other way Black-box Importance Sampling Importance sampling is what you call it when you’d like to have draws of theta from some target distribution […]

## Chris Moore, Guy Molyneux, Etan Green, and David Daniels on Bayesian umpires

Kevin Lewis points us to a paper by Etan Green and David Daniels, who conclude that “decisions of [baseball] umpires reflect an accurate, probabilistic, and state-specific understanding of their rational expectations—as well as an ability to integrate those prior beliefs in a manner that approximates Bayes rule.” This is similar to what was found in […]

## It is somewhat paradoxical that good stories tend to be anomalous, given that when it comes to statistical data, we generally want what is typical, not what is surprising. Our resolution of this paradox is . . .

From a blog comment a few years ago regarding an article by Robert Kosara: As Thomas and I discuss in our paper [When Do Stories Work? Evidence and Illustration in the Social Sciences], it is somewhat paradoxical that good stories tend to be anomalous, given that when it comes to statistical data, we generally want […]

## Bigshot statistician keeps publishing papers with errors; is there anything we can do to get him to stop???

OK, here’s a paper with a true theorem but then some false corollaries. First the theorem: The above is actually ok. It’s all true. But then a few pages later comes the false statement: This is just wrong, for two reasons. First, the relevant reference distribution is discrete uniform, not continuous uniform, so the normal […]

## “This finding did not reach statistical significance, but it indicates a 94.6% probability that statins were responsible for the symptoms.”

Charles Jackson writes: The attached item from JAMA, which I came across in my doctor’s waiting room, contains the statements: Nineteen of 203 patients treated with statins and 10 of 217 patients treated with placebo met the study definition of myalgia (9.4% vs 4.6%. P = .054). This finding did not reach statistical significance, but […]

## Seemingly intuitive and low math intros to Bayes never seem to deliver as hoped: Why?

This post was prompted by recent nicely done videos by Rasmus Baath that provide an intuitive and low math introduction to Bayesian material. Now, I do not know that these have delivered less than he hoped for. Nor I have asked him. However, given similar material I and others have tried out in the past that […]

## Died in the Wool

Garrett M. writes: I’m an analyst at an investment management firm. I read your blog daily to improve my understanding of statistics, as it’s central to the work I do. I had two (hopefully straightforward) questions related to time series analysis that I was hoping I could get your thoughts on: First, much of the […]

## Short course on Bayesian data analysis and Stan 23-25 Aug in NYC!

Jonah “ShinyStan” Gabry, Mike “Riemannian NUTS” Betancourt, and I will be giving a three-day short course next month in New York, following the model of our successful courses in 2015 and 2016. Before class everyone should install R, RStudio and RStan on their computers. (If you already have these, please update to the latest version […]

## Hey—here are some tools in R and Stan to designing more effective clinical trials! How cool is that?

In statistical work, design and data analysis are often considered separately. Sometimes we do all sorts of modeling and planning in the design stage, only to analyze data using simple comparisons. Other times, we design our studies casually, even thoughtlessly, and then try to salvage what we can using elaborate data analyses. It would be […]

## What is “overfitting,” exactly?

This came from Bob Carpenter on the Stan mailing list: It’s not overfitting so much as model misspecification. I really like this line. If your model is correct, “overfitting” is impossible. In its usual form, “overfitting” comes from using too weak of a prior distribution. One might say that “weakness” of a prior distribution is […]

## Classical statisticians as Unitarians

[cat picture] Christian Robert, Judith Rousseau, and I wrote: Several of the examples in [the book under review] represent solutions to problems that seem to us to be artificial or conventional tasks with no clear analogy to applied work. “They are artificial and are expressed in terms of a survey of 100 individuals expressing support […]

## 3 things that will surprise you about model validation and calibration for state space models

Gurjinder Mohan writes: I was wondering if you had any advice specific to state space models when attempting model validation and calibration. I was planning on conducting a graphical posterior predictive check. I’d also recommend fake-data simulation. Beyond that, I’d need to know more about the example. I’m posting here because this seems like a […]

## Statisticians and economists agree: We should learn from data by “generating and revising models, hypotheses, and data analyzed in response to surprising findings.” (That’s what Bayesian data analysis is all about.)

Kevin Lewis points us to this article by economist James Heckman and statistician Burton Singer, who write: All analysts approach data with preconceptions. The data never speak for themselves. Sometimes preconceptions are encoded in precise models. Sometimes they are just intuitions that analysts seek to confirm and solidify. A central question is how to revise […]