OK, here’s a paper with a true theorem but then some false corollaries. First the theorem: The above is actually ok. It’s all true. But then a few pages later comes the false statement: This is just wrong, for two reasons. First, the relevant reference distribution is discrete uniform, not continuous uniform, so the normal […]

**Bayesian Statistics**category.

## “This finding did not reach statistical significance, but it indicates a 94.6% probability that statins were responsible for the symptoms.”

Charles Jackson writes: The attached item from JAMA, which I came across in my doctor’s waiting room, contains the statements: Nineteen of 203 patients treated with statins and 10 of 217 patients treated with placebo met the study definition of myalgia (9.4% vs 4.6%. P = .054). This finding did not reach statistical significance, but […]

## Seemingly intuitive and low math intros to Bayes never seem to deliver as hoped: Why?

This post was prompted by recent nicely done videos by Rasmus Baath that provide an intuitive and low math introduction to Bayesian material. Now, I do not know that these have delivered less than he hoped for. Nor I have asked him. However, given similar material I and others have tried out in the past that […]

## Died in the Wool

Garrett M. writes: I’m an analyst at an investment management firm. I read your blog daily to improve my understanding of statistics, as it’s central to the work I do. I had two (hopefully straightforward) questions related to time series analysis that I was hoping I could get your thoughts on: First, much of the […]

## Short course on Bayesian data analysis and Stan 23-25 Aug in NYC!

Jonah “ShinyStan” Gabry, Mike “Riemannian NUTS” Betancourt, and I will be giving a three-day short course next month in New York, following the model of our successful courses in 2015 and 2016. Before class everyone should install R, RStudio and RStan on their computers. (If you already have these, please update to the latest version […]

## Hey—here are some tools in R and Stan to designing more effective clinical trials! How cool is that?

In statistical work, design and data analysis are often considered separately. Sometimes we do all sorts of modeling and planning in the design stage, only to analyze data using simple comparisons. Other times, we design our studies casually, even thoughtlessly, and then try to salvage what we can using elaborate data analyses. It would be […]

## What is “overfitting,” exactly?

This came from Bob Carpenter on the Stan mailing list: It’s not overfitting so much as model misspecification. I really like this line. If your model is correct, “overfitting” is impossible. In its usual form, “overfitting” comes from using too weak of a prior distribution. One might say that “weakness” of a prior distribution is […]

## Classical statisticians as Unitarians

[cat picture] Christian Robert, Judith Rousseau, and I wrote: Several of the examples in [the book under review] represent solutions to problems that seem to us to be artificial or conventional tasks with no clear analogy to applied work. “They are artificial and are expressed in terms of a survey of 100 individuals expressing support […]

## 3 things that will surprise you about model validation and calibration for state space models

Gurjinder Mohan writes: I was wondering if you had any advice specific to state space models when attempting model validation and calibration. I was planning on conducting a graphical posterior predictive check. I’d also recommend fake-data simulation. Beyond that, I’d need to know more about the example. I’m posting here because this seems like a […]

## Statisticians and economists agree: We should learn from data by “generating and revising models, hypotheses, and data analyzed in response to surprising findings.” (That’s what Bayesian data analysis is all about.)

Kevin Lewis points us to this article by economist James Heckman and statistician Burton Singer, who write: All analysts approach data with preconceptions. The data never speak for themselves. Sometimes preconceptions are encoded in precise models. Sometimes they are just intuitions that analysts seek to confirm and solidify. A central question is how to revise […]

## Estimating Public Market Exposure of Private Capital Funds Using Bayesian Inference

I don’t know anything about this work by Luis O’Shea and Vishv Jeet—that is, I know nothing of public market exposure or private capital firms, and I don’t know anything about the model they fit, the data they used, or what information they had available for constructing and checking their model. But what I do […]

## Analyze all your comparisons. That’s better than looking at the max difference and trying to do a multiple comparisons correction.

[cat picture] The following email came in: I’m in a PhD program (poli sci) with a heavy emphasis on methods. One thing that my statistics courses emphasize, but that doesn’t get much attention in my poli sci courses, is the problem of simultaneous inferences. This strikes me as a problem. I am a bit unclear […]

## Not everyone’s aware of falsificationist Bayes

Stephen Martin writes: Daniel Lakens recently blogged about philosophies of science and how they relate to statistical philosophies. I thought it may be of interest to you. In particular, this statement: From a scientific realism perspective, Bayes Factors or Bayesian posteriors do not provide an answer to the main question of interest, which is the […]

## Breaking the dataset into little pieces and putting it back together again

Alex Konkel writes: I was a little surprised that your blog post with the three smaller studies versus one larger study question received so many comments, and also that so many people seemed to come down on the side of three smaller studies. I understand that Stephen’s framing led to some confusion as well as […]

## Don’t say “improper prior.” Say “non-generative model.”

[cat picture] In Bayesian Data Analysis, we write, “In general, we call a prior density p(θ) proper if it does not depend on data and integrates to 1.” This was a step forward from the usual understanding which is that a prior density is improper if an infinite integral. But I’m not so thrilled with […]

## Statistical Challenges of Survey Sampling and Big Data (my remote talk in Bologna this Thurs, 15 June, 4:15pm)

Statistical Challenges of Survey Sampling and Big Data Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University, New York Big Data need Big Model. Big Data are typically convenience samples, not random samples; observational comparisons, not controlled experiments; available data, not measurements designed for a particular study. As a result, it is […]

## PhD student fellowship opportunity! in Belgium! to work with us! on the multiverse and other projects on improving the reproducibility of psychological research!!!

[image of Jip and Janneke dancing with a cat] Wolf Vanpaemel and Francis Tuerlinckx write: We at the Quantitative Psychology and Individual Differences, KU Leuven, Belgium are looking for a PhD candidate. The goal of the PhD research is to develop and apply novel methodologies to increase the reproducibility of psychological science. More information can […]