Skip to content
Archive of posts filed under the Bayesian Statistics category.

Bigshot statistician keeps publishing papers with errors; is there anything we can do to get him to stop???

OK, here’s a paper with a true theorem but then some false corollaries. First the theorem: The above is actually ok. It’s all true. But then a few pages later comes the false statement: This is just wrong, for two reasons. First, the relevant reference distribution is discrete uniform, not continuous uniform, so the normal […]

“This finding did not reach statistical sig­nificance, but it indicates a 94.6% prob­ability that statins were responsible for the symptoms.”

Charles Jackson writes: The attached item from JAMA, which I came across in my doctor’s waiting room, contains the statements: Nineteen of 203 patients treated with statins and 10 of 217 patients treated with placebo met the study definition of myalgia (9.4% vs 4.6%. P = .054). This finding did not reach statistical sig­nificance, but […]

Seemingly intuitive and low math intros to Bayes never seem to deliver as hoped: Why?

This post was prompted by recent nicely done videos by Rasmus Baath that provide an intuitive and low math introduction to Bayesian material. Now, I do not know that these have delivered less than he hoped for. Nor I have asked him. However, given similar material I and others have tried out in the past that […]

Died in the Wool

Garrett M. writes: I’m an analyst at an investment management firm. I read your blog daily to improve my understanding of statistics, as it’s central to the work I do. I had two (hopefully straightforward) questions related to time series analysis that I was hoping I could get your thoughts on: First, much of the […]

“Bayes factor”: where the term came from, and some references to why I generally hate it

Someone asked: Do you know when this term was coined or by whom? Kass and Raftery’s use of the tem as the title of their 1995 paper suggests that it was still novel then, but I have not noticed in the paper any information about where it started. I replied: According to Etz and Wagenmakers […]

Short course on Bayesian data analysis and Stan 23-25 Aug in NYC!

Jonah “ShinyStan” Gabry, Mike “Riemannian NUTS” Betancourt, and I will be giving a three-day short course next month in New York, following the model of our successful courses in 2015 and 2016. Before class everyone should install R, RStudio and RStan on their computers. (If you already have these, please update to the latest version […]

Hey—here are some tools in R and Stan to designing more effective clinical trials! How cool is that?

In statistical work, design and data analysis are often considered separately. Sometimes we do all sorts of modeling and planning in the design stage, only to analyze data using simple comparisons. Other times, we design our studies casually, even thoughtlessly, and then try to salvage what we can using elaborate data analyses. It would be […]

What is “overfitting,” exactly?

This came from Bob Carpenter on the Stan mailing list: It’s not overfitting so much as model misspecification. I really like this line. If your model is correct, “overfitting” is impossible. In its usual form, “overfitting” comes from using too weak of a prior distribution. One might say that “weakness” of a prior distribution is […]

Classical statisticians as Unitarians

[cat picture] Christian Robert, Judith Rousseau, and I wrote: Several of the examples in [the book under review] represent solutions to problems that seem to us to be artificial or conventional tasks with no clear analogy to applied work. “They are artificial and are expressed in terms of a survey of 100 individuals expressing support […]

3 things that will surprise you about model validation and calibration for state space models

Gurjinder Mohan writes: I was wondering if you had any advice specific to state space models when attempting model validation and calibration. I was planning on conducting a graphical posterior predictive check. I’d also recommend fake-data simulation. Beyond that, I’d need to know more about the example. I’m posting here because this seems like a […]

Statisticians and economists agree: We should learn from data by “generating and revising models, hypotheses, and data analyzed in response to surprising findings.” (That’s what Bayesian data analysis is all about.)

Kevin Lewis points us to this article by economist James Heckman and statistician Burton Singer, who write: All analysts approach data with preconceptions. The data never speak for themselves. Sometimes preconceptions are encoded in precise models. Sometimes they are just intuitions that analysts seek to confirm and solidify. A central question is how to revise […]

Bayesian, but not Bayesian enough

Will Moir writes: This short New York Times article on a study published in BMJ might be of interest to you and your blog community, both in terms of how the media reports science and also the use of bayesian vs frequentist statistics in the study itself. Here is the short summary from the news […]

Estimating Public Market Exposure of Private Capital Funds Using Bayesian Inference

I don’t know anything about this work by Luis O’Shea and Vishv Jeet—that is, I know nothing of public market exposure or private capital firms, and I don’t know anything about the model they fit, the data they used, or what information they had available for constructing and checking their model. But what I do […]

Analyze all your comparisons. That’s better than looking at the max difference and trying to do a multiple comparisons correction.

[cat picture] The following email came in: I’m in a PhD program (poli sci) with a heavy emphasis on methods. One thing that my statistics courses emphasize, but that doesn’t get much attention in my poli sci courses, is the problem of simultaneous inferences. This strikes me as a problem. I am a bit unclear […]

Not everyone’s aware of falsificationist Bayes

Stephen Martin writes: Daniel Lakens recently blogged about philosophies of science and how they relate to statistical philosophies. I thought it may be of interest to you. In particular, this statement: From a scientific realism perspective, Bayes Factors or Bayesian posteriors do not provide an answer to the main question of interest, which is the […]

Breaking the dataset into little pieces and putting it back together again

Alex Konkel writes: I was a little surprised that your blog post with the three smaller studies versus one larger study question received so many comments, and also that so many people seemed to come down on the side of three smaller studies. I understand that Stephen’s framing led to some confusion as well as […]

Don’t say “improper prior.” Say “non-generative model.”

[cat picture] In Bayesian Data Analysis, we write, “In general, we call a prior density p(θ) proper if it does not depend on data and integrates to 1.” This was a step forward from the usual understanding which is that a prior density is improper if an infinite integral. But I’m not so thrilled with […]

Ride a Crooked Mile

Joachim Krueger writes: As many of us rely (in part) on p values when trying to make sense of the data, I am sending a link to a paper Patrick Heck and I published in Frontiers in Psychology. The goal of this work is not to fan the flames of the already overheated debate, but […]

Statistical Challenges of Survey Sampling and Big Data (my remote talk in Bologna this Thurs, 15 June, 4:15pm)

Statistical Challenges of Survey Sampling and Big Data Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University, New York Big Data need Big Model. Big Data are typically convenience samples, not random samples; observational comparisons, not controlled experiments; available data, not measurements designed for a particular study. As a result, it is […]

PhD student fellowship opportunity! in Belgium! to work with us! on the multiverse and other projects on improving the reproducibility of psychological research!!!

[image of Jip and Janneke dancing with a cat] Wolf Vanpaemel and Francis Tuerlinckx write: We at the Quantitative Psychology and Individual Differences, KU Leuven, Belgium are looking for a PhD candidate. The goal of the PhD research is to develop and apply novel methodologies to increase the reproducibility of psychological science. More information can […]