Skip to content
Archive of posts filed under the Bayesian Statistics category.

When do statistical rules affect drug approval?

Someone writes in: I have MS and take a disease-modifying drug called Copaxone. Sandoz developed a generic version​ of Copaxone​ and filed for FDA approval. Teva, the manufacturer of Copaxone, filed a petition opposing that approval (surprise!). FDA rejected Teva’s petitions and approved the generic. My insurance company encouraged me to switch to the generic. […]

Going beyond confidence intervals

Anders Lamberg writes: In an article by Tom Sigfried, Science News, July 3 2014, “Scientists’ grasp of confidence intervals doesn’t inspire confidence” you are cited: “Gelman himself makes the point most clearly, though, that a 95 percent probability that a confidence interval contains the mean refers to repeated sampling, not any one individual interval.” I […]

Bayesian Linear Mixed Models using Stan: A tutorial for psychologists, linguists, and cognitive scientists

This article by Tanner Sorensen, Sven Hohenstein, and Shravan Vasishth might be of interest to some of you.

Moving statistical theory from a “discovery” framework to a “measurement” framework

Avi Adler points to this post by Felix Schönbrodt on “What’s the probability that a significant p-value indicates a true effect?” I’m sympathetic to the goal of better understanding what’s in a p-value (see for example my paper with John Carlin on type M and type S errors) but I really don’t like the framing […]

“Pointwise mutual information as test statistics”

Christian Bartels writes: Most of us will probably agree that making good decisions under uncertainty based on limited data is highly important but remains challenging. We have decision theory that provides a framework to reduce risks of decisions under uncertainty with typical frequentist test statistics being examples for controlling errors in absence of prior knowledge. […]

One-day workshop on causal inference (NYC, Sat. 16 July)

James Savage is teaching a one-day workshop on causal inference this coming Saturday (16 July) in New York using RStanArm. Here’s a link to the details: One-day workshop on causal inference Here’s the course outline: How do prices affect sales? What is the uplift from a marketing decision? By how much will studying for an […]

Causal and predictive inference in policy research

Todd Rogers pointed me to a paper by Jon Kleinberg, Jens Ludwig, Sendhil Mullainathan, and Ziad Obermeyer that begins: Empirical policy research often focuses on causal inference. Since policy choices seem to depend on understanding the counterfactual—what happens with and without a policy—this tight link of causality and policy seems natural. While this link holds […]

Reproducible Research with Stan, R, knitr, Docker, and Git (with free GitLab hosting)

Jon Zelner recently developed a neat Docker packaging of Stan, R, and knitr for fully reproducible research. The first in his series of posts (with links to the next parts) is here: * Reproducibility, part 1 The post on making changes online and auto-updating results using GitLab’s continuous integration service is here: * GitLab continuous […]

Causal mediation

Judea Pearl points me to this discussion with Kosuke Imai at a conference on causal mediation. I continue to think that the most useful way to think about mediation is in terms of a joint or multivariate outcome, and I continue to think that if we want to understand mediation, we need to think about […]

Too good to be true: when overwhelming mathematics fails to convince

Gordon Danning points me to this news article by Lisa Zyga, “Why too much evidence can be a bad thing,” reporting on a paper by Lachlan Gunn and others. Their conclusions mostly seem reasonable, if a bit exaggerated. For example, I can’t believe this: The researchers demonstrated the paradox in the case of a modern-day […]

“Simple, Scalable and Accurate Posterior Interval Estimation”

Cheng Li, Sanvesh Srivastava, and David Dunson write: We propose a new scalable algorithm for posterior interval estimation. Our algorithm first runs Markov chain Monte Carlo or any alternative posterior sampling algorithm in parallel for each subset posterior, with the subset posteriors proportional to the prior multiplied by the subset likelihood raised to the full […]

Informative priors for treatment effects

Biostatistician Garnett McMillan writes: A PI recently completed a randomized trial where the experimental treatment showed a large, but not quite statistically significant (p=0.08) improvement over placebo. The investigators wanted to know how many additional subjects would be needed to achieve significance. This is a common question, which is very hard to answer for non-statistical […]

Short course on Bayesian data analysis and Stan 18-20 July in NYC!

Jonah Gabry, Vince Dorie, and I are giving a 3-day short course in two weeks. Before class everyone should install R, RStudio and RStan on their computers. (If you already have these, please update to the latest version of R and the latest version of Stan, which is 2.10.) If problems occur please join the […]

Euro 2016 update

Big news out of Europe, everyone’s talking about soccer. Leo Egidi updated his model and now has predictions for the Round of 16: Here’s Leo’s report, and here’s his zipfile with data and Stan code. The report contains some ugly histograms showing the predictive distributions of goals to be scored in each game. The R […]

Brexit polling: What went wrong?

Commenter numeric writes: Since you were shilling for yougov the other day you might want to talk about their big miss on Brexit (off by 6% from their eve-of-election poll—remain up 2 on their last poll and leave up by 4 as of this posting). Fair enough: Had Yougov done well, I could use them […]

My talk tomorrow (Thurs) 10:30am at ICML in NYC

I’ll be speaking at the workshop on Data-Efficient Machine Learning. And here’s the schedule. I’ll be speaking on the following topic: Toward Routine Use of Informative Priors Bayesian statistics is typically performed using noninformative priors but the resulting inferences commonly make no sense and also can lead to computational problems as algorithms have to waste […]

YouGov uses Mister P for Brexit poll

Ben Lauderdale and Doug Rivers give the story: There has been a lot of noise in polling on the upcoming EU referendum. Unlike the polls before the 2015 General Election, which were in almost perfect agreement (though, of course, not particularly close to the actual outcome), this time the polls are in serious disagreement. Telephone […]

Reduced-dimensionality parameterizations for linear models with interactions

After seeing this post by Matthew Wilson on a class of regression models called “factorization machines,” Aki writes: In a typical machine learning way, this is called “machine”, but it would be also a useful mode structure in Stan to make linear models with interactions, but with a reduced number of parameters. With a fixed […]

The answer is the Edlin factor

Garnett McMillan writes: You have argued about the pervasive role of the Garden of Forking Paths in published research. Given this influence, do you think that it is sensible to use published research to inform priors in new studies? My reply: Yes, I think you can use published research but in doing so you should […]

Stan makes Euro predictions! (now with data and code so you can fit your own, better model)

Leonardo Egidi writes: Inspired by your world cup model I fitted in Stan a model for the Euro Cup which start today, with two Poisson distributions for the goals scored at every match by the two teams (perfect prediction for the first match!). Data and code are here. Here’s the model, and here are the […]