Skip to content
Archive of posts filed under the Bayesian Statistics category.

One-day workshop on causal inference (NYC, Sat. 16 July)

James Savage is teaching a one-day workshop on causal inference this coming Saturday (16 July) in New York using RStanArm. Here’s a link to the details: One-day workshop on causal inference Here’s the course outline: How do prices affect sales? What is the uplift from a marketing decision? By how much will studying for an […]

Causal and predictive inference in policy research

Todd Rogers pointed me to a paper by Jon Kleinberg, Jens Ludwig, Sendhil Mullainathan, and Ziad Obermeyer that begins: Empirical policy research often focuses on causal inference. Since policy choices seem to depend on understanding the counterfactual—what happens with and without a policy—this tight link of causality and policy seems natural. While this link holds […]

Reproducible Research with Stan, R, knitr, Docker, and Git (with free GitLab hosting)

Jon Zelner recently developed a neat Docker packaging of Stan, R, and knitr for fully reproducible research. The first in his series of posts (with links to the next parts) is here: * Reproducibility, part 1 The post on making changes online and auto-updating results using GitLab’s continuous integration service is here: * GitLab continuous […]

Causal mediation

Judea Pearl points me to this discussion with Kosuke Imai at a conference on causal mediation. I continue to think that the most useful way to think about mediation is in terms of a joint or multivariate outcome, and I continue to think that if we want to understand mediation, we need to think about […]

Too good to be true: when overwhelming mathematics fails to convince

Gordon Danning points me to this news article by Lisa Zyga, “Why too much evidence can be a bad thing,” reporting on a paper by Lachlan Gunn and others. Their conclusions mostly seem reasonable, if a bit exaggerated. For example, I can’t believe this: The researchers demonstrated the paradox in the case of a modern-day […]

“Simple, Scalable and Accurate Posterior Interval Estimation”

Cheng Li, Sanvesh Srivastava, and David Dunson write: We propose a new scalable algorithm for posterior interval estimation. Our algorithm first runs Markov chain Monte Carlo or any alternative posterior sampling algorithm in parallel for each subset posterior, with the subset posteriors proportional to the prior multiplied by the subset likelihood raised to the full […]

Informative priors for treatment effects

Biostatistician Garnett McMillan writes: A PI recently completed a randomized trial where the experimental treatment showed a large, but not quite statistically significant (p=0.08) improvement over placebo. The investigators wanted to know how many additional subjects would be needed to achieve significance. This is a common question, which is very hard to answer for non-statistical […]

Short course on Bayesian data analysis and Stan 18-20 July in NYC!

Jonah Gabry, Vince Dorie, and I are giving a 3-day short course in two weeks. Before class everyone should install R, RStudio and RStan on their computers. (If you already have these, please update to the latest version of R and the latest version of Stan, which is 2.10.) If problems occur please join the […]

Euro 2016 update

Big news out of Europe, everyone’s talking about soccer. Leo Egidi updated his model and now has predictions for the Round of 16: Here’s Leo’s report, and here’s his zipfile with data and Stan code. The report contains some ugly histograms showing the predictive distributions of goals to be scored in each game. The R […]

Brexit polling: What went wrong?

Commenter numeric writes: Since you were shilling for yougov the other day you might want to talk about their big miss on Brexit (off by 6% from their eve-of-election poll—remain up 2 on their last poll and leave up by 4 as of this posting). Fair enough: Had Yougov done well, I could use them […]

My talk tomorrow (Thurs) 10:30am at ICML in NYC

I’ll be speaking at the workshop on Data-Efficient Machine Learning. And here’s the schedule. I’ll be speaking on the following topic: Toward Routine Use of Informative Priors Bayesian statistics is typically performed using noninformative priors but the resulting inferences commonly make no sense and also can lead to computational problems as algorithms have to waste […]

YouGov uses Mister P for Brexit poll

Ben Lauderdale and Doug Rivers give the story: There has been a lot of noise in polling on the upcoming EU referendum. Unlike the polls before the 2015 General Election, which were in almost perfect agreement (though, of course, not particularly close to the actual outcome), this time the polls are in serious disagreement. Telephone […]

Reduced-dimensionality parameterizations for linear models with interactions

After seeing this post by Matthew Wilson on a class of regression models called “factorization machines,” Aki writes: In a typical machine learning way, this is called “machine”, but it would be also a useful mode structure in Stan to make linear models with interactions, but with a reduced number of parameters. With a fixed […]

The answer is the Edlin factor

Garnett McMillan writes: You have argued about the pervasive role of the Garden of Forking Paths in published research. Given this influence, do you think that it is sensible to use published research to inform priors in new studies? My reply: Yes, I think you can use published research but in doing so you should […]

Stan makes Euro predictions! (now with data and code so you can fit your own, better model)

Leonardo Egidi writes: Inspired by your world cup model I fitted in Stan a model for the Euro Cup which start today, with two Poisson distributions for the goals scored at every match by the two teams (perfect prediction for the first match!). Data and code are here. Here’s the model, and here are the […]

Betancourt Binge (Video Lectures on HMC and Stan)

Even better than binging on Netflix, catch up on Michael Betancourt’s updated video lectures, just days after their live theatrical debut in Tokyo. Scalable Bayesian Inference with Hamiltonian Monte Carlo (YouTube, 1 hour) Some Bayesian Modeling Techniques in Stan (YouTube, 1 hour 40 minutes) His previous videos have received very good reviews and they’re only […]

A Primer on Bayesian Multilevel Modeling using PyStan

Chris Fonnesbeck contributed our first PyStan case study (I wrote the abstract), in the form of a very nice Jupyter notebook. Daniel Lee and I had the pleasure of seeing him present it live as part of a course we were doing at Vanderbilt last week. A Primer on Bayesian Multilevel Modeling using PyStan This […]

Stan workshop this Thurs NYC

Jonah is speaking at the Bayesian Data Analysis meetup on Thursday night, “Stan Workshop. Life is precious: fix your sampling problems.” He’ll focus on common problems using MCMC and how to address them. For registration: http://www.meetup.com/bda-group/events/231650672/

Freak Punts on Leicester Bet

I went over to the Freakonomics website and found this story about Leicester City’s unexpected championship. Here’s Stephen Dubner: At the start of this season, British betting houses put Leicester’s chances of winning the league at 5,000-to-1, which seemed, if anything, perhaps too generous. My [Dubner’s] son Solomon again: SOLOMON DUBNER: What would you say […]

Stan on the beach

This came in the email one day: We have used the great software Stan to estimate bycatch levels of common dolphins (Delphinus delphis) in the Bay of Biscay from stranding data. We found that official estimates are underestimated by a full order of magnitude. We conducted both a prior and likelihood sensitivity analyses : the […]