Skip to content
Archive of posts filed under the Statistical computing category.

“Crimes Against Data”: My talk at Ohio State University this Thurs; “Solving Statistics Problems Using Stan”: My talk at the University of Michigan this Fri

Crimes Against Data Statistics has been described as the science of uncertainty. But, paradoxically, statistical methods are often used to create a sense of certainty where none should exist. The social sciences have been rocked in recent years by highly publicized claims, published in top journals, that were reported as “statistically significant” but are implausible […]

Let’s play Twister, let’s play Risk

Alex Terenin, Dan Simpson, and David Draper write: Some months ago we shared with you an arxiv draft of our paper, Asynchronous Distributed Gibbs Sampling.​ Through comments we’ve received, for which we’re highly grateful, we came to understand that (a) our convergence proof was wrong, and (b) we actually have two algorithms, one exact and […]

Stan users group hits 2000 registrations

Of course, there are bound to be duplicate emails, dead emails, and people who picked up Stan, joined the list, and never came back. But still, that’s a lot of people who’ve expressed interest! It’s been an amazing ride that’s only going to get better as we learn more and continue to improve Stan’s speed […]

How paracompact is that?

Dominic on stan-users writes: I was reading through http://arxiv.org/pdf/1410.5110v1.pdf and came across the term with which I was not familiar: “paracompact.” I wrote a short blog post about it: https://idontgetoutmuch.wordpress.com/2016/04/17/every-manifold-is-paracompact. It may be of interest to other folks reading the aforementioned paper. I would have used a partition of unity to justify the corollary myself […]

Fast CAR: Two weird tricks for fast conditional autoregressive models in Stan

Max Joseph writes: Conditional autoregressive (CAR) models are popular as prior distributions for spatial random effects with areal spatial data. Historically, MCMC algorithms for CAR models have benefitted from efficient Gibbs sampling via full conditional distributions for the spatial random effects. But, these conditional specifications do not work in Stan, where the joint density needs […]

Free workshop on Stan for pharmacometrics (Paris, 22 September 2016); preceded by (non-free) three day course on Stan for pharmacometrics

So much for one post a day… Workshop: Stan for Pharmacometrics Day If you are interested in a free day of Stan for pharmacometrics in Paris on 22 September 2016, see the registration page: Stan for Pharmacometrics Day (free workshop) Julie Bertrand (statistical pharmacologist from Paris-Diderot and UCL) has finalized the program: When Who What […]

A little story of the Folk Theorem of Statistical Computing

I know I promised I wouldn’t blog, but this one is so clean and simple. And I already wrote it for the stan-users list anyway so it’s almost no effort to post it here too: A colleague and I were working on a data analysis problem, had a very simple overdispersed Poisson regression with a […]

Some insider stuff on the Stan refactor

From the stan-dev list, Bob wrote [and has since added brms based on comments; the * packages are ones that aren’t developed or maintained by the stan-dev team, so we only know what we hear from their authors]: The bigger picture is this, and you see the stan-dev/stan repo really spans three logical layers: stan […]

Reproducible Research with Stan, R, knitr, Docker, and Git (with free GitLab hosting)

Jon Zelner recently developed a neat Docker packaging of Stan, R, and knitr for fully reproducible research. The first in his series of posts (with links to the next parts) is here: * Reproducibility, part 1 The post on making changes online and auto-updating results using GitLab’s continuous integration service is here: * GitLab continuous […]

“Simple, Scalable and Accurate Posterior Interval Estimation”

Cheng Li, Sanvesh Srivastava, and David Dunson write: We propose a new scalable algorithm for posterior interval estimation. Our algorithm first runs Markov chain Monte Carlo or any alternative posterior sampling algorithm in parallel for each subset posterior, with the subset posteriors proportional to the prior multiplied by the subset likelihood raised to the full […]

Short course on Bayesian data analysis and Stan 18-20 July in NYC!

Jonah Gabry, Vince Dorie, and I are giving a 3-day short course in two weeks. Before class everyone should install R, RStudio and RStan on their computers. (If you already have these, please update to the latest version of R and the latest version of Stan, which is 2.10.) If problems occur please join the […]

Reduced-dimensionality parameterizations for linear models with interactions

After seeing this post by Matthew Wilson on a class of regression models called “factorization machines,” Aki writes: In a typical machine learning way, this is called “machine”, but it would be also a useful mode structure in Stan to make linear models with interactions, but with a reduced number of parameters. With a fixed […]

Log Sum of Exponentials for Robust Sums on the Log Scale

This is a public service announcement in the interest of more robust numerical calculations. Like matrix inverse, exponentiation is bad news. It’s prone to overflow or underflow. Just try this in R: > exp(-800) > exp(800) That’s not rounding error you see. The first one evaluates to zero (underflows) and the second to infinity (overflows). […]

Betancourt Binge (Video Lectures on HMC and Stan)

Even better than binging on Netflix, catch up on Michael Betancourt’s updated video lectures, just days after their live theatrical debut in Tokyo. Scalable Bayesian Inference with Hamiltonian Monte Carlo (YouTube, 1 hour) Some Bayesian Modeling Techniques in Stan (YouTube, 1 hour 40 minutes) His previous videos have received very good reviews and they’re only […]

A Primer on Bayesian Multilevel Modeling using PyStan

Chris Fonnesbeck contributed our first PyStan case study (I wrote the abstract), in the form of a very nice Jupyter notebook. Daniel Lee and I had the pleasure of seeing him present it live as part of a course we were doing at Vanderbilt last week. A Primer on Bayesian Multilevel Modeling using PyStan This […]

Stan workshop this Thurs NYC

Jonah is speaking at the Bayesian Data Analysis meetup on Thursday night, “Stan Workshop. Life is precious: fix your sampling problems.” He’ll focus on common problems using MCMC and how to address them. For registration: http://www.meetup.com/bda-group/events/231650672/

Birthday analysis—Friday the 13th update, and some model checking

Carl Bialik and Andrew Flowers at fivethirtyeight.com (Nate Silver’s site) ran a story following up on our birthdays example—that time series decomposition of births by day, which is on the cover of the third edition of Bayesian Data Analysis using data from 1968-1988, and which then Aki redid using a new dataset from 2000-2014. Friday […]

Point summary of posterior simulations?

Luke Miratrix writes: ​In the applied stats class ​I’m teaching ​on​ hierarchical models I’m giving the students (a mix of graduate students, many from the education school, and undergrads) a taste of Stan. I have to give them some “standard” way to turn Stan output into a point estimate (though of course I’ll also explain […]

Stochastic natural-gradient EP

Yee Whye Teh sends along this paper with Leonard Hasenclever, Thibaut Lienart, Sebastian Vollmer, Stefan Webb, Balaji Lakshminarayanan, and Charles Blundell. I haven’t read it in detail but they not similarities to our “expectation propagation as a way of life” paper. But their work is much more advanced than ours.

A new idea for a science core course based entirely on computer simulation

I happen to come across this post from 2011 that I like so much, I thought I’d say it again: Columbia College has for many years had a Core Curriculum, in which students read classics such as Plato (in translation) etc. A few years ago they created a Science core course. There was always some […]