Skip to content
Archive of posts filed under the Statistical computing category.

JuliaCon 2015 (24–27 June, Boston-ish)

JuliaCon is coming to Cambridge, MA the geek capital of the East Coast: 24–27 June. Here’s the conference site with program. I (Bob) will be giving a 10 minute “lightning talk” on Stan.jl, the Julia interface to Stan (built by Rob J. Goedman — I’m just pinch hitting because Rob couldn’t make it). The uptake […]

Cross-validation != magic

In a post entitled “A subtle way to over-fit,” John Cook writes: If you train a model on a set of data, it should fit that data well. The hope, however, is that it will fit a new set of data well. So in machine learning and statistics, people split their data into two parts. […]

New Alan Turing preprint on Arxiv!

Dan Kahan writes: I know you are on 30-day delay, but since the blog version of you will be talking about Bayesian inference in couple of hours, you might like to look at paper by Turing, who is on 70-yr delay thanks to British declassification system, who addresses the utility of using likelihood ratios for […]

Bob Carpenter’s favorite books on GUI design and programming

Bob writes: I would highly recommend two books that changed the way I thought about GUI design (though I’ve read a lot of them): * Jeff Johnson. GUI Bloopers. I read the first edition in book form and the second in draft form (the editor contacted me based on my enthusiastic Amazon feedback, which was […]

A silly little error, of the sort that I make every day

Ummmm, running Stan, testing out a new method we have that applies EP-like ideas to perform inference with aggregate data—it’s really cool, I’ll post more on it once we’ve tried everything out and have a paper that’s in better shape—anyway, I’m starting with a normal example, a varying-intercept, varying-slope model where the intercepts have population […]

Causal Impact from Google

Bill Harris writes: Did you see Would that be something worth a joint post and discussion from you and Judea? I then wrote: Interesting. It seems to all depend on the choice of “control time series.” That said, it could still be a useful method. Bill replied: The good: Bayesian approaches made very approachable […]

Interactive demonstrations for linear and Gaussian process regressions

Here’s a cool interactive demo of linear regression where you can grab the data points, move them around, and see the fitted regression line changing. There are various such apps around, but this one is particularly clean: (I’d like to credit the creator but I can’t find any attribution at the link, except that it’s […]

Defaults, once set, are hard to change.

So. Farewell then Rainbow color scheme. You reigned in Matlab Far too long. But now that You are no longer The default, Will we miss you? We can only Visualize. E. T. Thribb (17 1/2) Here’s the background.  Brad Stiritz writes: I know you’re a creator and big proponent of open-source tools. Given your strong interest […]

My talk tomorrow (Thurs) at MIT political science: Recent challenges and developments in Bayesian modeling and computation (from a political and social science perspective)

It’s 1pm in room E53-482. I’ll talk about the usual stuff (and some of this too, I guess).

One simple trick to make Stan run faster

Did you know that Stan automatically runs in parallel (and caches compiled models) from R if you do this: source(“”) P.S. This capability is automatically in the current version of rstan which you can load in from Cran.

Introducing shinyStan

As a project for Andrew’s Statistical Communication and Graphics graduate course at Columbia, a few of us (Michael Andreae, Yuanjun Gao, Dongying Song, and I) had the goal of giving RStan’s print and plot functions a makeover. We ended up getting a bit carried away and instead we designed a graphical user interface for interactively exploring virtually […]

VB-Stan: Black-box black-box variational Bayes

Alp Kucukelbir, Rajesh Ranganath, Dave Blei, and I write: We describe an automatic variational inference method for approximating the posterior of differentiable probability models. Automatic means that the statistician only needs to define a model; the method forms a variational approximation, computes gradients using automatic differentiation and approximates expectations via Monte Carlo integration. Stochastic gradient […]

Stan Down Under

I (Bob, not Andrew) am in Australia until April 30. I’ll be giving some Stan-related and some data annotation talks, several of which have yet to be concretely scheduled. I’ll keep this page updated with what I’ll be up to. All of the talks other than summer school will be open to the public (the […]

This has nothing to do with the Super Bowl

Joshua Vogelstein writes: The Open Connectome Project at Johns Hopkins University invites outstanding candidates to apply for a postdoctoral or assistant research scientist position in the area of statistical machine learning for big brain imaging data. Our workflow is tightly vertically integrated, ranging from raw data to theory to answering neuroscience questions and back again. […]

Six quick tips to improve your regression modeling

It’s Appendix A of ARM: A.1. Fit many models Think of a series of models, starting with the too-simple and continuing through to the hopelessly messy. Generally it’s a good idea to start simple. Or start complex if you’d like, but prepare to quickly drop things out and move to the simpler model to help […]

Github cheat sheet

Mike Betancourt pointed us to this page. Maybe it will be useful to you too.

Lewis Richardson, father of numerical weather prediction and of fractals

Lee Sechrest writes: If you get a chance, Wiki this guy: I [Sechrest] did and was gratifyingly reminded that I read some bits of his work in graduate school 60 years ago. Specifically, about his math models for predicting wars and his work on fractals to arrive at better estimates of the lengths of common […]

Stan comes through . . . again!

Erikson Kaszubowski writes in: I missed your call for Stan research stories, but the recent post about stranded dolphins mentioned it again. When I read about the Crowdstorming project in your blog, I thought it would be a good project to apply my recent studies in Bayesian modeling. The project coordinators shared a big dataset […]

Expectation propagation as a way of life

Aki Vehtari, Pasi Jylänki, Christian Robert, Nicolas Chopin, John Cunningham, and I write: We revisit expectation propagation (EP) as a prototype for scalable algorithms that partition big datasets into many parts and analyze each part in parallel to perform inference of shared parameters. The algorithm should be particularly efficient for hierarchical models, for which the […]

Next Generation Political Campaign Platform?

[This post is by David K. Park] I’ve been imagining the next generation political campaign platform. If I were to build it, the platform would have five components: Data Collection, Sanitization, Storage, Streaming and Ingestion: This area will focus on the identification and development of the tools necessary to acquire the correct data sets for […]