Skip to content
Archive of posts filed under the Stan category.

This company wants to hire people who can program in R or Python and do statistical modeling in Stan

Doug Puett writes: I am a 2012 QMSS [Columbia University Quantitative Methods in Social Sciences] grad who is currently trying to build a Data Science/Quantitative UX team, and was hoping for some advice. I am finding myself having a hard time finding people who are really interested in understanding people and who especially are excited […]

Design top down, Code bottom up

Top-down design means designing from the client application programmer interface (API) down to the code. The API lays out a precise functional specification, which says what the code will do, not how it will do it. Coding bottom up means coding the lowest-level foundations first, testing them, then continuing to build. Sometimes this requires dropping […]

A continuous hinge function for statistical modeling

This comes up sometimes in my applied work: I want a continuous “hinge function,” something like the red curve above, connecting two straight lines in a smooth way. Why not include the sharp corner (in this case, the function y=-0.5*x if x<0 or y=0.2*x if x>0)? Two reasons. First, computation: Hamiltonian Monte Carlo can trip […]

Using Stan for week-by-week updating of estimated soccer team abilites

Milad Kharratzadeh shares this analysis of the English Premier League during last year’s famous season. He fit a Bayesian model using Stan, and the R markdown file is here. The analysis has three interesting features: 1. Team ability is allowed to continuously vary throughout the season; thus, once the season is over, you can see […]

Splines in Stan! (including priors that enforce smoothness)

Milad Kharratzadeh shares a new case study. This could be useful to a lot of people. And here’s the markdown file with every last bit of R and Stan code. Just for example, here’s the last section of the document, which shows how to simulate the data and fit the model graphed above: Location of […]

Update rstanarm to version 2.15.3

Ben Goodrich writes: We just released rstanarm 2.15.3, which fixed a major bug that was introduced back in January with the 2.14.1 release where models of the form stan_glmer(y ~ … + (1 | group1) + (1 | group2), family = binomial()) would produce WRONG RESULTS. This only applies to Bernoulli models with multiple group-specific […]

Prior choice recommendations wiki !

Here’s the wiki, and here’s the background: Our statistical models are imperfect compared to the true data generating process and our complete state of knowledge (from an informational-Bayesian perspective) or the set of problems over which we wish to average our inferences (from a population-Bayesian or frequentist perspective). The practical question here is what model […]

Stan in St. Louis this Friday

This Friday afternoon I (Jonah) will be speaking about Stan at Washington University in St. Louis. The talk is open to the public, so anyone in the St. Louis area who is interested in Stan is welcome to attend. Here are the details: Title: Stan: A Software Ecosystem for Modern Bayesian Inference Jonah Sol Gabry, […]

Fitting hierarchical GLMs in package X is like driving car Y

Given that Andrew started the Gremlin theme, I thought it would only be fitting to link to the following amusing blog post: Chris Brown: Choosing R packages for mixed effects modelling based on the car you drive (on the seascape models blog) It’s exactly what it says on the tin. I won’t spoil the punchline, […]

Stacking, pseudo-BMA, and AIC type weights for combining Bayesian predictive distributions

This post is by Aki. We have often been asked in the Stan user forum how to do model combination for Stan models. Bayesian model averaging (BMA) by computing marginal likelihoods is challenging in theory and even more challenging in practice using only the MCMC samples obtained from the full model posteriors. Some users have […]

Tech company wants to hire Stan programmers!

Ittai Kan writes: I started life as an academic mathematician (chaos theory) but have long since moved into industry. I am currently Chief Scientist at Afiniti, a contact center routing technology company that connects agent and callers on the basis of various factors in order to globally optimize the contact center performance. We have 17 […]

“Scalable Bayesian Inference with Hamiltonian Monte Carlo” (Michael Betancourt’s talk this Thurs at Columbia)

Scalable Bayesian Inference with Hamiltonian Monte Carlo Despite the promise of big data, inferences are often limited not by sample size but rather by systematic effects. Only by carefully modeling these effects can we take full advantage of the data—big data must be complemented with big models and the algorithms that can fit them. One […]

Running Stan with external C++ code

Ben writes: Starting with the 2.13 release, it is much easier to use external C++ code in a Stan program. This vignette briefly illustrates how to do so.

Prediction model for fleet management

Chang writes: I am working on a fleet management system these days: basically, I am trying to predict the usage ‘y’ of our fleet in a zip code in the future. We have some factors ‘X’, such as number of active users, number of active merchants etc. If I can fix the time horizon, the […]

2 Stan job postings at Columbia (links fixed)

1. Stan programmer. This is the “Stan programmers” position described here. 2. Stan project development. This is the Stan business developer/grants manager described here. To apply, click on the first link for each position above (the site) and follow the instructions. P.S. In first version of this post I messed up the links. They’re […]

Mortality rate trends by age, ethnicity, sex, and state (link fixed)

There continues to be a lot of discussion on the purported increase in mortality rates among middle-aged white people in America. Actually an increase among women and not much change among men but you don’t hear so much about this as it contradicts the “struggling white men” story that we hear so much about in […]

Ensemble Methods are Doomed to Fail in High Dimensions

Ensemble methods [cat picture] By ensemble methods, I (Bob, not Andrew) mean approaches that scatter points in parameter space and then make moves by inteprolating or extrapolating among subsets of them. Two prominent examples are: Ter Braak’s differential evolution   Goodman and Weare’s walkers There are extensions and computer implementations of these algorithms. For example, […]

Hey, we’re hiring a postdoc! To work on survey weighting! And imputation!

Here’s the ad: The Center on Poverty and Social Policy at the Columbia University School of Social Work and the Columbia Population Research Center are seeking a postdoctoral scholar with a PhD in economics, statistics, public policy, demography, social work, sociology, or a related discipline, to lead the development of survey weights and missing data imputations for the New York City […]

Expectation propagation as a way of life: A framework for Bayesian inference on partitioned data

After three years, we finally have an updated version of our “EP as a way of life” paper. Authors are Andrew Gelman, Aki Vehtari, Pasi Jylänki, Tuomas Sivula, Dustin Tran, Swupnil Sahai, Paul Blomstedt, John Cunningham, David Schiminovich, and Christian Robert. Aki deserves credit for putting this all together into a coherent whole. Here’s the […]

A fistful of Stan case studies: divergences and bias, identifying mixtures, and weakly informative priors

Following on from his talk at StanCon, Michael Betancourt just wrote three Stan case studies, all of which are must reads: Diagnosing Biased Inference with Divergences: This case study discusses the subtleties of accurate Markov chain Monte Carlo estimation and how divergences can be used to identify biased estimation in practice.   Identifying Bayesian Mixture […]