Skip to content
Archive of posts filed under the Stan category.

The current state of the Stan ecosystem in R

This post is by Jonah. Last week I posted here about the release of version 2.0.0 of the loo R package, but there have been a few other recent releases and updates worth mentioning. At the end of the post I also include some general thoughts on R package development with Stan and the growing number of […]

Postdoc opportunity at AstraZeneca in Cambridge, England, in Bayesian Machine Learning using Stan!

Here it is: Predicting drug toxicity with Bayesian machine learning models We’re currently looking for talented scientists to join our innovative academic-style Postdoc. From our centre in Cambridge, UK you’ll be in a global pharmaceutical environment, contributing to live projects right from the start. You’ll take part in a comprehensive training programme, including a focus […]

You better check yo self before you wreck yo self

We (Sean Talts, Michael Betancourt, Me, Aki, and Andrew) just uploaded a paper (code available here) that outlines a framework for verifying that an algorithm for computing a posterior distribution has been implemented correctly. It is easy to use, straightforward to implement, and ready to be implemented as part of a Bayesian workflow. This type of […]

loo 2.0 is loose

This post is by Jonah and Aki. We’re happy to announce the release of v2.0.0 of the loo R package for efficient approximate leave-one-out cross-validation (and more). For anyone unfamiliar with the package, the original motivation for its development is in our paper: Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian model evaluation […]

Fitting a hierarchical model without losing control

Tim Disher writes: I have been asked to run some regularized regressions on a small N high p situation, which for the primary outcome has lead to more realistic coefficient estimates and better performance on cv (yay!). Rstanarm made this process very easy for me so I am grateful for it. I have now been […]

Mitzi’s talk on spatial models in Ann Arbor, Thursday 5 April 2018

Mitzi returns to her alma mater to give a talk at joint meeting of the Ann Arbor useR and ASA Meetups: Spatial models in Stan Abstract This case study shows how to efficiently encode and compute an intrinsic conditional autoregressive (ICAR) model in Stan. When data has a neighborhood structure, ICAR models provide spatial smoothing […]

Combining Bayesian inferences from many fitted models

Renato Frey writes: I’m curious about your opinion on combining multi-model inference techniques with rstanarm: On the one hand, screening all (theoretically meaningful) model specifications and fully reporting them seems to make a lot of sense to me — in line with the idea of transparent reporting, your idea of the multiverse analysis, or akin […]

Bayesian inference for A/B testing: Lauren Kennedy and I speak at the NYC Women in Machine Learning and Data Science meetup tomorrow (Tues 27 Mar) 7pm

Here it is: Bayesian inference for A/B testing Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University Lauren Kennedy, Columbia Population Research Center, Columbia University Suppose we want to use empirical data to compare two or more decisions or treatment options. Classical statistical methods based on statistical significance and p-values break down […]

“The problem of infra-marginality in outcome tests for discrimination”

Camelia Simoiu, Sam Corbett-Davies, and Sharad Goel write: Outcome tests are a popular method for detecting bias in lending, hiring, and policing decisions. These tests operate by comparing the success rate of decisions across groups. For example, if loans made to minority applicants are observed to be repaid more often than loans made to whites, […]

Bob’s talk at Berkeley, Thursday 22 March, 3 pm

It’s at the Institute for Data Science at Berkeley. Hierarchical Modeling in Stan for Pooling, Prediction, and Multiple Comparisons 22 March 2018, 3pm 190 Doe Library. UC Berkeley. And here’s the abstract: I’ll provide an end-to-end example of using R and Stan to carry out full Bayesian inference for a simple set of repeated binary […]

What prior to use for item-response parameters?

Joshua Pritkin writes: There is a Stan case study by Daniel Furr on a hierarchical two-parameter logistic item response model. My question is whether to model the covariance between log alpha and beta parameters. I asked Daniel Furr about this and he said, “The argument I would make for modelling the covariance is that it […]

Research project in London and Chicago to develop and fit hierarchical models for development economics in Stan!

Rachael Meager at the London School of Economics and Dean Karlan at Northwestern University write: We are seeking a Research Assistant skilled in R programming and the production of R packages. The successful applicant will have experience creating R packages accessible on github or CRAN, and ideally will have experience working with Rstan. The main […]

Eid ma clack shaw zupoven del ba.

When I say “I love you”, you look accordingly skeptical – Frida Hyvönen A few years back, Bill Callahan wrote a song about the night he dreamt the perfect song. In a fever, he woke and wrote it down before going back to sleep. The next morning, as he struggled to read his handwriting, he saw […]

Andrew vs. the Multi-Armed Bandit

Andrew and I were talking about coding up some sequential designs for A/B testing in Stan the other week. I volunteered to do the legwork and implement some examples. The literature is very accessible these days—it can be found under the subject heading “multi-armed bandits.” There’s even a Wikipedia page on multi-armed bandits that lays […]

When to add a feature to Stan? The recurring issue of the compound declare-distribute statement

At today’s Stan meeting (this is Bob, so I really do mean today), we revisited the topic of whether to add a feature to Stan that would let you put distributions on parameters with their declarations. Compound declare-define statements Mitzi added declare-define statements a while back, so you can now write: transformed parameter { real […]

New Stan case studies: NNGP and Lotka-Volterra

It’s only January and we already have two new case studies up on the Stan site. Two new case studies Lu Zhang of UCLA contributed a case study on nearest neighbor Gaussian processes. Bob Carpenter (that’s me!) of Columbia Uni contributed one on Lotka-Volterra population dynamics. Mitzi Morris of Columbia Uni has been updating her […]

State-space modeling for poll aggregation . . . in Stan!

Peter Ellis writes: As part of familiarising myself with the Stan probabilistic programming language, I replicate Simon Jackman’s state space modelling with house effects of the 2007 Australian federal election. . . . It’s not quite the model that I’d use—indeed, Ellis writes, “I’m fairly new to Stan and I’m pretty sure my Stan programs […]

Big Data Needs Big Model

Big Data are messy data, available data not random samples, observational data not experiments, available data not measurements of underlying constructs of interest. To make relevant inferences from big data, we need to extrapolate from sample to population, from control to treatment group, and from measurements to latent variables. All these steps require modeling. At […]

How smartly.io productized Bayesian revenue estimation with Stan

Markus Ojala writes: Bayesian modeling is becoming mainstream in many application areas. Applying it needs still a lot of knowledge about distributions and modeling techniques but the recent development in probabilistic programming languages have made it much more tractable. Stan is a promising language that suits single analysis cases well. With the improvements in approximation […]

We were measuring the speed of Stan incorrectly—it’s faster than we thought in some cases due to antithetical sampling

Aki points out that in cases of antithetical sampling, our effective sample size calculations were unduly truncated above at the number of iterations. It turns out the effective sample size can be greater than the number of iterations if the draws are anticorrelated. And all we really care about for speed is effective sample size […]