As a surprise welcome to 2017, our paper on how the Stan language works along with an overview of how the MCMC and optimization algorithms work hit the stands this week.

- Bob Carpenter, Andrew Gelman, Matthew D. Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A Probabilistic Programming Language.
*Journal of Statistical Software*76(1).

The authors are the developers at the time the first revision was submitted. We now have quite a few more developers. Because of that, we’d still prefer that people cite the manual authored by the development team collectively rather than this paper citing only some of our current developers.

The original motivation for writing a paper was that Wikipedia rejected our attempts at posting a Stan Wikipedia page without a proper citation.

I’d like to thank to Achim Zeileis at *JSS* for his patience and help during the final wrap up.

**Abstract**

Stan is a probabilistic programming language for specifying statistical models. A Stan program imperatively defines a log probability function over parameters conditioned on specified data and constants. As of version 2.14.0, Stan provides full Bayesian inference for continuous-variable models through Markov chain Monte Carlo methods such as the No-U-Turn sampler, an adaptive form of Hamiltonian Monte Carlo sampling. Penalized maximum likelihood estimates are calculated using optimization methods such as the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm. Stan is also a platform for computing log densities and their gradients and Hessians, which can be used in alternative algorithms such as variational Bayes, expectation propagation, and marginal inference using approximate integration. To this end, Stan is set up so that the densities, gradients, and Hessians, along with intermediate quantities of the algorithm such as acceptance probabilities, are easily accessible. Stan can be called from the command line using the cmdstan package, through R using the rstan package, and through Python using the pystan package. All three interfaces support sampling and optimization-based inference with diagnostics and posterior analysis. rstan and pystan also provide access to log probabilities, gradients, Hessians, parameter transforms, and specialized plotting.

**BibTeX**

@article{stan:2017, author = {Bob Carpenter and Andrew Gelman and Matthew Hoffman and Daniel Lee and Ben Goodrich and Michael Betancourt and Marcus Brubaker and Jiqiang Guo and Peter Li and Allen Riddell}, title = {Stan: {A} Probabilistic Programming Language}, journal = {Journal of Statistical Software}, volume = {76}, number = {1}, year = {2017} }

**Further reading**

Check out the Papers about Stan section of the Stan Citations web page. There’s more info on our autodiff and on how variational inference works and a link to the original NUTS paper. And of course, don’t miss Michael’s latest if you want to understand HMC and NUTS, A conceptual introduction to HMC.

Awesome! Will read this today, and I’ll be citing this very soon in a few papers I have lined up.

In fact, this upcoming Friday, I’ll be at the SPSP conference presenting an “anova-like” and path model that uses stan for partial pooling across four experiments.

Also for further reading I recommend this paper introducing Stan that we wrote at about the same time as the JSS paper. But it was for a regular journal so it appeared in 2015 rather than 2017. Kinda funny that JSS is all-electronic yet it was two years slower than a conventional journal.

It’s great that this paper is finally out. In a funding proposal, I got into a bit of trouble because I could not prove that this paper did in fact exist.

A small point that ends up becoming a major irritant cumulatively: whenever I write a bibtex entry I try to make sure that any non-initial word that I want captialized is wrapped in curly braces. Otherwise you get a badly formatted bibliography:

Stan: {A} Probabilistic Programming Language

I really hate this part about bibtex. I bet someone has already found a general solution. Please post it if yes.

Andrew:

IMHO the abstract is *terrible*. Not sure what your audience is but an abstract is an opportunity to entice target readers to actually read the paper. It is a sale.

Most sales people agree: The way to sell something is not talk bout your product / feature etc (the “What is it?”) but rather to generate interest by answering the reader’s question — “What is in it for me?”.

So IMHO the abstract should talk about the benefits of using STAN for your target reader. Once people are sold on the benefits they’ll want to know more about it, what is it, its unique features, how to get started etc. That is, they will want to read the paper.

Fernando:

We didn’t write the abstract, or the paper, to sell Stan. Not many people read articles in the Journal of Statistical Software. These articles are archival more than anything else. The people who will read the paper are those who are already interested in Stan and want a quick technical overview.

I do think that promoting Stan is a good idea, and I’m sure the Stan website can be improved. The JSS article just isn’t really the place for that.

Ps bn dale I did not mean actual sale for dollars. But generically. Like selling an idea to a department head. Getting them interested.

And fine.But most readers of JSS are busy. Maybe 1000 skim the TOC, only 10 read an article, and so on.

I am working on assumption you’d like more of those target readers to actually read the work you’ve done. But maybe not.

Fernando:

We do lots of promotion and probably should do more. This particular journal article was not about promotion. It serves more of an archival function. I think people will read the paper who really feel the need for it. My guess is that posting the link on the blog will get more readers than any abstract would’ve done.

OK. I get the archival purpose. And yes, blog will drive a lot of traffic to paper abstract, and many to read it.

But again, a better abstract – IMHO – would get even more of that additional traffic to actually read the paper. (i.e. have a treatment effect). And it’s not like writing a different abstract would have taken a lot more work. Almost a free lunch. It may also raise questions as to how “customer” oriented the STAN team is.

But really this is not about this specific abstract. Just venting my general feeling that often research abstracts fail to put the reader, and her needs, first.

Fernando:

I agree with your general point, I just don’t think it happens to be so relevant in this case.

Here’s a concern, though: Sometimes people put things in the title or the abstract that aren’t in the actual paper!

Here’s an example: Carney, Cuddy, and Yap wrote a paper whose abstract concluded, “That a person can, by assuming two simple 1-min poses, embody power and instantly become more powerful has real-world, actionable implications.” The paper in question had no data at all on people “becoming more powerful,” indeed they had no data on power at all.

In all the reviewing process, nobody seemed to notice that the paper’s abstract referred to something that was not in the paper. That’s a case of too much salesmanship!

PS and I don’t think a good abstract is about promotion so much as about providing good customer service. It is the latter that begets the promotion. Else it’s just hot air.

But maybe your point is that the way it is written is exactly the way archive readers like it. May well be. I don’t have a clue. I can only say that for me it is not working at all.

Finally, as an example of an *excellent* abstract I would cite: A Conceptual Introduction to

Hamiltonian Monte Carlo, Michael Betancourt.

Just be fair to STAN team ;-). Michael ought to be the official abstract writer for all things STAN…