Stan Weekly Roundup, 16 June 2017

We’re going to be providing weekly updates for what’s going on behind the scenes with Stan. Of course, it’s not really behind the scenes, because the relevant discussions are at

  • stan-dev GitHub organization: this is the home of all of our source repos; design discussions are on the Stan Wiki

  • Stan Discourse Groups: this is the home of our user and developer lists (they’re all open); feel free to join the discussion—we try to be friendly and helpful in our responses, and there is a lot of statistical and computational expertise in the wings from our users, who are increasingly joining the discussion. By the way, thanks for that—it takes a huge load off us to get great answers from users to other user questions. We’re up to about 15 or so active discussion threads a day or thereabouts (active topics in the last 24 hours include AR(K) models, web site reorganization, ragged arrays, order statitic priors, new R packages built on top of Stan, docker images for Stan on AWS, and many more!)

OK, let’s get started with the weekly review, though this is a special summer double issue, just like the New Yorker.

Your news here: If you have any Stan news you’d like to share, please let me know at [email protected] (we’ll probably get a more standardized way to do this in the future).

New web site: Michael Betancourt redesigned the Stan web site; hopefully this will be easier to use. We’re no longer trying to track the literature. If you want to see the Stan literature in progress, do a search for “Stan Development Team” or “mc-stan.org” on Google Scholar; we can’t keep up! Do let us know either in an issue on GitHub for the web site or in the user group on Discourse if you have comments or suggestions.

New user and developer lists: We’ve shuttered our Google group and moved to Discourse for both our user and developer lists (they’re consolidated now in categories on one list). It’s easy to signup with GitHub or Google IDs and much easier to search and use online.
See Stan Discourse Groups and for the old discussions, Stan’s shuttered Google group for users and Stan’s shuttered Google group for developers“. We’re not removing any of the old content, but we are prohibiting new posts.

GPU support: Rok Cesnovar and Steve Bronder have been getting GPU support working for linear algebra operations. They’re starting with Cholesky decomposition because it’s a bottleneck for Gaussian process (GP) models and because it has the pleasant property of being quadratic in data and cubic in computation.
See math pull request 529

Distributed computing support: Sebastian Weber is leading the charge into distributed computing using the MPI framework (multi-core or multi-machine) by essentially coding up map-reduce for derivatives inside of Stan. Together with GPU support, distributed computing of derivatives will give us a TensorFlow-like flexibility to accelerate computations. Sebastian’s also looking into parallelizing the internals of the Boost and CVODES ordinary differential equation (ODE) solvers using OpenCL.
See math issue 101, math issue 551,

Logging framework: Daniel Lee added a logging framework to Stan to allow finer-grained control of

Operands and partials: Sean Talts finished the refactor of our underlying operands and partials data structure, which makes it much simpler to write custom derivative functions

See pull request 547

Autodiff testing framework: Bob Carpenter finished the first use case for a generalized autodiff tester to test all of our higher-order autodiff thoroughly
See math pull request 562

C++11: We’re all working toward the 2.16 release, which will be our last release before we open the gates of C++11 (and some of C++14). This is going to make our code a whole lot easier to write and maintain, and will open up awesome possibilities like having closures to define lambdas within the Stan language, as well as consolidating many of our uses of Boost into standard template library.

Append arrays: Ben Bales added signatures for append_array, to work like our appends for vectors and matrices.
See pull request 554 and pull request 550

ODE system size checks: Sebastian Weber pushed a bug fix that cleans up ODE system size checks to avoid seg faults at run time.
See pull request 559

RNG consistency in transformed data: A while ago we relaced the generated-quantities-only nature of _rng functions by allowing them in transformed data (so you can fit fake data generated wholly within Stan or represent posterior uncertainty of some other process, allowing “cut”-like models to be formulated as a two-stage process); Mitzi Morris just cleaned these up so we use the same RNG seed for all chains so that we can perform converence monitoring; multiple replications would then be done by running the whole multi-chain process multiple times.
See Stan pull request 2313

NSF Grant: CI-SUSTAIN: Stan for the Long Run: We (Bob Carpenter, Andrew Gelman, Michael Betancourt) were just awarded an NSF grant for Stan sustainability. This was a follow-on from the first Compute Resource Initiative (CRI) grant we got after building the system. Yea! This adds roughly a year of funding for the team at Columbia University. Our goal is to put in governance processes for sustaining the project as well as shore up all of our unit tests and documentation.

Hiring: We hired two full-time Stan staff at Columbia. Sean Talts joins as a developer at Columbia and Breck Baldwin as a business manager for the project, both at Columbia. Sean had already been working as a contractor for us, hence all the pull requests. (Pro tip: The best way to get a foot in the door for an open-source project is to submit a useful pull request.)