Skip to content
Archive of posts filed under the Statistical computing category.

Stanny Stanny Stannitude

On the stan-users list, Richard McElreath reports: With 2.4 out, I ran a quick test of how much speedup I could get by changing my old non-vectorized multi_normal sampling to the new vectorized form. I get a 40% time savings, without even trying hard. This is much better than I expected. Timings with vectorized multi_normal: […]

SciLua 2 includes NUTS

The most recent release of SciLua includes an implementation of Matt’s sampler, NUTS (link is to the final JMLR paper, which is a revision of the earlier arXiv version). According to the author of SciLua, Stefano Peluchetti: Should be quite similar to your [Stan’s] implementation with some differences in the adaptation strategy. If you have […]

Stan 2.4, New and Improved

We’re happy to announce that all three interfaces (CmdStan, PyStan, and RStan) are up and ready to go for Stan 2.4. As usual, you can find full instructions for installation on the Stan Home Page. Here are the release notes with a list of what’s new and improved: New Features ———— * L-BFGS optimization (now […]

NYC workshop 22 Aug on open source machine learning systems

The workshop is organized by John Langford (Microsoft Research NYC), along with Alekh Agarwal and Alina Beygelzimer, and it features Liblinear, Vowpal Wabbit, Torch, Theano, and . . . you guessed it . . . Stan! Here’s the current program: 8:55am: Introduction 9:00am: Liblinear by CJ Lin. 9:30am: Vowpal Wabbit and Learning to Search (John […]

Stan World Cup update

The other day I fit a simple model to estimate team abilities from World Cup outcomes. I fit the model to the signed square roots of the score differentials, using the square root on the theory that when the game is less close, it becomes more variable. 0. Background As you might recall, the estimated […]

Stan goes to the World Cup

I thought it would be fun to fit a simple model in Stan to estimate the abilities of the teams in the World Cup, then I could post everything here on the blog, the whole story of the analysis from beginning to end, showing the results of spending a couple hours on a data analysis. […]

Useless Algebra, Inefficient Computation, and Opaque Model Specifications

I (Bob, not Andrew) doubt anyone sets out to do algebra for the fun of it, implement an inefficient algorithm, or write a paper where it’s not clear what the model is. But… Why not write it in BUGS or Stan? Over on the Stan users group, Robert Grant wrote Hello everybody, I’ve just been […]

Comment of the week

This one, from DominikM: Really great, the simple random intercept – random slope mixed model I did yesterday now runs at least an order of magnitude faster after installing RStan 2.3 this morning. You are doing an awesome job, thanks a lot!

(Py, R, Cmd) Stan 2.3 Released

We’re happy to announce RStan, PyStan and CmdStan 2.3. Instructions on how to install at: http://mc-stan.org/ As always, let us know if you’re having problems or have comments or suggestions. We’re hoping to roll out the next release a bit quicker this time, because we have lots of good new features that are almost ready […]

Judicious Bayesian Analysis to Get Frequentist Confidence Intervals

Christian Bartels has a new paper, “Efficient generic integration algorithm to determine confidence intervals and p-values for hypothesis testing,” of which he writes: The paper proposes to do an analysis of observed data which may be characterized as doing a judicious Bayesian analysis of the data resulting in the determination of exact frequentist p-values and […]

Average predictive comparisons in R: David Chudzicki writes a package!

Here it is: An R Package for Understanding Arbitrary Complex Models As complex models become widely used, it’s more important than ever to have ways of understanding them. Even when a model is built primarily for prediction (rather than primarily as an aid to understanding), we still need to know what it’s telling us. For […]

My answer: Write a little program to simulate it

Brendon Greeff writes: I was searching for an online math blog and found your email address. I have a question relating to the draw for a sports tournament. If there are 20 teams in a tournament divided into 4 groups, and those teams are selected based on four “bands” (Band: 1-5 ranked teams, 6-10, 11-15, […]

Stan is Turing Complete. So what?

This post is by Bob Carpenter. Stan is Turing complete! There seems to a persistent misconception that Stan isn’t Turing complete.1, 2 My guess is that it stems from Stan’s (not coincidental) superficial similarity to BUGS and JAGS, which provide directed graphical model specification languages. Stan’s Turing completeness follows from its support of array data […]

Superfast Metrop using data partitioning, from Marco Banterle, Clara Grazian, and Christian Robert

Superfast not because of faster convergence but because they use a clever acceptance/rejection trick so that most of the time they don’t have to evaluate the entire target density. It’s written in terms of single-step Metropolis but I think it should be possible to do it in HMC or Nuts, in which case we could […]

Bayesian nonparametric weighted sampling inference

Yajuan Si, Natesh Pillai, and I write: It has historically been a challenge to perform Bayesian inference in a design-based survey context. The present paper develops a Bayesian model for sampling inference using inverse-probability weights. We use a hierarchical approach in which we model the distribution of the weights of the nonsampled units in the […]

WAIC and cross-validation in Stan!

Aki and I write: The Watanabe-Akaike information criterion (WAIC) and cross-validation are methods for estimating pointwise out-of-sample prediction accuracy from a fitted Bayesian model. WAIC is based on the series expansion of leave-one-out cross-validation (LOO), and asymptotically they are equal. With finite data, WAIC and cross-validation address different predictive questions and thus it is useful […]

An interesting mosaic of a data programming course

Rajit Dasgupta writes: I have been working on a website, SlideRule that in its present state, is a catalog of online courses aggregated from over 35 providers. One of the products we are building on top of this is something called Learning Paths, which are essentially a sequence of Online Courses designed to help learners […]

Thermodynamic Monte Carlo: Michael Betancourt’s new method for simulating from difficult distributions and evaluating normalizing constants

I hate to keep bumping our scheduled posts but this is just too important and too exciting to wait. So it’s time to jump the queue. The news is a paper from Michael Betancourt that presents a super-cool new way to compute normalizing constants: A common strategy for inference in complex models is the relaxation […]

“The results (not shown) . . .”

Pro tip: Don’t believe any claims about results not shown in a paper. Even if the paper has been published. Even if it’s been cited hundreds of times. If the results aren’t shown, they haven’t been checked. I learned this the hard way after receiving this note from Bin Liu, who wrote: Today I saw […]

Once more on nonparametric measures of mutual information

Ben Murell writes: Our reply to Kinney and Atwal has come out (http://www.pnas.org/content/early/2014/04/29/1403623111.full.pdf) along with their response (http://www.pnas.org/content/early/2014/04/29/1404661111.full.pdf). I feel like they somewhat missed the point. If you’re still interested in this line of discussion, feel free to post, and maybe the Murrells and Kinney can bash it out in your comments! Background: Too many […]