Skip to content
Archive of posts filed under the Statistical computing category.

What does CNN have in common with Carmen Reinhart, Kenneth Rogoff, and Richard Tol: They all made foolish, embarrassing errors that would never have happened had they been using R Markdown

Rachel Cunliffe shares this delight: Had the CNN team used an integrated statistical analysis and display system such as R Markdown, nobody would’ve needed to type in the numbers by hand, and the above embarrassment never would’ve occurred. And CNN should be embarrassed about this: it’s much worse than a simple typo, as it indicates […]

Bayesian Cognitive Modeling  Examples Ported to Stan

There’s a new intro to Bayes in town. Michael Lee and Eric-Jan Wagenmaker. 2014. Bayesian Cognitive Modeling: A Practical Course. Cambridge Uni. Press. This book’s a wonderful introduction to applied Bayesian modeling. But don’t take my word for it — you can download and read the first two parts of the book (hundreds of pages […]

My talk with David Schiminovich this Wed noon: “The Birth of the Universe and the Fate of the Earth: One Trillion UV Photons Meet Stan”

This talk will have two parts. (1) Astronomy professor David Schiminovich will discuss the ways in which recent large-scale sky surveys that include billions of data points can address questions such as, What will happen to the Earth and other planets when the Sun becomes a white dwarf? (2) Statistics professor Andrew Gelman will discuss […]

Dave Blei course on Foundations of Graphical Models

Dave Blei writes: This course is cross listed in Computer Science and Statistics at Columbia University. It is a PhD level course about applied probabilistic modeling. Loosely, it will be similar to this course. Students should have some background in probability, college-level mathematics (calculus, linear algebra), and be comfortable with computer programming. The course is […]

How Many Mic’s Do We Rip

Yakir Reshef writes: Our technical comment on Kinney and Atwal’s paper on MIC and equitability has come out in PNAS along with their response. Similarly to Ben Murrell, who also wrote you a note when he published a technical comment on the same work, we feel that they “somewhat missed the point.” Specifically: one statistic […]

“A hard case for Mister P”

Kevin Van Horn sent me an email with the above title (ok, he wrote MRP, but it’s the same idea) and the following content: I’m working on a problem that at first seemed like a clear case where multilevel modeling would be useful. As I’ve dug into it I’ve found that it doesn’t quite fit […]

Cool new position available: Director of the Pew Research Center Labs

Peter Henne writes: I wanted to let you know about a new opportunity at Pew Research Center for a data scientist that might be relevant to some of your colleagues. I [Henne] am a researcher with the Pew Research Center, where I manage an international index on religious issues. I am also working with others […]

Stanny Stanny Stannitude

On the stan-users list, Richard McElreath reports: With 2.4 out, I ran a quick test of how much speedup I could get by changing my old non-vectorized multi_normal sampling to the new vectorized form. I get a 40% time savings, without even trying hard. This is much better than I expected. Timings with vectorized multi_normal: […]

SciLua 2 includes NUTS

The most recent release of SciLua includes an implementation of Matt’s sampler, NUTS (link is to the final JMLR paper, which is a revision of the earlier arXiv version). According to the author of SciLua, Stefano Peluchetti: Should be quite similar to your [Stan’s] implementation with some differences in the adaptation strategy. If you have […]

Stan 2.4, New and Improved

We’re happy to announce that all three interfaces (CmdStan, PyStan, and RStan) are up and ready to go for Stan 2.4. As usual, you can find full instructions for installation on the Stan Home Page. Here are the release notes with a list of what’s new and improved: New Features ———— * L-BFGS optimization (now […]

NYC workshop 22 Aug on open source machine learning systems

The workshop is organized by John Langford (Microsoft Research NYC), along with Alekh Agarwal and Alina Beygelzimer, and it features Liblinear, Vowpal Wabbit, Torch, Theano, and . . . you guessed it . . . Stan! Here’s the current program: 8:55am: Introduction 9:00am: Liblinear by CJ Lin. 9:30am: Vowpal Wabbit and Learning to Search (John […]

Stan World Cup update

The other day I fit a simple model to estimate team abilities from World Cup outcomes. I fit the model to the signed square roots of the score differentials, using the square root on the theory that when the game is less close, it becomes more variable. 0. Background As you might recall, the estimated […]

Stan goes to the World Cup

I thought it would be fun to fit a simple model in Stan to estimate the abilities of the teams in the World Cup, then I could post everything here on the blog, the whole story of the analysis from beginning to end, showing the results of spending a couple hours on a data analysis. […]

Useless Algebra, Inefficient Computation, and Opaque Model Specifications

I (Bob, not Andrew) doubt anyone sets out to do algebra for the fun of it, implement an inefficient algorithm, or write a paper where it’s not clear what the model is. But… Why not write it in BUGS or Stan? Over on the Stan users group, Robert Grant wrote Hello everybody, I’ve just been […]

Comment of the week

This one, from DominikM: Really great, the simple random intercept – random slope mixed model I did yesterday now runs at least an order of magnitude faster after installing RStan 2.3 this morning. You are doing an awesome job, thanks a lot!

(Py, R, Cmd) Stan 2.3 Released

We’re happy to announce RStan, PyStan and CmdStan 2.3. Instructions on how to install at: As always, let us know if you’re having problems or have comments or suggestions. We’re hoping to roll out the next release a bit quicker this time, because we have lots of good new features that are almost ready […]

Judicious Bayesian Analysis to Get Frequentist Confidence Intervals

Christian Bartels has a new paper, “Efficient generic integration algorithm to determine confidence intervals and p-values for hypothesis testing,” of which he writes: The paper proposes to do an analysis of observed data which may be characterized as doing a judicious Bayesian analysis of the data resulting in the determination of exact frequentist p-values and […]

Average predictive comparisons in R: David Chudzicki writes a package!

Here it is: An R Package for Understanding Arbitrary Complex Models As complex models become widely used, it’s more important than ever to have ways of understanding them. Even when a model is built primarily for prediction (rather than primarily as an aid to understanding), we still need to know what it’s telling us. For […]

My answer: Write a little program to simulate it

Brendon Greeff writes: I was searching for an online math blog and found your email address. I have a question relating to the draw for a sports tournament. If there are 20 teams in a tournament divided into 4 groups, and those teams are selected based on four “bands” (Band: 1-5 ranked teams, 6-10, 11-15, […]

Stan is Turing Complete. So what?

This post is by Bob Carpenter. Stan is Turing complete! There seems to a persistent misconception that Stan isn’t Turing complete.1, 2 My guess is that it stems from Stan’s (not coincidental) superficial similarity to BUGS and JAGS, which provide directed graphical model specification languages. Stan’s Turing completeness follows from its support of array data […]