Skip to content
Archive of posts filed under the Statistical computing category.

I love it when I can respond to a question with a single link

Shira writes: This came up from trying to help a colleague of mine at Human Rights Watch. He has several completely observed variables X, and a variable with 29% missing, Y. He wants a histogram (and other descriptive statistics) of a “filled in” Y. He can regress Y on X, and impute missing Y’s from […]

No, Michael Jordan didn’t say that!

The names are changed, but the song remains the same. First verse. There’s an article by a journalist, The odds, continually updated, by F.D. Flam in the NY Times to which Andrew responded in blog form, No, I didn’t say that, by Andrew Gelman, on this blog. Second verse. There’s an article by a journalist, […]

Stan 2.5, now with MATLAB, Julia, and ODEs

As usual, you can find everything on the Stan Home Page. Drop us a line on the stan-users group if you have problems with installs or questions about Stan or coding particular models. New Interfaces We’d like to welcome two new interfaces: MatlabStan by Brian Lau, and  Stan.jl (for Julia) by Rob Goedman. The new […]

Statistical Communication and Graphics Manifesto

Statistical communication includes graphing data and fitted models, programming, writing for specialized and general audiences, lecturing, working with students, and combining words and pictures in different ways. The common theme of all these interactions is that we need to consider our statistical tools in the context of our goals. Communication is not just about conveying […]

My course on Statistical Communication and Graphics

We will study and practice many different aspects of statistical communication, including graphing data and fitted models, programming in Rrrrrrrr, writing for specialized and general audiences, lecturing, working with students and colleagues, and combining words and pictures in different ways. You learn by writing an entry in your statistics diary every day. You learn by […]

Some general principles of Bayesian data analysis, arising from a Stan analysis of John Lee Anderson’s height

God is in every leaf of every tree. The leaf in question today is the height of journalist and Twitter aficionado Jon Lee Anderson, a man who got some attention a couple years ago after disparaging some dude for having too high a tweets-to-followers ratio. Anderson called the other guy a “little twerp” which made […]

What does CNN have in common with Carmen Reinhart, Kenneth Rogoff, and Richard Tol: They all made foolish, embarrassing errors that would never have happened had they been using R Markdown

Rachel Cunliffe shares this delight: Had the CNN team used an integrated statistical analysis and display system such as R Markdown, nobody would’ve needed to type in the numbers by hand, and the above embarrassment never would’ve occurred. And CNN should be embarrassed about this: it’s much worse than a simple typo, as it indicates […]

Bayesian Cognitive Modeling  Examples Ported to Stan

There’s a new intro to Bayes in town. Michael Lee and Eric-Jan Wagenmaker. 2014. Bayesian Cognitive Modeling: A Practical Course. Cambridge Uni. Press. This book’s a wonderful introduction to applied Bayesian modeling. But don’t take my word for it — you can download and read the first two parts of the book (hundreds of pages […]

My talk with David Schiminovich this Wed noon: “The Birth of the Universe and the Fate of the Earth: One Trillion UV Photons Meet Stan”

This talk will have two parts. (1) Astronomy professor David Schiminovich will discuss the ways in which recent large-scale sky surveys that include billions of data points can address questions such as, What will happen to the Earth and other planets when the Sun becomes a white dwarf? (2) Statistics professor Andrew Gelman will discuss […]

Dave Blei course on Foundations of Graphical Models

Dave Blei writes: This course is cross listed in Computer Science and Statistics at Columbia University. It is a PhD level course about applied probabilistic modeling. Loosely, it will be similar to this course. Students should have some background in probability, college-level mathematics (calculus, linear algebra), and be comfortable with computer programming. The course is […]

How Many Mic’s Do We Rip

Yakir Reshef writes: Our technical comment on Kinney and Atwal’s paper on MIC and equitability has come out in PNAS along with their response. Similarly to Ben Murrell, who also wrote you a note when he published a technical comment on the same work, we feel that they “somewhat missed the point.” Specifically: one statistic […]

“A hard case for Mister P”

Kevin Van Horn sent me an email with the above title (ok, he wrote MRP, but it’s the same idea) and the following content: I’m working on a problem that at first seemed like a clear case where multilevel modeling would be useful. As I’ve dug into it I’ve found that it doesn’t quite fit […]

Cool new position available: Director of the Pew Research Center Labs

Peter Henne writes: I wanted to let you know about a new opportunity at Pew Research Center for a data scientist that might be relevant to some of your colleagues. I [Henne] am a researcher with the Pew Research Center, where I manage an international index on religious issues. I am also working with others […]

Stanny Stanny Stannitude

On the stan-users list, Richard McElreath reports: With 2.4 out, I ran a quick test of how much speedup I could get by changing my old non-vectorized multi_normal sampling to the new vectorized form. I get a 40% time savings, without even trying hard. This is much better than I expected. Timings with vectorized multi_normal: […]

SciLua 2 includes NUTS

The most recent release of SciLua includes an implementation of Matt’s sampler, NUTS (link is to the final JMLR paper, which is a revision of the earlier arXiv version). According to the author of SciLua, Stefano Peluchetti: Should be quite similar to your [Stan’s] implementation with some differences in the adaptation strategy. If you have […]

Stan 2.4, New and Improved

We’re happy to announce that all three interfaces (CmdStan, PyStan, and RStan) are up and ready to go for Stan 2.4. As usual, you can find full instructions for installation on the Stan Home Page. Here are the release notes with a list of what’s new and improved: New Features ———— * L-BFGS optimization (now […]

NYC workshop 22 Aug on open source machine learning systems

The workshop is organized by John Langford (Microsoft Research NYC), along with Alekh Agarwal and Alina Beygelzimer, and it features Liblinear, Vowpal Wabbit, Torch, Theano, and . . . you guessed it . . . Stan! Here’s the current program: 8:55am: Introduction 9:00am: Liblinear by CJ Lin. 9:30am: Vowpal Wabbit and Learning to Search (John […]

Stan World Cup update

The other day I fit a simple model to estimate team abilities from World Cup outcomes. I fit the model to the signed square roots of the score differentials, using the square root on the theory that when the game is less close, it becomes more variable. 0. Background As you might recall, the estimated […]

Stan goes to the World Cup

I thought it would be fun to fit a simple model in Stan to estimate the abilities of the teams in the World Cup, then I could post everything here on the blog, the whole story of the analysis from beginning to end, showing the results of spending a couple hours on a data analysis. […]

Useless Algebra, Inefficient Computation, and Opaque Model Specifications

I (Bob, not Andrew) doubt anyone sets out to do algebra for the fun of it, implement an inefficient algorithm, or write a paper where it’s not clear what the model is. But… Why not write it in BUGS or Stan? Over on the Stan users group, Robert Grant wrote Hello everybody, I’ve just been […]