Tyler Cowen links to a post by Sean Taylor, who writes the following about users of R: You are willing to invest in learning something difficult. You do not care about aesthetics, only availability of packages and getting results quickly. To me, R is easy and Sas is difficult. I once worked with some students [...]
Stan and RStan 1.1.0
We’re happy to announce the availability of Stan and RStan versions 1.1.0, which are general tools for performing model-based Bayesian inference using the no-U-turn sampler, an adaptive form of Hamiltonian Monte Carlo. Information on downloading and installing and using them is available as always from Stan Home Page: http://mc-stan.org/ Let us know if you have [...]
Math Talks :: Action Movies
Jonathan Goodman gave the departmental seminar yesterday (10 Dec 2012) and I was amused by an extended analogy he made. After his (very clear) intro, he said that math talks were like action movies. The overall theorem and its applications provide the plot, and the proofs provide the action scenes.
Stantastic!
Richard McElreath writes: I’ve been translating a few ongoing data analysis projects into Stan code, mostly with success. The most important for me right now has been a hierarchical zero-inflated gamma problem. This a “hurdle” model, in which a bernoulli GLM produces zeros/nonzeros, and then a gamma GLM produces the nonzero values, using varying effects [...]
Stan at NIPS 2012 Workshop on Probabilistic Programming
If you need an excuse to go skiing in Tahoe next month, our paper on Stan as a probabilistic programming language was accepted for: Workshop on Probabilistic Programming NIPS 2012 7–8 December, 2012, Lake Tahoe, Nevada The workshop is organized by the folks behind the probabilistic programming language Church and has a great lineup of [...]
Rust
I happened to be referring to the path sampling paper today and took a look at Appendix A.2:
Computational problems with glm etc.
John Mount provides some useful background and follow-up on our discussion from last year on computational instability of the usual logistic regression solver. Just to refresh your memory, here’s a simple logistic regression with only a constant term and no separation, nothing pathological at all: > y display (glm (y ~ 1, family=binomial(link=”logit”))) glm(formula = [...]
AdviseStat 47% Campaign Ad
Lee Wilkinson sends me this amusing ad for his new software, AdviseStat: The ad is a parody, but the software is real!
Choices in graphing parallel time series
I saw this graph posted by Tyler Cowen: and my first thought was that the bar plot should be replaced by a line plot: Six lines, one for each income category, with each line being a time series of these changes. With a line plot, you can more easily see each time series (these are [...]
Commercial Bayesian inference software is popping up all over
Steve Cohen writes: As someone who has been working with Bayesian statistical models for the past several years, I [Cohen] have been challenged recently to describe the difference between Bayesian Networks (as implemented in BayesiaLab software) and modeling and inference using MCMC methods. I hope you have the time to give me (or to write [...]
Cool one-day miniconference at Columbia Fri 12 Oct on computational and online social science
One thing we do here at the Applied Statistics Center is hold mini-conferences. The next one looks really cool. It’s organized by Sharad Goel and Jake Hofman (Microsoft Research, formerly at Yahoo Research), David Park (Columbia University), and Sergei Vassilvitskii (Google). As with our other conferences, one of our goals is to mix the academic [...]
Stan is fast
10,000 iterations for 4 chains on the (precompiled) efficiently-parameterized 8-schools model:
A Stan is Born
Stan 1.0.0 and RStan 1.0.0 It’s official. The Stan Development Team is happy to announce the first stable versions of Stan and RStan. What is (R)Stan? Stan is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo. It’s sort of like BUGS, but with a different language [...]
Visualizing Distributions of Covariance Matrices
Since we’ve been discussing prior distributions on covariance matrices, I will recommend this recent article (coauthored with Tomoki Tokuda, Ben Goodrich, Iven Van Mechelen, and Francis Tuerlinckx) on their visualization: We present some methods for graphing distributions of covariance matrices and demonstrate them on several models, including the Wishart, inverse-Wishart, and scaled inverse-Wishart families in [...]
More on scaled-inverse Wishart and prior independence
I’ve had a couple of email conversations in the past couple days on dependence in multivariate prior distributions. Modeling the degrees of freedom and scale parameters in the t distribution First, in our Stan group we’ve been discussing the choice of priors for the degrees-of-freedom parameter in the t distribution. I wrote that also there’s [...]
Migrating from dot to underscore
My C-oriented Stan collaborators have convinced me to use underscore (_) rather than dot (.) as much as possible in expressions in R. For example, I can name a variable n_years rather than n.years. This is fine. But I’m getting annoyed because I need to press the shift key every time I type the underscore. [...]
D. Buggin
Joe Zhao writes: I am trying to fit my data using the scaled inverse wishart model you mentioned in your book, Data analysis using regression and hierarchical models. Instead of using a uniform prior on the scale parameters, I try to use a log-normal distribution prior. However, I found that the individual coefficients don’t shrink [...]
Slow progress
I received the following message: I am a Psychology postgraduate at the University of Glasgow and am writing for an article request. I’ve just read your 2008 published article titled “A weakly informative default prior distribution for logistic and other regression models” and found from it that your group also wrote a report on applying [...]
Bayesian Learning via Stochastic Gradient Langevin Dynamics
Burak Bayramli writes: In this paper by Sunjin Ahn, Anoop Korattikara, and Max Welling and this paper by Welling and Yee Whye The, there are some arguments on big data and the use of MCMC. Both papers have suggested improvements to speed up MCMC computations. I was wondering what your thoughts were, especially on this [...]
Optimizing software in C++
Matt3 pointed us to this helpful document by Agner Fog, “Optimizing software in C++ An optimization guide for Windows, Linux and Mac platforms.” More here. Enjoy!
Moving beyond hopeless graphics
I was at a talk awhile ago where the speaker presented tables with 4, 5, 6, even 8 significant digits even though, as is usual, only the first or second digit of each number conveyed any useful information. A graph would be better, but even if you’re too lazy to make a plot, a bit [...]
Learning Differential Geometry for Hamiltonian Monte Carlo
You can get a taste of Hamiltonian Monte Carlo (HMC) by reading the very gentle introduction in David MacKay’s general text on information theory: MacKay, D. 2003. Information Theory, Inference, and Learning Algorithms. Cambridge University Press. [see Chapter 31, which is relatively standalone and can be downloaded separately.] Follow this up with Radford Neal’s much [...]
The first version of my “inference from iterative simulation using parallel sequences” paper!
From August 1990. It was in the form of a note sent to all the people in the statistics group of Bell Labs, where I’d worked that summer. To all: Here’s the abstract of the work I’ve done this summer. It’s stored in the file, /fs5/gelman/abstract.bell, and copies of the Figures 1-3 are on Trevor’s [...]
Google Translate for code, and an R help-list bot
What we did in our Stan meeting yesterday: Some discussion of revision of the Nuts paper, some conversations about parameterizations of categorical-data models, plans for the R interface, blah blah blah. But also, I had two exciting new ideas! Google Translate for code Wouldn’t it be great if Google Translate could work on computer languages? [...]