Max Joseph writes:
Conditional autoregressive (CAR) models are popular as prior distributions for spatial random effects with areal spatial data. Historically, MCMC algorithms for CAR models have benefitted from efficient Gibbs sampling via full conditional distributions for the spatial random effects. But, these conditional specifications do not work in Stan, where the joint density needs to be specified (up to a multiplicative constant).
CAR models can still be implemented in Stan by specifying a multivariate normal prior on the spatial random effects, parameterized by a mean vector and a precision matrix. This works, but is slow and hard to scale to large datasets.
Order(s) of magnitude speedups can be achieved by combining 1) sparse matrix multiplications from Kyle Foreman (outlined on the stan-users mailing list), and 2) a fancy determinant trick from Jin, Xiaoping, Bradley P. Carlin, and Sudipto Banerjee. “Generalized hierarchical multivariate CAR models for areal data.” Biometrics 61.4 (2005): 950-961.
With the oft-used Scotland lip cancer dataset, the sparse CAR implementation with the NUTS (No-U-Turn Sampler) algorithm in Stan gives 120 effective samples/sec compared to 7 effective samples/sec for the precision matrix implementation.
Details for these sparse exact methods can be found here.
Max Joseph is part of the Earth Lab Analytics Hub, University of Colorado – Boulder.
Bob Carpenter adds:
I put Max Joseph’s case study up on our web page. The top-level case studies page is here (with license and links to package dependencies and the case study).
The direct link to the case study is here.
Cool indeed. With this set-up, the implementation of such models has moved from “statistics research” to “statistics practice.”
P.S. I took a look at the document and I have a few questions/comments:
1. The map of Scotland is distorted. No big deal but might as well get that right.
2. I can’t believe that the “tau ~ gamma(0.5, .0005);” prior is a good idea. It just looks weird to me. At the very least I’d suggest a reparameterization to make it scale-free.
3. Is there a way to package up all that code into a function? I don’t have enough experience with Stan functions to be sure, but the idea would be to have a CAR model with all that code inside the function so that, as a user, I could just say theta ~ CAR(…) with all the parameters, and then I wouldn’t have to worry about all those matrix manipulations.