An interweaving-transformation strategy for boosting MCMC efficiency

Posted on October 19, 2011 9:24 AM by Andrew

Yaming Yu and Xiao-Li Meng write in with a cool new idea for improving the efficiency of Gibbs and Metropolis in multilevel models:

For a broad class of multilevel models, there exist two well-known competing parameterizations, the centered parameterization (CP) and the non-centered parameterization (NCP), for effective MCMC implementation. Much literature has been devoted to the questions of when to use which and how to compromise between them via partial CP/NCP. This article introduces an alternative strategy for boosting MCMC efficiency via simply interweaving—but not alternating—the two parameterizations. This strategy has the surprising property that failure of both the CP and NCP chains to converge geometrically does not prevent the interweaving algorithm from doing so. It achieves this seemingly magical property by taking advantage of the discordance of the two parameterizations, namely, the sufficiency of CP and the ancillarity of NCP, to substantially reduce the Markovian dependence, especially when the original CP and NCP form a “beauty and beast” pair (i.e., when one chain mixes far more rapidly than the other). The ancillarity–sufficiency reformulation of the CP–NCP dichotomy allows us to borrow insight from the well-known Basu’s theorem on the independence of (complete) sufficient and ancillary statistics, albeit a Bayesian version of Basu’s theorem is currently lacking. To demonstrate the competitiveness and versatility of this ancillarity–sufficiency interweaving strategy (ASIS) for real-world problems, we apply it to fit (1) a Cox process model for detecting changes in source intensity of photon counts observed by the Chandra X-ray telescope from a (candidate) neutron/quark star, which was the problem that motivated the ASIS strategy as it defeated other methods we initially tried; (2) a probit model for predicting latent membranous lupus nephritis; and (3) an interval-censored normal model for studying the lifetime of fluorescent lights. A bevy of open questions are presented, from the mysterious but exceedingly suggestive connections between ASIS and fiducial/structural inferences to nested ASIS for further boosting MCMC efficiency.

As is usual with XL, the computational idea comes with some deeper statistical principles. I’m reminded of the folk theorem and the Pinocchio principle.

The current issue of JCGS features a series of discussions of Yu and Meng’s article.

4 thoughts on “An interweaving-transformation strategy for boosting MCMC efficiency”

Jerzy on October 19, 2011 11:40 AM at 11:40 am said:

Nice. I did not even know that reparameterization (CP vs NCP) is a common “trick”, much less that you can get improvements by combining the two. Is Papaspiliopoulos, Roberts, and Skold 2007 a good intro reference for this approach?

I just ran into a situation where, essentially, a model involving
e ~ N(0,s)
logit(y) = X*B+e
was converging sloooowly. After trying many other things, out of sheer desperation I tested
xbe ~ N(X*B,s)
logit(y) = xbe
and it converged much more quickly. The fact that this made a difference still seems ridiculous to me. I assume that reading the article and references will help!
Iain on October 19, 2011 2:29 PM at 2:29 pm said:

Dealing with this issue can be very important in some hierarchical models. I’ll be interested to see how it compares to existing work:
http://pubs.amstat.org/doi/abs/10.1198/106186006X100470 and (my own take) http://homepages.inf.ed.ac.uk/imurray2/pub/10hypers/

Thanks for the reference.
C Ryan King on October 20, 2011 9:23 AM at 9:23 am said:

I was surprised that they gave up on the probit model. Maybe it can’t beat existing PX-DA, but that’s a pretty important application.
Pingback: parallel Metropolis Hastings [published] « Xi'an's Og

Comments are closed.