I was having a conversation with Andrew that went like this yesterday:
- Hey, someone’s giving a talk today on HMMs (that someone was Yang Chen, who was giving a talk based on her JASA paper Analyzing single-molecule protein transportation experiments via hierarchical hidden Markov models).
Maybe we should add some specialized discrete modules to Stan so we can fit HMMs. Word on the street is that Stan can’t fit models with discrete parameters.
- Uh, we can already fit HMMs in Stan. There’s a section in the manual that explains how (section 9.6, Hidden Markov Models).
We’ve had some great traffic on the mailing list lately related to movement HMMs, which were all the rage at ISEC 2016 for modeling animal movement (e.g., whale diving behavior in the presence or absence of sonar).
- But how do you marginalize out the discrete states?
- The forward algorithm (don’t even need the backward part for calculating HMM likelihoods). It’s a dynamic programming algorithm that calculates HMM likelihoods in , where is the number of time points and the number of HMM states.
- So you built that in C++ somehow?
- Nope, just straight Stan code. It’s hard to generalize to C++ given that the state emissions models vary by application and the transitions can involve predictors. Like many of these classes of models, Stan’s flexible enough to, say, implement all of the models described in Roland Langrock et al.’s moveHMM package for R. The doc for that package is awesome—best intro the models I’ve seen. I like the concreteness of resolving details in code. Langrock’s papers are nice, too, so I suspect his co-authored book on HMMs for time series is probably worth reading, too. That’s another one that’d be nice to translate to Stan.
More HMMs in Stan coming soon (I hope)
Case study on movement HMMs coming soon, I hope. And I’ll pull out the HMMs tuff into a state-space chapter. Right now, it’s spread between the CJS ecology models (states are which animals are alive) in the latent discrete parameters chapters and HMMs in the time-series chapter. I’m also going to try to get some of the more advanced sufficient statistics, N-best, and chunking algorithms I wrote for LingPipe going in Stan along with some semi-supervised learning methods I used to work on in NLP. I’m very curious to compare the results in Stan to those derived by plugging in (penalized) maximum likelihood parameter estimates
Getting working movement HMM code
Ben Lambert and I managed to debug the model through a mailing list back and forth, which also contains code for the movement HMMs (which also use a cool wrapped Cauchy distribution that Ben implemented as a function). The last word on that thread has the final working models with covariates.
We should be able to fit Yang Chen’s hierarchical HMMs in Stan. She assumed a univariate normal emission density that was modeled hierarchically across experiments, with a shared transition matrix involving only a few states. We wouldn’t normally perfrom her model selection step; see Andrew’s previous post on overfitting cluster analysis which arose out of a discussion I had with Ben Bolker at ISEC.