I (Aki) recently made a case study that demonstrates how to implement user defined probability functions in Stan language (case study, git repo). As an example I use the generalized Pareto distribution (GPD) to model extreme values of geomagnetic storm data from the World Data Center for Geomagnetism. Stan has had support for user defined functions for a long time, but there wasn’t a full practical example of how to implement all the functions that built-in distributions have (_lpdf (or _lpmf),_cdf, _lcdf, _lccdf, and_rng). Having the full set of functions makes it easy to implement models, censoring, posterior predictive checking and loo. The most interesting things I learned while making the case study were:
- How to replicate the behavior of Stan’s internal distribution functions as close as possible (due to lack of overloading of user defined functions, we have to make some compromises).
- How to make tests for the user defined distribution functions.
By using this case study as a template, it should be easier and faster to implement and test new custom distributions for your Stan models.
Where’s the cat? In the truck?
Should we ask Schroedinger?
Bill: I liked the picture but I guess this would bring in cats http://rspb.royalsocietypublishing.org/content/royprsb/281/1774/20132686/F4.large.jpg