https://github.com/stan-dev/stan/issues/2336#issuecomment-326360838

In general, there’s always a next-manual issue where bugs in the manual may be reported.

]]>I’ll work on an example for the forum. ]]>

You can technically predict individual weights directly, with some prespecified allowable error. You can predict the posterior weights. You can predict the prior weights as well. All of these tend to work well, to be honest. I actually think most people predict log(p(theta_i)), and then compute the posterior p(theta_i|data) afterward. Seems like most mixture packages work that way.

]]>If you’re trying to create a parameter that samples from the posterior of the mixture then you create a parameter “FakeData” and give it the same distribution as your real data.

p(FakeData | Component = 1,other parameters…) p(Component=1) + … + p(FakeData | Component = n, other parameters….) p(Component=n)

Since each component will be calculated as logarithms, remember that to multiply on regular scale, you add logarithms, and to add things, you take the log of the sum of the exponentials… log_sum_exp

if p(Component=1) is itself a simplex parameter then you can just use it, if these parameters don’t already sum to 1, then you need the normalization described above.

I hope that helped someone, at least maybe me. ;-)

]]>intN; int K; simplex[K] theta; real y[N]; ordered[K] mu; vector [K] sigma; ... vector[K] Pr[N]; for (n in 1:N) { vector[K] lp = log(theta) + normal_vec_lpdf(y[n] | mu, sigma); real log_Z = log_sum_exp(lp); target += log_Z; Pr[n] = lp - log_Z; }

and then Pr[n, k] gives the probability that the n-th item is assigned to mixture component k (it’s Pr[z[n] = k] if you think in terms of the latent responsibility parameter z[n] in 1:K).

]]>The first change-point example in the latent discrete parameters explains how to do what Sebastian is recommending. Just be careful, because it’s a subtraction, not a division as Sebastian wrote, because it’s on the log scale:

log Pr[y = i] = lmix[i] - log_sum_exp(lmix)]]>

Diallo, T. M., Morin, A. J., & Lu, H. (2016). Impact of misspecifications of the latent variance–covariance and residual matrices on the class enumeration accuracy of growth mixture models. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 507-531.

]]>I have some models where something like log_mix_lpdf

would solve some issues. ]]>

real lmix[5];

lmix[1] = …;

lmix[2] = …;

…

And then

target += log_sum_exp(lmix);

Will do it. Not as nice as a log_mix with multiple entries, but better than nesting.

]]>That was in the original feature request, but nobody’s ever gotten to it. The more-than-two-component version would use a simplex to parallel what we already have.

I’d also like to see a log-odds parameterization for both of these as we do with bernoulli_logit and categorial_logit. These are relatively easy functions to add to the Stan math library.

Longer term, we’ll be looking at even better ways to do this with higher-order types. I want to add simple types and lambdas with closures to Stan. Mitzi’s already got the basic infrastructure plumbed through to let us start generalizing the type system. She’ll be adding tuples first, but we’ll also be looking at adding higher-order types in the future.

]]>I tend to forgo the log_mix method and use, e.g., target += log_sum_exp(log(theta[1]) + pdf(whatever | params), log(theta[2] + pdf(whatever | params)) etc, because it’s easier to see what is going on, it’s more programmable (can construct each component in a loop and sum across the vector with log_sum_exp), and it extends to k > 2 mixtures.

If you want two trajectory states to be defined, just define two separate sets of parameters, order them in some way for identifiability (intercept tends to work well enough, but it depends on the data), and split the likelihood into k likelihoods weighted by log(theta[k]).

]]>I’m still a stan novice, so I’m trying to think it out.

]]>