This puzzle comes in three parts. There are some hints at the end.
Part I: Constrained Parameter Definition
Define a Stan program with a transformed matrix parameter d that is constrained to be a K by K distance matrix. Recall that a distance matrix must satisfy the definition of a metric for all i, j:
* positivity : d[i,j] >= 0
* self distance : d[i,j] = 0 iff i = j
* symmetry : d[i,j] = d[j,i]
* triangle inequality : d[i,j] <= d[i,k] + d[k,j] for all k
Part II: Modeling Assumptions
Now suppose there are noisy measurements y[n] of the distance between points ii[n] and jj[n]. This corresponds to a Stan program with the following data block.
data { intK; // num dims int N; // num observations int ii[N]; // first point of observation n int jj[N]; // second point of observation real y[N]; // distance measurements (could bound below by 0) }
A likelihood that assumes independent normal noise on the measurements is defined as follows.
model { for (n in 1:N) y[n] ~ normal(d[ii[n], jj[n]], sigma); }
This assumes a positive-constrained parameter sigma for the noise scale.
parameter { realsigma; // measurement noise scale ...
Feel free to give it a prior taking into account the distance scales involved in the problem and the measurement process.
Then there are then (K choose 2) free parameters, which may be (a) left unconstrained, (b) constrained to be positive, or (c) transformed to a proper distance matrix satisfying the distance metric conditions. What is the effect on the posterior of these three modeling assumptions?
Part III: Complexity
What’s the complexity of evaluating the likelihood with and without the distance-matrix constraint on the parameters?
Hints
Hint 1 (I) : There should be (K choose 2) [i.e., K * (K – 1) / 2] “raw” parameters (either unconstrained or constrained to be positive) which then need to be scaled.
Hint 2 (I) : See the manual chapter “Transformation of Constrained Variables” and in particular the section on lower and upper bounded variables if you want hints on the Jacobians necessary.
Hint 3 (I) : Order the transform so the Jacobian’s easy to compute.
Hint 4 (II) : The only way I know how to do this is by simulation, but a bunch of the other Stan developers probably just know the answer.
Hint 5 (III) : The answer should be should be an expression involving N and K, such as $latex {\mathcal O}(f(N,K))$ for some function $latex f$.
Hint 6 (III) : Count the number of expressions that get evaluated in the transform and in the model block.
Strikes Eli this is not too far from Doug Keenan’s challenge to separate series with deterministic trends and noise from those without. Given Keenan’s poorly defined statement of the problem, it probably would not be possible to completely solve, but Baysean methods may at least narrow the field enough that one of multiple submissions would win the $100,000 prize (there is a $10 entry fee)
You have a typo in the triangle inequality which should be something like:
Triangle inequality : d[i,j] <= d[i,k] + d[k,j] for all I, j, k
Thanks — I fixed it in the body of the post.
Is there a solution of this puzzle? How does one efficiently implement the constraints on d?