After seeing this post by Matthew Wilson on a class of regression models called “factorization machines,” Aki writes:
In a typical machine learning way, this is called “machine”, but it would be also a useful mode structure in Stan to make linear models with interactions, but with a reduced number of parameters. With a fixed k, the rest of the parameters are continuous and it would be a first step toward the models Dunson talked in his presentation. To get this fast it may require a built-in function (to get a small expression tree), but for a moderately big datasets it’s not going to need all these algorithm mentioned in this posting.
Interesting, perhaps something for someone to work on. It requires some programming but not within core Stan code.
Would this work for the interaction of groups in a hierarchical model, rather than just interactions of continuous variables?
For instance, suppose you have non-overlapping groups 1 and 2. There are a few ways you can do this. The most obvious is combining them separately like
y[i] = u1[j[i]] + u2[k[i]]
but you could also do the cartesian product of the groups, something like
y[i] = u1,2[jk[i]]
The set-up that is most similar to theirs would be like
y[i] = u1[j[i]] + u2[k[i]] + u1,2[jk[i]]
This suffers the same issue as the continuous case where the number of interactions becomes large as you increase the number of groups.
It’s reduced rank regression but for interactions rather than multiple outputs.
I guess one could also try including the weights w_i and w_j in the inner product of v_i and v_j to require large marginal effects for any pairs of variables with large interaction terms.
I would have to look at the details more carefully, but what it reminds me of is the Furnival and Wilson all subset regression algorithm (Regression by leaps and bounds), which can also efficiently find the best subset of order n for n going from 1 to the number of predictors.
If you don’t know the algorithm, https://www.jstor.org/stable/1267601?seq=1#page_scan_tab_contents.