I received the following email:
I have an interesting thought on a prior for a logistic regression, and would love your input on how to make it “work.”
Some of my research, two published papers, are on mathematical models of **. Along those lines, I’m interested in developing more models for **. . . . Empirical studies show that the public is rather smart and that the wisdom-of-the-crowd is fairly accurate.
So, my thought would be to tread the public’s probability of the event as a prior, and then see how adding data, through a model, would change or perturb our inferred probability of **. (Similarly, I could envision using previously published epidemiological research as a prior probability of a disease, and then seeing how the addition of new testing protocols would update that belief.)
However, everything I learned about hierarchical Bayesian models has a prior as a distribution on the coefficients. I don’t know how to start with a prior point estimate for the probability in a logistic regression.
Do you have any ideas or suggestions on how to proceed?
I wrote back:
Hi, I assume I can blog your qu and my reply?
To which he replied:
If possible, it might be nice to keep out the part about ** models. I might want to keep that part quiet until I have a paper ready to publish. Perhaps just blog about the idea of a prior probability leading into a logistic regression. (Maybe the epidemiological example?) Is that possible/OK?
OK, OK, . . . in that case, here’s my advice: you can put in a prior distribution for a predicted probability in two ways. The first way is to put the prior on the parameters in the model, and just solve for the hyperparamters that induce the predictive prior that you want. The solution process is iterative and stochastic; see this 1995 paper (see section 6.1 of that paper for an example of specifying a prior distribution). The second approach is to consider your prior as data, directly on the observation of interest. That is, your predictive probability that you’re working with is some function of the parameters of the model, and you just say that you have a prior mean and sd for that probability (or, you can do it on the logit scale, whatever), and you throw that normal density into your posterior distribution. Easy enough to do as one line in Stan.