Some recent blog discussion revealed some confusion that I’ll try to resolve here.
It seems to me that your prior has to reflect your subjective information before you look at the data. How can it not?
But this does not mean that the (subjective) prior that you choose is irrefutable; Surely a prior that reflects prior information just does not have to be inconsistent with that information. But that still leaves a range of priors that are consistent with it, the sort of priors that one would use in a sensitivity analysis, for example.
I think I see what Bill is getting at. A prior represents your subjective belief, or some approximation to your subjective belief, even if it’s not perfect. That sounds reasonable but I don’t think it works. Or, at least, it often doesn’t work.
Let’s start with a simple example. You hop on a scale that gives unbiased measurements with errors that have a standard deviation of 0.1 kg. To do Bayesian analysis, you assign a N(0,10000^2) prior on your true weight. That doesn’t represent your subjective belief! It’s not even an approximation. No problem—it works fine for most purposes—but it’s not subjective.
More generally, think of all the linear and logistic regressions we use. Instead of thinking of these as subjective beliefs, I prefer to think of the joint probability distribution as a model, reflecting a set of assumptions. In some settings these assumptions represent subjective beliefs, in other settings they don’t.
This article from 2002 might help. If I could go back and alter it, I’d add something on weakly informative priors, but I still agree with the general approach discussed there.
P.S. Just to give an example of what I mean by prior information: The analyses in Red State Blue State all use noninformative prior distributions. But a lot of prior information comes in, in the selection of what questions to study, what models to consider, and what variables to include in the model. For example, as state-level predictors we include region of the country, Republican vote in the previous presidential election, and average state income. Prior information goes into the choice and construction of all these predictors. But the prior distribution is a particular probability distribution that in this case is flat and does not reflect prior knowledge.
One way to think about informative prior distributions is as a form of smoothing: when setting the parameters of a probability distribution based on prior knowledge, we are imposing some time smoothness on the parameters. I think that’s probably a good idea and that the Red State Blue State analyses (among others) would be better for it. I didn’t set up this prior structure because I wasn’t easily equipped to do so and it seemed like too much effort, but perhaps at some future time this sort of structuring will be as commonplace as hierarchical modeling is today.