Nic Lewis writes:
I have made some progress with my work on combining independent evidence using a Bayesian approach but eschewing standard Bayesian updating. I found a neat analytical way of doing this, to a very good approximation, in cases where each estimate of a parameter corresponds to the ratio of two variables each determined with normal error, the fractional uncertainty in the numerator and denominator variables differing between the types of evidence. This seems a not uncommon situation in science, and it is a good approximation to that which exists when estimating climate sensitivity. I have had a manuscript in which I develop and test this method accepted by the Journal of Statistical Planning and Inference (for a special issue on Confidence Distributions edited by Tore Schweder and Nils Hjort). Frequentist coverage is almost exact using my analytical solution, based on combining Jeffreys’ priors in quadrature, whereas Bayesian updating produces far poorer probability matching. I also show that a simple likelihood ratio method gives almost identical inference to my Bayesian combination method. A copy of the manuscript is available here: https://niclewis.files.wordpress.com/2016/12/lewis_combining-independent-bayesian-posteriors-for-climate-sensitivity_jspiaccepted2016_cclicense.pdf .
I’ve since teamed up with Peter Grunwald, a statistics professor in Amsterdam whom you may know – you cite two of his works in your 2013 paper ‘Philosophy and the practice of Bayesian statistics’. It turns out that my proposed method for combining evidence agrees with that implied by the Minimum Description Length principle, which he has been closely involved in developing. We have a joint paper under review by a leading climate science journal, which applies the method developed in my JSPI paper.
I think the reason Bayesian updating can give poor probability matching, even when the original posterior used as the prior gives exact probability matching in relation to the original data, is that conditionality is not always applicable to posterior distributions. Conditional probability was developed rigorously by Kolmogorov in the conext of random variables. Don Fraser has stated that the conditional probability lemma requires two probabilistic inputs and is not satisfied where there is no prior knowledge of a parameter’s value. I extend this argument and suggest that, unless the parameter is generated by a random process rather than being fixed, conditional probability does not apply to updating a posterior corresponding to existing knowledge, used as a prior, since such a prior distribution does not provide the required type of probability. As Tore Schweder has written (in his and Nils Hjort’s 2016 book Confidence, Likelihood, Probability) it is necessary to keep epistemic and aleatory probability apart. Bayes himself, of course, developed his theory in the context of a random parameter generated with a known probability distribution.
I don’t really have time to look at this but I thought it might interest you, so feel free to share your impressions. I assume Nic Lewis will be reading the comments.
This also seemed worth posting, given that Yuling, Aki, Dan, and I will soon be releasing our own paper on combining posterior inferences from different models fit to a single dataset. Not quite the same problem but it’s in the same general class of questions.