My article with Cosma Shalizi has appeared in the British Journal of Mathematical and Statistical Psychology. I’m so glad this paper has come out. I’d been thinking about writing such a paper for almost 20 years. What got me to actually do it was an invitation a few years ago to write a chapter on Bayesian statistics for a volume on the philosophy of social sciences. Once I started doing that, I realized I had enough for a journal article. I contacted Cosma because he, unlike me, was familiar with the post-1970 philosophy literature (my knowledge went only up to Popper, Kuhn, and Lakatos). We submitted it to a couple statistics journals that didn’t want it (for reasons that weren’t always clear), but ultimately I think it ended up in the right place, as psychologists have been as serious as anyone in thinking about statistical foundations in recent years.

Here’s the issue of the journal, which also includes an introduction, several discussions, and a rejoinder:

Prior approval: The growth of Bayesian methods in psychology (pages 1–7)

Mark Andrews and Thom Baguley

Philosophy and the practice of Bayesian statistics (pages 8–38)

Andrew Gelman and Cosma Rohilla Shalizi

How to practise Bayesian statistics outside the Bayesian church: What philosophy for Bayesian statistical modelling? (pages 39–44)

Denny Borsboom and Brian D. Haig

Posterior predictive checks can and should be Bayesian: Comment on Gelman and Shalizi, ‘Philosophy and the practice of Bayesian statistics’ (pages 45–56)

John K. Kruschke

Comment on Gelman and Shalizi (pages 65–67)

Stephen Senn

The humble Bayesian: Model checking from a fully Bayesian perspective (pages 68–75)

Richard D. Morey, Jan-Willem Romeijn and Jeffrey N. Rouder

Rejoinder to discussion of ‘Philosophy and the practice of Bayesian statistics’(pages 76–80)

Andrew Gelman and Cosma Shalizi

The basic themes are laid out by Mark Andrews and Thom Baguley in their introduction:

Bayesian methods are not just another set of topics in advanced statistics such as, for example, structural equation modeling or nonlinear regression. For some, they represent a new paradigm (in the Kuhnian sense of term) for the field. As such, their increasing adoption has potentially profound implications for the nature and practice of data analysis in psychology, possibly affecting everything from the editorial policies of journals to how statistics is taught to students.

Despite their growing appeal, however, there remains a troubling lack of clarity about what exactly Bayesian methods do and do not entail and about how they differ from their so-called classical counterparts. Bayesian methods are often portrayed as being based on a subjective rather than frequentist interpretation of probability, with inference being an updating of personal beliefs in light of evidence. In practice, however, most modern applications of Bayesian methods to real-world data analysis problems are characterized by pragmatism and expediency: Bayesian methods are adopted because they promise (and arguably often deliver) solutions to important or difficult problems.

They continue:

As we see it, the choice of priors is like the choice of the probabilistic model of the data. For example, given a set of observations . . . we might model this data as . . . The choice of this probabilistic model need not be a reflection of our true beliefs about how this data was generated. Rather it can be seen as literally just a model that can potentially provide insight into the nature and structure of the data. By the same reasoning, the priors . . . need not be a reflection of our true beliefs about the parameters, but are just part of our general modelling assumptions. Just as the generative model provides a probabilistic model of the data, the priors provide a probabilistic model of parameters. Just as we assume that our data is drawn from some probability distribution with fixed but unobserved parameters, so too we assume that the values of the parameters are drawn from another probability distribution (also with fixed but unobserved parameters). . . . One reason for a degree of humility in our analysis is that no probability model and hence no statistical model in psychology is complete.

The punchline:

Priors, therefore, are just assumptions of our model. Like any other assumptions, they can be good or bad and may need to be extended, revised or possibly abandoned on the basis of their suitability to the data being studied.

P.S. Two other, less formal versions of our argument (in the Frey-Arrow style) are here and here.

The article was very clear and interesting. Thank you very much! I’m sure philosophers could argue endlessly about this or that small detail. However as a non-statistician (mathematician), I find it surprising that actual practitioners of data analysis would object to your reasonable and straightforward approach.

This is a very interesting list of articles!

My main comment is this:

The “usual story” that you describe is not the operational subjective view of Bayesian statistics. Part, but not all of your position is closer to the “purist” operational subjective version of Bayesian statistics than “the usual story”.

The operational subjective view is applied only to observables and is in general involves imprecise specification of subjective probabilities and expectations. As a consequence it is not inductive. Posterior distributions or Bayes factors are at best auxiliary devices rather than the central object of study. The operational subjectivist would agree with you that decision theory should be applied to real things lives, money not to finding the best estimate.

Implementing the operational subjective approach in real problems becomes a nearly impossible because.

a) foundationally only incomplete probability specifications are justified

b) modern computational methods are approximate conditioning methods which require full probability specifications

I think your program of research on model checking is a reasonable thing to do in this context. I don’t think it opens up any reason to doubt what should be the “usual story” the operational subjective approach.

A very interesting read, and I am less confident about certain things than I formerly was, but let me to propose this critique,

You wrote:

“It turned out that this varying-intercept model did not fit our data, … We found this not through any process of Bayesian induction but rather through model checking.”

I agree on the value of model checking, but I wonder if this is really distinct from inductive inference. In order to say that the model was inappropriate, don’t you think that you must, at least informally, have assigned it a low probability? In which case, your model checking procedure seems to be a heuristic designed to mimic efficiently the logic of Bayesian induction.

Even if you didn’t formally specify a hypothesis space, what you seem to have done is to say ‘look, this model mis-matches the data so much that it must be easy to find an alternate model that would achieve a much higher posterior.’ As such, the process of model checking attains absolutely strict validity only when that extended hypothesis space is explicitly examined, and your intuition numerically confirmed. Granted, many cases will be so obvious that the full analysis isn’t needed, but hopefully you get the point.

There certainly is a strong asymmetry between verification and falsification, but I can’t accept your thesis that falsification is deductive. Sure, its typically harder for a model with an assigned probability near zero to be brought back into contention than it is for a model with currently very high probability to be crushed by new evidence, but its not in principle impossible. Newtonian mechanics might be the real way of the world, and all that evidence against it might just have been a dream. The problem is that this requires not just Newtonian mechanics, but Newtonian mechanics + some other implausible stuff, which as intuition warns (and mathematics can confirm) deserves very small prior weight. (A currently favorable model can always be superseded by another model with not significantly greater complexity, which accounts for the asymmetry between falsification and verification.) The mathematics that verifies this is Bayesian and, it seems to me, inductive.

That we can apparently falsify a theory without considering alternatives seems to be simply this strong asymmetry allowing Bayesian logic to be reliably but informally approximated without specifying the entire (super)model.

Tom:

Fair enough. But when I do Bayesian inference conditional on a model, I’m really doing Bayesian inference: I explicitly set up the model and I explicitly compute the posterior distribution. When I do model checking, I agree that this might fruitfully be viewed as being an approximation to Bayesian inference, but in the meantime that’s not what it looks like to me. I’m not explicitly assigning probabilities to hypotheses or computing their posterior probabilities. One reason for this is that I can change an aspect of a model and have essentially no effect on inference conditional on that model (for example, if I change the prior distribution on a well-estimated parameter from N(0,100^2) to N(0,1000^2), this can still induce a huge change in that model’s posterior probability in a Bayesian model-averaging setting. So, even though an overarching Bayesian approach might be the best way to go, it’s not what I’m doing. And recall that a key goal of the article was to describe Bayesian data analysis as I actually practice it.

As I note in my comment above, the supposition that the choice is between (literal) deduction and inductive assignments of Bayesian posterior probabilities is a false one. Gelman and Shalizi’s falsifications, like Popper’s, involve ampliative (i.e., inductive though not probabilistic) inferences to an adequate model, real effect, indicated magnitude or the like.

Of related interest perhaps: no-pain philosophy (on Popper): http://errorstatistics.com/2012/02/01/no-pain-philosophy-skepticism-rationality-popper-and-all-that-part-2-duhems-problem-methodological-falsification/

Have a look at Kruschke’s discussion, this is basically the argument that he makes.

[...] of Bayesian statistics, which appeared at the end of our rejoinder to the discussion to which I linked the other [...]