## Bayes, statistics, and reproducibility: My talk at Rutgers 5pm on Mon 29 Jan 2018

In the weekly seminar on the Foundations of Probability in the Philosophy Departmentat Rutgers University, New Brunswick Campus, Miller Hall, 2nd floor seminar room:

Bayes, statistics, and reproducibility

The two central ideas in the foundations of statistics—Bayesian inference and frequentist evaluation—both are defined in terms of replications. For a Bayesian, the replication comes in the prior distribution, which represents possible parameter values under the set of problems to which a given model might be applied; for a frequentist, the replication comes in the reference set or sampling distribution of possible data that could be seen if the data collection process were repeated. Many serious problems with statistics in practice arise from Bayesian inference that is not Bayesian enough, or frequentist evaluation that is not frequentist enough, in both cases using replication distributions that do not make scientific sense or do not reflect the actual procedures being performed on the data. We consider the implications for the replication crisis in science and discuss how scientists can do better, both in data collection and in learning from the data they have.

Here are some relevant papers:

Philosophy and the practice of Bayesian statistics (with Cosma Shalizi) and rejoinder to discussion

Beyond subjective and objective in statistics (with Christian Hennig)

The failure of null hypothesis significance testing when studying incremental changes, and what to do about it

P.S. And here’s the video of the talk and discussion.

1. From “Philosophy and the Practice of Bayesian Statistics”:

“We examine… the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory.”

Do they? The ideal would be to put a prior over every possible model for the phenomenon under study and do Bayesian model averaging, but this is computationally infeasible (not to mention cognitively infeasible for the researcher). I see model checking and model revision as heuristic shortcuts / approximations to this ideal. I may be a Bayesian purist, but I’m also an engineer, and one always has to make compromises because of limited resources.

2. Dave C. says:

Open to the public?

3. I’m sure Harry Crane will post a video. Should be interesting.

4. Corey says:

For a Bayesian, the replication comes in the prior distribution, which represents possible parameter values under the set of problems to which a given model might be applied…

I think this must be one of those “a science based on defaults” things that I never really got; my Bayesian prior distributions are for encoding prior information about plausible parameter values for the model actually being applied right at that moment.

• Chris Wilson says:

+ 1. Perhaps the most relevant mixed case is the use of ‘weakly regularizing’ priors? On one hand, we are constraining by order of plausible magnitude- on the other, this is often not really a careful probability judgement. For example, we might be running several linear models with a ton of predictors and just want a sensible default like standardize to O(1) and use N(0,1) priors…

• Andrew says:

Corey:

There are different ways to look at this one; perhaps the simplest is to recognize that, mathematically, Bayesian inferences are calibrated if you average over the prior distribution; thus, Bayesian probabilities can be interpreted as frequency probabilities, averaging over the joint distribution, p(y,theta).

• Keith O'Rourke says:

Are you not also concerned with joint model being sensible for all parameters values that have non-negligible probability?

In the same way that frequency inference needs to be sensible for all data in the reference set or sampling distribution of possible data?

• Ben Goodrich says:

Yeah, but if I were a frequentist I would say that p(y, theta) is ill-defined because theta is not a random variable.

• Except of course when it has some systematic variation, like from county to county or person to person or test to test or whatever you’re modeling.

• Chris Wilson says:

…and then De Finetti comes along and shows that the math is the same, whether we use probability to describe uncertainty about a fixed parameter or if that parameter “really is” a random variable.