Skip to content
 

The virtues of incoherence?

Kent Osband writes:

I just read your article The holes in my philosophy of Bayesian data analysis. I agree on the importance of what you flagged as “comparable incoherence in all other statistical philosophies”. The problem arises when a string of unexpected observations persuades that one’s original structural hypothesis (which might be viewed as a parameter describing the type of statistical relationship) was false.

However, I would phrase this more positively. Your Bayesian prior actually cedes alternative structural hypotheses, albeit with tiny epsilon weights. Otherwise you would never change your mind. However, these epsilons are so difficult to measure, and small differences can have such a significant impact on speed of adjustment (as in the example in Chapter 7 of Pandora’s Risk), that effectively we all look incoherent. This is a prime example of rational turbulence.

Rational turbulence can arise even without a structural break. Any time new evidence arrives that we’re not certain is i.i.d. with older evidence, we have to make judgment calls on relevance. Reasonable people are bound to disagree. Differences may be tiny in log-odds space (where Bayesian updating tends to look relatively smooth) can wax or wane in terms of ordinary probabilities.

Imho we need to regard this less as a bug in our theory than as inherent feature of learning. This feature does at least as much good as harm, in helping us adjust quicker than we would otherwise. In continuous-time updating, the mean belief shifts in proportion to the variance of beliefs. The equation is analogous to the famous fundamental-law-of-evolution equation of statistician/biologist Fisher who argued that the most diverse population was the fittest in terms of adjusting to stress. In the same sense we can say that rational turbulence breeds diversity of opinion, which in turn speeds our learning under stress.

The most important application involves financial markets. In the spirit of de Finetti and Savage, we can identify subjective probability with willingness to pay, and market price as a kind of consensus probability. Finance theory tends to identify price turbulence with failures of rationality, either momentary (from the orthodox perspective) or persistent (from the behaviorist perspective). Statistics theory also struggles with this, since from either a classical or textbook Bayesian perspective our estimators should converge on the probabilistic truth. But if we acknowledge that God not only plays dice with the universe, but also occasionally changes His dice without telling us, we can appreciate how differences between rational market players can wax and wane. Excess volatility, and the volatility of volatility, provide evidence that people learn.

My reply:

Osband writes that my Bayesian prior “ctually cedes alternative structural hypotheses, albeit with tiny epsilon weights. Otherwise you would never change your mind.” I may act as if I have these alternatives, but actually my Bayesian models don’t have these pre-specified alternatives. Rather, I recognize that my model might be wrong and I’m willing to check it.

Otherwise, I like the “rational turbulence” idea. It reminds me of the idea, discussed in my paper with Shalizi, of the fractal nature of scientific revolutions.

2 Comments

  1. Kent Osband says:

    Andrew, thanks for posting my comment. I misspoke in saying "actually" when what I meant was "de facto". Your de facto prior appears to be nested, with a top layer that cedes a small probability that your model proper is wrong.

  2. Andrew Gelman says:

    Kent:

    I never calculate, or try to calculate, that small probability of which you speak. For reasons discussed in my articles (follow link above), I don't have any trust in these posterior probabilities that a model is true. For continuous-parameter models, these probabilities depend on parameters in the prior distributions that have no impact on the posteriors for each of the individual models.

    Rather than framing the problem as Pr(model is true|data), instead I accept ahead of time that the model is false and I look at its fit to data.

    That said, I do recognize the incoherence in my approach, as discussed in that earlier blog and article.