Continuing with my discussion here and here of the articles in the special issue of the journal Rationality, Markets and Morals on the philosophy of Bayesian statistics:

David Hendry, “Empirical Economic Model Discovery and Theory Evaluation”:
Hendry presents a wide-ranging overview of scientific learning, with an interesting comparison of physical with social sciences. (For some reason, he discusses many physical sciences but restricts his social-science examples to economics and psychology.)
The only part of Hendry’s long and interesting article that I will discuss, however, is the part where he decides to take a gratuitous swing at Bayes. I don’t know why he did this, but maybe it’s part of some fraternity initiation thing, like TP-ing the dean’s house on Halloween.
Here’s the story. Hendry writes:
‘Prior distributions’ widely used in Bayesian analyses, whether subjective or ‘objective’, cannot be formed in such a setting either, absent a falsely assumed crystal ball. Rather, imposing a prior distribution that is consistent with an assumed model when breaks are not included is a recipe for a bad analysis in macroeconomics. Fortunately, priors are neither necessary nor sufficient in the context of discovery.
I could just laugh this off—but as someone who has published two books and hundreds of articles on applied Bayesian statistics, I think I’ll take Hendry seriously.
Let me start with the tone. I generally don’t like when people take words or phrases that you disagree with them and put them in quotes. If you’re going to put “prior distributions” and “objective” in quotes, then please show the same disrespect to your other terms: “falsely” . . . “crystal ball” . . . “breaks” . . . “recipe” . . . “macroeconomics” . . . “discovery.”
But let me get to the substance. First, Hendry’s right. No statistical method is necessary. With sufficient effort, I think you can solve all statistical problems with Bayesian methods, or with robust methods, or with bootstrapping, or with any number of alternative approaches. Fuzzy sets would probably work too. Different approaches have different advantages, but I’m sure that if Hendry adopts a self-denying ordinance and decides to never use priors, he can solve all sorts of data analysis problems. He’ll just have to work really hard sometimes. But, to be fair, there are some problems that I have to work really hard on too. In short: econometrics methods tend to require more effort in complicated settings, but they often have appealing robustness properties. It’s fair enough that Hendry and I place different values on robustness vs. modeling flexibility.
My most serious criticism with Hendry’s above paragraph is the old, old story: he’s singling out Bayesian methods and priors as being particularly bad. Meanwhile all those likelihood functions and assumptions of additivity, symmetry, etc. all just sneak in. Hendry’s standing at the back window with a shotgun, scanning for priors coming over the hill, while a million assumptions just walk right into his house through the front door.
Here’s Hendry’s summary:
The pre-existing framework of ideas is bound to structure any analysis for better or worse, but being neither necessary nor sufficient, often blocking, and unhelpful in a changing world, prior distributions should play a minimal role in data analyses that seek to discover useful knowledge.
I’m going to have to disagree. I could give a million examples of useful knowledge that can be discovered with the aid of prior distributions. For example, where are the houses in the U.S. that have high radon levels? What are the effects of redistricting? How much perchloroethylene does the body metabolize? What is public opinion on gay rights by state? Or, for a classic from Mosteller and Wallace in 1960, classify the authorship of the Federalist Papers using 1960s technology.
I’m not saying that Hendry and his colleagues need to be using Bayesian methods in his applied research. I’m not even saying that Bayesian methods are needed to solve the problems listed in the above paragraph. In practice these problems were indeed solved using Bayesian inference, but I think other approaches could get there too. What I am saying is, why is Hendry so sure that “prior distributions should play a minimal role” etc.? I’m really bothered when people go beyond the simple and direct, “I have no personal experience with Bayesian inference solving a useful problem” to prescriptive (and wrong) statements such as “prior distributions should play a minimal role.” And it’s just silly to say that priors are “unhelpful in a changing world.” I’d think an econometrician would know about time series models!
Hendry also pulls the no-true-Scotsman trick:
Fortunately, priors are neither necessary nor sufficient in the context of discovery. For example, children learn whatever native tongue is prevalent around them, be it Chinese, Arabic or English, for none of which could they have a ‘prior’. Rather, trial-and-error learning seems a child’s main approach to language acquisition: see Clark and Clark (1977). Certainly, a general language system seems to be hard wired in the human brain (see Pinker 1994; 2002) but that hardly constitutes a prior. Thus, in one of the most complicated tasks imaginable, which computers still struggle to emulate, priors are not needed.
This is a no-true-Scotsman argument because, when confronted with an example in which our brains figure things out using a pre-existing structure (not for Chinese, Arabic, or English, but for human language in general), Hendry simply says that this system that is “hard wired in the human brain . . . hardly constitutes a prior.” Huh? It’s definitely a prior. That’s the whole point: our brains are tuned to decode human language.
Why does this bug me so much about a few throwaway paragraphs in an otherwise-pretty-good-article? Hendry’s anti-Bayesian sentiments are no more clueless than those earlier expressed by, say, John DiNardo. The difference is that DiNardo was just venting his opinions and was pretty open about this, whereas Hendry’s presenting his prejudices with an air of expertise. If Hendry wants to work on “replacing unrestricted non-linear functions by an encompassing theory-derived form, such as an ogive,” then fine. His theoretical models of model selection seem interesting and could perhaps be useful. I just wish he’d cut out the part where he implicitly disparages the work of Mosteller and Wallace, Lax and Phillips, and a few zillion other researchers who’ve used Bayesian methods to solve problems.
It’s not too late for Hendry to reform (I hope). All he needs to do is to retreat to present the positive virtues of his preferred inferential approach along with his explanations as to why Bayesian methods have not seemed useful for him. He’s an econometrician, he doesn’t work in toxicology and that’s fine. I think both his positive and his negative statements would be stronger if he would be more aware of the limits of his own experience. Just as, in mathematics, a theorem is clearer if you understand the range of its applicability and the areas where there are counterexamples.