For the past several months I’ve been circling around and around some questions related to the issue of how we build trust in statistical methods and statistical results. There are lots of examples but let me start with my own career. My most cited publications are my books and my methods papers, but I think that much of my credibility as a statistical researcher comes from my applied work. It somehow matters, I think, when judging my statistical work, that I’ve done (and continue to do) real research in social and environmental science.
Why is this? It’s not just that my applied work gives me good examples for my textbooks. It’s also that the applied work motivated the new methods. Most of the successful theory and methods that my collaborators and I have developed, we developed in the context of trying to solve active applied problems. We weren’t trying to shave a half a point off the predictive error in the Boston housing data; rather, we were attacking new problems that we couldn’t solve in any reasonable way using existing methods.
That’s fine, but in that case who cares if the applied work is any good? To put it another way, suppose my new and useful methods had been developed in the context of crappy research projects where nobody gave a damn about the results? The methods wouldn’t be any worse, right? Statistical methods don’t care whether the numbers are real or fake. I have an answer to this one: If nobody cared about our results we would have little motivation to improve. Here’s an example. A few years ago I posted some maps based on multilevel regression and poststratification of pre-election polls to show how different groups of white people voted in 2008. The political activist Kos noticed that some of the numbers in my maps didn’t make sense. Kos wasn’t very polite in pointing out my mistakes, but he was right. So Yair and I want back and improved the model. It took a few months, but at the end I had better maps—and also a better method (which will be published in the American Journal of Political Science). This all only happened because I and others cared about the results. If all we were doing was trying to minimize mean squared predictive error, I doubt the improvements would’ve done anything at all.
This is not to say that excellent and innovative statistical theory can’t be developed in the absence of applications [That's a triple negative sentence but I couldn't easily think of any cleaner way to get my point across --- ed.] or, for that matter, in the context of shallow applications. For example, my popular paper on prior distributions for group-level variance parameters came through my repeated study of the 8-schools problem, a dead example if there ever was one. In many cases, though, seriousness of the application, the focus required to get details right, was what made it all work.
Now on to literature. Watership Down is one of my favorite books ever, and one striking thing about it is all the physical details of rabbit life. The characters don’t feel like people in rabbit suits but rather like rabbits that happen to have human-level intelligence (although not, interestingly enough, full human-level feelings. The characters seem very “animal-like” in their generally relentless focus on the present). Does this local realism matter for the book? I think it does. Being (approximately) constrained by reality forced Richard Adams to tell a story that held together, in a way that I don’t think would’ve happened under a pure fantasy scenario.
And this in turn relates to the concern that Thomas Basbøll and I have about the anomalousness and immutability of stories, the idea that our explanations of the world are forced to be interesting and nontrivial because of the requirement that they comport with the odd and unexpected features of real it. Neuroscience needs to explain the stories related by Oliver Sacks—but Sacks’s stories would be close to irrelevant to science if he were to fudge the details or hide his sources. God is in every leaf of every tree. But, to get full use of this information, you have to get it all down accurately, you have to plot the data, not just the expectations from your model. Darwin understood that, and so should we all. This also arose in our recent discussions about college admissions: I had various nitty-gritty data questions with some found annoying but I found necessary. If you can’t get down to the data—or if the data dissolve when you get too close to them—that’s a problem.
Again, this is not to disparage purely theoretical work. In some sense, the most theoretical research is itself empirical in that it must respect the constraints imposed by mathematics: a system that is created by humans but has depths we still do not understand. Trying to prove a theorem is not unlike taking an empirical conjecture and juxtaposing it with real data.