Skip to content

Watership Down, thick description, applied statistics, immutability of stories, and playing tennis with a net

For the past several months I’ve been circling around and around some questions related to the issue of how we build trust in statistical methods and statistical results. There are lots of examples but let me start with my own career. My most cited publications are my books and my methods papers, but I think that much of my credibility as a statistical researcher comes from my applied work. It somehow matters, I think, when judging my statistical work, that I’ve done (and continue to do) real research in social and environmental science.

Why is this? It’s not just that my applied work gives me good examples for my textbooks. It’s also that the applied work motivated the new methods. Most of the successful theory and methods that my collaborators and I have developed, we developed in the context of trying to solve active applied problems. We weren’t trying to shave a half a point off the predictive error in the Boston housing data; rather, we were attacking new problems that we couldn’t solve in any reasonable way using existing methods.

That’s fine, but in that case who cares if the applied work is any good? To put it another way, suppose my new and useful methods had been developed in the context of crappy research projects where nobody gave a damn about the results? The methods wouldn’t be any worse, right? Statistical methods don’t care whether the numbers are real or fake. I have an answer to this one: If nobody cared about our results we would have little motivation to improve. Here’s an example. A few years ago I posted some maps based on multilevel regression and poststratification of pre-election polls to show how different groups of white people voted in 2008. The political activist Kos noticed that some of the numbers in my maps didn’t make sense. Kos wasn’t very polite in pointing out my mistakes, but he was right. So Yair and I want back and improved the model. It took a few months, but at the end I had better maps—and also a better method (which will be published in the American Journal of Political Science). This all only happened because I and others cared about the results. If all we were doing was trying to minimize mean squared predictive error, I doubt the improvements would’ve done anything at all.

This is not to say that excellent and innovative statistical theory can’t be developed in the absence of applications [That’s a triple negative sentence but I couldn’t easily think of any cleaner way to get my point across — ed.] or, for that matter, in the context of shallow applications. For example, my popular paper on prior distributions for group-level variance parameters came through my repeated study of the 8-schools problem, a dead example if there ever was one. In many cases, though, seriousness of the application, the focus required to get details right, was what made it all work.

Now on to literature. Watership Down is one of my favorite books ever, and one striking thing about it is all the physical details of rabbit life. The characters don’t feel like people in rabbit suits but rather like rabbits that happen to have human-level intelligence (although not, interestingly enough, full human-level feelings. The characters seem very “animal-like” in their generally relentless focus on the present). Does this local realism matter for the book? I think it does. Being (approximately) constrained by reality forced Richard Adams to tell a story that held together, in a way that I don’t think would’ve happened under a pure fantasy scenario.

And this in turn relates to the concern that Thomas Basbøll and I have about the anomalousness and immutability of stories, the idea that our explanations of the world are forced to be interesting and nontrivial because of the requirement that they comport with the odd and unexpected features of real it. Neuroscience needs to explain the stories related by Oliver Sacks—but Sacks’s stories would be close to irrelevant to science if he were to fudge the details or hide his sources. God is in every leaf of every tree. But, to get full use of this information, you have to get it all down accurately, you have to plot the data, not just the expectations from your model. Darwin understood that, and so should we all. This also arose in our recent discussions about college admissions: I had various nitty-gritty data questions with some found annoying but I found necessary. If you can’t get down to the data—or if the data dissolve when you get too close to them—that’s a problem.

Again, this is not to disparage purely theoretical work. In some sense, the most theoretical research is itself empirical in that it must respect the constraints imposed by mathematics: a system that is created by humans but has depths we still do not understand. Trying to prove a theorem is not unlike taking an empirical conjecture and juxtaposing it with real data.


  1. Rahul says:

    “Excellent and innovative statistical theory can sometimes be developed even in the absence of applications, but not often.” One negative. :)

    • That’s two negatives, “absence” and “not”. And I’d argue that “sometimes” is implicitly negative because of scalar implicature (which also makes the “but not often” redundant — you could replace “sometimes” with “rarely” for the full effect).

      So even though “A always X-es” implies “A sometimes X-es”, if you use “sometimes”, the implicature (not strict implication) is “not always”. So “sometimes” itself in this case is negative.

      Syntactic negation supports what are known as “negative polarity items”.

      The sentence “John would not budge an inch” sounds fine, but “John would budge an inch” is infelicitous. “budge an inch” is a negative polarity item that only shows up under the scope of negation. This includes negative quantifiers, like “nobody would budge an inch”, but not lexicalized negatives, “john failed to budge an inch”.

      • Fernando says:

        On scalar implicature: Is that an empirical fact?

        Personally, when I hear the statement “Some students can afford a new car”, my inference is not that “Not all students can afford a new car”, as stated by Wikipedia, but rather that the statement “Not one student can afford a new car” is false.

        For me the word “some” plays the role of an exception that negates an implicit universal statement, as in mathematical logic. That is, we might not know whether all or some students can afford a new car, but we are sure the claim that not one student can afford a car is false.

        • Mayo says:

          To interpret it as you say Wikipedia does would construe:
          “there are some experiments that falsify the theory” as “there are some experiments that do not falsify the theory”.

    • Rahul says:

      “Statistical theory developed from purely theoretical considerations often turns out to be relatively mediocre and derivative”?

      That sounds stronger and harsher than the original though. I think we use the negatives so often to soften the message.

  2. Fernando says:

    According to Wikipedia: “Watership Down was rejected 13 times before it was accepted by Rex Collings”.

    I wonder how it would have fared nowadays with self-publishing and e-books. It certainly would have been published sooner but would it have done better in terms of readership?

  3. John Mashey says:

    1) “In theory, theory is the same as practice, but in practice, it isn’t.”

    2) Good interplay between theory and practice (or methods experts and application-domain experts) often yields the most useful results.
    For example:

    a) In high school, I heard Bell Labs’ Henry O. Pollak give a terrific talk on this, describing a classical theoretical problem (minimal spanning trees) and Prim’s Algorithm or Kruskal’s Algorithm, for example … and then observing how much money it saved the Bell System in network planning.
    When I later worked at BTL, Prim was Executive Director (with John Tukey as Associate Exec Dir), Pollak was a Director in the division, one of whose departments had Kruskal, John Chambers, etc, another had William Cleveland, and another had Ron Graham, Mike Michael Garey, David Johnson, etc

    A lot of good math and statistics theory came out of that lab, and of course, they were pretty free to study problems they found interesting … but importantly, they were well-connected with many other BTL groups who had real problems, which sometimes gave ideas for new theoretical research … some of which would help solve real problems.

    b) In computing, Stanford has long had close ties with Silicon Valley industry. Years ago, at MIPS (cofounded by current Stanford President John Hennessy), we regularly attended dissertation presentations and had grad students as summer interns. They might bring new techniques (as in compiler optimization) but then get a better chance to see how well they really worked, and then run into new problems. Both Stanford and Berkeley professors often canvased industry friends for “interesting problems for which we wish we had better theory.” Students got idea about problems that people cared about.

  4. clement thery says:

    “Neuroscience needs to explain the stories related by Oliver Sacks—but Sacks’s stories would be close to irrelevant to science if he were to fudge the details or hide his sources. God is in every leaf of every tree” even ethnographers need to be reminded of that (myself primarily included): one needs to give a little too many details and a little too much information in order for the reader to be able, in cases needed, to interpret the material over the ethnographer/writer herself. There is delicate balance to find between letting apparently uninformative details in a description and having a single encompassing internal structure that organizes the description. I didn’t know statisticians were also interested in such issues.

    I think i will these rabbits stories — sounds interesting

  5. Steve Sailer says:

    Watership Down is a great book.

    Among much else, it’s an allegory of the British Army at war. Richard Adams modeled the exodus of the rabbits, having to travel cross country under terrifying conditions, on his colleagues in the Airborne unit who parachuted down “One Bridge Too Far” behind German lines in 1944 to seize the Rhine crossing and win the war by Christmas. When they were cut off, they had to swim the Rhine and make their way through 100 miles of enemy country to safety.

    Adams, I believe, had some sort of supply-type job in the unit and didn’t make the jump himself, but had long conversations with the survivors.

    • Anonymous says:


      He made a large donation to my Oxford college student mail room (Worcester) in remembrance of a friend he lost in the war.

      I recall him being very shaken at the ceremony and dinner afterwards (I did not even try to converse with him).

      If you google or contact the college, you might be able to get the name if you wanted to do further research.

    • Andrew says:

      Interesting. So Adams was, in effect, operating under two constraints.

      • Jacob H. says:

        When I read the book as a kid, the section where they are in the farmer’s fake warren, being fed and fattened for slaughter, struck me very strongly as allegory, but I wasn’t sure exactly of what. I mean, it felt more “real” than the rest of the book in the sense of seeming to pose a moral dilemma that seemed more relevant to human life, even though I couldn’t exactly explain what that dilemma was. This is different from Animal Farm, which if it works (I’m not 100% sure it does), works by very clearly allegorizing a specific set of historical events, and making them absurd and comical in doing so. Watership Down seems instead to take a set of institutions that we are familiar with, and hinting at another darker interpretation without specifying it precisely. At the same time, I think the book’s resonance is because we not only relate with the bunnies and the dangers they experience, we also realize that we as humans are the cause of their danger, distress, and death– it is our bulldozers and planned communities and farms that spell their doom.

        I also thought of it a lot much later when I was taking a course on Greek Tragedy as an undergraduate, although apart from the prophecy-of-doom component, I’d need someone else to spell out the connections for me.

  6. […] Gelman has been writing on his excellent blog about how it is the constraint and the unexpected inspiration of real-life, tricky, dirty data […]

  7. Fernando says:

    NSure, applications constrain implementation and development of new methods, making them more reliable, useful.

    But similarly theory does constrain applications. Theory is one of the few limitations to specification searches. It provides the basis of the “laugh test”, ruling out specifications that achieve significance through ridiculous means. But here theory stands for what we think we know about the world, a prior perhaps. And In some way reality is everywhere. Even in unicorns.