## The Pandora Principle in statistics — and its malign converse, the ostrich

The Pandora Principle is that once you’ve considered a possible interaction or bias or confounder, you can’t un-think it. The malign converse is when people realize this and then design their studies to avoid putting themselves in a position where they have to consider some potentially important factor. For example, suppose you’re considering some policy intervention that can be done in several different ways, or conducted in several different contexts. The recommended approach is, if possible, to try out different realistic versions of the treatments in various realistic scenarios; you can then estimate an average treatment effect and also do your best to estimate variation in the effect (recognizing the difficulties inherent in that famous 1/16 efficiency ratio). An alternative, which one might call the reverse-Pandora approach, is to do a large study with just a single precise version of the treatment. This can give a cleaner estimate of the effect in that particular scenario, but to extend it to the real world will require some modeling or assumption about how the effect might vary. Going full ostrich here, one could simply carry over the estimated treatment effect from the simple experiment and not consider any variation at all. The idea would be that if you’d considered two or more flavors of treatment, you’re really have to consider the possibility of variation in effect, and propagate that into your decision making. But if you only consider one possibility, you could ostrich it and keep Pandora at bay. The ostrich approach might get you a publication and even some policy inference but it’s bad science and, I think, bad policy.

That said, there’s no easy answer, as there will always be additional possible confounding factors that you will not have be able to explore. That is, among all the scary contents of Pandora’s box, one thing that flies out is another box, and really you should open that one too . . . that’s the Cantor principle, which we encounter in so many places in statistics.

tl;dr: You can’t put Pandora back in the box. But really she shouldn’t’ve been trapped in there in the first place.

1. Shravan says:

Even if one does a two-condition study, taking all plausible sources of variability into account quickly becomes a nightmare. Shall I include trial effects, and trial:condition interactions? Surely subjects must be getting fatigued over time, or learning to strategize. Should I include the amount of alcohol they drank the night before, or the amount of sleep they had? Gender? Age? I generally ignore all these variables in my experiments because it would become a mess.

2. Kaiser says:

another advantage of the ostrich approach is the built-in defense against the replication brigade. If it doesn’t replicate, it must be because the single precise treatment wasn’t replicated precisely.

In some sense, the “ostrich approach” isn’t always a terrible idea.

For example, any type of engineering begins in the lab. This is a world where things can be isolated and altered in an extremely precise, controlled manner. Then one slowly begins testing more and more variety of inputs until you think it’s ready for the real world. In fact, I think you can argue almost all science advances in this manner.

Of course, it’s really important to realize the *speed* at which your knowledge advances. In the lab, you can often collect data for relatively cheap, change some parameters, collect more data, and repeat. If you’re collecting data on live subjects, the time in between tweaking situational parameters is often measured in years (i.e. the next study) rather than in days. And if your measures in isolated parameters data are very noisy…well I don’t need to preach to the choir here.