Skip to content

Methodological terrorism. For reals. (How to deal with “what we don’t know” in missing-data imputation.)

Kevin Lewis points us to this paper, by Aaron Safer-Lichtenstein, Gary LaFree, Thomas Loughran, on the methodology of terrorism studies. This is about as close to actual “methodological terrorism” as we’re ever gonna see here.

The linked article begins:

Although the empirical and analytical study of terrorism has grown dramatically in the past decade and a half to incorporate more sophisticated statistical and econometric methods, data validity is still an open, first-order question. Specifically, methods for treating missing data often rely on strong, untestable, and often implicit assumptions about the nature of the missing values.

Later, they write:

If researchers choose to impute data, then they must be clear about the benefits and drawbacks of using an imputation technique.

Yes, definitely. One funny thing about missing-data imputation is that the methods are so mysterious and are so obviously subject to uncheckable assumptions that there’s a tendency for researchers to just throw up their hands and give up, and either go for crude data-simplification strategies such as throwing away all cases where anything is missing, or just imputing without any attempt to check the resulting inferences.

My preference is to impute and then check assumptions, as here. That said, in practice this can be a bit of work so in a lot of my own applied work I kinda close my eyes to the problem too. I should do better.


  1. Pam Davis-Kean says:

    Just as we should be clear of the biases and assumptions that can be made using listwise and pairwise deletion. Which bias and assumptions do you want to highlight in our studies? No technique is without issues.

  2. I’m surprised that you didn’t get more traction on this topic. Thanks for flagging it.

  3. Zad Chow says:

    I always find it unusual that researchers will avoid data imputation because of the complexity of the methods and think they are fine by excluding cases that have data missing (some even consider it to increase rigor). Do they not understand that if there’s a systematic reason for the data missing, it will simply bias the results?

    • Curious says:

      While I agree with this as a general idea, it does depend on the problem being studied. While you might argue that it is better to model every case in a set of data rather than remove it, modeling cases without enough data to be useful is indistinguishable from deletion, though I certainly acknowledge the benefit of this as a normative practice given it forces researchers to think more deeply about their data.

  4. Keith O'Rourke says:

    I think the confusion of vocabulary in the literature might have amplified the mystery – What Is Meant by “Missing at
    Random”? Shaun Seaman, John Galati, Dan Jackson and John Carlin

Leave a Reply