Data exploration and multiple comparisons

Bill Harris writes:

I’ve read your paper and presentation showing why you don’t usually worry about multiple comparisons. I see how that applies when you are comparing results across multiple settings (states, etc.).

Does the same principle hold when you are exploring data to find interesting relationships? For example, you have some data, and you’re trying a series of models to see which gives you the most useful insight. Do you try your models on a subset of the data so you have another subset for confirmatory analysis later, or do you simply throw all the data against your models?

My reply: I’d like to estimate all the relationships at once and use a multilevel model to do partial pooling to handle the mutiplicity issues. That said, in practice, in my applied work I’m always bouncing back and forth between different hypotheses and different datasets, and often I learn a lot when next year’s data come in and I can modify my hypotheses. The trouble with the classical hypothesis-testing framework, at least for me, is that so-called statistical hypotheses are very precise things, whereas the sorts of hypotheses that arise in science and social science are vaguer and are not so amenable to “testing” in the classical sense.

3 thoughts on “Data exploration and multiple comparisons

  1. Bill: Its worthwhile being wise here

    So my advice to you is not to be mislead by the data in a serious way ;-)

    Things do seem to be percolating and perhaps coalecsing here, in addition to ADrews recent stuff on it

    For some CV stuff see
    http://users.soe.ucsc.edu/~draper/San-Francisco-2

    And for variable selection stuff see
    http://users.soe.ucsc.edu/~draper/San-Francisco-2

    My current challenge involves lots of comparisons and most of them were statistically significant so should the almost significant ones be upgraded to significant and if so how to justify that (i.e. partial pooling to an average effect).

    K?

  2. Hi Andrew,

    You mention that:

    The trouble with the classical hypothesis-testing framework, at least for me, is that so-called statistical hypotheses are very precise things, whereas the sorts of hypotheses that arise in science and social science are vaguer and are not so amenable to "testing" in the classical sense.

    This makes a lot of sense to me. Do you have any citations of methods papers that discuss this issue further?

Comments are closed.