One thing I like about hierarchical modeling is that is not just about criticism. It’s a way to improve inferences, not just a way to adjust p-values.

In an email exchange regarding the difficulty many researchers have in engaging with statistical criticism (see here for a recent example), a colleague of mine opined:

Nowadays, promotion requires more publications, and in an academic environment, researchers are asked to do more than they can. So many researchers just work like workers in a product line without critical thinking. Quality becomes a tradeoff of quantity.

I replied:

I think that many (maybe not all) researchers are interested in critical thinking, but they don’t always have a good framework for integrating critical thinking into their research. Criticism is, if anything, too easy: once you’ve criticized, what do you do about it (short of “50 shades of gray” self-replication, which really is a lot of work)? One thing I like about hierarchical modeling is that is not just about criticism. It’s a way to improve inferences, not just a way to adjust p-values.

The point is that in this way criticism can be a step forward.

When we go through the literature (or even all the papers by a particular author) and list all the different data-coding, data-exclusion, and data-analysis rules that were done (see comment thread from above link for a long list of examples of data excluded or included, outcomes treated separately or averaged, variables controlled for or not, different p-value thresholds, etc.), it’s not just about listing multiple comparisons and criticizing p-values (which ultimately only gets you so far, because even correct p-values bear only a very indirect relation to any inferences of interest), it’s also about learning more from data, constructing a fuller model that includes all the possibilities corresponding to the different theories. Or even just recognizing that a particular dataset with a particular small sample and noisy, variable measurements, is too weak to learn what you want to learn. That can good to know too: if it’s a topic you really care about, you can devote some effort to more careful measurement, or at least know the limitations of your data. All good—the point is to make the link to reality rather than to try to compute some correct p-value, which has little to do with anything.

12 thoughts on “One thing I like about hierarchical modeling is that is not just about criticism. It’s a way to improve inferences, not just a way to adjust p-values.

  1. I really liked:

    “it’s not just about listing multiple comparisons and criticizing p-values (which ultimately only gets you far, because even correct p-values bear only a very indirect relation to any inferences of interest)”

    I’ll email this to all my p-value addicted colleagues – but presumably it should say “…only gets you SO far”?

  2. Something else, perhaps, is also going on; too many critics are disengaged, drive-by. Scientists who review are mumbling ‘well, I’d like it better if they… ‘ [did 6 years more work before publishing this paper]. This is part of the pushback against open post-pub ‘peer’ review. People aren’t behaving like proper peers; the peanut gallery is asking for ‘microcredit’ when their contribution is minimal.

    Perhaps critics should only be given a voice when they come with new data, integration of data from another source, or a new analytical method. Then criticism is a material step forward, indeed.

    • That sounds like projects that cannot realistically achieve their stated goals are getting funded. Just because a funding agency gave money to someone with a bad idea doesn’t mean everyone has to read about them try to finnagle out of the situation. Imagine this:

      1) A niave student doesnt realize the experimental design was crap,
      2) A committee of professors approves it. (as they and many other groups had approved similar crap projects for many years)
      3) Student realizes it is crap after performing the work and the experiance alerts them to all the pitfalls.
      4) Student is forced to BS about it to get publications, because in reality this was a waste of time.
      5) Student is now a professor with career and life built upon this BS. If everyone is doing it and the textbooks say to do it, it must not be so wrong. They become cynical or to beleive their own BS.
      6) Professor approves niave crap project of new student, starting the cycle again.

      This process will obviously select for fraudsters and dullards. The first dont care if they publish BS, the second lets the arguments from authority and consensus trump logic and experiance.

    • Ben:

      No, I completely disagree. If all a critic does is point out a flaw in a paper, that is useful. When I do work, I want people to tell me where I went wrong and how I could do better!

      • In Fagan (software) inspections, the job of the reviewers is to ask questions, not propose solutions. The intent, IIRC, is to direct the developer’s attention to a potential weak spot in their work without presuming to offer a solution.

        I think there are at least three reasons. Pointing out solutions is inefficient. It’s lots quicker to state that, “There’s a potential issue around line 34, I’m not sure whether the function at line 120 will handle negative arguments, and ….” than to say that /and/ have to come up with and then possibly debate a particular improvement.

        It also leads to fewer dead ends. If A, B, and C are all reviewers of D’s code, they could each find problems and propose solutions that don’t work together. By simply highlighting potential issues, they give D the opportunity to review whether each is a real issue and to synthesize a solution that treats all the issues appropriately.

        Finally, I think Fagan inspections were designed to reduce defensiveness. The reviewers don’t say “That’s bad” or “Here’s the right way to do what you wanted to do” but “I think there’s a problem here; check it out.”

        So I agree with Andrew on this.

        I do see a place for considering how we express our concerns to increase the probability that they will be heard appropriately by the recipient. I also see a place for us as recipients for figuring out how to hear increasingly tougher and more direct messages without becoming defensive and unproductive.

        • Bill:

          Also it helps to separate the job of criticism from the job of deciding whether a paper should be accepted for the journal or whether a product should be approved. As an author, I find it much easier to handle criticism if I know the paper is already ultimately destined to appear in the journal!

    • “Perhaps critics should only be given a voice when they come with new data, integration of data from another source, or a new analytical method. Then criticism is a material step forward, indeed.”

      This would be a bad idea, for several reasons, among them:

      A poor analysis of good data needs criticism to try to forestall repeating the poor analysis with new data.

      Poor methods of data collection need criticism to try to forestall repeating the poor collection process when gathering new data.

Leave a Reply to Corey Cancel reply

Your email address will not be published. Required fields are marked *