The cargo cult continues

Juan Carlos Lopez writes:

Here’s a news article: . . .

Here’s the paper: . . .

[Details removed to avoid embarrassing the authors of the article in question.]

I [Lopez] am especially bothered by the abstract of this paper, which makes bold claims in the context of a small and noisy study which measurements are not closely tied to the underlying constructs of interest—at best, they are quantifying a very context-dependent, special case.

Anyhow, I think you can get the gist of the article (and its problems) by only reading the abstract, Table 1, and Figure 1.

My reply:

Yes, there’s no need to take the paper seriously: it’s an exercise in noise mining, and if anyone would ever go to the trouble of replicating it—which I doubt will ever happen—I expect they’d see some other set of interactions pop up as statistically significant. In the news article, one of the authors describes the results in the paper as “surprising”—without realizing that it’s no surprise at all that if you shuffle around a bunch of random numbers, out will pop some random statistically significant comparisons.

The whole thing is a disaster, from data collection to analysis to writeup to publication to publicity—for the general reasons discussed here, and I think I’d be doing the authors a favor, at some level, to tell them that—but for the usual reasons of avoiding conflict I won’t bother doing this. It really makes me sad, not angry. This particular paper that you sent me is not on a particularly important or exciting topic (it’s just quirky enough to get into the news), it’s just routine cargo-cult science that we see every day. For lots of people, it’s their career and they just don’t know better.

Lopez followed up with another question:

In the setting of, say, a research seminar presentation, how do you answer the question “Why are you not including p-values in your Results section”?

Some context for my question: I’m a Ph.D. candidate at a university where most people are still using p-values in the usual ways which you criticize in McShane et al. (2017). I have trouble answering the question above in a way that doesn’t derail the entire discussion. Recently, I’ve discovered that the most effective way to avoid a long—and sometimes counterproductive—discussion on the topic is to appeal to authority by saying I’m following the ASA guidelines. This has become my go-to, 30-second answer.

My response: I don’t object to people including p-values—they do tell you something! My objection is when p-values are used to select a subset of results. I say: give all the results, not just a subset.

15 thoughts on “The cargo cult continues

  1. Why do you not include the article an effort not to embarrass the authors in question? If anyone is willing to submit a paper to an academic journal and reap the benefits of publishing said paper, then why shouldn’t they also have to stand by it when criticized?

  2. Oh man, I haven’t decided whether I’m P-etered out.

    I’ve actually returned to writing scripts. This subject deserves its own special one. We’ve finished lyrics to a HipHop Cypher. Now I can cajole my rap friends to perform it. The P-value Cypher.

    I’m glad you have solved my dilemma as to which forum to discuss this b/c I had hoped that the ASA communities blog would engender more discussion. It’s kinda staid, from my observation, in that nearly all the commentaries have been hashed over many times.

    I also acknowledge that I consider myself a student rather than an expert of statistics etc. I prefer that designation b/c it’s some topics are just hobbies for me. It’s that a few NAS academics, back in 90’s, introduced me to several related subjects. Little clue why.

    So if I err in my analysis, please free to correct me.

      • Sorry about that. I was reacting to this:

        Lopez followed up with another question:

        In the setting of, say, a research seminar presentation, how do you answer the question “Why are you not including p-values in your Results section”?

        Some context for my question: I’m a Ph.D. candidate at a university where most people are still using p-values in the usual ways which you criticize in McShane et al. (2017). I have trouble answering the question above in a way that doesn’t derail the entire discussion. Recently, I’ve discovered that the most effective way to avoid a long—and sometimes counterproductive—discussion on the topic is to appeal to authority by saying I’m following the ASA guidelines. This has become my go-to, 30-second answer.”

        —————

        I found references to McShane et al on Facebook and on two statistics blogs. To respond on Facebook is a bit unwieldy since I’m new to it. The others aren’t yielding much discussion. So was expressing relief that Andrew posted the link to McShane here.

        Re: script & cypher. Just some creative projects about p-value I’ve worked on

  3. I am a first year grad student in the social sciences, and have been reading your blog since last semester. One tendency I’ve noticed is a pattern when identifying issues in research of publishing some individuals’ articles (where you cite names), and select few cases where it seems anonymized (as here). This is coupled with a tendency to either deride those authors for bad methods (yes, it is bad science!), or to forgive them as possibly missing the whole point (again, as done here).

    You clearly have elite training that affords these critiques, so I’m inclined to believe many of the cases you cite fall into the latter basket, of individuals who maybe did not receive sufficient elite training (or renown) to even consider, never mind levy these assessments. In this light, whom do you help with derisive language, as in the majority of methods-critiquing posts? Would it not be more constructive (and better science, by extension) to more mechanically discuss their shortcomings, than to lambast authors for poor work? I ask only to update my own priors and decide whether to continue reading the entirety of these kinds of posts on your blog.

    • T:

      There are something like a million scientific papers published each year. Maybe half of them have serious, fatal errors. The goal in posting on these is not to help each individual researcher but rather to develop a more general understanding of statistical problems that commonly arise, along with the processes by which bad work can get recognized as problematic (or, in some cases, processes by which bad work gets publicized without recognition of its failures).

      Whether we discuss details, such as the titles of articles and names of authors, depends on our immediate goal. Sometimes there is a project that has public policy implications and where a particular author has been active in promoting a point of view; this was the case with Richard Tol and his paper on the economic consequences of global warming. Other times a particular researcher is personally active in promoting a line of work; this was the case with Satoshi Kanazawa and his papers on sex ratio. Other times a paper or set of papers is already in the public eye; this was the case with Daryl Bem’s experiments on ESP. Other times there is a particular dispute within a field of science, as in the birdsong research questions that we discussed in this space a few years ago. Other times the problem involves a statistical issue that is new or interesting. The paper discussed in the above post was none of these: it was a somewhat obscure paper with the sort of error that we had discussed many times before. So in this case I thought the details would be a distraction, and I was making a more general point that we often see this problem; as I said, it’s routine cargo-cult science. You might consider this derisive language; I think we currently have the opposite problem, that just about anything published in a scientific journal is considered, by default, to be high-quality science.

      I recognize that different people have different priorities. Given that you have more of an interest in more mechanically discussing shortcomings of work, you should feel free to write a blog with posts in this direction, or comment on this blog with discussions of particular mechanical shortcomings. I think this would be kind of boring and pointless to do too much of this (except in various cases which are of policy importance or are interesting for some other reason), but that’s just my perspective; you and others should feel free to write with more of your own focus.

    • T:

      Also I think it’s a mistake for you to write of me “forgiving” a researcher. It’s not my role to forgive or not forgive. I have no special role here. All these papers are in the public record, and I think it’s completely appropriate for you or I or anyone else to make public comments about the public record. The problem arises when people publish with no recourse to replies. In this case, people can feel free to reply in comments.

  4. Dr. Gelman publicizes the names of researchers to witch hunt only when their results disagree with his beliefs or when they are just downright silly. This is obvious to observant readers. Just look at the names he dropped above: scientists whose results support conservative agendas and scientists whose work is just downright silly, implying that research on biological racial and sexual differences is just as fallacious ESP.

    Ask yourself: In Dr. Gelman’s list of critiques, who among them is a researcher whose work bolsters leftest agendas?

    P.S. If this comment is successfully posted, it supports my idea that Dr. Gelman is doing this unintentionally, as a sort of ideological blindness rather than deliberately pushing his politics upon his readership.

    • Daon:

      1. There is no “witch hunting.” I’m baffled as to the attitude that there is something wrong with commenting negatively on published papers. The whole point of publication is to get the ideas out there for comment, positive and negative.

      2. You may think that the work we’ve discussed on himmicanes, air rage, beauty and sex ratio, etc., embodied cognition, etc., is “just downright silly,” but a lot of people—including serious scientific journals and major media organizations—don’t. So I think there is some value in figuring out what went wrong in these sorts of examples, in the science, in the statistics, and in the subsequent publicity and discussion.

      3. Just for example, here.

      • Glad to see your response. I’ll dismiss the connotation I’ve acquired from your blog to this point and see if I can view things more objectively, which is the viewpoint you claim to hold for your own content. Just know that I didn’t come into reading your blog expecting any slant; it developed naturally over the course of daily reading.

    • “P.S. If this comment is successfully posted, it supports my idea that Dr. Gelman is doing this unintentionally, as a sort of ideological blindness rather than deliberately pushing his politics upon his readership.”

      You could also view your comment being posted as a sign that someone is willing to engage in discussion and/or is not filtering certain comments.

      I think both are not a given concerning scientific blogs (which may perhaps explain your idea), and i think both can be seen as very important for scientific discussions.

    • “If this comment is successfully posted …”

      If this comment is successfully posted, it means that the comment met the requirements for posting.

      Also, had it not, were you going to make an inference from a denied antecedent?

      • I’ve had comments denied in the past. I don’t know Dr. Gelman’s criteria for censoring comments, but I’m glad to know it’s not related to an ideological agenda.

        • Verial:

          The spam filter sometimes automatically removes comments with spammy links. Also, sometimes when I’m going through comments, I notice comments with spammy links and I put them in the spam folder. In future it would be better to make sure you have no spam-like links in your url.

Leave a Reply to Verial Cancel reply

Your email address will not be published. Required fields are marked *