“Menstrual Cycle Phase Does Not Predict Political Conservatism”

Someone pointed me to this article by Isabel Scott and Nicholas Pound:

Recent authors have reported a relationship between women’s fertility status, as indexed by menstrual cycle phase, and conservatism in moral, social and political values. We conducted a survey to test for the existence of a relationship between menstrual cycle day and conservatism.

2213 women reporting regular menstrual cycles provided data about their political views. Of these women, 2208 provided information about their cycle date . . . We also recorded relationship status, which has been reported to interact with menstrual cycle phase in determining political preferences.

We found no evidence of a relationship between estimated cyclical fertility changes and conservatism, and no evidence of an interaction between relationship status and cyclical fertility in determining political attitudes. . . .

I have no problem with the authors’ substantive findings. And they get an extra bonus for not labeling day 6 as high conception risk:

journal.pone.0112042.g001

Seeing this clearly-sourced graph makes me annoyed one more time at those psychology researchers who refused to acknowledge that, in a paper all about peak fertility, they’d used the wrong dates for peak fertility. So, good on Scott and Pound for getting this one right.

There’s one thing that does bother me about their paper, though, and that’s how they characterize the relation of their study to earlier work such as the notorious paper by Durante et al.

Scott and Pound write:

Our results are therefore difficult to reconcile with those of Durante et al, particularly since we attempted the analyses using a range of approaches and exclusion criteria, including tests similar to those used by Durante et al, and our results were similar under all of them.

Huh? Why “difficult to reconcile”? The reconciliation seems obvious to me: There’s no evidence of anything going on here. Durante et al. had a small noisy dataset and went all garden-of-forking-paths on it. And they found a statistically significant comparison in one of their interactions. No news here.

Scott and Pound continue:

Lack of statistical power does not seem a likely explanation for the discrepancy between our results and those reported in Durante et al, since even after the most restrictive exclusion criteria were applied, we retained a sample large enough to detect a moderate effect . . .

Again, I feel like I’m missing something. “Lack of statistical power” is exactly what was going on with Durante et al., indeed their example was the “Jerry West” of our “power = .06” graph:

Screen Shot 2014-11-17 at 11.19.42 AM

Scott and Pound continue:

One factor that may partially explain the discrepancy is our different approaches to measuring conservatism and how the relevant questions were framed. . . . However, these methodological differences seem unlikely to fully explain the discrepancy between our results . . . One further possibility is that differences in responses to our survey and the other surveys discussed here are attributable to variation in the samples surveyed. . . .

Sure, but aren’t you ignoring the elephant in the room? Why is there any discrepancy to explain? Why not at least raise the possibility that those earlier publications were just examples of the much-documented human ability to read patterns in noise.

I suspect that Scott and Pound have considered this explanation but felt it would be politic not to explicitly suggest it in their paper.

P.S. The above graph is a rare example of a double-y-axis plot that isn’t so bad. But the left axis should have a lower bound at 0: it’s not possible for conception risk to be negative!

24 thoughts on ““Menstrual Cycle Phase Does Not Predict Political Conservatism”

  1. By not being able to explain the discrepancy with problems of the replication study or “theoretically interesting” measurement differences, they are showing that the non-replication is likely due to low power etc of the original study. It is a rhetorical device to convince those skeptical of the replication.

    • Mb:

      Yes, that make sense. I just wonder whether such rhetorical devices are a good idea. The question is whether it makes sense to say what you really think, or whether it’s better to understate to make a more bulletproof argument. This has come up occasionally in blog comments: I’ll say XYZ and a comment will say I should’ve just said XY or even just X because that would make my case stronger. My usual reply is that I’m not trying to make a case, I’m just trying to share my understanding of the problem. But I know that’s not the only game.

  2. Let me guess. The graph shows a pretty linear, straight-line, no slope relationship between the two variables. Papers (not this one) that claimed something was going on were at the edge of p=0.05.

    Amiright?

    • The whole whole genre is so crazy in that in all the papers because they are taking individual data on one day, adjusting everyone to an artificial 28 day cycle, ignore individual variation in mean time from start of cycle to ovulation, make assumptions about the accuracy of recall of the first day of prior cycle (“74% accurate to within 1 day, and 81% to within 2 days”). It seems to me there is a lot of room between doing blood testing (which they all say is too expensive) and collecting repeated measures data on the question of interest. Andrew you might be interested in this abstract’s reference to the day 6 question. http://www.bmj.com/content/321/7271/1259 (actually the whole thing is not that long).

      • That was a useful paper. The truth is cycles vary in length from person to person and time to time, ovulation isn’t that predictable, and memory isn’t that good… so the noise in measurement of cycle phase is pretty dang high. Because they discuss it as “cycle phase” it isn’t even actually getting at what is really desired in most of these studies, which is probably hormone concentration.

        • note, slightly confusing post, “it isn’t even actually getting at” was supposed to mean “studies that use cycle phase to relate to …. (color of shirt, political attitude whatever) aren’t really getting at the right concept even if they do fanciness to try to fix the “phase” measurement”

    • Yes, it is tact and snark, but not “just” snark. I think this is the tip of the iceberg that Andrew is really objecting to. Having to play the game of not exposing the bad analysis of established scholars is part of why things change so slowly (or don’t change at all). By all means, we must keep everybody’s reputation intact (otherwise, our errors may come to haunt us, or perhaps somebody will be vengeful). The effect is that errors and misrepresentations do not seriously jeopardize anybody’s career. And, then, the incentive to correct, retract, not overstate, is muted. I really can’t fault these authors for doing what everybody else does (and, in fact, what I might do in their position). But the system that makes it necessary for them to be “tactful” should be deservedly trashed.

      • I think this is too strong.

        First, everyone makes errors, and researchers shouldn’t be necessarily be punished for being wrong. We are all wrong all the time. We are just trying to get less wrong.

        Second, are you really arguing that we should be less polite to each other? Open and honest debate, absolutely, but that doesn’t mean we should drop “tact”.

        I totally agree that we lack important incentives to correct, retract and to write with the kind of humility appropriate to empirical investigations in the social sciences. But I don’t think that means we need to be a jerk to people with disagree with by default. This paper is clearly superior to the previous attempts to answer this question. I don’t see how explicitly shaming other researchers would improve it.

        I also think that, in general, having several papers out that come to different conclusions isn’t necessarily bad for the field. One of my favorites is that two researchers I like have recently done papers on the effects of child wages on child labor (you know, fun stuff). They come to almost exactly opposite conclusions, for the same country, at the same time period, using two different sources of variation in wages and two different datasets. I have no idea which one is more right, but I think both papers (and the field) would benefit from constructively engaging the analyses and arguments of the other one. I think both papers would be worse if they tried to bash the other one.

        All that said, I do understand the frustration that comes from watching other researchers essentially game the system by not adhering to the standards of fairness and honesty in debate. But when you do an analysis that is clearly superior, and you prove your point empirically, I’m not sure you have to also go out of your way to say “And look at how wrong these other idiots were.” You can just let your work speak for itself.

        • I think saying “previous studies which showed an effect were probably in error” is still tactful, but taking tact to the level of failing to actually state what your data supports… I think that harms the science and exceeds the level of tact that society really needs. Wasn’t there some kind of “tact” involved in a recent airline crash in the Bay Area (a few years back maybe)? Where a copilot was too tactful to actually say to the pilot that the approach was wrong? I remember reading something about that.

    • Agreed. I’m not American (Canadian instead) and in many cases something like “difficult to reconcile” if used here would be utter condemnation.

  3. This is a paper with a “negative result” — that is, it reports finding no evidence for the effect it claims to be seeking. Such papers are notoriously hard to publish, so its acceptance probably depends heavily on the existence of the earlier paper. In fact, it’s hard to see anyone taking this thesis seriously as something that needs to be disproven without the earlier paper. I think this is something of a confounding factor in the question of how direct the authors should be in trashing the earlier work.

Leave a Reply

Your email address will not be published. Required fields are marked *