[Note to busy readers: If you’re sick of power pose, there’s still something of general interest in this post; scroll down to the section on the time-reversal heuristic. I really like that idea.]
Someone pointed me to this discussion on Facebook in which Amy Cuddy expresses displeasure with my recent criticism (with Kaiser Fung) of her claims regarding the “power pose” research of Cuddy, Carney, and Yap (see also this post from yesterday). Here’s Cuddy:
This is sickening and, ironically, such an extreme overreach. First, we *published* a response, in Psych Science, to the Ranehill et al conceptual (not direct) replication, which varied methodologically in about a dozen ways — some of which were enormous, such as having people hold the poses for 6 instead of 2 minutes, which is very uncomfortable (and note that even so, somehow people missed that they STILL replicated the effects on feelings of power). So yes, I did respond to the peer-reviewed paper. The fact that Gelman is referring to a non-peer-reviewed blog, which uses a new statistical approach that we now know has all kinds of problems, as the basis of his article is the WORST form of scientific overreach. And I am certainly not obligated to respond to a personal blog. That does not mean I have not closely inspected their analyses. In fact, I have, and they are flat-out wrong. Their analyses are riddled with mistakes, not fully inclusive of all the relevant literature and p-values, and the “correct” analysis shows clear evidential value for the feedback effects of posture. I’ve been quiet and polite long enough.
There’s a difference between having your ideas challenged in constructive way, which is how it used in to be in academia, and attacked in a destructive way. My “popularity” is not relevant. I’m tired of being bullied, and yes, that’s what it is. If you could see what goes on behind the scenes, you’d be sickened.
I will respond here but first let me get a couple things out of the way:
1. Just about nobody likes to be criticized. As Kaiser and I noted in our article, Cuddy’s been getting lots of positive press but she’s had some serious criticisms too, and not just from us. Most notably, Eva Ranehill, Anna Dreber, Magnus Johannesson, Susanne Leiberg, Sunhae Sul, and Roberto Weber published a paper last year in which they tried and failed to replicate the results of Cuddy, Carney, and Yap, concluding “we found no significant effect of power posing on hormonal levels or in any of the three behavioral tasks.” Shortly after, the respected psychology researchers Joe Simmons and Uri Simonsohn published on their blog an evaluation and literature review, writing that “either power-posing overall has no effect, or the effect is too small for the existing samples to have meaningfully studied it” and concluding:
While the simplest explanation is that all studied effects are zero, it may be that one or two of them are real (any more and we would see a right-skewed p-curve). However, at this point the evidence for the basic effect seems too fragile to search for moderators or to advocate for people to engage in power posing to better their lives.
OK, so I get this. You work hard on your research, you find something statistically significant, you get it published in a top journal, you want to draw a line under it and move on. For outsiders to go and question your claim . . . that would be like someone arguing a call in last year’s Super Bowl. The game’s over, man! Time to move on.
So I see how Cuddy can find this criticism frustrating, especially given her success with the Ted talk, the CBS story, the book publication, and so forth.
2. Cuddy writes, “If you could see what goes on behind the scenes, you’d be sickened.” That might be so. I have no idea what goes on behind the scenes.
OK, now on to the discussion
The short story to me is that Cuddy, Carney, and Yap found statistical significance in a small sample, non-preregistered study with a flexible hypothesis (that is, a scientific hypothesis that posture could affect performance, which can map on to many many different data patterns). We already know to watch out for such claims, and in this case a large follow-up study by an outside team did not find a positive effect. Meanwhile, Simmons and Simonsohn analyzed some of the published literature on power pose and found it to be consistent with no effect.
At this point, a natural conclusion is that the existing study by Cuddy et al. was too noisy to reveal much of anything about whatever effects there might be of posture on performance.
This is not the only conclusion one might draw, though. Cuddy draws a different conclusion, which is that her study did find a real effect and that the replication by Ranehill et al. was done under different, less favorable conditions, for which the effect disappeared.
This could be. As Kaiser and I wrote, “This is not to say that the power pose effect can’t be real. It could be real and it could go in either direction.” We question on statistical grounds the strength of the evidence offered by Cuddy et al. And there is also the question of whether a lab result in this area, if it were real, would generalize to the real world.
What frustrates me is that Cuddy in all her responses doesn’t seem to even consider the possibility that the statistically significant pattern they found might mean nothing at all, that it might be an artifact of a noisy sample. It’s happened before: remember Daryl Bem? Remember Satoshi Kanazawa? Remember the ovulation-and-voting researchers? The embodied cognition experiment? The 50 shades of gray? It happens all the time! How can Cuddy be so sure it hasn’t happened to her? I’d say this even before the unsuccessful replication from Ranehill et al.
Response to some specific points
“Sickening,” huh? So, according to Cuddy, her publication is so strong it’s worth a book and promotion in NYT, NPR, CBS, TED, etc. But Ranehill et al.’s paper, that somehow has a lower status, I guess because it was published later? So it’s “sickening” for us to express doubt about Cuddy’s claim, but not “sickening” for her to question the relevance of the work by Ranehill et al.? And Simmons and Simonsohn’s blog, that’s no good because it’s a blog, not a peer reviewed publication. Where does this put Daryl Bem’s work on ESP or that “bible code” paper from a couple decades ago? Maybe we shouldn’t be criticizing them, either?
It’s not clear to me how Simmons, Simonsohn, and I are “bullying” Cuddy. Is it bullying to say that we aren’t convinced by her paper? Are Ranehill, Dreber, etc. “bullying” her too, by reporting a non-replication? Or is that not bullying because it’s in a peer-reviewed journal?
When a published researcher such as Cuddy equates “I don’t believe your claims” with “bullying,” that to me is a problem. And, yes, the popularity of Cuddy’s work is indeed relevant. There’s lots of shaky research that gets published every year and we don’t have time to look into all of it. But when something is so popular and is promoted so heavily, then, yes, it’s worth a look.
Also, Cuddy writes that “somehow people missed that they STILL replicated the effects on feelings of power.” But people did not miss this at all! Here’s Simmons and Simonsohn:
In the replication, power posing affected self-reported power (the manipulation check), but did not impact behavior or hormonal levels. The key point of the TED Talk, that power poses “can significantly change the outcomes of your life”, was not supported.
In any case, it’s amusing that someone who’s based an entire book on an experiment that was not successfully replicated is writing about “extreme overreach.” As I’ve written several times now, I’m open to the possibility that power pose works, but skepticism seems to me to be eminently reasonable, given the evidence currently available.
In the meantime, no, I don’t think that referring to a non-peer-reviewed blog is “the worst form of scientific overreach.” I plan to continue to read and refer to the blog of Simonsohn and his colleagues. I think they do careful work. I don’t agree with everything they write—but, then again, I don’t agree with everything that is published in Psychological Science, either. Simonsohn et al. explain their reasoning carefully and they give their sources.
I have no interest in getting into a fight with Amy Cuddy. She’s making a scientific claim and I don’t think the evidence is as strong as she’s claiming. I’m also interested in how certain media outlets take her claims on faith. That’s all. Nothing sickening, no extreme overreach, just a claim on my part that, once again, a researcher is being misled by the process in which statistical significance, followed by publication in a major journal, is taken as an assurance of truth.
The time-reversal heuristic
One helpful (I think) way to think about this episode is to turn things around. Suppose the Ranehill et al. experiment, with its null finding, had come first. A large study finding no effect. And then Cuddy et al. had run a replication under slightly different conditions with a much smaller sample size and found statistically significance under non-preregistered conditions. Would we be inclined to believe it? I don’t think so. At the very least, we’d have to conclude that any power-pose effect is fragile.
From this point of view, what Cuddy et al.’s research has going for it is that (a) they found statistical significance, (b) their paper was published in a peer-reviewed journal, and (c) their paper came before, rather than after, the Ranehill et al. paper. I don’t find these pieces of evidence very persuasive. (a) Statistical significance doesn’t mean much in the absence of preregistration or something like it, (b) lots of mistakes get published in peer-reviewed journals, to the extent that the phrase “Psychological Science” has become a bit of a punch line, and (c) I don’t see why we should take Cuddy et al. as the starting point in our discussion, just because it was published first.
I don’t see any of this changing Cuddy’s mind. And I have no idea what Carney and Yap think of all this; they’re coauthors of the original paper but don’t seem to have come up much in the subsequent discussion. I certainly don’t think of Cuddy as any more of an authority on this topic than are Eva Ranehill, Anna Dreber, etc.
And I’m guessing it would take a lot to shake the certainty expressed on the matter by team TED. But maybe people will think twice when the next such study makes its way through the publicity mill?
And, for those of you who can’t get enough of power pose, I just learned that the journal Comprehensive Results in Social Psychology, “the preregistration-only journal for social psychology,” will be having a special issue devoted to replications of power pose! Publication is expected in fall 2016. So you can expect some more blogging on this topic in a few months.
The potential power of self-help
What about the customers of power pose, the people who might buy Amy Cuddy’s book, follow its advice, and change their life? Maybe Cuddy’s advice is just fine, in which case I hope it helps lots of people. It’s perfectly reasonably to give solid, useful advice without any direct empirical backing. I give advice all the time without there being any scientific study behind it. I recommend to write this way, and teach that way, and make this and that sort of graphs, typically basing my advice on nothing but a bunch of stories. I’m not the best one to judge whether Cuddy’s advice will be useful for its intended audience. But if it, that’s great and I wish her book every success. The advice could be useful in any case. Even if power pose has null or even negative effects, the net effect of all the advice in the book, informed by Cuddy’s experiences teaching business students and so forth, could be positive.
As I wrote in a comment in yesterday’s thread, consider a slightly different claim: Before an interview you should act confident; you should fold in upon yourself and be coiled and powerful; you should be secure about yourself and be ready to spring into action. It would be easy to imagine an alternative world in which Cuddy et al. found an opposite effect and wrote all about the Power Pose, except that the Power Pose would be described not as an expansive posture but as coiled strength. We’d be hearing about how our best role model is not cartoon Wonder Woman but rather the Lean In of the modern corporate world. Etc. And, the funny thing is, that might be good advice too! As they say in chess, it’s important to have a plan. It’s not good to have no plan. It’s better to have some plan, any plan, especially if you’re willing to adapt that plan in light of events. So it could well be that either of these power pose books—Cuddy’s actual book, or the alternative book, giving the exact opposite posture advice, which might have been written had the data in the Cuddy, Carney, and Yap paper come out different—could be useful to readers.
So I want to separate three issues: (1) the general scientific claim that some manipulation of posture will have some effects, (2) the specific claim that the particular poses recommended by Cuddy et al. will have the specific effects claimed in their paper, and (3) possible social benefits from Cuddy’s Ted talk and book. Claim (1) is uncontroversial, claim (2) is suspect (both from the failed replication and from consideration of statistical noise in the original study), and item (3) is completely different issue entirely, which is why I wouldn’t want to argue with claims that the talk and the book have helped people.
P.P.S. You might also want to take a look at this post by Uri Simonsohn who goes into detail on a different example of a published and much-cited result from psychology that did not replicate. Long story short: forking paths mean that it’s possible to get statistical significance from noise, also mean that you can keep finding conformation by doing new studies and postulating new interactions to explain whatever you find. When an independent replication fails, it doesn’t necessarily mean that the original study found something and the replication didn’t; it can mean that the original study was capitalizing on noise. Again, consider the time-reversal heuristic: Pretend that the unsuccessful replication came first, then ask what you would think if a new study happened to find a statistically significant interaction happened somewhere.
P.P.P.S. More here from Ranehill and Dreber. I don’t know if Cuddy would consider this as bullying. One hand, it’s a blog comment, so it’s not like it has been subject to the stringent peer review of Psych Science, PPNAS, etc, ha ha; on the other hand, Ranehill and Dreber do point to some published work:
Finally, we would also like to raise another important point that is often overlooked in discussions of the reliability of Carney et al.’s results, and also absent in the current debate. This issue is raised in Stanton’s earlier commentary to Carney et al. , published in the peer-reviewed journal Frontiers in Behavioral Neuroscience (available here http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3057631/). Apart from pointing out a few statistical issues with the original article, such as collapsing the hormonal analysis over gender, or not providing information on the use of contraceptives, Stanton (footnote 2) points out an inconsistency between the mean change in cortisol reported by Carney et al. in the text, and those displayed in Figure 3, depicting the study’s main hormonal results. Put succinctly, the reported hormone numbers in Carney, et al., “don’t add up.” Thus, it seems that not even the original article presents consistent evidence of the hormonal changes associated with power poses. To our knowledge, Carney, et al., have never provided an explanation for these inconsistencies in the published results.
From the standpoint of studying hormones and behavior, this is all interesting and potentially important. Or we can just think of this generically as some more forks in the path.