Beyond “power pose”: Using replication failures and a better understanding of data collection and analysis to do better science

So. A bunch of people pointed me to a New York Times article by Susan Dominus about Amy Cuddy, the psychology researcher and Ted-talk star famous for the following claim (made in a paper written with Dana Carney and Andy Yap and published in 2010):

That a person can, by assuming two simple 1-min poses, embody power and instantly become more powerful has real-world, actionable implications.

Awkwardly enough, no support for that particular high-stakes claim was ever presented in the journal article where it appeared. And, even more awkwardly, key specific claims for which the paper did offer some empirical evidence for, failed to show up in a series of external replication studies, first by Ranehill et al. in 2015 and then more recently various other research teams (see, for example, here). Following up on the Ranehill et al. paper was an analysis by Joe Simmons and Uri Simonsohn explaining how Carney, Cuddy, and Yap could’ve gotten it wrong in the first place. Also awkward was a full retraction by first author Dana Carney, who detailed many ways in which the data were handled in order to pull out apparently statistically significant findings.

Anyway, that’s all background. I think Dominus’s article is fair [No, upon reflection, I don’t think the article was fair, as it places, without rebuttal, misrepresentations of my work and that of Dana Carney — AG], given the inevitable space limitations. I wouldn’t’ve chosen to have written an article about Amy Cuddy—I think Eva Ranehill or Uri Simonsohn would be much more interesting subjects. But, conditional on the article being written largely from Cuddy’s perspective, I think it portrays the rest of us in a reasonable way [actually no, I don’t think so. — AG]. As I said to Dominus when she interviewed me, I don’t have any personal animosity toward Cuddy. I just think it’s too bad that the Carney/Cuddy/Yap paper got all that publicity and that Cuddy got herself tangled up in defending it. It’s admirable that Carney just walked away from it all. And it’s probably a good call of Yap to pretty much have avoided any further involvement in the matter.

The only thing that really bugged me about the NYT article is when Cuddy is quoted as saying, “Why not help social psychologists instead of attacking them on your blog?” and there is no quoted response from me. I remember this came up when Dominus interviewed me for the story, and I responded right away that I have helped social psychologists! A lot. I’ve given many talks during the past few years to psychology departments and at professional meetings, and I’ve published several papers in psychology and related fields on how to do better applied research, for example here, here, here, here, here, here, here, and here. I even wrote an article, with Hilda Geurts, for The Clinical Neuropsychologist! So, yeah, I do spend some time helping social psychologists.

Dominus also writes, “Gelman considers himself someone who is doing others the favor of pointing out their errors, a service for which he would be grateful, he says.” This too is accurate, and let me also emphasize that this is a service for which I not only would be grateful. I actually am grateful when people point out my errors. It’s happened several times; see for example here. When we do science, we can make mistakes. That’s fine. What’s important is to learn from our mistakes.

In summary, I think Dominus’s article was fair, but I do wish she hadn’t let that particular false implication by Cuddy, the claim that I didn’t help social psychologists, go unchallenged. Then again, I also don’t like it that Cuddy baselessly attacked the work of Simmons and Simonsohn and to my knowledge never has apologized for that. (I’m thinking of Cuddy’s statement, quoted here, that Simmons and Simonsohn “are flat-out wrong. Their analyses are riddled with mistakes . . .” I never saw Cuddy present any evidence for these claims.)

Good people can do bad science. Indeed, if you have bad data you’ll do bad science (or, at best, report null findings), no matter how good a person you are.

Let me continue by saying something I’ve said before, which is that being a scientist, and being a good person, does not necessarily mean that you’re doing good science. I don’t know Cuddy personally, but given everything I’ve read, I imagine that she’s a kind, thoughtful, and charming person. I’ve heard that Daryl Bem is a nice guy too. And I expect Satoshi Kanazawa has many fine features too. In any case, it’s not my job to judge these people nor is it their job to judge me. A few hundred years ago, I expect there were some wonderful, thoughtful, intelligent, good people doing astrology. That doesn’t mean that they were doing good science!

If your measurements are too noisy (again, see here for details), it doesn’t matter how good a person you are, you won’t be able to use your data to make replicable predictions of the world or evaluate your theories: You won’t be able to do empirical science.

Conversely, if Eva Ranehill, or Uri Simonsohn, or me, or anyone else, performs a replication (and don’t forget the time-reversal heuristic) or analyzes your experimental protocol or looks carefully at your data and finds that your data are too noisy for you to learn anything useful, then they may be saying you’re doing bad science, but they’re not saying you’re a bad person.

As the subtitle of Dominus’s excellent article says, “suddenly, the rules changed.” It happened over several years, but it really did feel like something sudden. And, yes, Carney, Cuddy, and Yap ideally should’ve known back in 2010 that they were chasing for patterns in noise. But they, like many others, didn’t. They, and we, were fortunate to have Ranehill et al. reveal some problems in their study with the failed replication. And they, and we, were fortunate to have Simmons, Simonsohn, and others explain in more detail how they could’ve got things wrong. Through this and other examples of failed studies (most notably Bem’s ESP paper, but also hopelessly flawed Kanazawa and many others), and through lots of work by psychologists such as Nosek and others, we are developing a better understanding of how to do research on unstable, context-dependent human phenomena. There’s no reason to think of the authors of those fatally flawed papers as being bad people. We learn, individually and collectively, from our mistakes. We’re all part of the process, and Dominus is doing the readers of the New York Times a favor by revealing one part of that process from the inside. Instead of the usual journalistic trope of scientist as hero, it’s science as community, including confusion, miscommunication, error, and an understanding that a certain research method that used to be popular and associated with worldly success—the method of trying out some vaguely motivated idea, gathering a bunch of noisy data, and looking for patterns—doesn’t work so well at producing sensible or replicable results. That’s a good thing to know, and it could well be interesting for outsiders to see the missteps it took for us all to get there.

Selection bias in what gets reported

When people make statistical errors, I don’t say “gotcha,” I feel sad. Even when I joke about it, I’m not happy to see the mistakes; indeed, I often blame the statistics profession—including me, as a textbook writer!—for portraying statistical methods as tools for routine discovery: Do the randomization, gather the data, pass statistical significance and collect $200.

Regarding what gets mentioned in the newspapers and in the blogs, there’s some selection bias. A lot of selection bias, actually. Suppose, for example, that Daryl Bem had not made the serious, fatal mistakes he’d made in his ESP research. Suppose he’d fit a hierarchical model or done a preregistered replication or used some other procedure to avoid jumping at patterns in noise. That would’ve been great. And then he most likely would’ve found nothing distinguishable from a null effect, no publication in JPSP (no, I don’t think they’d publish the results of a large multi-year study finding no effect for a phenomenon that most psychologists don’t believe in the first place), no article on Bem in the NYT . . . indeed, I never would’ve heard of Bem!

Think of the thousands of careful scientists who, for whatever combination of curiosity or personal interests or heterodoxy, decide to study offbeat topics such as ESP or the effect of posture on life success—but who conduct their studies carefully, gathering high-quality data, and using designs and analyses that minimize the chances of being fooled by noise. These researchers will, by and large, quietly find null results, which for very reasonable dog-bite-man reasons will typically be unpublishable, or only publishable in minor journals and will not be likely to inspire lots of news coverage. So we won’t hear about them.

Conversely, I’ll accept the statement that Cuddy in her Ted talks could be inspiring millions of people in a good way, even if power pose does nothing, or even does more harm than good. (I assume it depends on context, that power pose will do more good than harm in some settings, and more harm than good in others). The challenge for Cuddy—and in all seriousness I hope she follows up on this—is to be this inspirational figure, to communicate to those millions, in a way that respects the science. I hope Cuddy can stop insulting Simmons and Simonsohn, forget about the claims of the absolute effects of power pose, and move forward, sending the message that people can help themselves by taking charge of their environment, by embodying who they want to be. The funny thing is, I think that pretty much is the message of that famous Ted talk, and that the message would be stronger without silly, unsupported claims such as “That a person can, by assuming two simple 1-min poses, embody power and instantly become more powerful has real-world, actionable implications.”

A way forward

People criticize Cuddy for hyping her science and making it into a Ted talk. But, paradoxically, I’m now thinking we should be saying the opposite. The Ted talk has a lot going for it: it’s much stronger than the journal articles that justify it and purportedly back it up. I have the impression that Cuddy and others think the science of power pose needs to be defended in part because of its role in this larger edifice, but I recommend that Cuddy and her colleagues go the other way: follow the lead of Dana Carney, Eva Ranehill, et al., and abandon the scientific claims, which ultimately were based on an overinterpretation of noise (again, recall the time-reversal heuristic)—and then let the inspirational Ted talk advice fly free of that scientific dead end. There are lots of interesting ways to study how people can help themselves through tools such as posture and visualization, but I think these have to be studied for real, not through crude button-pushing ideas such as power pose but through careful studies on individuals, recognizing that different postures, breathing exercises, yoga moves, etc., will work for different people. Lots of interesting ideas here, and it does these ideas no favor to tie them to some silly paper published in 2010 that happened to get a bunch of publicity. The idea is to take the positive aspects of the work of Cuddy and others—the inspirational message that rings true for millions of people—and to investigate it using a more modern, data-rich, within-person model of scientific investigation. That’s the sort of thing that should one day be appearing in the pages of Psychological Science.

I think Cuddy has the opportunity to take her fame and her energy and her charm and her good will and her communication skills and her desire to change the world and take her field in a useful direction. Or not. It’s her call, and she has no obligation to do what I think would be a good idea. I just wanted to emphasize that there’s no reason her career, or even her famous Ted talk, has to rely on a particular intriguing idea (on there being a large and predictable effect of a certain pose) that happened not to work out. And I thank Dominus for getting us all to think about these issues.

P.S. There are a bunch of comments, including from some people who strongly disagree with me—which I appreciate! That is, I appreciate that people who disagree are taking the trouble to share their perspective and engage in discussion here.

There are a lot of details above so maybe it would be a good idea to emphasize some key points:

1. I thought Dominus’s article was excellent and fair to all sides. [Actually, no, I don’t think so. — AG] There were a couple points that I wish had not been left out, but of course it’s the reporter’s job to write the story how she thinks is best. By raising these points, I’m not saying Dominus should’ve written her article in a different way, I’m just adding my perspective. In particular, I didn’t want people to have the impression that I don’t help social psychologists, given that I’ve put a lot of work during the past few years doing just that! In any case, if Simmons or Cuddy or any of the other people mentioned in the article want to share their perspective here too, I’d be happy to post their remarks as separate entries on this blog.

2. I have no desire to act as gatekeeper for scientific research. I think it’s fine that the original Carney, Cuddy, and Yap paper was published, and that Ranehill et al. and other replication studies were published, and that the analysis by Simmons and Simonsohn was published, and so on.

3. I’m not trying to tell Cuddy (or, for that matter, Simmons, Simonsohn, Bem, or anyone else) what to do. I’ll offer suggestions based on my current statistical understanding, and I’ll even publish research papers offering suggestions, as think that’s part of my job. These suggestions are just that, offered with full awareness that all these researchers are independent agents who are free to follow all, some, or none of my advice.

4. I do disagree with some things that Cuddy and her collaborators have written, for example the quote from the journal article at the top of this post and the quote from Cuddy characterizing the work of Simmons and Simonsohn. When I disagree, I explain why. At the same time, I recognize that what I’m doing is offering my perspective and, again, expressing my disagreement does not represent any attempt to stop others from expressing their views and their reasons for these views.

481 thoughts on “Beyond “power pose”: Using replication failures and a better understanding of data collection and analysis to do better science

  1. Same tired response for being called out about some pseudo-gate keeper role you’ve assumed/occupied/been seen in inside your field (dangerously so). What a shame! A stifling influence no doubt. I also cannot escape the tone and approach given to Cuddy as it relates to her gender, and the continuous push for her to ‘be useful in her field’ and ‘correct her mistakes’ so that her ‘inspiration’ can drive a clearer path forward. Wow. Who are you to dictate from andrewgelman.com? A sad state for your field and research areas and practices, thanks in large part to you and your approach.

    • Anon:

      I’m not dictating anything! As I wrote above, it’s her call, and Cuddy has no obligation to do what I think would be a good idea.

      Also, I am no gatekeeper. Yes, I’ve criticized journals such as Psychological Science for publishing some bad work, but my real problem is not with that—every journal will publish some bad work, it’s inevitable—but with occasional unwillingness of some researchers to accept criticism. Carney et al. published their paper, Ranehill et al. and others were motivated to replicate and the replications failed, Simmons and Simonsohn published some useful analysis helping make sense of it all. Everything got published (in journals or blogs). No gatekeeping! My response might well be “tired”—it is after midnight already—but I don’t see the gatekeeping nor the stifling. I’d like everyone to publish their results, and their theories, and their raw data, and allow their work to be openly criticized. Open all the gates.

      I also don’t really see the connection to Cuddy being a woman. The comments I’ve given to Cuddy are similar to those I’ve given to Satoshi Kanazawa, Richard Tol, and various other male researchers who, like Cuddy, have unfortunately found themselves in a position defending unsound research.

      I don’t want smart, driven, well-meaning people such as Kanzawa, Tol, Cuddy, Bem, etc., to waste their time and large chunks of their careers trying to come up with theories to explain patterns in noise. All these people, male and female alike, have evident talents and it makes me sad to see those talents wasted in this way. So of course if I have some ideas of how they can do better, I will share these ideas. I can’t fully follow your comment but it seems that you think it’s a bad thing that I’m hoping these researchers can use their talents to advance science, to “drive a clearer path forward”? I think that’s what science is all about, for men and women both. I give the same advice to myself.

    • Could not agree more. Nice to have some light shone on the crowd of internet trolls masquerading as saviors of science conveniently from the comfort of their computers.

      • Hmm, I feel pretty comfortable, but I’m willing to bet that it feels even more comfortable to sit behind one’s computer with a tenure/full professorship/editor position in a prestigious journal, and to tell people to stop pointing out methodological errors.

    • Anonymous: your response is a picture perfect, textbook example of hypocrisy. You are criticizing Gelman for being a “gatekeeper”? What, exactly, do you think your criticism amounts to? GATEKEEPING. You seem to be implying this is gender-related. So, let me ask you this: is it ok for you to call a man out in what amounts to a snarky, bitter, personal tone, but a man commenting on a woman’s research is somehow off limits? You need to take a look at that.

    • This comment can’t be real, right? This seems like a textbook example of Trevor’s Axiom.

      Andrew, thank you very much for this balanced and constructive contribution to the dialog that Dominus’s article brought back into focus.

  2. There is conempt and zeal in the ridicule of social psychologists here that you may not intend, but many perceive. Then there’s this coterie of commenters who take it even further. Critiquing failings in scientific work is important and valuable. But the extended mockery of papers, journals, authors, etc. is not helpful.

    • Anon:

      The mockery comes out of frustration. The idea that the National Academy of Sciences would publish junk like himmicanes and air rage, it’s just maddening. Or, again, not that they publish it—we all make mistakes—but that they seem to stand behind such work, it’s just horrible. Similarly, when Perspectives on Psychological Science published a paper that flat-out lied about me (and about Ulrich Schimmack), that made me furious.

      That said, I’m sure you’re right that some of my writings are not helpful. My legitimate anger and frustration may explain some behavior of mine that might well be counterproductive, but it’s no excuse, and I appreciate that commenters such as you will call me out on it. Even if I disagree on some of the specifics, I appreciate the larger point that my communications here can misfire, and one of the benefits of blogging (and of anonymous commenting) is that I can get this sort of feedback. So thank you.

      • Specifically, Andrew, I think some of us (certainly myself) find your unending references to people you’ve previously criticized (e.g. why bring up Bem or Kanazawa here, really?) to be a little tasteless and excessive, even if we agree with your general mission of improving scientific practice.

        • Zc:

          Awhile ago I mostly switched to referring to projects rather than researchers’ names. Thus I’d write about the ovulation-and-voting study or air rage or beauty-and-sex-ratio, pizzagate, etc.

          The reason why I referred to Kanazawa, Bem, Tol, Wansink, etc., by name in some of the comments here was that: (1) I had to refer to Cuddy by name since I was discussing a news article all about her and I think it would’ve been awkward to not use her name, and (2) people were saying I was singling out Cuddy or being sexist to criticize a woman so I thought it made sense to use the names of others too.

          In general, though, I do now prefer to refer to topics rather than names.

        • I think that eschewing names is a reasonable step to take. But I hope you will understand that you’re outsmarting yourself here a bit. That is, you’re deploying your full and considerable intelligence in explaining why your current policy is OK and why you have adequately resolved the issues raised by your past oversteps, while using perhaps a tiny bit less than that full intelligence on self-criticism. That’s totally understandable and we all do it, but it’s also good to learn to try to sometimes just listen to criticism rather than defending yourself against it.

          I think you’re good willed, that your thinking and writing on these issues has moved the field forward. I also think that we live in a sexist world and it’s inevitable that some of the backlash Cuddy has faced has been worse than it would have been if she were a man; and that by, in the past, occasionally using her name as a synecdoche for the entire problem you have made that problem worse, whether or not you were doing so for sexist reasons yourself. You’re trying to address that and that’s good. Saying “I’m trying to address that and that’s good” doesn’t help though.

        • If someone has used Amy Cuddy’s name as a synedoche for the entire problem, then that’s not good and may indeed strengthen sexist attitudes (although I think the effect would be quite small – those with sexist attitudes tend to find evidence for them anyway). I don’t think Dr. Gelman has done that though.
          Personally, I think that the backlash Dr. Cuddy has faced has been this bad largely because she has appeared so much in popular media. This has lead to high public interest, which has led to click-bait stories from popular media outlets. I’m inclined to think that had this whole thing stayed in these kinds of blogs and within the scientific community, the damage to her reputation would have been minor. It was the popular media that made her the “scapegoat”, not this blog or those who conducted the replication attempts.

        • “It was the popular media that made her the “scapegoat””

          I found this definition of “scapegoat”

          “a person who is blamed for the wrongdoings, mistakes, or faults of others, especially for reasons of expediency.”

          How have the popular media made her the “scapegoat”? Has there been some sort of orchestrated attempt by the popular media to blame Cuddy for mistakes other people made?

          Now just regarding the fact that Cuddy and her work is being mentioned in the popular media: Do you think she perhaps has had a role in this herself?

          What i mean is, I assume she chose to do her TED-talk herself, i presume she may have contacted the popular media herself in the past to share her story, and i presume she can simply say “no” when someone from the popular media asks her for her story (e.g. because she thinks her research has not yet been developed enough to communicate to a more general audience)?

        • Anonymous 9:08 am (for some reason I cannot reply directly to your comment, so I’ll put it here): Yes, of course I think she has greatly contributed to her appearance in the popular media and thus also to the subsequent popular media interest. I used scapegoat as a shorthand and put it in quotes because I don’t think she’s a scapegoat in the literal sense.
          My main point was to reply to those who more or less say that bloggers like Dr. Gelman destroyed her reputation (or something like that) that I don’t think that blogs like this as such actually did that much damage to her reputation and career, it was the popular media coverage that did it (and yes, she chose to appear in the popular media, but I was talking about how the “blame” should be distributed between information outlets, if we want to distribute blame).

    • Anon:

      The only contempt I can find in this post is voiced in your comments. I am a social psychologist and I have benefited immensely from Andrew’s advice and critiques. Our field would be worse off without his attention, even if sometimes it comes at a cost.

      If you took the time to read the NYT article in question, you’d see that just as much contempt has been voiced by critics of the ‘revolution.’ You seem to deviate little from that norm. Rather than policing the tone at andrewgelman.com, why don’t you look to your own?

    • Anonymous said: “Critiquing failings in scientific work is important and valuable. But the extended mockery of papers, journals, authors, etc. is not helpful.”

      I agree with the first sentence quoted, and would like to agree with your second. But, regrettably, my personal experience has been that authors often do not respond to criticism that is presented politely. For example, when I wrote http://www.ma.utexas.edu/blogs/mks/2014/06/22/beyond-the-buzz-on-replications-part-i-overview-of-additional-issues-choice-of-measure-the-game-of-telephone-and-twwadi/ (and the subsequent posts), I sent a “heads-up” email to the lead author of each paper mentioned. None of them replied. I had had previous experience of sending polite critiques to authors and at best receiving a reply that said something like, “Thank you for your interest in my work,” with no mention of the criticism.

      When I started following Andrew’s blog, I was very uncomfortable with the tone that was often used, and criticized Andrew for it. But I gradually considered the possibility that less polite methods than my preference were needed to get attention to the problems. That may indeed have been the case then; possibly it is no longer the case, but I can’t say for sure.

    • Sax:

      I know, don’t feed the trolls and all that, but, in all seriousness, what bothered you in my post above? If you’re gonna bother commenting at all, you might as well add some content, no?

  3. How is your anger “legitimate”? Subject your own stuff to peer review before trying to take down people more well-known than you for the sake of clicks. You are a bully. I am glad someone is finally calling you and your misogyny out.

    • Angela:

      My anger is legitimate (to me) in the sense that media coverage of science is a somewhat limited resource and when it is occupied by bad science such as himmicanes, ages ending in 9, beauty and sex ratios, etc., I think this takes space away from the more useful stuff.

      Beyond this, no I don’t think it’s bullying to point out statistical errors in published work. People sometimes point out errors in my own work, and they’re not always polite about it. When that happens, I don’t call them bullies, I thank them for pointing out my errors. We can learn from our mistakes, but only if we’re willing to do so. You say I should subject my own stuff to peer review, but of course I do this all the time!

      Regarding “misogyny,” see my comment above. I don’t think it’s misogyny to point out errors that happen to be done by female researchers. Indeed, I don’t think it would doing female researchers any favors to avoid pointing out their errors. As I wrote above, in pointing out mistakes in the published work of Kanazawa, Bem, Cuddy, Tol, etc etc., I’m not saying I think they’re bad people. I’m showing them respect, and I show this respect to researchers regardless of gender.

      • I disagree – you are gaining from your comments on your blog in a way that tilts the argument structurally in your favor. You are not having a discussion, you are throwing darts while your horde cheers you on. Instead of *first* having a face to face conversation (which in the article you admit you dislike doing), you take on researchers online in a way that suits you, because it becomes a one-sided attack. And here you are protesting your decency when the very setup here is markedly indecent. The journalists and media you compare yourself to first speak to the source: that is good practice. What you are doing is trolling.

        • Angela:

          I’m not sure what you mean by “throwing darts.” When Susan Fiske falsely writes that Ulrich Schimmack and I “imply that the entire field is inept and misguided,” without providing any evidence, or when she calls people “terrorists,” then, sure I’d say she’s throwing darts. When Amy Cuddy writes that Simmons and Simonsohn “are flat-out wrong. Their analyses are riddled with mistakes . . .” but without giving evidence for that statement, we could call that throwing darts. Or maybe just defensiveness on her part.

          But what are the “darts” that I’m throwing? Saying that key claims of Carney et al.’s work failed to replicate, when they really did fail to replicate? Saying that Simmons and Simonsohn offered a useful perspective on what went wrong, when that’s what they really did? Saying that I see a lot of value in Cuddy’s message, and suggesting ways of decoupling the valuable part from the part that didn’t work out? I don’t see this as “throwing darts,” or “a one-sided attack,” or an attack at all. Nor for that matter do I see what is “indecent” about trying to use replication failures and a better understanding of data collection and analysis to do better science.

          The dictionary defines “trolling” as making “a deliberately offensive or provocative online post with the aim of upsetting someone or eliciting an angry response from them.” I am not trolling. I do not want to upset anyone. If I upset you, this was not my goal. I certainly have no goal of upsetting Daryl Bem or Amy Cuddy or Satoshi Kanazawa or the other researchers who’ve unfortunately been trapped in a non-working research paradigm; as I wrote above, I have no ill will toward them and I’m doing my best to offer helpful suggestions.

          Given your evident disagreement with me, you might not be satisfied with my above response but I appreciate that you went to the effort to comment here. At the very least, I hope such discussions can help us communicate better to each other, even if there are frustrations along the way.

        • I appreciate your responding to my comments, Dr. Gelman – and yes, Amy Cuddy should have defended her own critique of the criticisms better, and probably acknowledged issues in her work more quickly. But my problem with this continues to be the very public way scientists are being tried for their research in these blogs and posts. I doubt your assertion that it is overall, good for science.

          It’s a person’s name after all, that you, in the case of Ms. Cuddy, casually and consistently linked in your posts to what bad research means, and by doing this (and thanks to the size of your audience), you turn into the main authority making that call. That is unfair, and while you may have good intentions to start with, the associated networks turns this into a pile-on, via the comments, shares, shaming on Facebook, etc.

          Amy Cuddy alone is not the point of my comments. This pattern of critique, calling out specific researchers via blogposts without conversation and in-conference discussions, strikes me as bad form.

        • Angela:

          You may be right; it’s hard to know. On the occasions that I have contacted researchers directly with questions and criticism, it hasn’t always gone well, and I do think that once work is published (and certainly when it’s published in one of the top journals in the field), it should be open to criticism from all, and I think that any implicit expectation or norm of contacting the original researchers could be a bad idea to the extent that it raises a barrier to criticism, if even a small one. But that’s just my view, and to be sure there are costs either way. I think science has advanced a lot in the past decade, in many cases through criticism that was done independently of the original researchers, for example criticism of beauty-and-sex-ratio research without consultation with Kanazawa, criticism of ESP research without consultation with Bem, etc.

          I will disagree with your claim that I am “the main authority making that call.” When writing about power pose, I have consistently deferred to Ranehill et al. and Simmons and Simonsohn. I don’t think I’m making a “call” at all. I’ve thought for awhile that the power pose paper is an example of a study where the data are simply too noisy to learn anything useful, which is too bad, but that hasn’t stopped others from working in the area, and I would not want the authority to shut down research in an area, even if I could!

          I do recognize that somehow all this has not been clear and I’ve tried to rectify by adding the P.S. at the end of the above post. Thanks again for your comments.

          In any case, moving forward, I don’t think it’s a good idea to deny the problems with existing published work, but I think it will be most productive to suggest ways to improve research in the future, which is what I’m trying to do at the end of the above post (before the P.S.). Along the same lines, this is why Blake McShane, David Gal, Jennifer Tackett, Christian Robert, and I are recommending to abandon statistical significance, in part because we want to move statistics away from a critical framework to something more constructive.

          So, believe it or not, I think we’re aiming in the same direction, even if we’re coming from different perspectives, and I do think of comments such as yours and others in this thread when trying to figure how to be most helpful going forward.

        • >But my problem with this continues to be the very public way scientists are being tried for their research in these blogs and posts

          I don’t think people that quietly publish their papers in field journals would be “tried” on blogs and in the NYT. But if you have the 2nd most viewed TED talk, then yes, you are subject to very public criticism. People who cannot stomach this should not seek so much publicity in the first place, or make very sure their claims stand up to scrutiny.

        • To explain the comment preceding this one: I had replied to Anon, but misspelled my name, so the comment was listed on my screen with note “awaiting moderation”. When I discovered my error, I wrote the “oops” message, hoping the moderator would see it and accept the original comment. But it now appears that the comment was considered spam. It essentially said that I agree with Anon’s last two sentences.

          (And when I tried to submit this, I got the “awaiting moderation” again, and realized that I had misspelled my email address. I guess it’s time to see the eye doctor.)

        • Good point — if Angela had said, “Mr. Gelman”, then saying “Ms.Cuddy” would have been equal treatment — but she said “Dr. Gelman”.

      • Anon:

        Trying to trace blog comments is not cool, in my opinion. I think it’s a good thing that people can comment anonymously and participate in discussions. Regarding the negativity of many of the comments to this post: Given the wide readership of the New York Times, and the fact that the article linked directly to this blog, it would not surprise me if a broad cross-section of readers could find themselves here and be moved to comment.

        • “Trying to trace blog comments is not cool”

          Indeed.

          Personally, I like getting negative comments on my own blog. It serves as a potential check on unjustified thinking by me. It shows my reach is expanding. It gives evidence of what the general public thinks. It gives my readers an opportunity to respond to criticism.

          On the other hand, a lot of my commenters don’t like negative comments. It seems like a violation of their turf.

          Turf battles seem like a characteristic feature of contemporary life, probably because the Internet has brought people into more intellectual conflict with each other.

        • Steve:

          Also, different comment sections have different styles. The commenters here can sometimes be rude but mostly the arguments stay on point. Maybe that could be considered a nerd-style. Even when we had a long thread a few years ago about racism, the comments were mostly pretty focused: really this must’ve been the best-behaved blog comment thread on racism ever. (Arguably this is a bad thing, in that perhaps racism is so evil that it’s inappropriate to debate the topic politely, but that’s another story.)

          The commenters on Marginal Revolution, though, they’re just at each other every day! Sometimes they make interesting points and amusing reading, but the level of anger there is just amazing to me, I guess because they’re talking about politics rather than science, sports, statistics, etc. The trolling level on our blog here is low, and I feel very lucky about that.

          And then there are newspaper comment threads which are universally recognized as essentially useless. When the Monkey Cage was just a blog, we didn’t get a lot of comments, but when we did they were often interesting. Once we moved to the Washington Post . . . forget about it.

          As a long-time blogger, the whole topic of comment threads absolutely fascinates me.

        • Agree, the comments here are usually good.

          The comments at Marginal Revolution also used to be very good. They only got much worse about 3 years ago or so. I wonder why.

        • It is sad how few blogs have readable comments. The quality of this blog’s comments make this site a rare gem.

          It probably has something to do with a feeling of being a small community. Also, the technical nature of the blog probably acts as a filter so commenters feel they have to add something concrete to warrant a posting.

          Angela’s comments stand out in this regard because they do not attempt to refute anything said by the host. They just question motivations and hurl accusations and moral condemnations.

    • Angela: re: misogyny: That the setting here is one where a man criticizes the work of a woman, especially in relation competence in a traditionally masculine area (scientific methods, statistics/mathematics), may well reflect societal and cultural misogyny, i.e. patriarchal structures. However, this should not be confused with individual-level misogyny. Calling out a methodological error is not misogyny.

      Regarding the NYT article, I’m a researcher in social psychology, and a woman, and I definitely do not feel pushed out of the field by this blog or by other methodological blogs. Haven’t heard any of my female colleagues complaining about that either (and we talk about gender structures in science all the time).

      As someone who is strongly sympathetic to radical feminist thinking, I find it irritating that such thinking is applied uncritically to anything.

      • I am not sympathetic to radical feminist thinking or a social psychologist. And I agree with you totally. Calling out a methodological error, or any other type of error, when a man makes the call about a woman’s work, is not misogyny. Ms. Cuddy’s role as a ‘self-help guru’ has nothing to do with her academic work. She is not empowering anyone if she carries on while her collaborators have agreed to put aside the questionable papers and move on.

        Re the New York Times. I cancelled my on-line subscription. The hard news is still good and I like some of the op-eds. I liked the long feature piece they did on high school seniors in Topeka. But the feature articles have some sort of identity politics agenda mixed with intense snobbery.

        • Thanks. I meant to add into my original that individual-level misogyny would mean sexist remarks, criticism using sexist or gender-stereotypical notions, or perhaps over-representation of women as targets. I have never seen anything like that in this blog or in Data Colada.

        • Couldn’t agree more – just want to add that NYT is the premier outlet of identity _culture_, not just identity politics. Politics is downstream from culture.

          The first principle of identity culture is that we can determine who is the hero, the victim, and the villain in any story just by knowing the gender and race of the persons involved.

    • Angela,
      You said to Andrew:
      “I am glad someone is finally calling you and your misogyny out.”

      I have not seen evidence that Andrew is misogynous. If you believe he is, please provide evidence that he is.

  4. Thanks for this response, Andrew, and for the decency you’ve shown here. Frankly, I found the article profoundly frustrating.

    There is no debate about the following: A professor at Harvard, whose advisor at Princeton edits one of the premier journals in all of science (not exactly some minor outsiders!), published an article in 2010 which is entirely non-replicable, and for good reason: the sample size and p-hacking problems are obvious. These types of papers were common in social psych at the time. The paper was parlayed into what was likely millions of dollars of speaking fees, international fame, and a bestselling book (which, despite the implication of the article that power posing played only a minor role in the book, literally has a power pose on its cover!). The replication problems and statistical problems are now very well known. The coauthor of the original paper completely accepts these corrections. Yet Cuddy continues to deny the problems, makes absurd claims about meta-analyses that ignore failed replications, and continues to give public speeches for money acting as if the problem doesn’t exist. All the while, her very powerful advisor gives speeches calling the much-less-powerful critics “methodological terrorists”.

    The original mistake is fine: social psych was rife with such nonsense. But the refusal to correct it, and to profit off not correcting it, is bad science and borderline fraud, as far as I’m concerned. Consider an event that began similarly: the Ken Rogoff Excel spreadsheet error. Ken was giving bad advice to governments on the basis of an analysis which was based on an error. The error was pointed out by a grad student. At this point, it’s all just good science. If Ken were to continue to show the same faulty graphs, to claim the error was because the student used a different version of Excel, to have other Harvard faculty denigrate the people correcting the initial claim, then at that point it would be totally justified to question Ken’s motives. And the same is true for Cuddy, for Bem, for Wasnink, and for the rest of the folks who choose to maintain popular fame and fortune at the expense of serious science.

    One last thing: The “woe is me” tone of the NYT article strikes me as borderline sexist: indeed, an article today on a flaw in Wasnink’s work (https://www.buzzfeed.com/stephaniemlee/who-really-ate-the-apples-though?utm_term=.ix0BW3jl7#.qnxg0WYy3) is very clear on why the original work was unreliable, and why Wasnink shouldn’t be taken as seriously as he in public policy, with no discussion of lost meals, or emails from friends asking if he “is all right”, or descriptions of tennis skirts. It is deeply damaging to imply that science done by women, or by full professors, or by grad students, or whoever, is somehow outside the bounds of contradiction; that implication only holds if you believe, as per the quote in the NYT article, that science is to do “big, fun things” rather to advance knowledge. The latter is the goal, period.

    • Re: “a bestselling book (which, despite the implication of the article that power posing played only a minor role in the book, literally has a power pose on its cover!)”

      A very strange little grievance to bring up, which I think shows your own biases pretty clearly.

      I have not read the book, have you? From your quote, it does not appear that you have either, as the article claimed just a few pages focused on power posing. I’m going to have to trust a NY Times’ writer on the accuracy of that, given how easily someone could verify it.

      Book covers are chosen by publishers based on what they think will sell best, and may have very little to absolutely nothing to do with the contents. Writers often have zero control over what cover is chosen. I am SURE that if you are an adult who regularly reads any sort of published literature – fiction or otherwise – you are aware of that.

      • I’m sorry, but that’s a very misleading characterization. Cuddy’s book advance almost certainly came as a result of her power pose TED talk and the millions of hits it received. Without that, she never would have had the publicity and fame to sell the book. The two are inextricably linked, and if anything, the lack of power pose info in the book could be a subtle acknowledgement by Cuddy that the research doesn’t really stand up without actually saying any of this.

        • There is nothing at all wrong with her profiting from her fame. She should acknowledge the problems with her work but that’s a separate issue. Insofar as the book makes inaccurate claims it’s a bad thing; but just being by the same author as earlier claims that were later discovered to be inaccurate, is not a problem.

    • A general theme that pops up a lot in the comments on the NYT article is that by making Dr. Cuddy sad, Dr. Gelman is being sexist and/or unladylike.

      The young English theologian Alastair Roberts wrote a very interesting post on the two modes of argument in today’s world, which I summarized as:

      Two Modes of Intellectual Discourse: Taking Everything Personally v. Debate as Sport

      http://www.unz.com/isteve/intellectual-discourse-taking/

      • Wow, thanks for this link!

        I never heard of this. After reading this i wonder 2 things:

        1) Does this (partly) explain the “tone” discussion, and related frustration on both sides?

        2) Which mode is more in line with scientific principles?

        • Anonymous: “Which mode is more in line with scientific principles?”

          I think both modes are equally contrary to scientific principles.

        • When i entered university and academia, i expected it to be more like (old, sporting) mode just from my intuitive understanding of what science should be about.

          After trying to understand what happened in psychology concerning how people like Stapel were never challenged for so long, and now seeing the debates about “tone”, i can totally see you could possibly use the differences between (old, sporting) mode and the (new, sensitive) mode of discussion (and education?) as a partial explanation for these events.

          I tried to summarize the text in some bullet points according to the 2 modes. I hope i did this correctly. Here is what i get:

          The first [new, sensitive] form of discourse seems lacking in rationality and ideological challenge to the second:

          -if you take offence, you can close down the discourse in your favour;
          -cannot tolerate uncompromising difference.
          -Supporters of this ‘sensitive’ mode of discourse will typically try, not to answer opponents with better arguments, but to silence them completely as ‘hateful’, ‘intolerant’, ‘bigoted’, ‘misogynistic’, ‘homophobic’, etc.
          -sensitivity-driven discourses will typically manifest a herding effect.
          -Dissenting voices can be scapegoated or excluded and opponents will be sharply attacked.
          -Unable to sustain true conversation, stale monologues will take its place.
          -Constantly pressed towards conformity, indoctrination can take the place of open intellectual inquiry.
          -Fracturing into hostile dogmatic cliques takes the place of vigorous and illuminating dialogue between contrasting perspectives.
          -Lacking the capacity for open dialogue, such groups will exert their influence on wider society primarily by means of political agitation.
          -seldom produce strong thought, but rather tend to become echo chambers

          The second [old, sporting] can appear cruel and devoid of sensitivity to the first:

          -demands personal detachment from issues under discussion,
          -offence is not meaningful currency within such discourse.
          -firm differences can be comfortably negotiated
          -The truth is not located in the single voice, but emerges from the conversation as a whole.
          -The point of the discourse is to expose the strengths and weaknesses of various positions through rigorous challenge,
          -Ideological conflict forces our arguments to undergo a rigorous and ruthless process through which bad arguments are broken down, good arguments are honed and developed, and the relative strengths and weaknesses of different positions emerge.
          -The best thinking emerges from contexts where interlocutors mercilessly probe and attack our arguments’ weaknesses and our own weaknesses as their defenders.
          -They expose the blindspots in our vision, the cracks in our theories, the inconsistencies in our logic, the inaptness of our framing, the problems in our rhetoric.
          -We are constantly forced to return to the drawing board, to produce better arguments.

          So, if i look at these 2 lists, i think you can make many connections to current discussions about publication practices, education, “tone”, etc. And i think you can make many connections to optimal scientific principles, values, and practices.

          In my quick understanding of the differences, and implications of them for science and scientific discourse, mode 1 (new, sensitive) is detrimental to science, while mode 2 (old, sporting) contains and result in what i view as scientific characteristics and consequences.

        • Let me try something, please forgive me if it doesn’t make any sense whatsoever. Concerning the connections I think you can make, the following is an example. It may sound very weird, so be prepared!?

          In doing so, I will view things like arguments, reasoning, logic, and evidence (data) as a way to try and come closer to the truth/finding out why and how things work/etc. (“doing science”).

          I think these things may have a similar function in both scientific discussions and scientific practices, which makes it possible for me to make the connection. For example:

          Gatekeepers (editors, journals, etc.):

          Mode 1: “if you take offence, you can close down the discourse in your favour” (new, sensitive)

          -Editors, reviewers can close down discourse, e.g. by not publishing something by a certain author because they don’t like their work for some reason. In a way they have “taken offence and closed down the discourse” as these results are not acknowledged and are not “part of the discussion” (scientific literature) anymore. You are telling “author X” that he/she is stupid and ignore them.

          Mode 2: “demands personal detachment from issues under discussion” (old, sporting)

          -Editors, reviewers are not given the opportunity to close down the discourse based on the results of a study because they use Registered Reports where the decision to publish something does not depend on the results but on the question/quality/etc. of the study. In a way they have “demanded personal detachment from the issues under discussion”. You are telling “author X” that he/she is part of the discussion and will be evaluated based on arguments, logic, evidence, etc.

          Selective reporting in a paper:

          Mode 1: “if you take offence, you can close down the discourse in your favour” (new, sensitive)

          -Selective reporting of variables in a paper can be seen as having “taken offence and closing down the discourse”, as these results are not acknowledged and are not “part of the discussion” (scientific literature). You are telling “variable X” that it is stupid and then ignore it.

          Mode 2: “demands personal detachment from issues under discussion” (old, sporting)

          -By pre-registering studies, authors are “demanding personal detachment from the issues under discussion” by acknowledging all results and making them “part of the discussion” (scientific literature). You are telling “variable X” that it is being evaluated based on arguments, logic, evidence, etc.

          Publication bias:

          Mode 1: “if you take offence, you can close down the discourse in your favour” (new, sensitive)

          -Publication bias concerning null-results can be seen as “taken offence and closing down the discourse”, as these results are not acknowledged and are not “part of the discussion” (scientific literature). You are telling “paper X” that it is stupid and ignore it.

          Mode 2: “demands personal detachment from issues under discussion” (old, sporting)

          -By using Registered Reports (where the results are published no matter the outcome), authors and journals are “demanding personal detachment from the issues under discussion” by acknowledging all results and making them “part of the discussion” (scientific literature). You are telling “paper X” that it is being evaluated based on arguments, logic, evidence, etc.

      • In a couple of places above I indicated that I think both modes of discourse mentioned by Steve Sailer are inappropriate for science. I’d like to try to elaborate a little on this. One big factor is my background: as a pure mathematician.

        Pure mathematicians try to figure out what assertions follow mathematically from collections of assumptions. There are two aspects to this: Figuring out what is true (i.e., what follows logically from the axioms and assumptions involved) and proving it. When we believe that something is true but haven’t yet proven it, we call it a conjecture. There might be different, conflicting conjectures about the same thing.

        Often the processes of conjecturing and proving are intertwined. Also, we might challenge a conjecture by trying to find a counterexample. Sometimes we succeed, but sometimes the attempt gives insight into how to work on the original conjecture.

        Often this process of proving (which may involve conjecturing and/or finding counterexamples) is done by one mathematician alone. Then it is communicated (either orally or in written form), and others (colleagues, or journal “referees” as we call them, or people who receive preprints) critique it.

        But sometimes the process of figuring out the conjectures, counterexamples, and proofs is collaborative. Collaboration may occur in various ways and degrees. Some mathematicians prefer to work collaboratively from the beginning; some prefer to work mainly alone, but perhaps ask for input from others when they get stuck or need another perspective. Some collaborations work in some ways like a debate, some don’t. Sometimes collaboration might consist of emailing a colleague a specific query when needed; the reply might vary from “I don’t know,” to ” Oh, yes here’s a reference for what you need,” to “Here’s a better person to ask than me,” or “You might try looking up these papers,” or “I haven’t had time to think about this yet, but will get back to you when I’ve had the time tot think about it,” or might be a complete, newly figured out answer to the question.

        Most pure mathematicians seem to be tolerant of different individual working styles — respecting that some people like debating styles, others prefer working mostly alone, etc.

        This model seems to have many strengths. In particular, it is flexible in many ways. I’ve seen this model (or aspects of it) used sometimes in science, and I think that it (suitably adapted to circumstances — e.g., if data gathering is needed, then getting the collaboration on planning that) is a better model than “one-size-fits-all” models such as the ones Steve Sailer described.

    • My understanding is that the Herndon/Ash/Pollin critique of Rogoff had three points: one was the excel error, another was a decision about which data to include (particularly some from New Zealand) and a third was on how the data was binned. The Excel error was actually the least important, and after correcting for it Rogoff could still make the same claim. It just got the most press because it’s obviously an error and easy to understand. The biggest problem with the paper is that by its nature it could only hope to show correlations rather than causation, and attempts to examine the latter (which we actually care about) indicate that Rogoff’s presentation to the public got the causality backward.

    • There actually was an article that did a lot of what you describe for Wansink, and did engender some sympathy for him (at the time): http://www.chronicle.com/article/Spoiled-Science/239529

      For the record, I don’t mind these articles showing the human element of science, I just wish they would show all perspectives equally. For example, I (and we) did consider how our criticisms of Wansink’s work would affect his life, and the lives of the people in his lab, etc., and we worried that we would be attacked for posting a criticism of a famous researcher (we are just 3 people without PhDs after all), but ultimately we found the problems with his work concerning enough to make them public and decided we’ll just have to deal with any repercussions. Luckily for us Cornell so far hasn’t really retaliated, but we had no way of knowing what they would do at the time.

      • Thanks for sharing. I know that academia can be political but I didn’t really consider that you and your fellow researchers had something to worry about, given how strong (and important) the case against Wansink was. Hopefully, Cornell and the rest of the research world does the right thing and appreciates your efforts. It exposed an embarrassing side to scientific research but I imagine the long-term lessons will be positive for the community. It’s unfortunate that you had to be the ones to take up the cause, rather than Wansink’s peers.

        I can’t help but feel sympathetic to Wansink. He seems like a nice, enthusiastic ambassador for science, but I can’t imagine that this will end happily for him or his department. Thinking about Gelman and the NYT piece on Cuddy — it seems people agree that scientific criticism is vital, and that there must be a way to do it civilly. But it’s hard to think of a critique for the kind of errors in Wansink’s work that would not be totally devastating to his reputation.

      • I want to thank you for the work you, and your colleagues do!

        I for one, am not brave enough to openly and directly express criticism to established researchers even though i think i may have valid points, and thus a useful scientific contribution to offer.

        This is because i do not trust most scientists/the system in general to take this the right way, and is probably due to personal characteristics (e.g. i don’t like dealing with people, emotions, politics, etc).

        I try to work around this by commenting on this blog anonymously. I do this in the hope that if i make any sense/say something useful, other people than me can read it and can perhaps use it to make some argument somewhere, improve something, etc.

        I guess my point is that:

        – i am thankful for people like you, and your colleagues, who are brave enough to do what i think is difficult work.

        – i am thankful for people like mr. Gelman, who provide the opportunity for everyone to comment (anonymously) on this blog and try and contribute something useful to science.

        – i am sad that i do not trust most scientists/the system enough to more directly, and non-anonymously, point out possible errors/raise possibly useful questions.

        • Yes, we need more people exposing bad science. You need to see the discussion on Psych facebook groups. Some poeple still have no clue what junk science is, they think this thing with Cuddy is a matter of opinion. And somehow these people have PhDs.

        • “Self-love is a good thing but self-awareness is more important. You need to once in a while go, ‘Ugh, I’m kind of an asshole.’ You need to have that thought once in a while or you’re a psychopath. You know like when you say to a friend of yours, ‘You’re being an asshole!’ And they’re like, ‘No I’m not!’ Well, it’s not up to you, if you’re asshole or not! That’s up to everybody else! You don’t get to say ‘No!’ to that. ‘You’re an asshole!’ ‘No I’m not’. Oh, sorry! I thought, okay, I’m glad I checked! I guess you’re not.’ If someone tells you you’re an asshole you should go, ‘Ah, shit! Alright, what happened? How did I get here?’” – Louis CK

        • Bottom line: We need people like Anonymous and people like Jordan and people like Andrew (and maybe even people like me?)

  5. One thing that surprised me about the article from yesterday, and other discussions, was that nobody mentions that Cuddy released her original Carney et al data and had an independent analyst analyze it (and he didn’t see any evidence for power posing). She should get credit for that; in similar situations, many others conceal their data or refuse to release it, or make some lame excuse about how hard it is to put together.

    Another thing that surprised me was Cuddy’s comment in the article that she is not able to find a collaborator for carrying out a straight replication. I don’t understand why one needs a collaborator, and in any case, what about Ranehill, or Simmons et al? It’s always a good idea to work with a scientific adversary; both keep each other honest.

    Finally, I thought that Cuddy could have quickly defused the situation in the beginning by just saying, sure, it’s possible Gelman et al are right. When I think about all my published “results”, one striking fact is that I’m not really sure of a single one among them. I tried to replicate quite a few of my and others’ results, I failed in most cases. The best one can do is run high precision studies and try to find a fact that holds up under scrutiny.

    My field is also as messy as Cuddy’s; what one finds may or may not generalize, even the same subject may not reproduce the same behavior a second time. It’s very difficult to say anything for sure in such a world.

    • They don’t mention her submission of the data to independent analysis because she has subsequently publicly twisted the outcome of that in favour of power pose. It would just be more picking on her really. If she had accepted the full finding on no effect that would have been an admirable* thing but she didn’t.

      *Strangely I thought “admirable” was an appropriate word here but it’s kind of like calling a man admirable for taking care of his children or not going to jail. Accepting the outcome was what she was supposed to do. It’s sad when we think it’s admirable that people retract their erroneous findings when it should be expected.

      • Some really interesting points here. I offer a couple of quick reactions.

        1. Shravan points to an issue that worries me a lot and that I’ve written about on my blog. Some people just refuse to share their data and this actually seems to be an optimal strategy for covering up flawed work that would be exposed by reanalysis of the data. This strategy can work because to a large extent the scientific community, and much more the general public, tolerates hiding data. So while sharing data should be routine and unremarkable in practice it sometimes looks like a heroic act of self-sacrifice.

        I think the only viable response to this dilemma is to presume that claims based on denied data are more dubious than claims that have been refuted using provided data.

        2. This is the first I hear of the episode with the independent statistician. I find this move to be rather strange. Personally, I wouldn’t just commit in advance to defer to an independent statistician. I would love to get such input into any of my projects but I wouldn’t consider it to be above criticism. On the other hand, I wouldn’t just reject the opinion of the statistician without giving reasons. Is that what Amy Cuddy did? I don’t know. I’m just saying that such a report should be an important touch point in a debate but we shouldn’t assume in advance that it will settle everything.

  6. “There is contempt and zeal in the ridicule of social psychologists here that you may not intend, but many perceive.”

    This comment best summarizes why, over time, I’ve been unable to take the critique of social psychologists seriously, despite agreeing with the underlying need for improvements in scientific research. Is the goal of such mockery really a path to “better science” or is it, as many perceive, a self congratulatory “gotcha” movement that is also guilty of discarding whatever data and contradictions that don’t fit their own narrative? From the perspective of someone outside of the field looking it, it doesn’t appear that the end result in actually better science.

    Do you think that because the mockery, which you do not deny, “comes out of frustration” that in any way justifies it? You contradict yourself – you claim your “legitimate” anger and frustration explains behavior that is counterproductive, and then in the same sentence claim that it’s no excuse. If it’s no excuse, why do you continue repeatedly try to portray your anger and frustration as “legitimate” explanations for your behavior? On one hand, you say it’s not excuse; on the other hand, in that very same comment you repeatedly offer them up as excuses.

    • Please quote a location Gelman mocks a researcher for an erroneous a finding. I don’t mean for defending an erroneous finding after it has been extensively demonstrated that it is false. Any mockery there is brought down on their own head. But please point out where the mocking tone is for the erroneous finding.

    • Anon:

      I think you’re responding to this comment of mine.

      In response:

      1. There is a difference between an explanation and an excuse. Just to take another example that does not involve power pose at all: Satoshi Kanazawa has published papers on sex ratios whose conclusions are derived essentially entirely from noise. When this has been pointed out to him, he’s avoided addressing the problems. This is understandable in that Kanazawa worked hard on these papers, they’ve been published, and he no doubt believes strongly in his theories. All of that is an explanation for Kanazawa’s unwillingness to consider the problems in his work, but it’s not an excuse. I think it can be useful to consider explanations without requiring them to be excuses.

      In any case, considering I write something like 400 blog posts a year, I’m sure that some of the things I write are counterproductive. I think that would be the case even if I never mocked at all! A positive post can be counterproductive too, for example if it inadvertently encourages people to continue along a wrong path. The trouble is that in any given case I don’t know if a post will be counterproductive. On balance, I think the vast majority of my posts are helpful. Of course I think that, otherwise I wouldn’t be doing this in the first place. But, as I noted above, I welcome comments from people who disagree—especially if they can be specific about what I could do differently. I can’t make everybody happy but I welcome feedback.

      2. I have no idea what you’re talking about regarding “a self congratulatory gotcha movement.” As I’ve written about a thousand times now, I don’t feel “gotcha” nor do I congratulate myself when I see a statistical error. It makes me sad. There’s no “gotcha” going on here. Indeed, I wrote that in my post above:

      When people make statistical errors, I don’t say “gotcha,” I feel sad. Even when I joke about it, I’m not happy to see the mistakes; indeed, I often blame the statistics profession—including me, as a textbook writer!—for portraying statistical methods as tools for routine discovery: Do the randomization, gather the data, pass statistical significance and collect $200.

      Regarding “the end result in actually better science”: As I’ve also written many times, I think better science will require getting better data. I’m not gathering data, but I’m helping in other ways, for example by statistical analyses that demonstrate how, when data are too noisy, nothing useful can be learned. This point may seem obvious but it was not understood by the beauty-and-sex-ratio researcher, the ovulation-and-voting researcher, and many others. It wasn’t obvious to me either, several years ago. We’re all learning here.

    • Anonymous: I’m a social/personality psychologist, and I perceive no such contempt in this blog or in other Dr. Gelman’s writings, and have no problem taking them seriously. I don’t see his criticism as directed to the field of social psychology at all (in my view that would be saying that social psychology, as a scientific endeavour, is doomed from the start), but towards certain practices that have been widespread in many fields, but especially in SP. If you have conducted well-justified studies and your statistics are correct, of if you have at least tried your best, I see no reason to feel that you are the target of his criticism.

      I personally realize now that I have done bad science in the past. I wish I had not, but I have, but it does not make me angry at someone who points out bad science. Quite the opposite, because I don’t want to do bad science! It does not even make me angry at myself, because I was young and impressionable and lacked experience. I’ll do my best to do better now and in the future.

      Then again, what if Dr. Gelman was ridiculing social psychologists? That wouldn’t be nice, but that would hardly be a reason to not try to do better science. That reasoning actually reminds me of the white male complaining that feminists are so angry/confrontational that he cannot listen to them, so he has no choice but to continue acting in sexist ways.

      • Must add even stronger sentiment: I’m SO GRATEFUL for everyone who have pointed out these methodological and statistical problems. This may sound all high and mighty, but I’m in this in order to actually find out something about human behavior. It would suck beyond description if I would retire after doing 40 years of non-science. I can’t understand that the “tone” of criticism is such a huge issue, if the criticism helps us to avoid that destiny. It’s good to demand to be treated respectfully, but if we are doing something wrong, shouldn’t correcting that be a priority?

        A side note which will sound like hindsight, but anyway: when I was a young PhD student reading social psychology papers, I noticed that the research was largely about testing very complex theories, and combinations of theories (2- and 3-way moderation effects), and a lot of assumptions were made. I remember being somewhat puzzled by this. Then Paul Rozin’s excellent article “What kind of Empirical Research Should We Publish, Fund, And Reward” came out, where he makes the point that unlike natural sciences, (social/behavioral) psychology lacks basic observational data, yet focuses on testing complex theories rather than trying to collect such data. I remember agreeing so vigorously. This was 2009, and I think things have improved a bit, especially in personality psychology, but we still need much more data about basic behavioral stuff, frequencies, psychologically relevant situations, simple situation-behavior patterns, and the like, to understand behavior. I think this lack of basic data may have been one issue behind the crisis.

        • +1

          As Peirce (1879) wrote,
          “The theory here [economy of research] given rests on the supposition that the object of the investigation is
          the ascertainment of truth. When the investigation is made for the purpose of attaining
          personal distinction, the economics of the problem are entirely different. But that seems
          to be well enough understood by those engaged in that sort of investigation.”

  7. So. A bunch of people are critiquing your critique style, particularly a journalist who wrote about someone you find less than worthy of her focus. My goodness, such protestations you offer. I read the NYT article which did you no kindnesses and thought I’d see if you had been portrayed too harshly. I don’t think you were. My first “huh” moment began with your, “Anyway, that’s all background. I think Dominus’s article is fair, given the inevitable space limitations. I wouldn’t’ve chosen to have written an article about Amy Cuddy—I think Eva Ranehill or Uri Simonsohn would be much more interesting subjects.” You didn’t chose, Mansplainer. She did, so get over yourself. And that right there is at the core of your long, self-serving, defensive post. Your also lengthy and defensive replies to critical comments. You are not simply an open-minded scientist, open to giving and receiving objective critique. Your writing choices are filled with subjectivity, with occasional passive-aggressive jabs and protestations, lacking in personal accountability. You have an ego and you’ve been petty with it. Own that.

    • Anon:

      Of course I have an ego—we all do! Where did I ever say I’m “simply an open-minded scientist”? I try to be an open-minded scientist—but there’s nothing simple about that! Being open-minded takes work. It’s not always easy.

      Regarding Susan Dominus’s article, you may think that “did me no kindnesses” but I thought it was fair. I don’t expect a reporter to do me kindnesses—that’s not her job. The reporter’s job is to tell an accurate, compelling, readable, and interesting story, and she did just that! And a bunch of people emailed me about the story so I thought it would be useful to add my perspectives. And I guess you added yours too. Fair enough.

      I don’t think any of us need to “get over ourselves.” I think we can all be ourselves, and to try to be the best version of ourselves that we can.

      As a statistician, I think a lot about how researchers can do better, and one thing I’ve been thinking about a lot lately is the importance of high-quality data. This point may seem obvious to you, or maybe not, but in any case it’s something that most statisticians, psychologists, medical researchers, etc., have not thought so much about until recently. My suggestions to Cuddy and others, and my many papers and talks addressed to statisticians, psychologists, and other applied researchers, are in many cases intended to connect that general idea to specific areas under study.

      • Sir, a good part of your post, especially your 4 point PS proves my assertion that you present yourself as simply open minded. And of course it isn’t simple to be so, but you’re part of a problem that the NYT article explores, and you missed the point. You repeat to me that you thought the article was fair, but omit that in your blog post, you followed this with “but” and overall, your post took a defensive tone. Your thoughts on research methods are valid; however, again and again, you miss my point while also making it. “Get over yourself,” — ie your ego and the human damage your method of making “suggestions” causes. What I wrote was glib, hurtful and absurd, yet you reply with pseudo sincerity that “we can all be ourselves.” I just want to scream at the passive aggressiveness in your writing. If Cuddy has an opportunity, as you say, then so do you. Maybe make that phone call next time.

        • Anon:

          1. I do think I’m open-minded! I just don’t think open-mindedness is simple; it takes work.

          2. I thought Dominus’s article was fair. I didn’t think it was perfect! But there’s no requirement or expectation that a news article be perfect. I appreciated that Dominus was careful, thoughtful, and fair, even if there were some other things I’d wish she included.

          3. You can put “suggestions” in scare quotes all you want; nonetheless, that’s what they are. I’m making suggestions to Kanazawa, Cuddy, etc. Not telling them what to do, not trying to stop them from publishing, etc. No reason there should be any damage involved.

          4. When I say “we can all be ourselves,” I mean it, I’m being completely sincere. Of course for simple logical reasons I can’t argue with your claim that my sincerity is “pseudo.” All I can say is, yes, I really mean it. I really do think that science benefits from different styles of inquiry.

          At this point, you, or other readers in this thread, might be wondering why I’m bothering to respond. It’s because these are difficult and contentious issues, and I do think it’s worth exploring where people are coming from, and giving myself a chance to learn from you and others. I think this sort of exchange also can help me express myself better in the future to give myself more of a chance to learn in new settings (and, again, I’m being sincere here, not pseudo-sincere).

        • So taking a philosophical view on science, science is a human endeavour. Therefore the psychology and personality of individual scientists matters. And yet at the same time it doesn’t. I mean by this that the personality traits that lead people to higher personal risk by publishing and promoting marginal or unsubstantiated work, and the traits that lead people to systematically mine the data to eliminate the chance of error both have a part to play. The “meta” view of the science is what matters, not the individuals. However, when science interacts with the media and non-scientists who view individual scientists as deliverers of truth, huge tension ensues. The motivated, bombastic individuals get coverage.
          Fundamentally, noone should expect others to share their personality traits, and we should accept that conflict and unresolved questions are useful to scientific progress.

  8. As an epidemiologist, I completely support the efforts of making scientific research more robust. We constantly try to improve our methods to make our inferences less biased and move science forward. Perhaps, it is difficult to imagine the harms of bad science in an innocuous field like social psychology (after all, no damage is done by holding arms up for few minutes – except wasting valuable resources, and maybe not only one’s own but others who build on that previous research). However, imagine the kinds of adverse outcomes resulting from unsound scientific methodology in a field, like nutrition, where the study results potentially have policy impact. I am glad to see the current debate is happening and researchers are paying attention.
    Btw, have you seen the recent discussion around Timothy Lash’s opinion piece in AJE? The Sander Greenland”s response was profound as always, which relates to the conversation on your above post. In case you missed: https://academic.oup.com/aje/article-abstract/186/6/639/3886035/Invited-Commentary-The-Need-for-Cognitive-Science?redirectedFrom=fulltext
    Peace

  9. “Why not help social psychologists instead of attacking them on your blog?”

    I fail to understand how pointing out errors isn’t helping social psychology. If it hurts someone’s feelings, so what?

    It isn’t like Social Psychology took a time out for a year or two in order to address their replication crisis and problems with statistical method. It is still being taught in hundreds of colleges and universities — right now, every day, year in and year out. It’s about a lot more than Cuddy’s personal career. If she feels bad it is totally appropriate.

    Anyone in the field right now won’t be able to claim, with any credibility, that ‘the rules changed’. Assuming that Social Psychology still aspires to be a science, being right is important. Including correcting errors. If it is passive-aggressive, mansplaining, mean, narcissistic, and indicates a latent authoritarian personality — too bad.

    • Why has social psychology been the central front in the Replication Crisis?

      I think this is partly because social psychology, as social psychologist Jonathan Haidt has documented, is extremely politicized. On the other hand, it is also because social psychologists are scientific enough to care. Other fields are at least as distorted, but they don’t feel as bad about it as the psychologists do. (At the extreme, cultural anthropologists have turned against science in general: at Stanford, for example, the Anthropology Department broke up for a number of years into Cultural Anthropology and Anthropological Sciences.)

      Is the social psychology glass therefore half empty or half full? I’d say it’s to the credit of social psychologists that they feel guilty enough to host these debates rather than to just ignore them.

      • Steve:

        What you say is similar to what I said here, where I argued that psychology has several features that contribute to the crisis:

        – Psychology is a relatively open and uncompetitive field (compared for example to biology). Many researchers will share their data.

        – Psychology is low budget (compared to biomedicine). So, again, not so much incentive to hoard data or lab procedures. There’s no “Robert Gallo” in psychology who would steal someone’s virus sample in order to get a Nobel Prize.

        – The financial rewards are lower within psychology, hence the incentive is not to set up your own company using secret technology but rather to get your idea known far and wide so you can get speaking tours, book contracts, etc. Sure, most research psychologists don’t attempt this, but to the extent there are financial rewards, that’s where they are.

        – In psychology, data are generally not proprietary (as in business) or protected (as in medicine). So there’s a norm of sharing. In bio, if you want someone’s data, you have to beg. In psychology, they have to give you a reason not to share.

        – In psychology, experiments are easy to replicate (unlike econ or poli sci, where you can’t just run a bunch more recessions or elections) and cheap to replicate (unlike medicine which involves doctors and patients). So replication is a live option, indeed it gets people suggesting that preregistered replication be a requirement in some cases.

        – Finally, hypotheses in psychology, especially social psychology, are often vague, and data are noisy. Indeed, there often seems to be a tradition of casual measurement, the idea perhaps being that it doesn’t matter exactly what you measure because if you get statistical significance, you’ve discovered something. This is different from econ where it seems there’s more of a tradition of large datasets, careful measurements, and theory-based hypotheses. Anyway, psychology studies often (not always, but often) feature weak theory + weak measurement, which is a recipe for unreplicable findings.

        To put it another way, p-hacking is not the cause of the problem; p-hacking is a symptom. Researchers don’t want to p-hack; they’d prefer to confirm their original hypotheses. They p-hack only because they have to.

        Hey—that’s a blog post right there. I guess I’ll post it; there’s room in May.

        • > blog post right there
          Agree – in medicine micro-array gene expression studies were easy/fast to replicate, very noisy (systematic errors), less protected , etc. and that’s about when John Ioannidis entered the field.

  10. Your response to the NYT article was more thorough and patient than I would have expected for the bully that I read about in an NYT article. I could agree that the weight and hostility of criticism Cuddy faces is indeed intolerable, and that your repeated mentions of her work can add to the pile-on. bUt I did think it felt unfair how Cuddy posed the rhetorical question of “Why not help social psychologists instead of attacking them on your blog?”, which implies that your insistence on shaming “scientific overreach”, and the well-being of social psychology, are mutual exclusive. Or, that which hurts Cuddy is also hurting all of social psychology.

    The NYT piece seems to imply that you lack the required empathy and social skills to work out issues with Cuddy in person:

    > * “Why not come to a conference or hold a seminar?” When I asked Gelman if he would ever consider meeting with Cuddy to hash out their differences, he seemed put off by the idea of trying to persuade her, in person, that there were flaws in her work…I don’t like interpersonal conflict,” he said.*

    What is it that you said or signaled — other than literally, “I am put off” — that gave the NYT reporter the impression that you “seemed put off by the idea of trying to persuade her”? I interpreted the description of your reluctance as admitting that you were braver on the Internet than in person, (who isn’t?), but furthermore, that your supposed devotion to scientific truth and accuracy doesn’t transfer to real-world confrontation, i.e. Cuddy’s allegedly blatant flaws are only blatant until you are asked to talk about them in person. Besides making you sound weirdly shy, it seems to imply that you’re more interested in being an internet troll than actually persuading folks. Which seems like harsh self-criticism.

    • Dan:

      Dominus reported our conversation accurately (conditional on space limitations) and I assume she did so for Cuddy, Simonsohn, and the others as well. She (Dominus) put in a lot of effort to get things right. Regarding the issue of why I never contacted Cuddy directly: On the occasions that I have contacted people directly when there have been big problems with their work, I typically have not found such interactions to be useful. Sometimes people don’t respond, other times they seem to miss the point. I do agree that there’s the potential to learn from such a conversation—but there’s also the potential to learn by posting on the blog and getting comments from anyone in the world who might have interest or expertise in the problem.

      What it comes down to, I think, is that there are different styles of interaction. Given that I’ve been blogging daily for over ten years, it’s no surprise that I find blogging to be a useful way of learning from and interacting with people. One reason I started blogging is that it seemed more useful to converse with thousands of people at once, rather than exchanging with people one on one. For some purposes, though, email can be better, and in retrospect maybe this would’ve been one such case. I’m not so sure that Cuddy thinks so, though, given that she never emailed me either.

      Regarding “the idea of trying to persuade her, in person”: Given that she hadn’t been persuaded by the direct evidence of Ranehill et al., and she hadn’t been persuaded by the very clear arguments of Simmons and Simonsohn, I didn’t (and don’t) see any reason she’d be persuaded by me! After all, I wasn’t really offering any new arguments; my contribution, such as it was, in my blog posts and Slate article (coauthored with Kaiser Fung) was to report the Ranehill et al. and Simmons and Simonsohn articles and add some perspective. So I’m not really sure how the conversation would’ve gone, given that Cuddy had already seen those things and was unpersuaded.

      Just in general I find it easier, and maybe more productive, to present my perspective, address arguments that come in, and consider how I can learn. Direct persuasion rarely works and is stressful, which I guess is what I mean when I said I don’t like interpersonal conflict. Anyway, each of us has our own style of interaction. Here I am responding to blog comments at 5 in the morning, something I don’t usually do!

      • I’m also Dan, but a different one than above. Forgive me if I’m asking something that’s been otherwise addressed on your blog or elsehwere – I only came across your site via the NYT article but I have a statistics based profession – how much weight do you give Cuddy’s rebuttal in the story to Ranehill regarding the timing of the power pose? As a casual observer of the issue, it does seem be a significantly longer time period, tripple her original duration. Again, the specific duration may have been tested in the subsequent studies that I haven’t read…. But is that difference in methodology significant (in your mind) to explain why Cuddy does not address Ranehill in the capacity you would like her to? I know you can’t speak for her, but was hoping to get your thoughts on it all the same. Thanks.

        • From the NYT article:

          “Cuddy thought it was likely that the difference in time — six minutes of standing versus two — was a crucial one and probably accounted for the disparity in the results. “It’s not a crazy thing to test,” Cuddy says. “I guess under the theory that more is better? But it could go the other way — three minutes is a really, really long time to be holding a pose like that. It seems likely to me that it would be really uncomfortable, but sure, study it, and let’s see.””

          Dan V’s question:

          “how much weight do you give Cuddy’s rebuttal in the story to Ranehill regarding the timing of the power pose?”

          Cuddy seems to have used 6 minutes power-poses in experiments described in *her own* papers, and things seemed to have “worked out” there. This is why i find it strange that she mentions it as a possible explanation.

          http://faculty.haas.berkeley.edu/dana_carney/pp_performance.pdf

          “Each participant adopted one of two standing poses (as used in Yap et al., 2013): they stood with hands on their hips, elbows pointing out and feet approximately 1’apart (high-power); or they stood with hands and arms wrapping around the torso and feet together (low-power). Figure 1 presents illustrations of the specific poses. Participants maintained the poses for a total of five to six minutes while preparing for the job interview speech”

        • I read the article as saying 3mins in a V for Victory pose was too long. 6mins with hands on hips is hardly onerous though, especially while preparing for an interview – I.e. distracted by other thoughts/activities

        • I think this is the original “power-pose’ article where you can take a look at the “power-poses” used, which are the same in the Ranehill replication:

          Cuddy: http://faculty.haas.berkeley.edu/dana_carney/power.poses.PS.2010.pdf

          Ranehill: https://osf.io/49a32/

          It looks to me like there were 2 different “high power poses” used, but no “V for Victory”-pose nor a “hands on hips”. Please correct me if i am missing something.

          Regardless, it doesn’t seem to me that “power-pose” researchers have ever made a distinction between different types of “power-poses” in their papers, promotion, and/or discussion about “power-poses”.

          If i am not missing anything, this is evidenced by Cuddy’s recent review of published “power-pose” studies, in which “power-poses” poses used in all the different studies are all simply called “configured posture” and “expansive nonverbal displays”. No differentiation between them seems to be made. Please correct me if i am missing something.

          http://datacolada.org/wp-content/uploads/2015/05/5111-Carney-Cuddy-Yap-PS-2015-Review-and-summary-of-research-on-the-emobided-effects-of-expansive-vs-contractive-nonverbal-displays.pdf

        • “how much weight do you give Cuddy’s rebuttal in the story to Ranehill regarding the timing of the power pose?”

          Cuddy also mentions this in her recent review article, which includes a direct comparison of her original study and the Ranehill replication:

          http://datacolada.org/wp-content/uploads/2015/05/5111-Carney-Cuddy-Yap-PS-2015-Review-and-summary-of-research-on-the-emobided-effects-of-expansive-vs-contractive-nonverbal-displays.pdf

          On the topic of “time in poses” Table 2 reads: “Participants in Ranehill et al.’s study held the poses 300% as long as participants in Carney et al.’s study. Duration and comfort of poses are very likely to be moderators.”

          I just noticed that the Ranehill article actually already mentions this possibility, and even tests for it:

          http://datacolada.org/wp-content/uploads/2015/05/5110-Ranehill-Dreber-Johannesson-Leiberg-Sul-Weber-PS-2015-Assessing-the-robustness-of-power-posing-no-effect-on-hormones-and-risk-rolerance-in-a-large-sample-of-men-and-women.pdf

          “First, prolonged posing time in our study may have caused participants to become uncomfortable, which could have counteracted the effect of power posing. To test this interpretation, we reanalyzed our data using first only those participants who reported the postures to be at least “somewhat comfortable” and then only those who reported the postures to be at least “quite comfortable.” This did not substantively change our results (see the Supplemental Material).”

          I assume Cuddy has also read this, which is why i find it strange that she 1) proposes the time/comfort thing as a possibility to explain the different results, and more importantly 2) that she does not mention the Ranehill comfort-test in her review-paper in the section where she talks about time and comfort being a possible issue…

    • Thanks for raising this – I was also bothered by an implicit claim in Dominus’ article that one has the right to demand their favorite channel of communication.

  11. Hi Prof Gelman —

    I enjoy reading your blog, and frankly, I think that the NYT article made several insinuations that were unjustified, namely that Cuddy facing brutal criticism is driving women out of the field etc. The NYT writer seems to somehow downplay how much Cuddy has benefited from a poorly written piece of science, nor does she point out that Cuddy’s protestations are very much in her own self interest.

    But what’s worse about the article is how it ignores how Cuddy’s actions affect other people in the social sciences. Someone like Cuddy beat out many people–including many women–to receive her tenured job at a top school. That position comes with responsibility, and Cuddy would apparently like to have all the benefits without all of the ensuing challenges. She eagerly took on a public role–no one forced her to do a TED talk–and then became dismayed when she also faced public criticism over it. But it’s all the more appalling considering how people like Cuddy have used shoddy research practices to advance their own careers at the expense of science: that makes it more difficult for those without inside connections to be able to do research and have others pay attention.

    At the same time, Prof. Gelman, I think that it would be good to be magnanimous in victory, so to speak. The journalist believes you have been emotionally distant and uncaring in criticism. Whether she is right or wrong, I think it would be good to make amends for any un-professional criticism. Criticizing people in a way that is unprofessional and more snarky also undermines the field by making people fear criticism more (and hence hide their mistakes). So we do need to pay attention to our tone, and regardless of whether people are right or not, it might be best if you could set the tone by admitting that you may have been too harsh/flippant in tone and starting a new page with a statement about what professional criticism looks like.

    • Bob:

      I don’t think my criticism of Cuddy (or, for that matter, of Kanazawa, Bem, and others) has been too harsh. At least I can’t think of any examples. I’m willing to be convinced otherwise, though, so if you have particular statements in mind that you think have been too harsh, let me know here in comments and we can discuss.

      Regarding flippant: Yes, I’m sure you’re right about that. Flippant can be fun but it can piss people off unnecessarily. Simmons et al. wrote a paper called “False-positive psychology,” I wrote a post called “Low-power pose,” and so on. Clever, but maybe a net minus. It’s hard to say in any particular case—cleverness can aid readability and make writing more memorable, but it can send distracting messages. To put it another way, I wrote a long post entitled, “What has happened down here is the winds have changed.” That was clever but not mean. I guess I should work at that. Or, at the very least, recognizing when flippancy can be mean, and using such an attitude sparingly.

      I have a couple hundred posts in the queue, but for new posts I’ll definitely work on that. Was it a good idea to title my post, “Low-power pose”? Probably not, as it sent the implicit (or maybe explicit) message that if you have published a study that, in retrospect, had major flaws, that your work should be mocked. If I wanted to go for a pun, it should’ve been more neutral, for example, “Power post study had lower power than we’d thought; How to move forward”? Such neutral framings are more difficult, but I accept the argument that the work is worth it. Or, to put it another way, if I can’t put in the effort to write a non-rude post (setting aside situations where rudeness is acceptable or even expected, as in some sportswriting), maybe it’s not worth posting at all.

      • So are you going to try changing your style of criticism? One thing that would be helpful is an analysis by you of all the data available so far. That would be a very useful discussion.

        When I reanalyzed the Cuddy data, I could see why people might think there is some evidence *in that data-set*; using Stan, the 95% credible interval for the testosterone going up in a high power pose was -0.8 to 12. It would be helpful to see all existing data in one article.

        • Shravan:

          As I said, for new posts I’ll work the style. Or, more precisely, the messages I’m sending, including those messages I might not be intending to send. We’ll see how it goes.

          Regarding power pose in particular, I doubt it’s worth analyzing all the data from the original study, given that (a) the data are so noisy and I suspect any effects are so variable that I doubt there’s anything there to find (it’s the “power = .06” problem that we’ve discussed many times in this space), and (b) Carney’s document on the study suggests there are many problems with data, including information leakage regarding treatment assignment. Also, given the high variability of the measurements and all the possibilities for bias, I don’t think that an interval of [-0.8, 12] (or, for that matter an interval [0.8, 12]) is particularly informative.

          Just to be clear: I have no problem with you or anyone else reanalyzing the data. It could be a useful exercise—and, who knows?, maybe something will turn up. But I don’t really seem much value in performing such a reanalysis myself. My impression on all this stuff—and I speak as an outsider, not making any claim to expertise in social psychology—is that there’s no evidence for anything consistent from power pose specifically, and I suspect that any advances in this area will come from much more granular within-person studies.

        • I feel someone should do the analysis, a statistician. This gives the social psychologists who are puzzled or upset by all the criticism something concrete to think about and maybe even respond to. E.g., a retrospective power analysis based on all available evidence would help.

        • I agree with Andrew Gelman; re-analyzing the existing data is not the point in many instances. As Gelman mentioned in one of his previous responses, P-hacking or the size/direction of CI is not the cause but the symptom of the problems of many studies in psychology. The rigor of statistical analysis comes secondary to the design of the study – no fancy statistical approach will fix the problems resulting from the lack of a good quality data.

          Unlike some highly controlled experimental fields like physics and chemistry, there are many opportunities in psychology for things to get muddled. For example, they often use a convenient sample of students; then, the results get to be generalized even when the authors add the limitations in the generic statement. They neglect to collect all the data on possible confounders or when they do, they don’t account for necessary sample size to include them in their models. I also noticed sometimes statistical models are built without an in-depth understanding of the necessity of including a covariate or the consequences of including an unnecessary covariate. Then, the parametric tests are used whether their assumptions are met or not. These are just a few examples but they collectively get in the way of being taken seriously by the scientific community.

          These issues are not unique to psychology; however, psychology seems to be more vulnerable than some other fields for the reasons explained before. In fairness to the researchers in psychology, the problems arise not always from their lack of good intentions but just whenever a human component is involved in a study, things always get complicated. Fortunately, there seems to be a willingness to improve their methods and approaches, and they are paying attention. I did not know about Stanford’s anthropology dept’s splitting the areas of study so some questions can be investigated without claiming scientific validity.

          In short, this intellectual discourse is all great for the betterment of the humanity; when we do good science, we all win.

      • Bob:

        I think you and I are largely in agreeance. I’m consciously trying to think about Dr. Cuddy’s perspective because I know I tend to have a reflexive negative reaction to mentions of TED :).

        I’m inclined to think Cuddy is not acting malicious. On the other hand, there’s only so much you can wait something out without being guilty of the sin of omission. I didn’t read Dr. Gelman’s blog when Cuddy came into the (negative) spotlight. Was there time/room-to-breathe between when Cuddy’s work was first scrutinized, and when it became (from her perspective) a nasty personal fight that was lose-lose? I think that’s a worthy question that the NYT piece tries to raise, and even if Cuddy is in the wrong, is there a better, more hospitable way to pressure otherwise reluctant scientists to re-examine and correct their work?

        The incentives to not admit fault are so strong that it’s hard to imagine what that culture of open scientific review would look/sound like. I find Gelman’s writing to be well-reasoned, and maybe that desensitizes me to the idea that his attacks on other scientists are nastier than they need to be. But, as someone coming from the media side, I’ve wondered if in science — just like in media — being a bit edgy/rude is the only possible option to draw attention to errors, when the natural tendency is to ignore them or stall.

        I bet many scientists have a dim view of the mass media’s commitment to the truth. Peer review is completely impractical with the speed and process of news. Only top-tier organizations like the New Yorker have a dedicated fact-check team. But even when the internal editing process is lacking, journalism has its own strong incentive for producing truthful work: the people, businesses, officials, etc. named in the stories will complain passionately about errors, to you and to your manager. And they have the option of suing you. Even journalists who are lax with the truth will get their shit together when legal liability is on the table.

        But when bad studies survive peer review, what party has real power and incentive to call out errors when later-discovered, or to even scrutinize peer reviewed studies in the first place? Most of the study’s subjects are anonymized and have no idea if erroneous information about them has been published. Competing scholars might scrutinize but it sounds like meta-analysis/negative-results isn’t popular for journals.

        Even when the errors are so egregious and numerous as they are with Cornell’s Dr. Wansink, somehow they pass inspection of trained and honest researchers. The science community probably has higher standards for decorum and patience than does the media when it comes to industry criticism. But what if Gelman’s aggressiveness and rudeness is ultimately needed to drive past the understandable reluctance scientists have to damage each others’ work?

    • Hate to sound flippant, but you seem to be the kind of person that this story was written for. The premise of the article is that Cuddy faces particularly harsh judgment for following research standards that were until very recently the status quo. The piece implies a reasonable question — if what Cuddy faces is what she deserves, are we also seeing lesser folks face statistical justice? Or is she being seen as the easy punching bag, putting her in an ugly corner where she’s lost all enthusiasm to revisit and advance her work. Now I think it’s fair to argue that Cuddy *should* face a lot of attention because she happened to have given one of the most popular TED Talks of all time, and more attention is naturally going to bring in more nasty trolls and hostility regardless of gender

      Perhaps the NYT article was too gentle in describing the academic criticism Cuddy received. But you’ve gone on to be especially aggrieved at how Cuddy is portrayed. You say the NYT writer “seems to somehow downplay how much Cuddy has benefited” when the subhead of the entire piece says that she was young and “won big”, including a “prestigious job at Harvard”. The piece is rife with references to her viral fame and book success. At the end, the author even writes, “If Amy Cuddy is a victim, she may not seem like an obvious one” because it is definitely not obvious how someone with her fame and success, as described throughout the article, is a “victim”.

      In arguing that Cuddy is living a life of sham to get that sweet easy job in academia, you must have missed the part where she’s mentioned as quitting the job.Seems weird to think of her as an apologetic grifter who suddenly decides to quit a lifetime cushy job. Seems just as valid to give some benefit of the doubt that she may be facing real distress about her role in science.

      I happen to like Andrew’s blog and so I’m likely blind to whatever is deeply flippant and offensive in his content. But if I had to guess for Cuddy, it’s not just that Andrew criticizes her, or that he criticizes her in a severely rude way, but that she’s the subject of a *lot* of mentions on his blog. I think the NYT reporter should have taken the time to count them. But as someone who doesn’t work in science, at a glance, I should be forgiven to think that Cuddy is a huge, if not the worst threat to science’s integrity, given how much she’s featured here. But from Andrew’s perspective, I can see why he might not see it that way, that many of the references are incidental and not overtly critical. And in any case, since Andrew’s mission is to stop the spread of sloppy pop science, he could argue that focusing on Cuddy (and social psychology in general) makes sense as she’s one of the most popular of pop scientists.

      Andrew says he’s genuinely confused about which of his posts have been too harsh on Cuddy. Speaking for Cuddy again, I might guess that this is something she has in mind (and it was mentioned in the article I think):

      http://statmodeling.stat.columbia.edu/2017/03/28/association-psychological-pseudoscience/

      Cuddy is mentioned as part of a lineup in which she fits alongside with an outright, malicious fraudster like Diedrick Stapel. I have to give Andrew the benefit of the doubt, that while he did mention Cuddy, the flippant remark that Stapel would belong at this conference is not so much an insult to Cuddy but to Fiske, whom Andrew seems to have more substantial reasons for disliking. But if I wouldn’t expect Cuddy to see that context — when you’re put on the same page as a fraudster, why should you assume that the implication is anything other than you are also a fraud?

      I don’t agree with everything in the NYT piece, and in fact, find it a bit frustrating overall. But I also believe that even if Andrew is sincere in his criticism being a tide that lifts all boats with no desire to be a hurtful troll, he may still not realize where his actions/words are unintentionally dickish, and see ways to make his important message more accepted.

      • Hi Dan —

        The piece is fantastically well-written, and that makes it hard as well to pick it apart. But my overall sense is that the writer is rather sympathetic to Cuddy, and she betrays that in a few ways. The biggest is the writer’s insinuation that Cuddy’s fame isn’t really about power posing because her best-selling book did not have much actual content about power posing. That seems like something the writer picked up from Cuddy herself, but anyone who works in publishing should know that that is really a red herring. Cuddy never would have been offered a lucrative advance to write a book like that (nor would it have taken off) if she hadn’t amassed a large online following as a result of her Ted talk. You can’t separate these things and pretend like Cuddy’s fame is really due to her personality. Certainly that helped, but she has directly and personally benefited from discredited research, and we’re not just talking about a trophy or an award–this is almost certainly in the hundreds of thousands of dollars.

        Second, the writer mentions in passing that Cuddy hasn’t replicated her own experiment, but seems to give her a pass by not digging in to why that is the case, while she zeroes in on Gelman’s refusal to meet with her in person. We are left with the impression that the former is idiosyncratic but the latter is intentional obfuscation. Having a tenured position at HBS (which is still listed on Cuddy’s linked in) should give her the resources to rather easily replicate her own experiment, especially as she should have all the protocols on hand.

        Again, no one forced Cuddy to take these high-profile positions or to give Ted talks. If you are going to go so high, you are going to face criticism. Maybe she has faced sexism as well, but it seems to me that this kind of criticism is almost certain if someone is benefiting so disproportionately from poorly designed research.

        And frankly, I disagree with the generous assessment that Cuddy and others were following “standard research practices”. People have known about these issues for years and that people cook their data. There have also been people who have taken extraordinary care to avoid false positives and test every nook and corner of their dataset, which I don’t think Cuddy et al. did. That’s precisely why people don’t trust published research findings until they see a lot of confirmation over time. Virtually everyone I know in medecine, for example, discounts published results when they first come out, and my colleagues in the social sciences are just as willing to doubt published research. If we in the field doubt statistical findings, we are disingenuous to present the findings as established fact to outsiders.

        • Bob –

          I agree with your points but I also think that the Stapel comparison is unwarranted since Cuddy did not seem to sit at her kitchen table and generate data from scratch. Her issue appears to be a lack of rigorous research methodology knowledge, which seems to be a systemic problem in her field.

        • Hi Ayse–

          I don’t think social psychologists ever really defended these research practices, they just turned a blind eye to them. Although Gelman has given a more statistically rigorous definition of how this works out (garden of forking paths), it’s no different than what many people do in qualitative research by cherry-picking citations and other difficult-to-detect shading of the truth.

        • still, creating data from nothing is a blatant fraud, which is, to my knowledge, not something Cuddy is accused of doing.

        • I agree that this wasn’t just a matter of following “standard research practice” that subsequently changed. Cuddy went much too far in subordinating the research to an agenda–in this case, the agenda of empowerment of women. The article shows how attached she was to the results; when the Ranehill replication results came in, she reports thinking, “Oh, bummer”–which could be anyone’s response, but which she turned into a battle to salvage the study. She began to emphasize the effect of power poses on feelings of power–and to this day points to the successful replications of this effect–without considering the complexities of such an effect. (As others have asked, how long do such feelings last? Are they always good? Is there a possible downside to them or a subsequent swing in the other direction? Might these feelings be a result of suggestion? Etc.)

          It is fine to strive to empower women, or to give them tools for their own empowerment. It is fine to conduct research into a topic that might contribute to such empowerment. But link the two prematurely, and you’re in for trouble–not from others but from your own intellectual subordination. This goes for any research that serves an external agenda.

      • Regarding the status quo/rules changing – I have often felt almost survival’s guilt for being born at the right time. If I was 10 years older, I definitely would have been all for exciting, media-worthy results, and I’m not very good at statistics, so I probably would have ended up publishing (and, possibly, hype-ing in the media) something not-so-justified. This is why I have a lot of sympathy for researchers who did something like that. But then again, it would not be right to let everyone believe that the unfounded results are backed by science.

  12. I agree with many of the points that you’ve raised about trends and institutional pressures in science, and appreciate your openness and responsiveness to the commentors here. Like others, I find myself taking issue with your tone and self-assigned role. Many of your points are presented as objective, including what other people and fields *should* do (e.g. ‘the challenge for Cuddy’, ‘good/bad science’). That, combined with what comes across as a bit of a savior complex (e.g. defending how you’ve ‘helped’ social psychologists, ‘think alot about how researchers can do better’) can be… odious. I agree with your comment above about how publications should be open to criticism from, well, the public, but I think there are ‘standards of care’ to which one is obligated. Engagement and attempts to understand the implicit bounds and aims of a field/researcher/paper, instead of couching one’s implicit assumptions/aims as objective would be one that I endorse. To your credit, it seems you feel you’re reaching out to the community, but there may be a difference between your and the communities perceptions of that outreach. Unfortunately this sensitivity can be seen as getting ‘feelings’ involved in science, but the reality is science is a consensus between humans, and human communication and understanding is underpinned by implicit rules and standards which, although arbitrary, aren’t meaningless.

    Unfortunately, the script of a man telling a woman that she’s wrong (*especially* online, and *especially* couched in ‘rationality’) runs deep. Maybe you see yourself as ‘protecting science’, others might not.

    • Kevin:

      Regarding what you call my savior complex or an odious statement that I’ve helped social psychologists: I’m responding directly to Cuddy’s quoted statement toward me: “Why not help social psychologists.” So I feel like I’m damned if I do, damned if I don’t. Cuddy asks why I don’t help social psychologists, in response I list many examples where I have helped social psychologists, and now I get diagnosed and called odious.

      I actually do think it’s statisticians’ job to help applied researchers, so I wasn’t bothered by Cuddy asking me to help social psychologists; I just thought it was unfortunate that she didn’t seem aware of all the work my colleagues and I have done in this area.

      So if you really think it’s a “savior complex” for me as a statistician to want to help, you have an argument with Cuddy, and for that matter pretty much the entire field of statistics, not specially with me, as in this particular case I’m doing exactly what’s expected of me as a statistician!

      Regarding your statement, “there may be a difference between your and the communities perceptions of that outreach”: Remember, Amy Cuddy is one person. She’s not the community! My colleague and I have received lots of positive feedback about our outreach efforts to psychology. I agree with your general point that science is a consensus between humans, but I think that in your focus on this particular newspaper article you’re missing that the work my colleagues and I do, and have published in psychology journals, is part of this consensus.

      Regarding your last point, I refer you to this comment and others on this thread.

      Let me also point out that you seem to be quoting me as saying things I’ve never said, nor would I ever say! You write about statements “couched in ‘rationality’” but I have not referred to “rationality” in this discussion, and you write “Maybe you see yourself as ‘protecting science’” but I’ve never said that nor do I see myself as having this role.

      In any case, I appreciate your comments. Although I disagree with you on the specifics, we can surely agree that I am having some difficulty in communication if my perception of what I wrote, and your perception of what you read, are so different. So at the very least I need to work on communication in some way.

      • For a social scientist, you lack any semblance of reflexivity whatsoever. The key line for you to contemplate from the NYT piece, Dr Gelman, and one which you have studiously avoided throughout this blogpost, and your many replies, is your quote, which you have not rescinded, “I don’t like interpersonal conflict”.

        On the contrary, it looks very much from all of this that in fact you do like interpersonal conflict – but only on your terms, firing missives from your blog, but not in person.

        • Anon:

          We each have to do what we do. I actually don’t see these discussions as interpersonal conflict; I see them as discussions of ideas. I think it’s helpful that blog discussions are out in the open. In email people can be unspeakably rude, and they can also dodge questions. On the blog there’s space for open discussion. I guess this is a form of interpersonal conflict, so I was not being precise in that particular quote. Another thing I said to Dominus during the interview was that I see the discussion to be about science (and, I guess this meta-discussion is also about communication), not about personality. As I wrote above, I have no animus toward Cuddy and, for that matter, I have no reason to believe that she has animus toward me personally either.

          Anyway, Cuddy and power pose aside, let me just say that I do like vigorous discussion—even when it’s painful to me, as this is here, I find it useful–but I don’t like when things get personal. That’s just who I am. Different people have different forms of interaction that they like. Some people love twitter, I hate it. I love blogs but other people find blogs to be logorrheic. And so forth. I say all this not to dismiss your comment but to put it into some context that seems valuable to me.

        • FWIW, I asked Dr. Gelman about that quote:

          http://statmodeling.stat.columbia.edu/2017/10/18/beyond-power-pose-using-replication-failures-better-understanding-data-collection-analysis-better-science/#comment-590661

          > Regarding “the idea of trying to persuade her, in person”: Given that she hadn’t been persuaded by the direct evidence of Ranehill et al., and she hadn’t been persuaded by the very clear arguments of Simmons and Simonsohn, I didn’t (and don’t) see any reason she’d be persuaded by me! After all, I wasn’t really offering any new arguments; my contribution, such as it was, in my blog posts and Slate article (coauthored with Kaiser Fung) was to report the Ranehill et al. and Simmons and Simonsohn articles and add some perspective. So I’m not really sure how the conversation would’ve gone, given that Cuddy had already seen those things and was unpersuaded.

          > Just in general I find it easier, and maybe more productive, to present my perspective, address arguments that come in, and consider how I can learn. Direct persuasion rarely works and is stressful, which I guess is what I mean when I said I don’t like interpersonal conflict. Anyway, each of us has our own style of interaction. Here I am responding to blog comments at 5 in the morning, something I don’t usually do!

          I think it’s reasonable to argue that from a logistical viewpoint, it is unnecessary for him to deliver his criticisms in person and unnecessary for Dr. Cuddy to respond to them in person. Maybe there’s a case to be made that Dr. Gelman would be less of an asshole/more willing to empathize (from Dr. Cuddy’s perspective) if he knows Dr. Cuddy as a real-life person. But the implication is insulting to both Dr. Gelman and Dr. Cuddy — that Gelman’s critique is so fundamentally flimsy that a handshake, smile, and chat over coffee is all that’s needed to make him realize his error. And that Dr. Cuddy’s research does not stand on its own as science, but requires her critics to be personally charmed by Dr. Cuddy to fully appreciate it.

        • I also think that Dr. Gelman may just be a person who is, as he seems to say in the quote, not super comfortable with getting into a 1 on 1 discussion with somebody when it’s quite likely the discussion can be contentious. Some of us are just awkward. I don’t know Andrew personally, though I have seen him give a talk, and it wouldn’t shock me if he just doesn’t love the idea of getting into confrontations with strangers.

    • Kevin:

      Let me follow up because you raised some interesting points. I was thinking more about the “savior complex” thing, and I’d like to elaborate.

      I think my job as an applied statistician is to help applied researchers, and my job as a researcher of statistical methods is to help applied researchers including people I may never even meet, people who will read my articles, my books, and, yes, even my blog posts. I think these are my roles, not my “self-assigned roles” but my generally accepted roles as a statistician. Yes, I gave Cuddy (and, by implication, a world of researchers, male and female alike) advice on quantitative research, just as other times I’ve given such advice to Satoshi Kanazawa, Brian Wansink, and many others.

      I think that my pointing toward research directions on this blog, in the context of other people’s existing research programmes, is no worse than Amy Cuddy giving inspirational advice to strangers on her Ted talk.

      Actually, let me state this more clearly. I think it’s a good thing for me to give such advice on blogs, and in books, and in research articles, and I also think it’s good for Cuddy to give such advice in her Ted talk, and in her articles, and her books, and in other public forums.

      I and others have made fun of Cuddy’s Ted talks, and I regret that. I think we were wrong to mock Ted talks. Yes, I have specific disagreements with the claims that Cuddy has made, and I have disagreements with what I and others have seen as attempts to frame speculations as solid science, but those are just details. In principle, I think Ted talks promoting useful scientific ideas are good. Conditional on Cuddy believing her scientific claims (again, let’s set aside the science dispute for now), it’s admirable for her to go out there, put herself on the line, and promote these ideas. Again, conditional on power pose being a generally helpful intervention, it’s brave and admirable of Cuddy to promote it. So, again, I was kinda missing the point when mocking the Ted talks.

      I don’t think that Cuddy giving Ted talks is evidence of a “savior complex” on her part: I think it means she wants to do good, to spread the word, to help people, and she’s willing to have the exposure. Maybe she likes the speaking fees too, but that’s fine. I like my consulting fees. The fact that an activity is lucrative or gives one fame should not be taken as evidence that there’s something wrong with it.

      Now, again, you can disagree with me on the details. You might well feel that it’s a bad idea for me to be giving particular statistical advice on the blog if you think it’s bad advice, just as I feel that it might be a bad idea to promote power pose in a Ted talk if power pose doesn’t really work. But then the problem is with the specifics of the message, not the delivery mechanism or the fact that each of Cuddy and I, in our own way, (a) believe we know what we’re doing, and (b) would like to share our understanding and improve the lives and work of others.

      Funny that this discussion should end up with Cuddy and me on the same side!

      As always, I really really appreciate everyone’s blog comments: the ones that agree with me, the ones that disagree with me, the ones that are completely offbeat, even the ones in which a commenter says the awkward thing that he finds some of my behavior to be odious. It’s all a learning experience, and you never know what will come up. Even if in this case I had to stay up all night until I finally realized that I think Ted talks are great.

      • Thanks again for your response Andrew, I respect all the work that you’re putting in to respond to me even though my due diligence on your other works and efforts is probably lacking. And you’re in the tough role of being the person seeking technical corrections, which unfortunately fits with other narratives (as I was trying to point out at the end of my original comment).

        To your comment above, even if there’s no explicit appeals to ‘rationality’ and ‘science’, I’m referring to the way your arguments come across as pursuing ‘best practices’ or ‘good science’ full stop. The idea of an unassailable truth machine called science I think we would both take issue with. I’d agree that there’s certainly benefits to the developments that you’re pushing, and to your credit you do occasionally spell the benefits out as (roughly) ‘improving repeatability’ and ‘generalizeability’. But I think it’s counterproductive to spread a belief of ‘good science’ instead of ‘appropriate science’. Although probably everyone would agree improving repeatability and generalizeability are being key to any reasonable version of ‘good’ science, I take issue with the deification of science – divinity tends to suffocate critical analysis. Again, this is probably an unfortuante side-effect of the limited space and need to use shorthand, but the way your arguments were presented in this post and the several others I read gave me the impression that there’s a goal of ‘good statistics’ rather than a set of tools that provides certain outcomes.

        And I agree with you – advice and sharing persepctives (here, or otherwise) is good! And I’m glad you’ve gotten good feedback. I’d nit-pic that there’s substantially different reader expectations on your blog vs a TED talk. If someone is reading something in a paper or a technical blog written by a tenured Professor, they take it less as opinion and more as fact. (Hopefully) TED talks are taken more for entertainment or inspirational value.

        • Kevin, in regards to the tone of the terms “good science” and “bad science” this may be an unfortunate result of the fact that the readership for this one post is so much broader than the typical readership here. “Bad science” is being used as a shorthand for practices that get discussed frequently on this blog, in extensive detail. And there is certainly nothing near consensus on what constitutes “good science”, only that it isn’t going to involve certain popular statistical practices that produce outcomes most of us would regard as clearly bad (e.g. systematic bias in effect size estimates, non-replicability, formally invalid interpretations of statistical results). But for a reader who isn’t familiar with these discussions, I can see how words like “good science” and “bad science” are going to sound awfully high-and-mighty.

        • Ben, glad to hear there’s more explicit discussion of ‘bad’ and that’s what draws a lot of people and discussion to this blog, and yeah, good and bad are necessary shorthands that sound high-and-mighty to those who don’t know the working definitions for that community. But I think that’s exactly where my issue comes from. Ideally these consensuses should be built by the broader community, not (respectfully) a blog community. But, of course, many readers here are also engaging through venues where there’s opportunity for response and discussion from the broader community (and ‘the accused’). Probably one issue here was that the community is *not* happy to engage in these discussions, and consensus is *not* moving, and then I can understand the motivation for finding a corner of like-minded people.

          And that’s not to say productive conversations can’t happen in less-formal contexts, or that ‘peer-review’ and conference presentations are flawless and that science must be completely dead-pan. But, I wonder about the impact of a local consensus being reinforced in a blog where opinion, jokes and funny pictures are mixed with more objective research. To Andrew’s credit, it seems he links out to ‘serious’ venues for the science-ier bits, but it seems there is a wide range of worldview consensus building that goes on here, some of which is at a less-explicit level.

        • Hi Kevin, I do see what you are saying. My response is that there is lots of both – discussions about “bad” methods happen here but they are also happening in journals and at conferences. Dr. Gelman has published dozens of papers on the subjects he brings up here, and they are all calmly and carefully worded and they don’t use shorthand like “bad science” (at least as far as my memory serves). Many of them are written practicing researchers who don’t have much statistical background:

          http://www.stat.columbia.edu/~gelman/research/published/
          http://www.stat.columbia.edu/~gelman/research/unpublished/

          As far as points of consensus go, there are some. I don’t know that there’s a statistician alive who thinks that the practice of analyzing data flexibly and then only reporting significant results is defensible. But it is a common practice. I don’t think there’s a statistician alive who would deny that reported effect sizes in a body of published literature will be biased upward if statistical significance is being used as a prerequisite for publication. But is there widespread recognition of this? I don’t think there’s a statistician alive who things that the p-value is the probability some null hypothesis is false, or the probability some substantive research hypothesis is true, but these beliefs are commonly held by people who report and consume p-values.

          So to some extent, the phrase “bad science” is being used to refer to misunderstandings of objective facts. There is of course plenty of room also for disagreement regarding what is “bad”, and people here will disagree a lot.

          You say that these consensuses should be built up in the broader community, and I would love for that to happen. That is the big goal of those of us who write about and discuss “bad” methodology. As it stands, what exists as consensus among statisticians and what exists as consensus among people who use statistical methods often disagree. But we’re trying to get the message out. I imagine that if every blog post here had the same audience as this particular one, some of the language we use would change.

        • My reply is awaiting moderation, but in the meantime I just saw that I got an example of “bad” p-value interpretations backwards. How embarrassing. I should have referred to the p-value being misinterpreted as the probability the null is true, or the probability some substantive research hypothesis is false.

        • Kevin said, ” But I think it’s counterproductive to spread a belief of ‘good science’ instead of ‘appropriate science’”

          Could you please give your definitions of “good science” and “appropriate science”?

        • Martha – the concept of ‘good science’ that I am challenging is the worldview that under a universal, decontextualized set of practices, ‘good results’ are found (and perhaps interpreted as ‘truth’). I’d say ‘appropriate science’ is that which makes explicit the context and aims of the investigation, motivates the appropriate tool, and the limitations of the interpretability of the results.

        • Kevin,

          Thanks for the clarification.

          I consider “the worldview that under a universal, decontextualized set of practices, ‘good results’ are found (and perhaps interpreted as ‘truth’)” to be a recipe for “bad science” (but there are other types of “bad science” as well).

          To me, “good science” is somewhat similar to what you call “appropriate science”. As a rough stab at a definition:

          Good science depends on the context and aims of the study. It explains why the methods used were chosen (including discussion of why they are more appropriate than other methods), and explains the limitations of the study, the analysis, and the results. In particular, good science explicitly points out areas of uncertainty and the implications of these for interpretation.

  13. Andrew, how about an experiment: the next time you find something to criticize, couch it in language that cannot possibly be taken to be offensive or condescending. A lot of people are criticizing you for the way you critique these papers. Suppose the critique had the tone of a comment submitted to a journal?

  14. A question, because I’m a bit confused: are all the comments headed “Anon” (or Anonymous or similar) from the same person? I’m struggling to make sense of some of the discussion.

  15. Counterpoint: who represents the researchers that have been crowded out by the evidentiary standards that select for Cuddy/Bern/Kanazawa types? What about the research discouraged by gatekeepers like Fiske? It seems like the benefits of deference and delay only run upstream.

    • Exactly. This is always the problem with “tone policing.” It guards the privilege of the in-group by prioritizing that group’s feelings and sensibilities over the legitimate grievances of people outside the group. It is smarmy and superficial. In short, rock on, Andrew.

      • Thanks.

        I posted about the need to address guarding the privilege of the in-group in the previous post (help up for approval given links removed here)

        Without links here.

        “Up-and-coming social psychologists, armed with new statistical sophistication, picked up the cause of replications, openly questioning the work their colleagues conducted under a now-outdated set of assumptions.”

        Steve Goodman raised this as a necessary? step in changing the methodology a field curently embraces at the recent ASA symposium

        That was my experience in clinical research in the 1980s and 1990, newer ambitious clinicians took senior roles away from those who had them using randomized trial methods, statistics and health economics. There will be very personal losses and gains so politics necessarily enters – those “whether I embrace your principles or mistress” exchanges. But the economy of research -Peirce, C. S. (1879). Note on the theory of the economy of research. – takes no prisoners, nor can it afford to.

        Note also, the soon to be published meta-analysis link given in paper has mentioned three levels of evidence: evidential value, clear evidential value and remarkably strong evidential value. The full paper is yet to be available, but in my 20+ years working in meta-analyses of clinical research I don’t remember ever getting clear evidential value except perhaps for side effects.

    • Exactly. Relatedly, some years ago, I was about to put a lot of effort into a research plan that would have been heavily based on goal priming. I would have used this plan to apply for the nr. 1 grant available to early-career post-docs in my country. It was, obviously, very, very important for my career (and livelihood) that this plan was as good as possible. When the priming non-replications came out, I changed my plans. I am, again, so so grateful to those who conducted the replication attempts and to those who spread the word about the replication failures. Otherwise I would have wasted a year of my career and a lot of energy and resources.
      I’d like to ask those who criticize “naming names” – do you have any sympathy for the likes of me, who were (and are) in a much less advantaged position than, say, John Bargh, and who have been helped tremendously by the replication movement and blogs like this?

      • +1. I am another careful, thoughtful researcher whose ability to do sound science has been immeasurably helped by the clear, important work done by Andrew on this blog as well as by Simonsohn, Simmons, Nosek, Ioannidis, Meehl, Dawes, and many other people helping us apply statistical concepts well.

  16. I want to comment at greater length but am on the bus right now. Andrew, I thought your response was considerate and thorough. I agree, also, that the NYT article was fair (except for what you noted). Still, I think it coukd have done more to distinguish legitimate criticism from personal attack.

    I am not a fan of the TED talk. I find it logically disjointed; what’s more, I think it starts out with a big error: a reference to a Nalini Ambady study that, from all I can see, does not exist. It seems that Cuddy conflated two existing studies. This would not matter much except that it leads directly into her argument. I wrote to her months ago to ask about this and did not get a response. I could be wrong–the study might exist–but I see absolutely nothing in Ambady’s bibliographies suggesting a study that relates doctors’ *body language* (rather than tone of voice) to their litigation history.

    And then there’s that business of “share the science” at the end, which to me encourages a superficial view of science.

    Cuddy is probably kind and charming. But as a scientist she has a respondibility to respond to the actual comtent of critiques. Instead, I sense a “this has been so terrible” emphasis, which, while understandable from an emotional perspective, does not advance the discussion.

  17. I hope this is not an attempt to silence criticism to bad science. Whether this is the case or not, I hope it will not make people like you reduce methodological criticism. Science progresses via refutations and criticism. The more intense the criticism the sooner false (possibly money/life/time costing) claims are refuted. To get bad science (especially when it is highly influential to the public) out of the way asap is far more important than the impact this might have to the feelings of the individuals involved.

    I am also saddened to hear that some well respected (in their field) researchers by reading the NYT article got the impression that Cuddy was attached by a “social media mob” to which you were a part of. If this is the impression this article has left to many intelligent people it is probably a not fair article to you. People do not necessarily follow your blog and their opinion (unlike yours) is not conditional on knowing what it has been said.

  18. Andrew,

    I think the main grievance is that critiquing poor research practices does not require repeatedly invoking the same short list of researchers and studies over and over as examples of bad practice. It is personally harmful to these researchers, and (even if they respond defensively to criticism) I don’t believe they “deserve” the personal suffering, so if it does not greatly benefit your scientific aims it should be avoided.

    The thing is, while perhaps not strictly necessary, I think that having concrete examples to represent general ideas *does* greatly enhance the clarity of your critiques. Repeating examples gives the community of your readers a useful shorthand. And you do seem to mainly only put researchers who rejected valid criticism on the list. (Not that I think personal suffering is the right penalty for rejecting criticism, but if it has to happen to someone it should be them.) So I personally think that your tone/style is morally fine, though I do cringe for the authors in your roster sometimes.

    One point, though. Sometimes you only refer to studies themselves (e.g. ovulation, himmicanes) and sometimes you actually mention names of researchers (e.g. Wansink and Cuddy). Wasink is a repeat offender so should be mentioned to refer to his entire body of work. But why Cuddy? Because she criticized Simmons? Why not just always refer to the study names to keep things less personal and maybe also personally harm researchers less? Or maybe I’m just a scientific discourse pacifist when military intervention is needed…

    • To play devil’s advocate: Perhaps the difference between Wansink and Cuddy is in the commonality between their studies: They all have massive QRPs, and these people *defend* those practices, despite having shown empirically and mathematically to be problematic.

      Sometimes there are ridiculous papers (Himmicanes). Sometimes there are entire lines of problematic research whose commonality is the author, not the topic.

      Yes, the problems within the work should be targeted and addressed. But if multiple studies from the same author have the exact same problems, and the authors defend it, it isn’t just “the work as it stands by itself”, but the repeated manifestation of bad practices from the researcher behind it.

      Do I think authors should be shamed, pushed out of academia, scolded, etc? No, of course not. However, at some point criticism /does/ need to shift from “your paper has a few confounds and bad practices” to “your practices are generally problematic, as is evidenced by all of your papers”.

      Every paper has mistakes. Every paper probably contains at least one bad practice, or at least a practice that in the future may be discredited. We’re humans; we make human mistakes, we made logical mistakes, etc. But when a method has been discredited, a good chunk of the field has discredited it, and many of your papers are consequently discredited, and replication efforts have failed to replicate your work, and yet you *still* continue to engage in those practices, defend those practices, and sell books/give talks as though those works are not under a threat of credibility, don’t you think it’s time to address the person, not the papers by themselves?

      This is what happened to Wansink as well. Multiple papers have issues, many papers have similar issues. Wrong age groups, impossible values, findings resulting from deep dives. If wansink had even humored the idea that “oh, shoot; you’re right, my bad. I’ll revise those and tell those who depend on my research what has changed”, it would have been a non-story — Or if it were a story, it’d be a positive one. But instead, Wansink doubled down, said deep data dives are justified, and failed to really acknowledge the problems in each paper or even in his entire approach to analysis and practices. Since then, there seems to be some movement – He voluntarily retracted a paper, which is *good*, and so far I’ve seen positive responses to that.

      Perhaps, sometimes scientific problems *are* due to the person, not just mistakes or problems within any particular paper. That’s all I’m saying, from a devil’s advocate perspective. Repeated problematic practices, refusals to change papers or conclusions, doubling down on bad methods/analyses, more or less ignoring replication failures, and still giving talks/writing books about how you are correct means there is something wrong with you *as a scientist*, because you’re ignoring empirical and mathematical counter-evidence altogether.

      I repeat – Noone should be berated, scolded, pushed out of the community, shamed out of the public, etc. That’s terrible. That’s also not what I think AG is doing though, nor most academic critics. Moreso imploring her to own up to the counter-evidence and stop exploiting noise.

    • This is not exactly a response to Z’s points, so sorry about that, but it’s related.

      I think many people (esp. the NYT commenters) critical of this blog and Simmons &co etc. are upset for Amy Cuddy because they see that her career and reputation has been destroyed by “witchhunt” or something like that. Now, I’m sure all this has been incredibly distressing for Ms. Cuddy and I genuinely feel bad for her because of that. However, I have come to think that if the popular media had not picked up the story, and picked it up in a way that ensures a lot of clicks and media buzz, the whole thing might have gone very differently. It seems plausible that if the criticism had only appeared in statistics blogs and conferences, the damage done to her reputation would have been minor, and she might still have a fair chance of scientific career. So, I think it’s a bit unfair to blame statistics blogs for Ms. Cuddy’s fate. I think popular media is much more to blame (if we want to distribute blame).

      I also think Ms. Cuddy was caught in a very unfortunate position historically. She did not yet have an established career/position, like Fiske, Bargh and others, but she started doing independent research well before the crisis, at a time when e.g. Psych. Science explicitly asked for exciting, novel results (or something like that, I don’t remember the exact wording of their previous motto). So yes, to some extent she has been placed into the role of a scapegoat, but I think it’s too simplified to say that statistical bloggers chased her and destroyed her. It’s also too simplified to say that she did bad science and got what she deserved. Historical and other non-individual forces were at work here.

      • > I genuinely feel bad for her
        That is understandable and I was too – though less so after I saw the abstract of her new meta-analysis with the remarkably strong evidential value claim …

        An analogy that came to mind was that of a accomplished surgeon in the late 1800s with skin dermatology so the he could not wash his hands when after Semmelweis’ work was becoming accepted. Yes we feel sorry for the surgeon but we don’t let them continue doing surgery as they can’t adequately wash their hands. Most of us feel even sorrier for Semmelweis!

        • Yes, you’re right of course. I haven’t seen the meta-analysis but it doesn’t sound cool. But I feel sympathy because she’s probably been through hell, psychologically. Even if it was largely her own fault, it’s not nice when people suffer.
          I think though that she may still genuinely believe that the power pose affects subjectively experienced power (I haven’t followed the replications closely enough to know whether this is in any way justified).

        • She might be right that the power pose affects subjectively experienced power and Andrew has repeatedly said that.

          She just seems to not have anything of even weak evidential value let alone remarkably strong evidential value nor has she yet convinced anyone considered to be a credible judge of that.

          As JG Gardin succinctly put it, you can not rule out an hypothesis based on how it was generated (e.g. noisy biased data) but you can and should choose whether to spend your time on it or not. (At this pint it’s an economy of research question rather than a do we have enough evidence to be convince it is or is not true question).

  19. The critical element missing from the NYTimes’s article was the cause of the revolution. People aren’t angry because Cuddy is selling pop-psy self help books. They’re angry because the issues they care deeply about (public health, education, cancer, law, policy) require evidence and the scientific literature, the stream upon which we depend for that evidence, has been heavily polluted by the product of poor methodologies and worse interpretations. They’re angry because courts and legislators and CEOs keep making bad decisions that cost time, money and lives despite a great many Mayos and Greenlands and McCloskeys and Gelmans and Ioannidis politely saying “You keep using those words ‘statistically significant’. We don’t think they mean what you think they mean.” Given the consequences of bad methods and their misinterpretations I’d say Andrew’s commentary has been remarkably restrained.

    P.S. Get some sleep.
    PP.S. The Texas Supreme Court recently published a very dumb opinion on the interpretation of “statistically significant”. The entire opinion turned on a single, never-replicated study from 14 years ago. And here’s something to brighten your day: one of the experts, I kid you not, was named Dr. Null.

  20. I don’t think being flippant is bad, and I don’t think it’s reasonable for people to get mad about it and act like they’re being personally victimized by blogposts titled “low power pose” or “what’s happened down here is the winds have changed.” Dominus’s article is very well written and includes the sentence “The power pose became the sun salutation for the professional woman on the cusp of leaning in.” What’s that if not flippant?

    I think a little bit of acrimony can be good for academic debates, it makes people pay attention and it gets rid of what Andrew calls happy talk. In my own field, you generally know that some people will disagree with your research because of previously existing disagreements, and maybe that means there’s less nonsense (in fairness there still is a lot of nonsense).

  21. Andrew,

    I appreciate your repeated statements that a scientist using bad data doesn’t make them a bad person.

    As a long-time reader, I have been bothered that you often use the personal names of the researchers as a byword for bad science. Yes, they make mistakes. I would _strongly_ prefer that you stick to the papers when critiquing the work, not using the individuals’ names. Using the names is simply too close to a personal attack rather than the scientific critique that I know you mean it to be.

    I’m sure it would be easier for these researchers to renounce some of their claims if their names weren’t so strongly associated with the defense of the claims. It could feel like one’s own personal worth as a scientist depends on defending the scientific claim — a much higher-stakes issue for a person than whether a particular paper’s thesis was correct. Perhaps criticizing the position of “Power Poses” (2015, PS) rather than the position of [personal name] could help scientists more willingly distance themselves from discredited papers.

    I think it would also make for a more amiable and pleasant scientific community, a very valuable thing.

    I would really appreciate your consideration of this point.

    • As a rule for criticizing papers, this is good.

      Setting this issue case aside, what to do for those authors who continue to display a pattern of bad scientific practices across papers, or double-down and knowingly mislead (bad science communication = bad science)? When it reaches that point a blanket rule against naming names seems problematic. Trust plays a role in science, and if someone reveals that they may not be trust-worthy, shouldn’t that be pointed out?

        • ZC-

          I am not sure what the motivation was, but I doubt appearances were a consideration. The dead horse seemed really really dead from my perspective, but people kept riding it. My guess is that Andrew found it annoying that people wouldn’t admit their errors, and the annoyance kept him writing about it with blog posts like this. My read is that one goal of this blog is to impact how social science is done, and Andrew repeatedly calls-out individuals and institutions in the belief that the prestige of hist platform may get them to change their ways. I am not sure if this is the right way to do it, but I see it as advocacy—we are most effective at changing what’s wrong in this world for the things that: (1) instinctively bother us, (2) we are positioned to do something about. That said, I can see how someone who has read just a few blog posts may see it as smug.

          For my taste there are too many posts about scientists behaving badly, but I may not be representative. I am not sure how he gauges audience interest, he doesn’t have advertisers.

    • From what I see though, AG does talk about papers when papers are the problem, and people when people may have a problem.

      Having one power pose article that is questionable is one thing.
      Ignoring replication failures, continuing to engage in practices well described as being problematic, and persistently overselling an effect despite multiple scholars failing to replicate and pointing to problematic practices and demonstrating that supporting literature is weak – That’s not an article problem, no? That’s a person problem.

      It seems consistent, to me, that AG and others here tend to focus on singular papers when they’re problematic, and on persons when they engage in questionable practices persistently (e.g., Wansink). But that can actually make sense – There’s maybe a point when any given paper isn’t problematic, but a scientist’s actions are, and therefore it’s worthwhile talking about the practices moreso than any particular paper, and that requires using their name.

      I don’t see these as personal attacks, really. But I’m just one opinion, who knows. Obviously others do.

  22. Dominus’ New York Times article is excellent. As a story it is written in a way that allows people to easily grasp the core issues behind the crisis in social science. The only weakness is the peripheral issue of whether Cuddy is right to attribute unfair/bullying treatment to Simmons, Simonsohn, or Gelman specifically. Cuddy makes some statements about *feeling* set-up by Simmons and Simonsohn, and about *feeling* bullied by Gelman. Whether or not these feelings reflect reality (I see no evidence that they do), Cuddy did not deserve the ugliness she received on some corners of the internet. My read is that some of that ugliness had sexist undertones, but I can’t point to any specifics.

    As a relatively frequent reader of this blog, my impression is that Andrew Gelman was not the source of the ugliness. He tries to keep it fun and light, and usually pushes back against commenters who impute malign intent to the objects of his criticism.

    While I am not sure if Andrew’s frequent harping on the usual suspects (Cuddy, Wansink, Bargh, etc.) is the right way to go, the pattern I see in his naming-names refrain is that these are researchers who have stubbornly stuck to their guns, and have not admitted error even when their errors are clear. For example, here is Bargh at it again. My impression is that researchers who err and admit their errors never make this list. Am I wrong?

    I am not sure Andrew should change what he is doing. Being on the other side, the one thing I would suggest is that he give researchers a heads-up by email when he plans on posting about their work. This allows the researcher to prepare and have time set-aside to discuss the issues that come up in the comments. It doesn’t feel good to have to suddenly put aside your plans in order to correct for mis-interpretations/representations of your work.

    For the rest, my impression is that this blog is written stream-of-consciousness, often with P.S. at the end for clarification, or corrections (typos are usually left in). If Andrew starts editing himself for politeness there will be fewer posts, the errors he highlights will be easier to disregard, and, most importantly, the posts will lose their flavor and the content won’t enter the mind as easy. In short the blog will be less interesting and less likely to do its job.

    I have learned a lot from this blog. If there are adjustments Andrew can make and still deliver the good stuff, then great, but my sense is that when you are on the receiving end you will always feel that you have been singled out, you will feel a loss of control, you will feel bullied, you will feel irritated. Especially so if you have a lot to lose. Even some passive observers are going to be turned off, either via empathy, or because of the clear social norm against repeatedly engaging in actions that (foreseeably) make people feel bad. I don’t really see a solution.

    Its sad when people lose their reputation, and their career. Amy Cuddy made some serious mistakes, but she is an impressive person. I hope she makes a come back. I believe she will. The power-pose is potentially a real thing, just not the version that is amenable to study in a brief social psychology experiment.

    • “Cuddy makes some statements about *feeling* set-up by Simmons and Simonsohn, and about *feeling* bullied by Gelman. Whether or not these feelings reflect reality (I see no evidence that they do), Cuddy did not deserve the ugliness she received on some corners of the internet.”

      If we are to take the article at face value, the fact that Simmons and Simonsohn said one thing to her in a pre-publication email and something rather different in the post-publication blog post about the article seems to me like something one could easily take as being set up. That Simmons appears to have done so unintentionally is perhaps a mitigating factor, but the fact that he didn’t realize he’d done so until confronted by a NYT reporter with his own email seems surprising. I don’t think there’s any evidence that anyone set out to make Cuddy’s name a byword for sloppy science. That it nonetheless happened and that those who made it happen have retreated into the same defensiveness about their actions that they criticize in Cuddy is hypocritical in the extreme.

      • my read on the Simmons interaction was different. The NYT article says that Simmons attributes the miscommunication to being too polite in the email, perhaps coupled with the fact that he knew Cuddy. The thing he says he regrets is not calling her to explain that his critiques were more strongly felt than he let on in the email, i.e. dropping the graph wasn’t sufficient. I don’t feel like I have enough info for the NYT writer to justify framing the story so that the reader walks away with the impression that Simmmons & Simonsohn set her up so that they could write a brutal take-down. Maybe the NYT writer has more info, or a better read on the situation, and we should trust her framing… but I don’t think she supports it sufficiently with facts. Is the email posted publically somewhere?

        For the “hypocritical in the extreme” issue, I don’t see the parallel, other than the fact that both parties mounted a form of defense. I see nothing wrong in defending your research and how you communicate it if you believe it is correctly done. I see nothing wrong with defending your commentary on research and how you communicate it if you believe it is correctly done. Things change when it becomes common knowledge that things have been incorrectly done (i.e. when you know too). That is when the defensiveness is open to criticism. For many of the researchers that are targeted on this blog, it has been common knowledge for awhile that their research, and their defense of that research, hasn’t been correctly done. As I understand it, we are here having a debate about the commentators, and to what extent their commentary hasn’t been done correctly. Andrew Gelman seems to be open to the discussion. I haven’t seen anything of parallel among those he targets for criticism.

        The pattern I see on this blog that most people are calling into question in the comments is the parade of bad scientists that Andrew brings up again and again. Is it unfair that he keeps bringing up the names of Bargh, Cuddy, Wansink, etc.? I know I cringe when I see it because the social costs are impossible to measure, which means we have no idea if the consequences for them are proportional to their actions. It is not obvious to me that there should be a blanket rule against naming names. What do we do when researchers continue to display a pattern of bad scientific practices across papers, or double-down and knowingly mislead (bad science communication = bad science)? Trust plays an important role in science, and if someone reveals that they may not be trust-worthy, shouldn’t that be pointed out? Shaming and internet mobs seem like a crude answer, but is there another mechanism in place? I see Andrew tries to shame universities that promote this stuff.

        Its tough. Even if the writing style (and tone) on this blog were different, Andrew Gelman is still highly respected, so his criticism would still carry a lot of weight and would still have real reputational consequences (and other costs), for those on the receiving end of his criticism.

        • Here’s my basis for the how I interpreted Simmons’ actions. I don’t have any inside information here, so I can just go from what I read in the Times. And it’s a long article, so if I overlooked something that should have informed my take, my apologies on that and please correct me.

          First, while we don’t know exactly what was in the Simmons email, we did get two separate mentions of the contents from the article. Here’s the first:

          “The letter Simmons wrote back to Carney was polite, but he argued that her P-curve had not been executed correctly. He and Simonsohn had each executed P-curves of the 33 studies, and each found that it was flat, suggesting that the body of literature it reflected did not count as strong evidence. He did write that ‘conceptual points raised before that section are useful and contribute to the debate’ but that they should take the P-curve out. ‘Everybody wins in that case.’ According to Cuddy, she and Carney thought the P-curve science was not as settled as Simmons believed it to be. But afraid of public recrimination, they did exactly as he said — they took out the P-curve.”

          The email comes up again later, in the context of Simmons being asked to reread it:

          “When Simmons and I met, I asked him why he eventually wrote such a damning blog post, when his initial correspondence with Carney did not seem particularly discouraging. He and Simonsohn, he told me, had clearly explained to Cuddy and Carney that the supporting studies they cited were problematic as a body of work — and yet all the researchers did was drop the visual graph, as if deliberately sidestepping the issue. They left in the body of literature that Simmons and Simonsohn’s P-curve discredited. That apparent disregard for contrary evidence was, Simmons said, partly what prompted them to publish the harsh blog post in the first place.

          “But the email that Simmons and Simonsohn had sent was, in fact, ambiguous: They had explicitly told her to drop the P-curve and yet left the impression that the paper was otherwise sound.”

          Taken together, this clearly left the impression in my mind that what was said privately in the email and what was later posted on the blog were quite different. The “[e]verybody wins in that case” quote seems hard to align with their later perspective on the article. Given that context, I don’t think it’s a stretch to interpret the email as encouraging while the blog post was not. And if you encourage someone to do something just so you can tear them down later, that would be setting someone up. I acknowledge that this might not be the whole truth and I certainly don’t think Simmons realized he was doing that. But the evidence to my mind seems to point toward my interpretation.

          As to your other points, I think I agree with you generally. My personal takeaway from this story has been the idea that I need to be more careful about thinking about using people as placeholders for broader concepts. It’s easy to fall into the trap of using someone’s name as shorthand for a broader category of behavior that they are possibly an example of. That behavior seems like it can have unintended consequences, which we’re all seeing play out here.

        • David W:

          The way you frame things made me want to double-check to make sure that I wasn’t careless or irresponsible in communicating my failure to see anything obviously amiss inthe Simmons/Cuddy interaction. To summarize: I don’t think I was. I think my previous interpretation holds, if anything I think the facts that I [quickly] gathered now favor Simmons and Simonsohn more strongly. Here is why:

          What I understood from the NYT column is that Cuddy emailed Leif Nelson for a comment on their forthcoming Psych Science response to Ranehill et al.’s failed replication. Nelson forwarded the email to Simmons and Simonsohn. Simmons politely corrected Cuddy over email on the p-curve, and sent her the correct p-curve, which was devestating for the points they wanted to make. Carney, Cuddy & Yap—they were still together at this point—didn’t agree with Simmons and thought that the “P-curve science was not as settled as Simmons believed it to be.” I am sure Simons didn’t agree with that, but he probably politely communicated that leaving out an incorrect p-curve was better than including it. So she left it out. Presumably Carney, Cuddy & Yap received feedback from others before they went ahead and published their 2015 comment in Psych Science, so responsibility is on them at that point, even if Simmons had said “everyone wins in that case,” which I can’t really understand out of context.

          After that Simmons & Simonsohn wrote up their side of the matter, which differed, and sent it to the authors for comment (presumably to Carney, Cuddy & Yap). From the Data Colada post it looks like only Cuddy replied. After receiving input Simmons & Simonsohn then blogged their response, which was really just posting their (correct) version of the p-curve, which Carney, Cuddy & Yap didn’t want to include in their review. Cuddy replied to this with: “Given that our quite thorough response to the Ranehill et al. study has already been published in Psychological Science, I will direct you to that [i.e. the 2015 comment]). I respectfully disagree with the interpretations and conclusions of Simonsohn et al., but I’m considering these issues very carefully and look forward to further progress on this important topic.” There was no specific challenge to Simmons & Simonsohn’s point.

          In sum, I don’t think we have enough to speculate as you do: “Given that context, I don’t think it’s a stretch to interpret the email as encouraging while the blog post was not. And if you encourage someone to do something just so you can tear them down later, that would be setting someone up.” I could be wrong, and the text of an email exchange could reveal otherwise, but my interpretation is that when Simmons re-read the email exchange felt bad, and maybe regretful, that he was interacting with Amy Cuddy at a critical juncture in which he could have potentially persuaded her not to publish the Psych Science comment if he would have been less polite and more forceful. This could have prevented her from painting herself into a corner, and may have saved future embarrassment. It was a little awkward that her comment was the perfect application of a p-curve, so that Simmons & Simomsohn ended up being the one to take down the message from the 33 studies from that literature (which included many papers that were not from Cuddy).

          P.s. I just saw Anonymous’ comment below. I agree.

        • I’ve criticized AG in comments above for using Cuddy’s name repeatedly. But JM’s comment above makes me want to defend AG a bit. The fact is, AG’s volume of blog output is what makes his blog popular. It would be essentially impossible to sustain that level of output without repeating oneself to some degree, without using some cliches. So I’d like AG to take the criticism he’s receiving without getting defensive — something he’s clearly trying to do and doing better at than many would, though he could still do better. But for those of us criticizing him, it’s important to acknowledge mitigating factors, and the volume of good discussion that’s happened on this blog is a huge mitigating factor.

        • Jameson:

          I love AG’s blog, and I agree with you. I didn’t intend to signal that I have reached a judgement on the best approach for how to blog this stuff. I was just pushing around ideas others have brought up here. The article called for some reflection, and I am not settled on what the best approach should be. Volume (and flavor) is important. Its fun to joke around too, it keeps people engaged. I certainly don’t have a lot of confidence in what I wrote above, but I do think there are ways to minimize blow-back if one is concerned about that.

          For the record, I think the NYT was unfair to quote Cuddy as feeling that AG was a bully, and then to pepper circumstantial evidence throughout the article that would foreseeably sway the reader towards agreeing with Cuddy’s interpretation of events. The NYT article ignored the context, and the timeline, which is crucial for understanding what happened.

          The NYT article spawned exchanges like this on twitter, which is clearly unfair to Simmons (I have seen others unfair to AG). I had a quiet tweet-storm in response (It was a failed experiment, I don’t know how to twitter).

          My current view is tweets 19-27, which I stitch together here:

          Public criticism has serious consequences even when it is fair. It makes people feel bad, it can hurt careers, and worse. Because these consequences are foreseeable, our knee-jerk response is to blame these heartless, insensitive critics. Then add gender. This critic behavior doesn’t feel natural, or human. In a certain sense this is right. There is a norm violation going on here. Why do something if you should have good reason to believe that they are going to feel bad, and that it could hurt them? Someone that does that must either be emotionally obtuse, callous or mean-spirited, no? I think this is the wrong inference. Science has different norms, and this is a necessary ingredient. If you want to be a scientist you are signing yourself up for this. (Don’t mean to sound like John Kelly there). We have to do something about the ugliness though. [addendum: if we can]

      • “If we are to take the article at face value, the fact that Simmons and Simonsohn said one thing to her in a pre-publication email and something rather different in the post-publication blog post about the article seems to me like something one could easily take as being set up”

        To me this is a very confusing story and i spent quite some time trying to figure out what happened. It seems to me that the critics did *not* necessarily say one thing to her in a pre-publication email and something rather different in the post-publication blog. In my interpretation, the funny thing is that there is no real problem here, and the way it is written up in the NYT piece may account for all the possible confusion and/or aggravation.

        From the NYT piece:

        -“The letter Simmons wrote back to Carney was polite, but he argued that her P-curve had not been executed correctly. He and Simonsohn had each executed P-curves of the 33 studies, and each found that it was flat, suggesting that the body of literature it reflected did not count as strong evidence. He did write that “conceptual points raised before that section are useful and contribute to the debate” but that they should take the P-curve out. “Everybody wins in that case.”

        I interpret this to mean/imply that the 33 studies altogether did not count as strong evidence. I presume Cuddy was told this, and she then left out her wrong version of the p-curve and presumably also did not choose to include the critics version which showed no strong effect.

        The Cuddy paper talking about the 33 studies seems to mostly be about conceptual issues like differences between studies (of which the critics said that it could be useful for the debate).

        The blogpost by the critics simply shows their version of the p-curve for the 33 studies, and simply notes that the Cuddy paper talks about possible moderators. It is their right to post their version of the p-curve (which Cuddy presumably did not want to include in her paper even though it seems like she knew about it and knew that it showed no strong evidence)

        In my interpretation Simmons and Simonsohn did not say one thing to her in a pre-publication email and something rather different in the post-publication blog post. All the possible confusion and/or aggravation seems to me to be in large part because of the way it has been written up in the NYT piece.

        Regardless: Cuddy and her collegues are responsible for their own actions and paper.

  23. As a casual reader of your blog, and an extroverted man who has been accused of occasionally lacking empathy, even I detect the tone that some folks are complaining about. The sooner you change it, the sooner you can get away from the image of a missle-launching-blogger to one of helping advance science with a voice that more people will be more receptive to. That’s the goal, isn’t it? You won’t be any good to science if you’re up all night defending yourself from internet rage.

    • As a casual reader of your comment, and an introverted man who has been accused of occasionally lacking patience, even I detect the tone in your comment that some folks are complaining about. The sooner you change it, the sooner you can get away from the image of a missle-launching-commenter to one of helping advance science with a voice that more people will be more receptive to. That’s the goal, isn’t it? You won’t be any good to science if you’re up all night creating internet rage

    • If someone doesn’t think Gelman has done a good job advancing science, they should focus less on his blog and read his books on Bayesian Data Analysis and on Multilevel Models. They should check out Stan. It’s strange—I stay out of the academic circle, and when I think of Gelman, I think of Stan and the Bayesian book. I don’t think of a “missile-launching-blogger.”

    • Nah. As usual, Lord Keynes said it best: “Words ought to be a little wild, for they are the assault of thought on the unthinking.” Paul Meehl was genteel in his writing and failed to save academic psychology from itself.

  24. I follow a lot of social psychologists on Twitter who have vigorously taken up the mantle of improving their discipline via better methodology. They are usually right about things. With that said, I find them to be far more dogmatic and dismissive than Andrew is on this blog. While I think Andrew may have unintentionally made life more difficult than what may have been deserved for Amy Cuddy—just by virtue of drawing attention to the flimsy research—I find Andrew to be a better messenger than many of the social psychologists I follow on social media and some of the other places where they pop up. Andrew can be cutting but—like a good Bayesian—is good at expressing comfort with uncertainty and finding reasonable ways to interpret imperfect research.

    • Unfortunately I worry that instead of expert pragmatists like Gelman, social psych will continue to get methodologists who are dogmatic and frankly not expert statisticians.

    • There may be some selection bias here. The format and culture of Twitter are such that bold statements get more exposure and attention than moderate ones. That is to say, I agree with you that I’ve seen meaner stuff out there — and it might actually be the case that the culture of psychology is nastier than this space generally when it has decided to criticize someone — but I’m not sure Twitter can give us very strong evidence for this.

  25. one thing that is lost in the rush to trying to exalt evidence-based and science-based statements is that a lot of true things are incredibly expensive and difficult to prove scientifically. just saying that something doesn’t have scientific proof doesn’t mean it’s wrong. in her case it could be right for one person and wrong for another, even with scientific proof, so why is scientific proof so important here?

    • I think Andrew’s position is entirely consistent with this – he’s just saying don’t claim scientific justification for something that doesn’t have it. Doesn’t mean you can’t still do a TED talk or write a book or inspire people. Just means it’s not ‘good science’. It can still be good ‘inspiration’ or whatever.

      Remember this is a blog by a statistician concerned about the quality of scientific research.

      • I sometimes watch horror movies (my partner likes them) and they often have eg ‘ESP’-style premises and similar. I can appreciate them as ‘good’ (well, often terrible but that’s beside the point!) entertainment without having to defend Daryl Bem’s research on ESP (also frequently criticised by Andrew).

        • I see what your saying but I think it’s a little different from Bem. Cuddy has established herself as somebody who gives inspiring TED talks and writes books. Ben isn’t going to help write the new season of Stranger Things.

        • Yeah. I just mean the genres shouldn’t be mixed.

          There are legitimate _scientific_ criticisms of Cuddy’s work as science. As with Bem (similar issues in fact). And as far as I’m aware she and others have pretty consistently refused to properly acknowledge these. And her supervisor and colleagues have a direct line to prestigious scientific journals like PNAS, while calling those who wrote blog comments ‘terrorists’.

          Andrew is, I think, upset about the claims of scientific rigour, not the claims that the talk or book or whatever is entertaining or inspirational. The issue is scientific integrity. While it is a shame that an early career researcher is caught up in all of this, this is a broader attack on what amount to (intentional or not) corrupt power structures and incentives in academic research.

          For every Amy Cuddy there is another Amy not ‘making’ it in science because they adhere to stronger scientific standards. Think of Wansink’s bullying of a postdoc (I think they were) for refusing to chase noise.

        • “Think of Wansink’s bullying of a postdoc (I think they were) for refusing to chase noise.”

          I largely agree with everything in your post except maybe your last sentence which i quoted above. In any case, and also given this whole discussion about “bullying”, i think there should be a rule to provide evidence for statements about bullying in these discussions.

          Here is Wansink’s blogpost in which he states the following information about the post-doc (and i have never heard anything else about this post-doc, but please correct me if i am missing something):

          “At about this same time, I had a second data set that I thought was really cool that I had offered up to one of my paid post-docs (again, the woman from Turkey was an unpaid visitor). In the same way this same post-doc had originally declined to analyze the buffet data because they weren’t sure where it would be published, they also declined this second data set. They said it would have been a “side project” for them they didn’t have the personal time to do it. Boundaries. I get it.”

          &

          “About the third time a mentor hears a person say “No” to a research opportunity, a productive mentor will almost instantly give it to a second researcher — along with the next opportunity.”

          https://web.archive.org/web/20170312041524/http:/www.brianwansink.com/phd-advice/the-grad-student-who-never-said-no

        • Possibly a fair comment, but you left out the part right in between those two quotes:

          >Six months after arriving, the Turkish woman had one paper accepted, two papers with revision requests, and two others that were submitted (and were eventually accepted — see below).

          > In comparison, the post-doc left after a year (and also left academia) with 1/4 as much published (per month) as the Turkish woman. I think the person was also resentful of the Turkish woman.

          >Balance and time management has its place, but sometimes it’s best to “Make hay while the sun shines.”

          The point of the post seems to be a negative comparison of two people. But sure, what exactly counts as bullying can be hard to say, and I don’t have enough background info to know whether eg the postdoc refused on principle or for other reasons, whether Wansink played a substantial role in them leaving academia etc etc.

          So I’m happy to retract this and leave it as ‘ambiguous’ what exactly happened there.

        • We put an incredible amount of pressure on young people in science to ‘publish or perish’, to get ‘exciting’ results, to ‘sell these’ etc etc. These are the people I’m most concerned about.

          I’ve been there myself and I also see them every day. I want to hear their stories and encourage the people who are willing to stand up to a prestigious supervisor putting pressure on them to get more exciting results.

  26. >The only thing that really bugged me about the NYT article is when Cuddy is quoted as saying, “Why not help social psychologists instead of attacking them on your blog?” and there is no quoted response from me. I remember this came up when Dominus interviewed me for the story, and I responded right away that I have helped social psychologists! A lot..

    Joshua Miller’s comment above helped me grasp more clearly that the NYT article’s narrative arc was following Amy Cuddy’s feelings and experiences. In this light, what if we read Prof. Cuddy’s question as : “Why not help *me* instead of attacking me on your blog?”

    At least, maybe this is how the journalist interpreted– or how she believed her readers would interpret– Prof. Cuddy’s question? It makes emotional sense to me, anyway, and would be a reasonable explanation for Andrew’s response being left out of the article.

  27. I am not surprised for all the sympathy for Cuddy, which the NYT article was perfectly propagandizing for. Lets take some facts: This person has a huge platform that she milks for personal profit, at a school where the average salaries are 300K a year. She even has NYT behind her. Can you imagine yourself in a NYT Sunday piece after an embarrassing misconduct? She does bad research, and doesn’t own it, and instead milks it to millionairedom. When exposed, she is then congratulated for quitting her job (seriously? she was probably asked to leave).

    Meanwhile many talented, hard working, and better trained people are trying to land in tenure track jobs with no privilege like Cuddy’s, with teaching loads so high that they don’t even have time to do research (what they thought they signed up for). Many are also dealing with mental health issues and addiction. Shall we also write some NYT pieces for all those PhD students who were trying to build their work on power pose and lost time? Or that scholar who decided to throw all her data out when she realized it is all noise. Do they know someone at NYT who can write about their oh so innocent frail arms?

    SRSLY? How many of you are really in academia? The game is rigged and this is a case where the problems with academic reward systems are exposed. I am nobody’s schadenfreude but this person had many opportunities to say “Yes, my work is problematic. I am going back to the drawing board and rethink my theory and methos” Instead she went on paid speaking gigs still talking about power pose as if nothing happened. Eventually people lose their patience and ugliness ensues. I am not condoning bullying (though I am not even sure what people have done was bullying), instead I see an uglier face in Cuddy’s “oh I am so innocent, they are all bully”s narrative while she perfectly knows what she has been doing, after being repeatedly called out for.

    • I will have to read the article, but reading through the comments on this blog, it seems likely that it is probably propagandizing for sympathy for Amy Cuddy. This seems symptomatic of a big problem in media reporting on scientific issues, which just makes it seem more likely that propagandizing is the aim of the article, not that academia does much to help itself be understood clearly. Thanks for expressing this so clearly.

    • Good point.

      We are, after all, talking about a Harvard professorship here.

      Is it really too much to ask that a professor at the very pinnacle of the profession do work that is robust, and, dare I say it, actually true? If Cuddy’s work does not stand up to scrutiny is it really so unreasonable to give her position to another person whose work has stood up to serious scrutiny? Is she really the person most deserving of that professorship?

      To say that she was undeservedly bullied out of her professorship implies that there is no solid work being done in her field that deserves that professorship more.

    • +1

      At a time when tenure is under attack, tenure-track positions are increasingly scarce, and the public is distrustful of “experts” and of the value of academic researchers, the actions of people like Cuddy (Princeton PhD, Harvard job) and Fiske (Harvard PhD, Princeton job – neither are exactly “lowly outsiders” at the bottom of the totem pole, status-wise) de-legitimizes scientific research in the eyes of the wider public.

      Can any of you imagine how we would react if a sympathetic article was published, with professionally-shot photo spread, of Satoshi Kanazawa – he of the “negro women are objectively less attractive, my scientific research shows!” findings???

      But the point, as Mr. Gelman has repeatedly pointed out, is not the conclusions themselves but rather the *METHOD* and *PROCESS* by which the findings were brought about. Because by any objective examination of the evidence, Kanazawa’s methods were not all that out of line with the standards that Fiske and Cuddy endorse!

      Trash science is trash, whether it come from a woman, an Asian male, or a Caucasian! To err is human, as they say, but to double-down on the error is when one moves into the territory of charlatanism.

  28. My sense has also been that the people whose mistakes you highlight are overrepresented on your blog, compared to the gravity of their mistakes. It’s true that doubling down makes errors worse, for you and anybody who depends on you (see: this presidential administration). So I get why you are aghast at these particular scientists’ refusal to admit error. But I think focusing so much on five or ten people represents, or at least induces in readers, its own kind of cognitive bias. It frequently comes across as though you’ve fit a logistic model on the binary outcome “is terrible work” and are crowing about the ten nominally-significant predictors you found, all dummy variables of the form “was co-authored by David Brooks.” Of course I’m being a bit tongue in cheek, and based on your approach to your own analyses, I doubt that’s what you really mean — but that’s the impression it leaves. Is this fundamentally a linguistic or social-cognition problem of writers or readers needing categories? — maybe? But if it is on readers, I don’t think it’s just me; and if you wanted to, you could probably get around it just by reducing mentions of specific researchers. Criticize them once or twice by name, sure, but decrease the callbacks.

    Like a couple other commenters, when I read the NYT article, I was surprised at your comment that you don’t like interpersonal conflict. I wondered, what on earth does he think he’s been doing? And I wondered, not for the first time, why I feel like I don’t see much political science research pilloried here. Am I missing it because I’m not familiar enough with the field for it to stick? Or is the field free of p-value-mania, somehow? Or are you aware on some level that you are sometimes pretty confrontational here, and quite sensibly choosing not to (pardon the phrase) shitpost where you eat?

    • > if you wanted to, you could probably get around it just by reducing mentions of specific researchers. Criticize them once or twice by name, sure, but decrease the callbacks.

      This seems to be the consistent piece of constructive criticism targeted at Andrew so I hope (and suspect) he will take this one on board.

      It is tough though, in that some of these folk perhaps seem to think they can just ‘weather the storm’ and keep doing what they’re doing. Which has the likely side effect that those who aren’t comfortable with the norms of these fields still get pushed out.

      • It is tough though, in that some of these folk perhaps seem to think they can just ‘weather the storm’ and keep doing what they’re doing. Which has the likely side effect that those who aren’t comfortable with the norms of these fields still get pushed out.

        Yeah, I’ve also wondered about adverse selection as a contributor to this problem. I worry less for the researchers who choose to leave their fields over these epistemological issues (I suspect that conscientious people will tend to land on their feet) than I do for the fields they leave behind.

        • Yeah definitely – I can imagine many people positively predisposed to ‘good science’ or whatever will just choose not to have anything to do with fields like social psychology. Which leaves these fields much worse off.

          I can also imagine people like Andrew just saying ‘screw it, I’ll leave them be’ and re-focusing on other areas (as he has seemed to start to lean towards lately anyway). Which, despite the ‘human factors’ mentioned, does I think leave fields like social psychology worse off.

          It’s not actually that fun being a naysayer! And despite some snark, I don’t think Andrew actually enjoys it that much either!

        • I’m sure you’re right — criticizing bad stuff can be fun in small doses, but like Halloween candy there are diminishing returns, especially when you feel like the people who really need to hear you are tuning you out.

          I also shouldn’t be so cavalier about the fate of the scientists who get shut out because they don’t want to engage in QRPs. I do think in the end they will be better off than their fields of origin will, but I shouldn’t minimize the human cost of needing to dust yourself off and start out again — it is substantial.

        • Some (or could be most) people don’t want to hear. They just want to be let alone to publish and sell whatever makes them happy. Doctors gives small doses initially. When patient does not respond at all to the treatment they usually increase it.

    • In another comment up there, AG acknowledged that he was too loose in choosing “interpersonal” as the modifier there, when it seems he really meant “in-person.”

      As to why political science research is less commonly subject to this kind of criticism, AG is not alone, and attributes it to the difficulty of doing replication studies at all (you can’t just run another identical election). Somewhere up there he has a really interesting comment on the topic.

      • I saw that comment (about why the crisis started in psychology despite existing in other fields), and agree that it’s interesting! I think I’m after a different question, though: why do I not see more political science articles with QRPs featured here? The conditions aren’t right for a replication project, sure, but I think lots can be learned without that, if you know the field of application well, especially if the datasets are open (and I think they are in polisci – correct me if I’m wrong).

        There could be several answers to my question — maybe he does talk about this a lot and I’m forgetting examples, or maybe the field really is failing at this less than psychology. But a couple of the possible answers that come to mind (fear of upsetting people he has to see every day in the hall, or fear of poisoning the well for his discipline when it comes to Congressional funding) I think might merit a step back to ask — if I wouldn’t do this in a scene where I had something significant at stake socially, should I do it here?

        I definitely don’t want my comments here to come across like I don’t want Andrew to write about these issues at all. I started in psychology and became so frustrated with its epistemological problems that I went back to school for statistics, so like some of the social psychologists who posted up thread, I mostly have found Andrew’s attention to these issues incredibly gratifying, like, hey, I wasn’t crazy to be unsatisfied and worried! But I still think tone matters when you want people to change. I appreciate how open he’s been to hearing this feedback (especially given that some of the anonymous comments here have been really pretty nasty themselves). I vaguely remember trying to engage him about the issue of one-on-one vs published critiques in a comment thread some years ago and I don’t think he got what I was saying. But even if that aspect doesn’t change (and hey, I haven’t actually tried the one-on-one approach in this context, so maybe he’s right and it sucks) I am glad he is giving some thought to his general approach.

        • Erin:

          1. I do sometimes write about problematic political science studies, for example claims about shark attacks, football games, and subliminal smiley faces. I think this sort of study is more prominent in social psychology, though: consider the influence of “embodied cognition” etc.

          2. No, I’m not holding back on my blogging (or my writing) out of fear of losing funding.

          3. I sometimes try direct contact but it typically doesn’t work, in that people who lean on questionable research practices often don’t want to hear about it. See here, for example. It’s fine with me if others want to contact researchers directly; indeed there’s nothing stopping anyone from reading a post here and sending a polite email to the authors of work being discussed. We can each contribute in our own way.

        • Thanks for the pointers to the political science stuff you’ve covered. I had indeed missed those, somehow, and will read what you have to say about them. (It’s likely to be more relevant to mistakes I want to avoid in my own work, since like political scientists I’m working on a large dataset in a context where a full replication is unlikely.)

          What about your own applied work? Looking at stuff you did fifteen or twenty years ago, are there analytical choices you’d make differently today, or conclusions you’ve drawn that you’ve since rethought?

        • Erin:

          Of course, I revisit and criticize my own work all the time! Just go to the published papers page on my website and you’ll see lots of examples. You could start here, for instance.

        • Yeah, I realize I was imprecise in what I asked for — of course I’ve enjoyed your past self-mocking posts regarding disproving your own theorems and whatnot. I was wondering more specifically whether you had written anything of the form “I used to approach problem X using technique Y, but I’ve since realized that has problem Z and so now I use Y’ instead, and you should too.” I’m afraid I’m not familiar enough with your past political science work to draw the connection immediately with the particular article you linked — was MRP an approach you stumbled on late, having used some other poststratification technique before? or did something that happened in the 2016 election make you rethink your conclusions here?

          I’m seeing tons of other great stuff in that publications list that I’d missed before, though, like this paper on multiverse analysis and this paper on the p-value problem. Both of those reflect things I came home from the recent symposium thinking about. So I’m really glad you pointed me that way.

        • Erin:

          Very specifically, if you read the paper linked to in my comment, you’ll see that it implies a complete revision of the thesis of Gelman and King (1993), which is one of my most most-cited papers in political science. In the 1993 paper we considered all sorts of reasons for poll swings, and in the 2016 paper we revised that view entirely by suggesting that most of the late-campaign swings were due to differential nonresponse.

          When it comes to methods, yes, one thing I’ve written a lot about recently is that in my books with Hill and with Carlin et al., we routinely used noninformative prior distributions, but more and more I’ve been convinced that informative priors are better. I have changed my thinking on that after seeing many examples and realizing lots of problems with the default noninformative approach.

        • Ah, thank you — that implication was not obvious to me as a non-political-scientist (I was not familiar with your 1993 paper, so didn’t make the connection when you cited it in the intro).

  29. ” I wouldn’t’ve chosen to have written an article about Amy Cuddy—I think Eva Ranehill or Uri Simonsohn would be much more interesting subjects.” Nope. There is where you are wrong Andrew and its not found in a P-value. Human nature. People like to gawk at others and their misfortunes and even watch them squirm a bit. That is what would get NYT readers to read the whole article about problems in research, methodology, statistics and even comment. Otherwise it is off to the Style section.

  30. For a blog about science, there’s an awful lot of assertion and opinion going on in this thread. How can Gelman and his various defenders really be sure he is helping science in his approach? It is entirely possible that the singling-out approach Gelman epitomises makes a significant number of meek geniuses very reluctant to pursue a career in social psychology or, if they do, to pursue risky or interesting topics. Equally, it is entirely possible that Gelman and the rest of them make the profession more attractive to people who enjoy publicly roasting their colleagues. You all might be helping in some areas, but how do we know the costs don’t outweigh the benefits?

    The approach Gelman takes seems to represent a sort of Popperian view of scientific progress, where only the strongest ideas survive rigorous debate. But everyone knows it frequently goes beyond rigorous debate, and the minimising (and indeed celebrating) of the bad behavior we see from people like Gelman and others is really disturbing – and many PhD students have told me how terrifying they find it. The Popperian approach of attacking to find weakness fails to recognise that we also need to produce an environment where quiet, clever, thoughtful and kind people who care about the truth and like solving problems can thrive. I’m certainly not arguing for a ‘safe space’ approach, but some self-regulation, greater charity, and more investment in critiquing ideas rather than the person would go a long way.

    Although I think the changes we’re seeing around replication, rigour and open science are welcome, but do they really need to come with the cost of seeing articles in the NYT which, frankly, reflects so badly on the field?

    • It is hard to not read this first paragraph as an illustration of status quo bias — changing anything might be bad unless you have evidence.

      Looking from any social science besides psych, it is striking how much the field seemed to avoid obvious criticisms of others’ work. In progress work was not receiving the criticisms it needed.

      What reflects poorly on the field is not just this profile of Cuddy, but the continued decisions of more powerful and prominent faculty like Fiske, Bargh, Mitchell, and Gilbert.

      We must be patient: science advances one funeral at a time.

      • “It is hard to not read this first paragraph as an illustration of status quo bias — changing anything might be bad unless you have evidence.”

        Asking for someone to support their claims with evidence is an example of bias? It is a simple logical fact that change can have positive and negative effects, but you interpret my statement instead as evidence that I favour the status quo. I certainly don’t but I’m not sure the Gelman approach is the best way to change the status quo. Please note the ‘not sure’ part of the sentence should be taken literally – I really don’t know, but I am open to the possibility it has a net negative. I also expect people who demand high standards of evidence from others to subject their own claims to their own standards, or if they can’t, to demonstrate some humility and caution.

    • “His various defenders”; you make frequent commenters here seem like Gelman has some cult following.

      Know why I like this blog?
      For the same reason I like several stats/methods blogs. Good discussion. Outside perspectives on social psych. Criticism of articles and practices WITH substance – You don’t see Gelman or others say “Researcher X sucks; discuss”, you /do/ see them point out problematic practices and back up why those practices are problematic. I’ve learned a /lot/ from this particular blog and its community.

      Tbh, when I talk about things this blog brings up about social psych to other social psych phd students, or phd students in other psych niches, their response isn’t some overwhelming dread or fear; they don’t fear risky or interesting topics, they don’t fear careers in social psych (at least, no more than they would anyway, given the terrible job market). They do perk their ears up and rethink some of their practices. They may push back on their advisor about certain analyses. They avoid deep dives. They better consider measurement error. They start thinking about other ways of describing support for predictions, models, hypotheses, etc. They rethink pursuits on topics that would have “sexy findings” if true, and become more grounded about being a scientist first, and a public figure second.

      I think the only people particularly frightened of people like Gelman and other “data police” are those who gained their positions by engaging in the problematic practices and inferential problems that Gelman and other “data police” call out. All the phd students I know are eager to ensure that they are correct, not that their titles are sexy; they want to make the best, most honest paper they can, not merely one that JPSP would be attracted to due to counter-intuitiveness.

      So – sure, for people who want to do bad science, Gelman’s blogs may be costly. I’d be fine with that. Gelman and others give pretty well substantiated and balanced, concrete recommendations about best practices; they call out bad work so we don’t make the same mistakes; they make tools to improve science; they write books, open articles, and publish videos about best practices and methods; they write books about how to better teach statistics; they describe why some practices are problematic, how some inferential logic is fallacious, and provide evidence for their claims. Anyone who is threatened by that, I fear, is in the wrong field to be honest. That’s good, honest information there for doing the best you can, and if that threatens your sensibilities, I worry your goals in science are misaligned with /the/ goals of science. (using “your” in the general sense, not targeted toward you).

      As for NYT articles reflecting badly on the field, the field is reflecting badly on the field. Or, it did, at least. Social psych is going through a pretty sudden and major growing pain. Bad practices accumulated bad results that waste time for everyone (red herrings everywhere; structures built on sand). Now we’re wising up, in part thanks to people like Gelman, Simonsohn, Simmons, and so many others that call out work so that we don’t make the same mistakes. Now we’re demanding open practices, open materials, open data, open tools, pre-registered plans, registered reports, etc to build a field that others can follow, instead of point and laugh at. I’m quite excited about that prospect, despite having existential dread about so many of my favorite findings likely being bunk.
      All that to say – Social psych sorta did this to itself, and some researchers’ reputations are tarnished in part because they did it to themselves. When you make your living overselling your results, and the scientists are screaming from the wilderness that your results are likely bunk, and you continue overselling your results, yes, your reputation as an empirical scientist will be diminished. When a professional scholar is found to have over 45 papers with obvious errors in them, impossible results, and questionable practices scattered across them, turns out that they’re not as pristine as previously given credit for. Social psych as a field has gained a bad reputation due to ignoring decades-old warnings, and overselling findings. A few researchers in particular stand out for these reasons, and their reputations are decreasing simply because the field is wising up, and those few are refusing to acknowledge problems.

      • “Tbh, when I talk about things this blog brings up about social psych to other social psych phd students, or phd students in other psych niches, their response isn’t some overwhelming dread or fear …”

        One bit of evidence supporting “or phd students in other psych niches”: The talk Andrew gave recently at the University of Texas at Austin (http://statmodeling.stat.columbia.edu/2017/10/02/statistical-methods-expiration-date-talk-university-texas-friday/) was at the request of the *clinical* psychology grad students.

        • I get it, I really do. You’re all coming here to throw around ideas and debate. But come on, citing one talk, or citing feedback from your own PhD students does not make a clear body of evidence. Gelman says he is trying to help (and seems convinced he is doing so), so I mentioned some PhD students are discouraged by what they perceive to be a fairly vitriolic atmosphere in the field.* I did so because this is a black swan that illustrates the *possibility* that the Gelman approach might cause harm as well as good. The jury is out on the net cost/benefit, but beyond identifying black swans, anecdotes really shouldn’t have a place in a discussion between scientists.

          *For the avoidance of doubt, these students are excellent, bright and committed to transparency and open science. But they just don’t like what many perceive to be hostility and personal attacks.

        • “You’re all coming here to throw around ideas and debate. But come on, citing one talk, or citing feedback from your own PhD students does not make a clear body of evidence.”

          1) I don’t think we’re “all” coming here to throw around ideas and debate. Some (many?)of us are here (at least in part) to help us try to figure out the complexities of the world and real life.

          2) I agree that citing one talk doesn’t make a clear body of evidence — that’s why I said “one bit of evidence” One needs to accumulate lots of bits of information from lots of sources.

          3) “beyond identifying black swans, anecdotes really shouldn’t have a place in a discussion between scientists.” Huh? Anecdotes are not the final word, but are one of many things that need to be considered in discussions between scientists. For example, your statement “some PhD students are discouraged by what they perceive to be a fairly vitriolic atmosphere in the field,” is an anecdote. That’s fine as one piece of evidence, but needs to be taken into consideration with other pieces of evidence, just as anything I have said needs to be taken into consideration with other pieces of evidence. But also, we need to work “locally”. For example, what have you done to help the students you are talking about learn to focus on the substantive aspects of a message rather than their emotional perceptions? I’m not in love with Gelman’s style, but I recognize that there is still a lot of substance there – because I have pushed beyond what my emotions tell me. His style is not what I would prefer, but I am not the only person in the world. Living in this world requires a lot of give and take — or else the alternative of sticking our heads in the sand.

        • You’re keen on demanding evidence from “student” about the quiet types leaving the discipline, yet here, you offer your own subjective experience with no evidence.

          If we are discussing at the level of subjective experiences, though, my experience is that those with high integrity and good methodological skills are not afraid of blogs like this.

    • As a social psychology PhD student, I am profoundly disturbed by your opinion.

      I too am quite terrified by the prospect of one day seeing my work criticized on this blog. However, this fear has had a direct and positive effect on the way I do science. I know many other social psychology students, researchers, and faculty who have had similar experiences.

      However, your doubt that andrewgelman.com has a net positive effect on social psychology is not what bothers me so much about your post.
      To the best of my ability I work hard and with integrity. I publish what I can and try in my own small way to push the field forward. However, the likelihood of me getting a (desirable) job in our field is very small.

      I attend a strong program, but not the strongest. I have a fantastic advisor, but he is not among the social psychology aristocracy.

      What I find far more terrifying and frustrating than the thought of having my work criticized on this website is the fact that when I hit the job market a decade after the onset of the replication crisis, I will still be competing against some decent portion of people who don’t give two shits about whether their work is statistically and methodologically responsible. I know this, because I know them. They are not bad people; they have simply been trained to hunt statistical significance and sex appeal.

      I don’t know where you are in your career, but I can tell you that for a great many people in my position, the deck seems absurdly stacked against exactly the kind of “quiet, clever, thoughtful, and kind people who care about the truth and solving problems” that you reference. The older graduate students that I know and whom I think fit that description have largely left the field. Really, nearly all of them.

      Doing social psychology *right* takes more time and effort and often yields less shiny results and this puts people at a significant disadvantage, compared to people who are comfortable aiming for the kinds of big effects that (we know now) often fail to replicate.

      Andrew Gelman and the people like him help level the playing field, if only a little. I am quite willing to run the risk of having my work criticized publicly, given that others who care much less about whether their work is reliable run the same risk. Public critics might crack a few shells, but that’s nothing compared to the systematic exclusion of junior scientists who aren’t willing to do bad science for the sake of good findings.

      And, just to be perfectly clear, I’m not proposing that all good scientists are shut out of the field or that all successful social psychologists are bad scientists; but rather that (1) if you’re willing (knowingly or not) to do the kind of sloppy science that makes sexy findings a nearly sure thing, you have a huge advantage over those who aren’t and (2) I think this is a fucked up state of affairs and it keeps me up at night.

      • +1000

        I’m soon to be on the job market, and my CV just doesn’t stack up to many others, but I also know those people, and they’re trained to, pardon my french, shit out publications after hunting for significance and spin a story as best as they can.

        Meanwhile, those who try to be extremely deliberate, careful, thoughtful about analyses and design, about inferences/conclusions, etc simply can’t crank out a million pubs a year in a phd program. Careful science takes time, thoughtless science takes much much less time. The deck is so stacked against people whose work is fully honest, transparent, messy, and careful, and yet it is those people the field needs badly right now.

        • i agree with all except the deck’s being stacked against those doing the right things. they do get recognized and feel fulfilled at the end unless one considers public attention or number of drinks consumed during grad school successes. good luck with the job hunting.

        • Ayse:

          Given my experience, I do agree with you. And, I generally feel respected and recognized by people whom I would want to feel respected and recognized by.

          However, doing flashy work (that maybe won’t replicate) doesn’t just get you attention and drinks during grad school…it also gets you tenure at Harvard.

        • student:

          On the other hand, as we see from many examples, getting a tenure at Harvard is not the only way to make an impact in your chosen profession, and help science and society move forward.

          We need more of your kind :-)

      • “deck seems absurdly stacked against exactly the kind of “quiet, clever, thoughtful, and kind people who care about the truth and solving problems” that you reference. The older graduate students that I know and whom I think fit that description have largely left the field. Really, nearly all of them.”

        Evidence please?

        • Psychologist:

          If you feel comfortable calling for evidence, I think it is only fair that you specify what sort of evidence you would consider compelling. And, ideally within that specification, you would at least nod toward pragmatics.

          I stated that the majority of thoughtful, clever, quiet, and kind PhD students who care about the truth and solving problems *that I know* have left the field. The only evidence I can imagine for this claim would be a list of all the students I know rated for their fit to these criteria and whether they have left the field. That’s about as absurd as your call for evidence.

          In your previous post, you apparently felt that reliance on reports from “many” unnamed graduate students constituted anecdotal evidence for your position. I suppose, if I wanted to just be snarky and not engage with your comment, I could have just something like, “Evidence, please?”

          From my view, your substanceless response to my concerns and perspective just underscores the state of the field.

          That all said, I’ll provide evidence in the form of a logical proof. Hopefully, you can overlook that which you already must know: I have no empirical evidence (but then, neither do you, and you seem perfectly comfortable making all sorts of insinuations and claims.).

          1. Publication record is a major driver of career success in social psychology

          2. In the past, and now to a lesser but still real extent, top journals tend to publish findings that are surprising, flashy, sexy, likely to get press coverage, etc. This is not an evaluative claim, just a description of the state of affairs. It makes sense that journals have these biases. They have to make money, after all.

          3. Because research that is counter-intuitive, sexy, flashy, likely to get press coverage etc. is more likely to get published, conducting such research is incentivized. Again, this is not a normative statement. I love research that tells a good story. However, nonetheless, this describes a specific sort of incentive structure.

          4. It is easier to obtain positive results with sloppy methodology and statistics.

          5. If two people, person A and person B, both hold sexy hypothesis X and person A is willing to engage in sloppy methodology and/or analyses (e.g. noisy measurement, multiple file-drawered underpowered studies, many unplanned comparisons and tests, post hoc hypothesis tinkering) and person B does not do these things, person A is more likely to obtain statistical significance/publishable results.

          -5a. Sometimes person A’s approach might not work and B’s will.
          -5b. However, it is definite that A’s approach can yield statistical significance at a much higher rate, given enough tinkering.

          6. Given multiple sexy hypothesis, A will obtain publishable results more frequently than B.

          7. Person A is therefore more likely to obtain more publications. And, given the shine of the results, they are likely to get in top journals.

          8. If person A has more publications in top journals, they have a market advantage, compared to person B.

          9. Peer review should function as a quality check. Person A’s sloppiness should be noticed and penalized. However, if the faculty conducting the peer review were trained within the same incentive structure as Person A and B, it is more likely that they will engage in in the same kind of sloppiness.

          10. That is, this incentive structure instantiates a dynamic positive feedback loop. People who do sloppy work are more likely to obtain statistical significance in the long run. They are therefore more likely to get good pubs. They are therefore more likely to get good jobs. And, they are then in positions of power and their practices are transmitted to more junior scientists.

          11. Ipso facto, students who do not engage in sloppy research have a market disadvantage. They are less likely to get desirable jobs. Therefore, they are more likely to leave the field.

          Based on your posts, I’m relatively certain that you won’t agree with this line of reasoning. Assuming that, if you’re going to reply, perhaps you will consider saying something of substance.

          If you don’t agree with my line of reasoning, where does it break down, exactly? The focal point of my argument is statistical: if one strategy is more likely to be successful than another, then it *will* be more successful and competitive agents should be more likely to adopt it. Maybe you’ll quibble about whether sloppy work gets published. In that case, I would very much like to know how you explain the low rate of replication in social psychology. Maybe you’ll be tempted to make an ad hominem attack and claim that I’m just bitter or something like that. Let me preempt you, I love what I do. I’m not worried about finding a job — it will be in social psychology or not. Whatever.

          However, I think it is despicable when people in positions of power defend incentive structures that disproportionately disadvantage students who do responsible, careful work. If you do not see how your apparent position does just this, I am appalled but not shocked. And, I would very much like to know what your alternative view of the field is.

      • Rock on, student!

        I’m in a similar position, but from within clinical psychology. If you’re ever inclined to reveal your name, I’ll know who to look out for if I ever want to collaborate with an expert in social psychology. Like the meek, the “quiet, clever, thoughtful, and kind” have yet to inherit the earth, but I’m confident that with all the wisdom we glean from this blog and its generous contributors, we’ll find a door or two to get our feet in.

  31. The immediate jump to accusations of misogyny is interesting. It seems to be a general tendency among women that open criticism is not acceptable.

    An interesting way to study this tendency is to read the comments on postings to the “theTwoXChromosomes” subreddit at Reddit. It is a subreddit explicitly devoted to posts by women about women. (Its at https://www.reddit.com/r/TwoXChromosomes/.)

    Anything but unqualified support in a comment tends to be considered highly offensive. Many participants express disapproval when a post reaches the “frontpage” because it brings in a lot of male commenters.

    FWIW, the original posts also tend to be much longer and detailed than posts in other subreddits as well.

    Maybe there is a publishable paper here.

    • From “The immediate jump to accusations of misogyny is interesting.” To “It seems to be a general tendency among women that open criticism is not acceptable.”

      Interesting…

      I mean, I agree the immediate jump to accusations of misogyny on Gelman’s part is a bit strange (although many comments in other communities /do/ contain misogynistic attitudes toward Cuddy, I don’t see Gelman espousing anything misogynistic; no critiques played on gendertypes, and he critiques work regardless of genders), but surely you see some irony in then saying “It seems to be a general tendency among women that open criticism is not acceptable.”
      Do you really think that’s a fair statement to make? Do you not think your second sentence is itself misogynistic?

      • I wouldn’t trust this person’s assertions. If you’re familiar with Reddit and its various cliques and the feminist communities there (and the sexists) you’ll know that /r/twoxchromosomes is one of the more wishy-washy subreddits with regards to the quality of the moderation. There’s some amount of history there (it was made a standard-subreddit at some point, meaning that everyone signing up for reddit would be signed up for /r/twox also) but long story short, if someone finds the Whataboutism-infested /r/twox to be an oppressively moderated environment already, that’s some interesting context information about that person and their perceptions. I wonder if one of the MRA or Redpill or whatever else bullshit sexist communities there are linked the NYT article and that’s where all these trolls are coming from in the comment section, cf. ‘Luke”s “Excellent blog. Cuddy is an affirmative action hire. Obviously. Thanks for setting women to the same standards as men!”.

        Is there an effort to try to recruit Andrew for a fucked up cause? :D

    • “It seems to be a general tendency among women”

      Are you serious or trolling? Are you really trying to make a broad claim about the behavior of half the population, citing a reddit thread to make your case?

      It’s not only offensive to women, it’s offensive to logic.

    • I haven’t looked at that subreddit, but have seen the phenomenon, “Anything but unqualified support in a comment tends to be considered highly offensive” on those subreddits I’ve looked at — which turned me off from looking any further at Reddit.

  32. Terry:

    I think your willingness to make a generalization about half of the humans living on this planet is even more interesting. It seems to me that among misogynists, there is a general tendency to generalize about women.

    For instance, see your post.

  33. “Good people can do bad science.”

    Corollary: Jerks can do good science.

    “The funny thing is, I think that pretty much is the message of that famous Ted talk, and that the message would be stronger without silly, unsupported claims …”

    This is true for most people reading here, but I don’t know if it’s true generally. There’s a ton of practical material out there on how to ace a job interview or business presentation (and self-help in general). But TED isn’t going to book someone to dispense mundane, practical tips. It has to be presented as a “Big Idea.” It has to be “deep” and “thought provoking.” The imprimatur of Official Science seems to enhance the appeal.

    “The idea is to take the positive aspects of the work of Cuddy and others—the inspirational message that rings true for millions of people—and to investigate it using a more modern, data-rich, within-person model of scientific investigation.”

    My guess is that with any sort of self-help type intervention some of it “works” for some people some of the time and the reasons for this are pretty much intractable. Any “effects” you might detect probably won’t be consistent or universal.

    More generally, I think: People have their own “natural” or “subconscious” inclinations, but they often desire to self-consciously alter their behavior and personality, for whatever reasons. Usually this fails. People do change, no question, both consciously and through circumstance. But a lot of times it simply requires too much conscious effort to “fake it” indefinitely, to use Cuddy’s term. People get bursts of motivation, but they usually can’t sustain the effort, or their desires shift, or they lose faith if the anticipated benefits don’t materialize soon enough. They then tend to revert to their more natural tendencies, perhaps with intermittent excitement over some new diet, workout, sales technique, dating advice, spiritual path, etc.

  34. This narrow focus on Cuddy and the harshness or reasonableness of the criticisms of her trivializes what is at stake here.

    Gelman et al. are testing the entire structure of modern academic research with a hammer, and large parts of the structure are sounding hollow. So far, only a few chunks of the edifice have crumbled, but the cracks seem to run very deep.

    • Anon1:

      Bem’s ESP paper is similar to the power pose paper and many many others that have been discussed in which strong claims are made based on p-values from noisy data where there are large numbers of researcher degrees of freedom. ESP and power pose are different topics but the forking-paths or p-hacking issue is the same for both of them. For Kanazawa, I was not referring to work on African women—I don’t recall reading that paper—I was referring to several papers he wrote on the variation in sex ratio, where, again, strong claims were made based on p-values from noisy data where there are large numbers of researcher degrees of freedom. It’s a common mistake, and I think the careful study of particular examples such as these has helped us have a better understanding of how to do quantitative research in the human sciences.

      • I feel like p-hacking is pretty common, especially in the medical world. Taking something which is relatively plausible and mixing it with notorious and implausible stuff.

        There are so many scientists out there whose,smallxstudies have not been replicated and yet you cite,

        • Anon:

          Yes, I agree that there’s a selection bias in what gets discussed. Psychology is more accessible than medicine, data in psychology are more available, etc. My most important role in all this, I believe, is to develop statistical methods and understanding, and I very much hope that medical researchers and statisticians working in medical fields will take this work and apply it to their research!

  35. I just want to add that I was disappointed to read the last paragraph of the NYT article because it gave me the impression that the whole aim of it was to make her more likable and promote her new book. Actually wonder if she could have paid for the article. Also, saying that she didn’t replicate her study because couldn’t find a collaborator is such a bullshit reason.

  36. There are other ways to test structures than using a hammer. Can I suggest you don’t take the same approach to fixing your house that you do to fixing science?

    • Psychologist,

      Terry’s metaphor (“Gelman et al. are testing the entire structure of modern academic research with a hammer, and large parts of the structure are sounding hollow. So far, only a few chunks of the edifice have crumbled, but the cracks seem to run very deep.”) is too mixed to be a good one, and you seem to have confounded it even further.

      *Testing* with a hammer can indeed be a good way of testing the soundness of a house (e.g., for structural soundness that would allow it to withstand high winds). *Fixing* the house may also involve using a hammer, but in a very different way). So I say, let’s just scrap this metaphor — it doesn’t work in context.

  37. I’ve been following this blog for perhaps 2 months. As a newcomer, I have read positive characterizations of some work and negative. Nevertheless, I did not pick up on several dynamics being debated. We can all learn to communicate more productively.

  38. I really wish someone would scrape the content on Gelman’s blog and tally up all of the “I’m not saying that XYZ… I’m just saying that PQRST” (e.g., “I’m not saying that [she’s an idiot, she’s lying, she’s a cheater]… I’m just saying that [she could be more careful, she should give up her data, she could do blah blah blah])”. But it’s very, very, very clear from the context of the statement that he actually does mean exactly XYZ. He says one thing, gets slammed, then says “I’m not saying that XYZ… I’m just saying that PQRST”. Only a lawyer would read the statement and think “oh yes, he means PQRST”. I think what most people don’t appreciate is when he tries to talk out of both sides of his mouth and have it both ways all the time, as if the “I’m just saying that PQRST” effectively insulates him from the plain meaning of XYZ that is communicated in tone and substance. And that is what comes across as being so jerkface-ish. I’m sure Gelman is a nice guy in person, he just comes across as a jerkface. (See? I pulled a Gelman right there.)

    One could take this a step further and apply some kind of machine learning to his blog posts for detecting snark and pettiness, and the readings would be off the charts. Kind of like that econ paper that described all of the misogyny on econjobrumors.

    • Word on the street is that Fiske and some graduate student have been working on an NLP analysis of Gelman’s and other methods blogs. This will almost certainly be controversial and contested as a study, and I am doubtful it will change anyone’s mind about what has and hasn’t been said on this or other blogs.

    • A,

      Nope. When I write, “I’m not saying XYZ,” I really really do mean that I’m not saying XYZ. I write this sort of thing to clarify and avoid confusion. Based on your comment, it appears that I have not always been successful in this effort, so let me re-state that when I write, “I’m not saying XYZ,” I really really do mean that I’m not saying XYZ. I hope this helps.

      Of course you can disbelieve the above paragraph too, but that gets us into an unhelpful infinite regress. So I recommend that if you decide to read my writings at all, that you take these statements at face value.

      • No, it doesn’t help. Taking statements at face value is exactly what you do *not* do. You say “I’m not calling her a cheater, I’m just saying that Cheater A, Cheater B, Cheater C, and Cuddy could do things a different way”. But by lumping them all together in the same sentence, you’re tarring her with the Cheater label even though you say “I’m not calling her a cheater”.

        • A

          In this case you are not actually quoting me so I’m not really sure exactly what you are referring to. In any case, when I specifically write, I’m not saying XYZ, it is specifically to clarify that I’m not saying XYZ. Writing can be ambiguous and that’s why sometimes I go to the extra effort to specifically explain what I’m not trying to say.

  39. Cuddy on twitter:

    “The effect of power posing on feelings of power has replicated 17 times, which @nytimes piece didn’t convey. Pls correct @susandominus Thnx!”

    The author of the NYT piece tweeted Cuddy back:

    “I wrote that your P-curved, comprehensive meta-analysis, soon to be published, showed evidence for effect on feelings of power…”

    https://twitter.com/amyjccuddy/status/921179408446230528

    Anyway, I find this action and phrasing by Cuddy interesting. Here is why:

    1) If i were a scientist, who just got covered in a big NYT piece which included talking about the validity of their findings, i would perhaps only contact the author of that piece to ask to correct possible errors, not ask to put something in there that was not “conveyed”.

    2) If i were a scientist, who just got covered in a big NYT piece which included talking about the validity of their findings, i would perhaps use different words to ask for something of the author. “Pls correct” would not be my choice of wording. I would use a sentence like: “Would it be correct, fair, useful, and possible for you to correct this?”

    3) If i were a scientist, who just got covered in a big NYT piece which included talking about the validity of their findings, i would perhaps be very careful about (re-) presenting found evidence, conclusions, and possible consequences. Perhaps i would have included that it would be fair, if i am talking about these 17 studies that showed “feelings of power”, to perhaps also talk about what these 17 studies have also found (e.g. that they found no evidence for any behavioural effects?)

    Anyway, reading about this whole “power-pose”-thing made me wonder a few things:

    For instance, has she (or anyone else) ever investigated how long these “feelings of power” last, and would that be useful/important to investigate before promoting power-posing?

    Has she (or anyone else) ever investigated if there is a potential downside to these “feelings of power”. For instance, investigate whether after a few minutes the “feelings of power” actually decline to a level *lower* than what was present before doing a “power-pose”, and would that be useful/important to investigate before promoting power-posing?

    I guess i am just not cut out to be a scientist…

    • I can sincerely say that power posing would make me feel embarrassed and awkward, and few of the people I’ve talked about this with think the same way. I have no idea how this could work in making anyone feel more powerful. Again sincerely, I’d be curious to see what sort of person one has to be for this to work in the intended way. It is just so foreign to me. Like when you hear about people who have committed inhumane atrocities, I can’t just understand what sort of people they are or where they are coming from. The mindset is so different that it is like they were sourced from another universe.

    • 4) If I were a scientist, who just got covered in a big NYT piece which included talk about the validity of my findings, and my being bullied by others, and the tone of the article was one of attempted objectivity to both me and those I accuse of bullying despite my knowledge that the author was mostly on my side, I’d be really really careful not to publicly do things that make it look even more like I expect the author to be on my side, e.g. Tweeting them with requets for changes to make my own position appear stronger.

      I mean FFS. Can you imagine if Gelman or Simmons Tweeted a “Pls Thnx” to Dominus asking for a quote to be expanded or an additional anecdote about Cuddy’s intransigence fit in?

      • I think the author has been pretty reasonable in her NYT piece concerning the events, and presenting the viewpoints of all parties involved.

        I also think the piece can easily be mistaken to portray a certain story which i think does not fit the facts and/or downplays Cuddy’s responsibility for the events that have happened. For instance, why would it be a problem for her to find a collaborator, and more importantly why would she even need collaborators to perform a replication? And did “the rules” really change, and if they did i reason “the rules” were about how to do science and excluded presenting your work, which you apparently did not replicate, to the general public via a TED-talk.

        I saw that the author posted a link to this blog on her twitter, which i think is very good and possibly useful for those interested in “power-posing” and wanting more information. She also re-tweeted a tweet by someone who said that the rules haven’t changed and if she was going to do a story about people who followed them but didn’t get a job. For me, it shows that the author cares about giving attention to all sides, discussion, reasoning, evidence, etc.

        Ever since i saw that the author posted the link to this site, I am hoping that she has read/is reading this blog, and the comments, which could perhaps be useful in some way or form.

        I must note that i am reasoning purely from a scientific perspective and with the goal of improving science.

        • “I also think the piece can easily be mistaken to portray a certain story which i think does not fit the facts and/or downplays Cuddy’s responsibility for the events that have happened.”

          Hmm, not sure i phrased this correctly (sorry, English is not my 1st language). I (think i) meant to say:

          “I also think the piece can easily be seen as portraying a certain story which i think does not fit the facts and/or downplays Cuddy’s responsibility for the events that have happened.”

        • Hi Anonymous, thanks for pointing this out. It wasn’t fair for me to say that Dominus is “on [Cuddy’s] side”.

    • “Has she (or anyone else) ever investigated if there is a potential downside to these “feelings of power”. For instance, investigate whether after a few minutes the “feelings of power” actually decline to a level *lower* than what was present before doing a “power-pose”, and would that be useful/important to investigate before promoting power-posing? ”

      I am actually getting really interested in “power-posing”. Here is what i thought might also be interesting to investigate. What if you actually feel so powerful/dominant that you simply do not care about possible “power-dynamics” concerning posture? And if you propose women to expand their posture, aren’t you implicitly giving more strength to the thing you are trying to combat?

      Why not simply sort of bypass this possible “power-dynamic” concerning posture by not even acknowledging it. In a conversation for instance, this could perhaps change the “power-dynamic” from being non-verbal (like posture) and refocusing the “power-dynamic” on other things like strength of arguments, logic, etc.

      Cuddy seems to say: if you can’t beat them, join them.
      I say: i you can’t or don’t want to join them, beat them.

    • “Has she (or anyone else) ever investigated if there is a potential downside to these “feelings of power”. For instance, investigate whether after a few minutes the “feelings of power” actually decline to a level *lower* than what was present before doing a “power-pose”, and would that be useful/important to investigate before promoting power-posing? ”

      https://static1.squarespace.com/static/54c826b8e4b0b23fee6c1343/t/5888290215d5dba7f58f7137/1485318404518/Smith+and+Apicella+2016.pdf

      “While we found no main effect of pose type on T, C, risk or feelings of power, we unexpectedly found that losers who held high power poses experienced a relatively greater decrease in T compared to losers who assumed neutral or low power poses. While this may be a false-positive and thus requires further replication, we do speculate on these findings”

      (…)

      “The field of ethology has long been interested in the question of why low-ranking individuals do not display signals of high-status (e.g. armaments), given that high status confers many benefits in terms of access to resources and mates. In other words, whyare there so few individuals faking high-status? In a classic study, Rohwer (1977) dyed the plumage of subordinate sparrows to match the plumage of high-ranking sparrows and found that the legitimate high-ranking birds persecuted the “fakers
      ”. Other studies on low-ranking males with experimentally exaggerated armaments find similar detrimental effects these males are more likely to be attacked, barred from feeding and excluded from social groups (for review,Berglund et al., 1996). Thus perhaps,Carney et al. (2015) was right to argue that the social context of power posing is important. Ironically however, the social context itself may undermine the supposed benefits of power posing.”

      Uh oh. This is what i meant by whether it would important to first investigate whether there are any possible downsides to something before promoting it. Perhaps “fake it till you make/become it” can actually backfire on you.

      • And why should feeling powerful always be a good thing? Power can be used for good, or for ill. Some people feel too powerful, if the good of humanity is the ultimate outcome measure.

  40. Having read the NYT article, your blog, and your response to the NYT article, I do appreciate your attempt to take criticism. However, it still seems like you’re attempting to ignore the larger issue at hand. You claim you’re avoiding interpersonal conflict, but by using names, repeated references, snark, and outright rudeness, you’re creating interpersonal conflict. You’re making conscious choices. You chose an indirect approach (with the convenient response that because other people haven’t responded well to direct contact, you are henceforth absolved from ever trying direct contact in the future). You chose a mean-spirited style. Neither of those choices are mature, responsible, or academic. You want to claim the higher ground of objectivity, but the manner in which you have called out others has not been objective. Maybe you’re going for the same “newsworthy” or “catchy” style that has gotten social psychologists popular media attention in the past (which gives you less grounds to criticize those who have felt the lure of media attention). Maybe it’s frightening to approach people directly and have to face the humanity of the people you’d prefer to view abstractly. Those are personal failings, similar to the human biases that led to poor science being conducted in the first place. You are not immune. This is a chance to step back and view yourself with the same objective lens that you have used on other’s scientific findings. I suggest you take it.

    • Not this. Andrew is not rude and doesn’t owe anyone direct contact when he is reacting to work that is published for the world to see. That’s just how you want the academic club to operate, old chap.

      • I agree. Science belongs to everyone, and there are no rules about which mode of contact (and even contact itself) should be used.

        Perhaps the “anonymous” who wrote the post above can give 5 examples of text prof. Gelman has used (and provide links to them) which he/she views as being not “objective” or “mean-spirited”.

        This would provide possible evidence for his/her claims, reasoning, and conclusions, which is always a good thing in science. It can then be discussed whether these examples are, or are not “objective” and “mean-spirited”, why this could be the case regarding the scientific enterprise, and if it should even matter.

        • Anon,

          Objectivity is hard to define, but to be fair I guess I can start to become mean-spirited after seeing the same error over and over again. I think we’re not supposed to be referring to writers by name anymore, so let me just say that there’s a certain New York Times columnist who’s written some interesting things on geography, social class, and voting, but who’s also refused to correct some of his published errors, even when those errors have been pointed out to him directly (incidentally, that’s another example where I tried to contact someone one-on-one and it didn’t work). I do think I became increasingly mean-sprited out of frustration. Whether or not this mean-spritedness is justified, I do think it’s there. I do think everything I wrote on that matter is accurate, but I’ve certainly expressed my annoyance.

          Another example is that a few years ago I posted some items on football columnist who also writes about politics. This guy is a good writer with lots of interesting ideas but was getting sloppy in his political pontificating and, again, after repeated exposure to his work I expressed some annoyance in what could be called a mean-spirited way.

          I’m pretty sure that these expressions of annoyance on my part have been counterproductive, even if they may have provided catharsis for me at the time.

        • This post is mean spirited and is posted in the same manner that you snark on other things (“I’m not saying XYZ… I’m just saying PQRST”). You say you’re not naming names, but (a) you say something snarky about how people are attacking you for making personal attacks without actually saying that you object to them attacking you for making personal attacks; and (b) you provide enough information in this post to make it clear to anyone with half a brain which NYT columnist you are referring to.

        • A:

          Huh? I’m not trying to keep the names of those two columnists secret, never claimed to do so. I’m just avoiding using their names as recommended by various commenters. As I wrote above (but I’ll say it again), I think both these columnists are talented, have written interesting things, and have a lot to offer. I was frustrated by some of their writings, and in frustration I wrote some mean-spirited things, which I think was a bad idea.

          I’m doing my best here!

        • Wow, I read that really differently — I am one of the people who commented above about the repeated naming of names, but I didn’t at all interpret Andrew’s response here as snark directed at me. It looked like honest self-reflection!

    • Science should have no code of silence and names should be named. Consider the recently published series of articles on the cell line misidentification crisis. In “The Ghosts of HeLa: How Cell Line Misidentification Contaminates the Scientific Literature” the authors demonstrate that tens of thousands of cancer papers cited hundreds of thousands of times make claims about cell lines that were not in fact what they were claimed to be. Entire research programs begetting other similar programs have been founded on the wrong cell and sometimes the wrong cell from the wrong specie; all because the researchers failed to authenticate their materials. Yet when you get to the section on “Data Availability” you find this:”However, access to data (i.e., the full list of articles found to be reporting on misidentified cell lines) is conditional upon approval by the research ethics committee of the Science Faculty of the Radboud University Nijmegen. The key concern leading to conditional access is that the data provide a rough estimate of the size of the problem of cell line misidentifications contaminating the research literature. It is NOT a way to accuse individual researchers, research teams, or research institutes, as the data are not sufficiently precise and will lead to false positives (and hence false accusations). Using the data without sufficient notice of the context might lead to false accusations targeting individual scientists or research institutes which could have severe negative consequences for individuals involved. Researchers wanting to re-use these data will have to convince the ethics committee that data will not be used for such purposes.”

      Obviously they didn’t read every word of every paper and so can’t be certain that just because a search turned up say “thymic” and “F2-4E5” in the same paper’s Materials section it didn’t entail that the researcher actually studied thymic cells via liver cancer cells (F2-4E5). And it’s easy enough to do our own search but why do it when it’s already been done? If you had a loved one with thymic carcinoma who was considering a clinical trial based on cancer cell research would you worry more about the reputation of the researcher should it turn out he’d used the wrong method or the effect of a useless treatment on your loved one? The first publication demonstrating that the cells weren’t actually thymic cells occurred in 1998. Nevertheless dozens of papers cited hundreds of times with the most recent being late last year continued to raise the suspicion of poor methodology. Ponder this: “Of the fifteen most recent articles referring to the [misidentified cell line], thirteen actually refer to it because they use the cell lines, all thirteen reporting research on thymic cells, without mentioning any knowledge of the misidentification of these cell lines.” Surely at some point a little public shaming is in order.

  41. Some general observations on the snark/criticism question:

    1. Criticisms, no matter how diplomatically worded, are usually very cutting to the recipient. It is easy to forget how hurtful even the blandest criticisms can be.

    2. This principle is a hundred times stronger when the target is the work of a lifetime, and it is a thousand times stronger when it threatens the recipient’s paycheck.

    3. Consequently, it is far too much to ask of human nature that the recipient of the criticism cheerfully and quickly admit significant errors. A few people can do this, but usually only when they are secure in their status and the paper at issue is a small part of their reputation.

    4. Rather, we should allow various ways to tacitly admit errors. For instance, the recipient may “move on” to other topics, while acknowledging that “my study is only only in a large literature” and that “the paper is one step in the process” or “has prompted a useful discussion”.

    5. This does not mean vigorous criticism is inappropriate because vigorous criticism is essential to serious research. Researchers should expect it, especially researchers who aspire to tenured Harvard professorships.

    • This is why

      1. A university education should involve regular criticism. People should be used to being told that (someone thinks) they are wrong. By the time they get to grad school, criticism should stop being so cutting because the skin should be thicker.

      2. Being wrong should never directly threaten your monthly paycheck. (There should be a right to be wrong.) Of course it should threaten your hopes of high speaking fees for popular audiences and, ultimately, winning the Nobel.

      3. Institutions exist to demand more human beings than nature does. Under the right institutional conditions, researchers would not only feel so threatened by criticism, they would also feel compelled to acknowledge it.

      4. Errors and bad habit require explicit correction.

      5. I completely agree with this one, actually.

    • Terry,

      I intend the following seriously and genuinely– please do not consider it as snark:

      I am very sorry that your education, upbringing, and life experience have not given you the skills of dealing with criticism. I guess it shows a weakness in our society or educational system. But the bottom line for you is that you have been shortchanged.

  42. I had my first published study ever not replicated. I (try to) do careful science for years. I follow Gelman’s blog because I often learn a lot from him and his commenters. I feel that as there was/is an economy of over-sellers in psychology there has emerged an economy of oversellers within the effort to better the field. I am not saying that Gelman is one but my guess is that the net worth of speak about the traits of that psychologists/biologist or any other scientist in improving the field approximates zero and is much more in the realm of power, self satisfaction, publicity and whatnot. economy. time to move on.

    • The grass is certainly always greener and that is one of the key advantages to promoting a new or otherwise untested method/approach/model/etc. With no record of failure to point to, it is all too easy to oversell the benefits and ignore the potential costs. I have a strong suspicion that if psychology and the social/behavioral sciences had adopted Bayesian instead of frequentist methods all those years ago it is equally likely we would be in the midst of a crisis where p-hacking would instead stand for ‘prior hacking’. There are no full-proof methods; humans excel at finding ways to abuse systems for self-gain. Once all the reforms have been fully adopted, I give it about 30 years until the next crisis emerges, when the field realizes that what they were sold is not what they bought.

      • Sentinel:

        Indeed, The problems with p-values are not just with p-values.

        Bayesian methods are hardly “new or otherwise untested”—we’ve been using them for decades to solve otherwise intractable problems in political science, toxicology, etc.—but, definitely, any method can be misused, and I think it’s important for there to a lively culture of criticism.

        Scientific publishing of course is full of criticism—but traditionally almost all this happens in the pre-publication stage. I think it’s important for there to be space for post-publication review as well. We have to move beyond the attitude that, once something is published, it should be taken as correct.

        I hope that, rather than using Bayesian methods for 30 more years and suddenly going off a cliff, that instead we continue to criticize our own work and that of others. And I do think this is happening. For example, Mister P is an increasingly popular method in political science and elsewhere—and there are also papers on how Mister P can to wrong, how to diagnose problems with Mister P, etc. Or, for another example, regression discontinuity is a powerful method in social science research but it can also mislead, and that’s one reason my colleagues and I have published papers on how RD can go wrong and how to do it better.

        • “Bayesian methods are hardly “new or otherwise untested”—we’ve been using them for decades to solve otherwise intractable problems in political science, toxicology, etc.”

          While technically true, my point is that from the perspective of mainstream psychology and the social sciences, Bayesian methods are essentially new and untested. Formal training in, and application of, such methods is infrequent enough as to be effectively zero. Same goes for pre-registration, and post-publication. The field as a whole has very, very little experience with these techniques. The examples you cite, to which I would add certain areas of cognitive psychology and neuroscience, are notable exceptions that most researchers are unfamiliar with and have had little to no impact on research practices and norms.

          “I hope that, rather than using Bayesian methods for 30 more years and suddenly going off a cliff, that instead we continue to criticize our own work and that of others.”

          NHST has been the topic of scathing criticism for years and years and years. Where has that gotten us? I have little doubt that if there is a cliff to go off people will invariably find it. In the long run it will be much more useful to adopt a harm reduction approach. Assume the methods will be corrupted, think hard about all the ways and reasons such corruption might occur and work tirelessly to create and environment that minimizes harmful pressures as well as the impact of errors. I’m not saying that we shouldn’t pursue these new approaches or that they won’t help, but if we don’t change the culture and the incentive system surrounding their use in an intelligent way then we’ll end up right back where we started: crisis.

        • “While technically true, my point is that from the perspective of mainstream psychology and the social sciences, Bayesian methods are essentially new and untested. Formal training in, and application of, such methods is infrequent enough as to be effectively zero. Same goes for pre-registration, and post-publication. ”

          This is a good point. From a practical perspective, getting beyond this probably needs to start with pressure from individual psychologists to provide things like the following:

          Workshops in Bayesian methods provided by professional societies at their meetings. (These should be aimed at editors of professional journals as well as individual researchers. They might also include workshops for psychologists wanting to teach courses in Bayesian methods.)

          Requiring graduate students to take courses in Bayesian methods.

          Statements from professional societies expressing the legitimacy of Bayesian methods, and giving examples of their use in psychology.

          It won’t be easy, but it probably needs to be primarily grass-roots initiatives by lots of individuals.

  43. Boy that was a close-run thing (Cuddy nearly getting tenure).
    Harvard’s motto is “Veritas” (truth).
    If Cuddy got tenure, Harvard would have had to change its motto to the Latin for “I played by the rules and then the rules changed :(“

  44. It’s been several years since I had to study psychology in order to help an ESL student finish her undergraduate and graduate degrees, and I have blocked a lot of it out, so please forgive me if my jargon is wrong or uninformed.

    In publishing, we have heated debates about the ethics of simultaneously writing and publishing books and reviewing other people’s works. Ultimately, it’s the readers who admire our work and who pay us, and they overwhelmingly reject the idea of one author critiquing another on a blog post or Goodreads or Amazon. I want to so badly sometimes, but knowing that “the internet is forever,” I exercise my limited self-control and let other people in other professions influence the discussion — or dissection — of other people’s stories. The stakes in fiction, and a great deal of non-fiction, are usually much lower than the stakes in peer-reviewed research. That is why the content of this blog baffles me.

    I don’t dispute the criticisms of the science or the methodology behind it. I disagree with the venue in which you continued to disparage people, at length, repeatedly, only recently converting this to attacking their studies instead. If Cuddy is ever going to move on to other projects, she has to have separation from the work itself, and you have made that difficult if not impossible in the small social circles of your peers. Whether or not you acknowledge that you have out-sized influence over the opinions of your peers is irrelevant, Dr. Gelman.

    Are you a social scientist or are you a blogger? If you’re both, why are you critiquing other people’s work instead of advertising your own or advocating for stricter protocols for advisors rather than lamenting what the big journals accept? Do you want to be known for your research or for your take-downs? The fact that you have thoughtfully replied to almost every comment on this post speaks volumes to me about your possible motivations, but more research is needed.

    As someone who opted out of post-collegiate academia (sorry, I neither know nor care what the term for it is anymore), I can tell you without hyperbole that this situation is the EXACT reason why people drop out of doctoral programs that are neither geared toward nor suited for the art of social media warfare. I am going to take a wild guess, not founded on any evidence at all, that academics who pursue research tracks may prefer the relative solitude that such a career used to promise, and that the idea of constantly defending every study they conduct in the sphere of public opinion is the last way they would like to spend their time.

    If rigorous debate, replication, and fiery exchanges are required, let them all be peer-reviewed, lest the line between science and opinion be further degraded. I see a real opening for a journal geared toward this end, like someone peer-reviewing Data Colada and publishing ongoing discussions.

    The fact remains that every study published before 2010 is not necessarily receiving an equal measure of scrutiny means that efforts like this blog are furthering the ‘guinea pig’ status of recent, popular studies. In my opinion, it is the work behind the scenes of changing the methodology that will be most valuable to the field in the future. Ethnography and anthropology have strict ethical base guidelines that undergraduate students are expected to follow before they interview or observe a single person.

    If a de-emphasis on “noise” is required to forego a crisis in accurate data, perhaps working toward that end, deliberately without public recognition or reward, is the higher path to take. It sounds like you may already be doing this, and if so, thank you. I would encourage that work over griping on the internet any day, no matter how eloquent it sounds or how many people agree.

    • Folklorist:

      Thanks for your thoughtful comments. I agree with some parts and disagree with others.

      You write that I “continued to disparage people, at length, repeatedly, only recently converting this to attacking their studies instead.” I don’t think that statement is borne out by my actual writings over the years. I don’t think that it is “disparaging people” or “attacking their studies” to point out errors that have been made in published work. People point out errors in my published work sometimes, and even if they happen to be rude, I can understand that it’s not personal.

      When people publish studies, they should expect, indeed hope, that others will read those studies and uncover whatever problems are there. Carney, Cuddy, and Yap wrote a paper and were lucky enough to have Ranehill et al. perform a careful replication, and were lucky enough to have Simmons and Simonsohn study their statistics. The result was negative for power pose (see first paragraph in above post), but that’s fine. I reported on what Ranehill et al. and Simmons and Simonsohn wrote. That was not “disparaging people” or “attacking their studies.” Later Carney reported lots of problems in that original study. That’s all fine too. I don’t think I’ve ever disparaged any of the people mentioned above. Scientists make mistakes. That happens.

      It’s not disparagement to point out a mistake, nor is it an attack. Indeed, I think that attitude—the attitude that a technical criticism is a disparagement or an attack—is counterproductive, in that it encourages researchers to deny and even compound problems with their published work rather than simply accepting they made mistakes.

      You write, “why are you critiquing other people’s work instead of advertising your own or advocating for stricter protocols for advisors rather than lamenting what the big journals accept?” I don’t actually know what you mean by “stricter protocols for advisors,” but, as to the other items: I do all these things. I advertise my own work, I also criticize my own work and I write about things I don’t know. Blogging is a conversation with all sorts of people. This particular thread started back in 2007 or so when someone pointed me to that paper on beauty and sex ratios. I’ve learned a lot from careful examination of other people’s work as well as my own.

      Indeed, I think criticism, of one’s own work and also of others, is a crucial part of scholarship, and I completely, completely disagree with any implication that I or anyone else should not be critical, on a blog or in any other forum. If you don’t want to be critical, that’s your choice—there’s room in the world for all sorts of approaches—but I think that criticism is central to science. This includes peer review before publication, and it also includes post-publication review, even by researchers in different fields. Peer review has its strengths but it can’t do it alone. Not by a long shot. Peer review is imperfect and can have systematic biases.

      Finally, you write, “it is the work behind the scenes of changing the methodology that will be most valuable to the field in the future.” I agree 100%, except that I would remove “behind the scenes”—I think this work is most effective when done in the open. New methods don’t come from nowhere. Lively discussion in the context of specific examples is a great way to move forward, at least it has been for my colleagues and me.

    • “I am going to take a wild guess, not founded on any evidence at all, that academics who pursue research tracks may prefer the relative solitude that such a career used to promise, and that the idea of constantly defending every study they conduct in the sphere of public opinion is the last way they would like to spend their time.”

      Are you aware of any cases of an early career academic (grad student, post doc, untenured assistant prof) working in relative solitude but then having to constantly defend every study they conduct in the sphere of public opinion? This assertion is made often but I’ve yet to see an example.

      The names that get named here and on other methods blogs and social media circles are big names, and the work being criticized is always work that was put forward in the popular media before the criticism arrived. It seems that those who prefer relative solitude have little to worry about.

      Regarding how to improve the use of statistical methods, writing scholarly articles is great but is also of apparently limited efficacy. The issues brought up here and in similar places are issues that have been continuously raised for decades and decades. The problem, it seems to me, is that there is a huge incentive to oversell statistical results. This is not going to go away by itself. Given the choice to ignore methodological shortcomings in analyses whose results we like, we will ignore methodological shortcomings in analyses whose results we like.

      This doesn’t mean anyone who points out these shortcomings has to be nasty about it, or to make things personal. But open criticism has one big thing going for it: people pay attention. The prospects for effective behind-the-scenes criticism don’t look great, at least if their historical effectiveness is anything to go by. The incentive to oversell statistical results should be countered by a disincentive, and as far as I can see the most effective disincentive for doing things poorly is that other people may point out that you’ve done things poorly.

      • But open criticism has one big thing going for it: people pay attention. The prospects for effective behind-the-scenes criticism don’t look great, at least if their historical effectiveness is anything to go by.

        I think that in advocating for private conversations about this stuff with authors who screw up, I’ve been unclear. I absolutely agree that if you approach another scientist with constructive feedback and you’re rebuffed, it makes perfect sense to take the conversation public. I’ve been thinking of something more like this policy, discussed in more detail here, although what I’d originally had in mind was less formal. But sometimes formal policies help. It sounds like those authors feel that theirs has.

        • I thought about this a great deal and I see a crucial problem. If the subtext of any private communication about an error is, “Fix this or I will go public,” then the private correspondence always has the potential to be experienced as blackmail. The scholar who receives it will is basically looking for the cheapest possible way to avoid public embarrassment. One reason not to contact authors in this way is to avoid even the hint that that’s what you’re fishing for. After all, there is always the possibility that the author will make some sort of offer.

          This seems to have happened many years ago when a book reviewer discovered that a historian had plagiarized the reviewer’s work. They ended up striking a deal, whereby the reviewer received monetary compensation for the copyright infringement and wrote a glowing review that of course didn’t mention the plagiarism. This all came out when someone else spotted it along with the incongruous positive review.

          For this reason, I much prefer that people publish their criticism the moment they’re convinced it’s right. It can always be retracted.

        • Wow. Maybe I’m naive — that possibility had not occurred to me at all. Though what risk does an honest critic bear from this? It’s not like you are accidentally going to take money from someone.

          Aren’t subjects of news articles often phoned in advance by journalists to say, hey, we’re about to publish about this thing involving you, here’s the broad outline of what we plan to say, any comments before we go live? Do people commonly interpret these calls as requests for hush money??

        • I don’t know how common it is, nor what the most common sorts of “deals” are that are made, but I think Andrew named the risk pretty accurately when he talked to Dominus: “interpersonal conflict”. Sometimes it’s better just to keep everything public.

        • I’ll say that, as a lay consumer, rather than a producer, of academic research, private, “off-line” communications with authors about (major) apparent errors in their published papers are the *very last thing* I would like to see become routine. (Flagging a typo or errant citation is fine of course.) I view confidential chats between insiders as a betrayal of the notion of published scientific literature, and of the scientific method we are all taught in high school. There should be a presumption that published results will be tested in public. Similarly, if a judge makes a mistake in a legal case, no one owes her a phone call before criticizing or reversing her decision.

        • Yeah, I understand that perspective, and agree that it’s critical that serious problems with published science become publicly known. A few years ago when I started thinking about this, it was in the context of looking for a solution to the problem of scientists refusing to admit their own errors. It seemed to me at the time that scientists were more likely to respond well to people telling them about a problem in a room with a smaller audience than, say, Slate. It’s ultimately an empirical question, of course, and I don’t have any data to bring to bear, just my own general sense that admitting error makes people feel vulnerable and vulnerability is sometimes easier to bear on a smaller stage. I mean, the idea of first learning I’ve screwed up when three friends send me a link to a writeup in a national magazine…! it just makes me hope to hell all my work is so trivial nobody ever reads it.

          Meanwhile, Andrew says his own experiences make him think it’s not worth the risk of making himself vulnerable in this way because the rate of people ‘fessing up in these conversations is not good. So he has anecdata, which is not great but is admittedly more solid than what I have. I also come at this assuming that most scientists are operating in good faith, and some of the people he complains about pretty clearly aren’t (a frequent plagiarist comes to mind).

          So I don’t know. I’m going to drop this because it’s very possible I’m wrong, and also because it’s getting mixed up with this other issue, of whether it’s important that mistakes be corrected — which doesn’t actually require scientists to admit their own errors. I realize I’m coming across like I don’t think that correcting the public record matters very much, and that’s not what I mean to convey, truly.

        • Mostly +1; my only reservation is regarding “the scientific method we are all taught in high school.”
          My objection to this is that “the scientific method” taught in high school might be a watered-down version of what really goes on in science.

  45. “If Cuddy is ever going to move on to other projects, she has to have separation from the work itself, and you have made that difficult if not impossible in the small social circles of your peers.” – This is exactly backwards, since Gelman’s whole issue is Cuddy’s unwillingness to separate herself from this work. Instead of admitting error as her co-authors have done, she is still exploiting her flawed study for fame and fortune.

  46. A few remarks.

    1) It’s quite unreasonable to expect any one person to treat the production of a whole field “equally”, so criticism about people’s work being unfairly singled out by Gelamn etc is implicitly requiring unrealistic perfection. In this sense our behaviour is stochastic: we can only react to what we see, and do what we can.

    2) Contacting researchers in private can be a good idea, in particular to have one’s criticism checked/tested. However, nearly always the authors do nothing if things remain private. As Cuddy still barely acknowledges the problems, it’s hard to imagine that a private contact would have changed her response at all. Also, remember that it’s a publication; there can be no obligation to use private channels to discuss published work.

    3) The criticisms that hurt are those that are TRUE. Until we learn only to publish what we are prepared to defend and explain, or until we get better at admitting errors, this problem will (and must) recur. Gelman is doing a great job highlighting fundamental problems in research today. He’s not a gatekeeper in any concrete sense – he’s not editing a top journal, deciding jobs or promotions. He’s just writing a blog. That blog is effective because what he writes is true and insightful.

    4) People will tend to continue an argument (“flog a dead horse”) if there is no resolution. The paper is still there, the author is still defending it, so why should people who disagree with its conclusions stop?

    5) I think the most potentially sexist thing about this story is the focus on Cuddy’s gender by the commenters here and by some of the articles online. There are plenty of men who have received similar, even less sympathetic, treatment: Brian Wansink, Carlo Croce and Christopher Shaw, to name some recent examples.

    6) Lets just stick to the science instead of fussing too much about tone or selection bias of targets for criticism. This case was an example of the science of noise, of the sort where it’s easier to get results if the experiments are badly designed, the samples are small and the protocol is not blinded. And it was ridiculously over-hyped: “extraordinary claims require extraordinary evidence, not extraordinary publicity”.

      • Perfect, this whole reaction just show how many leeches are in academia, sucking out society’s resources with junk science, and feeling bad when exposed later. For all those worried about the feeling of scientists, really, you have a problem. Scientists should not be married to hypothesis, they should be married to sound methodology. If someone shows you are wrong (correctly, of course) you should be thankfull for that person saving your time!

    • + 1. This.

      but with dissonance — its funny how strong the norms of human interaction contrast with the norms of scientific criticism.

      Critic behavior doesn’t feel natural, or human. In a certain sense this is right. There is a norm violation going on here. Why say something if you should have good reason to believe that someone is going to feel bad, and that it could hurt them? Even more if we know there might be a disproportionate pile-on effect that we cannot possibly calibrate. The intuitive reaction is that critic who proceeds anyway must either be emotionally obtuse, callous or mean-spirited. Feeling this human side of things makes it difficult to simultaneously accept that its okay to go ahead and criticize because science has different norms. I have had some serious dissonance on this one. It seems like the resolution is to recognize that this kind of criticism is a necessary ingredient if we are to move forward, and that researchers agree, ex ante, that they are signing up for this. Perhaps another thing to recognize is that good criticism is colorful, playful, and sometimes flippant. Andrew Gelman is really good at this. Its fun to read, not fun to be on the other side.

      I don’t think the recognition that this is necessary should preclude us from understanding Cuddy’s perspective, feeling empathy, or expressing sympathy. But that is for later. Poor scientific practices should be called out until they change. BUT, there is still that the ugly econ job market rumors-style pile-on part that should be addressed somehow.

  47. Most of these people comenting here are “scientists” doing shoddy research with NHST where they could get away with weak theories and “signficant” results. Then they come here and say “hey, science is about consensus, our small group had agreed to pat each other on the back for significant results! How dare you!”, on top of that adding the ludicrous claim that “you can”t criticize a woman”. Andrew, these are not the people you should be trying to please, you should not care about how they feel and if they understand correctly your ideas. These cargo cult science people is going to be exposed whehter it hurst their feeling or not.

  48. My first thought was that Cuddy has extended her 15 minutes of fame by now adopting public victimhood. Will her next incarnation be as a self help guru touting “power poses of the mind”? I have always been uncomfortable with self-promoting academics and researchers seeking celebrity status. The best role model for research IMO is Howard Florey. His bio is spellbinding, his work changed the world, he shared credit with his colleagues, and he avoided publicity and personal gain from the discovery of penicillin. He even had some WWII spycraft going on to get penicillin safely out of England to the US.

    • Alternatively Jack and KJA: I did not get any of what you say from either the article, nor the commenters on this blog. 1) We have no idea who initiated the article- so where you get that Dr. Cuddy herself adopted public victim hood in the article isn’t something I saw. 2) The NYT piece is kind of journalistic case study of how this useful and good methodological revolution impacts personally researchers who were doing what they thought they were supposed to be doing-at the time (obviously wrongly). 3) The NYT piece highlights how personal the attacks on Dr. Cuddy became. For example: “In one exchange in July 2016, a commenter wrote, “I’ve wondered whether some of Amy Cuddy’s mistakes are due to the fact that she suffered severe head trauma as the result of a car accident some years ago.” Gelman replied, “A head injury hardly seems necessary to explain these mistakes,” pointing out that her adviser, Fiske, whom he has also criticized, had no such injury but made similar errors.” Tell me- how is one not to take that as a pretty darn personal attack that is unjust, unwarranted, unhelpful, unnecessary, and not frigging about the science or methods? I don’t know how to read Dr. Gelman’s response as written in the article- he could have meant to downplay the comment and get back to the science- but it is not reported that he hit the commentor back for not staying on point. The NYT piece does note that Dr. Gelman tries to stick to the science- and of course he can’t be held accountable for what is posted. What he and all of us can take away from this is to be more emotionally intelligent with regards to how we critique, and what we critique as scientists. I will also point out that the NYT piece highlights further the fact that Simmons and Simonsohn sent an ambiguous e-mail to Dr. Cuddy- and through their own fault of communication (which frankly- as a woman I feel we have the upper hand in over men in terms of being emotionally intelligent about it) put her in a position that they could then beat her over the head with: “Oh, yeah,” he said quietly. He had a pained look on his face. “We did say to drop the graph, didn’t we?” He read it over again, then sat back. “I didn’t remember that. This may be a big misunderstanding about — that email is too polite.” Cuddy and Carney had taken their advice literally. Simmons stood by his analysis but recognized that there was confusion at play in how they interpreted the events that transpired.” So please consider before you blame the victim for being a victim that there is culpability here on the part of commentors/Dr. Gelman/ Simmons and Simonsohn- and in this case it *was* about primarily men attacking a woman. Take it a bit further and understand that the other 50% of the human race that men share the planet with- women- have often been ignored, shut out, and had to adapt to a communication style and tone in the sciences that makes *some* (certainly not all) of them uncomfortable and is quite frankly harmful and very often counter-productive to the aims. Certainly I did not take the attacks on Dr. Cuddy as ‘how dare they attack a woman’. I also did not take that from the NYT piece which does describe Dr. Cuddy’s culpability in how she responded (or didn’t) and addresses the celebrity status of her work- she was a good target who happened to be a woman- What I take issue with (as a woman- with a PhD in a social science field) is the emotional cowardice and unskillful communication that Dr. Gelman/Simmons and Simonsohn exhibit in the piece (understanding it’s a newspaper article that can’t possibly capture all the nuance).

      • “a commenter wrote, “I’ve wondered whether some of Amy Cuddy’s mistakes are due to the fact that she suffered severe head trauma as the result of a car accident some years ago.” Gelman replied, “A head injury hardly seems necessary to explain these mistakes,” pointing out that her adviser, Fiske, whom he has also criticized, had no such injury but made similar errors.” Tell me- how is one not to take that as a pretty darn personal attack that is unjust, unwarranted, unhelpful, unnecessary, and not frigging about the science or methods”

        It might have been a bad joke, and it might have been a sincere question. Thankfully we are all free to say what we want to say (within the law), and in general it seems rather silly to want to try and police what others say and how they say it.

        I read that quote too, and i saw it as a question, which i then tried to think about. The following is not meant as “an attack”, or being “disrespectful”, etc.

        I took a look at the TED-talk video (or transcript) where she talks about being in an accident, being told that she would not finish college, talking about how her “core-identity” was being smart and that it was now taken away, how she then felt “powerless”, and how she was told to “fake it till you make it”, etc. (https://www.ted.com/talks/amy_cuddy_your_body_language_shapes_who_you_are/transcript)

        Now, i can see a direct possible connection between the story of her accident and her possible personal belief and/or investment in her power-pose study and story, and her “core-identity” of being smart. I think she even sees this herself, because she told it in her TED-talk, and it seems to all be interwoven with the message she is trying to convey. I find it very plausible that all of this could have clouded her judgment, actions, interpretations, etc. of her work, and continues to do so.

        Regardless: you yourself already wrote “The NYT piece does note that Dr. Gelman tries to stick to the science- and of course he can’t be held accountable for what is posted.”

      • Dear SM,

        I am the person who wrote “I have wondered whether some of Amy Cuddy’s
        mistakes are due to the fact that she suffered head trauma as the
        result of a car accident some years ago”? last July.

        This was not a personal attack, but was intended to suggest that there might
        be a valid reason for Cuddy’s articles to have such poor quality statistical
        work, and also to suggest, however obliquely, that perhaps all the critics
        of her work might keep this in mind.

        And yes, her statistical work is poor. I recently took a close look at one
        of her articles (not the 2010 article that has generated such criticism).
        The article uses undergraduate-level statistics — one-way anovas, simple
        one- and two-predictor regressions. But there are multiple mistakes. The
        degrees of freedom are wrong. Some path-model coefficients are misreported
        and/or miscomputed. Etc. From a professor at *Harvard*.

        I have been willing to give Cuddy the benefit of the doubt. Statistics is
        hard for many people, and I am sure that they are even harder for people
        with, or recovering from, neurological problems. But suppose that I don’t
        give her the benefit of the doubt, and instead blame this poor statistical
        work on other things such as (a) incompetence, (b) sloppiness,
        (c) dishonesty, etc. Is that preferable, under the circumstances? In my
        opinion, no.

        Cuddy monitors this blog, and she sent me e-mail telling me that she
        found my comment “incredibly offensive.” So, I explained. She did not reply
        to my explanation. And now, I learn from Susan Dominus’s NYT article that
        Cuddy apparently reported my “hostile comment” to her (Dominus) and that
        Andrew should have reined me in. Apparently I am one of the persons
        “savaging” Cuddy’s intelligence, when in fact I was trying to protect
        her a bit.

        The situation is ironic, because I’m always the one suggesting, both
        publicly and privately, that we should be gentler with the junior
        people — the graduate students, the post docs, and the young, untenured
        assistant professors. See, for example, the 11 August 2017 thread “Consider
        Seniority of Authors When Criticizing Published Work?” (here on Andrew’s
        blog), which was initiated by me.

        Carol

        • Carol:
          Interesting that you mentioned about simple statistics. Use of parametric tests, when the assumptions are not met, seems to be pretty common in study analyses in social studies. I actually inquired about this to a researcher in psychology, who was trying to improve the field’s rigor in quantitative analysis. she told me that it followed the standard of education how the graduate students were trained. I am also wondering perhaps some of this aversion towards using more complicated statistical methods stems from the fact those who chose the social studies as a profession already find math and stat difficult. Nevertheless, these possibilities should be even more reason why they should ask for help from statisticians during the study design. I am an epidemiologist supposedly with a good quantitative training, I still ask statisticians when I am designing a study.

        • “I am an epidemiologist supposedly with a good quantitative training, I still ask statisticians when I am designing a study.”

          Sounds very reasonable to me! Maybe statistics classes ought routinely to include the caution to “check it out with a statistician before collecting data.”

      • “I will also point out that the NYT piece highlights further the fact that Simmons and Simonsohn sent an ambiguous e-mail to Dr. Cuddy- and through their own fault of communication (which frankly- as a woman I feel we have the upper hand in over men in terms of being emotionally intelligent about it) put her in a position that they could then beat her over the head with: “Oh, yeah,” he said quietly. He had a pained look on his face. “We did say to drop the graph, didn’t we?” He read it over again, then sat back. “I didn’t remember that. This may be a big misunderstanding about — that email is too polite.” Cuddy and Carney had taken their advice literally. Simmons stood by his analysis but recognized that there was confusion at play in how they interpreted the events that transpired.” So please consider before you blame the victim for being a victim that there is culpability here on the part of commentors/Dr. Gelman/ Simmons and Simonsohn- and in this case it *was* about primarily men attacking a woman.”

        Please read the following post (and the ones above it) for further information, and please keep in mind that according to the NYT story, Carney was the one who asked for help/ initiated the contact:

        http://statmodeling.stat.columbia.edu/2017/10/18/beyond-power-pose-using-replication-failures-better-understanding-data-collection-analysis-better-science/#comment-591214

    • Yes, it is indeed good — not simplistic.

      Some comments brought to mind while reading it:

      1. I just don’t have much tolerance for extreme statements, whether they come from Susan Fiske or Steve Sailer. (Yes, this one is a vent).

      2. Re the Steele et al work on stereotype threat mentioned in the article:

      a. I’ve got some commentary on it and related papers starting here: http://www.ma.utexas.edu/blogs/mks/2014/06/22/beyond-the-buzz-on-replications-part-i-overview-of-additional-issues-choice-of-measure-the-game-of-telephone-and-twwadi/

      b. Cuddy’s work had that same “miraculous” appearance that Engbar noted in his Slate article. Rationally, anything appearing that miraculous ought to be met with lots of doubt and critical reading. Yet, such was the state of psychology at the time that it was not met with such skepticism. I have seen another program (Emerging Scholar) that appears (from my eyes) to have improved success rates of minorities (and women)in math. But it is much more complex than the stereotype threat approach, and therefore much more difficult to analyze. A catch-22: Short of (very unlikely) miracles, solutions to the (mostly complex) problems of society (education, inequality, etc.) need to be complex, which makes them very difficult to evaluate.

  49. This brouhaha strikes me as just a battle in the perennial war between so-called purists and popularizers. Carl Sagan was held up to scorn and ridicule by the astronomy pharisees in the same way Amy Cuddy is being scorned and ridiculed by Gelman and his sycophants. My advice to Amy Cuddy is the same I, an elderly man, have given my daughters–measure yourself by your own yardstick, not the one other people try to hand you.

  50. The only point upon which I have any confusion is how is Cuddy’s work any threat, on a scientific level? Because the only thing that justifies this much attention is the goal of protecting people from bad information. It became a social trend (does anyone remember Farrah Fawcett hair? Okay, but would anyone sport it now?), but science, which is to say posterity, is well protected from being negatively influenced by the power pose findings, in which case this focus on a dead-as-a-doornail study is entirely for sport. There are older studies, upon which loads of research has been built, and those do warrant attention – not endless, beat the poor old dead horse until your hand hurts kind of attention, but critical and thoughtful review that exposes what is and what is not relevant, and somewhat urgently, before new research cites it any further. Cuddy’s work, all work, needs to be verified, but after that, use our finite allotment of time to move on to other things, unless the danger to the public. and to the bastion of scientific research, warrants extra steps. Power poses haven’t been scientifically proven to be impactful, and I think it’s safe to say most are convinced and it ends there, mission accomplished. It doesn’t deserve any more attention, unless it has moved from an issue of scientific evaluation over to one of recreational judgement, in which case you may as well be talking about her hair for all the relevance to science any further critique will have.

    • Good points, but Gelman wants change now, not in distant future. But one thing bothers me about Andrew’s posts. He only goes after people he does not know. It’s a very general thing, applicable to everyone: if you are on good terms with someone then you will be much less likely to ridicule their work. I am also like that. So a nice within-subject manipulation is to have Gelman spend time with Cuddy. Maybe Andrew should take a woman statistician with him and ask her to do the talking.

      • What sort of change would satisfy her peers in the scientific community? She left her position at Harvard. I do think that change, in the form of challenging and pretty much decimating Cuddy’s credibility, has been achieved as far as the scientific relevance of her big project. Does Gelman also want to have her public appearances boycotted? I cannot find any other reason for him to be addressing this any more other than to try to draw blood…or share her spotlight. He can take exception with the journalist, who crafted the article and therefore controlled the tone and implications. But Cuddy’s work and the resulting criticism have happened. Quite over. Anything after feels petty since it is without justification, and petty is not scientific.

        • Jennifer:

          In answer to your question, “Does Gelman also want to have her public appearances boycotted?” No. I’ve never said or implied such a thing. Indeed, if you read the above post you’ll find a long discussion of the potential value of Cuddy’s public speaking. Cuddy’s an effective public speaker and has a lot of other talents too. I have never suggested, nor would I want to suggest, that her ideas or her work or her speaking be suppressed or boycotted in any way. Indeed, I find it horrible that anyone would think that I would want this. So, despite my frustration in seeing this comment, I’m glad it appeared so I have a chance to clear this one up.

          Also you write, “I cannot find any other reason for him to be addressing this any more other than to try to draw blood…or share her spotlight.” No. I have no desire to “draw blood” nor do I have any desire “to share her spotlight.” I wrote this post in response to a news article that in my impression had left some misleading impressions. It was not my idea for the New York Times to write this story. The reporter talked to me, not the other way around. And when I was given the opportunity to be photographed for this article (something you might expect I’d want to be done, if I were really trying to “share her spotlight,” I declined, because, as I told the people at the New York Times, to me the story is about the science, not the personalities.

          So, again: No I don’t want to suppress any speeches, videos, books, scientific papers, or anything else. No I don’t want any boycotts. No I don’t want to “decimate Cuddy’s credibility”: the discussion is about published work, not about individuals. No I don’t want to “draw blood.” No I don’t want to share the spotlight. Yes I was bothered by statistical misconceptions that in the view of myself and many others, can lead to scientific confusion; yes I think the topic is important enough to write about it multiple times in different places; no it’s not about boycotts, destruction of credibility, drawing blood, or sharing spotlights.

          I appreciate the effort you put into posting this comment, I’m glad to have a chance to clear things up, and I regret anything I’ve written earlier that have given all these mistaken impressions.

        • Shravan wrote, “Gelman wants change now, not in distant future”

          Jennifer replied: “What sort of change would satisfy her peers in the scientific community? ” and then speculated regarding possible changes concerning Cuddy.

          I interpreted Shravan’s comment as talking about changes in statistical/scientific practices. I just don’t understand people who think in terms like, “I cannot find any other reason for him to be addressing this any more other than to try to draw blood…or share her spotlight.” That is so strange to me.

  51. Just finished reading the article in the NY Times Magazine. While I appreciate your “intent” I don’t really buy the idea that you are driven by a need for purity within the field. Your passion seems to be more parasitic in nature. The academic world is at times nothing more than a bunch of squealing monkeys scampering over each other for a few pieces of fruit at the top. And how dare a woman cross the bridge to the civilian side. If there are methodological errors, then cite that and replicate based on this information. Otherwise, you appear to be the bully in the playground. Insecurity drives arrogance.

    • Susan:

      1. You write that you don’t really buy the idea that I am “driven by a need for purity within the field.” That’s good, because I’m not driven by a need for purity within the field, nor have I ever claimed such a thing. I think a push toward “purity” would be horrible! As a scientist I think our work should be diverse, not “pure” in any way.

      2. I have no problem with papers being published on power pose or just about any other topics. As I’ve consistently said over the years, I think that publication is great, and post-publication criticism is great too.

      3. I don’t know what this has to do with anyone’s sex, but if this is your concern you might want to take it up with Eva Ranehill, Anna Dreber, and the others who published that first failed replication of power pose. Or you might want to take it up with Dana Carney, who wrote forcefully that “However since early 2015 the evidence has been mounting suggesting there is unlikely any embodied effect of nonverbal expansiveness (vs. contractiveness)—i.e.., “power poses” – – on internal or psychological outcomes. As evidence has come in over these past 2+ years, my views have updated to reflect the evidence. As such, I do not believe that “power pose” effects are real.”

      You could also consider the cases of famously non-replicated claims in the literatures of embodied cognition, ESP, sex ratios, first letters of names, and other areas that are associated with mistakes made by male researchers. The consistent pattern here is that researchers, men and women alike, have made a lot of mistakes, we’ve learned a lot, and we’re trying to do better here.

      4. I don’t know where your remark, “how dare a woman cross the bridge to the civilian side,” comes from. I think it’s just fine when researchers, male or female, have ideas that they’d like to share with the wider population, and I admire people such as Cuddy with the talent to do this. As I wrote in my post above (in the last section before the P.S.), I hope Cuddy can continue to do this, just detaching her potentially valuable messages from the scientific claims for which there is no clear evidence.

      Anyway, in quick summary, again I would strongly oppose anyone whose goal is “purity within the discipline,” and obviously there was some miscommunication if anything in that NYT article gave any other impression. Science is not about purity. And the goal of methodological criticism is not to insure purity; rather, it’s about allowing more voices and perspectives, so that we can listen to the Eva Ranehills and Uri Simonsohns of the world when they have valuable things to say.

    • Again, I simply don’t get what Dr. Gelman’s possible motivation/mind state/intention have to do with a) the validity of his or others’ criticism of power pose or other studies and b) the information provided by several unsuccessful power pose replication attempts?

  52. This is the kind of comment where you go from being a critic to being a paternalistic bully: “ some silly paper published in 2010,” Pretty sure there’s a hint of sexism in that comment as well. it was a peer reviewed paper that seems to have turned out to be wrong.

    • Anon:

      Even peer-reviewed papers can be silly. Even peer-reviewed papers by men, for example that ESP paper by Daryl Bem that got a lot of attention a few years ago. Peer review is fine, but sometimes silly papers get through nonetheless. It happens.

  53. “Even peer-reviewed papers by men…”

    I find that even extremely objectionable. Even invokes a scalar, with extreme end-points. Even when attacking men, one conveys disbelief that men lie on the right end of a scale.

        • Shravan:

          Anyone who’s been following science for the past several years will know that lots and lots of peer-reviewed papers are silly. It happens. And, sometimes a silly paper can be valuable. I might laugh at silly papers but that doesn’t mean I think they should never be published. I think silly papers should be published, and criticism of silly papers should also be published—ideally in the same place as the original publication. The sex of the author seems irrelevant in any case. I don’t really care if a paper is written by men or by women or, as is often the case, by a team including members of both sexes. I just brought up the Pythagoras thing because you mentioned the word “even,” which reminded me of that numerology regarding odd and even numbers. Silly all around, which I guess is what one can expect when commenting at 2 in the morning as a distraction from real work.

        • Got it; but of course I fully agree with your substantive. I never thought you would ever make a male/female distiction. But we saw from comments that it certainly led people who don’t know anything about you to treat you like sexist. Just as we should not call women girls, we should be careful about how our language can seem to reflect prejudices.

        • I hope we can one day stop having to “be careful about how language can SEEM to reflect prejudices.” We should be careful not to let our thinking reflect our prejudice any longer than it takes to correct them. We should hope that our language reflects our prejudices clearly enough that others can help us correct them. Many prejudices, and the one’s that probably bias our work the most, are unknown to us, or we are unaware that they are merely prejudices.

          Kurt Vonnegut’s son, Mark, said, “We’re here to help each other get through this thing, whatever it is.”

          Calling out someone for saying “even” in “even men can be silly”, in the context of defending himself against a completely baseless accusation that he thinks that he only women ever make silly mistakes is counterproductive. If we have to be this cautious in our language we will have a very hard time thinking. We will certainly have a very hard time asking other people to help us think things through.

        • i agree in general with your points; I don’t like overly PC language. But this current topic is *all* about language, about wording and phrasing.

          Serious question:
          What are your limits here? Can I call my female postdocs and students girls? If not, why not? Can I call my female students’ work “sexy”? We use sexy all the time in academia, for cool results. As a white (I assume) male, you probably have a lifetime of privilege behind you. Women have a lifetime of belittling language behind them. It’s hard to imagine what the impact is for them, because we have an advantage in society over them and are never called boys in uni. Nobody comes on to us and treats us like sex objects. See, e.g.:

          http://www.motherjones.com/politics/2017/09/she-was-a-rising-star-at-a-major-university-then-a-lecherous-professor-made-her-life-hell/

          There is a 103 page legal complaint behind this, what mother jones discusses is just the tip of the iceberg.

        • Thank you for this, Shravan.

          I think a key point here is: Is behavior toward the minority group condoned when it would not be condoned toward the majority group?

          In this case, if the professor had treated male students as he did female students, I think there would have been an uproar much earlier.

          I am very lucky that I have never encountered the type of behavior described in the article. I only recall one incident of inappropriate touching from a colleague. I did not act on it except to request the department chair (with no reason given) not to put me on any committee with the colleague. Sometime later, a graduate student complained to the department secretaries about the same faculty member; word got to the chair, who told her to talk to me. She did, I talked to the chair with more information, and he had a talk with the errant colleague, who never had any more complaints of this type about him.

        • Sorry, Shravan, it’s too long a conversation to define limits in this area. I don’t think I have to draw a bright line between Andrew’s “even” and a “lecherous professor”. My point is only that if your hypothesis isn’t that the speaker is actually a misogynist (that his language reflects ingrained prejudice that causes him to discriminate in practice and abuse his students) then it does more harm than good to call out simply the language.

          You called out Andrew for a very subtle linguistic sign of what *might* be (very mild) sexism if you didn’t know better. And you did so even though you do know better. I expressed my distaste for that rhetorical move, which I think has proven destructive to our discourse. I certainly don’t think it is constructive. I think it is even less constructive to say, “Okay, if we don’t object to people saying ‘even men can be silly’ [in a context where it clearly isn’t sexist], then how can we censure a lecherous professor who implies that sleeping with him will be good for a woman’s career?”

        • Elin:

          I said: “I hope we can one day stop having to be careful…” You said: “We should keep in mind…” We disagree about a “should statement”. You think we should be careful what we say because people might misunderstand; I think people should apply the principle of charity when interpreting the speech of others. When someone is obviously being civil and polite, looking for shibboleths of sexism in words like “even” is simply not constructive. If that sort of caution is really needed in America, I’m happy to live in Europe. I’ve never been to New York but it’s not really known for the linguistic caution you are arguing for. Or maybe that’s just some parts of NYC?

          Your second-to-last paragraph is interesting. It sounds like you are suggesting that, when talking to a woman, I should “keep in mind” that she may have been molested in the past and, therefore, if I say something like “even men can be silly” she might be distressed. I’m not sure if that’s what you mean when you suggest we should be mindful of the possible differences between our experiences. I don’t, like I say, want to tell women how they should feel. But I know that I would find it extremely patronizing if someone made an assumption about me as a post-traumatic (or less clinically sensitive) simply on the basis of my gender and treated me with extra caution as a result. The narrative we are seeing unfolding, it seems to me, is suggesting that we should assume that (American?) women in general have been the subject of constant humiliation throughout their lives and so we have to watch what we say to them.

          I think we have a substantive disagreement. You really believe we should be careful what we say out of concern for the finer sensibilities of women. I really hope that we don’t have to keep that up for long because women will learn to take the things we say with a grain of salt, i.e., be charitable in their interpretations until we really do threaten to fire them if they won’t sleep with us. (Of course: that threat can be quite implicit and still be actionable.)

          Martha: I agree that this is disproportionately detailed discussion about a very minor thing. But both Shravan and Elin have suggested that Andrew should not have said “even men” because it is somewhere on a spectrum with Harvey Weinstein at the other end. I think that is quite serious. Or silly, if you will.

        • The “even” definitely comes across as expressing a kind of patronizing tone in American English. That is “even Einstein had trouble at school” tends to basically say, yeah we expect you to have trouble at school because you are no Einstein. “Even men can have this happen” and you are not a man. It’s the kind of thing that in the context of a larger discussion can relay a negative message.

          Thomas insists that he knows best about how women should feel about certain things that are done to them, physical or verbal, so it’s not even worth engaging in a discussion (see what I did there).

        • Elin:

          Just to clarify this particular point: I wrote “even . . . ” within an annoyed and sarcastic response to what I considered a ridiculous blog comment. The commenter accused me of sexism for criticizing “a peer reviewed paper [by two women and a man] that seems to have turned out to be wrong.”

          I guess it would’ve been better to have not responded at all. But just to be 100% clear, I will rewrite the comment without the offending word:

          Peer-reviewed papers can be silly. Including peer-reviewed papers by men, for example that ESP paper by Daryl Bem that got a lot of attention a few years ago. Peer review is fine, but sometimes silly papers get through nonetheless. It happens.

        • Hi Elin, I agree that this isn’t worth discussing, but I wasn’t claiming to know how women should feel. I was telling a man that he was wrong about how objectionable something another man said was. I was lamenting one man’s objection to another man’s choice of words. It’s possible that Shravan holds some view about how women should feel when Andrew says that “even” men can be silly. My view was that we shouldn’t worry too much about how women feel about that, in part because, as you point out, women don’t benefit very much from my opinion about how they should feel.

          Like everyone else, I sometimes say something that other people misunderstand. In those cases, I might indeed think they “shouldn’t feel” angry or saddened by what I said. That is, I didn’t mean to make them feel that way. Sometimes, of course, I can be made to understand why they feel that way even though that wasn’t my intention. Other times, it’s “on them”, as Americans say. And, on that note, let me conclude by registering my distinctly European objection to moderating my speech online to the “tone in American English”. The world is somewhat bigger than that.

        • Thomas:

          Re American … Hence I specifically and carefully explained why some might not get the subtle meaning in the US context. I can’t believe it was hard to understand that point since I was so specific. But responding to women without sounding patronizing is challenging as you illustrated again.

          Andew:
          I know you didn’t mean it that way, it’s just how it reads We all should keep in mind that what we say is not necessarily what we mean to say that people hear.

        • Hi again Elin. I’m struck by the way you and Shravan both say you understood Andrew’s meaning, don’t think he was being sexist, and yet would nonetheless have him moderate his tone.

          “We all should keep in mind that what we say is not necessarily what we mean to say that people hear,” you say. I agree and I think most people have experienced misunderstandings like this. So why don’t we just apply the corollary? We all should keep in mind that what we hear is not necessarily what people meant to say. This is sometimes formalized as the principle of charity. In civil discourse (such at that modeled by Andrew) it means interpreting the speech of others in ways that maximize its reasonableness. (Many people here seem intent on reading all attempts at civility from Andrew as “disingenuous”. I think can only see his sincere desire to be understood.)

          You’re right that it’s hard to talk to some people without sounding patronizing. Some people, however, also have a hard time listening to others without feeling patronized–especially when they’re actually, in a particular instance, wrong. Since I do in fact feel like I’m more right about this than you are, you are probably just misinterpreting that feeling as some sort of inherent self-righteousness, which I assure you I don’t feel. Much less do I feel generally superior to women. You and Shravan are two indivdiuals, one male and one female, who I believe are wrong about a particular kind of tone policing.

        • Andrew:

          I feel that is a solid edit, it makes the sentence stronger and more focused besides dealing with the other issue.

          Thomas:

          First, my statement is not about “should,” it is a statement about a general and consistent finding about how human beings interact. What they hear is not always what the speaker intended to communicate. This is why marketers spend so much time and money on studying the impact of small changes in wording (and other items) on people’s responses to advertising, just to use a small example. I’m not in anyway in the world of telling people what they “should” do except that if they want to be effective communicators they should consider this basic reality. If you want to take that basic finding and use it to make “should” statements that’s fine, but that’s opinion and I have mine and you have yours.

          Second, Andrew is someone I have read a lot, interacted on line with a fair amount, emailed back and forth with, have mutual acquaintances with, etc.; since I am also an academic in New York that is not surprising. I don’t know him in person, but people think he is okay, and I feel confident that he does not intend to offend. Which is not to say that he didn’t offend some people. Actually, I think not telling him that would be kind of mean.

          However, other people don’t have that background, lots are brand new because of the NYT article. Further, I will always try to move conversation from the specific of this particular sentence from this particular person to the general of how humans interact with each other. I don’t think discussing Andrew is interesting as a general topic (discussing his work is), discussing how women’s work may be talked about differently in a general sense and how women may interpret comments about their work using this setting as an example is more interesting. Discussing the complexity of human communication and interaction is even more interesting.

          Further, as a rhetorical device, acknowledging the sincerity and good intentions of the person you are talking to (and I’m not doing that insincerely at all) makes a tough conversation easier to have. This is also a somewhat distinctive American style where we always look for a positive thing to say when starting a hard discussion.

          Whether you want to concede it or not, people come into a conversation with a history of past experiences and those experiences for women are very different. In the US at least, for women and African American people in particular, they will have almost universally experienced multiple incidents of people being at minimum patronizing to them and often will have had experiences where their jobs, funding or physical safety has been at risk. And that background comes into every conversation, just as your background comes into every conversation that you have. I have no idea whether anyone has every grabbed your genitals at work or threatened your funding if you wouldn’t have sex with them or asked you to “just” watch them take a bath, as in the recent examples in the news here. I don’t know if you ever have been at a conference and had someone ask if you were the partner of someone attending. Or any of the other of those small indignities that like water on a stone wear people down. Maybe those things have happened to you, maybe they haven’t. But your history of having those things happen to you (or not) is shaping your discussion of this issue.

          I’m of a generation of women that would (as Martha described) traditionally have said let’s just discuss it on the whisper-net. But instead I’m actually explaining to Andrew and others something that might be going wrong in how their words are coming across,. I think this willingness to speak up is a meaningful change in society and actual progress not just for women but for everyone.

        • Elin said,

          “I’m of a generation of women that would (as Martha described) traditionally have said let’s just discuss it on the whisper-net.”

          I don’t remember saying that (which doesn’t mean I didn’t say it), but would appreciate it if you could point out where I said it.

          To be honest, what I was thinking while reading your and Thomas’s posts was, “This seems so minor compared to when meetings and seminars ended with, “Thank you, gentlemen.””

        • Elin, Thomas-

          I am struck by how interesting your conversation is.

          The problem I see is that we are unevenly endowed with the skills to anticipate other people’s sensitivities, either because of innate ability, or experience (e.g. different cultural backgrounds). We also have to face different time and resource constraints, which can prevent us from fully exploring all possible interpretations of our message. What to do? Communication is risky. Conditional on communicating we have *some* control over how much risk we expose ourselves to, but its limited. I say go ahead and accept the possibility that some people will misinterpret us, there is too much to gain.

          On the other hand, we shouldn’t be shocked when people appear to be offended on occasion, and we should be grateful when they let us know because this provides an opportunity to clear things up. When it becomes a pattern, we should reflect on the possibility that the problem is us, and that maybe we should modulate our behavior (I acknowledge that it could be driven by an outrage-culture problem–but that’s harder!). But the pattern cuts both ways. If we find ourselves offended often, we should reflect on the possibility that the problem is us jumping to conclusions (I acknowledge that it could by a cultural problem too–but, again, that’s harder!).

          Still, I think the onus is on the (provisionally) offended party to follow the principle of charity, as it’s easier to withhold judgement than it is to read someone else’s mind.

        • Elin:

          I said: “I hope we can one day stop having to be careful…” You said: “We should keep in mind…” We disagree about a “should statement”. You think we should be careful what we say because people might misunderstand; I think people should apply the principle of charity when interpreting the speech of others. When someone is obviously being civil and polite, looking for shibboleths of sexism in words like “even” is simply not constructive. If that sort of caution is really needed in America, I’m happy to live in Europe. I’ve never been to New York but it’s not really known for the linguistic caution you are arguing for. Or maybe that’s just some parts of NYC?

          Your second-to-last paragraph is interesting. It sounds like you are suggesting that, when talking to a woman, I should “keep in mind” that she may have been molested in the past and, therefore, if I say something like “even men can be silly” she might be distressed. I’m not sure if that’s what you mean when you suggest we should be mindful of the possible differences between our experiences. I don’t, like I say, want to tell women how they should feel. But I know that I would find it extremely patronizing if someone made an assumption about me as a post-traumatic (or less clinically sensitive) simply on the basis of my gender and treated me with extra caution as a result. The narrative we are seeing unfolding, it seems to me, is suggesting that we should assume that (American?) women in general have been the subject of constant humiliation throughout their lives and so we have to watch what we say to them.

          I think we have a substantive disagreement. You really believe we should be careful what we say out of concern for the finer sensibilities of women. I really hope that we don’t have to keep that up for long because women will learn to take the things we say with a grain of salt, i.e., be charitable in their interpretations until we really do threaten to fire them if they won’t sleep with us. (Of course: that threat can be quite implicit and still be actionable.)

          Martha: I agree that this is disproportionately detailed discussion about a very minor thing. But both Shravan and Elin have suggested that Andrew should not have said “even men” because it is somewhere on a spectrum with Harvey Weinstein at the other end. I think that is quite serious. Or silly, if you will.

        • Elin,

          I realized belatedly that what you were referring to was probably my response to Shravan about the inappropriate touching.

          I had not encountered the phrase “whisper-net” before. Frankly, I don’t think it fits well the situation I described. I saw my action in talking to my department chair as going to someone whom I trusted and who was also in a position of authority. Part of what I mean by ‘trusted” is that he would respect my privacy.

        • Joshua Miller said:

          “The problem I see is that we are unevenly endowed with the skills to anticipate other people’s sensitivities, either because of innate ability, or experience (e.g. different cultural backgrounds). We also have to face different time and resource constraints, which can prevent us from fully exploring all possible interpretations of our message. What to do? Communication is risky. Conditional on communicating we have *some* control over how much risk we expose ourselves to, but its limited. I say go ahead and accept the possibility that some people will misinterpret us, there is too much to gain.

          On the other hand, we shouldn’t be shocked when people appear to be offended on occasion, and we should be grateful when they let us know because this provides an opportunity to clear things up. When it becomes a pattern, we should reflect on the possibility that the problem is us, and that maybe we should modulate our behavior (I acknowledge that it could be driven by an outrage-culture problem–but that’s harder!). But the pattern cuts both ways. If we find ourselves offended often, we should reflect on the possibility that the problem is us jumping to conclusions (I acknowledge that it could by a cultural problem too–but, again, that’s harder!).”

          +1

    • I assumed the point of the ‘even’ was to _ridicule_ the idea that bad science is somehow limited to that done by women.

      But given the context and things, I suppose it is understandable that it was misinterpreted or judged inappropriate? I’m not really the right person to weigh in here but I was surprised no one pointed out the obvious intention.

      • Yes you are of course right. But the ambiguity is precisely the problem. It’s all about your priors, ironically. If you know Gelman well and know for sure he is not sexist (i believe that), you choose the intended interpretation. If you already think he is sexist, as many of these anon posters and cuddy do, you will or may take the other interpretation.

        • I think just like in the “Everyone is a little bit racist” song you might want to consider that it is not a dichotomy, self-aware, or simple. And I agree with @ojm that the context matters. It’s not a particularly good rhetorical device in this context. It is also unfortunate that in the US this discussion is happening right in the middle of a series of horrible revelations about the experiences of women outside of the science/academic context. It’s bad timing that the NYT piece came out in that context, though maybe for the psychologists posting it wouldn’t have changed their responses, it almost is certainly impacting other people.

      • Ojm:

        Yes, indeed the point of the “even” was to ridicule the whole idea. Intonation is notoriously difficult to convey in typed speech. And it can be extremely difficult to communicate with people who are either already convinced that you’re acting in bad faith. The result can be a stilted, legalistic form of writing that inhibits clear expression and communication.

  54. The NYT article is incoherent nonsense. On one hand it says that authors are invariably partisan defenders of their work, and this is normal and expected behavior. On the other hand it is implied that it is too harsh to single out and publicly criticize someone’s bad methodology.

    So how is science supposed to progress and self-correct, according to this enlightened journalist? Are we to believe that privately pulling aside these famous, self-partisan scientists will invariably make them see the light?

    Most scientists are not angels and their worst tendencies are kept in check by public criticism.

    • “Most scientists are not angels and their worst tendencies are kept in check by public criticism.”

      Most of the public is unaware i think. Regardless, public criticism is very healthy i think. I think that’s why it seems to me that some “scientists” want to keep criticism outside of the general public domain, e.g. blogs.

      First it’s about tone, then you tone down, then it’s still not good, then you can’t talk about motivation or intention, then you can’t name names, then you can’t say this, or make that comparison, then you can’t give your opinion or at least not when it’s offensive to someone, then you simply shut up. Or not.

      I think Cuddy’s story in the NYT is telling in that regard. She complains about critics not helping her. Then she (or Carney as her representative) asks for help. The critics are very nice to try and help, and be careful and helpful, and somehow end up getting blamed for being “ambiguous”, and setting Cuddy up for failure. I sincerely hope, when Cuddy calls on them for help a next time, they say no.

      Part I: “She played by the rules, then the rules changed” (2017)
      Part 2: “She makes her own rules, then wants everyone else to follow them or else they are bullying and offensive” (coming soon to a theater near you)

      I have frankly given up on her field as of recent events on Pychmap facebook, and Cuddy’s twitter.

      https://twitter.com/amyjccuddy/status/922619235892948992

      In 15 years time, every article in her field will be an “ethnographic” paper, as not to offend anyone and let everyone share their opinions and experiences.

      This twitter account gives you a glimpse of the future (it’s hilarious to follow that one):

      https://twitter.com/realpeerreview?lang=en

      Here is Stephen Fry on “being offended”:

      https://www.youtube.com/watch?v=bq5dNcrHE8w

  55. It reaffirms my decision to not enter academia that even such a politely worded response as this one can attract outrage from people who interpret opinions as dictations and advice as hateful gatekeeping. People are crazy.

  56. here is another example of why the quality of data matters from another highly publicized theory… https://www.bloomberg.com/view/articles/2017-10-23/piketty-s-inequality-theory-gets-dinged?utm_content=view&utm_campaign=socialflow-organic&utm_source=twitter&utm_medium=social&cmpid%3D=socialflow-twitter-view

    and of course, it is another great example of why we should examine and reexamine research findings (especially those with significant policy implications) if science is self-correcting…

  57. Can’t resist the opportunity to share this piece of well-timed doppelgänger irony (or the chance to use doppelgänger in a sentence not also containing the words “+1 sword”… I personally am going with bulleted reason number 4 and will suggest that Dr. Cuddy might have been absent from that particular day in second grade.

    https://twitter.com/carney_dana/status/921554150583803904

    For the life of me, can’t understand why there is not more generalized discussion of second-author and co-author ethics entwined with this discussion.

    • Anonymous 10:

      I have no envy for Amy Cuddy or Uri Simonsohn or the other people involved in this story, and I doubt they have any envy for me. We’ve all been through a lot, and we’re all doing our best here. We have different perspectives and we’re expressing them.

      My perspective on power pose is informed by the papers I’ve read and the statistical analyses I’ve seen, and is expressed in the top paragraph of the above post. Dana Carney’s perspective on power pose is informed by her experience conducting that famous experiment and following it up, and her
      Cuddy’s perspective is informed by the research projects she’s conducted and been involved in, along with much personal experience and a huge amount of positive feedback on her Ted talk. Fair enough. We all have different sources of information, different perspectives, and, unsurprisingly, different views.

      Nothing’s being torn down. The Carney/Cuddy/Yap paper is up on the web for all to see at all times, along with the Ranehill et al. replication study, the Simmons and Simonohn analysis, Carney’s statement, and all the rest. It’s all there and nobody’s trying to hide anything.

      Also no inquisitions. What’s happening is public discussion of scientific claims—which is really the exact opposition of the Spanish Inquisition which was all about suppression of different views (or at least that’s how I remember it from Monty Python).

      I find the whole episode so sad. Dana Carney, Amy Cuddy, Eva Ranehill, Uri Simonsohn, and all the other players in this drama have been working hard for years to try to understand a complicated topic. This is admirable. Researchers have disagreements, and people have legitimate differences on how strongly to publicize speculative ideas, but we’re all on the same team here.

      It’s just horrible that communication has broke down so far that people are so full of anger here. Passion, sure: if power pose really is wonderful, it’s too bad that it’s being questioned, and if power pose is actually a waste of time or even counterproductive, then it’s too bad it’s been promoted so hard. But, again, these are legitimate points of dispute—in both directions. (I’ve never claimed that power pose doesn’t work, only that I have some skepticism and I don’t see the evidence as convincing.)

      Again, open scientific discussion is no inquisition. Quite the contrary.

      • “I find the whole episode so sad…. we’re all on the same team here.”

        Sorry, this is just so staggeringly disingenuous Professor Gelman. It’s an exercise in PR, now that the questionable behavior of you and your colleagues has been been so widely publicized.

        • Psychologist:

          I looked up “disingenuous” and the definition is “not candid or sincere, typically by pretending that one knows less about something than one really does.” But I am being candid and sincere. You don’t need to believe me on that, but in that case there’s pretty much no prospect of communication. I really so think that we’re all on the same side and this episode makes me sad. As I wrote in my post above, lots of things about this story make me sad. I wasn’t happy about writing that Slate article with Kaiser Fung either. We felt the topic was important and the story needed to be told. My colleagues and I have consistently worked for better science, as we see it, and I have no doubt that the other people in this story are doing so also, as they see it. We have different perspectives, there are scientific disputes, and we’re using different communication channels. Everything’s out in the open here which I think is good. As I responded to the previous commenter, there’s no suppression, no inquisition, etc., of any sort.

        • Um, clearly the exercise in PR was the NYT article in the first place… Do you wonder how it makes the cut despite that it contained no new information? It was written solely from Dr. Cuddy’s point of view and served to cast vocal critics (with valid scientific points) as misogynists. I’ll note that the letter later posted has basically a bullet point pitch for an upcoming book….

    • Anonymous 10:

      Ha! Got me on that one. Here’s the story of my presentations: about 10 years ago I wrote a book and, knowing that I was going around giving talks, I took someone’s advice and paid for a 4-hour private lesson on public speaking. It was pretty good, and one thing the instructor told me was that when I’m presenting with slides, I should stand right in front of the slide and point with my body, not just stand passively by the podium. The rationale for this was that if you speak at the podium and refer to slides, the people in the audience don’t know what to look at. In any case, I found that advice to be useful even though there was no statistical study backing it up. To return to power pose, the evidence I’ve seen doesn’t seem to support there being an across-the-board effect as originally claimed. But that doesn’t mean that power pose is useless; I expect it depends on context. The last section of my post above (before the P.S.) discusses some ways that perhaps this could be examined. But, yes, I’m not trying to stop people from doing power pose or other such interventions if they think it could work for them. There are lots of debates about the evidence but I think it’s great for people to try out different ideas to improve their public speaking and all sorts of things.

      • I was thinking about this kind of thing this morning, that without a doubt I have a little ritual I do both before speaking professionally, but also a different one I do before I get to campus to put myself in a good frame of mind for the day and for teaching in particular. And I really know that for me if I do certain things they work, so even though I don’t think that that there is some specific pose that works for everyone and I have no clue about if there is a physiological dimension, just anecdotally I feel that there is something to the idea that doing something like that can be a useful practice. I’m way too much interested in difference to think that there is going to be one position that works for everyone across all times and cultures (and ages and genders).

        • +1

          And I think this gets at what Andrew means when he talks about “variable effects”.

          Indeed, I think that much of science involving people (e.g., psychology and medicine) needs to focus more on the variability in effects — something p-values and means don’t address, but Bayesian methods can (at least sometimes).

  58. Seems to me the best thing for Gelman to do, if he really wants to assert that he didn’t focus on Cuddy, is to have at least two non-English-speaking data science graduate student analyse his commentary output statistically for references to different researchers, to determine what the relative frequency of references is. Wouldn’t that be what he would recommend to others?

  59. In this post, you write:
    “Think of the thousands of careful scientists who, for whatever combination of curiosity or personal interests or heterodoxy, decide to study offbeat topics such as ESP or the effect of posture on life success—but who conduct their studies carefully, gathering high-quality data, and using designs and analyses that minimize the chances of being fooled by noise. These researchers will, by and large, quietly find null results, which for very reasonable dog-bite-man reasons will typically be unpublishable, or only publishable in minor journals and will not be likely to inspire lots of news coverage. So we won’t hear about them.”

    A bigger problem I see, as a junior level researcher in the social sciences, is that, among careful scientists, it’s not JUST the ones studying “offbeat topics such as ESP” who get banished into oblivion. On the contrary, the field seems to be actively weeding out those who are meticulous, while those with more lax methods (and some shoddy practices) rise to tenure and then mentor the next generation. The number of A-level publications now needed to get tenure can be quite high, and those who use shoddy practices find a lot more sensational results than those who are meticulous and scientifically rigorous, regardless of the quality of the topic.

    Some practices, such as not using double-blind treatments for study manipulations (just one example), are fairly accepted norms in the field (at least in my subarea), very rarely critiqued by reviewers or other researchers. Though they are not typically challenged, this doesn’t mean they are not influencing results obscuring truth. But if a junior faculty member were to insist upon using double-blind approaches (and other improvements over existing norms), they would very likely get fewer false-positives –> fewer publications –> tenure denial.
    Some faculty never question such practices — it’s just how things have been done. Others understand the costs, but consciously make the trade-off to optimize for publication count over quality of scientific contribution. Either way, it can be very risky for a junior faculty member to try to increase their methodological rigor when the gatekeepers to their future career don’t care about such issues as much as pub count.

    There are also a lot of practices that even though only lightly trained in statistics will admit are poor science– hand selecting outliers to remove, generating “hypotheses” post-hoc after you’ve analyzed data and then finding some theory to support them — but they are so commonplace, I don’t think people even consciously note they are doing something bad. They are just following in their advisers’ and colleagues’ examples.

    The expected number of publications, and their expected “impact” just escalates higher and higher through this vicious circle — which not only weeds out some excellent scientists while providing lifelong employment and influence to less meticulous (and sometimes lower integrity) ones, but also creates an ocean of untrustable findings that dilute real scientific advances.

    As an assistant professor, I admit I am envious of the security of your position that allows you to speak out about practices that you think are holding science back, and provide constructive critiques of specific studies. Though my own voice is somewhat limited for now, I really appreciate the attention you (and other senior researchers) are giving to the social sciences, and the awareness you are raising about ways to improve the scientific reliability of social science research. I do not find your critiques to be mean spirited at all — as you mention in another post, comparing science to software and flaws as bugs, I see it as making science a better place. Please carry on! I am hopeful that maybe, later in my career, it will be more accepted to view publications as tools for collaborative knowledge building, rather than all-in-one definitive solutions upon which individuals’ egos (and livelihoods!) rest.

    Thank you!

    • “On the contrary, the field seems to be actively weeding out those who are meticulous, while those with more lax methods (and some shoddy practices) rise to tenure and then mentor the next generation”

      I would like to add the possibility that the field is actively weeding out those who value rational thought, and argumentation.

      Given the recent discussions, and events, surrounding scientific discussions/bullying/tone, i sincerely feel it might be very useful, and constructive, from a scientific perspective if the right people would write the right paper about these issues.

      • Cuddy left Harvard “quietly”; Carney tenured at Haas; Yap on tenure track at INSEAD.

        The field seems to know the difference between weeds and flowers…

        • But they are outliers who received extremely high publicity & payoff for their work, leading it to be scrutinized by a broad audience, and exposed for its flaws.
          My point (and that of some articles linked in the discussion) is that they didn’t do anything that isn’t (still!) completely commonplace (and selected for!) in the field. Thy others, who are not getting TED talk publicity,and thus scrutiny, are still progressing up the academic ranks.

          While yes, there is some increase in awareness of and value for better research practices, it has not propagated fully, leaving many young aspiring researchers in a pretty confusing and mixed signal environment.

        • Cuddy sought publicity through overreach. Carney got publicity (and much praise) for self-examination and doing the right (and difficult) thing. Yap has been totally silent.

          Cuddy payoff was all money & fame. Carney is a scientist that probably only suffered by association with this research. All of not much consequence for Yap, I suspect.

        • I view this as pretty simple… The lesson could not be more clear – If you stick to the findings and self-examine (to the point of throwing your own research under the bus when it becomes clear that’s what must be done) and you operate with integrity… you are on the path to becoming a respected scientist. If you grab the microphone, overreach the findings, overstep your second-author status, ignore any evidence you don’t like and focus on the media and money and fame… well, then you will probably be publicly pilloried by your peers. You might just become increasing reliant on public relations strategies (and not science) to try and rehabilitate your image at the expense of anyone and everyone who gets in your way. You might find yourself screaming UNFAIR and MISOGYNY and VICTIM along the way or in a cold sweat in the middle of the night.

          Choose your own adventure.

        • I’m not really disagreeing with you, though my point is that there are far, FAR more people who “overreach the findings, overstep [their] second-author status, [and] ignore any evidence [they] don’t like” than there are people who have been corrected for doing so. The Power Pose work is NOT unique in its flaws, but since it went the extra step of gaining “media and money and fame” those flaws came out. The public shaming of this one example (and a handful of others) may have increased awareness and set the wheels in motion for greater rigor field-wide, but, as with set-up drug busts, it certainly does not prove that the problem is under control.

          Sadly, many reactions by researchers I’ve seen tend more towards “Better avoid getting too much media coverage” than “better put extra time and effort into my methodological rigor.”

          As a very influential researcher once told me when I pointed out fatal flaws in several publications in a top journal, and asked whether he was worried about their unsubstantiated theories be propagated in practice “Meh, not many people read these journals.” Never mind that publishing therein can give someone employment and influence over the scientific process for life. And perpetuate the problem.

        • Definitely agreed with you. This is a wonderful cautionary tale and a teaching moment. If this didn’t start things rolling, at least for social psychology, arguably nothing will.

          As to the media: Scientists should avoid the media, as a baseline, until the confidence in the science rises to the level that you can use language that will be resonant for a lay audience and, at the same time, is language that does not compromise scientific ethics. Said with your words, media activity is to be conditioned on having spent the extra time, effort and rigor.

          I’m not a scientist. I read journals. I use the science when I see an opportunity (i.e. recent M. Mason paper on non-round numbers as superior anchors in negotiation… clearly creates significant outcome advantage… admittedly, n of 1 in my experiment. Recent trial had a counterparty aware of the tactic, call it out with reference to the paper, and thereby offset my advantage. No kidding. Two weeks ago in a price for services negotiation. The very influential researcher’s response is self-acquitting nonsense. It just ain’t true.

        • Your last paragraph is depressingly believable. So we need to keep pressing for more intellectual honesty — we can’t expect others to do it.

        • Yes, I agree that this is only a beginning — the struggle toward progress in improving research quality needs to continue quite a bit longer.

        • No, no, no… don’t be depressed. Rejoice because what you do matters to the world, sometimes anyway. To be clear, you’re not going to help everyone or help with every piece of research, exactly… For instance, when I want to be dominant I know why spreading my stuff out at the head of the conference table makes sense. At the same time, you aren’t going to find me in the elevator doing an impression of a starfish before I get to the meeting…. caveat emptor, you know?

      • Haha… one one hand, I think you’re onto something. But on the other hand, there are plenty of senior people in my field who don’t shy away from argumentation at all. The problems are that they argue about the wrong things (Does this build incrementally off existing (likely bunk) theory? Are there broadly generalizable, crystal-clear, immediately actionable insights that will get written up by the media? Does this utilize the current method-du-jour, regardless of whether it’s useful to the question?), and the arguments are one-way top-down from those who have been selected into the system as the spoils of their QRP, and who have the power to make or break aspiring young scientists’ careers.

    • As an assistant professor, I admit I am envious of the security of your position that allows you to speak out about practices that you think are holding science back, and provide constructive critiques of specific studies.

      If I could change one thing about academia, it would be to make it so that professors like you aren’t scared to say things like this on the record.

      I know we are far afield of the originating problem here — but really. What kind of academic freedom is it that we get when we require people to spend the first 10+ years of their career upsetting nobody?

      I think we have two file drawer problems, and one of those file drawers is full of people.

  60. That’s not really necessary considering that AG was banging on about Kanazawa and Bem well before Cuddy rose to prominence and has done the same with Wansink and the authors of that fertility-and-clothing-color-choice paper (whose names I don’t recall). Repeatedly highlighting particular exemplars of bad statistical methodology is (was?) simply a feature of his blogging style. In similar fashion, when the topic is plagiarism there’s a fair chance that AG might mention Edward Wegman, Doris Kearns Goodwin, or Bruno Frey (this latter for self-plagiarism in particular).

  61. Gelman – I think the real issue is that your frequent, disproportionately withering attacks on Cuddy’s research suggest you harbor an irrational professional animus towards Cuddy that’s rooted in gender. On behalf of my female colleagues, I would like you to privately evaluate whether misogyny may have animated your critiques of Cuddy’s work – critiques that do seem bizarrely, exceptionally, and unscientifically hostile on rereading – then, when you’re ready, be publish the results of your introspection.

    By your own account, the flaws of her 2010 paper are entirely unremarkable, even by the anachronistic standards of replication to which you’ve zealously held it. So why did you lump Cuddy in with male academics who’ve falsified and plagiarized their results, including Matthew Whitaker? As a scientist, this comparison (conflation?) seems moralizing, illogical and unfair. At many points, your prosecution of Cuddy has seemed Javert-like. Throughout it, I cannot help but see a strain of misogynist emotionalism that again and again caused you to transform Cuddy’s minor and contextually understandable academic infraction into a high crime against social science. For months, you’ve portrayed her as a fugitive from academic justice, indicted her fortune as ill-gotten, and encouraged her exile, all the while casting yourself as the self-appointed police man charged with enforcing science law.

    I’m glad to see that already, your tone may be changing.

    Post NYT piece, you generously concede that Cuddy may indeed be a nice, kind person, after all. But your refusal to acknowledge she may also be a gifted scientist and an able statistician continues to bother me and many of my colleagues, both male and female. In your recent eagerness to denigrate Cuddy as a “bad scientist,” are you not guilty of trafficking in precisely the sort of pop science you condemn, drawing a damning conclusion from a statistically insignificant sample size? Have you personally reviewed all the work she’s ever done?

    Will your colleagues and students later mock your contributions to academic discourse as the intellectually corrupted product of implacably chauvinist times? A lot of people are watching you. I hope your get the time and mental space to look at yourself as closely.

    • Aubrey:

      I have no professional animus toward Cuddy, nor do I have an animus toward Kanazawa or Tol or Bem any of the other researchers whose published work I’ve criticized.

      Nor do I think I’ve ever “prosecuted” Cuddy. I have pointed out things she’s written that I’ve disagreed with, I’ve discussed the work of Ranehill, Dreber, et al. and of Simmons and Simonsohn, and I’ve used power pose as an example in discussing larger problems with research.

      You write, “For months, you’ve portrayed her as a fugitive from academic justice, indicted her fortune as ill-gotten, and encouraged her exile, all the while casting yourself as the self-appointed police man charged with enforcing science law.” All I can say is: Huh? I’ve never described Cuddy as a fugitive, never referred to academic justice, never indicted anyone, never referred to her fortune (indeed, I have no idea how much money she has), never spoke of any exile or encouraged such a thing, never have cast myself as a policeman, nor have I ever referred to as science law. I have never claimed to assess Cuddy’s body of work and have only written about specific things she’s written.

      So your comment baffles me, as you criticize me for a long list of things that I’ve never done and never said. I feel that you, and various others, are fighting with an imaginary version of me that does not exist, a person with attitudes that I don’t have and who’s done and said things that I’ve never said.

      I appreciate that you’ve gone to the trouble of posting your comment here—I know it can be awkward to post on a site controlled by someone with whom you disagree—and I’m glad we have the opportunity to express these disagreements here directly. This is one of the reasons I maintain this blog, to allow an open space for discussion, including disagreement, regarding issues of science and science communication.

    • Aubrey, with all due respect, I think you may be missing the foundation point of the critique. The genesis and legitimacy of whatever “withering attacks” that have been made seem to me to have everything to do with Cuddy’s overreach of the findings and now, stunningly, with the narrative that seeks to defend her obvious overreach by conflating it with issues (valid ones in many other circumstances) of gender discrimination.

      I think it also seems fair that (in general terms), at some point, criticism can very legitimately move from the science to the scientist… while we can debate when that point occurred, it’s certainly been breached once the scientist seeks to push a one-sided personal narrative with the Life and Style team at the New York Times.

      This is even more obviously true when the scientist is leaving academia and, as she does so, proceeds to throw her entire field under the bus in the process. Wouldn’t you agree that, once we get into that kind of situation, that motives and intentions are important to understand and consider?

      Unless you have the insight to offer an alternative suggestion, it is only common sense that the intent here has been to rehabilitate or reinforce career opportunities that having nothing to do with being a scientist…. particularly when the next book is about bullies. Do you not think the publishing world knows damn well how powerful a redemptive personal narrative for marketing purposes?

      Do you not see the guiding hand of public relations expertise in the year-long crafting of this story, in the studio photographs and the forlorn facial expressions and the subtleties of the editorial slant? I mean, wow. This was thoughtful, strategic and well-executed. Without question. Don’t mistake that for accuracy or objectivity. It’s also not lost on me that the NYT editorial staff saw the backlash at their rather overt spin and tried to even out the slant after the fact… just look at their suggested comments and at some of Susan’s post release tweets. The back and forth between her and Cuddy on twitter is clear evidence to me that Cuddy felt she had editorial control over content. That she did that publicly is evidence of a hubristic entitlement to continue to overreach. Really, just stunning.

      Andrew is perhaps brutal from time to time, but I have yet to see any evidence at all of misogyny, not beyond the accusation of it. I’ve seen some awful stuff following the NYT piece, but not before and none of it from Geman, Simmons, or Simonsohn. Where is it? Let’s see it.

    • I’m a longtime reader of this blog, and occasionally suspect its tone might not be helping to advance science, and yet I don’t really agree with your critique in the main. My own sense is that the people who get the most unrelenting negative coverage in this space have been mostly men.

      (Does Cuddy claim to be a statistician? That would surprise me.)

    • I have not noticed much of what Aubrey writes about in AG’s blog (though I haven’t read it exhaustively), and I expect this anger is, at least somewhat, misdirected.

      It’s plausible (though not a given) that your reaction is an attempt to project onto a concrete individual, something you have observed more diffusely. AG writes in his comment below: “I feel that you, and various others, are fighting with an imaginary version of me that does not exist.”

      It’s plausible that Cuddy broadly receives more derision, and that derision takes a more personal tone, than others who have done worse. It’s plausible this takes place through snide comments, disdainful mentions, cutting commentaries and gleeful dismissals that happen all over the place — not just online, but also around the water cooler and at cocktail parties, in conversations with strangers at the airport, etc. It’s plausible this could lead to an overwhelming sense that she is being witch-hunted, even without being able to put one’s finger on exactly why, or point to specific measurable differences in how their mistakes are being discussed in more formal, thoughtful media sources. Collectively, such commentary and tones could be extremely intimidating and punishing (it’s timely that there’s so much press these days about public shaming) and that even more tempered and fair criticisms are then seen through the lens (perhaps not incorrectly) of fueling this fire. (I’d love to be able to do, say, a Twitter analysis of the WoM around several of these cases, but unfortunately, Twitter’s API won’t let you search that far back in time.)

      It’s also plausible that, if (if!) Cuddy has received excessively harsh WoM (word of mouth) relative to others, that it’s driven to some degree by misogyny. There is enough documented evidence about misogyny and (often nonconscious) sexist biases that I would be pretty surprised if this wasn’t playing a role in WoM chatter at all. But it may not be the primary driver for differences– it could, for instance, simply be that the Power Poses work was so well-known by the masses, and so catchy and memorable, that it is simply the easiest example to recall when thinking about QRP. Or that, because it had such mass appeal, the lash back involved more non-scientists than typical. Or because, scientists feel extra disdain for someone who appeared to profit substantially off their QRP. Or, that it’s precisely because what she did was so commonplace, that people felt compelled to verbally throw her under the bus, as if to deflect from their own similar practices. It could be a lot of things.

      I have a similar impression as you do, to some extent, about the feeling of Cuddy being zealously prosecuted and dismissed with researchers who have done much worse things. But I can’t point to anything on this blog that strongly supports that — nor can I pull up a suite of solid examples from other sources (maybe because I just don’t have time, or maybe because such a case would be hard to make). If it’s not all in my head (which it might be), then it’s something more slippery, harder to fight back against. Persecution by 1000 papercuts. Or rather, a million micro-aggressions.

      Misogyny and non-conscious sexism are real, and they have a big impact on people’s lives. It’s important to raise the concern and examine, to the best we can, what role they may be playing in issues we talk about (and the lives of those we talk about!). It’s an insidious issue that is hard to call out and fight against, so open conversations about it are critical.

      But, I think directing blame on AG, in this case, is misplaced.

      • Your explicit mention of non-conscious sexism makes this seem a good point to mention the following:

        Today, I attended an informal lunch of women faculty in the mathematical and physical sciences. One of the women mentioned her experiences in being on hiring and promotion committees. She said that her experience was that often in these situations, a male member of the committee would point to a negative comment about a woman candidate’s teaching, and say that that disqualified the woman candidate from serious consideration. She also said that she responds to this situation by looking for positive comments about the candidate’s teaching ability, and/or comparable negative comments about male candidates. She usually finds them — and when she brings them to the attention of the male faculty who originally brought up the subject, he usually accepts her evidence that his evidence was not adequate.

        I see this as an example of an excellent way to deal with unconscious sexism.

    • Aubrey,
      You write that the flaws in the paper are “entirely unremarkable”. I am not sure if you have read the paper but I sincerely hope that you don’t see the flaws as unremarkable or something that you would find in many other papers that were then interpreted as presenting such definitive evidence in favor of the intervention.

      1. Much has been written about the stunning lack of statistical power in this paper, and I remain surprised that someone trained by one of the supposedly preeminent social psychologist was unaware of close to 50-year old admonitions about the importance of statistical power. Who decides a priori that a sample size of 39 is sufficient for what was supposedly the first examination of this phenomenon? Would you allow a graduate student to conduct a study with such a sample size?
      2. The study lacked a control group so it is entirely unclear whether any observed differences between the expansive pose and contractive pose is due to the benefits of power posing or the undesirability of adopting a contractive pose (or a combination of the two).
      3. The design contains a classic confound. Participants were given the chance to gamble before their time 2 hormone levels were measured but there is a substantial literature noting that gambling (even given the opportunity to gamble) raises cortisol and testosterone.
      4. All analyses involving testosterone should have been analyzed separately for men and women as is apparently standard in this literature.
      5. The authors misreported a p-value of .052 as <.05.
      6. The first author has admitted to p-hacking – something that neither of the other two authors have ever denied.
      7. The authors noted in an original draft of their paper that gender moderated the effects of power posing on feelings of power and willingness to gamble (much weaker (near zero) effects for women) but never mentioned this in their published paper, with Amy Cuddy going out of her way to market power posing to women because of its supposed effect on feelings of power.
      8. The authors claim in their abstract that power posing makes people "more powerful" when actual power was never even examined in their paper.

      • Marcus, I challenge you to take a random sample of papers from Psych Science and see if you can’t find comparable observable flaws in about a third of them. (This is a fun exercise I do with select grad students sometime, though with a different journal). And as for the unobservable flaws (such as misreporting a p value, or aspects of p-hacking) I would guess that similar malpractices will have occurred in at least 50%. (Based on having observed many of these papers in creation.)

        I’m not trying to excuse the Power Poses paper, so much as acknowledge that a huge chunk of the field is similarly guilty.

        • Anon,
          I agree that many other papers have similar problems. What rankles me so much is the combination of shoddy research practices with completely unjustified claims to the public about the size of a particular effect. It sometimes seems to me that the way to really “make it” in psychology in terms of landing book deals, speaking engagements and favorable press coverage is to claim that a particular effect is very large – even if that claim is completely unjustified by the extant data. Even if your claim is later debunked you typically appear to retain your spot on the speaking circuit. This sometimes seems to be particularly prevalent among leading researchers in the positive psychology movement (e.g., positivity ratio, emotional intelligence, grit).

      • I know I’m shifting the conversation a bit here, but another to thing to think about is the difference in impact the scandal had on Cuddy vs. Carney (and vs. Yap).
        Yes, Carney renounced the research after the fact, and people praise her extensively for that. (We shouldn’t forget that it’s a much easier thing to denounce your past work when you have the security of tenure.)

        But, Carney still carried out the poor practices — in fact, she was the first author on the project. She was the only senior researcher on the project. That gives her a lot of influence over how the project was run. But, she can now throw it out as “that crazy stuff we did back then to save money” and come out unscathed — while the junior second author, Cuddy, lost her academic career.

        Now, a lot of people seem to get upset because Cuddy was “greedy for popular coverage” in doing her TED talk and subsequent media tour, and as such, she deserved extra punishment. Ok, it’s a fair qualm that she played the biggest role in spreading dubious science. But, in judging her as a scientist, it should be her science sins that are primarily focused on. If it’s mostly about what went on in the paper–rather than her ambition–why didn’t Yap also leave academia?

        It’s hard not to be reminded of the well-documented visceral disdain people feel for women (exclusively) who seem “ambitious”.

        • Anon:

          There’s no reason Cuddy, Carney, or Yap should leave academia, if they don’t want to! We all make mistakes. I’ve made lots of mistakes too. Writing a paper which in retrospect turned out to have serious flaws . . . we’ve all done that too. At least I have. I worked at a university department where some of my colleagues did not want to give me a promotion, so I went and got a job elsewhere. I wish success to all of Cuddy, Carney, and Yap, with each of them I hope doing what interests them the most.

        • I believe you, and appreciate that, and I wasn’t referring to you when I wrote that. Your blog has opened up a lot of conversation on the topic, far beyond the specific things you wrote.

          I was referring to the field as a whole. I doubt Cuddy had a lot of options to her in the field after this. Getting tenure requires glowing letters from 10-12 influential researchers in the field, plus approval from broader academic committes, and I suspect she believes (accurately) that she would not have gotten these things, given how closely her name became associated with “bad scientist”.

          That’s speculation for sure, as we don’t have the counterfactual. Maybe Harvard would have happily tenured her. Maybe another R1 school would have. But speculatively, I strongly doubt it. The poor research practices led by Carney, but that included Cuddy and Yap, left Carney unscathed, while ending Cuddy’s academic career. Yap seems to be fairly unscathed, but time will tell there.

          I don’t have a specific point here. I don’t think we should filter critiques of junior researchers. But, relating to the broader question of whether Cuddy was disproportionately punished (not on your blog, but in general)… well, nothing bad appears to have happened to her co-authors.

          Relating to the questions of underlying misogyny, it’s possible that sexism (against women who seem too ambitious) contributed to making Cuddy’s penalties more severe than Yap’s. (While tenure and seniority protected Carney.) Of course, we in no way have evidence that this is the case. But given the greater conversations in the world right now about sexism in science, backlash against and harassment of ambitious women, and concerns about the tenure system, it’s interesting to speculate upon when we have the conversation.

          But none of this is related to what you specifically wrote. I am appreciative of your blog as a place for people to have these important conversations, even if the comments drift substantially from the contents of your posts.

        • AG: perhaps you have forgotten the extent to which a junior faculty’s ability to continue in academia requires broad approval from senior researchers. Getting denied a promotion at one school is one thing; broadly (and unfairly) being seen as the poster child of the replicability crisis and being the subject of mass ridicule is another.

          Critiques of work are a good and necessary thing! But in this case, the fair & useful critiques led (unintentionally!) to a mass smear campaign of one particular researcher, which, given the timing it happened in her career, likely closed the door to a future in academia, not just at HBS, but at any R1 school. (Even if some school *might* have given her tenure, I imagine that the field itself was a hostile enough environment that it would not be a rewarding career.)

          There are leaders (not just in academia, also in politics, and industry) whose words are tempered, reasonable, and not intended to bully. But yet, when they point a lens at someone, their followers, who may have sexist or racist or other agendas of their own — treat it at bait and launch en masse in vicious attack, disproportionately on certain people. The leader is not directly responsible for the ensuing actions of their devotees. But, if one wants to make science more inclusive of currently marginalized groups, it could be worth considering how biases by others may unfortunately transform fair & reasonable critiques into disproportionate & devastating penalties for some. Which will have the more macro impact of continued exclusion of certain populations from positions of influence.

          Cuddy’s exclusion from tenure is not necessarily a horrible thing, even if not “fair” relative to her co-authors. If your fair critiques (and the unintended ensuing mass bullying by others) had come out a couple years later, perhaps she would already have been tenured — and perhaps taking the slot of someone more deserving. But the vastly divergent outcomes to her personal career hinged on the difference of a year or two in timing, which is something I find kind of silly and unfortunate about academia.

        • Anon said:
          “Carney … was the only senior researcher on the project.”

          My understanding was that Cuddy was the senior author. I couldn’t find any information one way or the other in a quick web search just now. Can anyone clarify?

        • anonymous– you are correct, I was mistaken. Carney was not senior at the time of the study. However, she was first author, and she did wait until she had the security of tenure to denounce the work.

          Of course, the expose of the concerns about that paper did not gain mass media attention until after she had tenure (I think. I believe she earned tenure in 2014?). So, I’m not saying she intentionally waited until she had security — but I imagine that that security made it a lot easier to come forward. If the expose had been delayed by another year or too, perhaps Cuddy would have already gotten tenure at HBS — and she would be a lifelong Harvard professor, rather than an exiled academic. Oh the difference a year or two can make.

          These are all just thoughts to consider.. no one can make any assertive claims about what would have happened in other circumstances.

    • As a feminist, I really wish people would stop playing the misogyny card here. There is a lot of misogyny in the world, but criticism of Dr. Cuddy by this blog or data colada is not among that. Dr. Gelman has never used sexist or gender-stereotypical terms in his criticism, and has not picked women as targets

    • Aubrey,

      Your comments sure look to me like there are a lot of exaggerations* and a lot of jumping to conclusions* without providing evidence. Doesn’t sound very scientific to me.

      *e.g.,
      “frequent, disproportionately withering attacks ”

      ** e.g.,
      “suggest you harbor an irrational professional animus towards Cuddy that’s rooted in gender.”

    • I can’t quite tell from the content of your post whether you are a scientist or not.

      If you are one, i highly recommend you put the type of thoughts and feelings you expressed so eloquently here in an “official scientific paper”.

      I think it would make a fitting and useful contribution to your field.

      You could even think about making it open access, so the general public can also enjoy your work!

  62. Aubrey, I am a male and I am since mid April 2015 trying to get retracted a fraudulent study on the breeding biology of the Basra Reed Warbler which was published in a Taylor & Francis journal, see https://www.academia.edu/33827046/ for backgrounds.

    I have received ‘disproportionately withering attacks’ and ‘critiques that do seem bizarrely, exceptionally, and unscientifically hostile’. I have also received several threats. Females have started a smear campaign against me. I always respond with 2 simple questions:
    (1) show me the raw research data;
    (2) provide me with comments of experts / reviewers / peers (within this field of research) who rebute / refute any of the findings of the report “Final investigation on serious allegations of fabricated and/or falsified data in Al-Sheikhly et al. (2013, 2015)” (see https://www.academia.edu/33827046/ ).

    The answers are always identical: no response & no response & no response & no response & no response (etc.). I have therefore recentely changed my way of communicating with people and entities who do not (want to) communicate with me about my efforts to retract this fraudulent study.

    I am asking these people and entities if they have objections to start to communicate with me ‘within the framework of tacit approval within a fixed period of time, a common practice in parts of the field of publication ethics’. It has turned out already several times that this is a successfull strategy. There is thus right now even a Nobel laureate, Randy Schekman, who is urging publisher Taylor & Francis to retract the fraudulent study on the breeding biology of the Basra Reed Warbler. See for backgrounds https://www.researchgate.net/project/Retracting-fraudulent-articles-on-the-breeding-biology-of-the-Basra-Reed-Warbler-Acrocephalus-griseldis

Leave a Reply

Your email address will not be published. Required fields are marked *