Primed to lose

David Hogg points me to a recent paper, “A Social Priming Data Set With Troubling Oddities” by Hal Pashler, Doug Rohrer, Ian Abramson, Tanya Wolfson, and Christine Harris, which begins:

Chatterjee, Rose, and Sinha (2013) presented results from three experiments investigating social priming—specifically, priming effects induced by incidental exposure to concepts relating to cash or credit cards. They reported that exposing people to cash concepts made them less generous with their time and money, whereas exposing them to credit card concepts made them more generous.

The effects reported in the Chatterjee et al. paper were large—suspiciously large.

Last year, I wrote about a study whose results were stunningly large. It was only after I learned the data had been faked—it was the notorious Lacour and Green voter canvassing paper—that I ruefully wrote that, sometimes a claim that is too good to be true, isn’t.

Pashler et al. skipped my first step and went straight to the data. After some statistical detective work, they conclude:

We are not in a position to determine exactly what series of actions and events could have resulted in this pattern of seemingly corrupted data. In our view, given the results just described, possibilities that would need to be considered would include (a) human error, (b) computer error, and (c) deliberate data fabrication.

And:

In our opinion based solely on the analyses just described, the findings do seem potentially consistent with the disturbing third possibility: that the data records that contributed most to the priming effect were injected into the data set by means of copy-and- paste steps followed by some alteration of the pasted strings in order to mask the abnormal provenance of these data records that were driving the key effect.

Oof!

No coincidence that we see fraud (or extreme sloppiness) in priming studies

How did we get to this point?

Do you think Chatterjee et al. wanted to fabricate data (if that’s what they did) or do incredibly sloppy data processing (if that’s what happened)? Do you think that, when Chatterjee, Rose, and Sinha were in grad school studying psychology or organizational behavior or whatever, they thought, When I grow up I want to be running my data through the washing machine?

No, of course not.

They were driven to cheat, or to show disrespect for their data, because there was nothing there for them to find (or, to be precise, that any effects that were there, were too small and too variable for them to have any chance of detecting; click on above kangaroo image for a fuller explanation of this point).

Nobody wants to starve. If there’s no fruit on the trees, people will forage through the weeds looking for vegetables. If there’s nothing there, they’ll start to eat dirt. The low quality of research in these subfields of social psychology is a direct consequence of there being nothing there to study. Or, to be precise, it’s a direct consequence of effects being small and highly variable across people and situations.

I’m sure these researchers would’ve loved to secure business-school teaching positions by studying large and real effects. But, to continue my analogy, they got stuck in a barren patch of the forest, eating dirt and tree bark in a desperate attempt to stay viable. It’s not a pretty sight. But I can see how it can happen. I blame them, sure (just as I blame myself for the sloppiness that led to my two erroneous published papers). But I also blame the system, the advisors and peers and journal editors and Ted talk impresarios who misled them into thinking that they were working in a productive area of science, when they weren’t. They were blindfolded and taken into some area of the outback that had nothing to eat.

Outback, huh? I just realize what I wrote. It was unintentional, and I think I was primed by the kangaroo picture.

In all seriousness, I have no doubt that priming occurs—I see it all the time in my own life. My skepticism is with the claim of huge indirect priming effects. As Wagenmakers et al. put it, quoting Hal Pashler, “disbelief does in fact remain an option.” Especially because, as discussed in the present post, if these effects were really present, they’d be interfering with each other all over the place, and these sorts of crude experiments wouldn’t work anyway.

It’s all about the incentives

So . . . you take a research area with small and highly variable effects, but where this is not well understood so you can get publications in top journals with statistically significant results . . . this creates very little incentive to do careful research. I mean, what’s the point? If there’s essentially nothing going on and you’re gonna have to p-hack your data anyway, why not just jump straight to the finish line. Chatterjee et al. could’ve spent 3 years collecting data on 1000 people, they still probably would’ve had to twist the data to get what they needed for publication.

And that’s the other side of the coin. Very little incentive to do careful research, but a very big incentive to cheat or to be so sloppy with your data that maybe you can happen upon a statistically significant finding.

Bad bad incentives + Researchers in a tough position with their careers = Bad situation.

27 thoughts on “Primed to lose

  1. >Especially because, as discussed in the present post, if these effects were really present,
    >they’d be interfering with each other all over the place, and these sorts of crude experiments wouldn’t work anyway.
    I’m reminded of the amusing debunking of the “water has memory” explanation of homeopathy, whereby water apparently has no trouble remembering the nice gentle remedy it was in contact with, but has no trace of all the intestines, toilets, and sewage systems it’s been through.

      • *I am not a fan of homeopathy but the methods of the critics highlight the uselessness of NHST perfectly.

        Can you link to any homeopathy skeptics who have actually measured how much starting material is remaining in the solutions claimed to be effective? From what I have seen, it looks like they plug numbers into equations developed for idealized circumstances and do RCTs testing the null hypothesis which, no surprise, has lead to conflicting evidence and wild speculation about water memory, etc.

        “Homeopathy is controversial because medicines in high potencies such as 30c and 200c involve huge dilution factors (10⁶⁰ and 10⁴⁰⁰ respectively) which are many orders of magnitude greater than Avogadro’s number, so that theoretically there should be no measurable remnants of the starting materials. No hypothesis which predicts the retention of properties of starting materials has been proposed nor has any physical entity been shown to exist in these high potency medicines. Using market samples of metal-derived medicines from reputable manufacturers, we have demonstrated for the first time by Transmission Electron Microscopy (TEM), electron diffraction and chemical analysis by Inductively Coupled Plasma-Atomic Emission Spectroscopy (ICP-AES), the presence of physical entities in these extreme dilutions, in the form of nanoparticles of the starting metals and their aggregates.”
        http://www.ncbi.nlm.nih.gov/pubmed/20970092

        Here is a whole report that never once attempts to measure what is in this stuff:
        http://www.parliament.uk/business/committees/committees-archive/science-technology/s-t-homeopathy-inquiry/

        • At last some proper science! (no sarcasm intended, a necessary clarification in mined fields) All that scientistic bragging was sickening, this is sobering, thanks.

        • I am confused. What’s the point you are making? Are you saying dilution does not….well, dilute?

          i.e. You think had they actually tried measuring they may have found grossly more starting material left in the final solutions than would be predicted by the 10^60 dilution factor?

          Sure there can be non-uniformity during dilution but with a 10^60 factor any variation is moot.

  2. “The lights are growing dim Otto. I know a life of crime has led me to this sorry fate, and yet, I blame society. Society made me what I am.”
    “That’s bullsh–. You’re a white suburban punk just like me.”
    “Yeah, but it still hurts.”

  3. The same first author also has a paper with incredibly strong effects for pretty subtle priming-type manipulations in our favorite journal – Psychological Science.
    http://pss.sagepub.com/content/early/2012/03/05/0956797611432497.full.pdf+html

    For example, in study 2 participants had to recall an ethical or unethical prior act and were then asked to rate their preference for light-related objects. Those in the “ethical” condition indicated much stronger preferences for these objects than those in the “unethical” condition. Effect sizes (d) were well over 1 for some of the objects.

  4. Things are well past the stage of researchers furiously compete with each other in fields with nothing much to achieve. That’s only the first stage. If that situation continues long enough something far more problematic happens. The field creates so many fake hoops to jump through (in order to distinguish ‘good’ from ‘mediocre’ academics) that being in that field means working 70 hours a week jumping through unproductive hoops.

    Eventually, things reach a point where the academic’s life is so ate up with stupidity, they could no longer do anything productive even if they had something productive to do. Economics seems to have reached this point some time back. To be an economist means spending so much of your life doing stupid stuff, you couldn’t make a real breakthrough even if the ingredients for such were handed to you on a plate. A noticeable and substantial increase in predictive power and usefulness is far more likely to come from someone not in the top tier of Economists. Top tier Economists spend all their time padding their CV. That’s how they became “top tier”.

    It used to be that the selling point of being an academic meant having more free time to think. That changed a long time ago. I know a ton of people working real jobs, who both get paid more and have more free time to think. While most academics can’t even remember the last time they did any quality thinking.

    • What really kills me though, is that it’s not uncommon for working class inventors to spend $50,000 developing, prototyping, patenting, and initial production runs of some invention they created all while working their blue collar job. It happens far more often then you would ever imagine.

      That’s what people do when they believe in something.

      Anyone who wanted to do similar studies of priming, or hurricane names, or whatever the latest silliness coming out psychology is, could do every one of these studies for less than $1,000. But they never self fund their experiments. If academics really believed any of the crap they spout about “searching for knowledge” you’d think they’d be willing to put their own money were their cynical mouth is. You know, kind of like they often did a few hundred years ago when scientist still made real discoveries.

      • An inventor may be incentivized to self fund his/her work in the hopes of making financial gains from the successful invention. But what incentive would an academic have to self fund research with no monetary gain?

        I agree that there are incredible pressures to publish, and that quality may be sacrificed for quantity. But making researchers pay out of pocket would not solve the problems you describe (which I fully agree are big concerns).

        • But what incentive would an academic have to self fund research with no monetary gain?

          Seriously? Academics have every prospect of monetary gain from good research. You got it backwards. The inventors are virtually guaranteed to loose everything. They got a better chance of being struck by lightening usually. The financial rewards to academics are far more certain.

          I didn’t say “make researchers pay”, but since you brought it up, I think if instead of giving researchers six figures to study this crap, they had to go get a real job and then shell their own money for this “research”, it would instantly eliminate 99.9% of these dumb studies.

          People forget this, but up until about world war II or so it was very common for even the big famous names we remember from science to work unpaid professorships. Often doing so for well into middle age before getting a pay gig. As late as the late 1800’s this sort of thing was the norm. Even when they did start to earn a real salary, their pay was often the equivalent of a poor graduate student stipend today.

          So the answer the question “what incentive would an academic have to self fund research with no monetary gain?” I would say “the same friggen’ incentive that all the great scientists used to have back when they did real friggen’ science”

        • If its not readily and easily publishable, most or many academics will face monetary penalties (not getting tenure, missing raises, missing awards and being labelled dead wood, getting stuck with admin work, not getting funded grad students, actually losing their non-tenured appointment etc.,etc.)

          I don’t have a representative survey, but I am aware of a number of institutions and individuals where this is top of mind.

        • Take Fermat, who is primarily remembered today for number theory or other mathematical results, but also did science proper. In particular, he discovered the variational principle that light traveled in such a way to minimize the travel time. It’s a fun calculus exercise for freshman to use this to derive the equation describing how light bends as it moves from one medium to another (air to water for example). This result had a huge impact on the development of physics. Arguably, it’s still being felt today.

          Yet Fermat never got paid for any of his math or physics research. He had a degree in law and worked as a judge.

          So how is that a judge in the 1600’s can do bad ass research in his spare time with no degree and no peer review, but six figured highly trained researchers today can’t produce a decent paper with millions in grant money?

        • And please don’t say “because he worked on easier problems”. Studying light in the 1600’s was pretty much infinitely harder than studying “social priming” in 2016.

        • Agree. The problems from the 1600s only look easier because they are solved now.

          And, as a taxpayer, there’s this general thought — why should we be paying for this?

          It’s one thing to say “let’s throw many millions into sending an instrument into outer space”, because the odds are pretty good on finding interesting stuff (however useless commercially, such knowledge nourishes the soul). But, given the results over the past decades, should certain types of research not be taxpayer funded?

    • Agreed. Something else that happens (and something that Andrew has touched on before) is that you end up creating entire research domains comprised almost entirely of researchers following the kind of practices described by Pashler et al. So when a paper like this gets submitted to a journal both the action editor and reviewers are likely to be academics “trained” in this way of doing research. Priming research is one good example but there are plenty other areas within psychology like it – and I imagine that other fields in the social sciences have similar problems.

  5. Maybe a bit less certain of things than Laplace (avoiding making a Peirce/Laplace joke here) but I was wondering about what’s actually new and what’s actually old here.

    My sense back in the 1980s ( http://statmodeling.stat.columbia.edu/2012/02/12/meta-analysis-game-theory-and-incentives-to-do-replicable-research/ ) was that there was a more ignorance than pressure to publishing something/anything. Today I do believe it more pressure to eat (publish) and mostly only dirt for most who want to eat. But it would be nice to have some empirical/historical study of this.

    Also will be interesting to see how ‘technical activism’ ( http://errorstatistics.com/2016/02/03/philosophy-laden-meta-statistics-is-the-new-technical-activism-free-of-statistical-philosophy/ ) evolves and deals with loss of low hanging fruit (as novelty of pointing out the problems wares off) and how they meet their continued need to publish…

    • Maybe a bit less certain of things than Laplace

      After Peter Higgs won the Nobel Prize for the prediction of the Higgs Boson, he had this to say:

      https://www.theguardian.com/science/2013/dec/06/peter-higgs-boson-academic-system

      “I wouldn’t be productive enough for today’s academic system “

      “He doubts a similar breakthrough could be achieved in today’s academic culture, because of the expectations on academics to collaborate and keep churning out papers. He said: “It’s difficult to imagine how I would ever have enough peace and quiet in the present sort of climate to do what I did in 1964.”

      You don’t simply get a competition for ‘achievements’ which becomes ever more frantic when there’s nothing to achieve. This competition over time creates a culture were it’s impossible to achieve anything even when real discoveries become possible again. Just ask any academic walking around saying:

      “no worries, when I’m 45 I’ll be able to work on the good stuff I got into science for”

      Things have probably been drifting in this direction since the 50’s but the turning point seems to have been around 1990 or so.

  6. I’d put it a bit differently. On one level, these are fairly well established effects. I’ve seen textbooks from the eighties citing studies on the effects of credit card insignia on restaurant tipping, and (unlike subliminal effects research) I believe the studies have held up. It’s not not a big effect (around 4% bump) but not trivial if you’re making a living off tips.

    I’ve noticed this pattern in other studies as well. Revisit some old research and make it sexy either by somehow pumping up the magnitude or by making ludicrously sweeping causal claims.

    • Jacob:

      I followed the link and was entertained:

      Although 8 coding errors were discovered in Study 3 data and this particular study has been retracted from that article, as I show in this article, the arguments being put forth by the critics are untenable. . . . Regarding the apparent errors in Study 3, I find that removing the target word stems SUPP and CE do not influence findings in any way.

      Hahaha, pretty funny. Results are so robust to 8 coding errors! Also amusing that they retracted Study 3 but they still can’t let it go.

        • Jacob:

          I don’t know what happened. But let me distinguish what is surprising behavior from what I would like to see people do.

          Suppose Chatterjee et al. really did cheat. I’m not saying they did, I have no idea, I’m just saying suppose they did. If so, and Pashler et al. caught them out, then, sure, it’s not surprising that they’d deny it. That’s what people do. It’s the so-called Chris Rock strategy. It’s what Ed Wegman and that Arizona guy did after they were caught copying material from others without attribution. Their denial of plagiarism did not surprise me, but I still think it would have been more appropriate for them to have admitted it and apologized.

          Suppose Chatterjee et al. did not cheat but they did really sloppy analyses. I’m not saying they did. If so, and Pashler et al. found these problems, then, sure, it’s not surprising that they’d try to salvage their work. Either out of lack of principle or out of simple lack of understanding. Recall the beauty-and-sex-ratio researcher, the ovulation-and-clothing researchers, and the himmicanes guys. They all refused to admit their conclusions might be wrong. Nobody was accusing them of cheating, but people (including me) were saying that they’d done bad work. And people don’t like it when other people say they’re incompetent. In all these cases, their refusal to admit error did not surprise me, but I still think it would have been more appropriate for them to have talked it over with some experts, thought it through, and accepted that they’d made mistakes.

          Finally, suppose Chatterjee et al. actually did everything fine (except for those 8 coding errors, I guess). If so, and Pashler et al. claimed there were errors, then, sure, it’s not surprising that they’d defend what they did.

          All three of these situations correspond to “people get defensive when their work is criticized” (I prefer that to the term “attacked”). And defensive behavior is unsurprising in all three situations. But I think it’s only appropriate in situation 3, not in situations 1 or 2.

      • Hello, Jacob and Andrew. I am one of the co-authors of the paper in question in this blog post (Randall Rose). I feel compelled to point out a couple of fundamentally misleading aspects of these posts. First, and perhaps most important, I have ceased trying to defend the data in this paper, particularly Study 3, a long time ago. I am not certain what happened to generate the odd results (other than clear sloppiness in study execution, data coding, and reporting) but I am certain that the data in Study 3 should not be relied on, and I made that clear in my commentary. Second, and I believe this is important too, I did not write the commentary to which you refer in the link above, nor did I endorse it with any form of authorship. My commentary was written separately, and I hope you will take the time to read it, as well as the commentary written by SInha, who explains her perspective on the coding errors that she reports and I only learned about two years into the matter of the integrity of the data in this paper.

        Even if there were no suggestion of impropriety (and I agree that data manipulation or fabrication remains a reasonable explanation for some of the patterns observed, albeit not the only explanations), the numbers of errors in coding and reporting that have been detected alone are sufficient to undermine confidence in the data. Drawing this conclusion is a painful and embarrassing one, to say the least. My commentary addresses issues related to multi-author research teams, in particular some of the suggestions that have been made to increase attention to data integrity. Please read it if you have the time.

        Finally, I appreciate Andrew’s recognition that it is difficult to avoid being defensive in these situations, regardless of the role one played in the research reported. Thank you all for your interest in this important issue.

        • Randall:

          Thanks for the comment. I followed the link and read all three responses: Chatterjee’s, yours, and Sinha’s. I found Chaterjee’s and Sinha’s responses to be ridiculous. Figure 1 of Sinha’s reply is particularly laughable. Your response seemed to be to be measured and careful. The business of the multi-author research team reminds me of the Lacour and Green story. I’ve published lots of papers with coauthors where I haven’t analyzed the data myself, and it must be horrible to be involved in such a situation, where at worst your collaborators fabricated data and at best they did very sloppy data analysis and have followed up with defensive replies that reveal a lack of understanding of effect sizes, replications, and so forth.

        • Andrew, thanks for reading all the commentaries, including the rebuttal from Pashler et al. I have been advised to further distance myself from this paper, but I am uncertain how best to do that beyond withdrawing my name from the paper, which I have already requested Marketing Letters to do. I may post my own “take” on the matter in a way that goes beyond what I said in my commentary at some point in the near future. But, I think it is very important that we be cautious in assigning blame for aspects of the data that are still of ambiguous origin.

          It is very painful to see that knowledgeable commenters perceive the research team to lack understanding of effect sizes and replications. I believe Professor Chatterjee understands these things very well. I know, because I taught him in his very first seminar. That being said, effect sizes are ambiguous signals of data quality. As pointed out in Chatterjee’s commentary, large effects are not that uncommon in the money priming literature. Surely we can agree that, if any conceptual prime is a powerful one, money would be near the top of the list. So, while the body of work on this topic is still not large (but growing rapidly), despite it having originated in a different theoretical frame in the ’80s, it remains to be determined what the range of plausible effects sizes really could be for supra-liminal money primes. Certainly, with the exception of Study 3 in Chatterjee et al. (which ironically has the smallest effects in the paper), the effects in this paper are remarkably large. As for the value of replication, this is not an uncontentious issue, despite the prevailing view that replication can determine the veracity of published effects.

Leave a Reply to mark Cancel reply

Your email address will not be published. Required fields are marked *