Skip to content
 

Why is the replication crisis centered on social psychology?

We had a post on this a couple years ago, but the topic came up again, and here are my latest thoughts.

Psychology has several features that contribute to the replication crisis:

– Psychology is a relatively open and uncompetitive field (compared for example to biology). Many researchers will share their data.

– Psychology is low budget (compared to biomedicine). So, again, not so much incentive to hoard data or lab procedures. There’s no “Robert Gallo” in psychology who would take someone’s virus sample in order to get a Nobel Prize.

– The financial rewards are lower within psychology, hence the incentive is not to set up your own company using secret technology but rather to get your idea known far and wide so you can get speaking tours, book contracts, etc. Sure, most research psychologists don’t attempt this, but to the extent there are financial rewards, that’s where they are.

– In psychology, data are generally not proprietary (as in business) or protected (as in medicine). So there’s a norm of sharing. In bio, if you want someone’s data, you have to beg. In psychology, they have to give you a reason not to share.

– In psychology, experiments are easy to replicate (unlike econ or poli sci, where you can’t just run a bunch more recessions or elections) and cheap to replicate (unlike medicine which involves doctors and patients). So replication is a live option, indeed it gets people suggesting that preregistered replication be a requirement in some cases.

– Finally, hypotheses in psychology, especially social psychology, are often vague, and data are noisy. Indeed, there often seems to be a tradition of casual measurement, the idea perhaps being that it doesn’t matter exactly what you measure because if you get statistical significance, you’ve discovered something. This is different from econ where there seems there’s more of a tradition of large datasets, careful measurements, and theory-based hypotheses. Anyway, psychology studies often (not always, but often) feature weak theory + weak measurement, which is a recipe for unreplicable findings.

To put it another way, p-hacking is not the cause of the problem; p-hacking is a symptom. Researchers don’t want to p-hack; they’d prefer to confirm their original hypotheses. They p-hack only because they have to.

100 Comments

  1. Dale Lehman says:

    I think this list suffers from the same thing you say at the end – confusion of symptoms from problems. For example, it is only relatively “easy” to replicate in psychology because small noisy experiments are tolerated and accepted. Economists do not generally accept these small noisy experiments (though they may, arguably, suffer from other, equally serious deficiencies). So, the question is why psychology is willing to accept such studies? Indeed, it seems that students are taught, even at the undergraduate level, to conduct research that is too noisy to stand up to serious scrutiny. Similarly, measurement seems to be sloppier in psychology than in some other fields (though management, just to cite one example, seems even worse). Again, the question is why? So, I think it would be more valuable to parse this list into what the underlying causes are, rather than confusing symptoms with causes.

  2. M says:

    This seems mostly right. As to why the replication crisis is centered on social psychology of all psychological fields, I’d guess it’s because it’s a relatively large field and one that is based on small, cheap experiments and NHST rather than say large longitudinal datasets and effect size estimation.

    It’s not my impression that economics usually involves careful measurement.

  3. I can tell you that there’s a white knuckle terror in academic publishing of the replication crisis visiting biology. “Biologists have to get serious about statistics,” sayeth an EIC, “but statisticians can’t expect them to use large sample sizes.”

  4. Test says:

    Psychology (and other related behavioral science fields) is more competitive than you think. Remember, economists and biologists have a large number of high-paying options outside of academia (i.e. government or industry for economists and industry for biologists). This is not as much the case for people specializing in social/cognitive/personality psychology and behavioral sciences, and so “publish or perish” is a real concern.

  5. Pietro Ghezzi says:

    I suspect the main reason why psychlogy studies are often weak is not one listed above and is part f the constraints of research. We live in a “society of the spectacle” (Debord), where even proper Universities regard highly if you are mentioned by newspapers or tv, as a form of second-class impact ( where real impact means changing something in people’s lives). Studies reporting that blondes like exresso coffee or that eating chocolate makes you more empathic are the kind of stuff dreams are made of for understaffed magazines and newspaper. This makes scientists motivated to reward sexyness over replicability. On the other hand, if I publish a paper stating that peroxiredoxin-1 forms a dimer linked by one disulfide instead of two, I am only pressured into publishing in a decent journal rather than having the additional pressure of have ng to do something appealing to the news.

  6. Michael R says:

    Most measures in psychology are intangible constructs e.g. psychopathy, social anxiety. Their very creation is based on multiple theories, hence they are themselves multiple constructs / beliefs. Even when it’s the same theory and the same questionnaire battery, we are still preparing evidence evidence for a theory, not a living breathing entity. Replication is difficult for things that you cannot see beyond the questionnaire and social psychology is lousy with questionnaires.

  7. Social psychology may also be special because it has wound up in a special historical niche. In psychology in general, the rejection of psychoanalytic theory that began perhaps in the 1930s has become complete, but bits of it have survived in the form of an interest in unconscious motivation, popping up now in measures of things like “implicit bias”, which is a natural topic for social psychology as opposed to other branches. Unconscious priming effects are not necessarily “social”, but they, too, wind up in social journals. Another historical influence, perhaps also a residue of psychoanalytic theory, seems to be the desire to show that people are not in control of their behavior, since their behavior is influenced even by extremely subtle environmental cues, of which they are unaware. The combination leads to experiments on subtle manipulations, which often produce tiny effects or non-effects, even with careful measurement and clearly stated hypotheses.

    Other parts of psychology are less subject to these problems, perhaps because they have more totally shed the Freudian legacy, or they never had it.

    I do not mean to imply that I agree with the total banishment of Freud and his followers. I don’t.

    • Jeff Rouder says:

      I think Greenwald, in his “New Look 3,” is the most explicit about the link from Freud to the current field of social cognition. (https://faculty.washington.edu/agg/pdf/Gwald_AmPsychologist_1992.OCR.pdf). But it doesn’t explain why social psychologists are so smug at faculty meetings.

      • Social psychologists may be smug at your institution, because they know not to generalize about a group they clearly don’t understand.

        Social psychologists have pushed hard against both psychoanalytic approaches and especially behavioristic approaches. A tour through Lewin and Heider would trace out the thread.

        “Freud’s position is not particularly recognizable as a theory of social activity” (From R. Lana, “Assumptions of Social Psychology,” Century Psychology Series, Appleteon-Century-Crofts, 1969, p. 53).

  8. John Bullock says:

    Andrew, I’m not sure that you are correct to say that “there’s a norm of sharing” in social psychology.

    Check out the 2006 article by Wicherts et al., “The Poor Availability of Psychological Research Data for Reanalysis.” The authors go to unusual lengths to request replication datasets for more than 100 articles that were published in leading journals. They succeed only 27% of the time.

    And on the requirements of leading psychology journals, see Wicherts and Bakker 2009, “Sharing: guidelines go one step forwards, two steps back.” They argue that the APA is actually preventing data–sharing instead of facilitating it.

    Perhaps data–sharing has become more common in social psychology since these papers were published. But I haven’t found a study which supports that claim. By contrast, it’s likely that one would have a much better success rate in economics or political science, if only because many leading journals in those disciplines require that authors post their data and code online as a condition of publication.

    • Dale Lehman says:

      While many economics journals do require data to be posted, you will find that many (I’d estimate 50% or more) have a .pdf that states that data cannot be released because it is proprietary. So, perhaps your “better success rate in economics” needs to be cut in half.

    • Jordan Anaya says:

      I’m also not sure there is any evidence psychologists share their data upon request more than other fields. Perhaps psychologists are more likely to publicly post their data somewhere at the time of publication, but I would argue it is a lot easier for psychologists to do so given how small their data sets are.

      • Andrew says:

        Jordan:

        Psychology experiments are typically low-stakes and there is typically no good reason not to share data. Hence, the embarrassment when data are not shared is greater, and I think this contributes to the perception of a replication crisis. Good scientific behavior in psychology is so easy, that when people are behaving badly, it becomes super-obvious.

        • Jordan Anaya says:

          Yes, I agree. A dirty secret about “big data” is it often doesn’t matter whether or not the data is shared. Let’s say I think there might be problems with a study that uses “big data”, and the data is posted somewhere. Even if I had access to unlimited computing resources, it still might take me several weeks to confirm their results. Am I or someone else really going to take the time to do that?

        • John Bullock says:

          Psychology experiments are typically low-stakes and there is typically no good reason not to share data. Hence, the embarrassment when data are not shared is greater, and I think this contributes to the perception of a replication crisis. Good scientific behavior in psychology is so easy, that when people are behaving badly, it becomes super-obvious.

          And yet — data–sharing in social psychology seems to be uncommon both in an absolute sense and relative to related fields, including political science. See the Wicherts articles to which I’ve linked in the post above. Then check out Figure 1 in this Allan Dafoe article, which shows that replication code and datasets for almost all articles in the American Journal of Political Science are now freely available online. And then remember that, with the advent of the Data Access and Research Transparency Statement, almost all political science journals have adopted the AJPS requirement that replication data and code be posted online. There’s nothing equivalent in social psychology. A few journals in that field have adopted strong data-sharing requirements, but they’re the exceptions.

          Andrew, I think that you are wrong about this. To judge by the Wicherts article, or by a review of the policies and websites of leading journals, there is relatively little data–sharing in social psychology. These low rates of data-sharing don’t suggest embarrassment about the refusal to share. They suggest the opposite.

          • Andrew says:

            John:

            In political science we typically work with public data: surveys such as the National Election Study or archival data such as election returns, war casualties, roll-call votes, etc. In psychology there is a tradition of people gathering their own data, in which case sharing is more of an option and less of a necessity. So, yes, I agree that the traditions in these two fields are different, and I think that the fact that public data are the norm in political science was a factor in pushing our field toward openness, even for privately collected data. Psychology is only later coming to the idea that it could be a good idea to require open data. But I do think it’s embarrassing for psychologists when they don’t share their data. For more on this, perhaps we would need to interview the ghost of Paul Meehl who could give his impressions as to why psychologists seem to be particularly stubborn. Political scientists has its share of frauds and cheats, but I don’t think we have anyone equivalent to that psychologist who accepted that string of ridiculous papers for PNAS.

            To continue with differences between poli sci and psychology: one factor I identified as relevant to whether a field of work is scrutinized is whether it has active opposition. Ideas in political science, especially when they are widely circulated, are typically contentious. They rarely slide into the public discourse without question. In contrast, until very recently, it seems that just about the entire fields of psychology or economics would fall hook, line, and sinker, for any statistically significant result that claimed that behavior could be explained by hidden motives (whether this be ovulation and clothing choice, or names and strikeouts, or embodied cognition, or beauty and sex ratio), if it was labeled as evolutionary psychology. All this unquestioned crap, when eventually questioned, led to a replication crisis. Perhaps poli sci has had less of a replication crisis because its well-publicized claims have typically been held up to a critical light.

            • Martha (Smith) says:

              Good points

            • John Bullock says:

              Perhaps poli sci has had less of a replication crisis because its well-publicized claims have typically been held up to a critical light.

              Yes, that could be. Whether political science claims have been scrutinized more often because they are inherently more contentious, or because political scientists have a stronger norm of data-sharing (even when they have gathered their own data)…in this case, the answer may be “both.”

  9. a reader says:

    I think there’s something else too: in psychology, effects are either highly dynamic, or really boring.

    • Exactly. Reader. Most of it pretty boring. Also I understand psychologists are prone to gossip. This is already in their disfavor in so far as forging substantive & empirical insights.

      • Anonymous says:

        “Also I understand psychologists are prone to gossip. This is already in their disfavor in so far as forging substantive & empirical insights.”

        Hahahaha! +1

        I actually think *social* psychologists might be more prone to this, not psychologists in general.

        In fact, one of my theories about social psychology and the “replication crisis”, is that the characters/personalities of folks that choose to study social psychology (instead of for instance clinical psychology like i happen to have done) might be directly related to the crisis in their field. I reason those that choose to study social psychology might be much more inclined to focus on groups, what others think of them, reputations, not upsetting certain norms even though they are bad, etc.

        In line of this, i am getting increasingly worried about all this talk about “incentives”, and “group norms”. I fear if poeple keep using these type of words, they will lose sight of the real issues, and can easily make up a whole new set of “incentives” and “group norms” that will have nothing to do with good psychological science or improving psychological science. I think this might already have been happening with:

        1) open practices badges for journal articles (it is still not clear to me why an individual researcher would be “incentivized” to care about a badge on their published paper)

        2) “Registered Reports” (which are apparently sometimes not even registered and/or this registration is not available to the reader, see: https://osf.io/5a63g/)

        3) the use of the word “transparent” (that word is already being (mis-)used to mean something that mostly resembles “honest” and not “giving the reader access to relevant information so they can check things for themselves)

  10. In short, social psychology is a relatively undeveloped social science. I’m lean to Paul Rozin’s view which I posted on an earlier thread. Who can take it seriously?

  11. Mayo says:

    Here were my remarks from 3 yrs ago about the ironies in the replication crisis in social psychology.
    https://errorstatistics.com/2014/06/30/some-ironies-in-the-replication-crisis-in-social-psychology-1st-installment/

  12. bacalao says:

    Another possibility is that the distribution of talent/intelligence/creativity over people who become social psychologists is shifted to the left compared to that for those who become, say physicists. In colloquial language, the density of really bright people is lower.

    A lot of these so-called replication errors are just the incompetence of poorly (self)-educated undisciplined minds.

    • Martha (Smith) says:

      My impression is that psychology is considered one of the easiest undergrad college majors. That would not imply that all psych majors are poor quality, but would suggest that the standards are not as high as in many other fields. In particular, I have heard people refer to a psychology major as “the easy route to medical school,” compared to, say, a biology major.

  13. Nate Breznau says:

    The financial rewards for Mr. Diederik Stapel were massive.

  14. Jacob says:

    Of course, a contrary view to those who say social psychology is an undeveloped science is that the problem is that it is in fact a developed one. I am partly playing devil’s advocate here, but if we know a lot of what we can know about social psychology already, the structural forces that Andrew points to are likely to be particularly disastrous. They are asked to squeeze new insights out of something that is already very close to peak knowledge. If you know most things already, most of your new findings will be wrong.

    Again, I don’t know that I’d go to bat for this point of view but it’s consistent with the basic problem here.

    • Jacob

      Perhaps ‘undeveloped’ is not quite the characterization I should have used. Its applications in international relations issues are open to further queries b/c ‘suspicion’ is the overarching lens thru which we analyze. Yet how do we verify the insights empirically. Small sample opinions have dominated throughout.

  15. anon says:

    Maybe need to re-read, but how is more open sharing of data a problem here?

    Sounds like the biologists supposedly being secretive just covers up their replication crisis rather than avoiding it.

    • Andrew says:

      Anon:

      Open sharing of data is not a problem. I’m saying that openness can contribute to a replication crisis: when data are open, outsiders can see that studies are not replicating. In some areas of biomedicine, data are so secret that it’s hard to tell that a study isn’t replicating, because it’s hard to figure out exactly what was done in the original study.

  16. Thanatos Savehn says:

    I dispute the claim that it’s centered on psych. When Obama’s PCAST had a look it was inspired by pharma companies that had poured billions into synthesizing and testing drugs designed to target a mechanism “discovered” by bio-researchers only to find again and again that the mechanism couldn’t be reproduced. Since the (2014) it has become obvious that the combination of flawed or rigged statistical analysis and a rampant failure to verify reagents, cell lines, antigens and animal lines has caused is wasting massive resources and costing the sick precious time. Power posing doesn’t kill but shoddy cancer research does.

    • a reader says:

      Right. As much as some commenters here decry clinical trials, their high failure rates indicate their necessity.

      • Anoneuoid says:

        What is your argument? It looks like:

        Premise: This way of doing things is totally infeasible and wastes tons of money
        Argument: ???
        Conclusion: We must continue, and even increase the rate of, doing things that way

        It sounds like one of those government “blow the budget” arguments.

        • a reader says:

          My argument is in regards to company sponsored trials, rather than government sponsored trials.

          In this situation, a company is allowed to gather information in whatever manner they want. Given this information, they can choose to push a trial forward or use resources elsewhere. Under such assumptions, the choice to sponsor a trial should represent an expression of the company’s prior that a trial will be successful. Given how expensive clinical trials are, and that the company is footing the bill, the choice to sponsor a clinical trial should indicate that the company has strong faith in their new treatment (and that the trial will be successful).

          The fact that such a high precent of clinical trials are unsuccessful is the cold water in the face presumably informing the company that it’s faith in the new treatment is misplaced.

          This conclusion isn’t “We must continue, and even increase the rate of, doing things that way” at all, at least from the company’s perspective. Rather, it’s a clear sign that companies need to re-evaluate how they form their prior information. This has lead to some important, uncomfortable conversations (or at least, they were uncomfortable when I witnessed them).

          • Martha (Smith) says:

            reader,

            I agree with what you say. But there are other problems that need to be faced. These include:

            Companies financing clinical trials have been known to “cherry pick” from huge amounts of data and studies and report only results that are favorable. (I don’t have references at hand, but could probably dig some up.)

            Attempts to foster “registration” of clinical trials haven’t had any real enforcement clout behind them.

            Legislators and lobbyists have been reported to “tweak” e.g., FDA regulations to be favorable to big pharma, rather than to the consumer.

            • a reader says:

              Martha:

              “Companies financing clinical trials have been known to “cherry pick” from huge amounts of data and studies and report only results that are favorable. (I don’t have references at hand, but could probably dig some up.)”

              I don’t understand this statement. In proposing an FDA clinical trial, all the analysis plans (i.e., response variables, predictors, analysis methods, plans for handling drop-out, etc.) must be laid out ahead of time, along with plans to control type I error rates. So attempting to cherry pick results from the clinical trial would greatly hamstring the trial (i.e., would require too many multiple comparison adjustments).

              Or perhaps you are saying that the companies cherry pick the data in order to justify running the clinical trial? Or non-company sponsored clinical trials, with the PACE trial as the prime example? I’m not 100% certain whether the PACE trial included cherry-picking per se, but I know it did include enough other problems of that magnitude.

              • Anoneuoid says:

                There was a report not to long ago (ie 1-3 years) from someone who got access to the complete FDA data for a bunch of trials. IIRC, they said they considered the public literature to be pretty much worthless without having access to that extra info. Standard stuff like measuring 20 things and reporting the 2 that “looked good”.

                Sorry, I can’t find what I am referring to at the moment but there are plenty of papers about selective reporting/etc of clinical trials:

                https://www.nejm.org/doi/full/10.1056/NEJMsa065779
                https://www.ncbi.nlm.nih.gov/pubmed/28529187
                https://www.nature.com/articles/tp2017203
                https://blogs.bmj.com/bmj/2018/01/31/tianjing-li-whats-not-shared-building-on-the-transparency-momentum/

                In proposing an FDA clinical trial, all the analysis plans (i.e., response variables, predictors, analysis methods, plans for handling drop-out, etc.) must be laid out ahead of time, along with plans to control type I error rates. So attempting to cherry pick results from the clinical trial would greatly hamstring the trial (i.e., would require too many multiple comparison adjustments).

                Perhaps the FDA does get this info but it just gets thrown into a warehouse and not used for anything? This report used to be easy to find on the FDA website but seems to be gone since Jan 2018:

                Incredibly, critical data resides in large warehouses sequestered in piles and piles of paper documents. There are no effective mechanisms to protect these paper records, which include very valuable clinical trial data. Furthermore, processes for data and information exchange, both internally as well as among external partners, lack clear business processes, information technology standards, sufficient workforce expertise, and a robust technology platform, such that the FDA cannot credibly process, manage, protect, access, analyze and leverage the vast amounts of data that it encounters.

                wayback.archive-it.org/7993/20180126164015/https://www.fda.gov/ohrms/dockets/ac/07/briefing/2007-4329b_02_01_FDA%20Report%20on%20Science%20and%20Technology.pdf

              • a reader says:

                Anoneuoid:

                That’s a pretty hard pivot from “Clinical trials aren’t helpful” to “the published academic research that results from clinical trials isn’t helpful”.

                It’s very important to note that the results of clinical trials *are* publicly along with all the protocols declared. And these records are used all the time by competing pharma companies. I suppose you could argue that it’s “just thrown into a warehouse”, but it’s a publicly available warehouse that actively gets used. So whether the published academic literature makes best use of the information obtained from clinical trials is a very different question from whether clinical trials are useful.

              • Martha (Smith) says:

                Reader,

                See, e.g., the following references in http://www.ma.utexas.edu/users/mks/CommonMistakes2016/AppendixDayThree2016.pdf :

                Doshi et al (2012), Doshi et al (2013), Le Noury et al (2015), Jefferson et al (2014)

              • a reader says:

                Martha:

                Thanks for that. I read through Doshi 2013 and my take away was that CSR’s (a) are really big and (b) contain a lot more information than what makes into academic publication. Is there any other key points to take away?

            • Keith O'Rourke says:

              > Companies financing clinical trials have been known to “cherry pick” from huge amounts of data and studies and report only results that are favorable.
              Definitely common in journal publications but rather rare in the regulatory review process where everything is potentially accessible and audit-able.

              Now, apparently some of this still happens in third world countries where firms can get away with doing drug trials without the local government approving and tracking what’s done in the trials.

              Also believe I heard the FDA is starting to report the discrepancies they hear about in publications to the editors. So the putting of just the good stuff in journal publications won’t work so well if the editors actual do something.

              > Legislators and lobbyists have been reported
              The first are serving their core supporters and the second have a business model to advance their stakeholders interest – right?

  17. Keith O'Rourke says:

    I was thinking along similar lines as Thanatos in that many working in meta-analysis of clinical trails in the 1980’s realized there were problems and by 1990’s they were known to be severe. Then micro-array expression data started to be analyzed and unlike clinical trials they were easily and quickly re-done and many more outsiders could see that studies were not replicating. John Ioannidis had perhaps the most impact there. Unfortunately (or fortunately see below) most folks don’t seem to be that aware of problems with published clinical trails.

    Andrew seems to be taking crisis to be “outsiders can see that studies are not replicating”. In that sense, maybe the replication crisis has yet to occur in medicine. I read somewhere last week that when all the meta-analyses are reviewed in the Cochrane library there is only a small subset that have identified having strong evidence. Most just classify the evidence as being weak and fragile.

    Part of what might be happening here is a reluctance to make this clear to everyone as that would be very stressful to patients and their families. Sort of a maintenance of a placebo effect of evidence based clinical expertise.

  18. Thanatos Savehn says:

    My thought is that because it’s not sexy (fame and fortune goes to the captain rather than the navigator) and lowers the rate of “discovery”, methodology is often ignored and abused. The answer then is to emphasize, or better yet prioritize, methodological theories of measurement and analysis in the education of scientists.

  19. Erin Jonaitis says:

    Interesting. My read of most of these bullet points (save the last) is that the problem became visible in social psych, which is really different from a claim that the problem is more pronounced in that field. Cf debates over whether time trends in incidence/prevalence of various diseases are primarily a function of biology or of observation.

    I think this list is interesting and is reasonable as far as it goes. I do think there is still room for asking why statistics was not out in front of this. In hindsight, 2016 seems kind of late to be issuing proclamations on what p values are and what they are not.

    • Keith O'Rourke says:

      > I do think there is still room for asking why statistics was not out in front of this
      There used to be a dis-interest or even outright avoidance of assessing multiple very similar studies. John Nelder coined this as the cult of a single study within the statistics discipline.

      When I was at Duke in 2008, I gave a graduate course on meta-analysis and only two students enrolled. That summer, I organised a SAMSI summer program on meta-analysis which attracted 50 to 60 people with roughly 30 percent Phd graduates. When asked none of them remember having any instruction on how to deal with multiple studies.

      Something changed afterwards and its not so bad. But hear no multiple studies, see no multiple studies and speak no multiple studies likely left the statistics discipline mostly unaware of the problems.

      • Yes why was statistics not out in front! I think it has to with the resort to devil’s advocacy translating into scholarships. Publish or perish enters into it which makes the process of self correction that much more cumbersome.

    • Jeff Valentine says:

      >My read of most of these bullet points (save the last) is that the problem became visible in social psych, which is really different from a claim that the problem is more pronounced in that field

      I think this is exactly right – the public discussion about the replication crisis is centered on social psych, but I doubt very seriously that the replication crisis itself is centered on social psych.

  20. Jim Dannemiller says:

    I think that your statements on data sharing in bio fields are a bit off the mark. There is a clear ethic of sharing, for example, in genetics. I am currently doing secondary data analysis on an NIH-funded data set with gene expression measured in the brain (post mortem) on approximately 16,000 genes. The data are publicly available at gn2.genenetwork.org. My experience is that geneticists understand the power of collaboration, and for the most part, are willing to share data. Because no single lab generally has the resources to gather data with sample sizes sufficient to achieve reasonable power, pooling and sharing data are seen as being necessary for doing good science.

  21. Rick G says:

    Missing from this list is “true effect sizes are very small compared to what researchers in the field think they will be”. I think that is the underlying force motivating small sample sizes, sloppy measurements, and an insistence on finding “something” in the data, which doom all of these experiments from the start.

    • Andrew says:

      Rick:

      Good point. To some extent your point is implied by this thing I wrote: “Finally, hypotheses in psychology, especially social psychology, are often vague, and data are noisy.” But it’s good to make it explicit.

  22. Al says:

    Experiments and concepts in Social Psychology tend to be much easier to understand than those from other fields. If I am teaching students about the Garden of Forking Paths/p-hacking it is much easier for me to use an example from Social Psychology instead of a field like Immunology. I personally know of far more examples of poor research practices in the field of Immunology, but unless the students are familiar with a bunch of background knowledge I would need to spend quite a bit of time explaining concepts to them. Power-posing is probably a great teaching example for similar reasons as to why it was a “great” TED Talk. By the time you get to the methodological problems (or in the case of TED Talk, by the time you get to the feel good message) people haven’t been bored into submission by a bunch of technical explanations.

    I suspect that the lack of a “crisis” in many other fields is in part due to their methodological problems hiding within a cloud of technical methods and complex jargon.

    • jrkrideau says:

      I have always found that psychology suffers, in the public mind anyway, in that it speaks in something close to the vernacular (English in many cases.

      Clearly we need a “sacred language” to get rid of all those nitpickers.

    • Keith O'Rourke says:

      AI

      Definitely a major factor.

      I should have brought that out more in clinical research, whatever treatment/disease you might use as an example, someone in the class/audience might be affected or have a loved one who is. Not only might this be overly stressful, the presenter almost needs to be a clinical expert not to give the slightest miss-direction.

  23. Jeff Valentine says:

    Andrew and the commenters have hit on disciplinary features that I suspect are related to the replication problem. Among these are (a) experiments that are cheap to run, (b) measurement that is both cheap and imprecise (so studies have lots of noisy measures), (c) effect sizes that are smaller than people think, (d) reliance on NHST along with its common misperceptions, and (e) a tendency to think “one study at a time”. Social psych scores high on these dimensions, but it is not the only discipline for which this is true.

  24. zbicyclist says:

    I think a contributing factor is that social psychology is PR- and newscast-friendly. The latest counterintuitive finding can be a nice, simple 2 minute segment on the news, and hyped ahead of time. This is less likely to be the case for, say, immunology (see Al’s comment above).

    The press doesn’t care so much if it’s solid, only that it’s newsworthy. Whether it replicates isn’t important to the news cycle.

    So this creates more immediate pressure for cute, nonreplicatable studies and also means it’s easier to understand the replication crisis issues. The Daryl Bem ESP experiments are a good example — it’s easy to see that there’s some problem somewhere in these studies, if you accept the easily understood premise that there’s no such thing as ESP. Non-experts would not have the ability to understand any similar issues in immunology.

  25. N says:

    There are some topics that are not mentioned in this list.
    – Job security and expectations regarding employment. Many academic disciplines are catering to specific societal systems, which provide a (relatively, these days) stable labour market for graduates and postgraduates. They also “annoy” the academia with certain demands on what knowledge they deem useful, asserting influence through third-party-funds and extra-academic job offerings. While these incentives have their own problems, they also enforce a certain rigidity (research must confirm to minimum standards of applicability) and, I assume, increase the (perceived) security of employment.
    – With the lack of a build-in audience, the most relevant observers, aside from their own peers, seem to be the mass media. The problem here is not (just) the public interest, but the lack of a critical, consistent, professional audience.
    – Government regulation or organizational regulations often offer incentives that are directly adverse to quality research (publish or perish!). This does not only pressure individuals within the system, it can also pressure people out of the system who are not willing to compromise/conform. The practices of project work and limitations of contracts impose additional burdens on (younger) researchers.
    – Higher publication count and more lax QA due to “easier” research practices. Now this is quite speculative, but as psychological research is relatively cheap, not just in money and labour costs, but also in time consumed, the frequency of publications expected from a researcher or research group goes up. This could diminish the importance of any individual study (more publish or perish) and, perhaps more importantly, leaves less time to prepare and revise a publication. The very way in which the research process is structured in this field could undermine its quality.

    Much of this is not exclusive to (or even disproportionally present in) Social Psychology, but I think bad incentives and problematic selection pressure will have exaggerated effects when there are fewer counter-incentives.

  26. Enough already says:

    There is no replication crisis in social psychology. Its BS. Some social psychologists started a career by doing poor replication attempts and uncreative individuals, personality psychologists and statisticians who know very little about conducting actual experiments with random assignment jumped on this trend. Doing bad research and getting all the attention that’s all.

      • Anonymous says:

        Actually, Diederik (Stapel) might view things differently now he is no longer part of the field of Social Psychology and earns his money via it. He wrote the following (in his book about his fraud):

        “I hate myself and project that hatred onto others. I’ve become a misanthrope. I don’t find anything or anyone interesting or worth talking to or about. Social psychology is garbage, just a collection of pseudo-effects. Everyone is just making a career running stupid little experiments that no one cares about. It has nothing to tell us.” (p. 130-131)

        Source: http://nick.brown.free.fr/stapel/

        • I skimmed through the book. How depressing an account.

          • Anonymous says:

            “How depressing an account.”

            To me, the following was the real depressing account: the Levelt committee’s report about the investigation concerning Stapel ‘s fraud. The following link leads to a Dutch website/text of the Tilbug University, but you can find a link there to the English version of the report:

            https://www.tilburguniversity.edu/nl/over/profiel/kwaliteit-voorop/commissie-levelt/

            Some “highlights” of the report:

            “The discovery of the methodological defects, which constitutes an unintended and unexpected finding of this inquiry, did raise the crucial question for the Committees as to whether this research culture, which is described in more technical detail below, is also rife throughout the field of social psychology, nationally and internationally. Could it be that in general some aspects of this discipline’s customary methods should be deemed incorrect from the perspective of academic standards and scientific integrity?” (p. 47)

            “The statisticians found countless flaws while studying and re-analysing all the survey material. This is inevitable to some extent; social psychology researchers cannot be expected to be aware of the latest, specialized techniques in the statistics field. But it is disturbing that statistical flaws frequently revealed a lack of familiarity with elementary statistics. For instance, one article had an entire series of t-test results on the difference between pairs of means, of the type: t = 0.90, p< .05 and therefore ‘significant’. The coauthor was astounded to be told by the Levelt Committee that a co-author should have seen this absurdity." (p. 52)

            I think (the exposed fraud by) Diederik Stapel, and the Levelt report, also may have contributed heavily concerning why the replication crisis has possibly been centered on Social Psychology.

            • Anonymous, Thanks for the link. You peaked my interest considerably. I am impressed by the candor here on Andrew’s blog. It is so much fun too.

            • I just wonder what is meant by “elementary statistics’. A course in basic logic course, including diagnostic talent, would supersede in importance along with conceptualing & sequencing of reasoning.

              • Anonymous says:

                “A course in basic logic course, including diagnostic talent, would supersede in importance along with conceptualing & sequencing of reasoning.”

                Yes!!

                I followed (parts of) a 3-year “Bachelor in Clinical Psychology” and a 2-year “Research Master in Behavioural Science” at a university and never got any classes in/about logic, reasoning, etc. This to me is still incomprehensible as i reason you need it with lots of things related to science and research.

                I seriously think that this has got to be (-come) a mandatory part of education in general (just like math, geography, history, etc.), but definitely education at the university-level.

                I (partly) came to this thought reading more and more papers/reactions by senior Psychology professors who seem to make the most basic errors in reasoning that somehow passed through “peer-review”. Either i am mistaken, or they are, but in both possible cases i reason mandatory classes might be in order :) Here is an example:

                https://andrewgelman.com/2017/09/27/somewhat-agreement-fritz-strack-regarding-replications/#comment-574038

              • Anonymous says:

                “I followed (parts of) a 3-year “Bachelor in Clinical Psychology” and a 2-year “Research Master in Behavioural Science” at a university and never got any classes in/about logic, reasoning, etc. This to me is still incomprehensible as i reason you need it with lots of things related to science and research.

                I seriously think that this has got to be (-come) a mandatory part of education in general (just like math, geography, history, etc.), but definitely education at the university-level.”

                *Good* math teaching involves developing skill in logic and reasoning. (Unfortunately, a lot of math teaching does not do this.)

              • Martha (Smith) says:

                “I followed (parts of) a 3-year “Bachelor in Clinical Psychology” and a 2-year “Research Master in Behavioural Science” at a university and never got any classes in/about logic, reasoning, etc. This to me is still incomprehensible as i reason you need it with lots of things related to science and research.

                I seriously think that this has got to be (-come) a mandatory part of education in general (just like math, geography, history, etc.), but definitely education at the university-level. “

                Good math teaching involves developing skill in logic and reasoning. (Unfortunately, a lot of math teaching isn’t good).

              • Anonymous says:

                “Good math teaching involves developing skill in logic and reasoning. (Unfortunately, a lot of math teaching isn’t good).”

                That could be. I only followed math in high school and don’t remember beig taught anything related to logic and reasoning. More importantly, i think they still should be taught seperately regardless.

                I really don’t understand why this is not part of the standard curriculum at high schools, universities, etc. (at least not here in the Netherlands). It just boggles my mind why this is not being taught. I reason it is way more important, and useful compared to, let’s say, chemistry or advanced math. I reason logic and reasoning could be taught keeping different levels of the students in mind just like your standard curriculum stuff, and i reason lots of real-world/interesting/relatable examples can be found everywhere.

                For me also, math and logic/reasoning “feel” very differently. I am really, really bad with anything related to math/statistics but think i can much more easily spot errors in reasoning for some reason. I sometimes sort of “sense” something doesn’t quite add up when reading/hearing something, and then have to really dig through what it is that is being written/said to spot the possible errors (see link above for a thread where i wrote about many possible errors that were a result of that type of process).

              • Martha Smith says:

                “For me also, math and logic/reasoning “feel” very differently. I am really, really bad with anything related to math/statistics but think i can much more easily spot errors in reasoning for some reason.”

                So sad to hear. I was fortunate to have had very good math from early on. High school geometry then was almost all about logic and reasoning — first, the basics such as distinguishing a statement from its converse; then start from axioms and use them to prove simple propositions; then use the axioms and propositions to prove deeper theorems — but all in a context where you could use pictures to guide what seemed plausible, as well as strict logic to prove it.

              • Anonymous says:

                “So sad to hear.”

                I hope that’s not in light of me being really bad at math/statistics and/or having had no logic/reasoning math related classes :) I couldn’t care less about both. I am perfectly happy to be really bad at math/statistics.

                I tried very hard at university, even to the point of buying one of Andy Field’s books and working through the book multiple times during the summerbreak as a voluntary extracurricular attempt to try and get a grasp on it, but to no avail.

                That’s also related to my point above: i think logic/reasoning is way more important, and way more useful, to learn compared to math (additional to the very, very basic stuff you learn pre-high school which i think is useful).

              • Martha (Smith) says:

                Anonymous:

                Oops — I quoted the wrong part of your post. The “So sad to hear” was intended to refer to “I only followed math in high school and don’t remember beig taught anything related to logic and reasoning. More importantly, i think they still should be taught seperately regardless. “

          • Ok I managed to read through 57 pages for the rest consisted of appendices

            • Anonymous says:

              “Ok I managed to read through 57 pages for the rest consisted of appendices”

              Wow.

              So, what was more depressing for you: Stapel’s own account via his book, or the Levelt report about his fraud and the (possible) state of Social Psychology?

              • Definitely Stapel’s own account was very sad. It had more poignant lessons for having engaged in such a fraudulent endeavor. The reputational costs to his family & to himself. His own daughter was queried by journalists.

                The Levelt Report sounded more sanitized eve though the Stapel case was so thoroughly reviewed seemingly. It was interesting the process by which Stapel accumulated so many articles; by providing data to academics and in turn securing co-authorship.

                I haven’t heard of these strategies in sociology, theology, history, and political science fields. Thanks for the link, BTW.

  27. Anoneuoid says:

    a reader wrote:

    That’s a pretty hard pivot from “Clinical trials aren’t helpful” to “the published academic research that results from clinical trials isn’t helpful”.

    There is no pivot, just a change of focus since you started making the claim that the info the FDA has is publicly available. This seems bizarre to me.

    It’s very important to note that the results of clinical trials *are* publicly along with all the protocols declared. And these records are used all the time by competing pharma companies.

    How can you keep saying this??? What are you basing it on? Here is what it says in the refs I already quickly found during my search for something else for the other post:

    Attempts to study selective publication are complicated by the unavailability of data from unpublished trials.

    https://www.nejm.org/doi/full/10.1056/NEJMsa065779

    We identified public (e.g., journal articles, Food and Drug Administration [FDA] reviews, short reports ) and nonpublic reports (clinical study reports [CSRs], CSR-synopses) available for these trials by 2015.

    https://www.jclinepi.com/article/S0895-4356(17)30121-X/fulltext

    Here is the FDA on CSRs, they are clearly just now (in 2018) running a pilot study to release portions of the info they have to the public:

    During the pilot, we will post key portions of the Clinical Study Reports (CSRs) – documents that sponsors create for FDA on each of their clinical studies.

    https://blogs.fda.gov/fdavoice/index.php/tag/clinical-study-reports-csrs/

    More on that pilot study. Btw, I think this was the reference I was originally thinking of. I did include it in that post but had only skimmed it:

    While it is tempting to say that this is “a step in the right direction,” the narrow scope of the pilot is cause for concern. To begin with, the pilot programme is voluntary: the sponsors decide whether to share their CSR information and what to share.

    […]

    When journal articles were compared to CSRs, we (and others) have found that the information in the public domain cannot be trusted at face value. Outcomes are “switched”. Efficacy results are “cherry picked”. Adverse events, including the most serious ones, are grossly under-reported in the publications. Inaccurate and selective reporting puts the health of the public at stake. Releasing CSRs enables independent scrutiny, for example to identify outcomes assessed but not presented to the public, one central step towards protecting and advancing public health.

    https://blogs.bmj.com/bmj/2018/01/31/tianjing-li-whats-not-shared-building-on-the-transparency-momentum/

    I suppose you could argue that it’s “just thrown into a warehouse”…

    That’s what a congressional investigation into the FDA back in 2007 claimed was going on, not me.

    …but it’s a publicly available warehouse that actively gets used.

    See above, you are just stating the info is publicly available out of the blue, Something that goes against what many people, including the FDA itself, claim.

    I personally have never checked, so it is either your word backed by no evidence, or even citation, or all these articles and the FDA… So what are you basing your claims on?

    • Anoneuoid says:

      Again posting down here due to nesting limits. I just noticed “a reader” wrote:

      Martha:

      Thanks for that. I read through Doshi 2013 and my take away was that CSR’s (a) are really big and (b) contain a lot more information than what makes into academic publication. Is there any other key points to take away?

      Obviously a reader is working with the Douglas Adams version* of “publicly available”. From the Doshi 2013 paper:

      We obtained CSRs from public sources, as follows:

      1. Requesting from EMA, under its Freedom of Information (FOI) policy, CSRs for manufacturer-sponsored trials of the 10 best-selling prescription-bound products in the USA in 2010.23

      2. Reusing CSRs from our own previous research (oseltamivir and zanamivir).12

      3. Downloading CSRs openly available on the Internet. Search terms were not predefined, but sites searched included Google (http://www.google.com), the Drug Industry Document Archive (http://dida.library.ucsf.edu/) and Institut für Qualität und Wirtschaftlichkeit im Gesundheitswesen (Institute for Quality and Efficiency in Healthcare) (IQWiG)’s library of reboxetine studies (https://www.iqwig.de/information-on-studies-of-reboxetine.980.en.html).

      4. Corresponding with one researcher who obtained CSRs through an FOI request to Food and Drug Administration (FDA) (epoetin alfa).

      5. Requesting manufacturers fill any gaps in the completeness of reports that we believe are legally required to be publicly available (paroxetine).

      To create as broad a database as possible, we did not apply restrictions in drug type or family or sponsor. We did not submit requests under the Freedom of Information Act to the FDA, because such requests can take years to be fulfilled and—if fulfilled—may be heavily redacted.24

      We did not draw a random sample of CSRs as there is no known sampling frame. No one knows how many reports have been written by intervention category as there is no central register of CSRs.

      http://bmjopen.bmj.com/content/3/2/e002496.long

      “But the plans were on display…”
      “On display? I eventually had to go down to the cellar to find them.”
      “That’s the display department.”
      “With a flashlight.”
      “Ah, well, the lights had probably gone.”
      “So had the stairs.”
      “But look, you found the notice, didn’t you?”
      “Yes,” said Arthur, “yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard.”

      -The Hitchhiker’s Guide to the Galaxy

      • a reader says:

        Anoneuoid:

        Well, perhaps it will help to explain where I get my views on clinical trials from. To be clear, I don’t claim to be an expert; in fact, my only experience was two summer internships at two different pharmaceuticals. So if you have experience or actual information that I’m not aware about, please correct me where I make a mistake.

        So where are the results from clinical trials? Clinicaltrials.gov, of course. This information is publicly available, but not all the trials result in academic publications. Hence my claim that results are publicly available, even though their results are not always reflected in the academic literature. I understand that they are maybe not as complete as you would like, but it is my understanding that this information is, in fact, used. In particular, when I worked at the pharmaceuticals, my boss told me that a big part of their job was monitoring updates and intern analyses that were required to be publicly available so the company could position itself better given current progress by competitors.

        It’s my guess that your note about the 2007 Congressional investigation that found clinical trial information was not readily available resulted in following: “The ClinicalTrials.gov registration requirements were expanded after Congress passed the FDA Amendments Act of 2007 (FDAAA). Section 801 of FDAAA (FDAAA 801) requires more types of trials to be registered and additional trial registration information to be submitted.”

        Why I do I find it odd that people claim pharma companies switch outcomes, fudge data, etc., during a clinical trial? Well, as an intern, I never analyzed data from an ongoing clinical trial so I can’t honestly say I know anything first hand about that process. But at my first position, it was my job to research futility stopping rules to be executed by a Data Monitoring Board. We would write the SOP on how to analyze the data and stop the trial early if the results did not look promising, which would then be executed by an independent group. How does one fudge the data when one only is allowed to write a procedure but not execute it? And no, we’re not allowed to write a procedure that says “drop outliers that look bad” or “chose the outcome that looks best”.

        At my second position, my job was to research methods for sensitivity analysis in the case of NI missingness, again to be executed by an independent group, which we were required to given a thorough explanation to the FDA why these sensitivity analyses would be appropriate. In this case, we had previous Phase III trial data which we were using as a pilot. During the previous trial, the treatment was used for condition X, but during the trial it was observed to have a positive and statistically significant effect on Y (at least when not adjusting for multiple comparisons), which motivated running an official clinical trial in which we declared that we were using the treatment on Y given the promising results observed in the earlier trial. Given that we had reasonable drop out in the pilot study (I think around 20%?), it was required to write an SOP for how to deal with this.

        So I don’t claim a complete understanding of exactly how everything is done. But these ideas that pharma companies are switching outcomes and hiding results just doesn’t make sense given my experience. If you’ve had more intimate experience with the clinical trial process, by all means, please let me know.

  28. Nik Tyresen says:

    The impression that psychology is a “cheap and easy” science is one of the worst plagues for our discipline.

    Suppose you are doing a lab experiment and you have one tiny room and one lab computer. You will be “running participants” one at a time, which even for college students would come with at least 45min – 1hr of real clock work time per datapoint. If you want special samples (real working adults, retirees, people with disabilities, anybody who doesn’t live within a short drive from an R1 campus), multiply that by possibly huge coefficient and add other expenses (participant rewards, travel costs), and factor in that you can’t just run your experiments during a slow weekend evening like my compsci buddies test their code, stopping whenever coffee runs low.

    A European take: ever since I moved from sociology to experimental psychology and from the US to Europe, I’ve been wondering, why I did that to myself. Where did all the undergrads willing to work for a reference letter to med school go? Where’s my endless supply of college students with research participation requirements? Why can’t I get away with a program-send_to_Facebook-get_coffee-…-???-PROFIT online survey? (The funding bodies on this side of the Pond, however, are convinced that psychology is humanities ergo is all about chatting with a couple pals in a café, so you don’t even need labs, much less RAs or participant remuneration.)

Leave a Reply