Skip to content
 

Fixing the reproducibility crisis: Openness, Increasing sample size, and Preregistration ARE NOT ENUF!!!!

In a generally reasonable and thoughtful post, “Yes, Your Field Does Need to Worry About Replicability,” Rich Lucas writes:

One of the most exciting things to happen during the years-long debate about the replicability of psychological research is the shift in focus from providing evidence that there is a problem to developing concrete plans for solving those problems. . . . I’m hopeful and optimistic that future investigations into the replicability of findings in our field will show improvement over time.

Of course, many of the solutions that have been proposed come with some cost: Increasing standards of evidence requires larger sample sizes; sharing data and materials requires extra effort on the part of the researcher; requiring replications shifts resources that could otherwise be used to make new discoveries. . . .

This is all fine, but, BUT, honesty and transparency are not enough! Even honesty, transparency, replication, and large sample size are not enough. You also need good measurement, and some sort of good theory. Otherwise you’re just moving around desk chairs on the . . . OK, you know where I’m heading here.

Don’t get me wrong. Sharing data and materials is a good idea in any case; replication of some sort is central to just about all of science, and larger sample sizes are fine too. But if you’re not studying a stable phenomenon that you’re measuring well, then forget about it: all those good steps of openness, replication, and sample size will just be expensive ways of learning that your research is no good.

I’ve been saying this for awhile so I know this is getting repetitive. See, for example, this post from yesterday, or this journal article from a few months back.

But I feel like I need to keep on screaming about this issue, given that well-intentioned and thoughtful researchers still seem to be missing it. I really really really don’t want people going around thinking that, if they increase their sample size and keep open data and preregister, that they’ll solve their replications. Eventually, sure, enough of this and they’ll be so demoralized that maybe they’ll be motivated to improve their measurements. But why wait? I recommend following the recommendations in section 3 of this paper right away.

23 Comments

  1. Hanno says:

    “will just be expensive ways of learning that your research is no good.”

    Call me cynical, but I feel that would be a massive improvement over “expensive ways to produce false positive results”.

  2. Anonymous says:

    “But if you’re not studying a stable phenomenon that you’re measuring well, then forget about it: all those good steps of openness, replication, and sample size will just be expensive ways of learning that your research is no good.”

    I have wondered whether large scale (replication) research is really the most optimal way to perform research for some time now. In light of this i posted the following on the website of the “Psychological Science Accelerator” (which if i understood things correctly aims to perform large scale research involving many labs, and announced they were writing a paper about the project):

    https://psysciacc.org/2017/11/08/the-psychological-science-accelerators-first-study/

    “I wondered if the following might be interesting, and useful, for you guys to investigate concerning 1) how to optimally accelerate psychological science, and 2) your possible paper about the project:

    1) Take a look at all the Registered Replication Reports (RRR’s) performed thusfar

    2) Randomly take 1, 2, 3, 4, 5, etc. individual labs from one of these RRR’s and their individual associated confidence intervals, effect sizes, p-values, no. of participants, etc.

    3) Compare the pooling of the information of these 1, 2, 3, 4, 5, etc. individual labs to the overall results of that specific RRR

    4) Try and find out what the “optimal” no. of labs/participants could be to not waste possible unnecessary resources

    5) Possibly use this information to support the no. of labs/participants per PSA-study in your paper, and/or use the information coming from this investigation to come up with a possibly improved manner to optimally accelerate psychological science.”

    I do not have the computer- and/or statistical skills to do this myself, but i thought it could possibly be a useful investigation for them concerning how to optimally perform research and accelerate psychological science.

    There has since been posted a pre-print about the “Psychological Science Accelerator”, but i could not find any concrete calculations/statistics about the use of participants except the following general statement:

    https://psyarxiv.com/785qu/

    “First, the ability to pool resources from many institutions is a strength of the PSA, but one that comes with a great deal of responsibility. The PSA will draw on resources for each of its projects that could have been spent investigating other ideas. Our study selection process is meant to mitigate the risks of wasting valuable research resources and appropriately calibrate
    THE PSYCHOLOGICAL SCIENCE ACCELERATOR investment of resources to the potential of research questions. To avoid the imperfect calibration of opportunity costs, each project will have to justify its required resources, a priori, to the PSA committees and the broader community.”

  3. Ema says:

    If you have the other things, why do you need theory?

    • Andrew says:

      Ema:

      I wrote: “You also need good measurement, and some sort of good theory.”

      Without good theory, you’re unlikely to come up with interesting or useful things to learn.

      • Noah Motion says:

        I would go further and argue that the whole point of a theory is to encode what we (think we) know about the world. If so, then having no (good) theory is worse than just making it less likely that we’ll learn interesting or useful things (though it does have this effect).

        • Andrew says:

          Noah:

          I don’t think good theory is a necessary starting point. Sometimes people just happen to find unexpected things in their data. It’s happened to me! But there’s a limit to how far you can go with pure exploration and no theory. One problem is that there seems to be a overestimation of how much you can get with a randomized experiment and statistical significance.

      • Conceptual agility, temporal acuity, & imagination also precursors to good theory.

  4. Kevin Lewis says:

    Good measurement and theory are subjective judgments, at least ex ante. And social phenomena are inherently unstable. Honesty and transparency are all we can rely on to converge on good methods and results ex post.

    I await the melee! :)

    • Andrew says:

      Kevin:

      Honesty and transparency are great but they won’t do it alone, except in the indirect sense of limiting the amount of time that is wasted on hopeless studies. If researchers are studying small or highly variable and unpredictable effects with noisy measurements, they’re not going to learn much of anything useful. But, yes, indirectly an honest and transparent approach will help, in that honesty and transparency should help people realize that these studies are not working.

      We also need to distinguish honesty and transparency from morality. For example, by all accounts Daryl Bem is a wonderful human being, but his studies of ESP are not transparent. He does not present his raw data, he only presents some of his data summaries, and his mode of reporting is all about proving a point, not about open exploration. These have been the standards in his field, and I’m not calling Bem “dishonest” by following these standards, but the result is that he’s not presenting a full account of his data and experimental procedures.

      So: (a) being a good person is not enough, and (b) honesty and transparency are not enough. But I do agree that a norm of honesty and transparency should help researchers give up, or modify their approaches, faster from various dead ends.

  5. Steve Lindsay says:

    Psychology faces both difficult challenges and easy ones. It will be difficult to make psychology a genuinely useful science, because that will require much better measures and much better theories than we currently have. That ain’t going to be easy. But it should, at least in principle, be easy to shift norms in way that encourage replicability. Of course that in itself won’t solve the big challenges. But it makes sense to me to start with the easy problems and then push forward from there.

    • Austin says:

      Eh, well if we’re just talking about USEFUL, that’s extremely easy to achieve, since some things that psychologists study are behaviors of interest in and of themselves rather than being indicators of an underlying factor. And some of those types of things have been replicated a good amount such that we know they’re real.

      Theoretical matters are probably more difficult. I get that strong theory is a good thing, I just don’t really get how you’re supposed to make it in psychology, perhaps because I’ve seen either no examples or few examples.

    • Anonymous says:

      “It will be difficult to make psychology a genuinely useful science, because that will require much better measures and much better theories than we currently have”

      Perhaps this depends on what you consider to be “useful”.

      To name just one thing, psychology makes lots of people lots of money if i am not mistaken.

      Editors, peer-reviewers, scientists, and universities all play a part in what can be considered to be a giant acedemic publication scam (e.g. https://forbetterscience.com/2017/08/24/the-costs-of-knowledge-scientists-want-their-cut-on-the-scam/)

    • Anonymous says:

      “But it should, at least in principle, be easy to shift norms in way that encourage replicability. Of course that in itself won’t solve the big challenges. But it makes sense to me to start with the easy problems and then push forward from there.”

      Hmm, i read this piece here:

      https://www.psychologicalscience.org/observer/preregistration-becoming-the-norm-in-psychological-science

      and if i am understanding things correctly it states that the journal Psychological Science handed out 4 pre-registration badges in 2015, 3 in 2016, and 19 in 2017.

      This seems a very low amount to me. This low number is even stranger to me, if i understood things correclty, after reading here that pre-registration might even be considered to be one of the “easy” things to tackle. This all made me wonder:

      1) How many pre-registered papers did Psychological Science receive in 2015/2016/2017?

      2) Why were these possible other pre-registered papers not published?

      3) Why does Psychological Science not simply require pre-registration?

      • Steve Lindsay says:

        I have been preregistering studies in my lab for several years and if memory serves not one of them has yet been published in a journal (although I hope some soon will be). My point is that typically there is a substantial amount of time between a researcher initially adopting preregistration and that researcher publishing preregistered work in a journal. Just because a study is preregistered does not, in my view, mean that its results warrant submission, let alone publication. My first two preregistrations (the second following up on the first) both yielded results exactly the opposite of prediction and I am still trying to figure out why. So I think it is pretty impressive that Psych Science published 19 articles that reported preregistered research in 2017. I would be surprised if any other journal would hold a candle to that. True, that’s a bit less than 10% of our empirical papers in 2017, but I think the number will be substantially higher in 2018, and higher still in 2019.

        Just because a study was preregistered does not mean that the work was worth doing or informative. It is quite easy to preregister an ill-conceived study. I don’t know how many submissions in 2017 that included one or more preregistered studies were declined, but I do know that at least some were.

        If we required preregistration at this point in time, we would not get very many submissions. But I would not be surprised if preregistration does eventually become a requirement.

        Also, standards for what is accepted as “preregistration” will gradually increase.

        Finally, preregistering is easy compared to fundamental advancements in theory and measurement. But it is not trivially easy.

        Steve

        • Anonymous says:

          “Also, standards for what is accepted as “preregistration” will gradually increase. “

          Ow, this is interesting to me! Also in light of the link above (https://www.psychologicalscience.org/observer/preregistration-becoming-the-norm-in-psychological-science)

          Can you tell me more about why and how standards of pre-registration could increase?

          I hope this will not involve any “special” in-house “pre-registration experts” who will review the pre-registrations, and i hope this will not involve hiding the pre-registration from the reader of the actual paper.

          I can totally see how journals would like something like that to happen though, as it legitimizes their role + that of peer-reviewers and can keep them in business. I am more interested in improving psychological science, and would view that scenario as a really bad idea.

          Like i wrote in the link about Registered Reports, i sincerely hope i did not work my ass of trying to improve psychological science, only for 1) certain “Open Science” people to put all the power and responsibility back in the hands of those that screwed things up, and 2) basically do the exact opposite of what they have been talking about all this time.

        • a reader says:

          Steve Lindsay: “Just because a study is preregistered does not, in my view, mean that its results warrant submission, let alone publication.”

          I was lightly daydreaming about this on the way to work a few weeks ago.

          Why shouldn’t preregistering be sufficient for publication? If some hypothesis seemed interesting and plausible enough to get funded for a study, then shouldn’t the fact that it *didn’t* pan out as expected also be interesting as well? If 30 people tried to replicate the power pose, don’t we want a record showing that it was only “successfully” replicated twice?

          You could argue that this might lead to too many papers. I would say we’re already there…plus we have the issue that your sample of studies has been greatly biased by the significance filter.

          • Anonymous says:

            “Why shouldn’t preregistering be sufficient for publication?”

            Aha, interesting point!

            The funny thing is that a certain Stephen Lindsay wrote a piece about pre-registration and “Registered Reports” in the link posted above, and here again now: https://www.psychologicalscience.org/observer/preregistration-becoming-the-norm-in-psychological-science

            What are “Registered Reports” you may ask: well those are studies that get pre-registered and published regardless of the results. In a way, to quote you, “preregistering is sufficient for publication” with that format.

            The main difference i think Steve Lindsay might argue here, is that “peer-reviewers” and/or the editor will have been super helpful during the 1st stage submission in that format: maybe they even completely changed the design or goal of the study, and/or will have decided whether the study “warrents publication”.

            Apparently “reviewer 2” does not exist with “Registered Reports”, there is no chance of editors and reviewers blocking certain research or otherwise manipulating things, and editors and reviewers all of a sudden do not make researchers leave out conditions and analyses and do other bad stuff!

            It’s like “Registered Reports” make all that is bad about “peer-review” and journals vanish all of a sudden! On top of that “Registered Reports” make researchers who act as “peer-reviewers” possess super research powers concerning research design, statistical analyses, and pre-registration that they apparently not possess as mere researchers submitting the study/proposal/paper. It’s like magic.

        • Martha (Smith) says:

          Steve Lindsay said, “standards for what is accepted as “preregistration” will gradually increase.”

          I’m not so sure about that. There is a common human tendency for “standards” to become codified quickly, which works against the tendency for initial attempts at standards to have weaknesses that need to be corrected.

        • Anonymous says:

          1) “I don’t know how many submissions in 2017 that included one or more preregistered studies were declined, but I do know that at least some were.”

          Hmm, i thought you were the editor-in-chief at Psychological Science. If so, i reason you should (want to) know how many pre-registered studies were declined in 2015/2016/2017, and/or could easily find out.

          2) “If we required preregistration at this point in time, we would not get very many submissions.”

          Ah, okay so it seems even more likely that you are the editor-in-chief given your use of “we” here.

          Assuming this is correct, and reasoning from that point onward:

          If you just stated that you don’t know how many declined submissions in 2017 (and i reason also 2015 and 2016) included one or more pre-registered studies, how exactly do you know that you would not get very many submissions if you would require pre-registration?

        • Anonymous says:

          “I would be surprised if any other journal would hold a candle to that. True, that’s a bit less than 10% of our empirical papers in 2017, but I think the number will be substantially higher in 2018, and higher still in 2019.”

          Hmm, i’m not really impressed with that. But more importantly, maybe you could design “pre-registration percentage” badges for journal editors?

          I hear badges for journals are a super useful “incentive” for researchers to pre-register (why exactly is still not clear to me), so perhaps they could also work for journal editors to actually publish the pre-registered work.

          If Psychological Science stays on course this year, and i understood you correctly, that could mean you, as editor-in-chief of Psychological Science, could be eligible for a “10% of our publications used pre-registration”-badge.

          You could hang it on you fridge at home, or do something else with it perhaps.

        • Anonymous says:

          “My first two preregistrations (the second following up on the first) both yielded results exactly the opposite of prediction and I am still trying to figure out why”

          Why could that be indeeed.

          While you think about that, perhaps you could read the following pre-print i wrote:

          https://psyarxiv.com/5vsqw/

          I don’t know if it makes much sense, but it may contain some links to possible answers to your ponderings (should you not be aware of them) that could perhaps be helpul in your quest for answers.

          If you don’t find the answers at first, you could take a break and look in the mirror for instance. You could then read it for a 2nd time, then look in the mirror again. Etc.

          Good luck with your quest for answers!

    • Andrew says:

      Steve:

      I agree that it will be a good step to encourage honesty and replicability. Right now we see researchers completely misrepresent data and references in social media and in published work, including in publications of the Association for Psychological Science; see for example here.

      So I think an excellent start would be for professional organizations to move toward a zero-tolerance position on lying and misrepresentation of data and references. Corrections, retractions, official apologies, the whole thing. No more attempts to talk the problems away. Instead, correction and contrition.

      • Anonymous says:

        “So I think an excellent start would be for professional organizations to move toward a zero-tolerance position on lying and misrepresentation of data and references. Corrections, retractions, official apologies, the whole thing. No more attempts to talk the problems away. Instead, correction and contrition.”

        I recently started thinking that it could be the case that by wanting to work with someone/something that is part of the problem and/or responsible for the problem, you may actually be giving them unnecessary influence and keep them as an unnecessary problematic part.

        (e.g. by wanting to work with publishers “gold open access” may actually make publishers even more money and become even more powerful http://bjoern.brembs.net/2016/04/how-gold-open-access-may-make-things-worse/?utm_content=buffere157b&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer)

        1) Concerning research for instance:

        Researchers who want to perform their research with some higher standards could perhaps work together in small groups, and just mostly ignore other researchers with “flair” and “magical” results.

        I reason those that want to “do the right thing” are way more dependent on the “input” of their work, and i reason it could be smart for them to start to work together in small groups that still allows all of them to try and contribute their own ideas

        (e.g. see here for my best attempt at describing such a possible research-format http://andrewgelman.com/2017/12/17/stranger-than-fiction/#comment-628652)

        2) Concerning the link you provide with regards to the misrepresentation of references:

        The person who makes the error is the one who possibly looks like a fool, not the person being accused of something they did not say/write. The journal who published it, and did not want to publish a correction if i understood things correctly, possibly looks like a fool. Both of them should be the ones who should make the effort to correct, not you. If they don’t, then that tells me, the reader, something.

        3) Concerning the APS in general:

        Any organization that hands out “APS rising star” awards (apparently to members only if i understood things correctly) makes themselves possibly look like a fool. It boggles my mind that psychological science organizations feel the need to hand out individual awards, but to then also dare to call it “rising star” like we’re on “The Voice” or “American Idol” is incomprehensible to me.

        (e.g. see here for my thoughts on individual awards in science here: https://psyarxiv.com/pju9c/)

        In my reasoning the important thing is that there are alternatives: concerning psychological research, researchers, publications, and how one views contributions to science.

        Perhaps it’s more useful to simply ignore a lot of stuff/researchers/journals/organizations/etc. and focus on the alternatives: e.g. researchers who want to do research with some higher standards can maybe work in groups to help themselves, pre-prints can be posted by anyone, blogs can be written, discussions happen on other media, etc.

        Psychological Science does not belong to an organization.
        Psychological Science does not belong to a journal.
        Psychological Science does not belong to academia.

Leave a Reply