Why experimental economics might well be doing better than social psychology when it comes to replication

There’s a new paper, “Evaluating replicability of laboratory experiments in economics,” by Colin Camerer, Anna Dreber, Eskil Forsell, Teck-Hua Ho, Jürgen Huber, Magnus Johannesson, Michael Kirchler, Johan Almenberg, Adam Altmejd, Taizan Chan, Emma Heikensten, Felix Holzmeister, Taisuke Imai, Siri Isaksson, Gideon Nave, Thomas Pfeiffer, Michael Razen, Hang Wu, which three different people sent to me, including one of the authors of the paper, a journalist, and also Dale Lehman, who wrote:

This particular study appears to find considerable reproducibility and I think it would be valuable for you to comment on it. I have not reviewed it myself, but I suspect it has been done reasonably—my guess is that there are some good reasons why experimental economics studies might be more readily reproducible (perhaps I should say replicable) than studies in psych, business, etc. The experiments are generally better conceived so as to have fewer intervening factors—e.g., random assignment with varying levels of financial rewards for performing some types of tasks. I believe these types of experiments differ in some fundamental ways from experiments about “power poses” or other such things. It may also be that economists are more careful about experimental setup than psychologists.

One further issue that deserves some attention is the difference between reproducing economic experimental results and replicating economic observational studies. I believe the status of the latter is likely to be very poor—and nearly impossible to investigate given how hard it is to get access to the data. The ability to reproduce results from the experimental studies casts no light on the likelihood of being able to replicate other types of economics studies.

The paper also came up here on the blog, where I wrote that I would not be surprised if experimental economics has a higher rate of replication than social psychology. I don’t know enough about the field of economics to make the comparison with any confidence, but as I said in my post on psychology replication, I feel that many social psychologists are destroying their chances by purposely creating interventions that are minor and at times literally imperceptible. Economists perhaps are more willing to do real interventions.

Another thing is that economists, compared to psychologists, seem more attuned to the challenges of generalizing from survey or lab to real-world behavior. Indeed, many times economists have challenged well-publicized findings in social psychology by arguing that people won’t behave these ways in the real world with real money at stake. So, just to start with, economists unlike psychologists seem aware of the generalization problem.

That said, I think it’s hard to interpret any of these overall percentages because it all depends on what you include in your basket of studies. Stroop will replicate, ovulation and voting won’t, and there’s a big spectrum in between.

27 thoughts on “Why experimental economics might well be doing better than social psychology when it comes to replication

  1. Just thinking that it would be super useful if some group would attempt to replicate all the studies in an undergrad social psychology textbook. Clear out the deadwood.

  2. “We find a significant effect in the same direction as the original study for 11 replications (61%)”

    With all the other factors involved (as Andrew mentioned), I’d say 11/18 is about the same as seen for that psych replication project (which was 35/97). Anyway, statistical significance in the same direction is a really crappy metric of “reproduced”. And I can’t think of any way that wouldn’t bias this number to the high side. There is no way 40% of results not being statistically significant in the same direction is a good thing. For sufficiently powered studies, I would expect 50% in the absence of any bias on the part of the original researchers.

    • UGH significant effects in the same direction? I’m sure Keith will have something to say here, but at least make SOME effort to combine information from multiple studies using a consistent quantitative approach. And that means some kind of approximate Bayes.

      Just to illustrate how stupid the “number of studies with significant results in the same direction” metric is… Suppose we have an initial study claiming an effect of size 1 on some scale, with statistical significance for this effect.

      Now, suppose we have 6 replications of this same basic study. N = 7,9,15,11,22,20755

      now suppose the first 5 all replicate statistical significant results in the positive direction… and the last study has effect size -0.04 +- .003

  3. Anon:

    Oooh, I hate this sort of thing (that quote from Camerer about B+ and A-). First, it’s kinda horrible because it suggests that psychology deserves a high passing grade, a statement which I’ll only believe when Psych Science and PPNAS stop regularly publishing junk. Second, I really don’t like the practice of taking quantitative information and replacing it by meaningless qualitative labels like
    “B+”—as if there’s any good reason why this is B+ and not A- or B- or whatever. That kind of quote just muddies the waters. Why not just say that X percent of studies replicated, rather than garbling things with a meaningless “grade”? Ugh.

  4. Maybe* the argument above about studies in econ vs. psych is reasonable (through there is no data…), but a statistical comparison with econ based on 18 studies is no comparison at all. And this is in Science?!

    • Sorry lack of time to read the study so this is just comment on economics is better than psychology.

      I trained as a psychologist. I totally fail to see the difference between a laboratory study in economics and a laboratory study in psychology. One is studying behaviour in a laboratory. While I might change my mind, I currently view “behaviour economics” as an interesting branch of psychology originating, I believe, out of the cognitive psychology field, perhaps with a soupçon of social psychology added.

      To my mind,the issue is not really the replicability of studies in psychology (any more that the problem of replicating clinical drug trial results) but the problem is the replicability of some types of studies in social psychology where as Andrew says, “social psychologists are destroying their chances by purposely creating interventions that are minor and at times literally imperceptible” We will ignore the issue that some social psychologist seem to be studying “tooth fairy problems”. See http://skepdic.com/toothfairyscience.html.

      If you study a phenomenon with strong and robust effects, then you get consistent, replicable results.

  5. >That said, I think it’s hard to interpret any of these overall percentages because it all depends on what you include in your basket of studies. Stroop will replicate, ovulation and voting won’t, and there’s a big spectrum in between.

    A fair point–this replication study only picked experiments from top journals. But of course that’s the most valuable thing, as those studies are more likely to get cited in academic and public press.

  6. Could you please stop equating “social psychology” with “psychology”? The two are not the same. One is a subfield of the other, with its own culture, values, traditions, theories, methods, criteria for what counts as an advance, etc. I am a psychologist who studies *other things*, and think that social has been floundering for decades.

    • Mark:

      Fair enough. I added “social” to the title of the post. (It was already in the main text.)

      Indeed, one of my complaints with a lot of this Ted-talk-style evolutionary psychology, and social psychology, and behavioral economics, and goofy political science (I’m not quite sure what to call the subfield of poli sci that deals with elections being determined by subliminal smiley faces and football games and shark attacks and hormones) is that it seems like a step backward in time, to the 1940s, back before the cognitive revolution in psychology.

      I mean, sure, people are often irrational, I get that—but developmental and cognitive psychologists have for decades been learning all sorts of things about how we think. And all this power-pose, himmicanes crap just tosses all that understanding aside, in favor of a crude behaviorist black-box model of human decision making. When then gets pimped in PPNAS, Science, Nature, Gladwell, Freakonomics, Ted, NPR, etc. Makes me wanna scream.

      • “And all this power-pose, himmicanes crap just tosses all that understanding aside, in favor of a crude behaviorist black-box model of human decision making.”

        I find this statement incredibly perplexing. Are you claiming that all models which don’t include cognitive biases are unrealistic? It’s a pretty serious critique of social science since the overwhelming majority of models currently in use black-box human decision making. Do you not believe in supply and demand models either?

        • David:

          I’m not talking about models that don’t include cognitive biases, I’m talking about models that don’t include cognition. I’m taking about models that ignore insights in psychology from the 1950s.

          Regarding your last two sentences, could you give an example or two of the sorts of models you’re thinking of? I’m not quite sure which supply and demand models you’re talking about. The economic models I have in mind don’t seem to me to be “black box” in the way I’m thinking. But maybe you’re thinking of some other class of models.

          I do have problems with some social science models that just don’t describe reality in some important ways (for example, I’ve criticized, and will continue to criticize, the curving-utility-function model of risk aversion; see my 1998 paper and many posts on this blog), but in that particular case I’m not criticizing the model as being “a crude behaviorist black-box model of human decision making”; I’m criticizing it for being oversimplified and misleading, which is a bit different.

        • I mean the ordinary supply and demand theory they present in every introductory economics textbook. Any theory which uses unbounded rational actors can be said to be behaviorist because they ignore human cognition. Many bounded rationality theories are behaviorist too if they state the theory in terms of what information is given to the individual rather than what information is perceived, processed, or retained by the individual. When you say we can’t use behaviorist black-box models, you’re arguing not just against the learning chapter of every introductory psychology textbook, but also most of the introductory economics textbook as well. Theories should incorporate human cognition if that makes sense for the given situation. If a theory can be stated while ignoring human cognition, so much the better.

        • David:

          No, I would not characterize ordinary supply and demand models as “crude behaviorist black-box model of human decision making.” These models are based on ideas such as, when the price of something goes down, people are typically willing to buy more of it. This makes sense because people are trading off different uses for their money. It’s rational decision making and often involves cognition, for example when a consumer is deciding how to spend his or her money.

          Power pose, himmicanes, shark attacks, smiley faces . . . that stuff is completely different in that it is all based on the idea that people are not thinking things through.

          Now, don’t get me wrong, I accept the idea that we are swayed by irrational impulses. My problem with power pose, himmicanes, shark attacks, smiley faces, etc., is the supposition of these theories that we can be swayed in large and predictable ways by irrelevant stimuli. If these effects were real, then, sure, we’d want to know about it. But actually there’s no good evidence that these effects are real, and also I think the belief in these effects is a kind of pre-cognitive thinking, almost like a belief in witchcraft.

          Finally, regarding your point that supply-demand models can work while ignoring human cognition: Sure, they’re just formulas, and if they work, they work. But underlying these formulas is the idea that they are consistent with human cognition.

    • I am a psychologist who studies *other things*, and think that social has been floundering for decades.

      I think you are being too kind; I’d go for more than decades.

      I can think of some very interesting and practical social psych work in the last decades. Altemeyer’s Authoritarianism comes to mind and I have seen some very promising new work on conspiracy thinking, but I have had the impression that social psych has been mainly studying irrelevant issues with poor techniques.

      Or, in some cases, social psych is studying in areas of great interest to others but the blasted questions they are asking are of no use in the real world!

    • No I disagree with Mark Seidenberg on this; I am assuming he is the same person who’s work I greatly admire.

      In psycholinguistics, which can be seen as a sub-class of cognitive psychology and is a field the Mark I am thinking of has published in, a vast number of the publications, especially in top journals, are shamelessly p-value hacked. And reviewers and editors let these papers through the review process.

      There are effects that clearly replicate in cognitive psychology, but there are vast tracts of even this field where we are looking at a giant p-value hack-fest. I would therefore include psychology and linguistics and psycholinguistics in Andrew’s sweeping critique. There are probably solid results out there, but nobody knows what proportion of these are solid because (a) nobody tries to replicate them, (b) almost nobody releases the raw data so that others can check their claims. You can already see that there’s monkey business going on by checking what proportion of scientists publish against their own previous findings. The proportion is near 0. Everybody is finding evidence consistent with their own theory, and nobody is ever wrong. That’s a red flag because that’s a logical and statistical impossibility.

  7. “and nearly impossible to investigate given how hard it is to get access to the data.”

    This stood out. Lots of the data economists use is publicly available for free on government websites. Lots of the rest is licensed to many research institutions. Many journals have started publishing data and code with articles. Some data is private or proprietary, but by and large the data is out there.

  8. “Another thing is that economists, compared to psychologists, seem more attuned to the challenges of generalizing from survey or lab to real-world behavior”

    Isn’t this because economists were not able to make generalizations in the first place given theoretical limitations (e.g., rational choice theory). If anything psychological insights added to generalizability of economical studies rather than the other way round. Further, I’m not aware of an abundance of soc psych studies using fake money/ imagined donations as substitute for real money, this is a well-known problem.

  9. Hi everyone,

    The Discussion part of the paper (I am one of many authors) includes a comparison with the RPP results. But it would be premature to draw strong conclusions about disciplinary differences given our sample size of 18 replications and methodological factors that could potentially explain why the replication rates differed (exclusion criteria differing etc). We are doing some general empirical econ replication stuff work with larger sample size than what we previously have had – will email Andrew when we have results.

    Anna :)

  10. I think a lot of it is because economists have their work scrutinized more often because their work can make people a ton of money. If the work fails or doesn’t replicate, then that means more people lose more money and get angry at the economists.

    • Speculation on my part: Some of it is also the culture of the discipline, I think. (Ironic, because one of the knocks on economists is that a lot of work in the field is very bad at accounting for culture.) My sense from talking to economists is that journal article referees are focused almost exclusively on looking for any reason they can find to criticize paper submissions, valid or not, and that these criticisms are often harsher than you see in other social sciences. Although I’m glad I don’t work in a culture like that, I admit that it might also weed out some low-quality papers that would get published in other disciplines (but it probably also weeds out some papers that should have been published, if I had to guess).

Leave a Reply

Your email address will not be published. Required fields are marked *