Skip to content

The Lab Where It Happens

“Study 1 was planned in 2007, but it was conducted in the Spring of 2008 shortly after the first author was asked to take a 15-month leave-of-absence to be the Executive Director for USDA’s Center for Nutrition Policy and Promotion in Washington DC. . . . The manuscript describing this pair of studies did not end up being drafted until about three years after the data for Study 1 had been collected. At this point, the lab manager, post-doctoral student, and research assistants involved in the data collection for this study had moved away. The portion of the data file that we used for the study had the name of each student and the location where their data was collected but not their age. Four factors led us to wrongly assume that the students in Study 1 must have been elementary school students . . .

The conclusions of both studies and the conclusions of the paper remain strong after correcting for these errors.”

— Brian Wansink, David R. Just, Collin R. Payne, Matthew Z. Klinger, Preventive Medicine (2018).

This reminds me of a song . . .

Ah, Mister Editor
Mister Prof, sir
Did’ya hear the news about good old Professor Stapel
You know Lacour Street
They renamed it after him, the Stapel legacy is secure
And all he had to do was lie
That’s a lot less work
We oughta give it a try
Now how’re you gonna get your experiment through
I guess I’m gonna fin’ly have to listen to you
Measure less, claim more
Do whatever it takes to get my manuscript on the floor
Now, Reviewers 1 and 2 are merciless
Well, hate the data, love the finding
Food and Brand
I’m sorry Prof, I’ve gotta go
Decisions are happening over dinner
Two professors and an immigrant walk into a restaurant
Diametric’ly opposed, foes
They emerge with a compromise, having opened doors that were
Previously closed
The immigrant emerges with unprecedented citation power
A system he can shape however he wants
The professors emerge in the university
And here’s the pièce de résistance
No one else was in
The lab where it happened
The lab where it happened
The lab where it happened
No one else was in
The lab where it happened
The lab where it happened
The lab where it happened
No one really knows how the game is played
The art of the trade
How the sausage gets made
We just assume that it happens
But no one else is in
The lab where it happens
Prof claims
He was in Washington offices one day
In distress ‘n disarray
The Uni claims
His students said
I’ve nowhere else to turn
And basic’ly begged me to join the fray
Student claims
I approached the P.I. and said
I know you have the data, but I’ll tell you what to say
Professor claims
Well, I arranged the meeting
I arranged the menu, the venue, the seating
No one else was in
The lab where it happened
The lab where it happened
The lab where it happened
No one else was in
The lab where it happened
The lab where it happened
The lab where it happened
No one really knows how the
Journals get to yes
The pieces that are sacrificed in
Ev’ry game of chess
We just assume that it happens
But no one else is in
The room where it happens
Scientists are grappling with the fact that not ev’ry issue can be settled by committee
Journal is fighting over where to put the retraction
It isn’t pretty
Then pizza-man approaches with a dinner and invite
And postdoc responds with well-trained insight
Maybe we can solve one problem with another and win a victory for the researchers, in other words
Oh ho
A quid pro quo
I suppose
Wouldn’t you like to work a little closer to home
Actually, I would
Well, I propose the lunchroom
And you’ll provide him his grants
Well, we’ll see how it goes
Let’s go
One else was in
The lab where it happened
The lab where it happened
The lab where it happened
No one else was in
The lab where it happened
The lab where it happened
The lab where it happened
My data
In data we trust
But we’ll never really know what got discussed
Click-boom then it happened
And no one else was in the room where it happened
Professor of nutrition
What did they say to you to get you to sell your theory down the river
Professor of nutrition
Did the editor know about the dinner
Was there citation index pressure to deliver
All the coauthors
Or did you know, even then, it doesn’t matter
Who ate the carrots
‘Cause we’ll have the journals
We’re in the same spot
You got more than you gave
And I wanted what I got
When you got skin in the game, you stay in the game
But you don’t get a win unless you play in the game
Oh, you get love for it, you get hate for it
You get nothing if you
Wait for it, wait for it, wait
God help and forgive me
I wanna build
Something that’s gonna
Outlive me
What do you want, Prof
What do you want, Prof
If you stand for nothing
Prof, then what do you fall for
Wanna be in
The lab where it happens
The lab where it happens
Wanna be in
The lab where it happens
The lab where it happens
Wanna be
In the lab where it happens
I wanna be in the lab
I wanna be in
The lab where it happens
The lab where it happens
The lab where it happens
I wanna be in the lab
Where it happens
The lab where it happens
The lab where it happen
The art of the compromise
Hold your nose and close your eyes
We want our leaders to save the day
But we don’t get a say in what they trade away
We dream of a brand new start
But we dream in the dark for the most part
Dark as a scab where it happens
I’ve got to be in
The lab (where it happens)
I’ve got to be (the lab where it happens)
I’ve got to be (the lab where it happens)
Oh, I’ve got to be in
The lab where it happens
I’ve got to be, I’ve gotta be, I’ve gotta be
In the lab
Click bab

(Apologies to Lin-Manuel Miranda. Any resemblance to persons living or dead is entirely coincidental.)

P.S. Yes, these stories are funny—the missing carrots and all the rest—but they’re also just so sad, to think that this is what our scientific establishment has come to. I take no joy from these events. We laugh because, after awhile, we get tired of screaming.

I just wish Veronica Geng were still around to write about these hilarious/horrible stories. I just can’t give them justice.


  1. Dzhaughn says:

    See what Hamiltonian Monte Carlo leads to.

  2. Jordan Anaya says:

    When it comes to the social sciences did we ever really take them seriously?

    It is annoying though when they take their flimsy work and sell it to the public.

    The thing that worries me is when I see shades of the problems psychology is facing in other sciences. Psychology is a “soft science” that was allowed to run rampant with our flawed publication and university system. You don’t really need data or theories that can be tested in psychology, so it was allowed to be over run with politicians. It was like letting kids run loose in the candy store.

    In other sciences it’s a lot harder to imagine someone publishing nonsense for 2 decades and being at the top of their field, but yet the researchers who publish a lot, oversell their results, hide their data, provide incomplete descriptions of methods, ignore criticism, still have a huge advantage.

    • Anoneuoid says:

      In other sciences it’s a lot harder to imagine someone publishing nonsense for 2 decades and being at the top of their field, but yet the researchers who publish a lot, oversell their results, hide their data, provide incomplete descriptions of methods, ignore criticism, still have a huge advantage.

      You have this reversed. It is actually harder to get away with this in psych because outsiders have some basic idea of what they are measuring (number of carrots eaten, etc). Once you start measuring things like “relative amount of dye that got pulled toward the positive electrode in this purified/processed sample”, pretty much anything about what affects rodent behaviour, number of events counted by 20 billion dollars worth of magnets and fancy computers, etc then only insiders have any sense of what is ridiculous or not.

      This is being borne out by the “reproducibility rates” (as far as getting statistical significance twice in the same direction assesses this…) of biomed vs psych. We are finding that, at ~40%, psych is not that bad relatively speaking. I doubt 40% of what gets published in biomed could be reproduced in principle.

      • Keith O'Rourke says:

        On the other hand, I do think the senior members in that community are far less tolerant of errors that get published (which are always evidence of more errors) when they find out what lead to them that should have been avoided. Like the age was missing in the data set, no one was still around that knew much about the data entry (was it checked for errors or even fully completed?) and we did our best to guess the age group but then wrote the paper up as if that was known…

      • Jordan Anaya says:

        On the one hand, it’s harder to quickly look at a biology paper and see numbers don’t add up. On the other hand, biology papers typically include much more information than psychology papers (which often just have a few tables of summary statistics).

        For example, this person made Nature’s “Top 10” list simply by taking nucleotide sequences and BLASTing them and seeing they weren’t what they were supposed to be:

        In biology there’s a tremendous amount of competition to be at the top of your field and clear your path for that Nobel Prize. Although outsiders may not be able to easily check the results of highly cited papers, competitors of that lab have a lot of incentive to find problems with it.

        In contrast, it seems Wansink (and until recently most psychologists) have had little to no push back from their peers, despite having dozens of papers with easy to spot problems. Wansink won the Ig Nobel prize for a study which contains fundamental problems:

        I can’t think of a famous biologist that has published questionable to fabricated data for decades without anyone noticing.

        However, I do agree a lot of what gets published in biology is only published for the sake of getting a paper, not because the experiments or data have any meaning.

    • jrkrideau says:

      When it comes to the social sciences did we ever really take them seriously?

      Perhaps the researchers don’t take them seriously but others do. I would refer you to the Carmen Reinhart and Kenneth Rogoff fiasco for perhaps the most serious ill-effect from social sciences in the last 50 years if not longer.

      Researchers in the social and behavioural sciences often do not understand the effects that their research can have on policy decisions. While it may look like policies are developed by jotting down notes an a napkin after the third martini or while hitting little white balls with clubs, policy recommendations are usually formed based on the best available data and information available. This often comes from social and behavioural sciences. Theory and tools developed in the social and behavioural sciences are used in program delivery.

      People who are developing policy recommendations use these results and tools all the time. These can affect billion dollar funding decisions, determine who does and does not remain incarcerated, what kinds of treatments some sick people receive, and so on.

      We think the Wansink papers are silly but it is quite conceivable that school boards, worried about childhood obesity, might start pasting stickers on all sorts of things to encourage healthy eating, probably wasting valuable staff time and taxpayer money when the resources could be better spend elsewhere.

      So other people take them seriously.

  3. Thanatos Savehn says:

    Ah, now they’re unweighed non-standardized matchstick carrots approximated in size as an approximate fraction of another unweighed, non-standardized carrot. And we learn that “elementary students ate twice the percentage of their carrots if attractively named as ‘X-ray Vision Carrots,’ than if un-named or generically named as the ‘Food of the Day.'”

    I assume the core idea is either that carrots contain a dose of something that’s good for kids or that they’re harmless but will fill them up and so decrease the dose of bad things that they’d otherwise eat, and that that we want to find ways to make kids ingest a meaningfully larger/smaller dose. (BTW isn’t dose usually measured in mg or ml rather than percentages of percentages?). So here even if the conclusions are still strong and robust the average doses and the difference in doses (not to mention variation between doses) between the two treatments cannot be estimated from these data. What then is the point of the exercise? Either way I sure hope someone is looking for the 452 children who went missing between the original publication and this correction.

  4. Maybe it’s because I just finished teaching and don’t yet have the energy for real work, but I can’t help looking further at this train wreck of a paper. If I’m reading Figure 1 of the correction properly (I’ve copied it here), naming carrots as “X-Ray vision carrots” compared to the “Unnamed control” has a “significant” effect on the number of carrots kids eat based on the (inane) metric of p-values. In fact, p = 0.000 (!).

    However, the naming does *not* have a “significant” effect on the number of carrots left uneaten (p= 0.378), nor the total number of carrots the kids put on their plate in the first place (nearly the same in both cases, p = 0.749). Note that the total number of carrots is the sum of the eaten and uneaten carrots.

    So, in the language of the paper, attractive vegetable names increase the number of carrots kids eat, but don’t decrease the number they don’t eat, though the number of carrots they take remains the same. This is profound. I’m not sure, but I think the authors have discovered a dietary manifestation of the Banach–Tarski paradox.

    • Jonathan (another one) says:

      Or maybe it’s the zen koan: How do you digest an uneaten carrot?

    • Jordan Anaya says:

      Interestingly, in the original paper they just did an ANOVA, but in the correction they did a Wald Chi-Square test. In addition, in the original the smallest p-value is also for “number eaten”.

      I assume they just see whatever test will give the “number eaten” the smallest p-value without realizing the 3 values are fundamentally related to each other and the p-values should also probably be related to each other.

      • Keith O'Rourke says:

        > p-values should also probably be related to each other.

        Possibly not in a obvious manner, at least completely on their own.

        The relations are between the means and variances (these are all that matter in ANOVA) but anticipating how that determines a necessary pattern in p_values is not obvious to me. That is, could one tell just from the three p-values that there surely is a problem?

  5. Guive says:

    Andrew should write a musical about the replication crisis

  6. Keith O'Rourke says:

    Recall a situation that might explain some of what is going on here with senior academics seeming to side so much with other academics who are being mercilessly nitpicked over some mistakes in details.

    When I was at Duke in 2008, having misunderstood Mike West’s email about how important it was that someone from the stats department show up at the meeting of the inter-university Dean’s panel studying complete rates of Phd students (he actually thought it would be a waste of time), I went to the meeting.

    They were presenting the data on completion rates submitted from the various universities involved. One of the Deans had their research assistant who had some spare time analyse the data. Probably only having a couple statistics courses, the only way they could think of analyzing the data was to set aside all the students except those who had completed or dropped out of their program.

    I mentioned there were better methods of analysis that would allow one to include all the students. I was thanked for providing that information in a matter of fact way. Then they started to discuss what they had learned about completion rates of African-American students and why that was so different.

    I raised the concerns about confounding with university, program, etc. and especially the small numbers. Something like only being from 2 or 3 universities with less than 10 at each. One of the Deans got quite annoyed and loudly blurted out that statisticians complain about how you do things no matter you do them. After that comment, they continued on with their to them highly enlightening insights they had discerned in their data. As I left, I could almost hear a sigh of relieve :-(

    Now these were all Deans from seeming highly reputable universities – so I was a bit surprised. But by the time they became Deans, these academics were likely so stuffed with prestige, not only did they sneeze marble they could not see or imagine any of their blind sides.

    • Andrew says:


      I’m reminded of the attitude of some senior researchers of indignation that work is being “targeted” for replications. I’d think that if a group of scientists feels they’ve made an important discovery, that they’d be thrilled for outsiders to replicate their work.

      The proponents of work that has failed to replicate or that has been shown to have serious statistical flaws seem to be holding two contradictory views in their head: First, they’re sure the work in question is correct, important, consistent with theory, pathbreaking, etc.; second, they have a suspicion that the work is fragile, maybe completely wrong, hence the fear of any stress testing whatsoever. I don’t take the cynical view here that “it’s all about the money” or the prestige or whatever; it’s my impression that the proponents of these flawed research efforts really hold both views at once, in some way that I don’t fully understand. Part of it, I think, an identification of research success with various measures of process—not accurate measurements, clean statistics, and replicability, but publication in peer-reviewed journals, personal connections with trusted colleagues, and media exposure.

      • Keith O'Rourke says:

        Agree, for some reason there is lack of perception of harms/benefits of not getting it/getting it.

        Perhaps someone who has insight into caste societies work, group/cartel thinking blocks perception or whatever.

        Until more do perceive – it’s going to continue to be a slog.

  7. Dale Lehman says:

    From my experience, most academics do not suffer from small egos. At the same time, they do have fragile egos – and power (at least over their students). The combination makes them detest criticism, except for the kind that they can answer, thereby “winning” the debate. This is certainly an over-generalization and there are many exceptions. But I think the fear of nonreplication is widespread and tied to these human characteristics. Now, if they would perfect their power poses, they might be more resilient…

Leave a Reply