Xian points me to this pitiful story.
I hate that these people never just say they’re sorry, for wasting everyone’s time if for nothing else.
Continuing my desire to while away a beautiful afternoon reading interesting stuff instead of working on what I should be working on, I followed the link and read Fuji’s lame letter.
Looking for the Carlisle paper, I first found the editorial in Aneesthesia by J.J. Pandit, which I found to be a good explanation to a scientific (but non-statistician) audience of the issues. It has a great paragraph in it:
“Those wishing to invent data have a hard task. They must ensure that all the data satisfy several layers of statistical cross-examination. Haldane referred to these as the ‘orders of faking’ . In his words, ‘first-order faking’ is to ensure simply that the mean values match what is expected. For his ‘second-order faking’, things become more difficult since the variances of these means must also be within those expected, and further consistent with several possibly inter-related variables. His ‘third-order faking’ is extremely difficult because the results must also match several established laws of nature or mathematics, described by patterns like central limit theorem, the Hardy-Weinberg Law, the law of conservation of energy or mass, and so on. It is therefore always so much easier actually to do the experiment than to invent its results.”
I’d like to think Pandit is right and it’s easier to do the work, but I doubt it. It’s particularly hard to do the experiment if the phenomenon you are looking for doesn’t actually exist. Furthermore, Pandit a bit later points out that Carlisle was able to catch this guy because he had published SO MANY papers, and therefore there was “a rich source of data for us to analyse”
Wasting time? That and killing people… this is medical research.
My concern is that, if you are genuinely malicious and a little smarter about things, you can just simulate your data from the structure that you hope would underlie it. Inject an appropriate amount of noise and bingo, perfect data, even perfect imperfections. If done even somewhat competently, that would be extremely hard to detect. Maybe there should be an ethics test before we teach MATLAB, R, Python, etc.
Ass mentioned before, there are people who know how to make a retraction and take responsibility for a mistake.
I particularly like the lame sentences: “I am not qualified to counter specific allegations concerning the ‘central limit theorem’ and its applicability in our case. As I said, our data sample is very special” Just the plain truth…
Reading that I couldn’t help but think of you !
You should point the « Center for disease control and prevention » to your zombie paper
Isn’t it cute how he puts “central limit theorem” in quotes as if he’s never come across it before…