Marcel van Assen has a story to share:
In 2011 a rather amazing article was published in Science where the authors claim that “We found that merely sniffing negative-emotion-related odorless tears obtained from women donors induced reductions in sexual appeal attributed by men to pictures of women’s faces.”
The article is this:
Gelstein, S., Yeshurun, Y., Rozenkrantz, L., Shushan, S., Frumin, I., Roth, Y., & Sobel, N. (2011). Human tears contain a chemosignal. Science, 331(6014), 226-230.
Ad Vingerhoets, an expert on crying, and a coworker Asmir Gračanin were amazed by this result and decided to replicate the study in several ways (my role in this paper was minor, i.e. doing and reporting some statistical analyses when the paper was already largely written). This resulted in:
Gračanin, A., van Assen, M. A., Omrčen, V., Koraj, I., & Vingerhoets, A. J. (2016). Chemosignalling effects of human tears revisited: Does exposure to female tears decrease males’ perception of female sexual attractiveness?.Cognition and Emotion, 1-12.
The paper failed to replicate the findings in the original study.
Original findings that do not get replicated is not special, but unfortunately core business. What IS striking, however, is the response of Sobel to the article of Gracanin et al (2016). See …
Sobel, N. (2016). Revisiting the revisit: added evidence for a social chemosignal in human emotional tears. Cognition and Emotion, 1-7.
Sobel re-analyzes the data of Gracanin et al, and after extensive fishing (with p-values just below .05) he concludes that the original study was right and the Gracanin et al study bad. Irrespective of whether chemosignalling actually exists, Sobel’s response is imo a beautiful and honest defense, where p-hacking is explicitly acknowledged and its consequences not understood.
We also wrote a short response to Sobel’s comment, commenting on the p-hacking of Sobel.
Gračanin, A., Vingerhoets, A. J., & van Assen, M. A. (2016). Response to comment on “Chemosignalling effects of human tears revisited: Does exposure to female tears decrease males’ perception of female sexual attractiveness?”.Cognition and Emotion, 1-2.
To save time, if your interested, I recommend reading Sobel (2016) first.
I asked Assen why he characterized Sobel’s horrible bit of p-hacking as “a beautiful and honest defense,” and he [Assen] responded:
I think it is beautiful (in the sense that I like it) because it is honest. I also think it is a beautiful and excellent example of how one should NOT react to a failed replication, and of NOT understanding how p-hacking works.
This is about emotions; although I was involved in this project, I ENJOYED the comment of Sobel because of its tone and content, even though it I did not agree with its content at all.
Our response to Sobel’s comment supports the fact that Sobel has been p-hacking. Vingerhoets asked BEFORE the replication if it mattered Tilburg had no lab, and Sobel says ‘no’, and AFTERWARDS when the replication fails he believes it IS a problem.
None of this is new, of course. By this time we should not be surprised that Science publishes a paper with no real scientific content. As we’ve discussed many times, newsworthiness rather than correctness is the key desideratum in publication in these so-called tabloid journals. The reviewers just assume the claims in submitted papers are correct and then move on to the more important (to them) problem of deciding whether the story is big and important enough for their major journal.
I agree with Assen that this particular case is notable in that the author of the original study flat-out admits to p-hacking and still doesn’t care.
Gračanin et al. tell it well in their response:
Generally, a causal theory should state that “under conditions X, it holds that if A then B”. Relevant to our discussion in particular and evaluating results of replications in general are conditions X, which are called scope conditions. Suppose an original study concludes that “if A then B”, but fails to specify conditions X, while the hypothesis was tested under condition XO. The replication study subsequently tested under condition XR and concludes that “if A then B” does NOT hold. Leaving aside statistical errors, two different con- clusions can be drawn. First, the theory holds in con- dition XO (and perhaps many other conditions) but not in condition XR. Second, the theory is not valid. We argue that the second explanation should be taken very seriously . . .
What seems remarkable and inconsistent is that Sobel regards some of our as well as Oh, Kim, Park, and Cho’s (2012; Oh) findings as strong support for his theory, despite the fact that there was no sad context present in these studies. Apparently, in case of a failure to find corroborating results, the sad context is regarded crucial, but if some of our and Oh’s findings point in the same direction as his original findings, the lack of sad context and exact procedures are no longer important issues.
Sobel concludes that we did not dig very deep in our data to probe for a possible effect. That is true. We did not try to dig at all. Our aim was to test if human emotional tears act as a social chemosignal, using a different research methodology and with more statistical power than the original study; we were not on a fishing expedition.
I find the defensive reaction of Sobel to be understandable but disappointing. I’m just so so so tired of researchers who use inappropriate statistical methods and then can’t let go of their mistakes.
It makes me want to cry.