Gaydar and the fallacy of objective measurement

Greggor Mattson, Dan Simpson, and I wrote this paper, which begins:

Recent media coverage of studies about “gaydar,” the supposed ability to detect another’s sexual orientation through visual cues, reveal problems in which the ideals of scientific precision strip the context from intrinsically social phenomena. This fallacy of objective measurement, as we term it, leads to nonsensical claims based on the predictive accuracy of statistical significance. We interrogate these gaydar studies’ assumption that there is some sort of pure biological measure of perception of sexual orientation. Instead, we argue that the concept of gaydar inherently exists within a social context and that this should be recognized when studying it. We use this case as an example of a more general concern about illusory precision in the measurement of social phenomena, and suggest statistical strategies to address common problems.

There’s a funny backstory to this one.

I was going through my files a few months ago and came across an unpublished paper of mine from 2012, “The fallacy of objective measurement: The case of gaydar,” which I didn’t even remember ever writing! A completed article, never submitted anywhere, just sitting in my files.

How can that happen? I must be getting old.

Anyway, I liked the paper—it addresses some issues of measurement that we’ve been talking about a lot lately. In particular, “the fallacy of objective measurement”: researchers took a rich real-world phenomenon and abstracted it so much that they removed its most interesting content. “Gaydar” existed within a social context—a world in which gays were an invisible minority, hiding in plain sight and seeking to be inconspicuous to the general population while communicating with others of their subgroup. How can it make sense to boil this down to the shapes of faces?

Stripping a phemenon of its social context, normalizing a base rate to 50%, and seeking an on-off decision: all of these can give the feel of scientific objectivity—but the very steps taken to ensure objectivity can remove social context and relevance.

We had some gaydar discussion (also here) on the blog recently and this motivated me to freshen up the gaydar paper, with the collaboration of Mattson and Simpson. I also recently met Michal Kosinski, the coauthor of one of the articles under discussion, and that was helpful too.


  1. Joe says:

    I’m not sure I get the criticism? You’re just saying that the gaydar studies suffer from the base rate fallacy?

    I don’t really think the authors claimed otherwise — they claimed that the test could achieve a certain level of accuracy and not that the results of the test indicated a particular posterior probability.

    • No, I think the claim is that they missed the forest for the trees. Gaydar isn’t about objectively measurable quantities extracted from facial recognition software, it’s about a sentient being actively making choices about appearance which do or don’t signal gay orientation. So by trying to study extracted measures you miss essentially the entire mechanism.

      I suppose that’s not uncontroversial, maybe it really is the case that gay people have different bone structure… but it seems unlikely that the cultural choices are meaningless, and so the question is what role does each play: biological differences in development vs cultural differences in choice of appearance.

      • More to the point, this is just one example of “mechanism blindness” which is rampant. You could look into things like for example why black men die more frequently from X (anything) and try to objectively measure all sorts of biology that gives them a predilection for X, and you might think you’re doing a great scientific job. But what if the real underlying reason is that overt or covert racism by doctors and hospital staff leads to them getting substantially worse treatment when they have X? You miss the mechanism because you just aren’t looking at the right phenomenon at all.

        • Paul Alper says:

          Daniel Lakeland:
          “You miss the mechanism because you just aren’t looking at the right phenomenon at all.”
          Is the famous “paradox of the smoking mother” an example of missing the mechanism due to looking at the wrong phenomenon?

          “The low birth-weight paradox is an apparently paradoxical observation relating to the birth weights and mortality rate of children born to tobacco smoking mothers. Low birth-weight children born to smoking mothers have a lower infant mortality rate than the low birth weight children of non-smokers. It is an example of Simpson’s paradox.”

          “At first sight these findings seemed to suggest that, at least for some babies, having a smoking mother might be beneficial to one’s health. However the paradox can be explained statistically by uncovering a lurking variable between smoking and the two key variables: birth weight and risk of mortality.”

          • Maybe, here the basic assumption seems to be that smoking is itself considered the dominant mechanism, nothing else much matters. But, you can gloss over a lot of stuff there. Smoking might be something that certain groups do, these groups might also do a lot of other stuff. For example, eating unhealthy meals that lead to uncontrolled gestational diabetes which tends to boosted birthweight. People who smoke are on average undoubtedly less aware of health issues on average. Perhaps the high birth weight babies are ones suffering from consequences of gestational diabetes and hence are high birthweight. Parents who smoke but actually control their other health issues might be the ones whose kids are lower birthweight but then after birth may be better taken care of…

            in general this “mechanisms blindness” is a big issue.

      • Dan Simpson says:

        It would be like trying to study the spatial distribution of badgers in England by taking 7000 of them to a lab…

        • Dan Simpson says:

          You can still learn things, but by ripping the object of study from the context that it actively interacts with, you lose a lot.

          • Joe says:

            Huh? Of course you do.

            But the claim about gaydar isn’t that facial attributes tell us everything or most things or even many things that we want to know about sexuality.

            Rather, the gaydar proposition is that, from relatively sparse information, you can extract meaningful signals about sexual orientation.

            I know, beyond any reasonable doubt, the sexual orientation of all of my closest friends. That’s not gaydar — that’s just part of what it means to know someone well.

            Gaydar is about forming an impression of sexuality immediately upon meeting someone. I’d say within the span of a few minutes. So, sure, there’s more to go on than a face, but a first step to validating the possibility of a meaningful gaydar is to start with very sparse information like that. Gaydar is unrelated to the question of, whether after a multi-hour interrogation or after following someone for a period of days, you can infer their sexuality.

            • The thing they were studying in the article most recently discussed was whether *pictures from dating websites and facebook etc* could be used to determine sexual orientation.

              so the claims are all related to whether *in that population of people who post those pics* there were detectable differences in frequency of certain facial features.

              That hardly constitutes real evidence that *in the population at large* there are detectable reliable differences in frequency of certain facial features.

              • Joe says:

                Ah gotcha – thanks. And now I understand the relevance of the base rate fallacy section.

                That said, I don’t think the linked paper is terribly clear. I would suggest revising it to emphasize the point in the way you just did.

                My $0.02 is that all of the clunky terminology is getting in the way.

  2. Keith O'Rourke says:

    > detailing the errors of statistical extrapolation that underpin them.
    Perhaps consider replacing extrapolation with exploitation ;-)

    (I actually initially misread as that).

  3. Michael Bailey says:

    I am not a fan of the original research–I’m highly skeptical that homosexual people’s faces are innately distinct from heterosexual people’s–but I’m also (uncharacteristically) not a fan of your paper, Andrew. Most of all, it intrudes into science boldly and dismissively without adequate knowledge. For example, it impugns the theory that homosexual persons are gender atypical (those lab studies neglecting “context”) without mentioning findings that little boys who are floridly feminine and want to be girls when they are five years old overwhelmingly turn into gay men. What’s your contextual theory explaining that, without gender atypicality, and likely innate gender atypicality? (Parents aren’t training these boys to like Barbies or wear dresses!)

    Also, gay men are not especially eager to express gaydar signals–indeed, many profess a preference for “straight-acting” partners, and almost none express a preference for “gay-acting” partners.

    It makes elementary mistakes, like the assertion that dichotomizing sexual orientation renders any analyses of gaydar questionable. Dichotomization a continuum isn’t a good idea, but mainly because it diminishes power, not raises it. (And there is reason to think that in men at least, a dichotomy may apply to most.) And this statement is flat out wrong: “Gender atypicality has not been studied at all in relation to heterosexuality, just as researchers have yet to attempt to develop a straight-dar.” Anyone who studies gender atypicality in homosexual people using heterosexual control groups–as we almost always do in my lab–are necessarily doing what you say has not been done.

    As for the “base rate problem,” if one wants to predict who’s gay, it’s good to adjust for base rates. If one wants to study differences between heterosexual and homosexual persons, that’s not so important. The original paper did emphasize the former goal, but it has never been mine. I want to understand the nature, nurture, and development of sexual orientation.

    Revise and resubmit.

    Mike Bailey
    Northwestern University
    p.s. you should at least read (and not just cite) some of my papers. Start here:

  4. Carlos Ungil says:

    > But we disagree with their claim that they have shown that “configural face processing significantly contributes to perception of sexual orientation.”
    > To understand our disagreement, consider several aspects of gaydar as we understand it, and which are consistent with the dictionary definition given at the start of this paper.

    Should any claim about “perception of sexual orientation” be interpreted as being about the concept of gaydar as you understand it?

    > Our second concern is the way in which gaydar, which was originally framed as an aspect of communication within the gay community, has been redefined as a skill that can (or should!) be deployed by the general (thus, mostly straight) population.

    Is the perception of sexual orientation by the general population a subject to be avoided?

