Mistaken identity

Someone I know sent me the following email:

The person XX [pseudonym redacted] who posts on your blog is almost certainly YY [name redacted]. So he is referencing his own work and trying to make it sound like it is a third party endorsing it. Not sure why but it bugs me. He is an ass as well so pretty much every thing he does bugs me. . . .

OK, fair enough. I was curious so I searched in the blog archives for commenter XX. It turns out he was not a very frequent commenter but he had a few, and he did refer to the work of YY. But I’m almost certain that XX is not YY. I’m no Sherlock Holmes when it comes to the internet but I checked the url’s, and XX appears to be coming from a different country than the location of YY. And, looking at the comments themselves, I can’t believe this is some elaborate attempt at deception.

No big deal. But it’s an interesting example of how it’s possible to be so sure of oneself and happen to be wrong.

39 thoughts on “Mistaken identity

  1. It is pretty interesting how our minds try to fill in the missing information. When I grade student tests blind to name I’m curious as I read, and often surprised when I match number to name. Only in the case of this blog, it’s like trying to fill in names from a very large roster without any finite frame. And furthermore in this case, it seems like emotion was helping guide the conclusion. I’m very sympathetic to finding myself convinced of an inference that turns out to be incorrect. I fall into lots of local minima when I think, and potholes when I bike.

  2. Wait… is “Keith O’Rourke” really C.S. Peirce?!?

    Or is “Anonymous” really E.T. Jaynes?!?

    Wait, OMG, it all makes sense now… “Gelman” is such a fake name, don’t you see, “Gelman” = “Kaufman”… I KNEW you weren’t dead!

  3. I’ve often thought it would be interesting if someone made a web-based program that altered the punctuation, capitalization, and sentence structure of text to “anonymize” it, changing the idiosyncrasies that make individuals’ writing styles recognizable, while still (hopefully) being grammatically correct. Not that I care about being disguised — the name in the rectangle is mine! — but I can imagine that a lot of people might find it useful. (Step one for my text would be removing hyphens.) (Step two: parentheses.) And yes, I know that the point of Andrew’s post is that such a service would be less necessary than I think!

    • I’m sure the NSA is working on it.

      As somebody whose blog gets 400 or so comments per day, I can distinguish among some regular anonymous by their prose styles. But I will never give away their identities.

      • By identities, you mean their real names etc.? What do you compare their writing style against? Distinguishing them I can get. But how do you get their identities?

  4. This is an example of the base rate fallacy, right? (Or something similar, I can never keep the names straight.) XX’s comments would be much more likely if they were YY, but the prior on XX being YY is so low that the emailer should never have regarded it as even probable that XX=YY, let alone “almost certain”.

    • Have you considered using a prior that is non-zero on the possibility that we are talking about a chimeric XXYY who sometimes is instantiated as XX, sometimes as YY, and sometimes as XY?

    • Last year in the midst of a deeply-felt exchange on a local website I invited the counterparty to meet me so we could settle our argument face-to-face. (Our on-line exchange had ceased to be constructive.) I was posting under a pseudonym but I offered up my email address and to meet him at a time and location of his choosing. Nope. No interest. He did however fume about how I refused to reveal my actual identity in the comments section where we were debating. Bottom line: I think confidentiality is a good thing. I’m also good with moderated comments sections.

  5. In online discussions, someone might:
    A) Give facts and relevant sources that can be assessed and checked.
    Things that speak for themselves (like side-by-side highlighted plagiarism displays) can be credible from anyone.
    IF a handle has gained credibility, less checking might be OK.

    B) Offer opinions/analyses
    Here, reputational assessment matters.
    The question is the *credibility* level people attach to B) opinions/analyses, based on evolving reputation of an online handle:

    1) “Anonymous” – zero, or negative, especially if the blog explicitly asks people not to do that.
    It is really annoying when several people post that way in one thread. I never even look at such, although occasionally someone will do it by accident, in which case they may post later saying so.

    2) Simple name like Bill, Bob, etc : these can collide as well, but maybe disambiguate within a thread.
    I rarely read these. Short codes may at least disambiguate within a thread, or show up often enough at a few blogs that one can think are the same real person.
    At that point, one can start assigning conditional reputation to the handle, very hard to do for Bill or Bob.
    These are pseudonyms, but not very useful, even if there is a one-to-one map between handle and real person.

    3) Longer pseudonyms that can be Googled to see what else people say. Bill is hopeless, Bill000x is fine. Google first before using.

    This is actually useful, and while others might fake it, one can actually assign credibility ratings as data accumulates.
    I’ve done analyses of thousands of comments, and there are many pseudonyms that seem to be used consistently, and either merit credibility or the reverse.

    4) Pseudonyms attached to blogs, Twitter, etc. These are better, in that one can more easily assess credibility, whether or not one knows anything at all about real-world identify. They are also much harder to hijack, i.e. posts by same handle are likely to be the same.

    5) Real name given … this can either be useful in checking whether there is plausible real-world evidence one should pay attention … or not useful.
    Someone posts as “Bill Smith”, their real name, not very useful. Traditional Chines or Korean names: really tough, unless with 4) above.

    Someone posts as Andrew Gelman, there are more than one (try PIPL) … but msot of the hits map to the owner of this blog, with a clear identity in the real world.
    Of course, a few of us have names that are essentially unique, have Twitter handles, Web pages, Wikipedia bios, and those are pretty easy, even if we use handles with or without blanks.

    6) of course, one always has to watch out for people trying to hijack handles with to make misleading comments, or more commonly, returning to the topic of this post, use sockpuppets to create virtual support for themselves. Sometimes those are easy to detect (come from same IP address is a hint … but not perfect) and sometimes hard.
    As in post, sometimes a suspicion is wrong.

    7) I’m quite happy with 4) and sometimes 3): consistent, recognizable pseudonyms are fine, and sometimes people have very good reasons for them. I have on occasions had many productive interchanges with pseudonym handles who earned strong credibility, without knowing who they were in the real world.

    Likewise, I have probably ignored good opinions/analyses from vaguer handles, just because life is short.

  6. When I was an undergraduate at Michigan State, there was another undergrad math major named Robert Lee Carpenter and we were both TA-ing the same intro to algebra and trig freshman class the same semester. Needless to say, much confusion ensued.

    Last trip abroad my wife and I were detained for a couple hours and then the friendly TSA guy told me “you just got caught by having a common name.” Gee, thanks. I wonder how Daniel Lee ever gets back into the country.

    Here in Australia, where I am now, Stan doesn’t even come up on the first page of Google hits because there’s a Netflix competitor of that name. There’s even an MC Stan out there (though unlike Eminem’s (another author of Stan if you’re googling), I’ve never heard her or his or its music). We should’ve listened to Hadley Wickham and called it mcmcstan2012 or something like that.

    • When I was a grad student at Trinity College, Cambridge (total student population (ugrad+grad) of 1000) another Robin Morris started as an undergraduate. I got invitations to a lot of parties that were more interesting than the ones I usually got invited to…

  7. re: stan
    Yes, always Google names first. UNIX was pretty good, as was VAX, although not having the WWW, that was a vacuum cleaner brand, leading to jokes.
    It is however hard to get 1-2 syllable, pronounceable names that don’t have negative connotations in some important language.

    Demand unique names at birth. Long ago and far away, I heard a story (which may have been apocryphal, but easily not) that somebody proposed that every person be given a permanent phone number at birth, like a SSN, only earlier.

    These days, it would be an email address or Twitter, FB, etc name.

  8. I’m on this blog a lot and I haven’t noticed a lot of pushing of any particular person’s research.

    The closest might be Fernando / Judea Pearl, but Fernando is not a pseudonym and often posts about his own papers.

Leave a Reply

Your email address will not be published. Required fields are marked *