New Alan Turing preprint on Arxiv!

Dan Kahan writes:

I know you are on 30-day delay, but since the blog version of you will be talking about Bayesian inference in couple of hours, you might like to look at paper by Turing, who is on 70-yr delay thanks to British declassification system, who addresses the utility of using likelihood ratios for helping to form a practical measure of evidentiary weight (“bans” & “decibans”) that can guide cryptographers (who presumably will develop sense of professional judgment calibrated to the same).

Actually it’s more like a 60-day delay, but whatever.

The Turing article is called “The Applications of Probability to Cryptography,” it was written during the Second World War, and it’s awesome.

Here’s an excerpt:

The evidence concerning the possibility of an event occurring usually divides into a part about which statistics are available, or some mathematical method can be applied, and a less definite part about which one can only use one’s judgement. Suppose for example that a new kind of traffic has turned up and that only three messages are available. Each message has the letter V in the 17th place and G in the 18th place. We want to know the probability that it is a general rule that we should find V and G in these places. We first have to decide how probable it is that a cipher would have such a rule, and as regards this one can probably only guess, and my guess would be about 1/5,000,000. This judgement is not entirely a guess; some rather insecure mathematical reasoning has gone into it, something like this:-

The chance of there being a rule that two consecutive letters somewhere after the 10th should have certain fixed values seems to be about 1/500 (this is a complete guess). The chance of the letters being the 17th and 18th is about 1/15 (another guess, but not quite as much in the air). The probability of a letter being V or G is 1/676 (hardly a guess at all, but expressing a judgement that there is no special virtue in the bigramme VG). Hence the chance is 1/(500 × 15 × 676) or about 1/5,000,000. This is however all so vague, that it is more usual to make the judgment “1/5,000,000” without explanation.

The question as to what is the chance of having a rule of this kind might of course be resolved by statistics of some kind, but there is no point in having this very accurate, and of course the experience of the cryptographer itself forms a kind of statistics.

The remainder of the problem is then solved quite mathematically. . . .

He’s so goddamn reasonable. He’s everything I aspire to.

Reasonableness is, I believe, and underrated trait in research. By “reasonable,” I don’t mean a supine acceptance of the status quo, but rather a sense of the connections of the world, a sort of generalized numeracy, an openness and honesty about one’s sources of information. “This judgement is not entirely a guess; some rather insecure mathematical reasoning has gone into it”—exactly!

Damn this guy is good. I’m glad to see he’s finally posting his stuff on Arxiv.

15 thoughts on “New Alan Turing preprint on Arxiv!

  1. Someone once described a researcher I really like as “constitutionally incapable of being obtuse.” But I like the affirmative version even better: “preternaturally reasonable.”

    I agree that reasonableness is underrated in one sense, because being unreasonable can reap rewards in terms of exposure/prestige. But I think it is highly valued among people who do research because they want to actually understand the world a little better.

    • Corey:

      I just couldn’t resist mocking Ferguson one more time for his cute little crowd-pleasing remarks of a couple years ago. As I wrote at the time, if disparaging someone for being gay and liking ballet and poetry is “overly sophisticated,” it’s the kind of sophistication that’s indistinguishable from the kind of remarks that I remember from junior high, back when there was a sort of constant surveillance about kids being “faggots.” It’s still hard for me to get a sense of whether this is how Ferguson really thinks, or if he was just (unsuccessfully) trying to pander to what he imagined were the prejudices of the audience for his paid lecture.

      I’m still waiting for him to give a speech at the ACM mocking Alan Turing. That’ll go over well.

  2. We should coin a new term for the reasonable statistician or researcher. We can be “Turingians,” “Turingists” or perhaps “Turingites” rather than Bayesians or Frequentists.

    Turingian does have a badass ring to it and it would honor someone whose future genius was taken from him and science because of ignorance and fear. Our world would be a much better place had he lived a few more decades.

  3. ‘He’s so goddamn reasonable’ – yep, that’s pretty much how I would describe you, based on this blog/your published work. It’s very frustrating.

  4. “This is however all so vague, that it is more usual to make the judgment “1/5,000,000” without explanation.”

    This is such a wonderful statement! How often are we in a situation like this (be it regarding priors, tuning constants or other decisions in statistics), and how rarely is this clearly acknowledged!

  5. What I find quite interesting is how unimportant Turing & I.J. Good, who later propogated this work like a madman (by all accounts, Good was slightly mad), seemed to think priors were. Their focus on likelihood ratios — as a measure of the *weight of the evidence* (Good’s obsession thereafter) — was all about trying to find a tractable, productive device for making the probative value of data readily apparent. The focus on “bans” & “decibans” — the latter supposedly being the “minimal change” in the relative probability of a hypothesis that human beings could be expected to notice or make use of, etc. — was all about that. Obviously, they knew that the likelihood ratio was just the factor one used to update one’s prior odds– but that seemed like a fairly trivial, obvious, mechanical. Appreciation of the need for assessing the *weight* of empirical evidence was what they really felt was missing from statistical practice.

    I think they not only were right but still are. Most of the difficulties that many now perceive in empirical research are connected to the prevalence of statistical measures — p-values being the most conspicuous one — that reflect inattention to the indispensability of evidentiary weight in causal inference.

    Obviously Bayesian statistical practice should be responsive to that. But my sense is that there isn’t among today’s Bayesians the confidence Turing & Good had that Bayesian likelihood ratios (in one form or another; obviously Bayes Factor is just one operationalization) should be the currency for valuing empirical results

    • Dan:

      The likelihood ratios come out automatically. The issue, at least considering my own work, is that cryptography is fundamentally a discrete problem, but most of the problems I work on have continuous parameters. So when I do Bayesian inference, it looks like a posterior distribution, it doesn’t look like a bunch of likelihood ratios.

      Also, prior distributions can be important, especially in applications such as pharmacology where data are sparse, models are dense, and external information is readily available.

      Finally, I find Turing’s writing much more compelling than Good’s. Good seems like much more of a talker, while Turing is a do-er. But maybe part of that comes from secrecy restrictions: maybe Good wasn’t allowed to write about the details of the work he did. On the other hand, Good had several decades after WW2 in which he could’ve done applied work, but didn’t. Which suggest to me that this wasn’t really his thing. Maybe what he needed was another Turing to collaborate with. Don’t get me wrong—Good wrote a lot of reasonable things too, but overall Good’s writing seems to offer a lot of promises and theory and not so much action.

  6. In cryptography we deal with malicious opponents, which invites a broader perspective.

    It’s fine to deal with rough approximations, to get a grasp of the challenge, e.g. the Drake Equation for extraterrestrial life. But in doing so, we must first avoid the weakness of “framing,” that is, shaping all future expectations.

    And we must watch for red herrings, e.g. the false message regarding “AF” that cracked the Japanese code for Midway.

    After all, however complex something like Navier-Stokes gets, the water molecules are not trying to fool you.

Leave a Reply to dmk38 Cancel reply

Your email address will not be published. Required fields are marked *