Leicester City was a 5000-to-1 shot to win the championship—and they did it.

Donald Trump wasn’t supposed to win the Republican nomination—last summer Nate gave him a 2% chance—and it looks like he will win.

For that matter, Nate only gave Bernie Sanders a 7% chance, and he came pretty close.

**Soccer**

There’s been a lot of discussion in the sports and political media about what happened here. Lots of sports stories just treated the Leicester win as a happy miracle, but more thoughtful reports questioned the naive interpretation of the 5000:1 odds as a probability statement. See, for example, this from Paul Campos:

Now hindsight is 20/20, but when a 5000 to 1 shot comes in that strongly suggests those odds were, ex ante, completely out of wack. . . . Leicester was the 14th-best team in the league last year in terms of points (and they were better than that in terms of goal differential, which is probably a better indicator of underlying quality). Anyway, the idea that it’s a 5000 to 1 shot for the 14th best team in one year to win the league in the next is obviously absurd on its face.

The 14th best team in the EPL is roughly equivalent to the 20th best team in MLB or the NBA or the NFL, in terms of distance from the top. Now obviously a whole bunch of things have to break right for for a 75-87 team to have the best record in baseball the next year. It’s quite unlikely — but quite unlikely as in 50-1 or maybe even 100-1.

That sounds about right to me. Odds or no odds, the Leicester story is inspiring.

**Primary elections**

There are a bunch of quotes out there from various pundits last year saying that Trump had zero chance of winning to the Republican primary. To be fair to Nate, he said 2% not 0%, and there’s a big difference between those two numbers.

But even when Nate was saying 2%, the betting markets were giving him, what, 10%? Something like that? In retrospect, 10% odds last fall seems reasonable enough: things *did* break Trump’s way. If Trump was given a 10% chance and he won, that doesn’t represent the failure of prediction markets.

I’d also say that, whatever Trump’s unique characteristics as a political candidate, his road to the nomination does not seem *so* different from that of other candidates. Yes, he’s running as an anti-Washington outsider who’s willing to say what other candidates won’t say, but that’s not such an unusual strategy.

**My own forecasts**

I’ve avoided making forecasts during this primary election campaign? Why? In some ways, my incentives are the opposite of political pundits. Nate’s supposed to come up with a forecast—that’s his job—and he’s also expected to come up with some value added, above and beyond the betting markets. If Nate’s just following the betting markets, who needs Nate? Indeed, one might think that the bettors are listening to Nate’s recommendations when deciding how to bet. So Nate’s gotta make predictions, and he gets some credit for making distinctive predictions and being one step ahead of the crowd.

In contrast, if I make accurate predictions, ok, fine, I’m supposed to be able to do that. But if I make a prediction that’s way off, it’s bad for my reputation. The plus of making a good forecast are outweighed by the minuses of screwing up.

Also, I haven’t been following the polls, the delegate race, the fundraising race, etc., very carefully. I don’t have extra information, and if I tried to beat the experts I’d probably just be guessing, doing little more than fooling myself (and some number of gullible blog readers).

Here’s a story for you. A few months ago I was cruising by the Dilbert blog and I came across some really over-the-top posts where Scott Adams was positively worshipping Donald Trump, calling him a “master persuader,” giving Trump the full Charlie Sheen treatment. Kinda creepy, really, and I was all set to write a post mocking Adams for being so sure that Trump knew what he was doing, it just showed how clueless Adams was, everybody knew Trump didn’t have a serious chance . . .

And then I remembered why primary elections are hard to predict. “Why Are Primaries Hard to Predict?”—that’s the title of my 2011 online NYT article. I guess I should’ve reposted it earlier this year. But now, after a season of Trump and Sanders, I guess the “primaries are hard to predict” lesson will be remembered for awhile.

Anyway, yeh, primaries are hard to predict. So, sure, I didn’t *think* Trump had much of a chance—but what did I know? If primaries are hard to predict in general, they’re hard for *me* to predict, too.

Basically, I applied a bit of auto-peer-review to my own hypothetical blog post on Adams and Trump, and I rejected it! I didn’t run the post: I rightly did not criticize Adams for making what was, in retrospect, a perfectly fine prediction (even if I don’t buy Adams’s characterization of Trump as a “master persuader”).

The only thing I did say in any public capacity about the primary election was when an interviewer asked me if I thought Sanders cound stand up against Donald Trump if they were to run against each other in a general election, and I replied:

I think the chance of a Sanders-Trump matchup is so low that we don’t have to think too hard about this one!

It almost happened! But it didn’t. Here I was taking advantage of the fact that the probability of two unlikely events is typically much smaller than the probability of either of them alone. OK, the two parties’ nominations are not statistically independent—it could be that Trump’s success fueled that of Sanders, and vice-versa—but, still, it’s much safer to assign a low probability to “A and B” than to A or B individually.

But, yeah, primaries are hard to predict. And Leicester was no 5000:1 shot, even prospectively.

**P.S.** Some interesting discussion in comments, including this exchange:

Anon:

A big difference between EPL and Tyson-Douglas is that there were only two potential winners of the boxing match. The bookies aren’t giving odds on SOME team with a (equivalent to) 75-87 record or worse winning the most games next year – but for one specific team. 5000-1 may be low, but 50-1 is absurdly high given there actually are quality differences between teams, and there are quite a few of them.

My response:

Yes, I agree. Douglas’s win was surprising because he was assumed to be completely outclassed by Tyson, and then this stunning thing happened. Leicester is a pro soccer team and nobody thought they were outclassed by the other teams—on “any given Sunday” anyone can win—but it was thought they were doomed by the law of large numbers. One way to think about Leicester’s odds in this context would be to say that, if they really are the 14th-best team, then maybe there are about 10 teams with roughly similar odds as theirs, and one could use historical data to get a sense of what’s the probability of any of the bottom 10 teams winning the championship. If the combined probability for this “field” is, say, 1/20, then that would suggest something like a 1/200 chance for Leicester. Again, just a quick calculation. Here I’m using the principles explained in our 1998 paper, “Estimating the probability of events that have never occurred.”

Also a comment from fraac! More on that later.

**P.P.S.** There still seems to be some confusion so let me also pass along this which I posted in comments:

N=1 does give us some information but not much. Beyond that I think it makes sense to look at “precursor data”: near misses and the like. For example if there’s not much data on the frequency of longshots winning the championship, we could get data on longshots placing in the top three, or the top five. There’s a continuity to the task of winning the championship, so it should be possible to extrapolate this probability from the probabilities of precursor events. Again, this is discussed in our 1998 paper.

The key to solving this problem—as with many other statistics problems—is to step back and look at more data.

Just by analogy: what’s the probability that a pro golfer sinks a 10-foot putt. We have some data (see this presentation and scroll thru to the slides on golf putting; see also in my Teaching Statistics book with Nolan, the data come from Don Berry’s textbook from 1995 and there’s more on the model in this article from 2002) which shows a success rate of 67/200, ok, that’s a probability of 33.5% which is a reasonable estimate. But we can do better by looking at data from 7-foot putts, 8-foot putts, 9-foot putts, and so on. The sparser the data the more it will make sense to model. This idea is commonplace in statistics but it often seems to be forgotten when discussing rare events. It’s easy enough to break down a soccer championship into its component pieces, and so there should be no fundamental difficulty in assigning prospective probabilities. In short, you can get leverage by thinking of this championship as part of a larger set of possibilities rather than as a one-of-a-kind event.