OK, here’s a research project for someone who’s interested in sports statistics. It’s from this comment by Paul in a recent thread:

What I would like to see (has anyone done it?) is an analysis of the performance of EPL teams that had similar pre-season odds to Leicester over the last 15-20 years or so. Even just a plot of the season points behind the champion for that year would give a better idea of just how much of an outlier Leicester was.

As Paul says, this is related to my idea of estimating low probabilities by modeling from precursor data.

So here’s a research project for you which, if successful, is guaranteed to get some attention and could even be influential in getting people to understand the idea of retrospective evaluation of prospective odds:

Step 1: Put together a dataset of points, goal differential, and pre-season betting odds from a bunch of past English soccer league seasons.

Step 2: Make some graphs.

Step 3: Fit some models.

Step 4: Make some graphs.

Do all 4 steps. And then, even if the later steps aren’t perfect, the data are out there and other people can fit their own models.

The part about amassing the dataset is a big portion of the work. Any ideas on where such at thing might come from?

The data already exists. Loads of it in fact and lots of people are looking at. The owner of Brentford FC – a Championship club (one level below the Premiership) in West London – is a young guy who made his fortune from online betting. He is experimenting with mathematical modelling to identify players to buy and how to structure the team. His rationale: he is not a Saudi or Russian billionaire so he can’t buy promotion to the Premiership, and so he has to be clever, just like the people at Leicester.

An example of data that exists: my son-in-law works for a company that has data on the position (on the pitch) of every Premiership player, every tenth of a second (or so), for every match in the last three or four seasons. The company is searching for collaborators who can help it make sense of the data.

On StatsChat, Bill Bennet left some interesting comments about odds for outlandish bets.

http://www.statschat.org.nz/2016/05/04/should-you-have-bet-on-leicester-city/

That is an interesting comment, and a good reminder as to what kind of bet this was exactly. A lot of the time when people think of betting they think about a single game match-up or maybe an over/under. If the match is expected to be even you could bet either side at something like 10:11. But with a futures bet like Leicester to win the championship, there’s only one side to bet. You can bet on Leicester but you can’t directly bet against Leicester. Which means that if the casino wants to try to balance the risk on that bet they have to balance it across a number of other bets, which essentially makes it advertising as described by Bennet.

If anyone is interested in soccer data – here is my R package that contains the results of European professional soccer games in England (and other countries) since 1871 – https://github.com/jalapic/engsoccerdata/ It doesn’t have betting odds, but the data may be useful.