Mark Palko writes:
I’ve got a stat problem I’d like to run past you. It’s one of those annoying problems that feels like it should be obvious but the solution has evaded me and the colleagues I’ve discussed it with. I’m working on a project where the metric of interest is defined in relation to pairs of data points. It has nothing to do with sports or betting but the following analogy (which I also post on the blog) covers the basic situation:
“You want to build a model predicting the spread for games in a new football league. Because the line-up of teams is still in flux, you decide to use only stats from individual teams as inputs (for example, an indicator variable for when the Ambushers play the Ravagers would not be allowed).”
Is there a standard approach for modeling this kind of data?
My reply: I don’t quite understand your question, but are you familiar with the Bradley-Terry and Thurstone-Mosteller models for paired comparisons? These are old–from the 1920s and 1940s, I believe–but they might do what you need. Interesting work has been done on these models recently by Hal Stern, Mark Glickman, and others, to allow the underlying parameters to vary over time.