Wayne Folta points me to “EigenBracket 2012: Using Graph Theory to Predict NCAA March Madness Basketball” and writes, “I [Folta] have got to believe that he’s simply re-invented a statistical method in a graph-ish context, but don’t know enough to judge.”
I have not looked in detail at the method being presented here—I’m not much of college basketball fan—but I’d like to use this as an excuse to make one of my favorite general point, which is that a good way to characterize any statistical method is by what information it uses.
The basketball ranking method here uses score differentials between teams in the past season. On the plus side, that is better than simply using one-loss records (which (a) discards score differentials and (b) discards information on who played whom). On the minus side, the method appears to be discretizing the scores (thus throwing away information on the exact score differential) and doesn’t use any external information such as external ratings.
Anyway, my point is that the writeup of the method focuses on statistical operations (forming a matrix of a graph, computing eigensomethingorothers), and, sure, something like that is necessary, but to me, what’s interesting is to know what information went into the rankings.
P.S. If I wanted to use the information that this guy was using, I’d probably just fit a simple normal linear model with a latent parameter for each team.