Josh Menke writes,
I saw that you had commented on adjusted plus/minus statistics for basketball in a few of your blog entries [see also here]. I’ve been working on a Bayesian version of the model used by Dan Rosenbaum, and wondered if I could ask you a question.
I wanted to be able to update the posterior after each sequence of game play between substitutions, so I decided to use the standard exact inference update for a normal-normal Bayesian linear regression model. If you’re familiar with Chris Bishop’s recent book, Pattern Recognition and Machine Learning, the updating equations for this are 3.50 and 3.51 on page 153. I felt OK with using a normal prior based on some past research I did in multiplayer game match-making with Shane Reese at BYU. The tricky part comes with using exact inference for updating the posterior. The updating method is very sensitive to the prior covariance matrix. I start with a diagonal covariance matrix, and if the initial player variances I choose are too high, the +/- estimates can go to infinity after several updates. I thought this was related to the data sparsity causing an ill-conditioned update matrix, but I thought I’d ask in case you’d had any experience with this type of problem.
Have you dealt with an issue like this before? If I set the prior variances low enough, I get reasonable results, and the ordering of the final ranking is fairly robust to changes in the prior. It’s just the estimation process itself that doesn’t “feel” as robust as I’d prefer, so I don’t know that I trust the adjusted values (final coefficients) to be meaningful.
I don’t think I can use MCMC in this situation either because trying to get 100,000 samples using 38,000+ data points and 400+ parameters feels intractable to me. I could be wrong there as well since I suppose I only need to include the current players in each match-up within the log likelihood. But it would still take quite a bit of time.
It would also be nice to go with the sequential updating version if possible since I could provide adjusted +/- values instantly after each game, if not after each match-up.
1. I’d try the scaled inverse Wishart prior distribution as described in my book with Hill. This allows the correlations to be estimated from data in a way that still allows you to provide a reasonable amount of information about the scale parameters.
2. I’d go with the estimation procedure that gives reasonable estimates, then do some posterior predictive checks, as described in chapter 6 of Bayesian Data Analysis. (Sorry for always referencing myself; it’s just the most accessible reference for me!) This should give you some sense of the aspects of the data that are not captured well by the model.
3. Finally, you can simulate some fake data from your model and check that your inferential procedure gives reasonable estimates. Cook, Rubin, and I discussed a formal way of doing this, but you can probably do it informally and still build some confidence in your method.