One of my favorites, from 1995.
Don Rubin and I argue with Adrian Raftery. Here’s how we begin:
Raftery’s paper addresses two important problems in the statistical analysis of social science data: (1) choosing an appropriate model when so much data are available that standard P-values reject all parsimonious models; and (2) making estimates and predictions when there are not enough data available to fit the desired model using standard techniques.
For both problems, we agree with Raftery that classical frequentist methods fail and that Raftery’s suggested methods based on BIC can point in better directions. Nevertheless, we disagree with his solutions because, in principle, they are still directed off-target and only by serendipity manage to hit the target in special circumstances. Our primary criticisms of Raftery’s proposals are that (1) he promises the impossible: the selection of a model that is adequate for specific purposes without consideration of those purposes; and (2) he uses the same limited tool for model averaging as for model selection, thereby depriving himself of the benefits of the broad range of available Bayesian procedures.
Despite our criticisms, we applaud Raftery’s desire to improve practice by providing methods and computer programs for all to use and applying these methods to real problems. We believe that his paper makes a positive contribution to social science, by focusing on hard problems where standard methods can fail and exp sing failures of standard methods.
We follow up with sections on:
– “Too much data, model selection, and the example of the 3x3x16 contingency table with 113,556 data points”
– “How can BIC select a model that does not fit the data over one that does”
– “Not enough data, model averaging, and the example of regression with 15 explanatory variables and 47 data points.”
And here’s something we found on the web with Raftery’s original article, our discussion and other discussions, and Raftery’s reply. Enjoy.