In an article catchily entitled, “I got more data, my model is more refined, but my estimator is getting worse! Am I just dumb?”, Meng and Xie write:
Possibly, but more likely you are merely a victim of conventional wisdom. More data or better models by no means guarantee better estimators (e.g., with a smaller mean squared error), when you are not following probabilistically principled methods such as MLE (for large samples) or Bayesian approaches. Estimating equations are par- ticularly vulnerable in this regard, almost a necessary price for their robustness. These points will be demonstrated via common tasks of estimating regression parameters and correlations, under simple mod- els such as bivariate normal and ARCH(1). Some general strategies for detecting and avoiding such pitfalls are suggested, including checking for self-efficiency (Meng, 1994, Statistical Science) and adopting a guiding working model.
Using the example of estimating the autocorrelation ρ under a stationary AR(1) model, we also demonstrate the interaction between model assumptions and observation structures in seeking additional information, as the sampling interval s increases. Furthermore, for a given sample size, the optimal s for minimizing the asymptotic variance of ρ.hat.MLE is s = 1 if and only if ρ^2 ≤ 1/3; beyond that region the optimal s increases at the rate of log^(−1)(ρ^(−2)) as ρ approaches a unit root, as does the gain in efficiency relative to using s = 1. A practical implication of this result is that the so-called “non-informative” Jeffreys prior can be far from non-informative even for stationary time series models, because here it converges rapidly to a point mass at a unit root as s increases. Our overall emphasis is that intuition and conventional wisdom need to be examined via critical thinking and theoretical verification before they can be trusted fully.
I’m very sympathetic to the argument that we have to be careful when imputing general statistical properties of a method based on past successes. I’m reminded of my (friendly) disputes with Adrian Raftery on Bayesian model selection. As Don Rubin and I wrote, “Raftery implies that the model with higher BIC will be expected to yield better out-of-sample predictions than any other model being compared. This implication is not generally true; there is no general result, either applied or theoretical, that implies this.” My guess is that Raftery was reasoning by analogy: he had a method derived from certain statistical principles and he just assumed that it would have other desirable properties. But, as Meng and Xie say, “intuition and conventional wisdom need to be examined via critical thinking and theoretical verification.”
The abstract of your paper reminds me of my Deep Thought paper. In addition, the world of time series is full of models that don’t make sense but are considered to be standard and acceptable. I think a key issue here is that econometricians are always afraid of cheating (also called “specification searches”). They distrust the idea of statistical data-based model-building (instead of what they prefer, which is a priori model building based on economic theory, or else fully nonparametric non-theoretically-based models). The statistical tradition of building a model using data with some theoretical support is not so popular in econometrics, as they worry that data-based modeling will violate statistical principles. I think this is why we often see economists running regression with data “straight out of the box” with minimal transformations of variables. Transforming is an opportunity to cheat. Similarly, they like AR or ARMA models with automatically-chosen lags because such models are objective and require no human input.
Here’s an example of a simple theoretically-based Bayesian model outperforming a default AR model. Cavan’s work was no surprise to the ecologists, but the time-series statisticians just couldn’t accept it. My impression was that they felt that the AR model was the game to be played and that it was cheating for a model to be built based on the structure of the problem.