Misunderstanding analysis of covariance

Jeremy Miles writes:

Are you familiar with Miller and Chapman’s (2001) article: Misunderstanding Analysis of Covariance saying that ANCOVA (and therefore, I suppose regression) should not be used when groups differ on a covariate. It has caused a moderate splash in psychology circles. I wondered if you had any thoughts on it.

I had not heard of the article so I followed the link . . . ugh! Already on the very first column of the very first page they confuse nonadditivity with nonlinearity. I could probably continue with, “and it gets worse,” but since nobody’s paying me to read this one, I’ll stop reading right there on the first page!

I prefer when people point me to good papers to read. . . .

10 thoughts on “Misunderstanding analysis of covariance

  1. I once worked with a Psychology Phd student in a clinical research institute and he was sending me 2-3 articles a week from Psychology journals that purported to explain how to do simple and _valid_ mediation analyses …

    The supply seemed to be endless and vampirical to anything critical of mediation analysis in the statistical literature.

    Fortunately there ended up being no discernable effect in their study and they decided to forgo a meditational analysis of the no effect.

  2. This paper has created a stir but it’s basically saying you do ANCOVA when you meet the assumptions of ANCOVA and it’s not a magical fix. Essentially, lots of psychologists were apparently violating the assumptions of independence between the covariate and the predictors.

  3. As someone who has referred errant colleagues to M&C2001, I was surprised to read you found fault with their paper. Looking back at it, I couldn’t discern the confusion you assert; care to clarify?

    • Oh, found it: “…ask whether some maturational change at that age makes a nonlinear contribution to basketball ability.”

      Hm, the mention of linearity seems out of nowhere, particularly as linearity is never mentioned again in the paper. In any event, as John says, the rest of the paper simply points out to psychologists that ANCOVA isn’t magic, that it has assumptions and if the data violate the assumptions, inferences are undermined.

      • i think you’re throwing out the baby with the bathwater here. if you continue to read they make what i took away as the more interesting and important point by explaining that trying to control for age makes no sense in this case. the broader message is a good one: THINK about what it means to control for a particular variable and be aware that statistical modeling cannot fix what are fundamentally substantive problems. confusing nonaddivity with nonlineartiy has not bearing the importance of this message.

  4. This type of paper—a non-mathematical discussion of the properties of estimators—is simply painful to read. If they attempted even a handful of equations it would have greatly clarified whatever point it is they’re trying to make.

    Echoing previous comments, it would appear that what they’re trying to say is: in a linear fixed effects regression, the coefficients on the covariates are inconsistent if the fixed effects are correlated with the covariates. That is simply wrong. The example on the bottom of page 44 about height and weight is wrong: they appear to think that the covariates in a regression model must be uncorrelated for the model to generate consistent estimates, because otherwise sweeping out the fixed effects removes some of the variation in the covariates, which they wrongly think generates some fundamental problem. They then cite an “extreme example” to illustrate the point, but the extreme example (mountain height and air pressure) is one of *perfect* colinearity. They appear to argue that the fact that we cannot estimate models in the presence of perfect colinearity implies that we cannot estimate models in the presence of any colinearity.

    It is hard to make sense of what they’re trying to say since verbal exposition of mathematical arguments is opaque, but as far as I can tell from a cursory reading, this paper is flat-out wrong.

  5. I’ve always had mixed feelings about “Misunderstanding Analysis of Covariance” as it makes some interesting arguments in the “beyond statistics” bit. I think the default assumption in a covariate acts as a statistical control (that they criticize) is problematic. This is really a philosophical issue and sometimes use of ANCOVA as a statistical control makes sense and othertimes not. Amusingly (for me) I just cited Miller and Chapman in my forthcoming book and offered Gelman and Hill as a counter-argument.

    I don’t think they are “flat out wrong” because I don’t think the crucial point they are making is a statistical one. (They may be wrong on other points). However, I think the criticism of the clarity of the arguments is valid. The IQ and brain damage example is a better one – you can’t equate two groups (one with brain damage and one without) by covarying IQ because IQ impairment is likely to be an intrinsic part of brain damage. This is really just a question of what statistical model fits the question of interest – and naive views of ANCOVA as a statistical control are mistaken. That said, it can be used to remove or reduce confounding and Gelman and Hill’s discussion (e.g., considering partial overlap on the covariate etc.) is a better way of looking at it.

    Thom

Comments are closed.