Why I don’t use the term “fixed and random effects”

People are always asking me if I want to use a fixed or random effects model for this or that. I always reply that these terms have no agreed-upon definition. People with their own favorite definition of “fixed and random effects” don’t always realize that other definitions are out there. Worse, people conflate different definitions.

Five definitions

Here are the five definitions I’ve seen:

(1) Fixed effects are constant across individuals, and random effects vary. For example, in a growth study, a model with random intercepts a_i and fixed slope b corresponds to parallel lines for different individuals i, or the model y_it = a_i + b t. Kreft and De Leeuw (1998) thus distinguish between fixed and random coefficients.

(2) Effects are fixed if they are interesting in themselves or random if there is interest in the underlying population. Searle, Casella, and McCulloch (1992, Section 1.4) explore this distinction in depth.

(3) “When a sample exhausts the population, the corresponding variable is fixed; when the sample is a small (i.e., negligible) part of the population the corresponding variable is random.” (Green and Tukey, 1960)

(4) “If an effect is assumed to be a realized value of a random variable, it is called a random effect.” (LaMotte, 1983)

(5) Fixed effects are estimated using least squares (or, more generally, maximum likelihood) and random effects are estimated with shrinkage (“linear unbiased prediction” in the terminology of Robinson, 1991). This definition is standard in the multilevel modeling literature (see, for example, Snijders and Bosker, 1999, Section 4.2) and in econometrics.

The definitions are all different!

Of these definitions, the first clearly stands apart, but the other four definitions differ also. Under the second definition, an effect can change from fixed to random with a change in the goals of inference, even if the data and design are unchanged. The third definition differs from the others in defining a finite population (while leaving open the question of what to do with a large but not exhaustive sample), while the fourth definition makes no reference to an actual (rather than mathematical) population at all. The second definition allows fixed effects to come from a distribution, as long as that distribution is not of interest, whereas the fourth and fifth do not use any distribution for inference about fixed effects. The fifth definition has the virtue of mathematical precision but leaves unclear when a given set of effects should be considered fixed or random. In summary, it is easily possible for a factor to be “fixed” according to some of the definitions above and “random” for others.

Because of these conflicting definitions, it is no surprise that “clear answers to the question ‘fixed or random?’ are not necessarily the norm” (Searle, Casella, and McCulloch, 1992, p. 15).

In summary . . .

If you use a particular definition of fixed and random effects, don’t automatically assume that the other definitions apply also. For example, if an effect is interesting in itself (see definition 2), it is not necessary to estimate it using least squares (see definition 5).

(See also Kreft and De Leeuw, 1998, Section 1.3.3, for a discussion of the multiplicity of definitions of fixed and random effects and coefficients, and Robinson, 1998, for a historical overview.)

The paper

This is all taken from my paper, “Analysis of variance: why it is more important than ever” (with discussion), Annals of Statistics, 2005.

Or it may be more fun to start with my rejoinder to the discussion.

References

Kreft, I., and De Leeuw, J. (1998). {\em Introducing Multilevel Modeling}. London: Sage.

Searle, S. R., Casella, G., and McCulloch, C. E. (1992). {\em Variance Components}. New York: Wiley.

Green, B. F., and Tukey, J. W. (1960). Complex analyses of variance: general problems. {\em Psychometrika} {\bf 25} 127–152.

LaMotte, L. R. (1983). Fixed-, random-, and mixed-effects models. In {\em Encyclopedia of Statistical Sciences}, ed.\ S. Kotz, N. L. Johnson, and C. B. Read, {\bf 3}, 137–141.

Snijders, T. A. B., and Bosker, R. J. (1999). {\em Multilevel Analysis}. London: Sage.

Robinson, G. K. (1991). That BLUP is a good thing: the estimation
of random effects (with discussion). {\em Statistical Science} {\bf 6}, 15–51.

7 thoughts on “Why I don’t use the term “fixed and random effects”

  1. I'm almost sure SAS uses definition (5). Which is fine, as long as the user doesn't naively assume that definitions (2), (3), and (4) apply also. (For example, a bad thing (in my opinion) is to say "I'm interested in so-and-so [definition 2] and thus I'll include it as fixed effects in Sas [definition 5].")

  2. Are any of these definitions consistent with the following definition:

    1. An effect is fixed if the effect is correlated with the error term.

    2. An effect is random if the effect is not correlated with the error term.

  3. Eric,

    No.

    Definition 1 distinguishes constant from varying coefficients. Constant coefficients are, by definition, not correlated with anything. varying coefficients can be correlated or not correlated, depending on the model.

    Definition 2 is about whether coefficients are intersting in themselves. Coefficients can be individually interesting, or not interesting, whether or not their distribution is correlated with the error term.

    Definition 3 is about whether the sample exhausts the population. This can occur, or not occur, with correlated or uncorrelated models.

    And so forth.

  4. Perhaps Eric means:
    1.An effect is fixed if it is correlated with the predictors

    2.It is random if not correlated with the predictors.

    I have seen such a definition in Econometric Analysis (William H.Greene).
    But this definition seems to contradict Prof. Gelman's paper: Fitting Multilevel Models When Predictors and Group Effects Correlate, in which random effects are assumed to correlate with predictors.

  5. Predictor variables exhibit fixed effects. And the fixed effects (of the predictors) on the response variable(s) are the observations.

  6. I thought I understood this material, but now I am totaly confused. I started out trying to get a consistent idea about the concept of Nested ANOVA.
    I found some of the definitions just so difficult to understand and seemly different, this then lead to the notion of Fixed and Random factors.
    I was always viewed fixed factors as being variables where the experimenter can set the levels so repeating the experiment would mean you could use the same levels over and over again, wwheras random implied the "variable" was indeed a random variable you have no control. At least tthis article has helped me realise there are different defintions.

Comments are closed.