In August, I announced a break from blogging. And this is my first new post since then. (not counting various interpolated topical items on polling, elections, laughable surveys comparing North Carolina to North Korea, junk science on pizza prices, etc)
I’m still trying to figure out how to do this; I have a file chock full of blogging material and I have this idea of just sitting down and writing 365 posts to get the whole year out of the way. But I guess I’ll start with one a day for awhile.
This is the first.
So. Several years ago the American Journal of Sociology asked me to write a comment for a paper they were running on age period cohort analysis. What happened was that Yang et al. had written a paper on disentangling these additive terms, and Steve Fienberg and someone else, I think, were saying that Yang et al. were wrong, and I was brought in for an outside opinion.
I wrote my comment—here it is—pretty much agreeing with Fienberg but expressing some sympathy for Yang et al. because age-period-cohort decomposition is something that in theory seems completely impossible but can actually be done in the context of real applications where there is prior information that can be used to parameterize and structure the answer. So I didn’t want to be completely negative, even though the solution of Yang et al., which attempts to be completely automatic, can’t possibly work.
For some reason the AJS decided they weren’t running my discussion. Soooo annoying. I hate when that happens. I write a paper to do someone a favor and then they decide they don’t want it. In the meantime, I did some more age-period-cohort analysis with Yair, though, so I guess that this earlier work wasn’t entirely wasted.
Anyway, I’d pretty much forgotten that episode until the other day when Chris Winship sent me this paper by Liying Luo, James Hodges, Daniel Powers, and himself, which appeared in AJS recently. Here’s the punchline:
“The Intrinsic Estimator for Age-Period-Cohort Analysis” (Yang et al. 2008) has been cited 189 times as of October 2016 and has been used by researchers in different disciplines to address important substantive questions. Many researchers appear convinced that the assumptions implicit in the IE do not affect the IE’s ability to estimate, even if only approximately, the “true” age, period, and cohort effects . . . The empirical and mathematical results presented in this comment contradict that optimistic view. . . .
Social scientists have long looked for statistical methods that will provide assumption-free results revealing the underlying structure of empirical data. As with causal analysis of observational data . . . we believe this is an impossible goal. Heckman and Robb (1985) stated the situation correctly nearly three decades ago: “The age-period-cohort effect identification problem arises because analysts want something for nothing: a general statistical decomposition of data without specific sub- ject matter motivation underlying the decomposition. . . .”
I agree. Again, I did not feel comfortable completely shooting down Yang et al. because I could imagine that their structure could be useful for researchers who want to include subject-matter knowledge in their inferences, but, sure, if it’s a take it or leave it on Yang et al., I’d have to say, leave it.
But then there’s more to the story. A different set of authors, Kenneth Land, Qiang Fu, Xin Guo, Sun Jeon, Eric Reither, and Emma Zhang (a group that includes one of the authors of the earlier Yang et al. paper) responded with an article in AJS. Their response had the aggressive title, “Playing with the rules and making misleading statements,” and they continued to fire with both barrels in the text:
Luo et al. then claim to raise “concerns about the robustness” and thus usefulness of the IE by showing that IE estimates can be “highly sensitive” to a researcher’s choice of coding scheme or model parameterization. In this response, we find these “concerns” to be based on misinterpretations, misunderstandings, and misrepresentations of the IE and, accordingly, misleading.
What’s with all the scare quotes, huh? Kinda iffy to be going against high-powered methodologists like Chris Winship and Steve Fienberg!
But let’s get to the conclusion of Land et al.:
The APC [age-period-cohort] accounting/multiple classification model is not identified unless we impose one additional constraint whose validity cannot be tested with any data. Just as there are infinitely many generalized inverse matrices to calculate the coefficient vector of this model, there are infinitely many corresponding possible constraints. . . . Given this, researchers should not use the IE [that estimate from Yang et al., 2008], or any other constraint such as equality-of-coefficients or a cohort characteristic proxy constraint, without careful thought about whether it is reasonable in their particular context.
I think we’re all in agreement on that point. The key point of disagreement is on whether the Yang et al. method is useful in a wide range of applications.
If you go to the conclusion of the original Yang et al. (2008) paper, you’ll see some mixed messages. First a full-steam-ahead:
Glenn (2005, p. 20) has stated several strong criteria for judging the acceptability and utility of a general-purpose method of APC analysis. The IE appears to satisfy these criteria. As shown here, the IE has passed both empirical and simulation tests of validity and can be used to test theoretically motivated hypotheses and to incorporate and test side information from other studies. The IE therefore may provide a useful tool for the accumulation of scientific knowledge about the distinct effects of age, period, and cohort categories in social research. Indeed, since the APC underidentification problem is an instance of a larger family of such structural issues, the potential range of application for the IE may be even larger.
But then . . . whoa, maybe not. Yang et al. continue:
Does this mean that researchers should naively apply this method to tables of rates and expect to obtain meaningful results? Again, no. Every statistical model has its limits and will break down under some conditions. APC analysis is well known to be treacherous, for reasons articulated by Glenn (2005), and should, in all cases, be approached with great caution and an awareness of its many pitfalls.
For now, I’ll go with Luo/Hodges/Winship/Powers, who agree with Heckman/Robb and Fienberg/Mason before them.