Racial classification sociology controversy update

The other day I posted on a controversy in sociology where Aliya Saperstein and Andrew Penner analyzed data from the National Longitudinal Survey of Youth, coming to the conclusion that “that race is not a fixed characteristic of individuals but is flexible and continually negotiated in everyday interactions,” but then Lance Hannon and Robert DeFina argued that those claims were mistaken, the product of a data analysis that did not properly account for measurement error. I did not check out the details in either of the two sides’ analysis, but Hannon and DeFina’s criticism made sense to me, in that it did seem to me that measurement error could cause the artifacts of which they wrote.

As I wrote in my post, the story so far seems consistent with the common pattern in which a research team finds something interesting and then, rather than trying to shoot down their own idea, they try to protect it, running robustness studies with the goal of confirming their hypothesis, responding to criticisms defensively, and so forth. Just par for the course, and it’s a good thing that the journal Sociological Science was there to publish Hannon and DeFina’s article.

Anyway, we had some good blog discussion, including this detailed comment by Aaron Gullickson:

Given that I [Gullickson] am a co-author with Saperstein on another paper on racial fluidity (in this case switching between “black” and “mulatto in the 19th century US south), I have given some serious thought to what exactly is going on here to produce bias, since it is not particularly well spelled out by Hannon and LaFina. Based on simulations, I believe the potential for bias from reporting error exists in the models that control for prior racial identification (models 1-3 above) but not in the fixed effects model (model 4). Admittedly, the fixed effect estimate is smaller here and if you look at some of S&P’s other work, it tends to be less robust across different dependent variables and model specifications, but I think it is too early to start throwing out the results entirely. Its also difficult to know what proportion of switchers are actually data errors (although I have seen a draft of S&P’s AJS response and I think they have some clever ways of thinking about it, but I won’t steal their thunder).

Here is a replicable simulation in R outlining the source of the bias: http://pages.uoregon.edu/aarong/reportingerror.html

I don’t quite understand the “steal their thunder” thing, as Saperstein and Penner could post a preprint of their American Journal of Sociology response right now? No need to wait until formal publication, why not get the response out right away, I’d think.

Anyway, I sent a message to Hannon asking if he had any response to the above comment by Gullickson. Here’s how Hannon replied:

Gullickson seems to be making an argument that is consistent with our call for greater use of fixed-effects models in this literature. I don’t agree with his suggestion that accounting for measurement error in racial fluidity studies means a total focus on missing completely at random because “Anything else gets into the tricky question of what measurement error really means anyway on a variable with no real true response.” I think non-random measurement error matters too in this context. More generally, I hope that researchers in this area eventually move away from the idea that when an effect cannot be explained by random error, it must be due to their particular hypothesized causal mechanism (racial priming and the internalization of stereotypes). There are ways to more directly test their mechanism.

I’m happy that the discussion of this paper is centering on measurement issues. As I’ve written many times, I think the importance of careful measurement is underrated in statistics and in social science.

Also, Hannon said one thing I like so much I’ll repeat it right here:

More generally, I hope that researchers in this area eventually move away from the idea that when an effect cannot be explained by random error, it must be due to their particular hypothesized causal mechanism.

Yes. Rejection of the straw-man null should not be taken as acceptance of the favored alternative.

11 thoughts on “Racial classification sociology controversy update

  1. I want to make a comment about the controls for interviewer characteristics. I’m no expert on this but in my methods class the students do group projects related to this. So e.g. one group takes interviewer race as a topic and on takes interviewer gender. So as a result I’ve looked at a lot of cross tabs of this with different variables.

    Here are two versions of the same data from the 2012 GSS (via GSS Data Explorer)

     This uses the GSS self-reported and then classified race variable for the Respondent.
    RACE (Race of respondent)

    INTETHN (Race of interviewer) White Black Other Total
    White 77% 64% 68% 74%
    Black 10% 23% 13% 12%
    Hispanic 4% 2% 10% 4%
    Two or more race 9% 11% 8% 9%
    Total 100% 100% 100% 100%

      ”
    INTETHN (Race of interviewer)

    RACE (Race of respondent) White Black Hispanic Two or more race Total
    White 76% 59% 69% 72% 74%
    Black 13% 28% 6% 17% 15%
    Other 11% 13% 25% 10% 12%
    Total 100% 100% 100% 100% 100%

    Smith, Tom W, Peter Marsden, Michael Hout, and Jibum Kim. General Social Surveys, 1972-2014 [machine-readable data file] /Principal Investigator, Tom W. Smith; Co-Principal Investigator, Peter V. Marsden; Co-Principal Investigator, Michael Hout; Sponsored by National Science Foundation. -NORC ed.- Chicago: NORC at the University of Chicago [producer and distributor]. Data accessed from the GSS Data Explorer website at gssdataexplorer.norc.org.

    Wowee significant chi square there!
    So sure there’s probably some shifting but a lot has to do with the demographics of the interviewers in an area and how they relate to the demographics of the respondants in an area, which is, to put it briefly, pretty darn strongly. Interviewers in NYC? Lots of African American women. Interviewers in Nebraska, mainly white women. Interviewed while incarcerated in NY State? Prisons are all upstate and hence interviewers are more likely to be white than the first time you were interviewed.
    But the key thing is that same race dyads are more likely than what you’d expect by chance. Which means it is not so simple to disentangle.

  2. I think “rejection of the null is not acceptance of your hypothesis” should be the sole content of perhaps the first 5-7 weeks of intro to stats

    • Nevertheless, that’s how we write our conclusions when we use p-values (and this includes statisticians). p<0.05 for us means that there is evidence for the specific alternative.

      • People informally using the sample mean and its CI to make the leap. I think I can live with that as long as it’s clear that it’s not the p-value that’s relevant here. This may be partly why Doug Bates removed p-value output from lmer.

        Actually, for standard hierarchical psych type data, even when I fit Stan models, I start with lmer and get a sense of what the frequentist means and CIs will be. That gives me a pretty good guideline for where the Bayesian analysis would take me. After a couple years of doing this, it’s pretty clear that if a frequentist CI is just borderline away from 0, the Bayesian credible interval will usually cross 0—Andrew’s point about credible intervals (which I think I saw in Gelman and Hill a long time back) being a bit more conservative. The big gain in the Bayesian analysis is better quality estimates of the variance components; especially with LKJ priors, Stan is by far the better (if slower) way.

        I sometimes end up writing papers with people who barely know what a frequentist model is giving you; switching the paper to a Bayesian analysis would cause their brains to explode. So I stick with the frequentist analysis, and the sample mean (I mean the estimated coefficients in a hierarchical linear model) and the associated CI are good enough for me as a poor man’s posterior. When you think of the overhead involved in understanding Bayes, this seems good enough to me in such situations.

      • IT really means in a way that because your theoretical or other model is strong enough to be the only reasonable alternative the fact that the null is being rejected by negation means that not null is being not rejected and since there is really only one not null or HA, there we are. However, how often do we really think that the only possible realities are null and the one presented in the current paper? In my experience in sociology, almost never.

        • We definitely act as if there is only one alternative. I’m not talking about sociology though—that sounds like a very difficult area to do statistical inference in.

  3. I agree that rejection of the straw-man null should not be taken as acceptance of the favored alternative. Please don’t put me up as the straw man who thinks such a thing. The simulation is used to address a specific form of measurement bias hypothesized by Hannon et. al. and to determine if it exists in within-person comparisons. Of course, the results could be hypothetically be explained by other forms of measurement error, omitted variable bias, etc. That is true of all research all the time.

    As to the “stealing thunder” bit, my point was that S&P took the time to write a response and I don’t think it would be professional of me to summarize the main point of that response based on having seen a draft before its even published.

    • Aaron:

      Yes, I respect that you did not want to convey the details of Saperstein and Penner’s comment before they publish it. What surprises me that they would hold it back. Once it’s written, why not post it right away so all can see?

      • AJS/University of Chicago Press is pretty aggressively adamant that authors not pre-print articles and comments. It’s an infuriating policy, but likely the reason you have not seen their comment (it’s why ours isn’t available). Plus, as we have not pre-printed the comment, it would seem a little odd to have the response to our comment, but not our comment. It is scheduled for the July issue, fwiw.

Leave a Reply

Your email address will not be published. Required fields are marked *