Question about standard range for social science correlations

Andrew Eppig writes:

I’m a physicist by training who is transitioning to the social sciences. I recently came across a reference in the Economist to a paper on IQ and parasites which I read as I have more than a passing interest in IQ research (having read much that you and others (e.g., Shalizi, Wicherts) have written). In this paper I note that the authors find a very high correlation between national IQ and parasite prevalence. The strength of the correlation (-0.76 to -0.82) surprised me, as I’m used to much weaker correlations in the social sciences. To me, it’s a bit too high, suggesting that there are other factors at play or that one of the variables is merely a proxy for a large number of other variables. But I have no basis for this other than a gut feeling and a memory of a plot on Language Log about the distribution of correlation coefficients in social psychology.

So my question is this: Is a correlation in the range of (-0.82,-0.76) more likely to be a correlation between two variables with no deeper relationship or indicative of a missing set of underlying variables?

My reply:

First off, I don’t think you can ever distinguish between correlations of .76 and .82 in this sort of situation, so let’s just call it .8.

Second, I certainly agree that other factors could be involved. I don’t think you can treat the high correlations as evidence against their argument. I’m not knowledgeable enough in this area to assess their hypotheses; I guess I’d be interested in hearing the thoughts of someone such as Wicherts who’s more of an expert in this area.

Finally, are you related to the first author of the linked article, or is it just that you did a search on Eppig and encountered this stuff?

P.S. Eppig responds:

I am in fact related to the first author of the study — he’s my brother.

Since my first question, I’ve been wondering about how to interpret the results of a regression when some of the dependent variables have been imputed via regression. So if I have a model:

fit.1 <- lm(y ~ x1 + x2 + x3 + x4 + ... + xn) where x1 has had its missing values imputed using: fit.2 <- lm(x1 ~ x2 + x3 + ... + xn) Are there extra considerations required in interpreting the model fit.1? Can one read off the coefficient values and errors from fit.1 as one would in a "regular" (i.e. where no imputation had been performed) model? Naively, I feel that the errors in xn are now correlated with the other independent variables and a simple linear regression is no longer appropriate/valid. Are the coefficients of x1, x2,..., xn valid but the errors invalid?

In response to this later question: This is called a measurement-error or simultaneous equations models. In general you want to fit both models together, or, in general, to model all the variables jointly. That said, in practice I’ll typically just take the imputed x-values as exact and not think too hard about it.

13 thoughts on “Question about standard range for social science correlations

  1. Speaking as a psychologist (though not having read the paper) I wouldn't regard that correlation as particularly likely.

    I would imagine that parasite prevalance co-occurs with lots of other bad things like the lack of an education system, health care, social supports etc which probably have more influence on IQ than parasite prevalance.

    Looking at the abstract, they appear to have controlled for education at least, though not health status. My instinct is that this is bunk though, but I could be totally wrong (not for the first time).

  2. Would part of the issue be with ecological correlations? Aren't they generally higher than correlations on an individual level?

  3. Its worth noting that the now extensive literature on comparing "IQ" across nations and its correlates uses lousy data, and in general, uses it fairly uncritically. But journals just lap it up. So with a little bit of imagination you can find lots of things that are correlated with these numbers and flog your pet theory.
    My impression is that many of these correlations are driven by outliers – though these studies are usually careful not to check this. The average IQ (as measured in these studies) in much of sub-Saharan Africa is very low, around 70 in many of these countries, way lower than everywhere else (a person with IQ below 70 is often deemed to be mentally retarded). You may want to consider whether this is plausible.
    So find something correlated with being in sub-Saharan Africa (temperature, distance from some arbitrary point, race, parasites, preference for vuvuzelas…) and you have a fantastic theory of the evolution of global intelligence or how intelligence explains world economic development or something. Publication guaranteed. Do not, of course, question the data or you will be dismissed for being "politically motivated" as I was.
    But I would encourage Andrew Eppig to do some real science instead.

  4. Correlations that high are not common in the social sciences, in general. But, high correlations are quite common when looking at average IQs across nations and various potential causes and effects. The reason your reader is surprised by how high the correlations are is because there has been so little discussion of national IQ differentials over the years because the topic is so politically incorrect.

    As for the parasite hypothesis, undoubtedly causation runs in both directions. J.D. Rockefeller's war on hookworm in the American South a century ago helped American Southerners have a lot more on the ball. Hookworm saps energy, including mental energy. And hookworm is still a problem in much of the tropical world. No doubt there are other parasites as well.

    But, also, consider high-IQ, low parasite burden Singapore versus low-IQ, high parasite burden Lagos. Same altitude and latitude. But, smart people in Singapore, such as Lee Kwan Yew, applied a lot of hard thinking to reducing the disease burden in Singapore. So, the smart tend to get smarter.

    The overarching point is that if you want to understand more about how the world works, you need to think hard and frankly about IQ. A vast amount has been learned about IQ, both as a cause and as an effect, over the last century, but, as the comments above suggest, most non-specialist social scientists don't know much about these findings and, truthfully, don't really want to know much because of the danger that they'll get treated like James Watson.

  5. JL:yup! I said "many" not "all". If you look at most uses of this data in this literature they do not make the adjustment you refer to.

  6. Kevin, Garett Jones's research is some of the more interesting of those using Lynn's data. Similarly to Whetzel and McDaniel, he found that the associations between IQ and various criterion variables were not due to abnormally low scores in the developing world. Link: http://mason.gmu.edu/~gjonesb/

    Considering also that Lynn's data are highly correlated with the results of various international student assessment studies, all of them recent, with representative national samples, I would say that these data are a good measure of the average quality of human capital in different countries. On the other hand, it's unclear to what extent these differences reflect environmental and genetic influences.

  7. "On the other hand, it's unclear to what extent these differences reflect environmental and genetic influences."

    Right. It's one of the tragedies of contemporary social sciences that so many people can't read more than a few words about IQ data without immediately assuming that the author is arguing that 100% of the causation of the gaps is genetic, which then induces brain lockdown as anti-taboo defenses take over.

    Lynn and Vanhanen have argued since 2002 that micronutrient diet lacks hurt Third World IQ scores. The U.S. made fortification of staple foods with iodine and iron mandatory before WWII, with outstanding results in reducing IQ-lowering medical conditions such as cretinism. Kiwanis International does good work paying to fortify salt with iodine in poor countries, but it's not a fashionable cause since we aren't supposed to be aware of lower average IQs in Third World countries.

    One thing we can say with certainty is that — contra all the mantras about the Flynn Effect — relative differences between countries in average IQ are fairly stable. There appears to have more change in average height than in average IQ over my lifetime — when I was a kid, the Dutch were not exceptionally tall yet.

    In Lynn and Vanhanen's data over the course of the 20th Century, you can see Northeast Asian countries getting a little bit smarter relative to the rest of the world, but it's not a big change. Mostly, there's stability:

    http://vdare.com/sailer/lynn_and_flynn.htm

    What that means is that existing differences are likely to be around to some degree for at least a generation to come, making IQ a hugely important element in understanding the world, now and in the future.

  8. Although I usually agree with Sailer on most everything, I think these analyses of “IQ and the pubic lice of nations” are silly or at least beyond the current state of psychometric science. As Wichert's and Dolan's work point out before you compare the IQ measurements of two groups of people you have to be sure that IQ measures the latent trait of intelligence in the same way. That is, ther test is measurement invariant. Most of these studies skip that step.

Comments are closed.