Nurit Baytch posted a document, A Critique of Ron Unz’s Article “The Myth of American Meritocracy”, that is relevant to an ongoing discussion we had on this blog. Baytch’s article begins:
In “The Myth of American Meritocracy,” Ron Unz, the publisher of The American Conservative, claimed that Harvard discriminates against non-Jewish white and Asian students in favor of Jewish students. I [Baytch] shall demonstrate that Unz’s conclusion that Jews are over-admitted to Harvard was erroneous, as he relied on faulty assumptions and spurious data: Unz substantially overestimated the percentage of Jews at Harvard while grossly underestimating the percentage of Jews among high academic achievers, when, in fact, there is no discrepancy, as my analysis will show. In addition, Unz’s arguments have proven to be untenable in light of a recent survey of incoming Harvard freshmen conducted by The Harvard Crimson, which found that students who identified as Jewish reported a mean SAT score of 2289, 56 points higher than the average SAT score of white respondents. . . .
Unz’s analysis of Jewish academic achievement is predicated on his ability to identify Jews on the basis of their names, which proved spectacularly wrong for the one data set on which there exists confirmed, peer-reviewed data . . . This finding was not anomalous, as Unz tried to suggest, for I’ve been able to confirm that Unz also grossly undercounted the number of Jewish students in other data sets of high academic achievers . . .
Here’s the background. Several months ago we discussed a claim from Ron Unz that Ivy League colleges discriminate in favor of Jews, a claim that received wide attention after it was featured in the New York Times column of David Brooks. I originally reported statistical Unz’s claims uncritically (as did Tyler Cowen), but after hearing from Janet Mertz and Nurit Baytch, I came to the conclusion that some of Unz’s numbers were way off, enough to invalidate some of his larger points. After some exploration and discussion from all parties, it became clear that Unz had combined different data sources and used different rules of counting in ways that supported his hypothesis. This sort of thing happens: data can be slippery, and that’s one reason why open discussion and critique is so essential in much of science.
If you’re joining us right now, it might be best to start with our summary of the discussion as of 18 Mar 2013.
Baytch (who I described anonymously as a “correspondent” in my earlier post) wrote a long article; you can take a look at her introductory summary to get her key points. She goes into lots of detail in how she performed her estimates and comparisons, and lots more detail on various particular claims that Unz made in his article and in later discussion.
My take on all this is that it can be harder than it looks to do research using statistics. Unz’s original numbers appeared authoritative (enough so to fool Cowen, Brooks, and me, along with Unz himself) but they had big errors. To put it another way, Unz put in the effort to compile the statistics for his original article, and then Mertz and Baytch put in the effort to come up with cleaner, better numbers. That’s how things go in research. Many times, initial data seem to show something a pattern disappears in light of better data. As Mertz wrote:
Unz considered “five minutes of cursory surname analysis” a sufficient basis on which to claim an important unexpected discovery, i.e., a rapid collapse in Jewish very high-end achievement in the 21st century. Most unexpected discoveries are found not to be true when additional analyses are performed to test their validity.
It’s perfectly natural to get excited when one’s initial hypothesis is confirmed by an examination of some data, but the next step is to recognize that these exciting discoveries do not always hold up.
Unfortunately, our blog discussion of all this with Unz did not go so well, in my judgment because we were seeing a mix of two different modes of discourse. Unz, who spends so much of his time in the political arena, is used to politically-motivated criticisms and responds in kind, and so I think he sees the statistics provided by Mertz and Baytch as attacks to be dodged or parried rather than as useful information that can help him modify his understanding of the world. But for those of us how are not so invested in a particular position, Baytch’s article, and Mertz’s from a few months ago, should be helpful to anyone interested in further study of ethnicity and high-end college admissions.
The story that Jewish students are underperforming was plausible (to Unz, Cowen, Brooks, and myself) but is unsupported by the data. This isn’t the first time that someone has made a high-profile claim that collapses in light of a careful look at the numbers. It’s the nature of statistics (and science more generally) that a researcher can see some data and put all the pieces together to form an appealing theory that explains many disparate observations, only to find later that the pattern was explained by various combinations of errors, missing data, and wishful thinking.
P.S. As before, the question might reasonably arise: why do I continue to post on this topic? For an answer to this question, I refer you to the last part of this earlier post, the part entitled, “A couple more things (for now).”
P.P.S. Some of the commenters pick up on Baytch’s numbers from the Harvard Crimson. As Baytch discusses, these numbers are consistent with her argument; they are not the basis for her argument. In fact, when putting together this post, I was not sure whether to include that particular sentence in her summary, because I was worried that people would, on a quick read, think that this bit was central to her argument. But then I thought, numbers are good, and people can read the whole thing to get all the details. In any case, for those of you who don’t read the whole thing, Batch’s main effort is to use a consistent approach to counting Jewish names so as to get coherent numerators and denominators when computing ratios.