Controversy about a ranking of philosophy departments, or How should we think about statistical results when we can’t see the raw data?

Jeff Helzner writes:

A friend of mine and I cited your open data article in our attempts to persuade a professor at another institution [Brian Leiter] into releasing the raw data from his influential rankings of philosophy departments. He is now claiming the national security response:

. . . disclosing the reputational data would violate the terms on which the evaluators agreed to complete the surveys (did they even bother to read the description of the methodology, one wonders?).

I [Helzner] do not find this to be a compelling reply in this case. In fact, I would say that when such data cannot be disclosed it reveals a flaw in the design of the survey. Experimental designs must be open so that others can run the experiment. Mathematical proofs must be open so that they can be reviewed by others. Likewise, it seems to me that the details of statistical argument should be open to inspection. Do you have any thoughts on this? Or do you know of any other leading statisticians who might have a line on this issue, something that we could offer in response to this guy’s reluctance to share his data?

I do not find this to be a compelling reply in this case. In fact, I would say that when such data cannot be disclosed it reveals a flaw in the design of the survey. Experimental designs must be open so that others can run the experiment. Mathematical proofs must be open so that they can be reviewed by others. Likewise, it seems to me that the details of statistical argument should be open to inspection. Do you have any thoughts on this? Or do you know of any other leading statisticians who might have a line on this issue, something that we could offer in response to this guy’s reluctance to share his data?

Jeff also points to this long discussion among philosophers regarding the practicality of data sharing in this context.

Here’s a relevant bit from my article (which tells the story of a team of researchers at a Federal lab who declined to share with me their data from an animal experiment, several years ago):

If you really believe your results, you should want your data out in the open. . . . Do not be so tied to your analyses that you are afraid that others might, with the same data, find something different.

To get back to Jeff’s questions to me, I have several thoughts:

1. Yes, there is a statistical literature on preserving the confidentiality of survey data. I’m not an expert on this but my coauthor Jerry Reiter is. Here’s a page all about his “research on various aspects of statistical disclosure limitation, including assessing risk and utility, synthetic data methods, remote access servers, and secure analyses of distributed data.” This would be a good start for anyone who is interested in pursuing this further. I think some computer scientists have looked into this topic also, but I don’t have any references.

2. I agree with Jeff that data should be disclosed, designs should be open so that others can run the experiment, and the details of statistical argument should be open to inspection. That said, if his study is a private effort, Leiter has no obligation to share any of the above, and as some people wrote in the above-linked comment thread, the most obvious ways of releasing data could destroy some of the confidentiality.

3. As an outsider in this debate, I’m free from any responsibility or interest in recommending what Leiter should or should not do. If he chooses not to release any of his raw data in any form, then I think it’s appropriate to interpret his claims with skepticism. Doing surveys is difficult, and when people don’t report what they did, I don’t tend to trust the results. Here’s an example from U.S. politics and here’s an example from Iraq.

P.S. I followed some links and came across a post where Leiter characterizes Jeff Helzner’s on-line behavior as “juvenile,” also referring to Jeff as “stupid,” “lazy,” or “malevolent,” also as an “asshole” [hmmmm . . . this is a family blog, maybe I should change that to “butthole”] and as a “boy.” Good thing Helzner is white, otherwise Leiter could get in trouble for that last one! I actually know Helzner (we both teach at Columbia), and he seems like a nice enough guy to me. . . .

Anyway, I’m glad to be in a field like statistics where the personal insults stay confined to letters of recommendation and don’t make their way onto blogs. Snark, yes. Calling somebody “juvenile,” no. That’s soooo junior high.

P.P.S. Leiter writes:

In the past, I [Leiter] have released—anonymized to the extent possible but also with a written agreement about confidentiality—the raw data to a sociologist who studies the philosophy profession (Kieran Healy at Duke), and he has written about the results, and has found them to be robust along many different dimensions. . . . I have an agreement to give Professor Healy the most recent data as well, as soon as we can retain an RA to assist in the anonymization.

If there is interest in sharing the data with others, perhaps Kieran Healy could speak with Jerry Reiter (also at Duke) about how best to do so.

40 thoughts on “Controversy about a ranking of philosophy departments, or How should we think about statistical results when we can’t see the raw data?

  1. I have several thoughts about this. First the methodology can certainly be made public. As Andrew has commented many times before, you should be skeptical if you cannot evaluate the methodology. As for releasing the data, it depends. I would prefer that the contract between respondent and researcher provide for the case where the data can be appropriately scrubbed of PII and then released in that manner. That being said, if the contract between researcher and respondent is that their data will not be released to others, then I actually think the data shouldn’t be released even if it otherwise would make sense. I have seen a few cases where the topic was sensitive in nature where folks wouldn’t agree to the survey or qual event if there was the possibility their data might get out. That is very rare though. That being said, if the gentlemen told folks their data wouldn’t be released (and I don’t know that to be the case), then their data shouldn’t be released.

    • John:

      Yes, I think there should be a way to release suitably scrubbed data, maybe with some random additions and changes thrown in to make it impossible to identify individual responses. In any case, Leiter is certainly under no obligation to release anything. It’s a tradeoff: it’s less effort to keep the data secret, but then people will rightly have less trust in the results.

      • Andrew: Just very quickly, a colleague did a survey on what Bayesian thought should be reported on Bayesian analyses. A very interesting and important pattern emerged.

        They were strongly advised by their REB not to report this because it was not specifically covered in the informed consent process.

        It does happen – hopefully rarely though.

  2. The discussion that you link to seems to reach a consensus: that departmental ranking histograms be released (or be made available upon request) in future iterations of the ranking. (This should not be done retrospectively for the reasons mentioned by John above.) Though there is a certain amount of heat in the discussion, and more so in other discussions of the same topic, this one seems relatively constructive. It did, after all, arrive at the suggestion just mentioned, which was not in the picture at the beginning.

  3. Pingback: Gelman on Open Data and the PGR « Choice & Inference

  4. I agree with the general principles, but folks seem not to be thinking enough about the Leiter methodology. One very good feature of the methodology is that evaluators cannot vote for their Ph. D. school or their current department, another good feature is that the evaluators names are listed in the report. So if someone knows I am an evaluator and I am at X University and have my Ph, D. from Y, they can look quickly and see who did not vote on those two schools and identify my individual response,or at least narrow it down to a very small group.

      • Andrew,

        Not a bad idea, but couldn’t that lead to spurious associations (by chance) in small samples?

        Coming from Epidemiology I am not overly upset by private data nor that data can’t be shared. But I am very sensitive to the need to fully explain the analysis protocol, assumptions made, changes in the pre-specified protocol and rich descriptive statistics.

        In my statistics work (way smaller amount of work) we have shared things like the data generation code we used to develop simulations without any hesitation.

        • Joseph:

          There’s a tradeoff between security and accuracy, this is what much of the research in this area is all about. One motivation for Reiter’s work has been the conflicting pressures on the Census when supplying microdata (that is, individual survey responses) to keep anonymity while supplying as much information as possible to users of the data.

    • People have discussed this problem at length. Two fixes have been suggested for the problem about missing data for PhD and present school. First fix: release only department histograms. Second fix: insert the mode for the missing numbers.

    • I find the issue of how to handle the gappy reports quite an interesting one. I haven’t had a chance to think things through in any depth, but it strikes me that both proposals made above face difficulties.

      The first proposal (suggested by Andrew) of issuing a randomly generated value, strikes me as running the risk of flagging out the response as having been generated by that very procedure, in case the value is very atypical of the remainder of the data (imagine how a 0/5 for epistemology for Rutgers might look).

      Regarding the second proposal (suggested by Mohan) of using the modal value, first of all, the distribution might well be multimodal. One could of course select from a uniform distribution over the modal values, but this will presumably tend to lead to a systematic bias in the estimate of the population mean.

      Any thoughts on simply issuing a value sampled from the sample distribution?

      • JC:

        When I said to use a randomly generated value, I didn’t say I would use the same random distribution for all the cells! Indeed you’d want a random distribution that varied by cell in accordance with the data.

    • This is the first time I’ve seen issues of statistics in philosophy rankings arise on this blog, and it’s not one I’m especially vexed about, but since it’s come up, I really don’t see why it is important for evaluators names to be listed. Quite aside from the fact that the data some people are asking for wouldn’t be expected to include a list of all person X’s scores,and so your worry wouldn’t arise, I don’t see the value of reporting names. On the other hand, if this is to have the standing of the profession (which I never thought it had, but I take it others do), it would be good to know that the methodology for inclusion of evaluators was itself adequately representative.

  5. I wonder if one could devise experimental designs that have something like zero-knowledge proof properties, where outsiders can ask certain kinds of queries and only certain kinds of data sets would be robust to the queries. Then, folks could ask the queries without needing to see the data and also rest assured that no data set could stand up to the queries unless it was free of this-or-that issue or problem that outsiders are worried about when they see only the results of the study. I’m not that hopeful that such a thing can be devised, but it would be interesting to think about “proof protocols” like those in the graph isomorphism problem, that allow you to “prove” certain properties about your data without releasing it. Homomorphic encryption is another possibility.

    The quoted letter from Helzner makes me think about this, because he says “Mathematical proofs must be open so that they can be reviewed by others,” but zero-knowledge proofs and interactive proofs make this statement false. Of course, some suggest compellingly that we miss out on the “meaning” or beauty of a proof when we do it this way, but it’s not really any less mathematically rigorous. It feels like we should be able to do the same sort of thing with data.

  6. Pingback: Signs of progress « Choice & Inference

  7. Leiter is not representative of philosophers, and it is odd that he wouldn’t publicize the data on which his rankings are based; I actually thought he did try to delineate criteria used at least, but I haven’t looked at his blog recently (except, coincidentally, on the matter of the Ruth Marcus obituary controversy yesterday). Any rankings that the field is going to take seriously, as in Leiter’s gourmet report, should be open for scrutiny. All the more so since the rankers are hand-picked by Leiter, I believe.

    • I very much agree with this comment. While PGR is no doubt very helpful for prospective grad students, it’s a closed shop, both in terms of who can review and in terms of review data. Trust is therefore an issue, and the disappointing vehemence of Leiter’s mudslinging suggests that this debate has touched a raw nerve.

      It’s certainly a debate worth having, though, since grad students’ prospects are on the line. I’m pleased to see it being conducted here more professionally and amicably!

      • What happened is that Leiter started this gourmet report in philosophy years ago, and people thought it was kind of cool, since the internet was brand new, and our field had no such thing. As he increasingly compiled ordinary, objective information about departments that was valuable and interesting to have all in one place—and as he even manages, as I hear it, to report who has taken a job and where, sometimes before job hunters even know the person has already accepted a job somewhere else (they report to him first, I guess)—his report has managed to become accredited as a valid source of appraisal of philosophy programs. As an information machine, he performs a professional service that apparently no other outfit has done or is willing to do for philosophy, and as a reward, he gets his ratings taken seriously. Now the vast majority of the information is uncontroversial, and it’s excellent to have in a place that strives to be up-to-date, and enjoys a pipeline to who is taking, or even thinking about taking, a new job! But it is a bit unseemly for official professional rankings in philosophy to be so closely connected to a blog that also serves as a vehicle for Leiter’s personal (and increasingly vehement) expressions of politics, venting, and advertising.

        • I meant to write “job recruiters” not “job hunters”, but job hunters also may learn from the Leiter report that someone else has just accepted the job they applied for. That’s how impressively up to date he is!

      • Oh, hey, Jon Williamson! I never got back to you about the flaw in your purported reductio of diachronic Dutch Book; will email you soon.

  8. Pingback: Jeff Helzner and Andrew Gelman « Formal Philosophy

    • @PA: This link works, but all of the links except for one on Leiter’s blog are dead.

      I have to say that reading the rest of the Burton thread, as you recommended, confirms JD’s description.

      Outing someone in front of the community he belongs to, which in this case is one that is commonly believed to include people who strongly oppose homosexuality, some violently so, is an extreme tactic. One reason given for outing someone is when that individual is living a public life at variance with his private life and, rightly or wrongly, is viewed to harm the interests of the gay community through his hypocrisy. But, I have never seen the tactic gratuitously and publicly used in response to a personal criticism that had absolutely nothing to do with sexual politics.

    • Well, perhaps you can point us to some evidence that Burton was never in the closet. Leiter looks pretty guilty based on the link that JD provided.

  9. The link above was the evidence, I thought. Leiter announced the “Right Reason” blog referencing Burton’s gayness (he’s “the gay libertarian”). No outcry resulted. Burton still blogs with many of the same right-wingers at a successor blog “What’s Wrong with the World.” He was not repudiated because of the ‘outing’ because he was never not out. Too bad this thread has been derailed by this silliness.

  10. The above link was mostly broken. The issue isn’t whether Leiter succeeded in outing Burton. The issue is that he brought up Burton’s ex, and his name, in the first place. That was JD’s claim.

    This discussion is sordid and it would be a diversion except that Leiter has a history of vicious personal attacks against his critics, sometimes with false claims that appear to be relevant, and sometimes, such as this case with his attempted outing of Burton, with a true but completely irrelevant claim.

  11. Leiter has a history of being defamed by anonymous right-wingers who hate him for obvious reasons. This thread now marks another chapter in that sordid history.

  12. Pingback: Manufactured Assent: The Philosophical Gourmet Report’s Sampling Problem « Choice & Inference

Comments are closed.