On citation practices, strategic and otherwise

John Sides links to an (unintentionally, I assume) hilarious peer-reviewed article by C. K. Rowley, which begins:

This commentary demonstrates that Avner Greif, through his citation practices, has denied Janet Landa her full intellectual property rights with respect to her contributions to the economic analysis of trust and identity. He has done so by systematically failing to cite her published papers in this field, incidentally promoting his own publications as meriting priority. In consequence, he has effectively blocked out Janet Landa’s work from the mainstream economics literature, albeit not from the literature of law and economics, where his own writings have not been directed.

The commenters on John’s blog all go on about how ridiculous Rowley’s article is–and I agree. From the outside, it’s hilarious, as is this story, but I imagine it’s pretty upsetting to those who are personally involved in these disputes.

But here what I want to do is ignore the content of Rowley’s article and use it as a jumping-off point to discuss the inherent difficulties with mis-citations and strategic citations. If we forget about flat-out sleazy behavior (avoiding the citation of known work of others, so as to falsely claim credit for originality) and forgivable ignorance (not citing relevant work that you weren’t aware of), there are two big problems I’ve seen again and again:

1. Citing others’ work in a vague way that does not make it clear that you’re actually doing something that’s already been done before.

2. Avoiding the citation of work that you know about but don’t like. The worst cases here, in my opinion, are when you puts down somebody else’s work but without actually citing it. That way, you’re not only not giving them a chance to defend themselves after your attack, you’re not even giving the reader a chance to see the arguments they’ve already made.

The funny thing is, I don’t think the people who do this are always bad people or even bad researchers. Item 1 occurs when you read something quickly or perhaps just give a citation out of politeness (possibly a second-hand citation culled from the references to a paper you actually have read). It’s bad practice to cite something you don’t fully understand, but it’s forgivable, especially if the alternative is to not cite it at all. Many times, I’ve been annoyed when someone does not cite what I consider to be important work of mine (for example, all the people doing Gibbs sampler on finite mixture models who don’t cite my 1990 paper with King); really, though, no harm is done to the world by not acknowledging our priority in this method that’s been rediscovered often enough by others.

Item 2 annoys me more, and I try pretty hard to avoid the temptation to use non-citation as a tool to disparage work that I dislike. My impression is that some people feel that they don’t want to dignify opposing work with a reference, but I’d rather air an academic dispute than bury it. For example, I’m glad that, in their article on model checking, Bayarri and Castellanos cited my work with Meng and Stern, even though they disagreed with it. This gave me the opportunity to comment, and then they had the chance to respond right back. The outcome was much better for all concerned, I believe, than had they merely given an anonymous dis of posterior predictive checking without leaving a trail of references. (Not to keep arguing this point, but I will say here that I disagree with the claim by Bayarri and Castellanos that posterior predictive checks “use the data twice.” As we discuss in chapter 6 of Bayesian Data Analysis, as well as in the 1996 paper with Meng and Stern, posterior predictive checks are based on the distribution p(y.rep|y). The data, y, are used only once, as is appropriate in a Bayesian analysis. Anyway, my real point here is that it’s better to have the discussion out in the open.)

That said, it’s still a judgment call. I typically won’t waste my time citing what I consider crank research, and I suppose that much of the time, people avoid citing opposing work when they just consider it too crappy to bother with. The paradox is this: it’s best to engage with your opponents’ strongest arguments–but your view of what their strongest arguments are is not necessarily their view.

To get back to the quotation above, I think that more citations are probably better, and I do think that sometimes people practice citation-avoidance for strategic reasons (to present their own work without giving the reader a clear sense of opposing views), but it can be so natural to do this, that I don’t think it’s appropriate to get so angry about it.

13 thoughts on “On citation practices, strategic and otherwise

  1. Well, I've written on Gibbs sampling for finite mixture models without citing your paper with King. But what do you expect, when neither the title nor the abstract give any hint that the paper has anything to do with mixture models, and none of the references in the paper are to books or papers on mixture models (at least to judge from their titles)? You may have made a mistake in putting novel methodological material in what appears to be a paper of interest only to people in a particular application area.

  2. Radford:

    JASA makes it pretty clear that even its application articles can have novel methodological content, but, yes I agree that we made a mistake by not writing a separate methodological article. As a grad student, though, I was so thrilled to be publishing in the top journal in the field that I had no thought that maybe my article wouldn't be read.

    As far as the methods are concerned, the most important contribution of that article is not the Gibbs sampler (which, as I noted, has been rediscovered many times) but rather the fact that we fit a hierarchical model with only one observation per group (using an external analysis to estimate the within-group variance parameter). And also that we used a weakly informative prior distribution on the means, variances, and proportions of the mixture components. That turned out well, and even now I don't see people always doing this. In retrospect, yes, it definitely would've been a contribution to write this all up as a methods paper.

  3. I do hesitate to give additional citations to some authors I regard as using very underhand and dubious citation practices themselves (it doesn't help that it's highly flawed work in the first place).

  4. I think the scary thing it that he has a potentially legitimate academic point (that a scholar, theory or group of theorists) are neglecting or not fully appreciating an important body of work, but he has focussed on supposedly dodgy citation practice as if it were a major academic crime.

    I'm also dubious that published academic work – in the sense of ideas, rather than their expression – is intellectual property. Once you publish your ideas are out there to play with and inform others. If you don't want that to happen from time to time don't publish!

    Yes, somebody might 'steal' credit for your ideas and so we have an informal culture of practices that help protect us against that, but in the big scheme of things isn't it the ideas that really matter?

    As an outsider I'm also not convinced that Landa's ideas are that original. I'm pretty sure Hume made some the same basic points … (and I'm probably wrong – because my view of the contribution is determined by my perspective as a non-economist).

  5. I'm sure we all have our tales of grievance and woe. I think it's important to distinguish between people who ignorantly fail to cite relevant papers (because they don't know about them) and people who intentionally fail to do so (because they don't want to admit that they got their idea from somewhere else). In fact, even this distinction fails to recognize shades of gray, such as people who saw something in a paper and then forgot about it.

    I have tried, and failed, to resist including the following example. More than ten years ago, I wrote a paper about the geographic distribution of indoor radon, in which I used Bayesian hierarchical modeling…which was not by any means new at the time, but the previous papers that Andrew and I had written were the only ones that applied Bayesian methods to radon mapping. This particular paper, of which I was sole author, applied the same methods as the papers Andrew and I had previously co-written, but looked at a larger portion of the country and included some additional variables. I wanted to provide a simple way for (non-statistician) readers to understand how good the predictions were — this had been a problem in our previous papers — so I defined a metric (which I called "effective R-squared"), analogous to R-squared, that compared the variance of the raw data to the posterior estimate of the unexplained variance. [The details aren't important for purposes of this blog post.] I discussed this idea with you, Andrew, and you basically thought it was harmless, although you weren't convinced that it was useful.

    Ten years later, Andrew, you published a paper (with Pardoe) in the journal "Technometrics" about defining an R-squared measure, similar but not identical to mine, to summarize Bayesian model results…and indeed, you gave a computational example involving radon data…without citing my paper or giving me any credit for the idea! Et tu, Brute? The horror!

    Of course I realize you had completely forgotten about my previous work, which I had never put on sound statistical footing and which I had published in a two-paragraph throw-away in a minor applied paper about radon. I've never even bothered to mention this issue to you. Ideas are cheap, and unless you take it upon yourself to publish your contributions and to publicize them to the extent that they get noticed, you can't be surprised if somebody else re-invents your stuff and publishes it without citing your prior invention, of which they are ignorant. (When I say "you" in the previous sentence, I mean "me." Well, grammatically, I guess I mean "I".) So not only have I not complained, I don't have a right to complain. A citation would have been nice, but could hardly have been expected.

    Eh, as long as I'm airing grievances, let me air one we can both share, Andrew. I'm sure you will recall our joint article whose title remains the favorite among my publications: "All maps of parameter estimates are misleading." I got an email from a radon researcher who read the article, saying that he really appreciated it because he had recognized the phenomena that we discussed and had been wrestling with how to present maps of parameter estimates, and our article had given him some good ideas and also led him to understand the inevitable limitations of his maps. Six months later he published an article about his mapping methods and the kinds of artifacts that inevitably occur in them…but he didn't cite our paper! D'oh! In this example (unlike the previous), the guy really did commit an ethical breach by not citing our paper: we know he knew about it, and it affected the work that he published. I also know that he didn't fail to cite us "on purpose" — he had nothing to gain by failing to cite our work — it just somehow, oddly, didn't occur to him, I guess.

    And here's another one:… oh, never mind. I've got several of these. I'm sure most people reading this blog do, too. What can we do, other than to try not to commit these omissions in our own publications? If you complain to the editors or the authors, you just look petty, and anyway it's usually too late to remedy the problem. And If you go so far as to publicize your belief that your work isn't adequately cited, you get ridiculed on some statistics professor's blog!

  6. I had this very discussion last week in the context of visual art. Copyright aside – because then the issues the is the work allowed to exist and who gets the money – it's now common for artists to use images created by others. These are rarely acknowledged. E.g., the museum in Boston now has a piece up that assembles a bunch of record album covers. The name on the piece is the guy who chose the records and the individuals who took the pictures and drew the art aren't listed at all. I'm sure most people assume this guy made all those albums – and if they don't recognize they're actual record jackets, that he created the entire imagery.

  7. Yes, ultimately it is the ideas that matter. But, the ideas of (previously) influential people are more likely to be debated than the ideas of less influential people, irrespective of the quality of those ideas. This may be inevitable: there are simply too many ideas out there to consider each with equal weight, so we use status cues to help us filter out which ideas deserve the most attention. Failing to give credit to the author of an idea is not just "bad form," but it also increases the likelihood that we will not give much attention to his/her next good idea.

    Related: my impression (i.e., no data), is that women scholars are more likely to have their ideas appropriated than men. I've often heard speakers reverse the order of co-authored citations to put the male co-author first, neglect to mention the female co-author altogether, cite a male author who "discovered" something after a female author without citing the latter, or (my favorite) invent a paper by the male member of a collaborative team that reports a study the female member of the team published on her own. I've never heard the reverse… although perhaps it did but just didn't register. Regardless, all these may be "honest" mistakes, but they are still consequential.

  8. Phil: I agree. As a wise person once said, "To have ideas is to gather flowers. To think is to weave them into garlands."

    Jonathan: Along these lines, what really pisses me off is that artist (whose name I don't want to mention because I don't want to give him the credit) who has a painting hanging in the Museum of Modern Art that's basically a very large version of comic-book art from the 1950s. The plaque on the wall makes no mention of the artists he ripped off. Doesn't seem fair to me. I think it takes a lot more inspiration and crafstmanship to make the original art than to have the cute idea of blowing it up into a large wall-size version.

  9. Fairer sets of citations? Perhaps a slightly different take, the Cochrane Collaboration provides resources to help largely non-statisticians do meta-analyses of clinical trials including a list of methodological publications – that clinical authors can cut and paste into the references for their papers. Great for those methodological authors who have papers listed, not so good for those who don't. But recently one of their group has been asking around for a more inclusive list that may replace this select list.

    Getting cited matters, as does being invited to give talks, contribute comments, etc. , etc.

    Its going be a long time before this component of academe is "fair"…

    Keith

  10. One can argue that RL changed the comic but I wholly agree that the original creator of the image has now been lost to history to all but the most informed specialist. In the art world, one of the big movements, as you may know, has been related to branding and commentary on commercial culture. Thus the blue dog. Thus Shepard Fairey's Andre the Giant star logo. In that context, they take another person's image and re-brand it. The artists, museums, etc. point to the big names and ignore how this simply writes out most creators from history. An example of the former is a photo taken of a famous photo by Walker Evans. This literal photo-copy is then signed and exhibited – in a famous museum. People know Walker Evans now, but imagine it's the year 2400 and the main surviving images are of this person's photocopy and so even a famous person's work can be erased by conceptual piracy.

    The artists, museums, etc. all point to the history of stealing in art. True, but when you were drawing the Annunciation or the Deposition, you were using common cultural images and then rendering the images through your hands. By contrast, Roman copies of Greek statues are today labeled as such because art history tries to recognize Lysippus et al. The modern trend would be to treat the copy as an original without credit to the original.

  11. I would tend to follow Radford's point that the gem was too well hidden in a paper whose application was not of general interest… Which is why I only became aware of Gelman and King very recently! Incidentally, the very first reference on Gibbs for mixtures is, I believe,

    Gilks, W.R., Oldfield, L., Rutherford, A., 1989, Statistical analysis. In: Knapp, W., et al., (Eds.), Leucocyte Typing IV. Oxford University Press, Oxford, p. 6.

    with a close second being

    Diebolt, J., Robert, C.P., 1990, Estimation des paramètres d'un mélange par échantillonnage bayésien. Notes aux Comptes Rendus de l'Académie des Sciences I, 311, 653-658.

    which includes a two-page version in English but is almost never quoted. (Fair enough, who would look at the French PNAS..?!)

    My worst personal experience with quotes and lost references is when I wrote my 1995 paper on the simulation of truncated normals, only to discover later that John Geweke had published almost exactly the same paper in the Proceedings of the 23rd Symposium in 1992..!

  12. This has been quite a stimulating exchange of the comments.

    And here I thought citation practices were all just a mix of petty politics and under-the-table mutualism.

Comments are closed.