In a blog discussion on plagiarism ten years ago, Rahul wrote:
The real question for me is, how I would react to someone’s book which has proven rather useful and insightful in all aspects but which in hindsight turns out to have plagiarized bits. Think of whatever textbook, say, you had found really damn useful (perhaps it’s the only good text on that topic; no alternatives) and now imagine a chapter of that textbook turns out to be plagiarized.
What’s your reaction? To me that’s the interesting question.
It is an interesting question, and perhaps the most interesting aspect to it is that we don’t actually see high-quality, insightful plagiarized work!
Theoretically such a thing could happen: an author with a solid understanding of the material finds an excellent writeup from another source—perhaps a published article or book, perhaps something on wikipedia, maybe something written by a student—and inserts it directly into the text, not crediting the source. Why not credit the source? Maybe because all the quotation marks would make the resulting product more difficult to read, or maybe just because the author is greedy for acclaim and does not want to share credit. Greed is not a pretty trait, but, as Rahul writes, that’s a separate issue from the quality of the resulting product.
So, yeah, how to think about such a case? My response is that it’s only a hypothetical case, that in practice it never occurs. Perhaps readers will correct me in the comments, but until that happens, here’s my explanation:
When we write, we do incorporate old material. Nothing we write is entirely new, nor should it be. The challenge is often to put that old material into a coherent framework, which requires some understanding. When authors plagiarize, they seem to do this as a substitute for understanding. Reading that old material and integrating it into the larger story, that takes work. If you insert chunks of others’ material verbatim, it becomes clear that you didn’t understand it all, and not acknowledging the source is a way of burying that meta-information. To flip it around: as a reader, that hypertext—being able to track to the original source—can be very helpful. Plagiarists don’t want you to be aware of the copying in large part because they don’t want to reveal that they have not put the material all together.
To use statistical terminology, plagiarism is a sort of informative missingness: the very fact that the use of outside material has not been acknowledged itself provides information that the copyist has not fully integrated it into the story. That’s why Basbøll and I referred to plagiarism as a statistical crime. Not just a crime against the original author—but, yeah, as someone whose work has been plagiarized, it annoys me a lot—but also against the reader. As we put it in that article:
Much has been written on the ethics of plagiarism. One aspect that has received less notice is plagiarism’s role in corrupting our ability to learn from data: We propose that plagiarism is a statistical crime. It involves the hiding of important information regarding the source and context of the copied work in its original form. Such information can dramatically alter the statistical inferences made about the work.
To return to Rahul’s question above: have I ever seen something “useful and insightful” that’s been plagiarized? In theory it could happen: just consider an extreme example such as an entirely pirated book. Take a classic such as Treasure Island, remove the name Robert Louis Stevenson and replace it with John Smith, and it would still be a rollicking good read. But I don’t think this is usually what happens. The more common story would be that something absolutely boring is taken from source A and inserted without checking into document B, and no value is added in the transmission.
To put it another way, start with the plagiarist. This is someone who’s under some pressure to produce a document on topic X but doesn’t fully understand the topic. One available approach is to plagiarize the difficult part. From the reader’s perspective, the problem is that the resulting document has undigested material, the copied part could actually be in error or could be applied incorrectly. By not disclosing the source, the author is hiding important information that could otherwise help the reader better parse the material.
If I see some great material from another source, I’ll copy it and quote it. Quotations are great!
Music as a counterexample
In his book, It’s One for the Money, music historian Clinton Heylin gives many examples of musicians who’ve used material from others without acknowledgment, producing memorable and sometimes wonderful results. A well known example is Bob Dylan.
How does music differ from research or science writing? For one thing, “understanding” seems much more important in science than in music. Integrating a stolen riff into a song is just a different process than integrating an explanation of a statistical method into a book.
There’s also the issue of copyright laws and financial stakes. You can copy a passage from a published article, with quotes, and it’s no big deal. But if you take part of someone’s song, you have to pay them real money. So there’s a clear incentive not to share credit and, if necessary, to muddy the waters to make it more difficult for predecessors to claim credit.
Finally, in an academic book or article it’s easy enough to put in quotation marks and citations. There’s no way to do that in a song! Yes, you can include it in the liner notes, and I’d argue that songwriters and performers should acknowledge their sources in that way, but it’s still not as direct as writing, “As X wrote . . .”, in an academic publication.
What are the consequences of plagiarism?
There are several cases of plagiarism by high-profile academics who seem to have suffered no consequences (beyond the occasional embarrassment when people like me bring it up or when people check them on wikipedia): examples include some Harvard and Yale law professors and this dude at USC. The USC case I can understand—the plagiarist in question is a medical school professor who probably makes tons of money for the school. Why Harvard and Yale didn’t fire their law-school plagiarists, I’m not sure, maybe it’s a combination of “Hey, these guys are lawyers, they might sue us!” and a simple calculation along the lines of: “Harvard fires prof for plagiarism” is an embarrassing headline, whereas “Harvard decides to do nothing about a plagiarist” likely won’t make it into the news. And historian Kevin Kruse still seems to be employed at Princeton. (According to Wikipedia, “In October 2022, both Cornell, where he wrote his dissertation, and Princeton, where he is employed, ultimately determined that these were “citation errors” and did not rise to the level of intentional plagiarism.” On the plus side, “He is a fan of the Kansas City Chiefs.”)
Other times, lower-tier universities just let elderly plagiarists fade away. I’m thinking here of George Mason statistician Ed Wegman and Rutgers political scientist Frank Fischer. Those cases are particularly annoying to me because Wegman received a major award from the American Statistical Association and Fischer received an award from the American Political Science Association—for a book with plagiarized material! I contacted the ASA to suggest they retract the award and I contacted the APSA to suggest that they share the award with the scholars who Fischer had ripped off—but both organizations did nothing. I guess that’s how committees work.
We also sometimes see plagiarists get canned. Two examples are Columbia history professor Charles Armstrong and Arizona State historian Matthew Whitaker. Too bad for these guys that they weren’t teaching at Harvard, Yale, or Princeton, or maybe they’d still be gainfully employed!
Outside academia, plagiarism seems typically to have more severe consequences.
Journalism: Mike Barnicle, Stephen Glass, etc.
Pop literature: that spy novelist (also here), etc.
Lack of understanding
The theme of this post is that, at least regarding academics, plagiarism is a sign of lack of understanding.
A common defense/excuse/explanation for plagiarism is that whatever had been copied was common knowledge, just some basic facts, so who cares if it’s expressed originally? This is kind of a lame excuse given that it takes no effort at all to write, “As source X says, ‘. . .'” There seems little doubt that the avoidance of attribution is there so that the plagiarist gets credit for the words. And why is that? It has to depend on the situation—but it doesn’t seem that people usually ask the plagiarist why they did it. I guess the point is that you can ask the person all you want, but they don’t have to reply—and, given the record of misrepresentation, there’s no reason to suspect a truthful answer.
But, yeah, sometimes it must be the case that the plagiarist understands the copied material and is just being lazy/greedy.
What’s interesting to me is how often it happens that the plagiarist (or, more generally, the person who copies without attribution) evidently doesn’t understand the copied material.
Here are some examples:
Weggy: copied from wikipedia, introducing errors in the process.
Chrissy: copied from online material, introducing errors in the process; this example of unacknowledged copying was not actually plagiarism because it was stories being repeated without attribution, not exact words.
Armstrong (not the cyclist): plagiarizing material by someone else, in the process getting the meaning of the passage entirely backward.
Fischer (not the chess player): OK, I have to admit, this one was so damn boring I didn’t read through to see if any errors were introduced in the copying process.
Say what you want about Mike Barnicle and Doris Kearns Goodwin, but I think it’s fair to assume that they did understand the material they were ripping off without attribution.
In contrast, academic plagiarists seem to copy not so much out of greed as from laziness.
Not laziness as in, too lazy to write the paragraph in their own words, but laziness as in, too lazy to figure out what’s going on—but it’s something they’re supposed to understand.
That’s it!
You’re an academic researcher who is doing some work that is relying on some idea or method, and it’s considered important that the you understand it. This could be a statistical method being used for data analysis, it could be a key building block in an expository piece, it could be some primary sources in historical work, something like that. Just giving a citation and a direct quote wouldn’t be enough, because that wouldn’t demonstrate the required understanding:
– If you’re using a statistical method, you have to understand it at some level or else the reader can’t be assured that you’re using it correctly.
– In a tutorial, you need to understand the basics, otherwise why are you writing the tutorial in the first place.
– In historical work, often the key contribution is bringing in new primary sources. If you’re not doing that, a lot more of a burden is placed on interpretation, which maybe isn’t your strong point.
So, you plagiarize. That’s the only choice! OK, not the only choice. Three alternatives are:
1. Don’t write and publish the article/book/thesis. Just admit you have nothing to add. But that would be a bummer, no?
2. Use direct quotes and citations. But then there may be no good reason for anyone want to read or publish the article/book/thesis. To take an extreme example, is Wiley Interdisciplinary Reviews going to publish a paper that is a known copy of a wikipedia entry? Probably not. Even if your buddy is an editor of the journal, he might think twice.
3. Put in the work to actually understand the method or materials that you’re using. But, hey, that’s a lot of effort! You have a life to read, no? Working out math, reading obscure documents in a foreign language, actually reading what you need to use, that would take effort! Ok, that’s effort that most of us would want to put in, indeed that’s a big reason we became academics in the first place: we enjoy coding, we enjoy working out math, understanding new things, reading dusty old library books. But some subset of us doesn’t want to do the work.
If, for whatever reason, you don’t want to do any of the above three options, then maybe you’ll plagiarize. And just hope that, if you get caught, you receive the treatment given to the Harvard and Yale law professors, the USC medical school professor, and the Princeton history professor or, if you do it late enough in your career, the George Mason statistics professor and the Rutgers history professor. So, stay networked and avoid pissing off powerful people within your institution.
As I wrote last year regarding scholarly misconduct more generally:
I don’t know that anyone’s getting a pass. What seems more likely to me is that anyone—left, center, or right—who gets more attention is also more likely to see his or her work scrutinized.
Or, to put it another way, it’s a sad story that perpetrators of scholarly misconduct often “seem to get a pass” from their friends and employers and academic societies, but this doesn’t seem to have much to do with ideological narratives; it seems more like people being lazy and not wanting a fuss.
The tell
The tell, as they say in poker, is that the copied-without-attribution material so often displays a lack of understanding. Not necessarily a lack of ability to understand—-Ed Wegman could’ve spent an hour reading through the Wikipedia passage he’d copied and avoided introducing an error; Christian Hesse could’ve spent some time actually reading the words he typed, and maybe even some doing some research, and avoiding errors such as this, reported by chess historian Edward Winter:
In 1900 Wilhelm/William Steinitz died, a fact which did not prevent Christian Hesse from quoting a remark by Steinitz about a mate-in-two problem by Pulitzer which, according to Hesse, was dated 1907. (See page 166 of The Joys of Chess.) Hesse miscopied from our presentation of the Pulitzer problem on page 11 of A Chess Omnibus (also included in Steinitz Stuck and Capa Caught). We gave Steinitz’s comments on the composition as quoted on page 60 of the Chess Player’s Scrap Book, April 1907, and that sufficed for Hesse to assume that the problem was composed in 1907.
Also, I can only assume that Korea expert Charles Armstrong could’ve carefully read the passage he was ripping off and avoided getting its meaning backward. But having the ability to do the work isn’t enough. To keep the quality up in the finished product, you have to do the work. Understanding new material is hard; copying is easy. And then it makes sense to cover your tracks. Which makes it harder for the reader to spot the mistakes. Etc.
In his classic essay, “Politics and the English Language,” the political journalist George Orwell drew a connection between cloudy writing and cloudy content, which I think applies to academic writing as well. Something similar seems to be going on with copying without attribution. It happens when authors don’t understand what they’re writing about.
P.S. I just came across this post from 2011, “A (not quite) grand unified theory of plagiarism, as applied to the Wegman case,” where I wrote, “It’s not that the plagiarized work made the paper wrong; it’s that plagiarism is an indication that the authors don’t really know what they’re doing.” I’d forgotten about that!