Trey Causey asks, Has R-help gotten meaner over time?:
I began by using Scrapy to download all the e-mails sent to R-help between April 1997 (the earliest available archive) and December 2012. . . .
We each read 500 messages and coded them in the following categories:
-2 Negative and unhelpful
-1 Negative but helpful
0 No obviously valence or request for additional information
1 Positive or helpful
2 Not a response
An example of a response coded -2 would be responses that do not answer the question, along with simply telling the user to RTFM, that they have violated the posting guidelines, or offer “?lm” as the only text when the question is about lm(). . . .
Proportions of emails in each category in the test set were estimated on a monthly basis. Much to my surprise, R-help appears to be getting less mean over time! The proportion of “negative and unhelpful” messages has fallen steadily over time, from a high of 0.20 in October of 1997 to a low of 0.015 in January of 2011. . . .
Let’s return to the puzzle of falling meanness, somewhat stable helpfulness, and growing numbers of unanswered questions. . . . R-help is essentially a public good. Anyone can free ride by reading the archives or asking a question with minimal effort. It is up to the users to overcome the collective action problem and contribute to this public good. As the size of the group grows, it becomes harder and harder to overcome the collective action problem . . . Maintaining the quality of the public good requires individuals willing to sanction rule-breakers. This was accomplished in early days by chiding people about the content of their posts or generally being unpleasant enough to keep free-riders away. As the R user base grew, however, it became more and more costly to sanction those diluting the quality of the public good and marginal returns to sanctioning decreased. Thus, we get free-riders (question askers) leading to unanswered question and a decline in sanctioning without a concomitant increase in quality of the good.
Interesting combination of descriptive data and social-science speculation. I like it. I just have a few thoughts:
– It would make sense to check the performance of the classifier by taking a random sample of a couple hundred messages, hand-coding them, and comparing to their classifications.
– It might also make sense to break up the data into two or three periods and run the classifier separately on each, just in case other aspects of the messages are changing which could confuse the classification algorithm.
– It would make sense to compare to other lists. For example, maybe the relevant statistic is the number of mean posts, not the proportion. This would be the case, for example, if just one or two of the list participants are providing all the negative feedback. These posters have some finite amount of time they can spend on this, so when the list gets larger they represent a smaller proportion of the whole.
The Ripley paradox
Let me conclude by briefly exploring the ethical issues raised by Causey in his post. Being mean isn’t so nice, but it can be an effective way of helping the list function better, thus serving the public good. Contributors to lists can often seem snappy and downright nasty, but they’re really being altruistic. They may be a bit mean because they’re tired after answering the same questions over and over for a decade, but they’re ultimately helping people out.
The downside of these mean commenters is sometimes they seem sooooo eager to give snappy answers to stupid questions, that I fear that they go out of their way to answer the easy questions and skip out on the toughies. You might say I do the same thing on the blog, sometimes. I’ll read a serious comment and respond only with a +1 or not even that, whereas I’ll go endlessly back and forth with trolls. There’s some way in which correcting error seems so urgent, while more serious exploration can wait.
So, I appreciate the years of unpaid service that these volunteers have put into the R help list, and as far as I’m concerned, they can be as crabby as they want for as long as they want. I hope we can work out a better system so they can be even more effective and so that they don’t feel they need to waste so much of their time on the easy questions.