Using trends in R-squared to measure progress in criminology??

Torbjørn Skardhamar writes:

I am a sociologist/criminologist working at Statistics Norway. As I am not a trained statistician, I find myself sometimes in need to check basic statistical concepts. Recently, I came across an article which I found a bit strange, but I needed to check up on my statistical understanding of a very basic concept: the r-squared. When doing so, I realized that this was also an interesting case of research ethics. Given your interest in research ethics, I though this might be interesting to you.

Here’s the mentioned article, by Weisburd and Piquero, is attached. What they do is to analyzed reported results from all articles published in the highest ranking criminological journal since 1968 through 2005 to determine whether there are any progress in the field of criminology. Their approach is basically to calculate the average r-square from linear models in published articles. For example, they state that “variance explained provides one way to assess the state of the science of criminology and its relevance for public policy, and how that science has changed over time” (page 455, final paragraph). They find that the “explained variance” is generally low – and even on the decline, so there has not been much progress, and they conclude: “That criminology is not developing models of crime with more explanatory power over time is troubling” (page 491, first sentence).

I needed to look up in my old statistical text books to find out if interpreting the r-squared statistics in this way made much sense. I think it doesn’t, but I’m not entirely sure if there might be some circumstances where it might be meaningful after all. Perhaps if the sole purpose is predicting a specific phenomenon? (But that is usually not the purpose at all).

The research ethical issue is related to the statistical issue. While trying to find out if I had misunderstood something about r-squard, I came across Gary King’s article (1986) “How to not lie with statistics” where he also discusses the direct interpretation of r-squared. If I got it right, King argues that r-squared is only meaningful for comparing models on the same data with the same outcome variable. As far as I can understand, then, King’s argument implies that calculating the average r-squared across studies does not make much sense.
I also found your blog posts with related arguments: http://statmodeling.stat.columbia.edu/2007/08/rsquared_useful/ and http://statmodeling.stat.columbia.edu/2012/10/r-squared-of-1/

Then I realized that Weisburd and Piquero actually cites King’s article (on page 464), but only on a side note that one can easily manipulate to get a higher r-squared. As Weisburd and Piquero cite King’s article, we must assume they have read the main arguments too. But it appears as if they just ignore King’s main arguments. The ethical issue is then when authors delibrately ignore highly relevant arguments that might undermine their own publication. In this case, it looks like Weisburd and Piquero puts forward an argument that they know (or should know) does not hold. At the least, they should have discussed the counter arguments properly.

By the way, it should be noted that Weisburd and Piquero are much cited criminologists having some impact in the criminological literature. So they are not making beginners’ mistakes.

My reply: I do think that R-squared can be a useful summary of a fitted model. But I question the premise that progress in criminology research would be characterized by an increase of explained variance. When I think of my own applied research, progress is typically not measured by pure prediction. It’s often important to work on problems where individual outcomes are difficult to predict.

6 thoughts on “Using trends in R-squared to measure progress in criminology??

  1. One possibility that jumps to mind: Could there have been a shift from grouped to individual-level studies?

  2. Measuring progress by changes in r-squared is a particularly daft idea, but this article exaggerates the sorry understanding of the connection between causality and statistics only a little. Notions commonly found in criminological research articles (some of them more often implicit than explicit):

    1. R-squared: The bigger, the better.

    2. When a variable has been shown or can be argued to be connected to the outcome, you must control for it. Overcontrol problems don’t exist.

    3. It is totally feasible to estimate the effects of multiple independent variables on the dependent variable in the same model.

    4. Unless there is multicollinearity! If the VIF is bigger than 4 (or 5, or 10), you must run a factor analysis to reduce the variable space. You then construct indexes of variables that load on a common factor, put these in the regression instead, and, voilà, less collinearity. Alternatively, just leave variables out of the model.

    To end on a positive note, it is my impression that things are slowly getting better.

  3. This is especially crazy if there is no consideration of what the dependent variables are. If these were all models in which crime rates were the outcome, for example, we could have a simple conversation about why it’s a bad idea. As it is, we can’t even begin.

  4. I have a macro that allows me to print these references out automatically whenever someone starts droning on about R^2’s not being “big enough” or the “goal” of social science to be to “increase” it.

    Abelson, R.P. A Variance Explanation Paradox: When a Little is a Lot. Psychological Bulletin 97, 129-133 (1985).

    King, G. How Not to Lie with Statistics. Am. J. Pol. Sci. 30, 666-687 (1986).

    Rosenthal, R. & Rubin, D.B. A Note on Percent Variance Explained as A Measure of the Importance of Effects. J. Applied Social Psychol. 9, 395-396 (1979).

    Obviously it has its uses, but R^2 envy” is a tyrannical species of thoughtfulness that can never be over ridiculed.

    Or if you don’t like that metaphor, try this one. It is to social science what measles is to public health. As the dates of these articles suggests, a vaccine for this childish affliction was developed 3 decades ago, but occasionally cases crop up from fields w/ low vaccination rates, at which point aggressive measures for containment are in order.

      • No. I love that paper. Because I love anything having to do with measuring indoor radon levels.

        But putting that aside, I am not saying that R^2 has no uses. E.g., to help evaluate the relative fit of alternative models (representing competing hypotheses) as appplied to the same data set, or (relatedly) to assess (via change in R^2/F-test) hypotheses relating to interactions etc.

        Necessarily the “progress in size of R^2” thesis in paper isn’t about that.

        Moreover there can be *plenty* of “advances” in criminology that involve figuring out new ways to measure things in a reliable valid way & draw inferences therefrom that don’t “increase R^2” relative to previous models and that in fact result in “small” but *practically* important effects (the point that Abelson & Rosenthal, in particular, stress).

        Your paper, I gather, develops a technique for determining R^2 at every level of a hierarchical model. If that helps someone to do better any of the things it makes sense to do w/ R^2, then that seems great (even if what they are doing doesn’t involve radon, although why they’d be bothering to do whatever it is, I have no idea). And if someone starts displaying R^2 envy or R^2 measles w/ your technique, that’s not your fault!

Comments are closed.