About that claim that police are less likely to shoot blacks than whites

Josh Miller writes:

Did you see this splashy NYT headline, “Surprising New Evidence Shows Bias in Police Use of Force but Not in Shootings”?

It’s actually looks like a cool study overall, with granular data, and a ton of leg work, and rich set of results that extend beyond the attention grabbing headline that is getting bandied about (sometimes with ill-intent). While I do not work on issues of race and crime, I doubt I am alone in thinking that this counter-intuitive result is unlikely to be true. The result: whites are as likely as blacks to be shot at in encounters in which lethal force may have been justified? Further, in their taser data, blacks are actually less likely than whites to subsequently be shot by a firearm after being tasered! While its true that we are talking about odds ratios for small probabilities, dare I say that the ratios are implausible enough to cue us that something funny is going on? (blacks are 28-35% less likely to be shot in the taser data, table 5 col 2, PDF p. 54). Further, are we to believe that suddenly, when an encounter escalates, the fears and other biases of officers suddenly melt away and they become race-neutral? This seems to be inconsistent with the findings in other disciplines when it comes to fear, and other immediate emotional responses to race (think implicit association tests, fMRI imaging of the amygdala, etc.).

This is not to say we can’t cook up a plausible sounding story to support this result. For example, officers may let their guard down against white suspects, and then, whoops, too late! Now the gun is the only option.

But do we believe this? That depends on how close we are to the experimental ideal of taking equally dangerous suspects, and randomly assigning their race (and culture?), and then seeing if police end up shooting them.

Looking at the paper, it seems like we are far from that ideal. In fact, it appears likely that the white suspects in their sample were actually more dangerous than the black suspects, and therefore more likely to get shot at.

Potential For Bias:

How could this selection bias happen? Well, this headline result comes solely from the Houston data, and for that data, their definition of a “shoot or don’t shoot” situation (my words) is defined as an arrest report that describes an encounter in which lethal force was likely justified. What is the criteria for lethal force to be likely justified? Among other things, for this data, it includes “resisting arrest, evading arrest, and interfering in arrest” (PDF pp.16-17, actual p. 14-15—they sample 5% of 16,000 qualifying reports) They also have a separate data set in which the criteria is that a taser was deployed (~5000 incidents). Remember, just to emphasize, these are reports involving encounters that don’t necessarily lead to an officer involved shootings (OIS). Given the presences of exaggerated fears, cultural misunderstandings, and other more nefarious forms of bias, wouldn’t we expect an arrest report to over-apply these descriptors to blacks relative to whites? Wouldn’t we also expect the taser to be over-applied to blacks relatively to whites? If so, then won’t this mechanically lower the incidence of shootings of blacks relative to whites in this sample? There are more blacks in the researcher-defined “shoot, or don’t shoot” situation that just shouldn’t be there; they are not as dangerous as the whites, and lethal force was unlikely to be justified (and wasn’t applied in most cases).

Conclusion:

With this potential selection bias, yet no discussion of it (as far as I can tell), the headline conclusion doesn’t appear to be warranted. Maybe the authors can do a calculation and find that the degree of selection you would need to cause this result is itself implausible? Who knows. But I don’t see how it is justified to spread around this result without checking into this (This takes nothing away, of course, from the other important results in the paper).

Notes:

The analysis for this particular result is reported on PDF pp. 23-25 with the associated table 5 on PDF p. 54. Note that when adding controls, there appear to be power issues. There is a partial control for suspect danger, under “encounter characteristics,” which includes, e.g. whether the suspect attacked, or drew a weapon—interestingly, blacks are 10% more likely to be shot with this control (not significant). The table indicates a control is also added for the taser data, but I don’t know how they could do that, because the taser data has no written narrative.

See here for more on the study from Rajiv Sethi.

And Justin Feldman pointed me to this criticism of his. Feldman summarizes:

Roland Fryer, an economics professor at Harvard University, recently published a working paper at NBER on the topic of racial bias in police use of force and police shootings. The paper gained substantial media attention – a write-up of it became the top viewed article on the New York Times website. The most notable part of the study was its finding that there was no evidence of racial bias in police shootings, which Fryer called “the most surprising result of [his] career”. In his analysis of shootings in Houston, Texas, black and Hispanic people were no more likely (and perhaps even less likely) to be shot relative to whites.

I’m not endorsing Feldman’s arguments but I do want to comment on “the most surprising result of my career” thing. We should all have the capacity for being surprised. Science would go nowhere if we did nothing but confirm our pre-existing beliefs. Buuuuut . . . I feel like I see this reasoning a lot in media presentations of social science: “I came into this study expecting X, and then I found not-X, and the fact that I was surprised is an additional reason to trust my result.” The argument isn’t quite stated that way, but I think it’s implicit, that the surprise factor represents some sort of additional evidence. In general I’m with Miller that when a finding is surprising, we should look at it carefully as this could be an indication that something is missing in the analysis.

P.S. Some people also pointed out this paper by Cody Ross from last year, “A Multi-Level Bayesian Analysis of Racial Bias in Police Shootings at the County-Level in the United States, 2011–2014,” which uses Stan! Ross’s paper begins:

A geographically-resolved, multi-level Bayesian model is used to analyze the data presented in the U.S. Police-Shooting Database (USPSD) in order to investigate the extent of racial bias in the shooting of American civilians by police officers in recent years. In contrast to previous work that relied on the FBI’s Supplemental Homicide Reports that were constructed from self-reported cases of police-involved homicide, this data set is less likely to be biased by police reporting practices. . . .

The results provide evidence of a significant bias in the killing of unarmed black Americans relative to unarmed white Americans, in that the probability of being {black, unarmed, and shot by police} is about 3.49 times the probability of being {white, unarmed, and shot by police} on average. Furthermore, the results of multi-level modeling show that there exists significant heterogeneity across counties in the extent of racial bias in police shootings, with some counties showing relative risk ratios of 20 to 1 or more. Finally, analysis of police shooting data as a function of county-level predictors suggests that racial bias in police shootings is most likely to emerge in police departments in larger metropolitan counties with low median incomes and a sizable portion of black residents, especially when there is high financial inequality in that county. . . .

I’m a bit concerned by maps of county-level estimates because of the problems that Phil and I discussed in our “All maps of parameter estimates are misleading” paper.

I don’t have the energy to look at this paper in detail, but in any case its existence is useful in that it suggests a natural research project of reconciling it with the findings of the other paper discussed at the top of this post. When two papers on the same topic come to such different conclusions, it should be possible to track down where in the data and model the differences are coming from.

P.P.S. Miller points me to this post by Uri Simonsohn that makes the same point (as Miller at the top of the above post).

In their reactions, Miller and Simonsohn do something very important, which is to operate simultaneously on the level of theory and data, not just saying why something could be a problem but also connecting this to specific numbers in the article under discussion.

103 thoughts on “About that claim that police are less likely to shoot blacks than whites

  1. I do a fair bit of media promoting my team’s work (mostly print/web, but a fair bit of radio/TV). In a few dozen interviews in the past two years, I’ve only had one or two where the reporter hasn’t directly asked “What surprised you in this research?” Our PR staff tells all of our researchers to have a ready answer for the question because it is inevitable.

  2. This study is a perfect example of how conditioning on post-treatment variables can lead you astray when estimating a causal effect. If the police perception of a person’s race is the treatment, then almost all situational variables that they condition on in the study are “downstream”: whether an encounter takes place at all, whether the encounter escalates to “potential for lethal force”, whether the encounter escalates to “use of taser”.

    This truncation problem (that many potential encounters don’t even occur if a suspect is perceived as white) is actually a very interesting non-ignorable missing data problem, and without treating it, I’m not sure if we can say much of anything about this issue using police records.

    • Interest lies in the *direct effect* of race on whether a shooting occurs not through other intermediate variables. It is this direct effect that is interpreted as bias. To estimate this direct effect, it is appropriate to adjust for post-‘treatment’ variables. This relates back to the recent discussion on mediation.

      • I think we may disagree on the estimand here. In my view, the larger social question is the total effect of perceived race on the probability of being shot by police before an encounter has been initiated.

        • Now that I think about it more, I agree with you. So, I don’t know how to draw graphs here, but consider the below paths that can be combined into a graph:

          (1) perceived race–>encounter–>escalation–>shoot
          (2) other_stuff–>encounter
          (3) perceived race–>escalation
          (4) other stuff–>escalation
          (5) perceived race–>shoot
          (6) other stuff–>shoot

          I think you’re saying that the combination of (1), (3), and (5) answers the question “how much does racial bias of police officers contribute to shootings?”. But Fryer with his regressions is only looking at path (5) and (assuming he adjusts sufficiently for ‘other stuff’ and ‘escalation’) answering the question “how much does racial bias of police officers contribute to the decision to shoot after an encounter has escalated assuming that it contributes the same amount in all escalated situations?”. I suppose the best we can ever hope for is to get the controlled direct effect not through encounters (i.e. paths 3 and 5) since cops don’t record data on all the people they see and don’t initiate encounters with (though maybe with bodycams…)

          And all this of course leaves out that policy level bias of stationing more cops in black neighborhoods has a huge impact on encounters.

      • The analysis assumes that all the Incident Reports are truthful and reliable.

        It seems to me that one of the really important aspects of video evidence is how often it proves that police descriptions of events are fabricated after the fact to justify what happened. Gosh, I wonder if racial bias could play any role here!?

        I’m also uncomfortable with the frequent mentions of results being “significant” or not. It looks to me like the Fryer study is a great (seriously) exploratory work, but with so many forking paths it’s ridiculous to use as confirmatory. Or have I totally misunderstood?

        • the really important aspects of video evidence is how often it proves

          Firstly, how often is that really? A few cases reported in media does not mean it happens often. Media itself distorts the discussion on racial bias by almost exclusively reporting on blacks being shot by the police even though whites are shot more often. Secondly, if your or my life were videoed and then our recollections of what we did was compared to the video evidence, it would probably also appear that we lie a lot. Our memories are a lot less reliable than we realize, as amply shown by research.

  3. 1.) I agree: a very surprising finding is suspicious and warrants extra investigation.
    2.) The paper is unpublished, it is a working paper, but as Prof Gelman argues peer review is no elixir, it is neither necessary nor sufficient for a paper to be considered “true”.
    3.) I don’t appreciate the tone Mr Feldman uses: Feldman, a PhD student in epidemiology, writes about Harvard full professor and John Bates Clark Medal winner Fryer’s paper as: “highly flawed, …suffers from major theoretical and methodological errors.” That comment alone discounts, IMHO, the rest of the blog post. No, full professors are not immune from making mistakes, but my prior is that on methodology, Fryer is probably right and Feldman probably wrong.
    4.) Fryer probably should have issued more caveats to discourage headline-grabbing.

    • Jack:

      A related complication is that, although Fryer is listed as sole author of the paper, the project is surely a collaboration between him and several others in the data collection and analysis and interpretation of the findings. And I don’t just mean secretarial help or the equivalent in data processing. I know from many years of experience that social science is hard and there’s not much that one person can do on his own.

      • Yes Fryer uses an army of research assistants who are not coauthors (standard practice in economics, but outside it is considered odd). They are 14: “Brad Allan, Elijah De La Campa, Tanaya Devi, and William Murdock III provided truly phenomenal project management and research assistance. Lukas Althoff, Dhruva Bhat, Samarth Gupta, Julia Lu, Mehak Malik, Beatrice Masters, Ezinne Nwankwo, Charles Adam Pfander, Sofya Shchukina and Eric Yang provided excellent research assistance.”

        Above I meant I discount Feldman’s blog post, not Prof Gelman’s. Sorry for any ambiguity.

        I didn’t see a link to the actual paper, so here it is (NBER version costs $$):
        scholar.harvard.edu/files/fryer/files/main-july_2016.pdf

        • Jack:

          I think the term “research assistant” can be misleading in that someone who provides “truly phenomenal research assistance” is really a collaborator. Or, to put it another way, to do social science research you need to talk things over with people and try out different ideas. If the sole author of this paper did not do so—if the research assistants merely collected data, ran regressions, etc.—then this could explain lots of problems, as it’s really really really hard to get these things right on your own.

    • Jack:
      Yeah Feldman’s assessment didn’t take into account Fryer’s overall contribution, and one of the linked papers actually corroborates Fryer’s results about “shoot or don’t shoot” situations! Well, at least if you accept simulator evidence, and take the decision to shoot to the penultimate moment (PAPER HERE)

    • Fryer has never published on policing before (and Feldman has only published one paper on PLOS last December). I think both are somewhat in the territory of doing analyses without really being domain experts.

      • Feldman’s remarks are both confused and strikingly obnoxious. He doesn’t really understand the concept of “statistical discrimination,” he completely misunderstands what “rational” means, and he repeats, in an unclear manner, limitations of Fryer’s study that are clearly acknowledged as limitations by Fryer.

        Feldman does not actually have a paper, as the term is generally understood, on policing: the piece in PLOS is not peer-reviewed, it’s an opinion piece, and all it does is point out that the Guardian collects data on police shootings and call for better data from U.S. authorities.

        Fryer on the other hand has published extensively on empirical methods to detect discrimination, literally writing the book on the subject in the sense of authoring the relevant chapters in the Handbook of Labor Economics, the Handbook on Economics of Discrimination, the Handbook of Social Economics, and the Handbook of Labor Economics. He has published dozens of highly-cited articles in major journals on detecting racial discrimination in a variety of contexts.

        Of course none of this means Fryer is right and Feldman is wrong, but the claim that Feldman is somehow more expert than Fryer in this context is ludicrous.

        I have no idea why Andrew is promoting Feldman’s blog post.

        Incidentally, Ross’s paper in PLOS One answers different questions than Fryer’s paper, so I am not sure why Andrew thinks there’s anything to “reconcile.” The descriptive statistics presented by Fryer are consistent with the analysis in the Ross paper.

        Andew, I am curious: did you actually read Fryer’s paper or Ross’s paper?

        • Chris:

          I read all these papers quickly, none in detail. The main point of my post was to convey Josh Miller’s thoughts, which seem very reasonable to me. I mentioned Feldman’s post because he happened to send me an email on it just before I was posting. I would not say I was “promoting” Feldman’s blog; in retrospect maybe it would’ve been better for me to just have asked Feldman to post a comment. Finally, I mentioned the Ross paper because a bunch of people pointed it out to me. I’m not disagreeing with you that the data in Fryer’s and Ross’s papers are consistent with each other, but certainly the headline messages coming from the two papers are much different, hence the need to reconcile. By mentioning the goal of reconciliation, I didn’t mean to imply that reconciling the two papers would be difficult; I just think it would be useful, given that the headline messages of the two papers are so different.

        • I’m curious: Even if the “headlines” of both Ross and Fryer say two different things, can one explain what this “bias” really means “on the ground”? Have you given greater thought of how and why the data is “consistent”? As an individual with a sociology and journalism background, the data analysis is a little over my head at times. But I see the gist: When most data says there’s roughly 3-4 times (3.49 times according to Ross) greater likelihood of being shot when unarmed if black vs. white, or have use of force used against you, we’re in the same ball park. (In raw numbers, about 40% of Americans shot and killed by police unarmed are black, and around 25% of those are simply shot and killed, according to the Washington Post.) The “next step” is what I think I did with Ross’ use of force data, and also another Summer 2016 report by the Center for Policing Equity that also says there’s about a “20% more likelihood” of force being used against you if black than white (with controls) and 360% more likely without. But “20% more” is completely misleading, I think, to it’s actual importance, despite being “statistically significant.”

          https://medium.com/@agent.orange.chicago/how-roland-fryers-controversial-study-on-racial-bias-by-police-actually-shows-negligible-bias-ea3a8b1fd293#.45pzqqvkv

          AS I WROTE:

          Do the simple math on any interaction with New York City police (see graphic above) during the controversial “stop-and-frisk” decade and you get nearly the same percentage for black and white citizens:

          * 1 out of every 64 blacks stopped will have a weapon DRAWN on them
          (1.5 % of encounters)
          * 1 out of every 77 whites stopped will have a weapon DRAWN on them
          (1.2 % of encounters)

          Or:
          * 1 out of every 185 blacks stopped will have a weapon POINTED at them
          (.5% of encounters)
          * 1 out of every 232 whites stopped will have a weapon POINTED at them
          (.4% of encounters)

          You can do this 5th grade arithmetic for all those moments of “disparities” (from 16% to 25%) among the “use-of-force continuum,” which I started doing in the Center for Policing Equity Study that I debunked. Even news outlets like Fortune focus on the higher percentages, which tend to lead readers to “bias is the problem, not behavior” conclusions. But that shocking figure of “25% more likely to” use of force with a baton or pepper spray really just compares 5 times out 10,000 (blacks) versus 4 times out of 10,000 (whites). While technically true it’s “25% higher” likelihood of a baton beating — and that’s the number media and academia promote — you’re still talking about pretty much the same minuscule rate or percentage of occurrence, i.e. .05% (black) versus .04% (white). That’s a paltry figure that doesn’t fit the narrative (or the cliché anti-police t-shirts) of cops beating down citizens.

    • My experience and prior is the opposite – that senior tenured professors are usually the least discerning interpreters and least capable practitioners of empirical data analysis.

    • I’m skeptical of this. If you’re a police officer, know this issue is in the air, and are suddenly put into a deadly force simulator where there is no personal threat and, oh hey, some suspects are white and some are black, I’m going to guess you’re not going to behave as you normally would on the street. Let alone the selection bias of those officers who do the drill.

    • JP: nice. Given that the participants are being observed, and police in the field typically are not, might that not play a role? This appears to be evidenced by the fact the civilians in the simulators also are less likely to shoot blacks? Civilians don’t typically face the same pressures as police, but apparently they do in the simulation. I’d like to see some tough incentives on these participants, to kill as much as possible this motivation to look good to the observer. Probably wouldn’t be good enough, because the key variable which is likely to lead to a bias in the decision to shoot is FEAR, and that is hard to simulate in the lab (Also, if you are interested in another one, Feldman provided one, which I linked above)

        • If the simulation results had gone the other way, would they also be meaningless? This bit where you tell a story that sounds plausible for why result X can be ignored but result Y should not sounds a lot like the thought process that leads you down the garden of forking paths.

        • Sounds like you mean confirmation bias, not forking paths.

          Anyway, the problem here pertains to the external (and construct) validity of the experiment.

          The goal is to design an experiment that has the salient features of the real world that have been hypothesized to lead to a bias in policing, and there are two prominent ones:

          1. Police fear of black males.

          2. Lack of accountability/Observability/Judgement

          This experiment has neither of these, so I am not interested, without even looking at the results.

          Now is my explanation for their results correct? Who knows, but in my experience, if you give people a cheap way to signal to others (and themselves) that they are a non-racist socially acceptable person (PC), they will jump at the opportunity. If your Facebook network is big enough, you will see it everyday. Now do we believe it?

          This is not to say I wouldn’t be interested in a simulated shoot-or-don’t shoot situation, but you gotta do it better than this.

  4. Of course we also have to ask why are we surprised? The default (completely naive) position would be that police treat everyone the same. Clearly we have news and media examples and statistics that tell us that’s not the case, so we believe it’s not the case, so we do research to find out and dig deeper. Which result should be surprising? What’s Bayes have to say? :-)

    To me, we’re still early and in the middle of that last (research) part; I think there will be some research coming out of social experiments soon (such as police departments demilitarizing or changing “broken windows policing” policies) where we can garner more data.

    But I actually didn’t find the results surprising… I tend to believe that we (humans) tend to overestimate the impact of rare events, particularly when they’re horrific or even just problematic, and mostly because of how news travels. Not that there isn’t something to fix, I’m just not surprised if we mischaracterize it due to some human bias. This is my favorite article on the topic of surprise findings lately (and it’s unrelated to race and police): http://www.nytimes.com/interactive/2016/06/03/upshot/up-college-unemployment-quiz.html

  5. Wasn’t the effect insignificant (i.e. consistent with just a small effect of bias on the decision to shoot)? It’s not shocking to me that bias would have a small effect given that the disproportionate number of encounters black people have with police seems to be able to explain most of the disparity in shootings. White people look at the Minnesota video and can’t imagine it happening to them, but the victim had been stopped over 50 times. Maybe if white people imagined being stopped 50 times with a gun in their pocket instead of just once it wouldn’t seem as far fetched that it could happen to them too.

      • +1
        Do we know the proportion of African-American police officers in Houston relative to the rest of the country? Could a higher prop. explain why Houston is (?) different than the rest of the country?

        • This question assumes that black officers are immune to the conditioning that makes police officers feel that black men, all else equal, are more likely to be dangerous in an encounter. Anecdotally, as a black man, I’ve had nearly as many potentially dangerous encounters with black officers as I have with white officers. Many of my friends have similar stories.

        • According to 538, Houston has the 20th most mismatching police-to-populace demographics of the 75 largest police forces in the US, at least as of 2010. fivethirtyeight.com/features/reexamining-residency-requirements-for-police-officers/

        • Although I think that graph says that blacks are actually overrepresented on the Houston police force; they do relatively poorly because of a lack of Hispanics.

    • Interesting quote from that study:

      >It is notable that Miami-Dade (FL, contains Miami), Harris (TX, contains Houston), and Cook (IL, contains Chicago), stand out as counties where the ratio of {black, unarmed, and shot by police} to {white, armed, and shot by police} is elevated to 19.08 (PCI95: 4.46, 81.13), 6.71 (PCI95: 1.46, 26.77), and 5.60 (PCI95: 1.25, 21.97) respectively.

      And in Fryer the shooting data was all from Houston, ratio 6.71 (PCI95: 1.46, 26.77).

      This relative risks in this PLOS study are a little odd though, what we’d really like are ratios of P(shot|black,other factors)/P(shot|white,other factors) but since it’s based only on shooting data we get P(black|shot,other factors)/P(white|shot, other factors), so local population demographics and crime rates will affect the result as well.

  6. The ‘Washington Post’ has recently done much independent research in this area. The WP quickly discovered that official FBI/police records were highly inaccurate; for example, that police nationwide were killing more than twice as many people as the FBI had previously reported officially. In October 2015, FBI Director, James B. Comey, said it was “unacceptable” that journalists had become the leading source of information on the subject of police shootings.

    The Selection-Bias issue here extends to automatically ‘trusting’ police data/reports. Individual policemen and police departments have very strong incentive to minimize or conceal any negative aspects or culpability in their violent encounters with the populace.

    • A couple of points that I think don’t get talked about but probably factor in to the issue significantly.
      1)escalate/de-escalate – I suspect that many times police opt to de-escalate situations where an individual is white, as opposed to black. The quantity of blacks in prison for ultimately minor drug offenses would seem to bear this out. If a black man is going to get stopped more frequently than a white man, which I think no one disputes, the chance that he is going to over time get stopped and hauled off to jail for a minor drug offense, probably motivates many to run or fight. Then the situation rapidly escalates with a bad outcome. The white man with that joint in many cases is going to see the policeman de-escalate the situation and since his odds of being stopped are less, he is at much less risk of seeing the inside of a jail.
      2) The educational requirements for many police and sheriff’s departments are probably much lower than many people realize. Often a high school diploma or GED and no felony convictions are all that is required to put on a badge.

      Probably there should be a much more well defined use of force rule in place for police in general. Should a minor drug offense escalate to the use of lethal force. Probably not.

  7. Isn’t the post that a study like this needs to be examined, not that the study is wrong? I find the comments interesting because a number exhibit a bias the study must be wrong – kind of like Brexit can’t happen so the model must be wrong – and coming up with reasons to justify that belief.

    My take on the study was different: people expect an obvious result which shows blacks shot, violently arrested, etc. and that doesn’t appear in the data or analysis, whether the single conclusion from one city’s data is correct or not. To be clear, if you took out the headline about whites being more likely to be shot, etc., then you have a study which still goes against the existing bias by saying it doesn’t look like there’s a meaningful difference.

    My completely personal take on this is very complicated and I describe it only to show how hard it would be to analyze this topic. Example: I think the black community is generally closer or more identified with members who have criminal records, a belief based in the rates of arrests, convictions and incarcerations of black males (a whopping 37% of the state and federal prison population and, as I remember from the DoJ stats, a higher percentage of younger black males given the age of the prison population). The effects of these rates has been studied – don’t know how well – and related to effects like the increasing number of single black mothers getting college degrees. By contrast, in NH a white man gunned down some police and was shot – all on tape – and there was outrage because one of the officers had been harassing the shooter for a long time but, bluntly, few white NH people identified with that guy’s situation. But it’s extremely hard – impossible? – to separate this kind of effect that can be described in reasonable words from more inchoate anger rooted in relative poverty and other issues, such as the need within the community for more policing not less and the natural feeling that not only do you suffer more crime but you get hassled as well. Contradictory impulses are difficult to pull apart.

    It’s even difficult to think about the lesser categories in the study, such as the 12% or so increase in use of hands. Are all arrests just racial or is crime involved? (This goes back to the 37% number above.) If you’re a cop, black or white or whatever, you have expectations and you also want to protect yourself. I was somewhat surprised the differences in many categories of lesser “violence” were so low given where much of the data comes from. The police in general, not in specific situations, may actually be better at evaluating personal risk than we are and do it with less bias than we civilians would.

    • If you take out the part about whites being more likely to be shot, the headline of the article reads “New Evidence Shows Bias in Police Use of Force”, which I think goes along with most people’s beliefs. Whether you believe those particular parts of the analysis or not is probably dependent on what you think of the data/analysis in general.

      Also pointed out in the Times article is that the entire analysis is predicated after a stop has been made and that other research has shown that blacks are more likely to be stopped in the first place. Even if police officers were completely unbiased in their interactions with people, blacks would disproportionately experience shootings and other poor treatment because they experience more interactions. Further thinking along this line is over at http://datacolada.org/50 , where they square the results by noting that you might actually expect less violence/fewer shootings against blacks if they’re stopped more often because many of those extra stops will involve people who have done nothing wrong and aren’t violent/threatening.

        • Alex & Jonathan

          Alex: In your link to datacolada, Uri Simonsohn is making the same point (independently), see my “Notes.” Uri is more thorough though, see his footnotes.

          FYI: Fryer discusses (and addresses) the issue of stops. The point here is that one cannot control for the level of danger across white and black suspects given the selection criteria.

        • Yes, the link buttresses the point I originally made, which is about the bias of disbelief. The effect found wasn’t large but, as I noted, if you changed the headline to say “there’s not much discernible”, that would be just as surprising because people expect and I think want or even need to find that yes bias must be there.

          As an aside, imagine if this work came out from a white economist. My guess is this would be described as racist and people would develop reasons and aspersions why this is nonsense. I’m trying not to talk about race or racism but instead about how bias infects our ways of thinking, our models and our analyses.

        • Jonathan

          1. agreed on the delicate race issues involved for researchers.

          2. I think you are missing the point on “there’s not much discernible.” As Uri points out explicitly, the data here is also consistent with police being more likely to shoot black than whites. (I hint at it in the “Notes” above)

          3. On disbelief, I think you are overlooking the possibility that there are non-biased reasons to respond with disbelief, here is one way:

          We know three things:
          a. black males are feared more than white males, and you can measure this in many different ways (implicit association tests, gsr, fMRI of the amgydala)

          b. fear can distort the perception of threat.

          c. blacks are more likely to be shot by police.

          Given a&b: Do we think c is entirely attributable to blacks being stopped more often, and blacks having statistically higher rates of violent behavior? I doubt it greatly.

          Now a study comes out that seems to be making this exact counter-intuitive point, and it comes out with near perfect timing; it is just too good of a result. So it is natural to think that something must be up, and to express disbelief.

        • First, “perfect timing”? Really? This subject has been in the news for a long time. Michael Brown was shot in August, 2014 and he wasn’t the first nor the only reason why a black economist might be interested.

          Second, I’m commenting on the headline about shootings. The study clearly finds – and across multiple analyses – that use of force is highest and most racially unbalance at the low end of the force scale, meaning application of hands, and that racial bias or unbalancing lessens (and fairly consistently) as the type of force increases. This is true under a variety of assumptions. The link to Data Colada concludes: “Finally, I’d argue we expect not just a main effect of threat, one to be controlled with a covariate, but an interaction. In high-threat situations the use of force may be unambiguously appropriate. Racial bias may be play a larger role in lower-threat situations.” Well …?

          Third, the 3 part statement of a prior is exactly the kind of thing I’m talking about: you need to be willing to upend where that leads you. It’s easy to construct a prior which demands police bias as a result.

          Fourth, I don’t mean to be arguing “authority” but rather am saying the idea that they literally missed the quantity of stops doesn’t make sense. And when you read the material, that they treated the data with that in mind comes through. I of course may be wrong but I think it’s more likely an issue which could be addressed by adding a footnote. I’m also not convinced people have the issue correct but that’s a longer topic and involves my own questions about the specifications they used.

          Fifth, I find it humorous that people are complaining so much about sample size and data problems given typical studies. It’s like saying “Nothing we do is reliable at all except you don’t notice that unless and until someone gets mad enough to pay attention to the work.” And that calls into question the validity of or at least the usage of statistical methods. Yeah, data is a mess. And analyzing it involves choices. (This is not meant to be a personal comment but a generalization.)

        • Hi Jonathan

          sorry, I was unclear. by perfect timing I meant the *perception* that the paper could have been rushed to fit the news cycle with the 2 prominent recent shootings. This can be part of the explanation of the disbelief. This perception happens to be wrong: meta-data on the PDF says the paper was compiled July 5th, day of the Baton Rouge shooting, day before Minnesota shooting, so only a coincidence. Just happened to hit the new cycle at the right moment.

          Fine on 2. On, 3 is fine, too–but, hey, the point is, it doesn’t have to come from a simpleton biased prior, there is a clear psychological model for why we should expect this, and be surprised when it isn’t there.

          On 4, I don’t get this. the main point here, and in Uri’s post, isn’t about the quantity of stops.

          On 5, agreed, this is a general issue. Hopefully there are enough people with different tastes and interests to catch the different biases, mistakes, etc.. Same issue in charitable giving: some people donate to the humane society, others to doctors without borders… we hope all issues that require charity are addressed, but probably not.

        • I might have missed it in the methods, but there’s no certainly no caveat about it in the discussion. If you’re going to bother to point out that you can’t randomly assign race, you might also point out that there’s a potentially large bias involved before even getting into the data at hand.

          I also don’t think the appeal to authority buys much. Even really smart people miss things or leave them out. For example, in footnote 17 Freyer says that they never would have considered (or bothered) looking at the reliability of their coding if someone hadn’t suggested it. Even then there is no report of the reliability, just the assurance that the results don’t change.

        • Jonathan:

          Regarding the point about “a team led by a Clark Medal winner at Harvard”: One thing that frustrates me about this and others of Fryer’s papers is the single-authorship. I don’t care so much about whether the other authors of this work were given credit—I assume they are getting paid well enough, as well as learning a lot. Rather, my problem is that when there is only one stated author, the person who is nominally responsible for the work is not the same as the person who did all the work. I know that Fryer is a respected researcher and I have no reason to doubt that he’s done a lot of good stuff, but he has made some pretty basic mistakes. A few years ago I noticed a very basic error he made, misreading one of his own graphs! I have no idea what happened there, but my guess is that the person who made the graphs was not the person who wrote the erroneous text—and it’s possible that neither of these two people was Fryer.

          An advantage of including all the authors of the work as authors of the paper is that these authors can take responsibility for their work, in a way that doesn’t happen if people who collect data, or make graphs, or write text, are merely listed as research assistants.

  8. Is this situation, “the claim that police are less likely to shoot blacks than whites” sort of similar to the famous “paradox of the smoking mother”?

    https://en.wikipedia.org/wiki/Low_birth-weight_paradox

    “The low birth-weight paradox is an apparently paradoxical observation relating to the birth weights and mortality rate of children born to tobacco smoking mothers. Low birth-weight children born to smoking mothers have a lower infant mortality rate than the low birth weight children of non-smokers. It is an example of Simpson’s paradox.”

    “The birth weight distribution for children of smoking mothers is shifted to lower weights by their mothers’ actions. Therefore, otherwise healthy babies (who would weigh more if it were not for the fact their mother smoked) are born underweight. However, they still have a lower mortality rate than children who have other, more severe, medical reasons why they are born underweight.”

    “In short, smoking is harmful in that it contributes to low birth weight which has higher mortality than normal birth weight, but other causes of low birth weight are generally more harmful than smoking.”

    Reasoning from analogy, police are less likely to shoot a person with black skin than a person with white skin because blackness alone produces a confrontation; confrontation with a white person is a more serious indication of an underlying problem.

    • I don’t usually like to do “me too” comments, but Paul was able to form a coherent comment before I could. This is exactly what I was thinking. (Plus/minus all the other issues mentioned above.) I wonder if there’s enough information in the raw data to get at this question?

    • Paul:

      Right, but Fryer recognizes this possibility, the conclusion is that once you get to the critical point of escalation, where lethal force may be justified, there is no bias.

      “To be clear, the empirical thought experiment here is that a police officer arrives at a scene and decides whether or not to use lethal force. Our estimates suggest that this decision is not correlated with the race of the suspect. This does not, however, rule out the possibility that there are important racial differences in whether or not these police-civilian interactions occur at all.” (PDF page 25)

      The only point I am making is that the definition used to define this category is likely to over-select non-dangerous suspects in black category relative to white category, so even if there is a bias, it gets swamped with obvious cases of no-shoot from blacks (because tasers were over-applied, or reports over-applied “resisting arrest” and whatever).

  9. If it isn’t confirmation bias, it’s sour grapes. Whine about who got shot, and then whine about who didn’t get shot as much. While the media is throwing raw meet to wild dogs, you’re writing hamburger reviews.

  10. The core issue with this analysis is that the problem is defined away. Every officer involved shooting incident in Houston will be described in the arrest report as one where lethal force may have been justified. I’d be a lot more interested in an analysis of the rate of routine traffic stops and other common police encounters that escalate to the use of force.

    I’m a resident of Houston, Texas who recently (within the last 3 months) had an encounter with the Houston Police Department. As it happens, I’m a wealthy white male in my mid-50’s. I was pulled over at night for speeding on I-69, on the far north side of Houston (almost to Humble, for those who know the area). I was speeding, like most of the drivers on the freeway at the time. I believe I was pulled over because I was driving a car with out of state license plates (it was a rental car).

    My traffic stop was as pleasant as a traffic stop can be. I didn’t get a ticket, despite having no proof of insurance and no copy of the rental agreement. The problem is that I and many other residents of Houston believe that, had I been a young black man (or woman, it’s been a year since Sandra Bland died), the outcome might have been very different. This study does absolutely nothing to convince me otherwise. I have too many friends who have experienced encounters with the police force differently. I’ve seen too many cell phone videos that directly contradict the police reports.

    • It seems like Fryers results would agree with that impression–in otherwise-similar interactions between police and civilians, blacks are more likely than whites to get roughed up short of being shot.

  11. Josh Miller claimed that implicit association tests (IAT) demonstrate bias against blacks. In fact, IAT research consistently shows that white Americans tend to be biased in FAVOR of black Americans. This was shown in Blanton et al. (2015).

    The misinterpretation that IAT is associated with anti-black behaviors stems from the fact that most IAT studies only examine whether IAT scores covary with racially biased behavior, without considering the absolute level of biased behavior in the sample. Blanton et al. show that in almost all published IAT studies the average white participant is biased in favor of blacks in their observed behavior.

    In fact, there are IAT studies claiming to show whites to be anti-black in which every single white study participant demonstrated a pro-black bias in the behavioral measures used. In those cases, the correlation between IAT scores and racially biased behaviors means that those with higher (“more anti-black”) IAT scores tend to be only slightly pro-black behaviorally, while those with low IAT scores tend to be extremely pro-black.

    • Actually, I misspoke above. Blanton et al. did not show that the average white American in IAT studies is behaviorally pro-black. What they showed was that the average white scoring a zero (“non-biased) on an IAT about racial bias is pro-black in their observed behavior. However, several studies, such as this one do show a robust pro-black behavioral bias even when IAT scores show an anti-black bias.

      • Interesting. I guess time will tell on this study, its pretty new. Doesn’t seem directly relevant though, as the associations, and related behavior don’t appear to be about fear and danger, but about higher level judgments.

        Anyway, I think it is safe to say, there are enough other objective metrics to validate the basic point about a bias in the perception of danger, and fearful responses. This is not a controversial point.

        • I disagree about it not being a controversial point. The discrimination research is probably p-hacked to the gills. Shenanigans like those described here are most likely bog standard in this literature. Lots of the claimed effects may not exist.

          More generally, how do you decide if there’s a bias when people perceive one group to be more dangerous than another when one of the groups is in fact more dangerous? Almost all people killed by cops are men, but as men are much more violent and dangerous than women, how can we know if this means that cops are biased against men? Men are eight times more likely to commit homicide than women. Blacks are also eight times more likely to commit homicide than whites. Clearly the police should, rationally, be more fearful of some groups than others. It’s not easy to say whether they overdo it or not.

        • “Clearly the police should, rationally, be more fearful of some groups than others”

          The base rate here is so low that group membership gives an officer next to no information on the threat posed. It would not be rational to be more fearful of some group than others.

        • I disagree. A third of black males will go to prison in their lifetime, and in the worst neighborhoods the percentage is much higher. Young black males make up about 1% of the population but 27% of all murderers.

        • The relevant base rate is:

          Pr[danger|black man] not Pr[black man|convicted killer]

          Still too low to be informative.

  12. Re the plos 1 paper and “reconciling it with the findings of the other paper discussed at the top of this post”, I think the discussion section of the plos 1 paper does the reconciling itself:

    “It is important to reiterate that these risk ratios come only from the sample of individuals who were shot by police and census data on race/ethnicity-specific population information. The USPSD does not have information on encounter rates between police and subjects according to ethnicity. As such, the data cannot speak to the relative risk of being shot by a police officer conditional on being encountered by police, and do not give us a direct window into the psychology of the officers who are pulling the triggers.”

  13. Closely related and also in the press is this study: THE SCIENCE OF JUSTICE RACE, ARRESTS, AND POLICE USE OF FORCE.
    http://policingequity.org/wp-content/uploads/2016/07/CPE_SoJ_Race-Arrests-UoF_2016-07-08-1130.pdf
    They have a tortured, reverse way of reporting some of their key findings, which Professor Gelman might appreciate.
    [N=19,269 arrests total.]

    In this study, they claim to show that “the narrative that crime is the primary driver of racial disparities [in police use of force] is not supported within the context of these departments [the population of their study].” One of their findings is that, in a minority [!!!!!] of departments, the rate of police use of force is higher against blacks than white, when controlling by number of arrests for violent crimes. They report it thus: “Finally, some departments revealed robust disparities across levels of force even when using this most conservative test [controlling by # arrests for violent crimes]. As Figure 3 shows, Black-White gaps in canine use and in OC spray use persisted in 55% of departments after violent arrests were controlled. Likewise, Black-White disparities persisted in weapon use in 40% of departments, in less lethal and Taser and hands and body use in 36% of departments, and in lethal force in 25% of departments.”

    Looking at tables 5 and 11, the average disparity is 30-40% across all departments against whites, and this pattern is described as troublingly “robust” evidence of bias against blacks. And they don’t even *mention* the more common-sense conclusion, which is the opposite: that according to this conservative criterion, there is *negative* bias against blacks on average. It seems like they’re suffering from some kind of cognitive blockage.

    Their caveat doesn’t seem to make much sense either: “Additionally, multiple participating departments still demonstrated racial disparities when force incidents were benchmarked exclusively against Part I violent arrests, such that Black residents were still more likely than Whites to be targeted for force. This method is very likely prone to underestimate racial disparities because African Americans are overrepresented in violent crime arrests but Part I violent crimes constitute only 1/24th of all arrests nationally (BJS, 2012)….” [My emphasis]. That’s overrepresented as compared to their fraction of the population. Isn’t that kind of question-begging with respect to the proper interpretation of “disparity”, since their argument is that criminal behavior of populations does not account for racial disparity in use of force?

  14. The assumption that somehow the result is surprising is simply media artifact. The prior on the police shootings should be that whites are more likely to be killed in police encounters than blacks. One look at aggregate data would tell you that. Looking at guardian data, 25% of police shooting victims are black, while 50% are white. It’s hard to judge the reliability of police reports on arrests and interactions, but we have some independent indicators: homicides. 50% of homicide victims are blacks ( around 40% for non Hispanic whites). It’s well known that homicide rate correlates well with all violent crime. Consequently, it’s a useful proxy for overall violent criminality and hence police interactions. This gives a clear indication that blacks are killed at a disproportionally low rate compared to whites. In general, the prior is in agreement with Fryer study.

    Another indicator might be the perpetrators of police killings are around 40% black. Again, indicating that police are less likely to shoot blacks than whites.

    • Krzys

      No, it is a surprise. Everyone was surprised, including Fryer.

      You claim that your back of the envelope calculation makes the result “not surprising”

      No, it doesn’t.

      Let’s assume for a half second that the conclusion you draw from your data is true. That’s surprising too! The surprise comes from what we know about bias in other areas.

      Now, no one has made your argument. All these people have the aggregate data is right there in front of them, no analysos needed, and they just aren’t seeing it! That should make you worry.

      You claim to have corrobating evidence that whites are more likely to be killed.

      You are wrong.

      1. Overall homicide rate is not a useful proxy for the potential for violent criminality for the black suspects in their sample. The whites are highly selected for being dangerous, the black are not.

      2. You are using US-wide population figures, and you are not controlling for the fact that whites are an overwhelming majority of the population, and encounters, so 25% and 50% conditional on being shot by cop, is not comparable to the other measures.

      3. 50% of homicide victims being black is not additionally informative for a police officicer making a decision. And by the way half the victims are the targets of something like 5-10% of the perpetrators.

      • Actually, no, I don’t worry about what academics think is surprising or not. I trade for a living and live off of people not seeing obvious things. That doesn’t obviously make me right, it just makes the argument from authority laughable to me.

        Again, the result is surprising to some due ideological commitments. Just because there’s bias doesn’t mean it matters in aggregate. So, before anyone assumes anything, it’s incumbent on them to look at basic evidence. Clearly, nobody here does.

        As to your individual points:

        1. In general, it is well known in the literature that homicide rate correlates well with violent crime in general. Your selection argument makes no sense.

        2. Non Hispanic whites are not a majority of encounters at all. Total arrests for all violent crime are roughly the same in absolute terms for both non Hispanic whites and blacks. Blacks are a majority of arrestees for murder and robbery.

        3. homicide rates are extremely informative for police officers (and others), considering, for example, that the murder rate for young black man is close to 100/100k compared to 4 in general population and around 2 for non Hispanic whites.

        • Not surprising to you != not surprising

          Good on ya, you found something obvious that no one has noticed.

          1. The point was about the above selection discussion regarding Fryer’s work. So yes, your proxy doesn’t have much to say.

          2. Arrests for violatent crime is not the right category, encounters that begin for non-violent offenses escalate to shooting.

          3. In an ex-ante statistical discrimination sense, I agree, but once you have the controls in the study (lethal force justified), I don’t.

          This is all crude back of the envelope stuff, not a careful, principled analysis, enjoy the forking paths that lead you back to your prior.

        • I didn’t have any prior till i looked at summary statistics. Seems like everybody here has a strong prior without looking at any data.

          1. Fryer data are micro and might not be very representative, but summary macro data are.

          2. Well, if you assume that there is bias there, you would have to include all arrests in the encounter proxy for blacks, but that would mean that the rate would decline even further. In other it would imply even a bigger bias against whites in deadly police shootings. Your only hope is to somehow argue that all the encounters are somehow unjustified, but then the homicide rate gets in the way.

          3.Micro data on encounters are hard to gather, overall criminality rates are instructive for overall hazard ratios, especially when disparities are that gigantic.

          Bias has to show in aggregate data if it matters quantitatively. It’s a very simple analysis and i don’t think you understand what forking path concept means. This analysis is the opposite of it. Careful, micro analysis in small sample can easily get you a lot of ideologically motivated noise.

    • Funny. The prior to me would be the extremely low rate of white violent crime with black victims in the USA as a whole. Thus, the result that white cops are not going gun crazy on black suspects at a higher rate than other race combinations doesn’t surprise me at all.

      Even if you include police, there just isn’t much white on black violence in modern times in the USA.

  15. Yes. Surprise *is* a factor but in two conflicting directions:

    1) Surprise is evidence that the investigator was not searching for the result. Strengthens (I mean reduced the discount for pre planned results etc.)

    2) Means his prior was quite different. meaning the bayesian total evidence is weaker. i.e. the current evidence should be weakened by the strong prior of before. Assuming of course there is any meaning to his prior…..

  16. I think every reasonable person agrees that to do their jobs properly, cops must hassle blacks, especially young black males, disproportionately compared to whites or members of other races. This is because of differences in crime rates.

    The question is whether the police overdo this extra scrutiny, harassing blacks even more than their overrepresentation among criminals warrants. If that is the case, then the criticism that Fryer’s comparison between shootings in situations where the police has already engaged the individual is biased is reasonable.

    However, I think there may also be a countervailing effect whereby whites (or other non-blacks) may be disproportionately hassled by the police compared to blacks. This could happen because blacks are, on the average, more violent and dangerous, but cops have only unreliable indicators to use to determine if someone is violent and dangerous and must therefore be confronted. When using such unreliable indicators to choose whom to confront, the police will, if they are completely racially unbiased, end up hassling too many whites (and other non-blacks) and/or too few blacks.

    This is analogous to a well-known problem in psychometrics, discussed here. It is known that if two groups differ in their average scores on some underlying dimension, the use of unreliable indicators of individuals’ positions on that dimension to select individuals will lead to more false positives in the group with the lower mean score. This is used to explain, for example, why SAT scores tend to overpredict the college performance of blacks compared to whites: the average black is lower on the underlying dimension (academic aptitude) than the average white, so when unreliable indicators (SAT scores) are used to select individuals above a certain cutoff score, blacks are more likely than whites to make the cutoff just because measurement error (rather than actual ability) put them over the cutoff.

    So, if the police stopped blacks and whites in an unbiased manner (based on some outward indicators of being suspicious), the average black among those stopped would be more dangerous than the average white, given that whites are less dangerous on the average. To remove this discrepancy, the police would have to adopt stricter criteria for dealing with blacks than whites.

    • D,

      There are few misconceptions in your comment that I would like to address:

      1. “I think every reasonable person agrees that to do their jobs properly, cops must hassle blacks, especially young black males, disproportionately compared to whites or members of other races. This is because of differences in crime rates.”

      – This Newt Gingrich like rhetorical flourish is typically the beginning of an argument that contains fallacious reasoning, and your comment did not disappoint in this regard. It is not true that every reasonable person agrees with your assertion. There are very strong and logically defensible arguments that contend that increased harassment will lead to more crime and not less as is implicit in your argument.

      2. “It is known that if two groups differ in their average scores on some underlying dimension, the use of unreliable indicators of individuals’ positions on that dimension to select individuals will lead to more false positives in the group with the lower mean score.”

      – While mathematically this is correct, the argument assumes a supernatural ability to know an unknowable. In other words, it is not possible to know what the actual true scores are for the two groups on a hypothetical dimension.

      3. “… SAT scores tend to overpredict the college performance of blacks compared to whites”

      – The SAT is a poor predictor of college performance. I have never seen a well controlled, large sample, appropriately analyzed study support your claim. While the largest sample study available concludes the opposite of your claim (http://www.nacacnet.org/media-center/PressRoom/2014/Pages/BillHiss.aspx).

      4. “So, if the police stopped blacks and whites in an unbiased manner (based on some outward indicators of being suspicious), the average black among those stopped would be more dangerous than the average white, given that whites are less dangerous on the average. To remove this discrepancy, the police would have to adopt stricter criteria for dealing with blacks than whites.”

      – Given that the assumptions of your argument do not hold, the conclusion does not logically follow.

      • Wait, cops hassling African americans leads to them killing each other at very high rates? How’s the logic of the argument work? And especially, is there any evidence for it whatsoever?

        • That was not the argument I made. I am happy to address critiques of my actual argument, but not of distortions of it.

        • Ok, then, how does harrasment lead to higher crime, and especially would it explain the substantial difference in crime rates?

        • In the same way that repression leads to an increased likelihood of revolution. Emotional responses to abusive behavior will result in either feelings of guilt and assumptions that the abused is in the wrong and deserving of such abuse, or of anger and dismissal of social mores that would inhibit criminal behavior. Both types of reactions are likely within a given community. When those charged with enforcing societal rules do not abide those rules themselves, the notion that a stronger response will produce positive results is not tenable short of a fully run police state.

        • Ok, the internalization of abuse would suggest an increase in self harm, but suicide is the lowest of all groups in black community. As to the externalization of abuse into asocial behavior, this could be plausible in explaining the general sweep of AA history in the US, but suffers from problems on shorter time scales. Namely, throughout the sixties and 70’s policing and incarceration rates were extremely mild (by western standards) yet those years saw an explosion of crime.

      • Curious,

        When I said that the police should “hassle” blacks more, I didn’t mean that they should resort to any untoward tactics. I simply meant that because blacks commit more crimes, a competent police force will commit a disproportionate amount of its resources to fighting black crime. Most reasonable people will agree with this. If you don’t, it just means that you’re not reasonable.

        “the argument assumes a supernatural ability to know an unknowable. In other words, it is not possible to know what the actual true scores are for the two groups on a hypothetical dimension”

        It’s not unknowable. Looking at, say, data on violent crime — whether arrest or incarceration data, victim survey data, or criminals’ self-reports — there is not the slightest doubt that blacks are disproportionately involved in violent crime. I agree that the difference cannot be very precisely estimated, but there’s no question that the it exists in the direction that I indicated.

        “The SAT is a poor predictor of college performance. I have never seen a well controlled, large sample, appropriately analyzed study support your claim. While the largest sample study available concludes the opposite of your claim”

        You haven’t looked very hard then. That study, which isn’t the largest SAT study by any means, doesn’t show that the SAT is a poor predictor. In fact, based on my quick browse through it, it doesn’t even appear to report any correlations between SAT scores and college outcomes. It only compares SAT submitters and non-submitters. This comparison is meaningless because people self-select into those two groups, and the groups differ from each other on many variables. For example, the non-submitters tend to have higher high-school GPAs, meaning that they are probably superior to the submitters on at least some non-cognitive traits related to school performance. In the public university sample the average non-submitter in fact had a higher SAT score than the average submitter (in the private school sample it’s the opposite, but with 59% of SAT data missing, the difference is uninterpretable). All in all, it seems like a poorly done study.

        Here’s an excellent presentation by Paul Sackett on the predictive validity of the SAT using a sample of 250 schools and 1.2 million students. The raw correlation between SAT scores and first-year GPA is 0.35. Introducing various reasonable controls (for range restriction, course difficulty, etc.) increases the correlation to as high as 0.67. These are not small effects.

        “Given that the assumptions of your argument do not hold, the conclusion does not logically follow.”

        Which assumptions do not hold? Do you think that the level of violent crime is not higher in blacks than whites?

        • D,

          1. I will accept you assertion that you mean ‘hassle’ does not mean the police engage in any unprofessional behavior or misconduct. Does this mean we agree that if police do engage in an unprofessional manner more often with blacks who have not committed any crime or have committed no more than a petty crime, this would be unwarranted? Or are you arguing that it is acceptable as long as the proportion of misconduct is consistent with the crime relative crime rate between racial groups?

          2. A 0.35 correlation explains a whopping 12% of the variation of 1st year gpa, which leaves 88% unexplained by that model. The ‘corrections’ used in this literature conveniently only ever increase the effect sizes, and never decrease it. Does that not make anyone else suspicious? Using a logically flawed measure of reliability to inflate an effect size is not convincing to those with critical thinking skills. Why do you think graduation gpa is not typically reported in these types of presentations? Because the correlation disappears and the presenters are often consultants for ETS. Find me a large scale study with a diverse group of colleges that does that, and you might have the beginning of an argument. But you are better off leaving SAT’s out of your argument entirely as it doesn’t help it.

          3. The precision of the ‘true score’ value is essential for your argument to be valid. Without a good estimate of this proportion, there is no way to judge the reasonableness of the observed amount of police engagement by racial group.

        • 1. The proportion of police misconduct for all races should of course ideally be zero.

          2. Again, you don’t know this literature. There’s plenty of research on the predictive validity of the SAT for cumulative GPA. The results show that the SAT predicts cumulative GPA better than first-year GPA.

          The point of SAT validity research is to estimate the extent to which schools can increase student performance by selecting students based on SAT scores. It makes sense to correct for a purely mechanical downwardly biasing effect like range restriction as well as for the systematic bias introduced by different course choices. There’s nothing suspicious about the fact that such corrections increase the effect size if you understand the statistics involved. Sackett et al. don’t correct for the unreliability of the SAT or the GPA at all, so those “critical thinking skills” of yours may need some honing. Correcting for unreliability would of course make the correlations even higher.

          Psychometrically, the SAT is a run-off-the-mill IQ test. There’s a 100-year-old research tradition on the use of IQ to predict academic achievement. The results are very consistent and similar to the SAT results, so there’s nothing surprising about the predictive validity of the SAT.

          3. Estimating that “true score” is not more difficult than estimating whether the police stop and engage blacks too much compared to whites. Moreover, when trying to derive an estimate of whether cops pay too much attention to blacks you must factor in the influence of racial differences in the underlying dispositions to crime. I’m not saying that estimating either of these quantities is easy. The point is simply that you shouldn’t ignore either one.

        • 1. I did not watch the video, but Paul Sacket is well known for supporting statistical corrections of conceptually flawed, and poorly measured constructs to improve the strength of the estimate despite the logical inconsistency of doing so. He certainly provides an argument, but not one that is convincing.

          Why would a restricted range matter for a correlation between two continuous phenomena where small variations in one are argued to be strongly correlated with the other?

          Possible answers:

          a) Because the constructs are not actually continuous.
          b) Because the relationship is not linear (Which would be perplexing)
          c) Because the constucts are poorly conceptualized.
          d) Because the constructs are poorly measured.
          e) Because ETS can maintain their monopoly if the relationship appears stronger.

          If any of these are true, why should we give any credence to an argument that assumes a strong level of understanding of the phenomena being assessed?

          2. The .36 weighted average is still unimpressive and includes only studies ETS chose to include in their marketing literature. I would hope most know the difference. I have personally collected quite a bit of data on SAT, ACT, and GPA that was not supported by ETS or any of their consultants. Publication bias is a real thing. SAT was never more than very weakly correlated with cumulative GPA.

          2. Justifying a conceptually flawed and poorly measured construct by comparing it to another conceptually flawed and poorly measured construct (g) is also not a convincing argument. The notion that ‘g’ is anything more than a statistical artifact of exploratory factor analysis was simply an exercise in post hoc explanation of an ‘interesting’ finding.

          3. Your argument about an estimate of a true score is exactly my point. The estimate is coming from the same data that you have described as measured by unreliable indicators, data that is restricted in range (which seems to be important), on a construct that is conceptually diffuse.

        • It’s logically inconsistent to correct for range restriction? Meaning, the fact that Caltech students have SAT around 1500, but kids at Comunity colleges struggle to break 1000 is a random accident? Now, I understand why you believe all those concepts are “poorly constructed”.

        • Your argumentation regarding the SAT consists entirely of misrepresentations of the available literature, conspiracy theories and bald-faced lies. When I refute your arguments, you simply ignore what I said and come up with a new batch of nonsensical claims. You seem to have an emotional need to minimize the importance of cognitive ability, so you’re throwing all kinds of things at it, hoping that something sticks. I could easily refute everything you claimed in your latest comment, given that basically nothing has been so thoroughly researched in social science as this general question, but what would be the point of that? You would just ignore my arguments and evidence and spout more nonsense that completely disregards the published literature.

          To quickly validate Sackett’s results I got data from the NLSY97, which is a longitudinal study of a representative sample of Americans born between 1980-84 funded by the Department of Labor, and correlated SAT scores with cumulative college GPA for those who had attended four-year colleges:

          Correlation with college GPA in the NLSY97:

          SAT math 0.32 (n=781)
          SAT verbal 0.29 (n=783)
          SAT total 0.34 (n=777)
          AFQT 0.35 (n=1278)

          These results are very quick and dirty and have problems with missing data, range restriction, etc., but the basic results nicely confirm what Sackett and others have reported. AFQT is another IQ test and, as can be seen, its association with GPA is almost identical to that of SAT total. But of course you will argue that the Department of Labor is in cahoots with the ETS and these results are fake… There’s no point for me to continue this discussion.

        • I am making the same argument about this research as is repeatedly made on this blog about the impact of publication bias… The garden of forki t paths… And crude measures being treated as if they are precise. I am not arguing that there are not individual differences in speed of learning. What i am arguing is that it explains a fraction of the variation in outcome. And that pretending that it explains a larger amount than can eve v hope to measure is irresponsible.

        • D:

          Reiterating a point that Curious made earlier: The correlations with cumulative college GPA that you give for various items range from 0.29 to 0.35. These correspond to explaining only between about 9% to 12% of the variation in cumulative GPA. This is pretty small. So these items might be predictors, but they are very poor predictors. Not strong enough to draw strong conclusions from.

        • Martha,

          Firstly, I don’t think the “variance accounted” is an informative metric. Predictors that explain a relatively small amount of variance can be very useful when selection is stringent. Sackett et al. put it this way:

          As long ago as 1928, Hull criticized the small percentage of variance accounted for by commonly used tests. In response, a number of scholars developed alternate metrics designed to be more readily interpretable than “percentage of variance accounted for” (Lawshe, Bolda, & Auclair, 1958; Taylor & Russell, 1939). Lawshe et al. (1958) tabled the percentage of test takers in each test score quintile (e.g., top 20%, next 20%, etc.) who met a set standard of success (e.g., being an above-average performer on the job or in school). A test correlating .30 with performance can be expected to result in 67% of those in the top test quintile being above-average performers (i.e., 2 to 1 odds of success) and 33% of those in the bottom quintile being above-average performers (i.e., 1 to 2 odds of success). Converting correlations to differences in odds of success results both in a readily interpretable metric and in a positive picture of the value of a test that “only” accounts for 9% of the variance in performance.

          Secondly, the uncorrected correlation is a misleading indicator of the actual underlying relationship. For example, it treats a GPA of 3.5 in home economics from Podunk College and a GPA of 3.5 in theoretical physics from Caltech as indicating the same level of achievement. Check out the video presentation by Sackett I linked to in a previous comment to see how correcting for such artifacts increases the correlation.

        • D,

          1. The outcome measures from the vast majority of the studies on the topic suffer from the exact same problems as the predictors you are promoting. The criterion problem in organizational research has never been solved, instead it is ignored and the chosen path is business as usual.

          2. Simply because a small correlation can provide utility in a given instance, such as when an analytic trader bets on a small modeling effect because it’s stable and she knows it will eventually provide an opportunity. This same logic does not hold for crudely measured predictors (which are no longer considered constructs, but rather simply test results because the construct notion could not be logically defended for tests such as Intelligence, SAT, etc.) correlated with crudely measured outcomes that capitalize on correlations between unmeasured and uncontrolled variables which are only very indirectly related to the ultimate outcome of interest in an organization, survival and profit.

          3. The argument about range restriction does not answer the question of why that would be necessary for precisely measured variables. And if it is, then there is a simple solution that does not include corrections that can only increase the effect size and never decrease it. Collect the missing range of the constructs on both sides of the equation. It is a simple solution. Not an easy one to accomplish, but a simple one to understand.

        • Edit:

          Simply because a small correlation can provide utility in a given instance, such as when an analytic trader bets on a small modeling effect because it’s stable and she knows it will eventually provide an opportunity, does not mean it holds across the board.

  17. What I find disturbing, is that while everyone is discussing odds ratios, no one seems to know how to interpret these. Fryer apparently thinks that odds ratio of logistic regression intercept is equal to the base probability and that an increase in odds ratio translates into multiplicative increase in base probability. In table 2A they report odds ratio of police use of force for white 0.153 and odds ratio for black 0.153*1.53. Then they write: ,,Blacks are 53% more likely to experience any use of force relative to a white mean of 15.3 percent. ” Unfortunately, that is incorrect. The base probability for white is 0.153/(1+0.153)=0.13 and it increases to 0.19 for black. That’s 6 percent increase, or a relative increase by 100*(1-0.19/0.13)=46%. With the correct interpretation, the gap between white and black become smaller … and less surprising I guess.

  18. There have been two killings by law enforcement in my neighborhood in this decade, both of white males.

    One was a national story while it was happening, but then was forgotten because it was just a typical suicide-by-cop: a 40-something white homeless guy pulled out a gun on a busy street and shot into the air and the pavement, apparently careful to not hit anybody, until the LAPD came and killed him.

    The other — a federal agent killed an 18-year-old violist — got virtually no media coverage for three years until it made the front page of the Los Angeles Times when a judge awarded the dead youth’s family $3 million. Reading the brief squib in the LA Times the day after the shooting had set off my Cop Cover-Up sensors — law enforcement was claiming that the parking lot where the shooting occurred was a known drug market, which it definitely is not. (I’ve walked through it 500 or more times.) My wife and I went down to the site to see if the official story could possibly be plausible and we ran into the dead youth’s mom. We told her the story in the paper looked dubious and she should sue. She did and she won.

  19. Ross (2015) and Fryer (2016) do not come to different conclusions as you presume.

    Ross is about race-of-shot suspects whereas Fryer is about race-of-shooting officers.

  20. This author makes some interesting points, but the problem is that is not science. If we start with an assumption of how things should be, then try to make the narrative or calculations fit our preconceived notion, that is not science. There may indeed be bias in the reporting. However, perhaps there is bias in the other direction as well. Perhaps White officers are harsher on reporting White suspects as dangerous than Black suspects due to higher expectations of the first and lower expectations of the second. Being harder on “your own people” is not an unheard of or uncommon phenomena. The challenge is to find ways to obtain more complete, accurate, relatively neutral reporting and data that is unbiased in any direction or form.

    Just because the author “feels” it should be won way over another, does not make it the case. And to try to manipulate the data to fit our preconceived notions of truth is not science, but politics.

Leave a Reply to Z Cancel reply

Your email address will not be published. Required fields are marked *