Skip to content

OK, sometimes the concept of “false positive” makes sense.


Paul Alper writes:

I know by searching your blog that you hold the position, “I’m negative on the expression ‘false positives.'”

Nevertheless, I came across this. In the medical/police/judicial world, false positive is a very serious issue:


Cost of a typical roadside drug test kit used by police departments. Namely, is that white powder you’re packing baking soda or blow? Well, it turns out that these cheap drug tests have some pretty significant problems with false positives. One study found 33 percent of cocaine field tests in Las Vegas between 2010 and 2013 were false positives. According to Florida Department of Law Enforcement data, 21 percent of substances identified by the police as methamphetamine were not methamphetamine. [ProPublica]

The ProPublica article is lengthy:

Tens of thousands of people every year are sent to jail based on the results of a $2 roadside drug test. Widespread evidence shows that these tests routinely produce false positives. Why are police departments and prosecutors still using them? . . .

The Harris County district attorney’s office is responsible for half of all exonerations by conviction-integrity units nationwide in the past three years — not because law enforcement is different there but because the Houston lab committed to testing evidence after defendants had already pleaded guilty, a position that is increasingly unpopular in forensic science. . . .

The Texas Criminal Court of Appeals overturned Albritton’s conviction in late June, but before her record can be cleared, that reversal must be finalized by the trial court in Houston. Felony records are digitally disseminated far and wide, and can haunt the wrongly convicted for years after they are exonerated. Until the court makes its final move, Amy Albritton — for the purposes of employment, for the purposes of housing, for the purposes of her own peace of mind — remains a felon, one among unknown tens of thousands of Americans whose lives have been torn apart by a very flawed test.

Yes, I agree. There are cases where “false positive” and “false negative” make sense. Just not in general for scientific hypotheses. I think the statistical framework of hypothesis testing (Bayesian or otherwise) is generally a mistake. But in settings in which individuals are in one of some number of discrete states, it can make a lot of sense to think about false positives and negatives.

The funny thing is, someone once told me that he had success teaching the concepts of type 1 and 2 errors by framing the problem in terms of criminal defendants. My reaction was that he was leading the students exactly in the wrong direction!

I haven’t commented on the politics of the above story but of course I agree that it’s horrible. Imagine being sent to prison based on some crappy low-quality lab test. There’s a real moral hazard here: The people who do these tests and who promote them based on bad data, they aren’t at risk of going to prison themselves here, even though they’re putting others in jeopardy.


  1. Rodney Sparapani says:

    Hi Andrew:

    Interesting. But, why are people driving around with baking soda pleading guilty to a felony?
    It seems to me there is more going on here than just flawed testing.


    • Matt Z says:

      There is a big literature on this. Most often it is part of a plea bargain to avoid the threat of a longer sentence. In younger people, it might be as simple as the interrogator promising they will get to go home to their families after hours and hours of questioning. In the very young and mentally impaired it can be easy for interrogators to plant false memories after hours and hours of questioning. There are plenty of articles about it out on the webs.

      • Kaiser says:

        Yes, Matt is right. I did some research into this subject for my first book. False confession is a real thing! There are many causes-the most egregious is that the court allows the police to tell lies to elicit confessions. Also, psychologically, the innocent person thinks that surely since I didn’t do it, there should be lots of other evidence to contradict the confession. Unfortunately, some of the research showed that jurors place disproportionate weight on the confession evidence, at the expense of other evidence.

    • gdanning says:

      Interestingly, in the case highlighted in the Pro Publica article, the defendant pleaded guilty to a misdemeanor, and was sentenced to 45 days in jail. And, the article does not seem to say that anyone is actually convicted at trail on the basis of the field test alone (nor would they – a more sophisticated test would doubtlessly be performed were the case to go to trial). The field tests seem to be used only as the basis for an arrest, not a conviction.

      The problem seems to lie not with the test, but with people being pressured to plead guilty (From the article: “A majority of those defendants, 58 percent, pleaded guilty at the first opportunity, during their arraignment; the median time between arrest and plea was four days.”).

      So, this seems to be an instance of the reporter not understanding the problem he or she is reporting on, or being so devoted to a pre-assumed narrative that he or she is blind to the real story. Unfortunately, that seems all too common in the journalism profession.

      • DADS says:

        > The problem seems to lie not with the test, but with people being pressured to plead guilty

        You’re being disingenuous. They are being pressured to plead guilty after being falsely detained.

        As a reader of a statistics blog that often posts about utility functions, you of all people should realize that the choice is between “plead guilty and get a reduced sentence” vs. “plead not guilty and face the maximum penalty,” and many, many people would prefer the former given their lack of faith in due process after just having been falsely detained.

        • gdanning says:

          But, that is my entire point – being given the choice between “plead guilty and get a reduced sentence” vs. “plead not guilty and face the maximum penalty” is exactly the pressure I am talking about. That choice simply should not be offered to a defendant at such an early stage in the proceeding – DAs should not be permitted to offer plea deals at that stage, and judges should not accept guilty pleas based on plea bargaining at that stage.

          PS: BTW, innocent people are arrested all the time; that is unavoidable. A person can be arrested if there is probable cause to think he or she is guilty. “A police officer has probable cause for an arrest when he has “knowledge or reasonably trustworthy information of facts and circumstances that are sufficient to warrant a person of reasonable caution in the belief that the person to be arrested has committed or is committing a crime,” Weyant v. Okst, 101 F.3d 845, 852 (2d Cir.1996). A drug test that is 79% accurate almost certainly meets that test. Hence, a police officer who arrests someone based on that test has done nothing wrong. It is the DA who pushes for a guilty plea based on that test who is the one who has acted wrongfully.

    • Kenneth Carlson says:

      I had the same question. You might classify this story as a classic example of burying the lede. (1) There’s a significant chance that your friendly neighborhood coke dealer is selling you counterfeit drugs, and (2) consumers apparently can’t tell.

  2. elin says:

    I think that there are lots of situations like this, and it’s also a good idea for undergraduates to learn to think about the variety of ways in which they may make decisions that are wrong or leap to the wrong conclusions about individuals, situations, policies. The drug test example is way more interesting as a way to introduce Bayes Theorem than some drawers with or without jewelry in them or even the Monty Hall problem etc. Many students understand the issues around drug tests well since they have had those kinds of jobs. Of course I also want them to eventually question where the specificity and sensitivity values come from too. Basically I think it is good critical thinking for people to consider a number of ways they can go wrong in drawing conclusions. That’s why I don’t have a problem with introducing the idea of types of error in a conceptual way.

    I think the important thing, though, is that the distribution of individuals on dichotomous variables is a different thing than the distribution of sample statistics from samples containing large numbers of individuals.

  3. Jonathan (another one) says:

    That’s one cool cat, standing in that quantity of baking powder. Sort of an El Chapo of gatos.

  4. Anoneuoid says:

    Even in this case what you really want to know is the amount/percentage of a certain substance that is present, not whether it is present at all. The only reason the false positive concept seems to make some sense is because the tests are so inaccurate and the people using them are so poorly trained. Yes, if the situation is a lost cause anyway who cares how weak your method of analysis is… the results will always be inconclusive.

    Once you get into diagnostic tests designed and performed by properly trained people, they will start doing stuff like measuring isotope ratios to distinguish between sources and/or trace the contamination back to a source. Imagine if law enforcement didn’t fall into the hypothesis testing trap and instead did that, how much easier would it be to see who is getting what from who, etc?

    Thinking more about the above, I’d even go so far as to say reliance on the hypothesis testing paradigm is a major contributor the epic failure of the US “war on drugs” we have seen the last few decades (not that I am particularly in fond of this war).

  5. Tom says:

    This is a case where signal detection theory is a more useful framework than statistical hypothesis testing, because SDT is entirely about individual cases and how the costs and benefits of different outcomes affect — and should affect — decision making.

  6. Jonathan says:

    I’ll never like the phrase “false positive”. Tests sometimes give wrong answers so the correct statement is that the test was wrong. To me, this has two important meanings: 1) when you say “false positive”, that carries the idea that the test was conducted, that it was conducted somehow in a reasonable manner and that the result is reasonable but just happen to be wrong, and 2) none of that is true because a) tests are conducted poorly and that affects reliability of answers and b) tests results may be affected by a number of factors.

    Imagine this: a guy is brought in for public drunkenness and the test is presented to the court that the defendant failed to walk a straight line, that the defendant was unable to respond to simple commands requiring identification of objects and was unable to reach out and grasp an object without assistance. That’s your false positive: the test and its subparts were administered properly and the defendant did in fact fail at every one of them … except the defendant is blind. Take the false positive route and the defendant has to affirmatively prove, “But I’m blind. Really. Blind enough that I can’t see.” It isn’t a false positive when someone with multiple scelerosis is treated as drunk by the police – which happens. The point isn’t all that subtle, especially in a court: the burden of proof is a) the test must be disproved, which is the “false positive” idea, or b) the validity of the test in this circumstance must be proved. And in statistical terms, you can think of that as a sign question: you’re examining an effect and a measure of that effect, so the direction you approach that from has meaning. In a false positive, you’re approaching from the idea of validity, which I suppose is why we drowned women because witches float.

    • Wayne Folta says:

      False Positive is not the same as Wrong, since Wrong includes False Positives and False Negatives, and when it comes to decisions the distinction matters a lot.

      At the same time, we need to not use False Positive as a flippant way to write off the impact of poor tests. It’s like “statistically significant” which may or may not indicate practical significance.

  7. Eric says:

    > The funny thing is, someone once told me that he had success teaching the concepts of type 1 and 2 errors by framing the problem in terms of criminal defendants. My reaction was that he was leading the students exactly in the wrong direction!

    What do you mean? Guilty/not guilty is a discrete state, no?

    • Martha (Smith) says:

      I am guessing that Andrew’s phrase “leading the students exactly in the wrong direction” really refers to the whole idea of hypothesis testing.

      But I agree with Eric that using the analogy of guilty/not guilty is a good way to explain Type I and Type II errors, if one is going to teach hypothesis testing, which is needed because it is used so much — so people need to understand the problems with it, and Type I/Type II errors do that partially (which is not to deprecate Type M and Type S errors; these also need to be taught).

      And, as Elin points out, drug testing is a very good way to introduce students to Bayes’ theorem, then on to Bayesian analysis. In my opinion/experience, drug testing works much better for this purpose than a betting approach, which loses a lot of students (although it fascinates a few).

    • Elin says:

      If anything the relationship to hypothesis testing might be more about the reasonable doubt standard versus the preponderance of evidence standard.

  8. Paul Alper says:

    The link referred to in your sentences:

    “Nevertheless, I came across this. In the medical/police/judicial world, false positive is a very serious issue:”

    comes up as

    “This site can’t be reached

    cost%20of%20a%20typical%20roadside%20drug%20test%20kit%20used%20by%20police%20departmentshttp’s server DNS address could not be found.

  9. Chris G says:

    > There are cases where “false positive” and “false negative” make sense.

    Add “fire control decisions” to the list with “roadside drug testing”.

  10. Tom Dietterich says:

    I think the concepts of false positive and false negative make perfect sense when a person is making an irrevocable binary decision. Example: “Should this self-driving car apply the brakes?” “Should the air bag be deployed?” “Is this person guilty of driving under the influence of intoxicants?” “Should this person have a lumpectomy?”

    My sense is that Andrew’s objection to talking about hypothesis testing in this way is that our goal is very rarely to make an irrevocable binary decision. Neyman and Pearson were very clever to try to formalize statistical testing as a binary choice, but statistical analysis is rarely about binary choices. We seek to understand the underlying processes, the relevant factors, the sources of noise and missingness, the strength of the evidence, the weaknesses of our hypotheses, the additional data that should be gathered, and so on. Most (All?) of the problems with p-values result from the attempt to use a statistic designed for a binary decision to serve all of these other purposes.

Leave a Reply