Is it fair to use Bayesian reasoning to convict someone of a crime?

Ethan Bolker sends along this news article from the Boston Globe:

If it doesn’t acquit, it must fit

Judges and juries are only human, and as such, their brains tend to see patterns, even if the evidence isn’t all there. In a new study, researchers first presented people with pieces of evidence (a confession, an eyewitness identification, an alibi, a motive) in separate contexts. Then, similar pieces of evidence were presented together in the context of a single criminal case. Although judgments of the probative value of each piece of evidence were uncorrelated when considered separately, their probative value became significantly correlated when considered together. In other words, perceiving one piece of evidence as confirming guilt caused other pieces of evidence to become more confirming of guilt too. For example, among people who ended up reaching a guilty verdict, the same kind of confession was considered more voluntary when considered alongside other evidence than when it had been considered in isolation.

Greenspan, R. & Scurich, The Interdependence of Perceived Confession Voluntariness and Case Evidence, N. Law and Human Behavior (forthcoming).

Bolker writes:

The tone suggests that this observation—“perceiving one piece of evidence as confirming guilt caused other pieces of evidence to become more confirming of guilt too”—reflects an inability to weigh evidence, but to me it makes Bayesian sense: each piece influences the priors for the others.

I agree. It seems like a judicial example of a familiar tension from statistical analysis: when do we want to simply be summarizing the data at hand, and when do we want to “collapse the wave function,” as it were, and perform inference for underlying parameters.

36 thoughts on “Is it fair to use Bayesian reasoning to convict someone of a crime?

  1. This is interesting. It seems that the natural tendency, when weighing evidence, would be to use priors; after all, our lives consist of recognizing patterns and making inferences. But the untrained mind rarely performs anything as rigorous as Bayesian inference; it (the mind) probably tends to fit new evidence into the prior, rather than measure its actual divergence from the prior. Thus, the use of priors in the courtroom needs a strong counterbalance.

    • Diana:

      With a formal model for the pieces of evidence (as observations) the individual weights are set out by the prior and data generating model (aka likelihood) and ones purpose. Things that are represented in the model as being common get individual non-zero weights and these add up giving a higher weight together. If there are dependencies in the model between the pieces of evidence then the individual weights themselves will go up or down.

      So this is a long winded way to say that this can be formalized in a way to support the subjects hitting upon an insightful (or less wrong way) to weigh evidence.

      Gory technical details might be discernible here (given a good grasp of likelihood theory) http://statmodeling.stat.columbia.edu/wp-content/uploads/2011/05/plot13.pdf

      • Thank you. I will read your paper in full (and I mean it; it’s readable and interesting). I appreciate your point that the approach can be formalized. My point was that this would take substantial training and education. Is it possible to prepare jurors in this manner, or is it more realistic to introduce some counterbalances to natural impulses and tendencies?

        • Another thought: This could become an important part of civics education (from high school onward). I don’t mean to suggest it’s impossible; it would just take time and concerted instruction.

  2. An instance of the legal system not trusting jurors to be Bayesian is the ban on propensity evidence, which is evidence that the defendant has committed similar crimes in the past.

    • is it really an example of not trusting jurors to be Bayesian, or not trusting them to be proficient enough at Bayesian analysis to avoid overweighing the “propensity” prior? [” The inquiry is not rejected because character is irrelevant; on the contrary, it is said to weigh too much with the jury and to so over persuade them as to prejudge one with a bad general record and deny him a fair opportunity to defend against a particular charge.” Mi- chelson v. United States, 335 U. S. 469, 475-476 (1948)]. Plus, there is the concern that admitting propensity evidence creates “the risk that a jury will convict for crimes other than those charged—or that, uncertain of guilt, it will convict anyway because a bad person deserves punishment—creates a prejudicial effect that outweighs ordinary relevance.” United States v. Moccia, 681 F. 2d 61, 63 (CA1 1982).

        • Well founded fear, one presumes. I’m hardly an expert, but isn’t weighing priors a challenging enterprise? It seems to me that asking jurors to weigh “propensity” evidence would lead to an increase in false convictions — especially given the prejudicial effect of evidence of a criminal record, and the rather limited probative value thereof. (After all, if Joe is on trial for burglary, his prior burglary arrests certainly makes it more likely than he is the burglar than that I am, but that isn’t really the question. The question is, “does the evidence show beyond a reasonable doubt that Joe is the perpetrator,” not, “does the evidence show that Joe is the most likely perpetrator.” Propensity evidence is much more relevant to the latter question than to the former).

        • I’m also not an expert, and I agree the fear is likely well founded. I was just bringing it up because it’s an explicit example of the law thinking it’s not ‘fair’ to use Bayesian reasoning to convict someone of a crime.

          I do disagree with your position that propensity evidence wouldn’t be relevant to determining guilt beyond a reasonable doubt, though. ‘Beyond a reasonable doubt’ just means posterior probability of guilt > P for some predetermined P near 1. In the absence of propensity evidence, the assumed-innocent prior on guilt is 1/N where N is the number of people that could possibly have committed the crime. If most instances of a certain type of crime are committed by past offenders, there are M past offenders in the area, and M<<N, then it would be appropriate to update the 1/N prior upward, and this can have a pretty big impact on the posterior even near 1. Certainly enough to offset doubts sowed by unlikely scenarios put forth by the defense (e.g. it was an elaborate frame up by the police).

        • Oh,I didn’t mean to say that propensity evidence is completely irrelevant, just that it is only very marginally relevant, esp, to the issue of “is the defendant guilty,” as opposed to “is the defendant more likely to be guilty than some unnamed third party.” Also, I see a risk of juries conflating those issues — akin to the well-known problem with lineups, in which witnesses sometimes pick the person who looks most like the perpetrator (and hence, proper procedure calls for the police to explicitly teil the witness that the perpetrator might not be in the lineup at all).

          Also, there is the prejudice problem – including the risk that knowledge that the defendant is a “bad guy,” which might infect jurors’ consideration of the weight and credibility of other evidence. I’m not sure that Bayesian analysis has a solution to that problem. It seems to me that the risk inherent in propensity evidence outweighs its value.

          Re crazy defense theories, I have done some criminal defense work for 25 yrs and never encountered that; but, then, most of the cases I am asked to work on have dealt with issues other than identity, such as intent, degree of crime, etc, so I cant say that my experience is necessarily representative. I do know that there are limits in the ability of defendants to present defenses – it can’t just be based on pure guesswork. Eg: In Calif, the defense of third party culpability is not available in the absence of fairly concrete evidence that such a third party exists.

        • The importance of distinguishing between “is the defendant guilty,” and “is the defendant more likely to be guilty than some unnamed third party” is a good point.

      • Exactly. There’s also a selection issue — those with a criminal background are more likely to be charged with a crime even when innocent (the “round up the usual suspects” phenomenon).

    • A more exact phrasing is the US legal system in general weighs the probative value of prior “trials” relative to the current trial when prior “trials” is multidimensional (arrests, statements, convictions, characters references, uncorroborated statements pro or anti, etc) and these two sets are related informally by a judge as to tendency to exculpate or convict so the entire trial is a weighing that is reserved in our system to the judge because in our system the judge is responsible for determining which facts and possible facts bear against the particular legal standard. It isn’t mistrust of juries but rather division of labor in which the judge determines in general which priors to apply. The judge is essentially the designer of the trial and jury then does normal Bayesian thinking.

  3. I think there are two critical mental activities that influence these kind of judgements. First is the strong tendency people have to manufacture reasons for everything, whether well-founded or not. Second, probably not unrelated, is the desire to see events as fitting into a clear story arc.

    You could say that the second one especially is a manifestation of pattern recognition, but it is more: the desire or need to *find* a pattern.

    Good lawyers know this very well, and try hard to provide a compelling story arc. Jurors will find it hard to evaluate evidence apart from some story arc. This arc might be one that develops in the jury room, or it might be one that was provided by the legal team of one side or the other.

  4. There’s a pretty big literature on this effect & how it *subverts* Bayesian reasoning by causing laypeople & professionals (e.g., Drs) to to let the “likelihood ratio” assigned to early considered pieces of information spill over to later considered ones. It’s a cognitive bias, pure & simple.

  5. I think this type of reasoning has the potential to build a more robust judicial system.

    It is empirical fact that in the US the probability of having committed a crime varies based on race conditionals. Humans also tend to recognize patterns and have a sort of heuristic Bayesian reasoning approach. As such it’s not particularly surprising, conditional on race, we notice different conviction probabilities as a function of race.

    The current strategy seems to be to call these irrational biases, implicit bias, and shame people for them. The reality *could* be that using race conditionals in a trial does improve accuracy. As far as I’m aware there is no meaningful causal inference experiment to test this.

    A more meaningful strategy could be (maybe?) to acknowledge that Bayesian inference could potentially improve accuracy, but that employing it conditional on features such as race is contrary to values of individual liberty. This is an anti-racist view that reaches the conclusion without ignoring that pattern recognition/Bayesian reasoning *could* improve aggregate accuracy.

    • The loss for convicting an innocent person is different from the loss for failing to convict a guilty person, so improving aggregate accuracy probably isn’t the right target.

      • That’s right–but Simon can quickly retreat to a decision theoretic utility maximization problem.

        But then the question is utility for whom? Cf. Leguin’s story about the mythical town of Omelas. Does a majority (or even super-majority) have a right to impose even a global utility maximizing decision on any of its citizens? Is it moral for an individual to support or participate in such a government?

      • Yes, “The loss for convicting an innocent person is different from the loss for failing to convict a guilty person,” — in fact, the former loss is usually greater than the latter, since convicting an innocent person usually entails letting a guilty person go free.

        • Thanks.

          Reading this over, it seems to me that the application of Bayesian reasoning was severely deficient on a number of grounds (some of which are mentioned in the article, e.g., the fact that close relatives could also be a match and could well live nearby). Also unmentioned is the possibility that the lab simply made an error. When DNA testing was introduced some time ago, labs did make errors (documented in Gigerenzer’s book “Calculated Risks”). This case is about 20 years old and falls into that category. Suffice it to say that when this information is taken into account, the match probabilities are nowhere near as small as those quoted in the article you posted.

          I note also that the case applies only in the UK as a precedent.

          IMO it is really difficult to use Bayesian reasoning in court cases, simply because it is difficult for juries to wrap their heads around the ideas involved. I taught an honors (freshman/sophomore) course a number of times on Bayesian decision theory (finite state spaces only, which made it accessible to students without calculus). After a semester in this seminar-style course, where we did indeed discuss Bayesian jurisprudence from a juror’s point of view for about two weeks near the end of the course, I think the students might have been able to do this…but for a jury selected from a random pool with essentially no background in the subject, I’m dubious that they could do a reasonable job, even if the reasoning were carefully explained by the litigants.

        • “it is difficult for juries to wrap their heads around the ideas involved”

          It’s sad that human beings seem to have been equipped with brains/minds that can and do use (proper) probability theory[1] – including even the noncommutative sort, apparently[2] – but can develop intellects that struggle to understand and use it.

          [1] https://en.wikipedia.org/wiki/Bayesian_cognitive_science
          [2] https://en.wikipedia.org/wiki/Quantum_cognition

      • hmmmm … Is that b/c now that I’ve bullied you so savagely w/ my terroristic “… in your pipe …” comment, you are thinking of leaving the field? Or b/c I puffed myself up like an angry cat when I wrote the comment & intimidated the bejeebers out of you?

        There are lots of studies on coherence effects, which definitely aren’t in the “WTF?!” vein of psychological studies. Basically it’s just a kind of rolling confirmation bias that gets triggered by early pieces of evidence considered in sequence (a “primacy effect,” as social psychologists would say). There’s not the sort of skepticism about this line of work that there is about “power posing” among serious decision science researchers.

        You should of course be a critical reader– but read the studies! Or if you have, tell me what you find unpersuasive or suspect about them.

        A good recent review piece: DeKay, M.L. Predecisional Information Distortion and the Self-Fulfilling Prophecy of Early Preferences in Choice. Current Directions in Psychological Science 24, 405-411 (2015).

        Some interesting ones with physicians:

        Kostopoulou, O., Russo, J.E., Keenan, G., Delaney, B.C. & Douiri, A. Information distortion in physicians’ diagnostic judgments. Medical Decision Making 32, 831-839 (2012).

        Nurek, M., Kostopoulou, O. & Hagmayer, Y. Predecisional information distortion in physicians’ diagnostic judgments: Strengthening a leading hypothesis or weakening its competitor? Judgment and Decision Making 9, 572-585 (2014).

        Kostopoulou, O., Sirota, M., Round, T., Samaranayaka, S. & Delaney, B.C. The Role of Physicians’ First Impressions in the Diagnosis of Possible Cancers without Alarm Symptoms. Medical Decision Making (2016).

  6. Reading your piece, I remembered a post you wrote a while ago about not using any other information when grading students in a class other than the exam result. After searching for it, I found the post (http://statmodeling.stat.columbia.edu/2008/09/10/robin_hanson_an/). It was in 2008!

    Anyway, here’s the relevant bit:
    “Suppose I give a pre-test at the beginning of the course, then at the end I give a final exam. For simplicity suppose these are the only 2 pieces of info that we have on the students, and then imagine we can use these to predict future performance (e.g., grades in a future course). Once the course is over, the pre-test probably adds information (beyond what’s in the final exam alone), but it wouldn’t really be fair to use the pre-test to assign the final grade.”

    I think something similar happens here: Maybe the pieces of information do add information, but it’s not fair and they should be assessed as if they’re independent information.

    Last, but not least, we have evidence that people give more evidence to coherent stories, and ending up disregarding (inefficiently) information that contradicts a coherent story (cf. http://psi.sagepub.com/content/13/3/106.abstract).

  7. It strikes me that a major problem in criminal proceedings currently and one that would apply if one were to apply Bayesian techniques is weighing the priors.

    It is fairly clear that most forensic science is as scientific as homeopathy and classic bits of hard “evidence” such as eyewitness reports and confessions are dubious at best.

    So unless one is going to use a null prior for everything I don’t see that Bayesian theory has much application, not because it might not work but the scaffolding underpinning a lot of ‘evidence’ is so poor that it would be a matter of GIGO.

  8. Oh my. Who knew my letter to Andrew would prompt this many replies to his post? I haven’t read them all, and haven’t followed any of the links. I just want to add this.

    My original idea was that the different strands of evidence in a single trial could reasonably be seen as reinforcing one another. Andrew seemed to agree.

    Then some of the discussion turned to the reasons for excluding information about the defendant’s “bad guy” past. I think that exclusion makes sense – you’re trying the defendant for just this one offense – even if knowing he (or she) is a bad guy makes guilt this time more probable. That would indeed introduce prejudice. (Rarely do we get to use the word “prejudice” so literally!)

  9. The problem of confirmation bias overcoming Bayesian techniques isn’t confined to lay people. In a previous career as an intelligence analyst, it was striking how often some of my (intelligent and well educated) peers would perform very intricate intellectual contortions to jam new information into consistency with their prior. This became more and more convoluted with each divergent piece of information.

Leave a Reply to Diana Senechal Cancel reply

Your email address will not be published. Required fields are marked *