What is “weight of evidence” in bureaucratese?

Martha Smith writes:

An NPR program today (Chemical Industry Insider Rolls Back Rules At EPA, led me to a NY Times article by Eric Lipton ( about the same topic. I browsed te latter a bit. One quote of note:

One area of contention was Dr. Beck’s insistence that the E.P.A. adopt precise definitions of terms and phrases used in imposing rules and regulations, such as “best available science” and “weight of the evidence.” The agency had repeatedly rejected the idea, most recently in January, in part because the definitions were seen as a guise for opponents to raise legal challenges.

The two links above and several others in the article are to a document

I searched this on “best available science” but did not find any
definition in the document. Searching on “Weight of evidence” got to what looks like it might possibly be a definition:

p. 185: Weight of Evidence (Congressional Record June 7, 2016): “The term ‘weight of evidence’ refers to a systematic review method that uses a pre-established protocol to comprehensively, objectively, transparently, and consistently, identify and evaluate each stream of evidence, including strengths, limitations, and relevance of each study and to integrate evidence as necessary and appropriate based upon strengths, limitations, and relevance of each study and to integrate evidence as necessary and appropriate based upon strengths, limitations, and relevance.”

This doesn’t sound very precise to me.

Also, p. 35 had this:

Influential risk assessments should be capable of being substantially reproduced. As described in the OMB Information Quality Guidelines, this means that independent reanalysis of the original or supporting data using the same methods would generate similar analytical results, subject to an acceptable degree of precision. Public access to original data is necessary to satisfy this standard, though such access should respect confidentiality and other compelling considerations. It is not necessary that the results of the risk assessment be reproduced. Rather, someone with the appropriate expertise should be able to substantially reproduce the results of the risk assessment, given the underlying data and a transparent description of the assumptions and methodology.

Public access to original data is good, but the rest of this sounds pretty weak.

There’s lots more in the podcast, article, and supporting documents that’s worth discussing. The popular press articles face the problem of trying to discuss scientific issues in lay language; but the issues themselves involve the conflict between the scientific and legal perspectives. Not an easy topic.

Indeed, it’s a bit disturbing for them to write, “It is not necessary that the results of the risk assessment be reproduced.”

One challenge here is that even statisticians can give you bad definitions of “weight of evidence.”


  1. Anoneuoid says:

    integrate evidence as necessary

    The word “integrate” is interesting. To me that means coming up with something like the stefan-boltzmann law: T^4 = I*(1-alpha)/(epsilon*sigma) which integrates information about temperature, irradiance, albedo, and emissivity of an object. I don’t think that is what they are expecting though.

    It sounds like what they are thinking of is just feeding a bunch of info into a human, or ensemble of humans, and then using their “intuition” (black box) to get a final decision. The lack of reliable theory/law leads to a situation much more like implementing a ML algo than science.

  2. LM says:

    >>> “This doesn’t sound very precise to me.”

    well, how precise are our court system evidence criteria for enforcing all American laws? What accepted standards does the government routinely use?
    Should stricter scientific criteria be used for all government legal interventions?

    The minor issue here is EPA legal imposition/enforcement of environmental rules, but our normal courts convict (and sometimes kill/execute) people on what might be considered quite vague evidence definitions.

    Consider the official standard of proof for criminal and civil court cases. Crimes must generally be proved “beyond a reasonable doubt”, whereas civil cases are “proved” by lower standards such as “the preponderance of the evidence” (which essentially means that it was more likely than not that something occurred in a certain way).

  3. Dale Lehman says:

    A great book on these topics is Judging Science by Foster and Huber ( It hardly resolves any issues, but amply demonstrates the difficulties involved with determining what passes for “scientific knowledge” and the challenges of defining this is legal and practical circumstances.

  4. RJB says:

    One challenge here is that even statisticians can give you bad definitions of “weight of evidence.”

    I infer that you think some statisticians can give you good definitions. Can they? Have they? Perhaps they can come close when they stipulate that a number of assumptions are met regarding data gathering, sample selection, measurement, etc. I write as someone who works in a field (accounting) that tries to define words like reliable, relevant, verifiable, material (as in important), etc. We have definitions, but few think they are “good”, just reasonably workable.

    • Kyle C says:

      As an administrative judge, I agree. No one can define “the weight of the evidence” in a way that transfers from one set of facts to the next. (I am an amateur Wittgensteinian on this.) The best one can hope for is to specify a process that generates the evidence to be weighed. Ultimately the weight “is what it is.”

  5. Alex says:

    Part of it, I think, is a broad governmental issue where the government can’t tell businesses (including contractors hired for government projects) exactly how to do things. The EPA might have a rule about how much of a pollutant a factory can produce but they probably won’t tell the factory owners what chemical processes or scrubbers or whatever to use to achieve that pollution level. Or for work that I’ve done, I’m tasked with conducting analyses and writing reports but am only somewhat directed on how to do that. This system allows the government to get what it wants (certain levels of pollution; completed analyses) while also allowing businesses to innovate to achieve those goals. The flip side is that some things don’t get super-specific definitions.

  6. Martha (Smith) says:

    Andrew said: “One challenge here is that even statisticians can give you bad definitions of “weight of evidence.””

    My concern is that I don’t think it’s possible to give a precise definition of “weight of evidence” that would cover all situations the EPA might consider. In particular, different situations may have different types of evidence to be considered. And I don’t recall seeing anything about how “weight of evidence” should take into account evidence and estimates of uncertainty.

  7. steven t johnson says:

    The weight of evidence for a proposition might be conceived as the probability of it’s being true. Except instead of citing an actual number (much less margin of error,) there’s simply a binary decision that the probability is high enough (or not.) This may be preferred to mask the inability to actually calculate a meaningful probability. But could the real difficulty be that “weight of evidence” is more a risk assessment? As in, the risk to the decision process from assuming the truth of the proposition is too small to change the outcome? I’m not sure whether there’s a general standard for “too risky.” But if there isn’t, I suspect the “weight of the evidence” isn’t susceptible for a simple statistical definition for the same kinds of reasons?

