18 thoughts on “Three ways to present a probability forecast, and I only like one of them

  1. Doesn’t it matter what your audience is going to do with the information? I suspect that all three of these estimates are too precise by that metric. Weather consumers don’t care about the probability at all (or let’s say that have a three-part test p=0, p=1 and p= somewhere-in-the-middle-let’s-say 50 percent or so). For the election example, there are probably about 5 relevant categories (no chance, behind, close, ahead and sure thing). For the final example, the probability is just too imprecisely measured to give you a third decimal.

    But what would you do with a finance guy who wants to calculate the probability of default from observations on a credit default swap. He has to pick some precise estimate in order to calculate a hedge ratio in some other instrument. Or, if he can find two inconsistent estimates (within the spread) he can arbitrage. In a world of penny spreads, he might need values more precise than one percentage point, and certainly mroe accurate than 10.

    How about a Blackjack counter? A tenth of a percent in the odds is the difference between eating and not eating sometimes!

    • Jonathan:

      Of course, the context is crucial. For reasons I discussed here, I think it’s ludicrous to present an election forecast to 1% increments of probability. For the weather, yes, I think every 10% is fine. I’m thinking of users such as myself. The options p=0,.5,1 do not provide enough information for me.

      I’m not saying that 10% is always the right level of precision for a probability forecast; I’m just talking about the examples above.

      • The US Census Bureau lists the population estimate of the US as 316,128,839. What’s your opinion on that?

        Is that a similar situation? Or different. Would you that they rather list it as 316.1 million?

        • Rahul:

          The Census Bureau’s number is a fiction but they have various responsibilities, including (I think) making the numbers add up in every jurisdiction. And some communities in the U.S. have just a few people. So I can see the logic in giving the population as a precise number even though it is obviously incorrect. It wouldn’t be a bad idea to include a standard error with that number.

          P.S. I just looked it up at the Census website and they give the population as 319,115,990. But it’s on an odometer-like counter and it keeps going up, with a net gain of one person every 15 seconds. Hey, it’s at 319,115,993 now! This is fine with me. My wall clock has a second hand, and that’s fine too, even though the time on the clock isn’t really accurate to within 1 second.

  2. I certainly agree with the sentiment. But, out of curiosity, what’s your recommendation for extreme cases? For example, what if you’re looking at district-level elections and one of the districts is particularly safe for one party or the other for some reason (e.g., gerrymandering), so your model spits out, say 98% for the republican and 2% for the democrat? Do you report it as 100% and 0%? That seems problematic because it sounds like you’re certain of the outcome (which you effectively are, but nothing is really certain). So, maybe you astericks it? Or do you go with the nearest 1% in that case?

    • Dab:

      Yes, good point, I guess something would need to be done in these endpoints. I remember a few years ago seeing someone’s forecast where he gave the Republican presidential candidate a 4% or 5% chance of winning in Washington, D.C. It turned out that the problem was some sort of roundoff or bounding error.

  3. Sam Wang has actually brought this issue of false precision up lately, and now gives the prediction to the nearest 5%, plus or minus 15% (election.princeton.edu). I think that might confuse it a little bit, but at least it puts uncertainty up front and center.

  4. I think the extra decimals are mainly added for reasons other than forecast “accuracy”.

    Want to know how much factor A (e.g. the election in Kansas) matters? Well, looking at a counterfactual (e.g. candidate drops out) overall probability only changes by X%, whereas factor B affects expected outcomes by Y%.

    Similarly, a fine scale sometimes will allow you to separate gradual shifts over time from discrete events changing the forecast.

    Finally, these things are supposed to be fun! So why not let the policy wonks have their decimals. (Similar to the different purposes of data visualizations of attention-grabbing vs displaying information.)

  5. I check the NWS forecast every morning. Not only are the 5 day extended forecasts very good, i.e., when they say X pct chance of rain over the long term X pct seems about right over the long run, but if you read the Forecast Discussion page the forecasters get into model uncertainties and often their own skepticism of model predictions. (They often nuanced priors.) While I may act based on the bottom line prediction (“60 pct chance of rain.”) I always appreciate knowing its origins.

    Our local NWS forecast –
    http://forecast.weather.gov/MapClick.php?CityName=Bedford&state=MA&site=BOX&textField1=42.4833&textField2=-71.2667&e=1#.VEWwgsn5nFI

    The associated Forecast Discussion page –
    http://forecast.weather.gov/product.php?site=BOX&issuedby=BOX&product=AFD&format=CI&version=1&glossary=1&highlight=off

    PS On theme of uncertainties in forecast models I believe they need to recalibrate/update their priors for 5 day quantitative precipitation forecasts. I haven’t kept a written log but I can usually keep numbers in my head. Their five day QPF forecasts are frequently high by a factor >2 and are almost never lower than actual. I wonder if anyone has been keeping a log?

  6. I once was part of a panel tasked with estimating probabilities for a number of different possible future events. One issue that arose was “how many decimals?”, with strong opinions being expressed that we needed to be precise and give values to 1% or perhaps 0.1%. This was neatly addressed by one of the older and wiser members, who said “Do we really think that we can categorize these into even as many as nine different levels?”. Put that way, the obvious answer was no, so we settled on rounding to 10%. But truthfully, given the subject, we were trapped by the decimal system: three levels might, just, have been about right, since the year was 1988 and we were being asked to estimate probabilities of earthquakes. The report is available
    at http://pubs.er.usgs.gov/publication/ofr88398 By alphabetical accident I am the first author, and the older and wiser person was the second.

  7. I first thought that I am in an alternative universe where Democrats are in the red corner and Republicans are a in the blue one. Then I looked at the link and it’s Obama 2 years ago. Whew!

  8. The “probability of precipitation” is a tricky computation to any degree of precision. The wikipedia page at
    http://en.wikipedia.org/wiki/Probability_of_precipitation (probably reliable) starts

    In U.S. weather forecasting, POP is the probability of exceedance that more than 1/100th of an inch of precipitation will fall in a single spot, averaged over the forecast area.[1] For instance, if there is a 100% probability of rain covering one side of a city, and a 0% probability of rain on the other side of the city, the POP for the city would be 50%. A 50% chance of a rainstorm covering the entire city would also lead to a POP of 50%.

    and continues later with

    … most of the time, the forecaster is expressing a combination of degree of confidence and areal coverage.

    It’s an interesting read.

  9. Pingback: Shared Stories from This Week: Oct 24, 2014

Leave a Reply

Your email address will not be published. Required fields are marked *