Skip to content

More gremlins: “Instead, he simply pretended the other two estimates did not exist. That is inexcusable.”

Brandon Shollenberger writes:

I’ve spent some time examining the work done by Richard Tol which was used in the latest IPCC report.  I was troubled enough by his work I even submitted a formal complaint with the IPCC nearly two months ago (I’ve not heard back from them thus far).  It expressed some of the same concerns you expressed in a post last year.

The reason I wanted to contact you is I recently realized most people looking at Tol’s work are unaware of a rather important point.  I wrote a post to explain it which I’d invite you to read, but I’ll give a quick summary to possibly save you some time.

As you know, Richard Tol claimed moderate global warming will be beneficial based upon a data set he created.  However, errors in his data set (some of which are still uncorrected) call his results into question.  Primarily, once several errors are corrected, it turns out the only result which shows any non-trivial benefit from global warming is Tol’s own 2002 paper.

That is obviously troubling, but there is a point which makes this even worse.  As it happens, Tol’s 2002 paper did not include just one result.  It actually included three different results.  A table for it shows those results are +2.3%, +0.2% and -2.7%.

The 2002 paper does nothing to suggest any one of those results is the “right” one, nor does any of Tol’s later work.  That means Tol used the +2.3% value from his 2002 paper while ignoring the +0.2% and -2.7% values, without any stated explanation.

It might be true the +2.3% value is the “best” estimate from the 2002 paper, but even if so, one needs to provide an explanation as to why it should be favored over the other two estimates.  Tol didn’t do so.  Instead, he simply pretended the other two estimates did not exist.  That is inexcusable.

I’m not sure how interested you are in Tol’s work, but I thought you might be interested to know things are even worse than you thought.

This is horrible and also kind of hilarious. We start with a published paper by Tol claiming strong evidence for a benefit from moderate global warming. Then it turns out he had some data errors; fixing the errors led to a weakening of this conclusions. Then more errors came out, and it turned out that there was only one point in his entire dataset supporting his claims—and that point came from his own previously published study. And then . . . even that one point isn’t representative of that paper.

You pull and pull on the thread, and the entire garment falls apart. There’s nothing left.

At no point did Tol apologize or thank the people who pointed out his errors; instead he lashed out, over and over again. Irresponsible indeed.


  1. anon says:

    The previous errors were real and should not have happened, but this one is a red herring. The two alternative numbers in Tol (2002) are non-standard aggregation approaches. None of the other studies reviewed in Tol (2009) use them and it would be completely nonsensical to put these two other estimates from Tol (2002) into the Tol (2009) review. Anyone vaguely familiar with this literature would understand why that is the case. Tol should probably have explained that somewhere, but I doubt you will find anyone who understands these two alternative aggregation methods who would not agree with the choice Tol has made on this issue in his 2009 paper.

    • Andrew says:


      This could be. I’m reporting what Shollenberger sent me but I haven’t looked at these three numbers myself. It could well be that Tol made a zillion errors in this project, not a zillion and one, and perhaps I was too quick to assume this was an error too.

    • anon, whether or not the two other estimates are non-standard is a non-issue. You can always come up with a post hoc rationalization for disregarding data you don’t like. You could disregard the method for aggregating damages for many of the papers as being non-standard. For instance, one paper Richard Tol used relied upon self-reported measures of of happiness to determine economic damages of global warming. I doubt anyone would believe that is more standard than the alternate aggregation approaches of Tol 2002.

      Moreover, the issue here is not that the other numbers were somehow “right.” The issue here is that alternate approaches gave vastly different results. Whether those alternate approaches were non-standard or not, Tol 2002 did nothing to endorse one approach over another. As such, it is completely inappropriate to cherry-pick one of Tol 2002’s results while ignoring the others. This is particular true if doing so is the only way to get the only data point which shows any meaningful amount of warming.

      People would obviously have viewed Tol 2009 very differently if Richard Tol had not made his many data errors so only one data point showed any meaningful amount of warming then said up front, “That one point is from one of my papers which gave three values, which I chose the most favorable result from. But it’s cool, because the other results were from non-standard approaches.” That’s because even if the other approaches are non-standard, you have to clearly explain what results they give and why they shouldn’t be used. Anything else is dishonest.

      With all that said, I’ll point out I find it weird how people like to make comments like anyone “vaguely familiar with this literature would understand” blah, blah, blah. It’s not an argument. It doesn’t contribute anything. It’s just annoying posturing. And given Tol was publishing papers in 2008 using one of the alternate aggregation approaches (equity weighting), it’s rather strange. If Tol was comfortable enough with the alternate aggregation approaches to publish papers using them in papers in 2008, why should he not be comfortable using estimates from them in 2009?

    • Richard Tol says:

      Anon is correct. The other two estimates are incomparable: They map the same results onto a different welfare function. Nordhaus does the same in a few papers, and the alternative estimates are similarly omitted here.

      • This argument sounds somewhat plausible until you realize other papers map their results onto entirely different welfare functions yet have their results used anyway. If Richard Tol wanted to require consistency across the papers, that would have been fine. He didn’t though. If he had, he wouldn’t have had enough data to publish his work.

        Tol obviously can’t apply one standard to one paper in order to cherry-pick favorable results from it then not apply the same standard to other papers simply because it would be inconvenient.

  2. I don’t see how Richard’s claim that “The corrections did not affect the conclusion that the initial impacts of climate change are beneficial” holds up even on its own terms.

    For example, see the bottom of this post for an updated figure using the corrected data. The best-fit regression line — and yes, we’ve been through the many problems of this approach, but let’s temporarily ignore that for consistency — now depicts negative impacts throughout.

    If we don’t try to fit a curve to the data, then Richard’s claim still doesn’t make sense. At best, the revised paper provides ambiguous evidence of initial benefits. (And, again, even that generous interpretation hinges on the results of a single study!)

  3. P.S. Andrew (and others), what’s your take on Richard’s latest paper, which uses kernel regression to fit a curve to the data. I don’t see how this nonparametric approach is suitable at all given the paucity of data, but perhaps I’m missing something.

    • Andrew says:


      I’ll have to let someone else read this new paper. Tol has destroyed his credibility with me and I see no reason to waste any more time reading his stuff.

    • I’ve read that paper, as well as another recent paper by Richard Tol apong the samr lines. Leaving aside the incredible fact there are still data errors in them, the main problem is what you point out – there is so little data. The non-parametric methods Tol tries to use in these papers are completely unsuited for such small sample sizes.

      In at least one case, this produces an absurd result. Tol subsetted his data to perform various tests. He reported certain results, but if you checked the figures he included, you could see reducing thr amount of data he had (by subsetting it) increased his certainy levels. That is, the more data you have, the less you know.

      (There’s a fairly simple mathematical reason that happens. If people are interested, I can explain when I get home and am not typing on a phone.)

        • I intended to write a post about it at my little site this evening because the topic is interesting (and amusing) enough to merit one, but I’m afraid something came up took priority on the posting schedule. It’s tangentially related to the Richard Tol issues in that it arose from a couple comments I wrote about the Tol issues at Retraction Watch, but I don’t know how interested anyone here would be in it. Long story short, it appears Retraction Watch secretly edits user comments. Yeah. Sorry for being off-topic, but it’s kind of a big deal. You don’t just go around changing the comments people submit to your site willy nilly then presenting them as though that’s how they were submitted. If you need to edit things out, you leave a note indicating the change. (As an example, feel free to delete all this if you feel it’s too off-topic.)

          Anyway, I’ll try to post about the mathematical issue I referred to Saturday as Friday will be too busy. For a really condensed version, a common way to estimate uncertainty in one’s data is to take a number of subsets of the data and see how variance there is across them. The idea is how much difference there is between the subsets should represent the uncertainty in your data. The problem with that is when you have very little data, the variance in your data isn’t representative (the sample doesn’t accurately represent the population). That means removing data, such as by taking subsets of it, gives you samples which are even less representative of the whole population.

          This can introduce all sorts of biases, but the weirdest is one which happens when taken to the extreme. Remember, this type of approach uses variance as your measure of uncertainty. As you reduce the number of data points to a sufficiently small amount, you can actually reduce the amount of variance. If you only have one data point, you have 0 variance. If greater variance means greater uncertainty, 0 variance means absolute certainty that one data point is correct. Because, after all, no data disagrees with it.

          That line of thought is nonsensical, of course. That’s why approaches like that (e.g. bootstrapping, jackknifing) all require a sufficiently large amount of data so that the subsets are all adequately representative of the population. Tol just ignored those requirements when doing his tests. (Incidentally, Tol doesn’t even have ~20 data points. He has ~20 data points spread across a number of different temperature values. Those values are treated differently from one another, thus effectively reducing the total number of data points he has for his uncertainty calculations.)

    • Dikran Marsupial says:

      Just looking at Figure 1, the kernel model clearly has more structure to it than is justified by the data, especially the big overshoot between 3 and 4.5, which is then corrected to pass exactly through the rightmost data point.

    • Heya. I hope you’ll forgive me for discussing a somewhat different paper. It turns out the paper you linked to wasn’t Tol’s latest paper. He has at least three newer ones, of which I had read two. I wound up writing my post on the newest of the ones I had read, simply because it was the newest.

      The explanation I gave in my comment is still mathematically sound, and it’s largely the same issue for the paper I discuss (variance in this case is just measured a different way).

      But to make up for the disappointment a bit, I’ll give you a special sneak peek. My post mentions I found a data error in a new Tol paper. It turns out he corrected a data error in that paper, one I pointed out when I criticized the work he slipped into the IPCC report. I don’t know if he gave me credit for finding the error. But I do know this: He inverted another paper’s conclusions.

      That’s right folks. Tol took another paper which found global warming would show damages and presented it as showing benefits. And guess what? Until his 2015 paper, he had been listing this one as showing benefits. He inverted his own position for the paper!

      I think I know how it happened too. I just need to confirm a couple things. It’s too funny.

  4. I just had a weird problem with this site. After I submitted my first comment upthread, I couldn’t see it. I thought it was just in moderation since I was a first time commenter, but then later, I couldn’t see the comment I had left from my phone. Or any other new comments. It turned out I was receiving the exact same page I had received the first time I loaded the page. It was really weird.

    I reset my wireless router and everything seems to work fine now, but I thought I’d mention it.

  5. The first errors could have been an honest mistake or plain ignorance (IIRC, Andrew’s original gremlins post was complaining about extrapolations outside the range of the design matrix). But with Tol refusing to simply concede the point and retract the article altogether once the central claim no longer holds, does this enter the territory of scientific misconduct? I’m curious about what Andrew and others think about this point.

    In any case, people could misuse this non-result with adverse real-world consequences. And why don’t the editors of the journal in question take a look at the paper and decide for themselves whether it should stand or not? Can’t the editors unilaterally retract/withdraw a paper?

  6. Eli Rabett says:

    To continue the gremlin whacking, the data/figure reckon damages as a function of global temperature anomaly, but it is not clear what time is used to set the zero point, nor is it clear that all of the studies used the same, initial time to set T=0 or that Tol reconciled them. Nordhaus, responsible for six of the 16 points used, picks 1900 to set T=0. Tol (2009) states that the damages are all relative to today, which is idiotic because it means that an additional 2C rise from the current global temperature anomaly would be ~3C since 1900. According to Tol, this would be minimally damaging and no bunny believes that.

  7. Jonathan Gilligan says:

    Over at Eli Rabett’s blog, the lagomorph asks a very important question on top of all these others: In aggregating the data, did Tol correctly ensure that the “warming” numbers all referenced the same baseline? “The serious question is what is the baseline for each study and which sets the zero of the temperature scale? … For which is the baseline pre-industrial (in which case the world is now past Tol’s outlying positive point at 1.0 C), 1860 or so when instrumental records start in which case the world is pretty close to it, or some more recent time, in which case, another couple of degrees would, at least according to Tol, have little effect.”

    Brandon Shollenberger answers that “Tol didn’t actually use any real baseline. All the papers used different temperature baselines, and Tol didn’t put them on a common one!” (more detail here at Shollenberger’s blog)

Leave a Reply