The gremlins did it? Iffy statistics drive strong policy recommendations

Posted on May 23, 2014 3:12 PM by Andrew

Recently in the sister blog.

Yet another chapter in the continuing saga, Don’t Trust Polynomials.

P.S. More here.

80 thoughts on “The gremlins did it? Iffy statistics drive strong policy recommendations”

dab on May 23, 2014 5:12 PM at 5:12 pm said:

Iffy statistics drive strong policy recommendations

Part of me wants to (cynically) ask, “So, what else is new?” [thinking of how statistics are used in political rhetoric]; part of me wants to (glibly) ask, “what other kind are there?” [recognizing that iffy-ness in the form of uncertainty and variation are part of any statistical analysis] and then duck for cover [remembering that the host and many of the commenters are professional statisticians who may not take kindly to the glibbness (I already stepped on ecologists’ toes in a previous thread)]; but mostly, I just want to give you props for writing a piece for a newspaper that requires the reader to have some intuition for the effect outliers can have on the least squares fit of a quadratic function in order to follow the argument being made.

Reply ↓
Chris G on May 23, 2014 8:57 PM at 8:57 pm said:

I’m all for statistically-robust estimation methods (that’s actually what motivated the title of my blog) but I’ve got a bigger question: How seriously should take the economic impact models that the curves are being fit to? “Not at all.” is my take. Point #1: Check Tol’s Table 1. The numbers are all over the place. That suggests to me a lack of model validation, i.e., the ‘models’ are more appropriately described as guesses. Point #2: The average global surface temperature is about to exceed the maximum encountered over all of agricultural history – and blow past the maximum at a high rate of speed. (Link = http://thinkprogress.org/climate/2014/01/17/3180811/world-climate-cataclysm/) Our climate system is notoriously nonlinear. You really think that economic models which lack data validation data over more than about 1 deg are going to be applicable to increases of 2.5, 3, maybe 4 deg when we don’t even have a good handle on how the climate system is going to respond? (And it’s generous to suggest that we have even 1 deg worth given that our economic systems have been non-stationary over the period where the avg surface temperature has increased that amount.) I think the economic impact models are a joke. It’s an effort to make deterministic predictions where determinism isn’t justified.

I believe a more constructive view is to acknowledge that anthropogenic CO2 emissions have caused and will continue to result in retention of massive amounts of excess heat. (If you know even just rudimentary radiative transfer that’s uncontroversial.) The excess heat retained in the lower atmosphere, the Earth’s surface and ocean surface seems likely to induce changes in our climate system, the details of which we cannot accurately predict. (Will atmospheric circulation patterns change? See, for example, the Ridiculously Resilient Ridge – link = http://thinkprogress.org/climate/2014/03/07/3370481/california-drought/ Will ocean circulation patterns change? Will it rain a lot more? Will it rain a lot less? There’s credible analysis to justify believe in any of those possibilities. Climate models just aren’t high fidelity enough to make accurate regional predictions a few years out – never mind a few decades out.)

Rather than attempting to predict economic effects let’s ask more fundamental questions. Since we’ve decided to play Russian roulette with our climate or, more accurately, as we’ve decided to force our kids and grandkids to play, let’s a) figure out many chambers have bullets in them and b) establish whether it’s possible to remove a few – preferably all of them. MIT climate scientist Kerry Emanuel noted a few months back, “Being conservative in signal detection (insisting on high confidence that the null hypothesis is void) is the opposite of being conservative in risk assessment.” Let’s get serious about risk assessment with respect to our food and water supplies before we start arguing about differences of a few tenths of percent in year-of-year GDP growth.

Reply ↓
- Andrew on May 24, 2014 2:14 AM at 2:14 am said:
  
  Chris:
  
  Yes, as I discussed in my post at the sister blog, there was this funny thing going on in which each data point represents a different model but is only displayed at a single temperature point. I don’t really understand this. Once you have a model that can make predictions for a change of 1 degree, why not see what its predictions are for 0.5 degrees, 1.5 degrees, 2 degrees, etc? In that case, each point on Tol’s graph would become its own curve, and then he could average the curves in some way. Fitting a single curve to all the points doesn’t make sense to me.
  
  Reply ↓
- question on May 24, 2014 8:02 AM at 8:02 am said:
  
  Chris,
  ” CO2 emissions have caused and will continue to result in retention of massive amounts of excess heat.(If you know even just rudimentary radiative transfer that’s uncontroversial.)”
  
  If that is so then why is temperature on venus so close to that on earth at the same pressure? I had a previous discussion here with “dab” and he found the evidence uncompelling. I would like to hear other opinions since (judging by the last post) dab did not seem to understand my point.
  
  The discussion can be found here but it is somewhat split up:
  http://statmodeling.stat.columbia.edu/2014/05/05/can-make-better-graphs-global-temperature-history/
  
  Reply ↓
  - question on May 24, 2014 8:04 AM at 8:04 am said:
    
    Sorry, “he” in the above post should be they/she/he.
    
    Reply ↓
  - Chris G on May 24, 2014 8:08 AM at 8:08 am said:
    
    > If that is so then why is temperature on venus so close to that on earth at the same pressure?
    
    As I said, if you know even just rudimentary radiative transfer the statement “CO2 emissions have caused and will continue to result in retention of massive amounts of excess heat” is uncontroversial. Your question indicates that you don’t understand radiative transfer. End of story.
    
    Reply ↓
    - question on May 24, 2014 8:18 AM at 8:18 am said:
      
      Chris,
      
      No, it is plausible that the retaining heat can also lead to negative feedbacks such as increased albedo. The theory should be able to explain the data, which indicates that for a given pressure the temperature will be similar regardless of the composition of the atmosphere. Please look at the data in that previous post.
    - Chris G on May 24, 2014 8:28 AM at 8:28 am said:
      
      > … for a given pressure the temperature will be similar regardless of the composition of the atmosphere.
      
      Your statement indicates that you don’t understand radiative transfer.
    - question on May 24, 2014 8:36 AM at 8:36 am said:
      
      I can see you did not bother to look at the previous thread (I can’t blame you really for that), so OK.
    - Chris G on May 24, 2014 8:42 AM at 8:42 am said:
      
      > I can see you did not bother to look at the previous thread (I can’t blame you really for that), so OK.
      
      Oh, I read it. I even left a few comments.
    - question on May 24, 2014 8:59 AM at 8:59 am said:
      
      Chris,
      
      I don’t believe you really did read it despite your posts.
      
      You recommended David Archer’s book where he provides a table showing a “simple model” predicting surface temperatures for Venus and Earth as 240K and 253K, respectively. According to him the observed are 700K and 295K, he then writes:
      “Our simple model is too cold because it lacks the greenhouse effect.”
      http://mathsci.ucd.ie/met/cess/FoundClim/archer_global_warming.pdf
      
      So the “greenhouse effect” accounts for a difference of magnitude >400K while at the same pressure the temperatures are similar (+/- 10K). This is very close given the large purported greenhouse effect and vast differences in atmospheres.
      
      Please look at the data and provide your interpretation of that. Perhaps it is a coincidence? I don’t know.
  - Andrew on May 24, 2014 3:15 PM at 3:15 pm said:
    
    OK, ok, enough about Venus already! There’s a place for that sort of discussion but not on this blog. Thanks for understanding.
    
    Reply ↓
Richard Tol (@RichardTol) on May 24, 2014 1:45 AM at 1:45 am said:

The second-order polynomial was a good fit for the data available at the time of writing.

The format allowed me to update the numbers and re-estimate the curve, but not to change the functional form.

Alternative functional forms are discussed in a paper forthcoming in Computational Economics.

Reply ↓
- Shravan Vasishth on May 24, 2014 1:57 AM at 1:57 am said:
  
  I understood Andrew’s point to be that extrapolating outside the available data, so to speak, using a polynomial fit, was not a good idea.
  
  Reply ↓
  - Andrew on May 24, 2014 2:40 AM at 2:40 am said:
    
    Shravan:
    
    Yes, and indeed some of Tol’s substantive claims arise from his earlier extrapolations. When wrote, “The assessment of the impacts of profound climate change has been revised: We are now less pessimistic than we used to be,” and “the benefits of climate policy are correspondingly revised downwards,” these claims are entirely based on (a) the feature of the quadratic that when it goes up and then down, it has to go down even faster, and (b) his extrapolation of his original model (with data points only going past 3 degrees) to 5.5 degrees.
    
    So that’s bad news. Within the context of the quadratic, Tol’s statements are logical, but without that strong modeling assumption, I don’t think his statements quoted above make much sense.
    
    Also, Tol’s justification of his own data point as a non-outlier in his original analysis came from two other points which turned out to be miscoded.
    
    Reply ↓
    - Shravan Vasishth on May 24, 2014 6:26 AM at 6:26 am said:
      
      Very interesting. In the Sheffield statistics courses I did some years ago, in they drove home the point again and again by making us think about this, whether extrapolating a model fit outside the range of the data make any sense.
      
      This reminded me of Feynman’s analysis of the Columbia disaster data on O-rings. If I recall correctly, the engineers implicitly made a linear extrapolation, whereas if they had plotted data for higher temperatures they would have seen the non-linear trend.
    - Chris G on May 24, 2014 6:47 AM at 6:47 am said:
      
      And he’s extrapolating curves created using dubious data. The entire endeavor looks like a house of cards. Reading his 2009 paper, he acknowledges the issues with attempting to model but then proceeds to use models which acknowledges are problematic as the basis for his analysis. I don’t get that. (I’m reminded of Tukey’s maxim, “The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.”)
      
      Some things noted by Tol in his 2009 paper which would give me pause:
      
      1. “Although research is scarce… climate change effects would not be homogeneous within countries; certainly, particular economic sectors (like agriculture), regions (like coastal zones), and age groups (like the elderly) are more heavily affected than others.”
      
      2. “[E]arlier studies focused on the negative effects of climate change, whereas later studies considered the balance of positives and negatives. In addition, earlier studies tended to ignore adaptation. More recent studies… include some provision for agents to alter their behavior in response to climate change. However, more recent studies also tend to assume that agents have perfect foresight about climate change, and have the flexibility and appropriate incentives to respond. Given that forecasts are imperfect, agents are constrained in many ways, and markets are often distorted — particularly in the areas that matter most for the effects of climate change such as water, food,
      energy, and health — recent studies of the economic effects of climate change may be too optimistic about the possibilities of adaptation and thus tend to underestimate the economic effects of climate change.”
      
      Yes, I too suspect that predictions of the economic effects of climate change may be too optimistic about the possibilities of adaptation.
      
      3. “In short, the level of uncertainty here is large, and probably understated — especially in terms of failing to capture downside risks.”
      
      4. “Estimates are often based on extrapolation from a few detailed case studies, and extrapolation is to climate and levels of development that are very different from the original case study. Little effort has been put into validating the underlying models against independent data…”
      
      Need I go on? At what point does one invoke Tukey’s maxim? At what point does one acknowledge that while they’d like to make a meaningful prediction they’re simply not able to do so.
    - Lionel A Smith on May 24, 2014 7:06 AM at 7:06 am said:
      
      A minor nit pick, Feynman was investigating the SRB O-ring failure of a Challenger Shuttle mission and account of which can be found in his book ‘What do you care what other people think?’ and elsewhere.
    - Shravan Vasishth on May 24, 2014 7:33 AM at 7:33 am said:
      
      Whoops, sorry about that. Pity I can’t go back and correct my comment…
    - Shravan Vasishth on May 24, 2014 7:49 AM at 7:49 am said:
      
      This is like the Moses illusion: how many pairs of animals did Moses take on the Ark? Answer: two. Except it was not Moses, but Noah.
    - Richard Tol on May 24, 2014 11:39 AM at 11:39 am said:
      
      Andrew:
      The Tol (2002) was not deemed to be an outlier because of a coding error in the construction of the confidence interval; not because data errors.
    - Andrew on May 24, 2014 12:34 PM at 12:34 pm said:
      
      Richard:
      
      The point is that in the original paper there were several positive values so the Tol (2002) point didn’t stand out so much. In the revised paper there’s only one positive value (not counting a 0.1 for one study) and that big “2.3” stands out a lot, i.e. it is an outlier, albeit not as much of one as the “-11.5” from that other study.
    - Richard Tol on May 24, 2014 2:18 PM at 2:18 pm said:
      
      +2.3% for 1K warming is not an outlier if the curve is a second order polynomial, which is a plausible assumption.
      
      The fit is destroyed by the new observations: -11.2% for 3.2K and -4.6% for 5.4K. The former suggests a non-linearity that is much stronger than second degree; the latter suggests linearity.
      
      As I wrote before, the nature of a corrigendum meant that I could not explore these issues in full. There is a paper coming out soon that does.
    - Chris G on May 24, 2014 2:41 PM at 2:41 pm said:
      
      > +2.3% for 1K warming is not an outlier if the curve is a second order polynomial, which is a plausible assumption.
      
      What does a model order selection criterion such as AIC or BIC suggest is the appropriate polynomial order?
    - Andrew on May 24, 2014 3:19 PM at 3:19 pm said:
      
      Richard,
      
      When you write statement that the that the -11.2% “suggests a non-linearity that is much stronger than second degree,” you’re showing an incredible faith in your model. But it’s a strange model, as it’s not a model of the impact of warming, it’s a model of other people’s estimates of the impact of warming. To suggest that one paper’s estimate of -11.2 provides evidence of a strong nonlinearity . . . hmmm, that’s a bit like saying that Picasso paintings provide evidence that there are a bunch of women out there who have both their eyes on the same side of their face.
    - Andrew on May 25, 2014 3:15 AM at 3:15 am said:
      
      Richard:
      
      Regarding the +2.3% point, Bob Ward put it well when he wrote:
      
      The three mistakes in Figure 1, particularly the mis-plotting of Hope (2006) as a welfare impact of +0.9 instead of -0.9, appears to have prevented Professor Tol from recognising that his 2002 paper is an outlier because it is the only one of the 14 data points that indicated significant net benefits from warming, and excluded many of the potential impacts. Hence, the section on ‘Findings and implications’ on pages 33-37 of Tol (2009) require revision, particularly the assertion on page 34 that “some estimates, by Hope (2006), Mendelsohn, Morrison, Schlesinger, and Andronov (2000), Mendelsohn, Schlesinger and Williams (2000) and myself (Tol, 2002b) point to initial benefits of a modest increase in temperature, followed by losses as temperatures increase further”.
      
      I would not have used the word “significant” because its meaning can be ambiguous, but the point is clear, that the values that were mistakenly set to be positive provided cover for the +2.3% number. Once those signs got flipped, that +2.3% value produced by Tol (2002) stands alone. Indeed, if you had not already been expecting to see these positive values, you might have noticed the errors when you were doing the original analysis.
      
      Thus, there was possibly a cascading effect: your estimate of +2.3% made you receptive to the idea that other researchers could estimate large positive economic effects from global warming, and then once you made the mistake and flipped some signs from positive to negative, this made your own +2.3% estimate not stand out so much. This is not a minor thing, given that one of the points of your paper was the claim of a positive economic effect of moderate global warming.
    - Richard Tol on May 25, 2014 3:47 AM at 3:47 am said:
      
      Andrew:
      You’re welcome to redo the analysis and check which data-point drives which conclusion. Until you do, please refrain from making strong statements.
      
      As Martin notes below, the short-term impacts are irrelevant for policy. The momentum in the energy system and the climate system are such that the short-term impacts are sunk.
      
      The long-term impacts do matter. The difference may not be significant, but the context is decision analysis rather than hypothesis testing.
    - Andrew on May 25, 2014 4:03 AM at 4:03 am said:
      
      Richard:
      
      I typed in the data from your revised paper and ran the regression. Having done this, I stand by what I wrote, that I don’t think that -11.2% number provides much evidence about the underlying curve, that the +2.3% number stands out as being the only large positive number, that it is was a mistake to take data only going up to 3 and extrapolate past 5, etc. I don’t know whether I’d call these “strong statements.” They’re not really statements about climate science at all, they’re just basic statistics.
    - Richard Tol on May 25, 2014 4:30 PM at 4:30 pm said:
      
      Andrew:
      As I said, the Rehdanz-Maddison estimate is inconsistent with a second-order polynomial.
      
      You may not like extrapolation beyond the domain of estimation, but that implies that you can’t study climate change.
    - Anonymous on May 25, 2014 11:56 PM at 11:56 pm said:
      
      @Richard I’d say if you’re going to extrapolate, you need to model the process, not use a convenient mathematical form.
    - Andrew on May 26, 2014 1:48 AM at 1:48 am said:
      
      +1
    - Rahul on May 26, 2014 2:43 AM at 2:43 am said:
      
      @Anon @Andrew:
      
      Can you elaborate? e.g. for this specific case what’d you recommend as a model instead of the quadratic? And what should be the metric for the goodness of a model?
- Rahul on May 24, 2014 4:17 PM at 4:17 pm said:
  
  Richard:
  
  +1 to you for choosing to come on here to at least engage with critics. I wish more authors did that.
  
  Regardless of right or wrong, open discussion seems sadly very lacking in lots of academic settings.
  
  Reply ↓
  - Shravan Vasishth on May 26, 2014 2:57 AM at 2:57 am said:
    
    Hi Rahul,
    
    elsewhere you wrote: “Can you elaborate? e.g. for this specific case what’d you recommend as a model instead of the quadratic?”
    
    I thought Andrew’s point is not that a polynomial is a bad thing per se to use for the data at hand. The point is that one should not extrapolate outside the available range of data. It doesn’t matter what kind of model you use, you will always have that limitation.
    
    Richard said somewhere that if one cannot extrapolate, one cannot do climate modeling. And someone pointed out that one really needs to build a process model rather than fitting a statistical model to the data an extrapolating the line outside the data range. Apparently such models exist:
    
    http://en.wikipedia.org/wiki/Climate_model
    
    But Richard is an economist; probably you need to have a different specialization (or learn about a whole new body of knowledge) to know enough to model the process.
    
    Reply ↓
    - Rahul on May 26, 2014 3:27 AM at 3:27 am said:
      
      Right, I thought @Anon meant the quadratic is a specifically bad choice for extrapolation but another functional form may be better?
      
      If I understand what you are saying correctly, @Anon’s criticizing a pure functional model used for extrapolation? A more fundamental phenomenological model is what’s needed?
    - Andrew on May 26, 2014 5:02 AM at 5:02 am said:
      
      Rahul:
      
      The quadratic does some particularly weird things in this case because of its parametric nonmonotonicity. The curve is anchored at zero, and if you pull it upward at x=1 (which is what happens from Tol’s dot at that point) then it gets pulled more negative at high values of x. This makes sense given the logic of the quadratic: if the effects really are positive at x=1 and then negative at x=2.5, then the curve has to be declining fast, and if you have an increase from 0 to 1 and a rapid decline from 1 to 2.5, then the second derivative in that zone is large and negative, and the quadratic assumes a constant second derivative, hence the fast curving beyond that point. The point is, though, that there’s no real reason to believe this model, and it’s a bit odd that moving a point at x=1 upward will pull down the curve at higher values of x.
    - Rahul on May 26, 2014 7:39 AM at 7:39 am said:
      
      Understood. No arguments against that.
      
      The real question is, (a) would you have been happier had he, say, selected a spline or some other functional form lacking this pathology of the quadratic?
      
      (b) Or is this fundamentally too little data so every slight deviation will get amplified side effects causing artifacts of extrapolation no matter what functional form was chosen.
      
      (c) Or, are you against fitting any arbitrary function in the first place. i.e. the model chosen needs a basis in the underlying phenomenon? (This is indeed a very good approach, no doubt, but when that’s too hard, are (a) or (b) acceptable at all?)
      
      Maybe I’m wrong, but to me (a), (b) & (c) are three different critiques. I’m curious which one we are making here.
    - Martin on May 26, 2014 6:36 AM at 6:36 am said:
      
      Yes, but this is about modelling the total cost of climate change. What is the ‘process’ here? As I understand it, we simply do not know. The functional form is completely ad-hoc.
    - Rahul on May 26, 2014 7:41 AM at 7:41 am said:
      
      I like the clarity of your comment. So, by that opinion, an ad hoc function is perfectly fine, we are only arguing about the exact (and suitable) choice of ad hoc function.
      
      If you had to select, are you fine with the quadratic or would you have selected some other function?
    - Martin on May 26, 2014 8:53 AM at 8:53 am said:
      
      This is being discussed for at least a decade now. Again, this is only as far as I understand it (d’oh), specific numbers are all invented: damages should somehow account for the notion that right now and in the near term nothing much will happen in terms of net damages (be it in the negative or in the positive). On the other hand, they have to show up as catastrophic damages at a warming of, say, 10°C. You only get there with something convex (and in general, the idea that damages due to warming should be convex, is quite intuitive). At low temperature changes, the specific form of the function doesn’t make much difference. At about 5°C it does, but any estimate here is a wild guess anyway. The reason why I think this is not much of a problem is that valuation of damages take over any tinkering about net damages anyway: a model with regional resolution and equity weights, for example, gives you an unbounded SCC estimate as soon as a region’s loss is catastrophic. If the net damages are -30 percent of GDP or -60 percent just doesn’t make much difference, because valuation takes over the calculation.
      
      But yes, the functional form is ad-hoc, and I do see the problems. So, I’d be genuinely interested in alternatives. Andrew suggested below that Tol should have asked the help of a professional statistician. OK, so what would a professional statistician suggest, specifically? The task is this: We have a couple of very bad total damage estimates, but we are somewhat sure that no clear damage signal attribuable to AGW will emerge in the near term and for low temperature changes. On the other hand, we are sure that damages will be catastrophic fairly rapdidly after +5°C, and human civilization will cease to exist at +10°C. Now tell us a carbon tax. Don’t say ‘We just don’t no, and a precautionary principle should apply’, because that really doesn’t tell you anything, at all.
    - Rahul on May 26, 2014 2:39 PM at 2:39 pm said:
      
      +1 for this:
      
      OK, so what would a professional statistician suggest, specifically?
      
      I’m truly curious about the alternatives too.
    - Andrew on May 26, 2014 2:44 PM at 2:44 pm said:
      
      Rahul:
      
      I’ll post more on this, but very briefly: I don’t like the implied model of Tol’s meta-analysis in which there the published studies represent the true curve plus independent random errors with mean 0. I think it would make more sense to consider the different published studies as each defining a curve, and then to go from there. In particular, I’m guessing that the +2.3 and the -11.5 we keep talking about are not evidence for strong nonmotonicity in the curve but rather represent entirely different models of the underlying process.
      
      In short: I don’t think the analysis can be fixed by just playing with the functional form; I think it needs to be re-thought.
    - Rahul on May 26, 2014 3:10 PM at 3:10 pm said:
      
      Andrew:
      
      I re-considered Tol’s graphs & in hindsight I think its not a functional form issue or even an analysis issue per se. I think the devil’s embedded all in that single +2.3 data point. I’d be very wary of trusting one single positive data point in this sea of negative impacts.
      
      Further, why are there barely two studies at delta-Temperature less than 2 Celsius & then this rash of studies all clustered around 2.5 to 3.0 Celsius? What’s the rationale behind choosing these fourteen estimates?
      
      e.g. Say I studied health impact of smoking (as a function of number of cigarettes smoked per day). If my meta analysis used 20 estimates for people smoking 20 cigarettes a day & only one estimate of 1 smoke-a-day subjects then my conclusions are terribly sensitive to errors in that last study.
      
      I’d preferably not choose such a skewed dataset. Or if I did, & the 1-smoke-a-day got a grossly counter intuitive impact than the rest I’d be very wary of that point (OTOH, this depends on how strong monotonicity is in your prior).
      
      That was my naive analysis. I could be wrong.
    - Rahul on May 26, 2014 3:14 PM at 3:14 pm said:
      
      In other words, my naive intuition says Tol is free to conclude what he will for the 2.5-3 Celsius range. But he just doesn’t have any much data to reliably make *any* sort of claim for what happens in the T less than 2.5 Celsius range.
    - K? O'Rourke on May 26, 2014 4:48 PM at 4:48 pm said:
      
      Rahul:
      
      For any meta-analysis to be credible, it must argue that something should be common between the studies while the inevitable differences can be somehow allowed for (e.g. in an RCT perhaps a common relative risk with a varying control group event rate).
      
      In Non_RCTs that’s extremely hard as the other things adjusted for or modelled have to be the same or somehow credibly re-adjusted to achieve similarity of effect of interest (e.g. in y ~ B1 * x + B2 * y and y ~ B1 * x + B2 * z the B1s are not usually equal).
      
      Even with that but with just summary study data – it’s just aggregation and so subject to aggregation bias.
      
      I’ll be interested in what Andrew found, for instance I am not sure how to deal with different randomised studies that had differing number of dose groups with different doses…
    - K? O'Rourke on May 26, 2014 4:51 PM at 4:51 pm said:
      
      Never shows up until after sent.
      
      (e.g. in y ~ B1 * x + B2 * z and y ~ B1 * x + B2 * w the B1s are not usually equal).
    - Andrew on May 26, 2014 4:56 PM at 4:56 pm said:
      
      Keith:
      
      For Tol’s paper I think the problem is that the standard meta-analysis paradigm does not apply. These are not independent studies, they are separate models with some unclear amount of overlap of information and assumptions. This is not to deny the importance of such an exercise; I just think it needs a different sort of statistical model. And I am disturbed both by Tol’s mix of defensiveness and assertiveness (for example, alternately characterizing the changes as having essentially no effect or as being policy-relevant) and by his apparent complete buy-in on the quality of the individual data points, to the extent that when he sees outliers, he sees them as evidence for strong non-monotonicity rather than evidence of a problem with his conceptual model.
Martin on May 24, 2014 10:42 AM at 10:42 am said:

Andrew,

This is an attempt to critisize you, so please destroy me gently, if you feel you have to.

You yourself talk about allegedly “strong policy recommendations” that are based on these estimates. But this is, I think, flat-out wrong: total impacts are NOT policy relevant, this is not evern the right derivative. Marginal damage costs would be. Tol himself tells you so in his comment over at the WP: whatever benefits he saw, they were “sunk” – i.e. they represent the very idea of a policy-IRrelevant measure (the point is also stated in the paper itself) as understood by economists. Again, you have those “strong policy recommendations” even in your title. But then – and you quote the passage yourself – the only policy “impliaction” to be found in the original paper seems to be that uncertainty is so large that we should err on the ambitious side of climate policy. The relevance claimed in the Erratum seems anything but “strong”, and no “recommendation” is claimed.

Then, whatever the marginal damage costs, an economist does not maximise something like output when looking for policy recommendations, but utility. Utility is convex, so even if net benefits were accruing in terms of, say, output, nothing would follow in terms of utility. I’d add here that SCC estimates – which are based on utility – do indeed show a cost, not a benefit. I see the problem that e.g. damage functions are calibrated to total damages. But the question here, it seems to me, should be how this affects SCC estimates. If it does not, this is much ado about a policy-irrelevant measure. The only reason why this permeates Teh Internet as much as it does is some vague idea that it is policy-relevant (nobody would care if it wasn’t, and probably you yourself got it from the ongoing debates in several blogs), when it is not. However, the enormous relevance you are claiming seems to be an artifact of the internet itself, rather than a claim implied by either the paper or, indeed, basic economic welfare theory.

Reply ↓
- Martin on May 24, 2014 11:03 AM at 11:03 am said:
  
  “utility is convex”
  
  …and you are not a dork.
  
  Reply ↓
- Andrew on May 24, 2014 12:37 PM at 12:37 pm said:
  
  Martin:
  
  Tol wrote that the revised estimate based on the new data “is relevant because the benefits of climate policy are correspondingly revised downwards.” So it seems that some policy implications are involved here!
  
  Reply ↓
  - Martin on May 24, 2014 1:05 PM at 1:05 pm said:
    
    Andrew,
    
    you didn’t title your post “it seems that iffy involve some policy implications”, the corresponding part of the title is “Iffy statistics drive strong policy recommendations.” There is no policy recommendation, at all. There is one strong implication, namely that uncertainty is huge and right-skewed and that we should therefore err on the ambitious side of climate policy. You quoted that part yourself. Perhaps that is what your title really means, but I don’t think so. There is a claim of a policy-relevant implication in the erratum, but it is not quantified, so I am at a loss how you see a “strong recommendation”, if there is no recommendation, and much less a strong one.
    
    Also, you ignore Tol himself and now me explicitely stating the econ101 no-brainer that sunk costs are not policy relevant. Why?
    
    Reply ↓
    - Andrew on May 24, 2014 3:13 PM at 3:13 pm said:
      
      Martin:
      
      I appreciate when people air their disagreements with me openly rather than just privately thinking that I’m wrong. So thank you for the comments.
      
      So let me clarify. I wrote, “Iffy statistics drive strong policy recommendations.” I think we can all agree on the “iffy statistics” part, so the only dispute here seems to be “strong recommendations” bit.
      
      In his original paper, Tol wrote, “The policy implication is that reduction of greenhouse gas emissions should err on the ambitious side,” and in his correction he wrote that the revised estimate based on the new data “is relevant because the benefits of climate policy are correspondingly revised downwards.” Both of these imply policy recommendations. But, sure, these are not necessarily “strong” recommendations; he might be implying in the first case that reduction should err only a little bit on the ambitious side, and in the second case he might be saying that the results are only slightly relevant.
      
      On the other hand, if you really want to object to my equating “relevant” to “strong recommendations” in a sentence regarding “the benefits of climate policy,” you might want to object even more strongly to Tol throwing around non-data-based recommendations in the first place. (For example, the statement “negative surprises are more likely than positive ones” might well be true, but shouldn’t such possible surprises be incorporated into the models? And if the statement, “the benefits of climate policy are correspondingly revised downwards” has any meaning at all, it can only be in context to a strong extrapolation of a quadratic model that seems to come from nowhere.
      
      Finally, I have nothing to say about the econ101 no-brainer. Tol is the one who brought up policy, not me.
    - Martin on May 24, 2014 8:24 PM at 8:24 pm said:
      
      Andrew,
      
      I am happy that you appreciate diagreements, because I really do disagree. You have the strong policy recommendations in your title, so you should be able to back them up. But all you really have is a rather vague “seems that some policy implications are involved.” If that is all you can actually argue, you should retract the earlier allegations.
      
      The irony is that you make strong allegations without much basis in a piece in which you criticize just that. If what you provide is – by your own admission – “not necessarily “strong” recommendations” (though this has a slight aftertaste of snark, if I am not mistaken), you should not anchor your piece on them being exactly that.
      
      But I even doubt the “recommendations” part. There are three instances in which I see policy relevance (as claimed by Tol), one in the negative: a) whatever initial benefits (of warming) there are (not), they are sunk and thusly not policy relevant; b) total damage estimates are bad and uncertainty is huge and skewed to the right, hence we should err (…); c) some policy-relevant implication in the erratum that he shouldn’t have written if there was no place to justify it (if he can indeed do so).
      
      Now I thought perhaps it’s my weak grasp of English, but after checking with merriam webster I am pretty sure that only a) (in the negative) and b) can be counted as recommendations in any meaningful way: a) in that one should not subsidize the emissions of greenhouse gases, as the (non-existing) benefits are sunk, b) in a rather obvious way as a sort of precautionary principle. c) might (or not) have implications, but there is no recommendation. If so, please point it out to my as if I were stupid, which I might well be. And while you do refer to b) as an example of a policy recommendation, it is so general in its nature that I cannot bring myself to disagee with it: be carefull if you don’t know what will happen.
      
      So I think we are down to c). And I think it’s neither a recommendation, nor strong. And it appears that you are not really sure about your topical claim yourself.
      
      As to Tol: his discussion style is highly elliptical, with a blind spot for BS his market liberal friends are throwing around, and freaking out every time somebody to his left makes a statement about climate policy (especially environmentalists). I have no idea how to discuss him, and myriads of commenters had to learn the hard way that it might not be possible. Still, I could vent my objections to Tol, of course. But looking at a), b) and c), my problem is that I simply do not disagree with a) or b) (yes, he should try to incorporate uncertainty better, but the simple claim that one should be careful in the fave of huge uncertainty skewed to the right is nothing I’ll object to), and c) just is no recommendation. My objection to c) is that he should not have made a claim if he was (and is) unable to back it up. And neither should you.
    - Andrew on May 25, 2014 3:01 AM at 3:01 am said:
      
      Martin:
      
      When Tol wrote that the revised estimate based on the new data “is relevant because the benefits of climate policy are correspondingly revised downwards,” I consider this a “strong policy recommendation” in that he said his claims were relevant because of their implications for the benefits of policy. You consider this a weak policy recommendation or perhaps no recommendation at all because no specific policy was suggested.
      
      Fair enough: we have different definitions of what is a “strong policy recommendation,” which indeed is not a precisely defined term. I suppose it depends on what is the comparison point. I suppose I’m implicitly comparing Tol’s papers to the sort of empirical papers that I often see in political science and economics, where the researcher presents a data analysis and leaves it to others to consider policy options, whereas you are implicitly comparing Tol’s papers to the sort of papers that often appear in the climate-change literature, where there can be very specific policy recommendations. From where I’m coming from, Tol’s paper has strong policy recommendations, while from your perspective, he is offering only vague policy recommendations.
    - Martin on May 25, 2014 9:36 AM at 9:36 am said:
      
      Yes, that makes sense to me.
      
      I am not sure about the part where a “researcher presents a data analysis and leaves it to others to consider policy options.” After all, regarding externalities (and that’s how greenhouse gas emissions are treated in economics) there is a clear decision criterion: tax (subsidize) the economics activity creating the externality so that the tax (subsidy) matches the marginal cost (benefit) of the externality (or technically the tax would be a bit lower to account for its effect, but the point is clear). Whatever implication Tol sees, its eventual translation in a policy recommendation has to go through this kind of CBA framework (if possible – you shortly discussed Weitzman’s Dismal Theorem some time ago, where he shows that it is not possible under certain conditions). Tol (2009) is not an isolated paper in a topical vacuum, after all – and in the context of the relevant literature, it is clear what he is talking about. I think this is relevant, and that Tol’s claims should be judged against the backdrop of their topical context, not against various failures of some people somewhere at some point.
    - Andrew on May 25, 2014 10:00 AM at 10:00 am said:
      
      Martin:
      
      I’m not judging Tol’s claims against various failures of some people somewhere at some point. I’m saying that Tol’s analysis itself has failures: beyond the obvious errors of the wrong data points, there’s the problem of not recognizing the +2.3 and -11.5 numbers as outliers, there’s the problem of extrapolating beyond the range of the data, there are deeper conceptual problems with the statistical model itself. This does not necessarily make Tol’s paper useless—no empirical work is perfect—but it has flaws, and these flaws could make a difference in the world, to the extent that Tol’s empirical analysis and policy-relevant statements are taken seriously.
    - Martin on May 25, 2014 10:27 AM at 10:27 am said:
      
      Andrew,
      
      I am not sure if these flaws would make a difference when it comes to the stated decision criterion. As I said, all this has to go through a benefit-cost analysis looking at marginal quantities; the decision is based on the “social cost of carbon”, e.g. what is optimized is not any pure cost (as, say, output or consumption loss), but utility (of consumption).
      
      As far as I understood similar discussions (e.g. between William Nordhaus and Weitzman), the exact curvature of damages does not make much difference for low temperature changes as long as damages are convex (though there are problems as quadratic-polynomial damages do not capture catastrophic outcomes very well, or at all). As I said in my first comment, the interesting question would be if the corrections (be it the obvious ones or the mistakes you identify) make any difference when it comes to the social cost of carbon, which is the – and the only – policy-relevant quantity, economically speaking. Sure, Tol made his statement (the claim c) above) without looking into this himself (or without having the place to point it out, if he did so), so he should not have made it IMO.
      
      Anyèway, this is what I mean with the topical context: the decision criterion is based on the social cost of carbon. If one wants to judge wheter any corrections in Tol (2009) are important w/r/t policy, one has to look how this quantity is altered. So when you suppose that you are “implicitly comparing Tol’s papers to the sort of empirical papers that I often see in political science and economics where the researcher presents a data analysis and leaves it to others to consider policy options”, I think this just does not cut it. You are judging in a topical vacuum, and I don’t think that it a useful approach when it is very clear from the literature context how policy recommendations are to be evaluated according to the field.
    - Andrew on May 25, 2014 10:40 AM at 10:40 am said:
      
      Martin:
      
      I disagree. If none of the actual numbers in Tol’s analysis matter, then sure, nothing matters. But to the extent people take seriously his claims based on extrapolation and a flaky mode, then, yeah, that’s a problem.
    - Martin on May 25, 2014 11:12 AM at 11:12 am said:
      
      Andrew,
      
      maybe this is a problem, yes, though I do not see it immediately.
      
      To illustrate: one problem would be the elsticity of the marginal utility of consumption, as it enters the discount rate. This would capture that comsumption utility is concave (and increasing): a dollar more or less is virtually of no concern to Oprah Winfrey, easily a difference for e.g. a student with limited ressources (he could buy a cheap meal for it), and a huge difference for the poorest of the world. Depending on how this parameter is chosen (empirically, from an ethical point of view, etc.) this could easily dominate the calculation, with changes in the monetary damages (as in terms of output loss, for example) making little difference. This would just mean that the question of damages is less important than the question to whom the happen. W/r/t those (erroneous) positive initial warming: this is, if I understand it correctly a net quantity. Thus, with concave utility of consumption this can still give a negative social cost of carbon, if the positives accrue mostly to the wealthy while the negative impacts are mostly a burden for the poor. A look at a global map with expected climate change impacts will show that this is indeed the case (correspondingly, SCC estimates for the US alone are much lower than global SCC estimates).
      
      This does not mean that there is no problem if any changes in damages cannot make a difference, ever. But everybodw would agree with that, and it’s not what I meant: I said it is not clear if the specific mistakes in Tol (2009) make a difference for small temperature changes (and small differences are the only ones even the most avid defenders of integrated assessment models would be confident in). Maybe they do, but this is not obvious without further ado, and I think that one should evaluate exactly that to draw any conclusions. If it does not affect the conlusion, this would, I think, not mean that “nothing matters”, but just that in a complex model, not every change in any part of it translates into a relevant change in the outcome.
    - Andrew on May 25, 2014 2:00 PM at 2:00 pm said:
      
      Martin:
      
      Tol wrote that the revised estimate based on the new data “is relevant because the benefits of climate policy are correspondingly revised downwards.” This revision comes only because in his original paper he made the data mistake of getting some points wrong (which led to the analysis mistake of not recognizing his +2.3% as an outlier) and a further analysis mistake of extrapolating that quadratic way beyond the range of his data points. So, again, to the extent that Tol claimed relevance to policy is actually relevant, he was drawing conclusions based on multiple errors. This is bad news. And his recent comments regarding quadratic fit and nonlinearity suggests that he still doesn’t understand the statistical issues. That’s not so terrible: statistics is hard, that’s why we have professional statisticians. But it is a problem if Tol is trying to do things on his own. At some point it’s better to realize one’s limitations and bring on some collaborators, rather than extrapolating all over the place.
    - Eli Rabett on May 27, 2014 10:56 PM at 10:56 pm said:
      
      Tol’s work is the basis of policy recommendations made by Bjorn Lomborg’s Copenhagen Consensus, the Breakthrough Institute and many other such organizations. It is also frequently cited by those close to right wing groups who are working on affecting policy and in some cases (see Australia) are doing so. Moreover consider how Tol has integrated this nonsense into the IPCC AR 5 where he was a coordinating lead author
      
      If that is not enough for you, Tol is due to testify at a House hearing on Friday. Stay tuned.
Noah Motion on May 24, 2014 12:15 PM at 12:15 pm said:

I’m reminded of the recent case of Reinhart and Rogoff who introduced major errors in data processing and analysis into an influential paper on macroeconomic policy…. The other problem is that people who are caught out in their mistakes often go on and on about how the mistakes don’t really alter their conclusions.

There’s some evidence that Piketty is the new Reinhart & Rogoff:

The data underpinning Professor Piketty’s 577-page tome, which has dominated best-seller lists in recent weeks, contain a series of errors that skew his findings. The FT found mistakes and unexplained entries in his spreadsheets, similar to those which last year undermined the work on public debt and growth of Carmen Reinhart and Kenneth Rogoff….

In his spreadsheets, however, there are transcription errors from the original sources and incorrect formulas. It also appears that some of the data are cherry-picked or constructed without an original source….

“I have no doubt that my historical data series can be improved and will be improved in the future … but I would be very surprised if any of the substantive conclusion about the long-run evolution of wealth distributions was much affected by these improvements,” he said.

Reply ↓
- Chris G on May 24, 2014 12:33 PM at 12:33 pm said:
  
  http://equitablegrowth.org/2014/05/23/whiskey-tango-foxtrot-bang-query/
  
  Reply ↓
- Andrew on May 24, 2014 12:38 PM at 12:38 pm said:
  
  Damn economists and their spreadsheets! They just don’t seem to take their data very seriously.
  
  Reply ↓
  - Rahul on May 24, 2014 3:55 PM at 3:55 pm said:
    
    Is that a jab at Economists or Spreadsheets or both?
    
    Reply ↓
    - Daniel Gotthardt on May 24, 2014 4:46 PM at 4:46 pm said:
      
      Ecosheets are the worst! I guess the phonetic similarity between sheet and cheat is not by chance. Spreadsheets spread cheats!
      
      (Disclaimer: I’m not claiming that Richard Tol wanted to cheat, but I can completly understand Andrew’s disdain for spreadsheets.)
    - Rahul on May 24, 2014 5:28 PM at 5:28 pm said:
      
      If only people knew how much of critical engineering gets designed on the basis of spreadsheets. :) At least bad EconoSheets ( usually ) cannot blow up plant.
    - Daniel Gotthardt on May 24, 2014 5:48 PM at 5:48 pm said:
      
      I probably really don’t want to know about that. :) Regarding spreadsheets you might like http://www.burns-stat.com/documents/tutorials/spreadsheet-addiction/
      
      On a more serious note, I think bad economics influencing policy decision can be as problematic and influential as bad engineering could be.
    - Rahul on May 25, 2014 5:19 AM at 5:19 am said:
      
      Excellent link. Loved it. Thanks!
  - K? O'Rourke on May 26, 2014 9:08 AM at 9:08 am said:
    
    > just don’t seem to take their data very seriously
    
    How long must we sing this song?
    
    When I was hired to work on this project in 1984 http://www.amazon.com/Canada-Can-Compete-Management-Industrial/dp/0886450209 it was made very clear to me that my first priority was to make the data analysis fully reproducible. I was told that this was because an American colleague was being considered for a Nobel Prize in Economics until they were caught being unable to revise their work given some new data or reproduce previous calculations. (Their research assistant had left and no one could figure out what they had done.)
    
    Maybe if someone very publically ends up not getting a Nobel Prize because of sloppy data analysis – the need for this level of care will be more widely understood!
    
    (The analysis was all done in Lotus 123, but all calculations were carried out with macro programs with data read in from electronic files obtained from OECD and others, with any corrections again made by macro programs. When my predecessor was hired about a year after I left, I was asked not to talk to them until after they had reproduced all my analyses. They were able to do this and then we meet to discuss the programs in more technical depth.)
    
    Reply ↓
    - Rahul on May 26, 2014 10:35 AM at 10:35 am said:
      
      “When my predecessor was hired about a year after I left,….”
      
      Lovely! ;)
    - K? O'Rourke on May 26, 2014 1:06 PM at 1:06 pm said:
      
      Strictly speaking, for predecessor read successor :-(
    - Rahul on May 26, 2014 1:09 PM at 1:09 pm said:
      
      Sorry, I couldn’t resist. :)
- Chris G on May 24, 2014 1:16 PM at 1:16 pm said:
  
  http://www.motherjones.com/kevin-drum/2014/05/chris-giles-challenges-thomas-pikettys-data-analysis
  
  Reply ↓
- Chris G on May 24, 2014 1:22 PM at 1:22 pm said:
  
  And one from Ryan Avent for good measure –
  
  http://www.economist.com/blogs/freeexchange/2014/05/inequality-0
  
  Reply ↓
- Martin on May 29, 2014 9:28 PM at 9:28 pm said:
  
  Take that back you all, says Piketty:
  
  http://piketty.pse.ens.fr/files/capital21c/en/Piketty2014TechnicalAppendixResponsetoFT.pdf
  
  Reply ↓
Rahul on May 25, 2014 5:17 AM at 5:17 am said:

Andrew says:

The other problem is that people who are caught out in their mistakes often go on and on about how the mistakes don’t really alter their conclusions.

Well, OTOH a mistake indeed may not alter conclusions? It sure is possible, isn’t it? Its not even unlikely that a mistake doesn’t alter conclusions.

All, I’m saying is we need to judge each such claim on its own merits. As such, mere stating that “my mistake doesn’t change my conclusions” isn’t a smoking gun & these denials cannot be damning by themselves.

Reply ↓
- Andrew on May 25, 2014 8:11 AM at 8:11 am said:
  
  Rahul:
  
  Yes, I agree. Such a statement should be considered on its own merits.
  
  Reply ↓
Grant McDermott on May 26, 2014 9:40 AM at 9:40 am said:

Andrew, thanks for the post, which I really enjoyed.

I came to very similar conclusions in a recent post that might interest you, since it also explores how sensitive the best-fit curve in Tol (2009) were to the outlier study in question (i.e. Tol, 2002): http://stickmanscorral.blogspot.com/2014/04/on-economic-consensus-and-benefits-of.html

Richard also mentions his forthcoming paper which relies on nonparametric methods to estimate the curve. However, I’ve yet to see a good answer as to why we should expect this to produce meaningful results when the sample size (ignoring their non-random nature for the moment) is only around 20.

Reply ↓
Pingback: A whole fleet of gremlins: Looking more carefully at Richard Tol's twice-corrected paper, "The Economic Effects of Climate Change" « Statistical Modeling, Causal Inference, and Social Science Statistical Modeling, Causal Inference, an

80 thoughts on “The gremlins did it? Iffy statistics drive strong policy recommendations”

Leave a Reply Cancel reply