Cumulative residual plots seem like they could be useful

Peter Vanney, a statistician at Texas Highway Patrol, writes:

I’m wondering if you could comment on CURE (CUmulative REsidual) plots that I’m seeing quite a bit in vehicle crash modeling. Ezra Hauer and Joseph Bamfo champion them as a way to determine model fit for their hierarchical Bayesian generalized linear mixed models. I had not seen them until I started reading some papers published by the Texas A&M Transportation Institute about modeling crashes. They seem dubious to me, but I can’t quite determine why I don’t like them. Would love to get your thoughts on the topic.

Here’s the book Ezra Hauer wrote.

And here’s a paper in which they reference these plots.

It seems to me [Vanney] almost like a mixture of a Q-Q plot and a marginalized residual scatterplot, but since all of their CURE plots seem to show a systematic bias of their models, I don’t understand why they are using them (except for maybe model analysis only). Also, maybe I’m old fashioned, but I like to see trace plots and some plots of posterior distributions in papers.

Wow! The Texas Highway Patrol. How cool is that? I had no idea this blog had such global reach.

Anyway, my reply is that these cumulative residual plots don’t seem so bad. Cumulating is a kind of smoothing. I’ve used binned residual plots, and I could see how cumulative plots could work too, especially in an area such as highway safety where there’s a natural dimension for the smoothing/binning/summing.

11 thoughts on “Cumulative residual plots seem like they could be useful

  1. Cumulative residuals plots are very useful with time series (or panel) data where one is checking if the model is not capturing some element of variation through time.

    • Thanks John, because neither axes are labeled, I’m confused about what exactly is being shown. Are they comparing model residuals through the MCMC iterations? Or is the x-axis one of the explanatory variables?

  2. From my work in finance, I’ve come to realize that when looking at plots of cumulative sums of stock returns as well as cumulative sums of any white noise process, it is very easy to trick yourself into thinking you see something that’s not really there (or at least won’t generalize to new data). That being said, they can be a good tool to find place to look for model inadequacy.

    • Hi Bryan,

      So if one of the model’s residuals shot-off into the ether, I would be able to see it there. Is the something gained by looking at the cumulative residuals compared to a residual scatter plot?

  3. By the way, a nod to you is the top trending comment on PubMed Commons at the moment:

    “To the Editors,

    ………

    The authors have made a common but serious error of inference by neglecting that ‘Difference in Nominal Significance is not a Significant Difference’ as described elsewhere (1,2).

    The conclusion offered by the authors is not supported by the data and should be corrected.

    Sincerely,

    David B. Allison, PhD
    University of Alabama at Birmingham”

    https://www.ncbi.nlm.nih.gov/pubmed/28759403#cm28759403_73468

  4. Andrew, I hate to have to correct you, but the THP is actually located in America, although I am sure you have a global reach :)

    On a more topical note, how does this differ from looking at a simple residual plot? Wouldn’t the same non-random form take shape and illustrate a problem with the fit?

    • Not being a native Texan, but having lived in Texas more than half my life (and being a U.S. citizen all my life), it still sometimes seems like Texas is de facto a country all its own (and a lot of the state’s legislators seem to think that, too!)

      • A banal (but good because I am from Oklahoma) joke I once heard: A texan gets away from the bustle of the city one night to go camping, looks up at the beauty of the stars and the moon and says, “ahhhh, only in Texas.”

Leave a Reply

Your email address will not be published. Required fields are marked *