Skip to content

Easier-to-download graphs of age-adjusted mortality trends by sex, ethnicity, and age group

Jonathan Auerbach and I recently created graphs of smoothed age-adjusted mortality trends from 1999-2014 for:
– 50 states
– men and women
– non-hispanic whites, blacks, and hispanics
– age categories 0-1, 1-4, 5-14, 15-24, 25-34, 35-44, 45-54, 55-64, 65-74, 75-84.

We posted about this on the blog and also wrote an article for Slate expressing our frustration with scientific papers and news reports that oversimplified the trends.

Anyway, when Jonathan and I put all our graphs into a file it was really really huge, and I fear this could’ve dissuaded people from downloading it.

Fortunately, Yair found a way to compress the pdf file. It’s now still pretty readable and only takes up 9 megabytes. So go download it and enjoy all the stunning plots. (See above for an example.) Thanks, Yair!

P.S. If you want the images in full resolution, the original document remains here


  1. Dimiter says:

    This calls for a dedicated website where the user can choose the comparisons of interest.

    • Andrew says:


      CDC Wonder, which is where we got these data, is such a website. But Wonder doesn’t do the age adjustment or smoothing or graphing that we do. I guess it could make sense for us, or some group like ours, to partner with CDC so that everyone would have immediate access.

      Soon Jonathan and I will release our paper with the Stan and R code so that anyone can make their own graphs by just downloading the data and following our steps.

  2. Jonathan says:

    What is your next step here? Just curious more than anything. You have posted a refutation to the new Deaton and Case paper. But are you going to go any further than that?

    • Andrew says:


      I’m not sure. If Jonathan and I worked at the CDC, I guess we’d want to build a widget so that anyone could do this. I guess one idea would be to try to get funding to do such a thing. It’s not clear how to market this as a traditional publication: you can’t very well just publish an article with 2 pages of text and 60 pages of graphs. What I’d really hope is to get someone interested who knows more on the substantive end. In this sort of project I’m more comfortable as a helper than a leader. One way to frame the project from a statistical perspective is in terms of model evaluation: if the smoothed graphs are better than the raw graphs (and I think they are), how can we show this? One approach is leave-one-out cross-validation. In any case we’re planning a longer report that describes the methods and includes our R and Stan code. We made these graphs months ago and just threw together the above-linked report in a hurry, as we had heard the new Case and Deaton paper was coming out. (No, Case and Deaton didn’t run the paper by me ahead of time for comments!)

      • Nick Menzies says:

        For model evaluation: couldn’t you set up the prediction for the next year’s data (currently unreleased)? I am not sure how one would produce the ‘raw’ prediction, though the same applies to LOO-CV. If ‘raw’ implies ‘no data generating mechanism specified’ then I am not sure how you would proceed.

      • Lynne says:

        Andrew: For your paper, I just wanted to mention that analytically it is actually stronger to control for SES before looking at differences by race. (I believe the Case-Deaton original paper did not look at differences by SES within categories other than white.) That would also help with SES differences by state, and any widening by SES that has occurred over time at these ages. Also, if using education categories, it is better to separate < high school from the high school graduates. In the original Case-Deaton they put those groups together, which was very odd, as you would expect more selection in < high school, and it has been shrinking the most over time. You also see differences between those two groups on risk of death in existing published mortality papers. This is old and incomplete, but has some brief discussion of the literature on this.

        A flaw in the above is that it doesn't control for the changing percentiles in the education categories before running the regressions. See Sam Preston and Irma Elo. 1995. "Are Educational Differentials in Adult Mortality Increasing in the US?" Journal of Aging and Health 7(4):475-496 for the use of an education index.

      • Eli Rabett says:

        Don’t see why it could not be published as a comment with the figures as supplementary materials

        • Andrew says:


          I’m sure it can be published in some way but you need a hook. You can’t just say: We downloaded a bunch of data from the CDC website and made these graphs. Different sorts of hooks are possible, for example:
          1. Discovery. We claim we found something new in the data that nobody saw before.
          2. Explanation. We tell a causal story by mashing up the mortality data with some other trends.
          3. Graphics. We create improved graphical displays.
          4. Prediction. We use a smoothing method that gives better predictions.

          Case and Deaton went for 1 and 2. Jonathan and I would have to go for some mixture of 3 and 4. We’re planning to do this, just haven’t done it yet because it’s not one of our high-priority projects.

  3. Steve Sailer says:

    Thanks. Very helpful.

    It’s interesting to use these state-by-state figures to speculate about the possible causes of the White Death (female version).

    One possibility is that it’s related to the coal mining business and/or culture, as seen in the East South Central.

    Or maybe it is related to resorting of people, with skinnier, healthier people moving out of, say, West Virginia to, say, Colorado.

    Another possibility is that it’s related to general prosperity. For example, the Upper Plains have had a surprisingly good 21st Century, while the Carolinas got hammered by popping of the Housing Bubble.

    Another idea is that something going on here is related to ethnicity or religion. Perhaps Protestants or Scots/Irish or something like that are in a state of moral, cultural, or economic decay, while Catholics and Jews are less so.

    Another possibility is to relate this to smoking. Perhaps people who would have self-medicated their problems with tobacco are less likely to do so now, and thus turn to Oxy and smack for relief.

  4. Steve Sailer says:

    A graphical suggestion: making the vertical axis scale constant across all graphs could be helpful in immediately showing which regions are healthier.

    Another idea would be that the state names, which could be abbreviated to the usual two letter postal codes, should appear to the right of the lines to make them a little more quickly readable.

    The graphs could be made taller than they are wide (the opposite of what they are now) to make them more affirmative about ups and downs in mortality trends.

  5. Steve Sailer says:

    Dr. Gelman’s graphs of mortality trend graphs for white women at younger ages are really scary:

    • Andrew says:


      I followed your link.

      You write, “Another methodological issue is that looking at a ten year long cohorts leads to issues involving changes in average age. Perhaps the average age of 55-64 year olds in state X in 1999 was 50, while in 2014 it was 51, which would raise the mortality rate all by itself.”

      This is indeed a concern had we been looking at trends in raw data (indeed, this was the mistake that Case and Deaton made in their original paper, which resulted in the thousands of incorrect headlines saying that the death rate was going up among middle-aged whites). But it’s not a concern in our graphs because we age-adjusted. We got the data for each age category and then reweighted before doing the analysis of the age bins.

  6. oncodoc says:

    I am also concerned about drawing large conclusions from data derived by chopping and bundling parts of large data sets. Like many, I can see all sorts of fantastic creatures when I look at clouds. We do tend to “connect the dots” even when there is really just a shotgun blast. About 25 years ago I hears a talk about a study by Sir Richard Peto; when the inevitable question about subsets came he responded that he analyzed the study by horoscope and found lots of benefit for Aquarius but not for Aries.
    This doesn’t mean the data presented by Case and Deaton is wrong. However, we should use it to devise hypotheses not conclusions.

    • Rahul says:


      This is what I mean by the risk of finding artifacts when one systematically does data mining. You are bound to find *some* trend, some feature.

      Isn’t this the analog of the garden of forking paths?

      • Andrew says:


        There’s nothing wrong with forking paths. As I discuss in my articles on the topic, the appropriate thing is to analyze all the data and look at all the paths. The problem with forking paths arises when a researcher such as Daryl Bem selects and presents only one path. As you can see in my posts, I’ve been careful to show all the graphs for all the groups. The Case and Deaton papers are more frustrating because they will sometimes pull out just one subgroup and start telling a story. I present all the graphs as a counter to that.

    • Keith O’Rourke says:


      But then Sir Richard ended up sitting on a plane beside Ronald Reagon’s wife’s astrologer who was convinced the benefit for Aquarius but not for Aries made complete sense (or so Richard said).

      Motivated and reflective pattern searches will eventually prove profitable while unmotivated and opportunistic ones won’t (both could succeed or fail in a given case but on average the first will succeed more).

      There is no proof for this, it’s just a hope so that we don’t give up trying to learn from observations.

  7. jrg says:

    Hi Andrew,

    One thing to consider: a log-scale on rates. Demographers tend to plot age-specific mortality rates on the log-scale at least the following reasons.

    1) The long-term secular trend over the last 100 years has been fairly steady proportional change

    2) The effect of covariates tends to be proportional (not just for the modeling convenience of proportional hazards but also empirically). Thus the log-scale helps us more clearly the possible variation caused by differences in health-production inputs.

    3) Proportional changes in mortality hazards create — approximately — linear changes in life expectancy. Although we care about the absolute risk of dying, we care at least as much about length of future life.

    4) It allows us to see rates of different ages on same graph, compare trends in states with differ mortality levels, etc.

    To me a remarkable feature of your state plots are that the lower mortality states (e.g., Wisconsin) are making faster proportional progress than the slower mortality states (e.g., Indiana). This would be more visible in a log plot.

    Of course, there’s no one right way to do this. But you’re going to see different patterns in logs than in the absolute scale — and they’re worth considering.

    A second point — I recommend to all working in this area the papers by Currie and Schwandt using county-level data:

    — Mortality inequality: the good news from a county-level approach
    J Currie, H Schwandt
    The Journal of Economic Perspectives 30 (2), 29-52
    — Inequality in mortality decreased among the young while increasing for older adults, 1990–2010
    J Currie, H Schwandt
    Science 352 (6286), 708-712

Leave a Reply