Mortality rate trends by age, ethnicity, sex, and state (link fixed)

There continues to be a lot of discussion on the purported increase in mortality rates among middle-aged white people in America.

Actually an increase among women and not much change among men but you don’t hear so much about this as it contradicts the “struggling white men” story that we hear so much about in the news media.

A big fat pile of graphs

To move things along, Jonathan Auerbach and I prepared a massive document (zipped file here; still huge) with 60 pages of graphs, showing raw data and smoothed trends in age-adjusted mortality rate from 1999-2014 for:
– 50 states
– men and women
– non-hispanic whites, blacks, and hispanics
– age categories 0-1, 1-4, 5-14, 15-24, 25-34, 35-44, 45-54, 55-64, 65-74, 75-84.

It’s amazing how much you can learn by staring at these graphs.

For example, these trends are pretty much the same in all 50 states:

But look at these:

Flat in some states, sharp increases in others, and steady decreases in other states.

The patterns are even clearer here:

Starting point

To get back to the question that got everything started, here’s the story for non-Hispanic white men and women, aged 45-54:

Screen Shot 2016-01-18 at 10.35.52 PM

Different things are happening in different regions—in particular, things have been getting worse for women in the south and midwest, whereas the death rate of men in this age group have been declining during the past few years—but overall there has been little change since 1999. In contrast, as Anne Case and Angus Deaton noticed a bit over a year ago, other countries and U.S. nonwhites have seen large declines in death rates, something like 20%.

Breaking down trends by education: it’s tricky

In a forthcoming paper, “Mortality and morbidity in the 21st century,” Case and Deaton report big differences in trends among whites with high and low levels of education: “mortality is rising for those without, and falling for those with, a college degree.”

But the comparison of death rates by education is tricky because average education levels have been increasing over time. There’s a paper from 2015 on this topic, “Measuring Recent Apparent Declines In Longevity: The Role Of Increasing Educational Attainment,” by John Bound, Arline Geronimus, Javier Rodriguez, and Timothy Waidmann, who write:

Independent researchers have reported an alarming decline in life expectancy after 1990 among US non-Hispanic whites with less than a high school education. However, US educational attainment rose dramatically during the twentieth century; thus, focusing on changes in mortality rates of those not completing high school means looking at a different, shrinking, and increasingly vulnerable segment of the population in each year.

Breaking down trends by state

In my paper with Jonathan Auerbach, we found big differences in different regions of the country. We followed up and estimated mortality rates by state and by age group, and there are tons of interesting patterns. Again, our latest graph dump is here (zipped file here), and you can look through the graphs yourself to see what you see. Next step is to build some sort of open-ended tool to use Stan to do smoothing for arbitrary slices of these data. Also there are selection issues as people move between states, which is similar but not identical to selection issues regarding education.

Message to journalists

Case and Deaton found some interesting patterns. They got the ball rolling. Read their paper, talk with them, get their perspective. Then talk with other experts: demographers, actuaries, public health experts. Talk with John Bound, Arline Geronimus, Javier Rodriguez, and Timothy Waidmann, who specifically raised concerns about comparisons of time series broken down by education. Talk with Chris Schmid, author of the paper, “Increased mortality for white middle-aged Americans not fully explained by causes suggested.” You don’t need to talk with me—I’m just a number cruncher and claim no expertise on causes of death. But click on the link, wait 20 minutes for it to download and take a look at our smoothed mortality rate trends by state. There’s lots and lots there, much more than can be captured in any simple story.

47 thoughts on “Mortality rate trends by age, ethnicity, sex, and state (link fixed)

    • I just checked the paper really quick, but couldn’t find any mention of the data source (I am assuming cdc wonder again…) and coding scheme used (I am assuming ICD-10). I was wondering what ICD 10 codes they use for “overdose”, because in the earlier paper a lot of the “poisonings” were from prescription pain and blood pressure meds (although they made it seem like it was recreational by linking with alcohol). Anyway, hopefully they can include a real methods section eventually.

      *https://www.brookings.edu/wp-content/uploads/2017/03/6_casedeaton.pdf

  1. I wonder if we’ll be able to dig out much useful information from this data. By useful I mean actionable. Fifty different states, two genders, three social classes, and several levels of educational attainment gives us so many boxes that I fear a thousand hypothes will find some statistical support. Reading this blog regularly has made me very leery of conclusions drawn from large, inherently uncontrollable aggregates. Looking just at the Middle Atlantic States map, we see that Pennsylvania looks very different than the other two states. I doubt that crossing the Delaware has that much impact. The map of Pennsylvania contains a very diverse topography. What are the trends in Philadelphia versus Pittsburgh, Philadelphia versus Bryn Mawr, or Philadelphia African-Americans versus the Pennsylvania Amish? I can ask similar questions about every state.
    Mortality trends are obviously important to study. I hope that people studying this issue are alert to the problems with large data sets that Professor Gelman warns about.

    • Govt policies can have big changes across essentially arbitrary boundaries like the Delaware, so it is possible that the changes you saw are real.

      In order to make sense out of these things I believe you need substantive causal hypotheses about how they work. For example, I looked recently at suicide data since the mid 1990’s, and found that there was a noticeable kink in white male and female suicides around the early 2000s. My sister who works at the VA suggested the issue of PTSD suicides in military veterans post Iraq + Afghanistan conflicts. Suicide in military vets is something the military has been actively trying to fight with new policies, and that suggests that they are seeing it as an increasing problem, so this seems plausible. The next step there is to figure out what pattern you expect if suicide increase is being driven largely by military veterans. So for example you expect increases among people age say 20-35 not among 55-70 or whatever. But war vets is probably not the only group with issues. For example, economic conditions might weigh heavily on a certain population (the “increasingly vulnerable” low-education-attainment group mentioned for example). So, looking at measures of economic stress among certain groups and then looking at demographics of those groups in different states, and then fitting models along those lines could be valuable.

      Doing a careful job with this kind of data is valuable, doing a p less than 0.05 drive-by + publication in PNAS, not so much.

      Any careful job really requires multiple passes, look at the big dataset with an eye towards whittling down the explanations, then look for corroboration in alternative datasets, then collect survey data specifically to address your hypotheses, etc. Declaring things done after one brief pass with a electronic stat-o-rama meter not so much.

      • But why would you try to look at aggregate data and then try to guess which were military suicides using your blunt demographic filter?

        Is’t the war vet data directly available?

        • I mentioned it just as an example of how you might begin to start explaining causality in things you see in aggregate data…

          One assumes war vet data would be available somewhere to somebody, but it’s not an area I have familiarity with. Lots of these kinds of datasets are locked up for privacy reasons. Even CDC won’t give you any aggregated data where the number of cases falls below 10 in any bin.

          As I say, a careful job requires multiple passes, and access to the appropriate data.

        • At least supposedly they have a standardized reporting system now:

          http://t2health.dcoe.mil/programs/dodser

          there are PDFs with reports, but I didn’t see a link to databases. As I say, it’s not an area where I’m familiar with the data sources. CDC Wonder is pretty easy to use at least and I’m sure that’s where most studies of mortality get their data.

          It’s shocking to me how much data there is available out there, but also how difficult it is to actually work with, with all the non-standard definitions of the tables and the survey panels with different questions each year, and the fixed width records or csv files or only SAS/STATA datasets or whatever. it’s a mess out there.

          Pretty sure if you put up a $1M prize for the creation of a single disk image containing an ext4 file system with MySQL tables having all the major Census, BLS, CDC, and IRS datasets at the microdata level in a common format… you’d dramatically improve the usefulness of the billions of dollars of public data available out there for essentially zero cost (as a fraction of what it took to collect the data).

  2. Yes, I find the segmentation by education impossible to interpret. First, it is highly correlated with Age, the other primary variable of interest. Second, for the younger people, there is a missing data issue: we don’t know if they will eventually go to college, get a graduate degree, do postgraduate work, etc. Thirdly, as you mentioned, the selection issue masking other variables. In any case, I could not understand the economists who advise giving out more college degrees because when everyone has a college degree, I doubt that these mortality trends would have reversed.

  3. This post just leaves me with more questions than answers:
    (1) What do we know versus what do we not know?
    (2) Ideally how do we get to what we WANT to know (i.e. data and methods)?
    (3) What is this back and forth really doing to help?

    I say the above as a researcher who wants to actually identify what the next great step on this topic is. At present I am completely lost!!!

    • +1

      What’s next? I see a lot of armchair debate on faceted graphs like these and people will even fit hierarchical models on top.

      But is there ever actionable intelligence? Any case studies where such an analysis led to a change in policy?

  4. C’mon, there are some pretty clear answers here. “Conservative” (and from the graphs, those are indeed scare quotes) government is killing women and in the more progressive states they live longer. There are only a couple of states that break that trent, HI and RI, but that may be connected to the economic unpleasantness in 2008/9.

  5. There is another reason that analysis of death rates by education is tricky. The data for the numerator and denominator comes from different sources. For deaths, it is from the death certificate, as reported by next of kin; for population data it comes from the census, and is more likely to be self-reported. In the past (15+ years ago), I have declined to report death rates by education because I thought the patterns didn’t look reasonable. I looked at death rates for a set of cancers that are thought not to be related to socio-economic status or lifestyle factors and found that, according to the death certificate and census data, high school grads had much higher death rates than people with either less or more education (I age-adjusted the death rates using 5-year age groups). The group with “some college” education (i.e. not college grads) had the lowest death rates.

  6. First, is Jonathan the most popular name for commenters?

    Fascinating graphs. You mention moves between states, how sensitive is this: how much numerical change in the women population of the South or various states within it would be required to shift the curve meaningfully? To shift it to match other regions or to shift it to somewhere close enough where you say it could be chance?

    My general prior, for what little that is worth, is women are getting heavier, that white women are not only getting heavier but also tend to be under-diagnosed and treated for heart disease, that there is some component related to smaller town/more rural populations (which could be seen perhaps in asthma treatment) and that improvements among non-white women reflect improvement from worse initial conditions. I would believe as well that all people outside cities tend to be less well served medically but that could be seen more clearly in women. It’s a feature of US life that really poor areas now have more medical coverage – coming up from a low base – but many less poor areas have not improved in medical care and are now worse, allowing of course for overall improvements in medical standards and procedures. Example that might be trackable: I think statins are prescribed less for women and I would expect less for women in the South than for women in the Northeast.

  7. Do you have a variable for “born in America”? I would have thought white people in the middle states would be mostly American-born for several generations but that white people in the coastal states particularly California, New York and Florida would be more likely to be non-American born. (And do non-citizens count?)

    It would also be interesting to look by partnered and un-partnered. My understanding is that their has been a rise in un-partnered women (especially never married) which means they have more ability to move to somewhere more prosperous. And it’s certainly easier for the healthy to move than the unhealthy.

  8. > Next step is to build some sort of open-ended tool to use Stan to do smoothing for arbitrary slices of these data.

    Actual next step is to emphasize that we already have had such a tool for a while: It is the stan_gamm4() function in the rstanarm package and you can estimate GAMs (for non-linear trends over time) with largely lme4-style grouping.

      • +1 for the use of GAM’s in rstanarm. I’ve been making liberal use of them at work, especially with the ability to quickly specify a gaussian process basis for one off models/visualizations.

        Jonathan: GAM’s let you deal with the non-linear trends in the data or other decompositions (slow vs fast vs periodic trends) that might be annoyingly difficult to deal with otherwise. Whether one should or shouldn’t use a GAM as evidence for a given causal explanation is a separate matter.

  9. I think it would be better to rearrange panels like this:
    west-north-central east-north-central new-england
    pacific mountain mid-atlantic
    west-south-central east-south-central south-atlantic

    This is far from exact geography, but has at least very vague resemblance
    to the map that everyone knows by heart. Alternatively, the arangement can be done
    by some other principle, maybe socioeconomic, urban/rural, percent of people
    engaged in outdoor activities etc. The 3×3 grid itself should have some meaning.

    • As long as we’re commenting on geographic groupings, I was quite confused by seeing a bunch of states in the central time zone in the “west” category. Of course, I’ve always lived west of those states.

  10. The inter group comparison of graphs is made more difficult by a lack of common scale.

    In order to visually compare rates of change between small multiple graphs, a standard lower right point and scale are needed.

  11. As some one who would have a hard time distinguishing esophagus from colon, I want to suggest looking at this graphs as a function of the cause of death. But maybe it was already suggested in previous threads on this topic…

  12. Are more white males dying in their forties as fewer are dying at a younger age? Could this be indicated by the overall life expectancy of white males rising? If more are dying in their forties recently, wouldn’t the overall life expectancy be dropping?

  13. The breakdown by state leaves me rather frustrated. Comparing heavily rural states to heavily metropolitan states strongly suggests that if compared white middle aged women in rural areas to metro areas a wide gap would be visible.

    I really was hoping the data dump would have the trends split along rural/metro. Sometimes looking at data at the state level is clarifying. But as American politics (and society) have become increasinly split along a rural/metro divide state-level anslysis make obscure rather than clarify the underlying trend.

Leave a Reply to Eric Cancel reply

Your email address will not be published. Required fields are marked *