Skip to content
 

Some clues that this study has big big problems

Paul Alper writes:

This article from the New York Daily News, reproduced in the Minneapolis Star Tribune, is so terrible in so many ways. Very sad commentary regarding all aspects of statistics education and journalism.

The news article, by Joe Dziemianowicz, is called “Study says drinking alcohol is key to living past 90,” with subheading, “When it comes to making it into your 90s, booze actually beats exercise, according to a long-term study,” and it continues:

The research, led by University of California neurologist Claudia Kawas, tracked 1,700 nonagenarians enrolled in the 90+ Study that began in 2003 to explore impacts of daily habits on longevity.

Researchers discovered that subjects who drank about two glasses of beer or wine a day were 18 percent less likely to experience a premature death, the Independent reports.

Meanwhile, participants who exercised 15 to 45 minutes a day cut the same risk by 11 percent. . . .

Other factors were found to boost longevity, including weight. Participants who were slightly overweight — but not obese — cut their odds of an early death by 3 percent. . . .

Subjects who kept busy with a daily hobby two hours a day were 21 percent less likely to die early, while those who drank two cups of coffee a day cut that risk by 10 percent.

At first, this seems like reasonable science reporting. But right away there are a couple flags that raise suspicion, such as the oddly specific “15 to 45 minutes a day”—what about people who exercise more or less than that?—and the bit about “overweight — but not obese.” It’s harder than you might think to estimate nonlinear effects. In this case the implication is not just nonlinearity but nonmonotonicity, and I’m starting to worry that the researchers are fishing through the data looking for patterns. Data exploration is great, but you should realize that you’ll be dredging up a lot of noise along with your signal. As we’ve said before, correlation (in your data) does not even imply correlation (in the underlying population, or in future data).

The claims produced by the 90+ Study can also be criticized on more specific grounds. Alper points to this news article by Michael Joyce, who writes:

their survey [found] that drinking the equivalent of two beers or two glasses of wine per day was associated with 18% fewer deaths, it also found that daily exercise of around 15 to 45 minutes was only associated with 11% fewer premature deaths.

TechTimes opted to blend these two findings into a single whopper of a headline:

Drinking Alcohol Helps Better Than Exercise If You Want To Live Past 90 Years Old

Not only is this language unjustified in referring to a study that can only show association, not causation, but the survey did not directly compare alcohol and exercise. So the headline is very misleading. . . .

Other reported findings of the study included:

– being slightly overweight (not obese) was associated with 3% fewer early deaths

– being involved in a daily hobby two hours a day was associated with a 21 % lower rate of premature deaths

– drinking two cups of coffee a day was associated with a 10% lower rate of early death

But these are observations and nothing more. Furthermore, they are based on self-reporting by the study subjects. That’s a notoriously unreliable way to get accurate information regarding people’s daily habits or behaviors.

Just after we published this piece we heard back from Dr. Michael Bierer, MD, MPH — one of our regular contributors — who we had reached out to for comment . . .:

Observational studies that demonstrate benefits to people engaged in a certain activity — in this case drinking — are difficult to do well. That’s because the behavior in question may co-vary with other features that predict health outcomes.

For example, those who abstain from alcohol completely may do so for a variety of reasons. In older adults, perhaps that reason is taking a medication that makes alcohol dangerous; such as anticoagulants, psychotropics, or aspirin. So not drinking might be a marker for other health conditions that themselves are associated — weakly or not-so-weakly — with negative outcomes. Or, abstaining may signal a history of problematic drinking and the advice to cut back. Likewise, there are many health conditions (like liver disease) that are reasons to abstain.

Conversely, moderate drinking might be a marker for more robust health. There is an established link between physical activity and drinking alcohol. People who take some alcohol may simply have more social contacts than those who abstain, and pro-social behaviors are linked to health.

P.S. I’d originally titled this post, “In Watergate, the saying was, ‘It’s not the crime, it’s the coverup.’ In science reporting, it’s not the results, it’s the hype.” But I changed the title to avoid the association with criminality. One thing I’ve said a lot is that, in science, honesty and transparency are not enough: You can be a scrupulous researcher but if your noise overwhelms your signal, and you’re using statistical methods (such as selection on statistical significance) that emphasize and amplify noise, that you can end up with junk science. Which, when put through the hype machine, becomes hyped junk science. Gladwell bait. Freakonomics bait. NPR bait. PNAS bait.

So, again:

(1) If someone points out problems with your data and statistical procedures, don’t assume they’re saying you’re dishonest.

(2) If you are personally honest, just trying to get at the scientific truth, accept that concerns about “questionable research practices” might apply to you too.

39 Comments

  1. Interesting that this hit the blog the same week a new study is out saying there is no healthy level of drinking… Also with major problems

    https://www.theguardian.com/society/2018/aug/23/no-healthy-level-of-alcohol-consumption-says-major-study

    • Also interesting that The Guardian just a few weeks ago published an article saying that all the “official” recommendations about alcohol are completely tainted by politics… and yet they reported that other study without referencing their earlier story:

      https://www.theguardian.com/commentisfree/2018/aug/03/alcohol-health-government-advice-drug-benefits-morality

    • Adede says:

      Really? I thought that one was ok.

      • The first major problem with this and any other study of alcohol consumption is that there is no way to get causal information from the available data. Without assigning a large number of people to drink controlled amounts of alcohol you simply *can not* estimate a counterfactual level of health for individuals, and without a counterfactual estimate for individuals, you can not simply compare outcomes between self-assigned groups and expect causal inference no matter how much you talk up your statistical methods for controlling for biases.

        The next problem with this particular study is the tremendous measurement error that actually exists in measuring people’s actual alcohol consumption. But whatever the error involved, huge fractions of the population of the world drink *zero* drinks ever, so there’s a massive discrete spike in the population histogram, and zero causal data that could be collected by for example randomly assigning non-drinkers to drink say 1 or 2 drinks per day and compare their outcomes to people who were randomly assigned to stay at 0.

        Also, all but something like less than 10% of people are in the 0-4 drinks daily range, the conclusion then rests on the behavior of their massively aggregated curve through a minefield of measurement errored data in its 0-4 range. The discrete spike in population at zero already makes it so that in their figure 5 that they are unable to show any meaningful difference between 0 and 1 drink per day. Their curve fitting methodology is described as:

        “For each outcome, we estimated the dose–response relative risk curve using mixed-effects logistic regression with non-linear splines for doses between 0 and 12.5 standard drinks daily.”

        They seem to aggregate this data by *actual deaths due to each cause* even though survivorship issues are very strong: (you can’t get lip cancer at age 60 if you died at age 50 from a heart attack, so deaths from lip cancer may be *beneficial* compared to the alternative … dying at 50 from a heart attack for example)

        Their description of their nonlinear dose-response fitting is hardly enough to really understand the implications of their model, which seems to be a curve fitting method, but no information is available about what regularizations are used. For example, what keeps the model so smooth rather than oscillating around? What is the role of this regularization in forcing the shape of their relative risk curve? Do they have any open data and code? I don’t know but none is mentioned in The Lancet’s website.

        Another concern I have is how did they estimate “attributable” health problems? They mention “We calculated PAFs using our estimates of exposure, relative risks, and TMREL, following the same approach taken within the GBD studies.”

        So now it requires a historical deep dive into methods of “attributing” health problems to alcohol consumption. This seems fraught with researcher degrees of freedom, and survivorship bias issues, and soforth.

        Finally, they adjust their life years by disability, but not by quality. Asking people how much they enjoy a life involving a drink or two vs enforced teatotalling might well massively alter the conclusion. They also don’t seem to acknowledge heterogeneity in the population. People who abstain probably include both people who want to abstain and people who wish they could drink but don’t… people who drink probably include people who like their drinking level and people who wish they could quit, and people who wish they could cut back… Quality adjustment would include differences between actual and desired levels…

        Without a fully transparent analysis code and dataset it’s entirely impossible to understand the role of analyst choices in the outcomes.

        • My statements about causality are too strong, in the presence of a very good causal model you can make causal inference from pure observational data. Like for example if your model is of golf ball flight thats been validated in a wind tunnel, and you want to make causal inferences about what would have happened if you had hit a golf ball with a different dimple pattern and the same initial conditions off the face of the club… But with mortality data you just don’t have that kind of detailed precision in your models.

          • Adede says:

            So do you think smoking is bad for you? Or is mortality simply too complicated to make that conclusion from observational data?

            • I do think smoking is bad for you, except maybe if you are a soldier on active duty who uses nicotine to stay alert and therefore keep from getting shot, or if you are a suicidal person who uses nicotine as an antidepressant… Which is to say that causality is in fact complicated. Remember causality is about the difference between what does happen and what would have happened in a counterfactual situation. The thing about smoking is it probably has relatively few contexts where it’s beneficial in the modern world. That’s not so clear with alcohol.

              • Also an interesting component of the tobacco story is the “natural experiment” where an entire industry recruited people to participate in consumption starting in around 1910 and running through the 1940’s. Disease rates in those who were recruited could be compared to disease rates in the unrecruited control group through time
                https://www.cdc.gov/mmwr/preview/mmwrhtml/figures/m4843a2f1.gif

                Unfortunately alcohol consumption already had widespread participation starting around 5000BC and there are many confounding factors in self selection so it’s hard to use a natural experiment on lifetime timescales to identify causal effects.

        • Max Griswold says:

          Hi Daniel,

          I actually wrote this study and would like your opinion on these responses back to your concerns. The data and analytic code is available, we’re having trouble getting the Lancet to host it; they’re not seeing the value in letting it be publicly available, which is a shame.

          -What could we have done to derive causal claims in this analysis, when we only had access to observational studies? We tried to control for as much as possible, given the underlying data. Our thinking was the biological basis for harm for these causes was well established in the literature and that the sheer amount of observational evidence could allow us to construct reasonable dose-response curves. What would be a better method, in the absence of concrete interventions?

          -We tried to deal with the zero-inflation problem mentioned by first estimating if someone consumes or not, then measuring the distribution of consumption within the drinking category. Our histograms look pretty expected, once you do this. We had access to the underlying microdata to establish these distributions. (In most locations, it’s a gamma-looking shape, like a left-skewed triangle, which has been mentioned by other alcohol researchers)

          -Each of our outcomes were modeled separately, where only one causes of death was assigned to an individual. Not sure exactly what your comment means. We didn’t aggregate anything.

          -The journal asked us to tone down our statistical explanations for word count concerns. There was no regularization performed (not sure what qualifies exactly as regularization, I’m thinking of things like ridges or LASSO), the splines ended up being pretty simple, with knots from 0 – 150, at roughly 10-15 gram intervals depending on the model. We used a grid-search over knot points, holding the other aspects of the model fixed, then used OOS CV on data coverage, choosing the best performer. I can share the code, if there’s interest. An earlier version of the model is available here: https://github.com/ihmeuw/dismod_mr

          -Population attributable fractions were the aspect that let us claim attribution for deaths. This method is only really useful for population-level analysis, not individual-analysis. Basically, if there’s a strong biological relationship between the risk factor and health outcome, along with meeting an evidence-scoring metric we use (basically, a little scorecard for number of prospective studies showing an effect, magnitude of effects, controls used by studies, etc), then we performed a meta-analysis establish relative risks. PAF extends relative risks to attribution through a simple function, developed in earlier Global burden of diseases studies -> integral over dose of (exposure*relative risk) – counterfactual-level of harm * relative risk at that level / all divided by integral over dose of (exposure*relative risk). There’s lots of criticisms of PAFs but attributable fractions are fairly common in the epi/global health community.

          -We tried to deal with the issues of researcher degrees of freedom by including every peer-reviewed study published since the 1990s and being lenient in our inclusion criteria. Basically, if a study reported a relative risk with categorical or continuous exposure groups, we included it. We also tried a variety of analysis besides the one presented. We didn’t find enough variation to justify showing them all but they’re available, as well as the underlying data. I would be happy to share and let other researchers come to their own conclusions. Again, we’re having trouble getting the journal to include this, it’s unclear to me why.

          Not sure how survivorship bias comes into account here for our method. You might want to look at our causes of death paper, and risk factor paper, which explain the justification for PAFs in detail. (https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(17)32366-8/fulltext?elsca1=etoc , https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(17)32152-9/fulltext) Not trying to make people do a deep-dive but it’s hard to include the sheer amount of research that goes into each step of our model.

          -As public health professionals, we’re strictly concerned with DALYs, not QALYs. It would be simple to change the analysis to QALYs, you would just need to take the PAFs in the article and multiple them by estimated QALYs.

          Happy to provide any analysis code and data. We’re still working with the journal to make this easier to access.

          • Max, wow thanks for your response. I love that you are open to discussing your methods, and am not happy that the Lancet is discouraging you from sharing, but very glad that you are willing to share information through other channels. And thanks Andrew for the venue where we could meet. I would like to dive a little deeper now in some of your specific responses, and get back to you here in a bit.

          • Let me start with your question about what you could do to get causal claims?

            The only way to get causal claims without assigning interventions randomly is to have a strong causal model for the effect *that you and everyone else really believe*. Inherent in such a model is *model uncertainty* including what range of stuff people are willing to consider. What is the “correct” model to describe how say drinking 1 drink per day affects death rate through time for an individual with certain characteristics? This is the model you need, and to the extent that there could be different models, you really need to evaluate those different models in a model selection context, and include this model uncertainty within your analysis.

            The model I’d like to see from a causal perspective is something where time-series correlations are taken into account:

            N(dead(t)| AlcoholConsumptionHistory(t),Ybirth) = N(live(t-dt) | Alcohol Consumption History(t),Ybirth) * F(Die(t) | AlcoholConsumptionHistory(t), OtherCovariates, HealthState(t,Ybirth))

            Obviously N(live(Ybirth),Ybirth) = the number of people born in year Ybirth, and F(Die(t)…) is the fraction of people born in birth year Ybirth who are still alive in year t-dt and die in year t, as a function of their alcohol consumption curve through time. You will also need some model for HealthState which is something that maybe allows you to track the cumulative effects of health problems… For example liver disease doesn’t happen because of a single drink, but a decade of heavy drinking followed by a decade of cutting back to 1/day is obviously different than 1/day for 2 decades, and in some sense that HealthState is needed to account for such things, and other things as well (such as cumulative effects of smoking or sexually transmitted disease, or malnutrition or whatever).

            So this is obviously a time-series model, and could be expressed as a differential equation for each birth year, but you’d be fine to do it year by year in a difference equation form I think.

            The “attributable to alcohol” is then the difference:

            N(dead(t) | AlcoholConsumptionHistory,Ybirth) – N(dead(t) | NoAlcoholConsumptionEver,Ybirth)

            which is a function of t. You now need to fit this model and have a sample that is large enough to estimate the functional forms for the F function and healthstate equations (which will require year-by-year knowledge of exposures to other health related issues, such as air pollution, smoking, contaminated water, occupational exposures, etc etc). It’s important to have identified at least several confounding long-term health exposures, since alcohol exposure occurs over decades and so do many other environmental exposures such as those mentioned.

            The function F is of course very controversial and uncertain. So a good method would be to allow a wide variety of functions F using a prior over the parameters that describe the F function that allows for behaviors like increasing with dose, decreasing with dose, J shaped curves, etc etc. Displaying a plot of draws from the prior distribution over F functions would be key to showing that all plausible behaviors that anyone could want to include are included.

            Next you’re going to run a massive MCMC calculation which will give you posterior distributions over F and using that, a posterior predictive distribution over the attribution of deaths through time at different simple dosing protocols. Note that I am skeptical that any software available today can really do this fit, but if any can it’s likely to be Stan, and it’s likely to need months of computing time.

            The final thing you’d probably want to do is use the posterior distribution over the parameters for F to compare expected DALY loss under some simple dosing schemes, such as 0.25, 0.5, 1, 1.5, 2, 2.5, 3, 4, 5 drinks/day starting say at age 21 and going through to 60 years old, then dropping to half your original consumption until death. These are relatively meaningful reference dosing protocols.

          • I wrote a longish post about causality from observational data, but it didn’t show up after submitting. I’m hoping that means it’s held in some queue and will show up eventually. The blog seems to be acting strangely the last few days for me at least.

          • That’s great you are sharing Max

          • Continuing on issues you responded to (hoping some earlier comments will eventually show up, I’ve emailed Andrew about it), you say

            “Each of our outcomes were modeled separately, where only one causes of death was assigned to an individual. Not sure exactly what your comment means. We didn’t aggregate anything.”

            Attributing a death entirely to one cause is problematic. A person commits suicide by shooting themselves. Is the cause a gunshot wound, depression, failure of a doctor to diagnose depression, failure of an antidepressant to work for this person, stress caused by bankruptcy, bankruptcy caused by a “liar loan” written in the lead up to the housing crash, the existence of liar loans due to Federal monetary policy in the wake of the dot-com crash, or poor decision making on the part of investors during the dot-com bubble leading to federal monetary policy??? The answer is all of the above. The specific effect of monetary policy is the outcome that would have occurred if monetary policy were different. The effect of possession of a firearm is the difference in what would have occurred in the absence of the firearm. It might well be in the absence of the monetary policy far fewer people would have been suicidal, but in the absence of the firearm, somewhat fewer suicidal people would have succeeded. The attributable fraction to the monetary policy could well be much larger than the attributable fraction to the gun, or the other way around.

            In any case, if you do assign a single cause to particular deaths, and then try to aggregate all the ones “attributable to alcohol” you then get a dose-response curve for that attribution. But the actual causal degree to which alcohol contributed should be the difference between what would have happened without alcohol and what did happen with alcohol. In the calculation of the dose-response curve for death by type A, you are looking only at people who died due to cause A, essentially conditioning on the future outcome “death at time T by cause A” but without alcohol there is the definite possibility of “death at time t less than or greater than T by cause B for all B” which might have occurred in the absence of alcohol. Suppose for example a person with depression self medicates for a year or two using alcohol… and this extends their life compared to the suicide they would have committed earlier if they hadn’t first deep dived into a bottle… and then later they commit suicide anyway. We might attribute “suicide due to alcoholism” because we observe the future, but in fact we should say “extended life from time t to time T due to alcohol reducing suicide risk for a time until eventually alcohol fails to work anymore”

            In any case, if I understand your model you assign attribution to particular deaths, and then model dose-response, and then *aggregate dose-responses* across all causes of death, weighted by actual frequency of each cause observed to get your figure 5. This fails to account for the fact that alcohol can cause changes in frequency of cause of death. For example, people live longer because they don’t die of heart attacks, therefore they eventually get lip cancer. Again, from a causal perspective, you’re looking into the future, a crystal ball. In fact what you’re doing is converting a time-series problem into an instantaneous problem (at time of death).

            Without a markovian/time-series model with memory, it’s impossible to correctly aggregate these things to get some kind of meaningful causal figure 5, because inherently the causality has memory and the dosage acts over time, and in the presence of other time-acting effects whose contributions could be increased or decreased by alcohol.

            • To give you a simple biological example of how important this is, consider death by ischemic heart disease. It’s not the case that victim at age 58 was totally fine and then at age 59 victim suddenly had a huge plaque build up and heart attack. In fact, throughout victims life he built up plaques in his arteries. The causal effect of alcohol on plaque formation is the question of interest. Eliminating alcohol from the victims diet at age say 21 would have done a bunch of stuff for the next 38 years, including changed the person’s physiology, changed the persons choices of food (perhaps he would eat less cheese in the absence of a fondness for red wine) changed the person’s friends leading to different activities (and maybe different exercise regimens)… perhaps change in choice of career (to avoid careers involving selling wine for example) etc.

              In the presence of a very detailed causal model for all of this, such as the kind of detail available for the flight of the golf ball I mention above, the only way to actually calculate the net effect of all these follow-on-causes is to randomly assign people to different alcohol consumption regimens.

              So, basically *there is NO way to get causal data* from observational data in the absence of this kind of detailed causal model for people’s actions in the presence and absence of the alcohol.

              The one case where this is different is perhaps something like a person getting hit by a drunk driver, or a person dying of acute alcohol poisoning etc. There it really is an *instantaneous* process and we don’t need a long term time-series exposure model to attribute the death.

            • Chris Wilson says:

              Interesting stuff Daniel. Also, huge kudos to Max Griswold for taking time to come on this blog and discuss! Here was part of my reaction (posted elsewhere on social media) to this study:
              “(things that bother me…) 3) “Although we found some protective effects for ischaemic heart disease and diabetes among women, these effects were offset when overall health risks were considered—especially because of the strong association between alcohol consumption and the risk of cancer, injuries, and communicable disease.” OK, confounder effect much? Injuries and communicable diseases, really?, 4) studying small effects with observational data at the population scale, and then deriving deterministic conclusions like “there is no safe level of alcohol consumption”. Put in the context of the larger literature, the only defensible conclusion IMO is this, “consuming 1-2 units of alcohol per day may protect from some things a little, and may increase risk for some other things a little, but we expect individual variability to swamp population-level conclusions. There is no compelling reason to take up drinking if you don’t already. Heavy drinking is another matter, and is very likely very not good for you.” “

              • Chris: I too found the communicable diseases issue strange, and it’s related to your point about absolute vs relative risk. For example they plot the tuberculosis risk vs consumption dosage response curve. But this is obviously highly context dependent. Tuberculosis risk in Russia or Mali or Bangladesh is obviously way different from tuberculosis risk in say Belgium or Italy or Canada. Here your point about absolute vs relative risk is good. If you’re in Italy you might be really concerned about Heart Attack and not concerned at all about Tuberculosis. In New Delhi it could be wildly different. This global homogenization is inappropriate.

          • Final major point here, thanks again Max for engaging at Andrew’s blog.

            Using PAF based on meta-analysis is basically just going to give you the answer the the question “what is the consequences of the assumptions and conclusions that others have published?” There is no model checking of whether other’s assumptions are valid or consistent with data.

            The alternative, learning the coefficients of some kind of instantaneous risk vs dosage function for some kind of markovian/time-series risk model starting with a sufficiently broad prior over F would give you a valid uncertainty calculation surrounding how much information you really have about causal risk.

            If you built a markovian / timeseries / differential equation population level model using just say 4 or 5 main effects, you might have some kind of chance of fitting it from first principles, and inferring how much the data really tells you about your model for risk. Since there’s clear history of evidence for these areas, I’d suggest: heart disease, stroke, oral cancer, automotive accidents, liver disease, all other cancer, all other causes of death.

            Then simply fit this to data on people born between say 1960 and 1980 (helps you deal with issues involving war, and massive changes in technology between say 1920-1960) using a simple nonlinear absolute instantaneous risk function for each disease (you could start with say 4th order fourier series on 0 to 15 doses/day plus a linear trend and a constant term, I think you can maybe get rid of some of the fourier coefficients using symmetry and constraints, for example with the constant and linear trend you might just need sin series, total of 6-10 coefficients for each disease) and track populations for each disease. You’re going to wind up with around 70 alcohol related coefficients to fit. You’ll also want some dose-response function for smoking and oral cancer, and smoking and all other cancer because those are going to be relevant. In the end it’ll be something like a 100 dimensional set of parameters. I’d recommend to stick with one first world country data to develop the model.

            Then, utilizing Thanatos’ point about the measurement difficulty of actually determining people’s *real actual* consumption vs reported, you’ll want to create a reasonable measurement error model for inferring actual consumption and use the uncertain, unobserved actual consumption in your model fit.

            That highly simplified model would from basic causal first principles lead you to the full range of data-consistent plausible conclusions. If you find the kind of consistency and narrow error region you show in your Figure 5 under this kind of model fit including the measurement errors and the full prior uncertainty in the shape of the dose-instantaneous risk function, I’d honestly be SHOCKED absolutely SHOCKED.

            • Note how besides the 100 or so coefficients to describe the dose-response instantaneous risk functions, you’ll need a few parameters *per person* to describe their actual consumption history. If you look at all deaths in the US of those born between 1960 and 1980 you’re looking at something like 2-3 million deaths per year so you’re talking about tens of millions of parameters if you try to do a good job (and no computing system available today is likely to handle it). Of course it seems unlikely that you’re really going to have any measurements at all on alcohol usage for most deaths, so you’ll wind up just using a much more limited dataset where you have noisy individual alcohol measurements, and deaths, but you could perhaps then extrapolate frequencies of drinking dosage to the whole population and use population death rates and all deaths to further constrain things: force the extrapolations to match observed aggregated death rates by illness. You’d of course wind up with correlated uncertainties between dosage and risk functions… hard to say if the problem is just that more people are drinking more than estimated, or if smaller quantities of alcohol are more damaging… Still if we’re honest that’s a true uncertainty needing to be acknowledged.

          • Max, I put up a toy model that shows how for causal purposes it’s important to have a causal model through time

            http://models.street-artists.org/2018/08/31/alcohol-risk-qualitative-example/

            In this example, even though there is explicitly zero effect of alcohol on “cancer” people who drink more alcohol die more of cancer! Why? Because they’re partially protected from dying of heart disease, so they live longer and experience more cancer risk!

            Here, there is an association between drinking alcohol and dying of cancer! But causally in this toy example, there is *exactly zero* effect of alcohol on instantaneous rate of cancer production, for all time…

            Hope that helps illustrate one of the most important parts of my concern: causal inference requires causal models.

        • Thanatos Savehn says:

          And …

          (1) Even if you had 1,000 people drink identical amounts of ETOH they would metabolize it at different rates, sometimes in different parts of the body thereby often producing different metabolites with different effects on different local micro-environments. The point being that the effective dose varies by the individual exposed to it. Everybody wants cause given exposure but few take the time to discover dose given exposure, percent body fat, etc. The few who have tend to discover that metabolism often varies widely within individuals. It varies throughout the course of the day, it varies by what you’ve eaten and what you’re eating, by your level of activity, by whether you’re sitting or standing or lying down.

          According to the study Danes, Norwegians and Germans are religious drinkers all, whereas the most committed teetotalers are from Egypt, Bangladesh and Pakistan. Upon examining their life expectancies and living standards it wouldn’t be obviously wrong to conclude that there’s something (even if you’re not sure what it is) to be said for drinking. Now, add to all that Russia’s drinking problem (at least 20 gallons/year vodka to be considered a heavy drinker???) and its worse-than-Egypt’s life expectancy and you might sensibly conclude that at some dose and lower it’s all (given current knowledge) inexplicable noise; and that the best bang for the public health buck would come from figuring out about where, within perhaps a drink or two per day, the habit becomes obviously pathological.

          2) Indeed, drinkers’ drinking habits are as variable as drinkers’ metabolisms. Worse yet, they lie about it. I’ve reviewed thousands of sets of medical records over the years and dictated as many tables of their smoking, alcohol, exercise, eating, medication compliance, etc. It became obvious to me that most people lie to themselves, lie to their doctors and lie to the epidemiologists who ask them about exposure histories. A common pattern is for plaintiffs to have told doctors of a 3-4 packs per day lifetime habit, when asked in the 60s and 70s, 1 ppd in the 80s and 90s and 1 pack per week (a single cigarette after lunch and dinner) in recent times. When under cross examination they are confronted with multiple histories in their records of heavy smoking for decades most are genuinely shocked. We are often willfully blind to the risks we voluntarily undertake and shape our memories accordingly.

          3) Bingo on “attributable”. It’s one of those weasel words used by weasels. I once created a graphic derived from all of the attributable disease claims made by all the leading disease and anti-vice (drinking, smoking, obesity, anorexia, etc.) organizations to demonstrate that collectively they claiming almost three times as many deaths annually than actually die of all causes combined. “Attributable” then is just a way advocates lay claim to shared corpses. The fights used to be bitter but eventually everyone realized that they could all raise more money if they shared the victims.

          • Max Griswold says:

            Hi Thanatos,

            One of the nice things about the parent study, the Global Burden of Disease, is that for attributable outcomes, only a single death can be coded to any risk factor. We mediate deaths where there’s reason to believe two underlying risk factors could be co-occuring (like alcohol use and hepatitis for liver cancer and cirrhosis). There’s no way within our analysis for there to be more attributable deaths than total deaths. It’s more in the range that 40% of deaths can be ascribed to any particular risk factor.

            • Thanatos Savehn says:

              Once again I am reminded of my Dad’s advice to speak carefully and charitably whenever speaking in public. I apologize for the tone of my comment. I’m gearing up for a trial and in sharpening blades mode (which I offer as explanation rather than excuse).

              Now, that aside, back in the day I put a nosologist on the stand followed by a biostatistician. Post trial we discovered that none of the jurors could remember anything good or bad about either of them or indeed anything at all that they had said. The reason I put them on was to make some point about what ICD codes were pulled and not pulled for the other sides’ bespoke epi study. I didn’t understand then and still don’t understand today how it is decided which of the numerous ICD codes found in anyone’s medical records or death certificates ought to be used and why (other than, as I suspected in that case, cherry picking).In any event, are there non-paywalled papers about that set out best practices for harvesting potential effects of causes from medical records? Finally, I didn’t mean to imply that the authors over-counted but rather that if you added up all the victims of smoking claimed by the anti-smoking people, all the victims of drinking claimed by the anti-drinking people, all the victims of toxins claimed by the anti-toxin people, etc., that it summed to ~280% of all known/reported deaths.

        • anon says:

          The paper does an okay job estimating alcohol’s disease burden. But some major problems it has, in addition to those given by Daniel:

          It doesn’t give absolute risk differences, only relative risks.
          The reasoning from “no safe level” to suggesting abstention is incomplete. Many everyday activities carry some risk, which is not a reason to abstain from them.
          The claimed results fairly flatly contradict an IPD analysis of more than 1/2 million people, also in the Lancet (Wood et al 2018) yet no mention of this. Which one of these analyses is wrong?
          No discussion of the uncertainty interval around the relative risk curve overlapping the null at around 1.5 drinks/day and lower.

          • Adede says:

            “Many everyday activities carry some risk, which is not a reason to abstain from them.”
            Yes, but alcohol consumption is completely discretionary. One could argue there’s no train to do it, given the risk.

  2. a reader says:

    Is there a link to the actual paper?

    I’m not sure that the researchers made any terrible mistakes, but the Tribune reporter is certainly making claims that are likely to be unsupported by the data (i.e., drinking *causes* an increase in longevity). For example, here’s the quote from the linked article: “Researchers discovered that subjects who drank about two glasses of beer or wine a day were 18 percent less likely to experience a premature death, the Independent reports.”. Note that The Independent makes a statement about correlation, while the Minneapolis Star Tribune makes a statement about causality from the same quote: “When it comes to making it into your 90s, booze actually beats exercise, according to a long-term study.”. Dr. Brier correctly notes that it seems very possible this is due to survivorship bias.

    As for the discretization of the data (i.e., “exercised between 15-45 minutes”)…I don’t think it’s that’s such a big a deal. You’re right that non-linear trends are hard to model, but discretization makes it easier, if you have enough data. My guess is that the categories 0-15, 15-45 had a lot of data for senior citizens, but not so much for 45+. Likewise, I’m guessing the “overweight” category is much more populated than the “obese” category. I would say the discretization is only a problem if you wanted to make inference on a finer level, but I don’t see any need for that just yet.

  3. Adam says:

    The comparative claims like “lived longer” or “fewer deaths” – is that comparing 90 year olds in the study to other 90 year olds in the study? Is it appropriate to use “premature death” in this context?

  4. Vince says:

    This is bad on so many levels…

    The news article doesn’t link to any study. A bit of googling suggests that this “study” was presented at an AAAS conference, but I couldn’t find any publication(s), much less data. There is a credulous news release at https://www.mind.uci.edu/90-study-finds-link-moderate-alcohol-consumption-longevity/

    The lead researcher is quoted in the article:

    “I have no explanation for it, but I do firmly believe that modest drinking improves longevity,” Kawas said during the American Association for the Advancement of Science annual conference in Austin, Texas.

    I missed the class on the scientific method that included “I have no explanation for it, but I firmly believe …” as appropriate.

    Check out https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4561216/ for another example of noise mining from this group.

    • a reader says:

      ‘I missed the class on the scientific method that included “I have no explanation for it, but I firmly believe …” as appropriate.” ‘

      Scientists can, and moreover should, have opinions. This is a Bayesian blog, after all. But they should also be capable of separating believes and evidence, as Kawas appears to be doing.

      With that said, it’s my belief that modest drinking probably does not improve longevity. But I can’t point to any hard evidence at the moment!

  5. Max Griswold says:

    Testing 2.0, with different email. Assuming now that comments must be pre-approved.

    • Some subset of comments are held to be approved manually. It’s not clear why. Hopefully all comments we both wrote will show up eventually.

    • Thanatos Savehn says:

      On the off chance that this is helpful, here’s what I’ve been getting:

      Despite using different browsers I regularly get a page that’s many hours old. Interestingly, the reload clicky is sometimes missing and F5 doesn’t even work. Closing, re-opening, and reloading (when available) brings up (eventually) the (I think) newest version of the blog. What this implies, other than the end times, is beyond me.

  6. awm says:

    Seems like the biggest issue would be that if heavy drinking kills people before they turn 90, a study of 90+ year olds would not be very informative, and the 90+ drinkers either lucked out or had something else going for them.

  7. Steve C says:

    Thanks for the interesting article Daniel (and thanks Andrew for the blog, which I appreciate, not sure I have commented here before).

    It seems that most “rock solid” data on diet and alcohol, promoted by worthy institutions and august government bodies, is subject to the same problem. Mostly correlation and few randomised controlled trials. Randomised controlled trials are usually small scale and often contradict the correlation data accepted as definitive truth.

    Also, outside of people in genetics, it is not appreciated that there are genetic differences among different populations (previously known as races, a biologically useful although imprecise definition) that also affect the body’s interaction with diet. As simple examples: a) lactose persistence, very popular among descendants of the early settled farmers and not so much among descendants of groups that were more recently hunter-gatherers, b) ability to break down alcohol, for example E Asians having a high percentage of non alcohol drinkers due to a couple of genetic differences.

    Separating cause and effect is very problematic.

Leave a Reply