Dale Lehman writes:
Let me be the first (or not) to ask you to blog about this just released NEJM study. Here are the study, supplementary appendix, and data sharing statement, and I’ve also included the editorial statement. The study is receiving wide media attention and is the continuation of a long-term trial that was reported on at a 10 year median follow-up. The current publication is for a 15 year median follow-up.
The overall picture is consistent with many other studies – prostate cancer is generally slow to develop and kills very few men. Intervention can have serious side effects and there is little evidence that it improves long-term survival, except (perhaps) in particular subgroups. Treatment and diagnosis has undergone considerable change in the past decade. The issue is of considerable interest to me – for statistical reasons as well as personal (since I have a prostate cancer diagnosis). Here are my concerns in brief:
This study once again brings up the issue of intention-to-treat vs actual treatment. The groups were randomized between active management (545 men), prostatectomy (533 men), and radiotherapy (545 men). The analysis was based on these groups, with deaths in the 3 groups of 17, 12, and 16 respectively. Figure 1 in the paper reveals that within the first year, 628 men were actually in the active surveillance group, and 488 in each of the other 2 groups: this is not surprising since many people resist the invasive treatment and possible side effects. I would consider those that chose different groups than the random assignment within the first year as the true effective group sizes. However, the paper does not provide data on the actual deaths for the people that switched between the random assignment and actual treatment within the first year. So, it is not possible to determine the actual death rates in the 3 groups.
The paper reports death rates of 3.1%, 2.2%, and 2.9% in the 3 groups. If we just change the denominators to the actual size of the 3 groups in the first year, the 3 death rates are 2.7%, 2.5%, and 3.3%, making intervention look even worse. If we assume that half of the deaths in the random prostatectomy radiotherapy groups were among those that refused the initial treatment and opted for active surveillance, then the 3 death rates would be 4.9%, 1.2%, and 1.6% respectively, making active surveillance look rather risky. Of course, I think allocating half of the deaths in those groups in this manner is a fairly extreme assumption. Given the small numbers of deaths involved, the deviations from random assignment to actual treatment could matter.
The authors have the data to conduct both an intention to treat and actual treatment received comparison, but did not report this (and did not indicate that they did such a study). If they had reported details on the 45 total deaths, I could do that analysis myself, but they don’t provide that data. In fact, the data sharing statement (attached) is quite remarkable – will the data be provided? “No.” That really irks me. I don’t see that there is really any concern about privacy. Withholding the data serves to bolster the careers of the researchers and the prestige of the journal, but it doesn’t have to be that way. If the journal released the data publicly and it was carefully documented, both the authors and the journal could receive widespread recognition for their work. Instead, they (and much of the establishment) choose to rely on their analysis to bolster their reputations. But these days the analysis is the easy part, it is the data curation and quality that is hard. Once again, the incentives and rewards are at odds with what makes sense.
Another question that is not analyzed but could be if the data was provided, is whether the time of randomization matters. The article (and the editorial) cites the improved monitoring as MRI images are increasingly used along with biopsies. Given this evolution, the relative performance of the 3 groups might be changing over time – but no analysis is provided based on the year upon which a person entered the study.
One other thing that you’ve blogged about often. For me, the most interesting figure is Figure S1 that actually shows the 45 deaths for the 3 groups. Looking at it, I see a tendency for the deaths to occur earlier with active surveillance than either surgery or radiation. Of course, the p values suggest that this might just be random noise. Indeed it might be. But, as we often say, absence of evidence is not evidence of absence. The paper appears to overstate the findings, as does all the media reporting. Statements such as “Radical treatment resulted in a lower risk of disease progression than active monitoring but did not lower prostate cancer mortality” (page 10 of the article) amounts to a finding of now effect rather than a failure to find a significant effect. Null hypothesis significance testing strikes again.
Yeah, they should share the goddam data, which was collected using tons of taxpayer dollars:
Regarding the intent-to-treat thing: Yeah, this has come up before, and I’m not sure what to do; I just have the impression that our current standard approaches here have serious problems.
My short answer is that some modeling should be done. Yes, the resulting inferences will depend on the model, but that’s just the way things are; it’s the actual state of our knowledge. But that’s just cheap talk from me. I don’t have a model on offer here, I just think that’s the way to go: construct a probabilistic model for the joint distribution of the all the variables (which treatment the patient chooses, along with the health outcome) conditional on patient characteristics, and go from there.
I agree with Lehman that the intent-to-treat analysis is not the main goal here. It’s fine to do that analysis but it’s not good to stop there, and it’s really not good to hide information that could be used to go further.
As Lehman puts it:
Intent-to-treat analysis makes sense from a public health point of view if it closely reflects the actual medical practice. But from a patient point of view of making a decision regarding treatment, the actual treatment is more meaningful than intent-to-treat. So, when the two estimates differ considerably, it seems to me that they should both be reported – or, at least, the data should be provided that would allow both analyses to be done.
Also, the topic is relevant to me cos all of a sudden I need to go to the bathroom all the time. My doctor says my PSA is ok so I shouldn’t worry about cancer, but it’s annoying!
I told this to Lehman, who responded:
Unfortunately, the study in question makes PSA testing even less worthwhile than previously thought (I get mine checked regularly and that is my only current monitoring, but it is not looking like that is worth much, or should I say there is no statistically significant (p>.05) evidence that it means anything?
Damn.