Subtle statistical issues to be debated on TV.

Posted on October 18, 2010 4:53 PM by Keith O’Rourke

There is live debate that will available this week for those that might be interested. The topic: Can early stopped trials result in misleading results of systematic reviews?

It’s sponsored by the Cochrane Collaboration and although the level of discussion is often not very technical, it does in my opinion provide a nice window into clinical research and as Tukey might put it “the real uncertainties involved”.

(As a disclaimer – I once assigned some reading from this group to my graduate students and they were embarrassed and annoyed at the awkward handling of even minor technical issues – but the statistical research community is not their target audience.)

I have a favourite in this debate, and a quick search on co-authors (not me) would likely tip that off to most members of this blog.

Here’s the directions kindly supplied by Jonathan Sterne, who will be in the chair.

Dear SMG Members,

By means of follow up to previous advertisements; the Discussion Meeting:”Can early stopped trials result in misleading results of systematic reviews?” will be broadcast live online (please see attached for further meeting details).

October 21 2010, 07.30 – 09.00 AM Denver Time (MDT)
(US East Coast +2 hours, UK +7, Central European Time plus 8)

To watch the meeting live, simply visit: www.cochrane.tv at your equivalent local time.

Should you have any queries or comments, before or after the meeting please do not hesitate to get in touch.

Jonathan Sterne

———————-

Jonathan Sterne
School of Social and Community Medicine
University of Bristol

Abstract

Can early stopped trials result in misleading results
of systematic reviews?
Cochrane Colloquium, Keystone Colorado
October 21 2010, 07.30-09.00

Following the publication of empirical studies demonstrating differences between the results of trials that are stopped early and those that continue to their planned end of follow up, there has been intensive recent debate about whether the results of stopped early trials can mislead the clinician and consumer public. The Cochrane Bias Methods Group and Statistical Methods Group are delighted that two leading experts have agreed to present their views and lead a discussion on how review authors should address this issue.

Stopping early for benefit: is there a problem, and if so, what is it? Gordon Guyatt, McMaster University

Stopping at nothing? Some Dilemmas of Data Monitoring in Clinical Trials
Steven Goodman, Johns Hopkins University Schools of Medicine and Public Health

9 thoughts on “Subtle statistical issues to be debated on TV.”

wei on October 19, 2010 4:40 AM at 4:40 am said:

Is this am or pm?
thx
perceval on October 19, 2010 5:07 AM at 5:07 am said:

You say that

(As a disclaimer – I once assigned some reading from this group to my graduate students and they were embarrassed and annoyed at the awkward handling of even minor technical issues – but the statistical research community is not their target audience.)

Care to share some examples, or a link to a paper / rant / blog that might elucidate these points for non-statisticians who are not afraid of equations?

Thanks!
Jason Connor on October 19, 2010 5:36 AM at 5:36 am said:

This may have started due to a (very very poor) paper published in JAMA

Stopping Randomized Trials Early for Benefit and Estimation of Treatment Effects: Systematic Review and Meta-regression Analysis
Dirk Bassler, Matthias Briel, Victor M. Montori, Melanie Lane, Paul Glasziou, Qi Zhou, Diane Heels-Ansdell, Stephen D. Walter, Gordon H. Guyatt, and and the STOPIT-2 Study Group
JAMA. 2010;303(12):1180-1187.

in which the authors compare estimates from trials that stopped early for success to trials that didn't stop early and (surprise!) found that trials that stop early have better point estimates.

Since trials only stop early for success when they show an effect, this is equivalent to saying that trials that show large effects have larger effects that trials that show small effects — a point missed by the authors. The authors did not include only fixed trials in the latter group. So if a trial COULD have stopped early and didn't (because perhaps due to natural variability the observed effect was smaller), it gets lumped into the smaller effect group.

This is like doing a bunch of the exact same study with 80% power then saying the 80% of trials that are significant show larger effects than the 20% of trials that didn't result in statistical significance. But the authors do not seem to understand natural variability.

Instead the authors criticize studies that stop early for success, even ones with tight Type I error control — not realizing that stopping a trial as soon as we believe we know the answer can save R&D dollars and get information and/or new treatments to patients faster.

To me it seems like an ethical as well as statistical problem the authors get wrong

You'd never think JAMA would publish this paper — but that's just what they've done here.

The July 14, 2010 issue of JAMA

http://jama.ama-assn.org/content/vol304/issue2/in…

has four letters showing the error in the authors thinking.
wei on October 19, 2010 6:22 AM at 6:22 am said:

tight type I error control does not mean there is no overestimation.
perceval on October 19, 2010 9:46 AM at 9:46 am said:

Thank you Jason – have bookmarked for the next time I'm near my uni login for JAMA. Very helpful.
K? O'Rourke on October 20, 2010 4:25 AM at 4:25 am said:

Not to spoil the debate for anyone, but one could see this as just a group of people who just don't get statistics (and put it in Zombies)

or an opportunity to discern what it is they don't realize well enough to get it

or even an opportunity to learn what they know that we don't that is distracting them from the statistical issues (and put it in Public Health)

or put it in both.

Wei: My inference is that it is PM (refreshments being served) but that’s not anywhere I can find it and I have checked with Jonathon.

And you are right its not tight type I error control but more so adequate expenditures of experimental resources – power or better still Sex Beauty and Power (which is posted somewhere on this blog)

K?
K? O'Rourke on October 20, 2010 4:47 AM at 4:47 am said:

No – AM Denver time!!!!

Thanks Wei – the joys of blogging.

K?
K? O'Rourke on October 22, 2010 4:35 AM at 4:35 am said:

Debate “post-mortem” – given partial reception of http://www.cochrane.tv

I did get most of Steve and Gordon’s presentations but little reception afterwards.

I have requested the video from Jonathon and will post it here if I can.

I thought it was worth listening to and I especially appreciated Steve’s boxing slide – bullying does often occur in clinical research methodology development and sometimes someone needs to stand up and risk taking “cheap shots”. The only possibly cheap shot I noticed in this debate was Gordon calling Steve a statistician – but that might have been intended as a compliment.

I believe Gordon’s concerns are legitimate and real – sequential methods won’t be properly implemented, early results will cause others to not do or not report further needed research (aka publication bias) and less promising results will take longer to get into print (early study effects). But I believe the solution is not censorship or penalties (down-weighting) but learning – fix this inappropriate and harmful processing/reporting of research efforts/results.

I believe Steve could have offered a helpful bridge to those without a Bayesian background by pointing out that the concern about sample size when a trial is statistically significant is about power (as Andrew said once here, you need to interpret statistical significance with one eye on power) and this requires some non-data sense (prior) of the size of the unknown effect. Now then why not make that explicit? Anyways I do believe the best way to get researchers to credibly use Bayesian techniques is via the closest frequency technique. (As ad executive told me once, we film what the client thinks they want in the morning, and then we film we think they should have in the afternoon. Knowing we have done both, most agree to look at the afternoon’s work first.)

And I don’t think the value (and manageable risks) of Bayesian shrinkage to improve the processing/reporting of research efforts/results got through to the audience – but then I missed most of that discussion.

K?
K? O'Rourke on November 2, 2010 5:23 AM at 5:23 am said:

For those who are interested – the full debate and questions/comments are online

http://justin.tv/cochranetv/b/272278382

Near the end of the questions/comments, Steve made some clear and convincing comments for more of a role for Bayesian thinking in Evidence Based Evidence.

K?

Comments are closed.