Charles Warne writes:
A colleague of mine is running logistic regression models and wants to know if there’s any sort of a test that can be used to assess whether a coefficient of a key predictor in one model is significantly different to that same predictor’s coefficient in another model that adjusts for two other variables (which are significantly related to the outcome). Essentially she’s wanting to statistically test for confounding, and while my initial advice was that a single statistical test isn’t really appropriate since confounding is something that we make an educated judgement about given a range of factors, she is still keen to see if this can be done. I read your 2006 article with Hal Stern “The difference between ‘significant’ and ‘not significant’ is not itself statistically significant” which included the example (p. 328) where evidence for a difference between the results of two independent studies was assessed by summing the squares of the standard errors of each and taking the square root to give the standard error of the difference (se=14). My question is whether this approach can be applied to my colleague’s situation, given that both logistic regression models are based on the same sample of individuals and therefore are not independent? Is there an adjustment that can be used to produce more accurate standard errors for non-independent samples or should i not be applying this approach at all? Is there a better way this problem could be tackled?
My reply: No, you wouldn’t want to take the two estimates and treat them as if they were independent. My real question, though, is why your colleague wants to do this in the first place. It’s not at all clear what question such an analysis would be answering.
P.S. Warne adds:
I completely agree with your question as to why my colleague would want to do this in the first place. Her rationale is that she wants to show that, once controlling for the newly added variable, there is a significant change in the coefficient of the other variable. ie. there are two models
model 1 : y = a + bx1
model 2 : y = a + bx1 + cx2
..and she wants to show there is a significant change in the coefficient ‘b’ between the two models.
I said that the multivariate model 2 speaks for itself and that a practically/clinically important (which depends on the substantive area of interest) change in ‘b’ from model 1 to model 2, combined with evidence of a significant association between ‘c’ and the outcome, shows that variable c is confounding the association between b and y.
I’m not sure what to say. My first thought is that, yes, model 2 has it all, and there’s not much to be learned by comparing the coefficients in the two models. But part of me does think that something can be learned from the comparison.
As to the question of how to get a standard error: b.hat[model2] – b.hat[model1] is a linear function of the data, y, so it shouldn’t be too hard to work out a formula for its standard error. Or you could do it by boostrapping.