Skip to content
 

Robust logistic regression

Corey Yanofsky writes:

In your work, you’ve robustificated logistic regression by having the logit function saturate at, e.g., 0.01 and 0.99, instead of 0 and 1. Do you have any thoughts on a sensible setting for the saturation values? My intuition suggests that it has something to do with proportion of outliers expected in the data (assuming a reasonable model fit).

It would be desirable to have them fit in the model, but my intuition is that integrability of the posterior distribution might become an issue.

My reply: it should be no problem to put these saturation values in the model, I bet it would work fine in Stan if you give them uniform (0,.1) priors or something like that. Or you could just fit the robit model.

And this reminds me . . . I’ve been told that when Stan’s on its optimization setting, it fits generalized linear models just about as fast as regular glm or bayesglm in R. This suggests to me that we should have some precompiled regression models in Stan, then we could run all those regressions that way, and we could feel free to use whatever priors we want.

9 Comments

  1. Fernando says:

    “robustificated”

    My heart bleeds for the English language.

  2. Thinkling says:

    Enough with the neologery!

  3. Mitzi Morris says:

    “I’ve been told that when Stan’s on its optimization setting, it fits generalized linear models just about as fast as regular glm”
    really? I’ve been told that’s not happening until Stan 2.0.

    • Marcus says:

      The new optimization work won’t be released until Stan 2.0, however it is already in the development branch of the git repository.

  4. Peter Meilstrup says:

    Wichmann and Hill’s two papers on fitting psychometric data (they come up if you google “wichmann hill psychometric”) explore this question in some detail.

    The R package “psyphy” has some prebuilt saturating logit and probit link functions that plug into the GLM command.