I love it when I can respond to a question with a single link

Shira writes:

This came up from trying to help a colleague of mine at Human Rights Watch.

He has several completely observed variables X, and a variable with 29% missing, Y. He wants a histogram (and other descriptive statistics) of a “filled in” Y.

He can regress Y on X, and impute missing Y’s from their fully observed X values (from the posterior predictive distribution). If he wants a histogram of the “filled in” Y, what would you recommend to him? Is there a good way to display this, taking the uncertainty in the imputed Y’s into account?

My reply:

http://www.stat.columbia.edu/~gelman/research/published/biom_031010.pdf

6 thoughts on “I love it when I can respond to a question with a single link

  1. It’s not clear to me what the author really wants, is it:

    1) A histogram which represents approximately the histogram of the full set of Y values?
    2) A way to model/quantify/display the uncertainty in the histogram described in (1) given that some of the data is imputed?
    3) Something else?

    Part 2 is a very Bayesian question, as it asks essentially for a (Bayesian probability) distribution over frequency distributions.

Leave a Reply

Your email address will not be published. Required fields are marked *