Question 13 of my final exam for Design and Analysis of Sample Surveys

13. A survey of American adults is conducted that includes too many women and not enough men in the sample. In the resulting weighting, each female respondent is given a weight of 1 and each male respondent is given a weight of 1.5. The sample includes 600 women and 380 men, of whom 400 women and 100 men respond Yes to a particular question of interest. Give an estimate and standard error for the proportion of American adults who would answer Yes to this question if asked.

Solution to question 12

From yesterday:

12. A researcher fits a regression model predicting some political behavior given predictors for demographics and several measures of economic ideology. The coefficients for the ideology measures are not statistically significant, and the researcher creates a new measure, adding up the ideology questions and creating a common score, and then fits a new regression including the new score and removing the individual ideology questions from the model. Which of the following statements are basically true? (Indicate all that apply.)

(a) If the original ideology measures are close to 100% correlated with each other, there will be essentially no benefit from this approach.

(b) If the original ideology measures are not on a common scale, they should be rescaled before adding them up.

(c) If the original result was not statistically significant, the researcher should stop, so as to avoid data dredging and selection bias.

(d) Another reasonable option would be to perform a factor analysis on the ideology mea- sures and create a common score in that way.

Solution: b and d. a is wrong because if the measures are highly correlated, the regression coefficients in the original model will be very noisy. c is wrong because the average of a bunch of measures can be a good predictor even if the individual measures are noisy.

11 thoughts on “Question 13 of my final exam for Design and Analysis of Sample Surveys

  1. I was curious about your solution to c because I agree that this can be a sensible thing to do so the researcher shouldn’t stop. On the other hand, doing this dependently on significance results will in fact introduce a bias (I guess) and it would be better if this could be decided without using the y-variable based on meaning and pattern of the x-variables alone.

      • I think that’s the crucial point – the average has to be meaningful (though it often will be when you have highly correlated predictors). In psychology people often want to run a MANOVA when the correlated measures are outcomes. I generally argue that the simple average (on a common scale) is usually more robust and more theoretically plausible than the ‘optimal’ combination from MANOVA (which is entirely theoretical, unstable across samples and capitalizes on chance patterns in the sample). I’m thinking of Dawes’ improper linear models here.

        What I’m less sure of is when its worth using factor analysis – my gut feeling is that for most sample sizes in psychology the factor analysis is overkill and not likely to add much value (though I think it is a reasonable approach).

  2. Another useful way of thinking about the effect of a sum of variables is that it is equivalent to constraining the effects of the individual variables to be the same.

  3. Pingback: Question 14 of my final exam for Design and Analysis of Sample Surveys « Statistical Modeling, Causal Inference, and Social Science

  4. My sense is the hypothetical researcher in the question started testing at the wrong end.

    Using sequentially partitioned hypotheses testing, he could have adopted the following plan to control the family wise Type I error rate.

    First, test the null of no effect using the average measure. If not rejected stop.

    Second, if the null is rejected proceed to test the individual components using some appropriate multiple comparison procedure.

    BTW Did the researcher do an F test on the separate ideology measures? Or did he only look at the individual tests? What is the relation between using the F test, replacing inputs with their simple average, or with a principal component? It would appears the difference is how the data are weighted before inference.

Comments are closed.