2. Which of the following are useful goals in a pilot study? (Indicate all that apply.)
(a) You can search for statistical significance, then from that decide what to look for in a confirmatory analysis of your full dataset.
(b) You can see if you find statistical significance in a pre-chosen comparison of interest.
(c) You can examine the direction (positive or negative, even if not statistically significant) of comparisons of interest.
(d) With a small sample size, you cannot hope to learn anything conclusive, but you can get a crude estimate of effect size and standard deviation which will be useful in a power analysis to help you decide how large your full study needs to be.
(e) You can talk with survey respondents and get a sense of how they perceived your questions.
(f) You get a chance to learn about practical difficulties with sampling, nonresponse, and question wording.
(g) You can check if your sample is approximately representative of your population.
Solution to question 1
1. Suppose that, in a survey of 1000 people in a state, 400 say they voted in a recent primary election. Actually, though, the voter turnout was only 30%. Give an estimate of the probability that a nonvoter will falsely state that he or she voted. (Assume that all voters honestly report that they voted.)
Solution: Draw the probability tree, you get that the proportion of people who say they voted is .3+.7p. Solve .3+.7p=.4, you get p=(.4-.3)/.7=.14, or 14%. I was also going to ask for the standard error (which you’d obtain by starting with the standard error for the “.4″ and propagating that through) but I decided to keep it simple. As it was, only about half the students got this question right. This is not a knock on the kids—I just didn’t teach this material well—I’m just letting you know to give a sense that this isn’t such an easy problem.
P.S. As some commenters note, Problem 1 isn’t so realistic. Commenter awm points out that “for the most part people aren’t lying and that the sorts of people who participate in surveys about elections are disproportionately the sort of people who vote.” My problem would’ve been cleaner if I’d also said to assume there was no nonresponse, and if I’d chosen a better example!