## Question 2 of my final exam for Design and Analysis of Sample Surveys

2. Which of the following are useful goals in a pilot study? (Indicate all that apply.)

(a) You can search for statistical significance, then from that decide what to look for in a confirmatory analysis of your full dataset.

(b) You can see if you find statistical significance in a pre-chosen comparison of interest.

(c) You can examine the direction (positive or negative, even if not statistically significant) of comparisons of interest.

(d) With a small sample size, you cannot hope to learn anything conclusive, but you can get a crude estimate of effect size and standard deviation which will be useful in a power analysis to help you decide how large your full study needs to be.

(e) You can talk with survey respondents and get a sense of how they perceived your questions.

(f) You get a chance to learn about practical difficulties with sampling, nonresponse, and question wording.

(g) You can check if your sample is approximately representative of your population.

Solution to question 1

From yesterday:

1. Suppose that, in a survey of 1000 people in a state, 400 say they voted in a recent primary election. Actually, though, the voter turnout was only 30%. Give an estimate of the probability that a nonvoter will falsely state that he or she voted. (Assume that all voters honestly report that they voted.)

Solution: Draw the probability tree, you get that the proportion of people who say they voted is .3+.7p. Solve .3+.7p=.4, you get p=(.4-.3)/.7=.14, or 14%. I was also going to ask for the standard error (which you’d obtain by starting with the standard error for the “.4″ and propagating that through) but I decided to keep it simple. As it was, only about half the students got this question right. This is not a knock on the kids—I just didn’t teach this material well—I’m just letting you know to give a sense that this isn’t such an easy problem.

P.S. As some commenters note, Problem 1 isn’t so realistic. Commenter awm points out that “for the most part people aren’t lying and that the sorts of people who participate in surveys about elections are disproportionately the sort of people who vote.” My problem would’ve been cleaner if I’d also said to assume there was no nonresponse, and if I’d chosen a better example!

1. Anonymous says:

Re: solution to question 1. Am I the only one who considered it obvious that the voter turnout referred to the population while there was no information of the true voter turnout in the survey sample?

• Andrew says:

Anon:

It is possible to go and check whether particular people voted. But it takes effort and is rarely done.

2. zbicyclist says:

Mostly no:
(a)(b) no; effect size maybe (g) yes, but it’s a low power test that will only catch very large discrepancies (e.g. all your completed interviews are among people who don’t work during the day).

Mostly yes.
(c)(d)(f) yes (e) yes, but pretest is the better place for this (I used to read the questions to my mother over the phone as a pre-pretest. Just mentioning that because it’s Mother’s Day weekend. It wasn’t Mom’s favorite thing to do, but she was family.)

• K? O'Rourke says:

If Andrew had used the words like, should, reasonable to, justifyable to, I would agree with you.

But he asked “can”, and these things you “can” do, no matter how silly or counterproductive.
(“can” being taken as meaning could)

Hopefully, the context of the course clarified what “can” meant.

Tests are dreadfully hard to construct, usually in quantitative courses questions mostly involve must questions (must 2+2=4).

Shouldn’t half the students getting a question right mean it’s maximally informing (sp?) about student ability?

After all, a question everyone gets wrong, or one that everyone gets right, gives you no ability to distinguish between the ability of different students.

There’s probably a ton of reasons why I’m wrong, but it seemed sensible to me :)

• rdm says:

I tried to convince my students at one point that a midterm with mean=50 and std=15 was much more useful in terms of my discovering what they had learnt and understood, than one with mean=80, std=5. Of course, they wanted the latter. Even when I told them I would be applying a curve before giving letter grades.

4. Mark says:

(d), (e), and (f) are definitely useful goals of a pilot. (b) is fine, too, as you might get lucky with a larger effect size than anticipated, but I wouldn’t call it a “goal”.

(a), (c), and (g) are certainly not useful goals.

• Mark says:

I would also qualify that (d) is only really useful for standard deviation, but I’d never use an effect size from a small pilot study to power a larger study…. See (c).

5. DK says:

I am not sure how “useful” and “pilot study” are defined but, broadly speaking, I would answer “all of the above”.

6. Jonathan (a different one) says:

I would say all but (g) have some potential use. (g) is out because the representativeness of this sample that you won’t be using in the full experiment is inherently uninteresting. (a) and (c) are possibilities for things you might use, and if you want to call that a “goal” I guess you could.

7. Scott says:

(a) seems a bad idea if you take it to mean excluding variables from your confirmatory analysis just because they’re not significant in a pilot study. Pilot studies are usually low-powered, so ‘not significant’ could still include large effect sizes. But there might be something cool pop out of your pilot study that you decide it’s worth focusing the rest of your research efforts on. Can’t really call it a goal though.

(b) – you _can_ do, but if that’s your _goal_ it’s not really a pilot study. Again, the low-power means you can’t conclude much from a non-significant result.

(c) – ditto, really

(d) is often one of the main reasons for doing a pilot study – looking at the confidence intervals you can make pessimistic estimates of effect sizes & standard deviation to plan the main study.

(e) & (f) are the other main reasons.

(g) is important insofar as its a facet of (f), i.e. if your sample’s less representative than you’d expect by chance, you should check the sampling procedure.

8. [...] yesterday: 2. Which of the following are useful goals in a pilot study? (Indicate all that [...]

9. Kaiser says:

Instead of a probability tree, try using a probability table. You have four cases based on (Voted, Claimed Voted) dimensions. All four numbers must add to 1000. Fill in the numbers, and fish out (Not Voted, Claimed Voted) case is 100,and the marginal (Not Voted) is 700. This uses P(a/b) = P(a and b) / P(b) instead of Bayes Rule but more students will grasp it.

10. K? O'Rourke says:

Not to undo gerd gigerenzer here, but you have fully defined the joint probability distribution and Bayes rule does apply (i.e. you are conditioning on or taking Not Voted as the fixed observed quantity).

This is interesting to me, because when I tricked my teenage son into _discovering_ the logic of Bayes (actually using Galton’s two stage quincunx which seems like just the physics of a pin ball machine), he objected “shouldn’t there be some formula that does a better job?”

Also, after initially using trees to explain things in an introductory stats course, I switched to tables the next time I taught the course, thinking it might be _better_. Near the end of that course, one of the better students got up and used trees to explain the same stuff. Unfortunately most of the students had left, but the dozen or so students that remained all agreed that it was a much better way to explain things.

Would be interesting to do a randomized cross-over experiment, but my guess is whatever is done first will seem to be harder (on average).