Question 10 of my final exam for Design and Analysis of Sample Surveys

Posted on May 20, 2012 5:00 PM by Andrew

10. Out of a random sample of 100 Americans, zero report having ever held political office. From this information, give a 95% confidence interval for the proportion of Americans who have ever held political office.

Solution to question 9

From yesterday:

9. Out of a population of 100 medical records, 40 are randomly sampled and then audited. 10 out of the 40 audits reveal fraud. From this information, give an estimate, standard error, and 95% confidence interval for the proportion of audits in the population with fraud.

Solution: estimate is p.hat=10/40=0.25. Se is sqrt(1-f)*sqrt(p.hat*(1-.hat)/n)=sqrt(1-0.4)*sqrt(0.25*0.75/40)=0.053. 95% interval is [0.25 +/- 2*0.053] = [0.14,0.36].

4 thoughts on “Question 10 of my final exam for Design and Analysis of Sample Surveys”

zbicyclist on May 20, 2012 7:21 PM at 7:21 pm said:

10. This turns out to be a surprisingly helpful trick. Glad to see you spreading it.
- Andrew on May 20, 2012 7:31 PM at 7:31 pm said:
  
  Yes, it came up in a consulting project just last month. I was looking for a convenient reference for it in our report, and I was pleasantly surprised to see that it’s in the (current) edition of the Moore and McCabe book. I don’t like Moore and McCabe as a textbook but it’s actually a fine reference and excellent to have on the shelf as it has clear explanations of all sorts of basic statistical methods.
Bob Carpenter on May 21, 2012 2:23 PM at 2:23 pm said:

Stan model with an implicit uniform prior on theta.

parameters { real(0,1) theta; }

model { 0 ~ binomial(100,theta); }

Compile the model, compile the resulting C++ code, and run 1M samples:

SHELL: bin/stanc src/models/basic_estimators/binomial.stan

SHELL: g++ -O3 -Lbin -lstan -I src -I lib anon_model.cpp

SHELL: ./a.out –iter=1000000 –thin=10

Then compute the 95% interval in R:

R: x = read.csv(‘samples.csv’, header=TRUE, comment.char=’#’)

And sorry in advance for all the digits, but I don’t know how to control them,

R: quantile(x[,3], c(0.025,0.975))

2.5% 97.5%
0.0002510584 0.0354259225

That seems pretty close, because

R: dbinom(0, 100, 0.035)

[1] 0.02836164

and

R: dbinom(0, 100, 0.00025)

[1] 0.9753069
Pingback: Question 11 of my final exam for Design and Analysis of Sample Surveys « Statistical Modeling, Causal Inference, and Social Science

Comments are closed.