Here’s a fun demonstration for intro statistics courses, to get the students thinking about random sampling and its difficulties.
As the students enter the classroom, we quietly pass a sealed envelope to one of the students and ask him or her to save it for later. Then, when class has begun, we pull out a digital kitchen scale and a plastic bag full of a variety of candies of different shapes and sizes, and we announce:
This bag has 100 candies, and it is your job to estimate the total weight of the candies in the bag. Divide into pairs and introduce yourself to your neighbor. [At this point we pause and walk through the room to make sure that all the students pair up.] We’re going to pass this bag and scale around the room. For each pair, estimate the weight of the 100 candies in the bag, as follows: Pull out a sample of 5 candies, weigh them on the scale, write down the weight, put them candies back in the bag and mix them (no, you can’t eat any of them yet!), and pass the bag and scale to the next pair of students. Once you’ve done that, multiply the weight of your 5 candies by 20 to create an estimate for the weight of all 100 candies. Write that estimate down (silently, so as not to influence the next pair of students who are taking their sample). [As we speak, we write these instructions as bullet points on the blackboard: "Draw a sample of 5," "Weigh them," etc.]
Your goal is to estimate the weight of the entire bag of 100 candies. Whichever pair comes closest gets to keep the bag. So choose your sample with this in mind.
We then give the bag and scale to a pair of students in the back of the room, and the demonstration continues while the class goes on. Depending on the size of the class, it can take 20 to 40 minutes. We just cover the usual class material during this time, keeping an eye out occasionally to make sure the candies and the scale continue to move around the room.
When the candy weighing is done (or if only 15 minutes remain in the class), we continue the demonstration by asking each pair to give their estimate of the total weight, which we write, along with their first names, on the blackboard. We also draw a histogram of the guesses, and we ask whether they think their estimates are probably too high, too low, or about right.
We then pass the candies and scale to a pair of students in front of the class and ask them to weigh the entire bag and report the result. Every time we have done this demonstration, whether with graduate students or undergraduates, the true weight is much lower than most or all of the estimates—so much lower that the students gasp or laugh in surprise. We extend the histogram on the blackboard to include the true weight, and then ask the student to open the sealed envelope, and read to the class the note we had placed inside, which says, “Your estimates are too high!”
We conclude the demonstration by leading a discussion of why their estimates were too high, how they could have done their sampling to produce more accurate estimation, and what analogies can they draw between this example and surveys of human populations. When students suggest doing a “random sample,” we ask how they would actually do it, which leads to the idea of a sampling frame or list of all the items in the population. Suggestions of more efficient sampling ideas (for example, picking one large candy bar and four smaller candies at random) lead to ideas such as stratified sampling.
—
When doing this demonstration, it’s important to have the students work in pairs so that they think seriously about the task.
It’s also most effective when the candies vary greatly in size: for example, take about 20 full-sized candy bars, 30 smaller candy bars, and 50 very small items such as individually-wrapped caramels and Life Savers.