## What’s the point of the margin of error?

So . . . the scheduled debate on using margin of error with non-probability panels never happened. We got it started but there was some problem with the webinar software and nobody put the participants could hear anything.

The 5 minutes of conversation we did have was pretty good, though. I was impressed. The webinar was billed as a “debate” which didn’t make me happy—I wasn’t looking forward to hearing a bunch of pious nonsense about probability sampling and statistical theory—but the actual discussion was very reasonable.

The first thing that came up was, Are everyday practitioners in market research concerned about the margins of error for non-probability samples? The consensus among the market researchers on the panel was: No, users pretty much just take samples and margins of error as they are, without worrying about where the sample came from or how it was collected.

I pointed out that if you’re concerned about non-probability samples and if you don’t trust the margin of error for non-probability samples, then you shouldn’t trust the margin of error for any real sample from a human population, given the well-known problems of nonavailability and nonresponse. When the nonresponse rate is 91%, any sample is a convenience sample.

The larger point is that just about any survey requires two steps:
1. Sampling.

There are extreme settings where either 1 or 2 alone is enough.

If you have a true probability sample from a perfect sampling frame, with 100% availability and 100% response, and if your sampling probabilities don’t vary much, and if your data are dense relative to the questions you’re asking, then you can get everything you need—your estimate and your margin of error—from the sample, with no adjustment needed.

From the other direction, if you have a model for the underlying data that you really believe, and if you have a sample with no selection problems, or if you have a selection model that you really believe (which I assume can happen in some physical settings, maybe something like sampling fish from a lake), then you can take your data and adjust, with no concerns about random sampling. Indeed, this is standard in non-sampling areas of statistics, where people just take data and run regressions and that’s it.

In general, though, it makes sense to be serious about both sampling and adjustment, to sample as close to randomly as you can, and to adjust as well as you can.

Remember: just about no sample of humans is really a probability sample or even close to a probability sample, and just about no regression model applied to humans is correct or even close to correct. So we have to worry about sampling, and we have to worry about adjustment. Sorry, Michael Link, but that’s just the way things are. No “grounding in theory” is going to save you.

What’s the point of the margin of error?

Where, then, does the margin of error come in? (Note to outsiders: to the best of my knowledge, “margin of error” is not a precisely-defined term, but I think it is usually taken to be 2 standard errors.)

What I said, during our abbreviated 5-minute panel discussion, is that, in practice, we often don’t need the margin of error at all. Anything worth doing is worth doing multiple times, and once you have multiple estimates from different samples, you can look at the variation between them to get an external measure of variation that is more relevant than an internal margin of error, in any case.

The margin of error is an approximate lower bound on the expected error of an estimate from a sample, and that such a lower bound can be useful, but that in most cases I’d get more out of the between-survey variation (which includes sampling error as well as variation over time, variation between sampling methods, and variation in nonsampling error).

Where the margin of error often is useful is in design, in deciding how large a sample size you want to estimate a quantity of interest to some desired precision.

In an email discussion afterward, John Bremer pointed out that in tracking studies you are interested particularly in measuring change, and in that case it might not be so easy to get an external measure of variance. Indeed, if you only measure something at time 1 and time 2, then the margin of error is indeed relevant to assessing the evidence. To get an external measure of uncertainty and variation you need a longer time series. I just wanted to emphasize the point that the margin of error is a lower bound and, as such, can be useful if it is interpreted in that way. Even if sampling is perfect probability sampling and there is 100% response, the margin of error is still an underestimate because the sample is only giving a snapshot, and attitudes change over time.

1. Keith O'Rourke says:

At least in the areas I have worked in(e.g. clinical trials, epidemiology), systematic errors are very likely to be similar across different studies making the use between-study variation very problematic. For instance, work on Commensurate Priors assumes the distribution of systematic errors is symmetric 9or not too asymmetric).

You seem to be making that (implicit) assumption here and perhaps could add something about why that’s appropriate in your setting.

For those who like reading the classics, perhaps Mosteller and Tukey’s Hunting for the real uncertainty chapter in their regression book.

2. Steve Sailer says:

This seems like the kind of subject where it would be a good idea to do some market research ahead of time to see if a particular campaign to improve public sophistication about statistics won’t just make things worse. Get some focus groups together, such as reporters and editors, and see if they make sense of your ideas or misinterpret them badly. Then try a focus group of newspaper subscribers.

3. Bob says:

I dislike the term “margin of error” because it implies (to me anyway) that the error cannot be greater than the margin. However, when designing a study, (an event that I have been involved in like only once) knowledge of the likely “margin of error” as a function of npq and the implications for survey design and sample size seem useful to me.

Bob

4. zbicyclist says:

Yes, it’s a lower bound.

But what’s often not appreciated is the main reason WHY it is a lower bound — and in particular why a larger sample size doesn’t help all that much.

The basic equation: Total Error ^2 = Sampling Error ^2 + Bias ^2

Sample size has no effect on bias. In fact, when you get to really large data sets, the sampling error can be practically nonexistent, and nearly all the error comes from bias terms.

5. Jose M. Vidal-Sanz says:

In many academic studies response rates can be low. But in marketing research, response rates are not as bad as a 10% when a relevante problema is at stake, in these cases firms often provide respondents with economic incentives that can lead up to an 80% response rate or even higher. It all depends on the size of the incentive, the effort required to the respondants must be balanced, and the study should be well managed (e.g, if there is some flexibility to arrange an appointment at a convenient time, follow up rejections, and the type of interview, the writing of the questionnaire, etc.). For lesser problems, however, firms do not invest much and one can find samples with self selection associated to low response rates, but the truth is that in this cases marketers do not even use representative samples (this would make the study more expensive) but convenience samples. in these cases the analysis is considered from a exploratory perspective, and marketing analysts use the results with some caution (as they also do with qualitative research studies). In any case, there are not universal rules here, so that some companies take more care than others. Market research companies charge more for studies based on representative samples, and firms need to evaluate the cost-benefit consequences of each study.

• zbicyclist says:

80% is by no means a representative response rate for marketing research studies, although I have no doubt there’s some study somewhere that has achieved this among a highly motivated group.

For example, the MRA numbers are higher for customer lists than for general population surveys, 22% versus 14%. http://www.marketingresearch.org/survey-nonresponse

The MRA page claims 14.4% response rate in 2007 for the general population. That’s consistent with the Pew numbers:

The Pew work (which is what Andrew is citing with his 91% nonresponse number) shows 9% in 2012, but 21% in 2006 and 15% in 2009, which suggests that marketing research studies in general are a bit lower in response rate. [This may be definitional; I did not verify that the definition of ‘response rate’ was comparable.]

• Jose M. Vidal-Sanz says:

Certainly not, because many studies do not support really important decisions. Minor decision are often based on ad hoc studies which tend to be based on non representative samples. But even for small regular decisions, companies often get good data from representative samples. The reason is that many companies buy data from syndicated services and omnibus studies. For example the customer panel and the retail panel from AC Nielsen collect consumption and sales data using representative samples at multiple countries. Companies just buy access to these data from a specific category at a lower price, and the cost of the study is less relevant because the same data are sold to thousands of companies all around the globe. So in the business world, things are not as dramatic as a 10% repose rate.

• JD Deitch says:

Jose, there’s a difference between “representative sample” and a “probability based sample”. AFAIK, among the major MR firms, GfK is the only one to lay claim to the latter.

6. Jose M. Vidal-Sanz says:

What I mean is that a single source of representative data is widely used for many companies, cost sharing is a key issue.