Skip to content

Question 12 of my final exam for Design and Analysis of Sample Surveys

12. A researcher fits a regression model predicting some political behavior given predictors for demographics and several measures of economic ideology. The coefficients for the ideology measures are not statistically significant, and the researcher creates a new measure, adding up the ideology questions and creating a common score, and then fits a new regression including the new score and removing the individual ideology questions from the model. Which of the following statements are basically true? (Indicate all that apply.)

(a) If the original ideology measures are close to 100% correlated with each other, there will be essentially no benefit from this approach.

(b) If the original ideology measures are not on a common scale, they should be rescaled before adding them up.

(c) If the original result was not statistically significant, the researcher should stop, so as to avoid data dredging and selection bias.

(d) Another reasonable option would be to perform a factor analysis on the ideology mea- sures and create a common score in that way.

Solution to question 11

From yesterday:

11. Here is the result of fitting a logistic regression to Republican vote in the 1972 NES.

Income is on a 1–5 scale. Approximately how much more likely is a person in income category 4 to vote Republican, compared to a person income category 2? Give an approximate estimate, standard error, and 95% interval.

Solution: On the logit scale, the estimate is 0.66 with se 0.12. The 95% interval is [0.66 +/- 2*0.12] = [0.42,0.90]. To switch to the probability scale, divide by 4 and round down: the estimate is then 0.16 with se 0.03, 95% interval is [0.10,0.22].

Battle of the Repo Man quotes: Reid Hastie’s turn

In response to my comments on his recent opinion article on the the human tendency to overvalue information presented as stories, Reid Hastie writes:

Andrew (and Commenters) … I’d like to try to clarify some of the statements and implications in my Bloomberg article on “Our Gift for Good Stories …” The essay is what it is, but some of the implications that I intended to convey do not seem to have been communicated effectively. So, let me take a shot at clarification here. (Of course I am not assuming that, once clarified, my statements are necessarily correct or that you will agree with them.)

1. What I meant in the sections of the paper that claimed the brain is naturally good at visual and causal (narrative) thinking, is that that the brain was probably selected, through evolutionary processes to be adaptively successful at those capacities. I don’t have good evidence for this claim … but, we do a lot of those kinds of thinking, we’re distinctive as a species in the ways we do them, just about every typical human in any culture does them much the same way, there seems to be some evidence for systematic localization in brain structures, and we’re pretty good at them.

In contrast, I don’t believe that mathematical thinking has those properties. (Here I mean stylized mathematical thinking that goes beyond counting – such as algebra or geometry and more elaborate forms of this capacity.) I admit I know even less about evidence for cultural (not) universality, developmental trends, and brain localization (my reading of recent neuroscience, gives me the impression that mathematics beyond counting “hijacks” circuits in the brain that were not selected for these abilities … but, I admit I’m not even sure how to define some of these terms.) I also believe, from behavioural experiments, that we are not endowed (through natural selection) with the logical reasoning skills required to solve stylized logic problems like those that are presented in classroom logic courses. Though we are pretty good at some elaborate forms of “natural deduction” that are best described as rule-based. Here I think our abilities are part of our (selected) language generation and comprehension skills.

2. I agree that my assertion that Bayesian Causal Networks (really the principles behind that notation) provide a normative, rational model for causal reasoning is highly speculative. But, it seems to me that the Bayesian Causal Networks (e.g., the principles for calculation on those networks proposed by Pearl, and many others) are more popular than any previous proposal for a normative model, and that approach seems to distill the truth from many of the most plausible prior accounts. I appreciate that the Bayesian Networks approach is still under construction and it may fail gain the status of a generally accepted normative framework. But, every so often you have to bet on a new theoretical development, and I’m placing a wager on Bayesian Causal Networks … I certainly have not won that bet yet. (I also think Andrew is right that most realistic causal systems studied in the behavioural and biological sciences are defined by many small-influence causal relations and the current versions of Bayesian Causal Networks may not be a practically useful framework for modeling those phenomena.)

3. I am not apologetic about my choice of the Tversky & Kahneman causal conjunction brainteaser: Which is more likely “A flood drowns 1,000 Californians” versus “An earthquake followed by a flood drowns 1,000 Californians”? (And, hey, nothing against Californians: some of my best friends are Californians.) I agree that the standard criticisms of the heuristics & biases brainteasers apply (to the specific example I cited): Are there linguistic or conversational biases in the wording; how can untutored respondents use the probability response scale in a consistent and reliable manner; if there is no referent, how can we talk meaningfully about probabilities or errors; does this “brainteaser” really illustrate the bias introduced by “narrative causal coherence,” or is it just simple availability. (I did check on the numbers for that problem before I cited it in the article and I now regret overstating the magnitude of the effect size.) I used the flood-earthquake problem because it was the most relevant to readership of a Bloomberg editorial. And, I used this problem because I believe that it “works” rhetorically to illustrate a form of the “narrative fallacy”; our bias or tendency to believe “good narratives” are likely to occur and to think the “good story” confers other advantages, such as the ability to predict the next similar event.

Andrew says he does not see how this example demonstrates a problem with logical thinking skills. My answer is not subtle; I just buy the Tversky-Kahneman conclusion that there is an implied violation of logical set-superset membership relationships when we rate the probability of a conjunction higher than the probability of a component category. If you don’t accept that interpretation, you can certainly argue the alternative (see my list of methodological weaknesses above, plus the fact that different samples of respondents made the ratings that I claim “violate the logical principle”). But, I still prefer my interpretation.

4. I’m not sure how to respond to the criticism that my claim that many interpretations of the recent financial crises are examples of narrative fallacies is actually motivated by my desire to excuse specific individuals and institutions from responsibility for the crises. I can say it was certainly not my intention to excuse bankers and others from responsibility when I included that example. Again, I thought I was providing an illustration that typical Bloomberg readers could understand; and that I was warning them not to commit the fallacy in their professional roles – and especially to be more modest about their own abilities to forecast future events. Although I’m (happily) located in the University of Chicago Business School, I am not knowledgeable about financial events. If I were to characterize my own views on responsibility for the recent financial events, I would say that many bankers, analysts, raters, regulators, and politicians should be punished, much more harshly than they have been (or will be). (To cite an analogy, I reject most philosophical conceptions of “free will,” but I also firmly endorse incentives and sanctions to control socially consequential behaviour. Okay, argument from analogy is always dubious.) In any case, the reaction to my citation of (some) discussions of the financial crisis as examples of the “narrative fallacy” was an unintended consequence and I’m grateful to the blog for pointing that out.

5. I want to thank Andrew for pointing out that my tone seemed smug. That too was unintended (though maybe that’s not very remarkable, who sets out to intentionally write an article that sounds smug – maybe the occasional humorist). I’m not used to writing “thought pieces,” and that smugness is a style problem I need to work on. Many of the assertions in the article were tentative in my mind, but I wanted to say something definite and engaging. That’s probably a reason that many popular science writers get in trouble.

6. If you were interested in the topic of my essay, I would recommend a couple of other commentaries on the same topic: My beloved co-author, Robyn Dawes published a much more thoughtful essay on the same theme, “Prediction of the future versus an understanding of the past” (American Journal of Psychology, 106(1), 1-24); and I admire Duncan Watts recent (2011) popular book, “Everything is Obvious … Once You Know the Answer.”

Finally, I’d like to thank Andrew, and the rest of you bloggers for taking my article seriously enough to comment. As a further testimonial, I’ve enjoyed Andrew’s blog for many years and it is the most-visited non-news site on my browser toolbar. And, finally, finally … thanks for the Repo Man quotes (My favourite? p. 139, Hastie & Dawes, 2010).

When blogging I spend so much time reacting to journalists who seem to either ignore criticism or incorporate it without acknowledgment. I doubt, for example, I’ll be getting any responses from Michael Barone, Doug Schoen, Gregg Easterbrook, or Campbell Brown. Academic researchers, though, have a tradition of thoughtful dialogue (with some exceptions). I appreciate Hastie’s thoughtful reply.

Question 11 of my final exam for Design and Analysis of Sample Surveys

11. Here is the result of fitting a logistic regression to Republican vote in the 1972 NES.

Income is on a 1–5 scale. Approximately how much more likely is a person in income category 4 to vote Republican, compared to a person income category 2? Give an approximate estimate, standard error, and 95% interval.

Solution to question 10

From yesterday:

10. Out of a random sample of 100 Americans, zero report having ever held political office. From this information, give a 95% confidence interval for the proportion of Americans who have ever held political office.

Solution: Use the Agresti-Coull interval based on (y+2)/(n+4). Estimate is p.hat=2/104=0.02, se is sqrt(p.hat*(1-p.hat)/104)=0.013, 95% interval is [0.02 +/- 2*0.013] = [0,0.05].

Responding to a bizarre anti-social-science screed

Philosophy professor Gary Gutting writes:

Public policy debates often involve appeals to results of work in social sciences like economics and sociology. . . . How much authority should we give to such work in our policy decisions? . . . The core natural sciences (e.g., physics, chemistry, biology) are so well established that we readily accept their best-supported conclusions as definitive. . . . But how reliable is even the best work on the effects of teaching? How, for example, does it compare with the best work by biochemists on the effects of light on plant growth? Since humans are much more complex than plants and biochemists have far more refined techniques for studying plants, we may well expect the biochemical work to be far more reliable. . . . While the physical sciences produce many detailed and precise predictions, the social sciences do not.

OK, fine. But then comes the punchline:

Given the limited predictive success and the lack of consensus in social sciences, their conclusions can seldom be primary guides to setting policy. At best, they can supplement the general knowledge, practical experience, good sense and critical intelligence that we can only hope our political leaders will have.

This all makes sense but I’m a bit confused. In no area are scientific conclusions “the primary guides to setting policy.” Political and business leaders rule policy; the rest of us can just supply advice. It’s not like physicists are in charge of energy policy (if so, I expect we’d have a big fat carbon tax), nor are biologists in charge of the teaching of evolution in many states. So I’m not quite sure what Gutting is talking about.

The only place where I see social scientists controlling policy is (some) economists’ influence over economic policy. But this is a well-known issue, usually framed not as a matter of scientific expertise (or lack thereof) but in terms of massive conflicts of interests (for example, Lawrence Summers’s multimillion-dollar payoff from a hedge fund). But this can’t be Gutting’s point: if it were, he’d talk about economists, not social science in general. But in that case I just don’t get it. It’s not like there are a bunch of number-crunching sociologists running around telling the government what to do!

It seems to me that, the field economics aside, policy is run just as Gutting would like: “our political leaders” (as he puts it) can pretty much do what they’d like, constrained by the political opposition but feeling no particular obligation for science (social or otherwise) to be “primary guides to setting policy.” I’m not complaining here—I don’t know that science should be a primary guide in most contexts (if it were, maybe we’d all be riding flying cars by now, with all the parking problems that would entail)—I just don’t see what Gutting is getting at.

Question 10 of my final exam for Design and Analysis of Sample Surveys

10. Out of a random sample of 100 Americans, zero report having ever held political office. From this information, give a 95% confidence interval for the proportion of Americans who have ever held political office.

Solution to question 9

From yesterday:

9. Out of a population of 100 medical records, 40 are randomly sampled and then audited. 10 out of the 40 audits reveal fraud. From this information, give an estimate, standard error, and 95% confidence interval for the proportion of audits in the population with fraud.

Solution: estimate is p.hat=10/40=0.25. Se is sqrt(1-f)*sqrt(p.hat*(1-.hat)/n)=sqrt(1-0.4)*sqrt(0.25*0.75/40)=0.053. 95% interval is [0.25 +/- 2*0.053] = [0.14,0.36].

Problemen met het boek

Regarding the so-called Dutch Book argument for Bayesian inference (the idea that, if your inferences do not correspond to a Bayesian posterior distribution, you can be forced to make incoherent bets and ultimately become a money pump), I wrote:

I have never found this argument appealing, because a bet is a game not a decision. A bet requires 2 players, and one player has to offer the bets. I do agree that in some bounded settings (for example, betting on win place show in a horse race), I’d want my bets to be coherent; if they are incoherent (e.g., if my bets correspond to P(A|B)*P(B) not being equal to P(A,B)), then I should be able to do better by examining the incoherence. But in an “open system” (to borrow some physics jargon), I don’t think coherence is possible. There is always new information coming in, and there is always additional prior information in reserve that hasn’t entered the model.

Question 9 of my final exam for Design and Analysis of Sample Surveys

9. Out of a population of 100 medical records, 40 are randomly sampled and then audited. 10 out of the 40 audits reveal fraud. From this information, give an estimate, standard error, and 95% confidence interval for the proportion of audits in the population with fraud.

Solution to question 8

From yesterday:

8. Which of the following statements accurately characterize the National Election Studies? (Indicate all that apply.)

(a) The NES began in 1960.

(b) Since 1980, the NES has mostly relied on telephone interviews.

(c) The NES typically has a sample size of about 1000–2000 people.

(d) The NES uses a sampling design that ensures they get respondents from all fifty states and D.C.

Solution: c. This is a purely factual question, not much to say here.

Cross-validation to check missing-data imputation

Question 8 of my final exam for Design and Analysis of Sample Surveys

8. Which of the following statements accurately characterize the National Election Studies? (Indicate all that apply.)

(a) The NES began in 1960.

(b) Since 1980, the NES has mostly relied on telephone interviews.

(c) The NES typically has a sample size of about 1000–2000 people.

(d) The NES uses a sampling design that ensures they get respondents from all fifty states and D.C.

Solution to question 7

From yesterday:

7. Which of the following statements accurately summarize claims made by Page and Shapiro in The Rational Public and their associated research articles? (Indicate all that apply.)

(a) Americans’ attitudes on policy alternatives are highly unstable over time, reflecting a rational response to unstable political conditions.

(b) When studying public opinion, question-wording is less important than scholars have traditionally thought.

(c) Attitudes about foreign policy change more abruptly than attitudes on domestic issues.

(d) The contents of the mass media account for a high proportion of opinion changes on foreign policy.

(e) Using the assumption of rationality, Page and Shapiro fit a hedonic regression to estimate the underlying utility function of survey respondents.

(f) Page and Shapiro use the term “rational” ironically; their fundamental claim is that Americans are easily distracted and that rational-public models are seriously flawed.

Solutio: c and d. But you can make arguments for a and b. No way on e or f.

Those mean psychologists, making fun of dodgy research!

Two people separately sent me this amusing mock-research paper by Brian A. Nosek (I assume that’s what’s meant by “Arina K. Bones”). The article is pretty funny, but this poster (by Nosek and Samuel Gosling) is even better! Check it out:

I remarked that this was almost as good as my zombies paper, and my correspondent pointed me to this page of (I assume) Nosek’s research on aliens.

P.S. I clicked through to take the test to see if I’m dead or alive, but I got bored after a few minutes. I gotta say, if Gosling can come up with a 10-item measure of the Big Five, this crew should be able to come up with a reasonably valid alive-or-dead test that doesn’t require dozens and dozens of questions!

Comments on “A Bayesian approach to complex clinical diagnoses: a case-study in child abuse”

I was given the opportunity to briefly comment on the paper, A Bayesian approach to complex clinical diagnoses: a case-study in child abuse, by Nicky Best, Deborah Ashby, Frank Dunstan, David Foreman, and Neil McIntosh, for the Journal of the Royal Statistical Society. Here is what I wrote:

Best et al. are working on an important applied problem and I have no reason to doubt that their approach is a step forward beyond diagnostic criteria based on point estimation. An attempt at an accurate assessment of variation is important not just for statistical reasons but also because scientists have the duty to convey their uncertainty to the larger world. I am thinking, for example, of discredited claims such as that of the mathematician who claimed to predict divorces with 93% accuracy (Abraham, 2010).

Regarding the paper at hand, I thought I would try an experiment in comment-writing. My usual practice is to read the graphs and then go back and clarify any questions through the text. So, very quickly: I would prefer Figure 1 to be displayed in terms of standard deviations, not variances. I find variances difficult to interpret, and I’m always taking mental square roots (0.09 is 0.3 squared, and so forth). Figure 3 is appealing but I don’t like the visual emphasis of the endpoints of the 95% intervals. From a Bayesian standpoint, there is nothing special about the 2.5th and 97.5th percentiles of the posterior distribution, and I think it goes against the spirit of the article to emphasize these arbitrary endpoints. I also think that, with some care, the graphs in Figures 3, 4, and 5 could be compactly re-expressed to show comparisons more effectively (as in Gelman, Pasarica, and Dodhia, 2002). Tables 2 and 3 I think are useless: why should a reader care that the 10th percentile point of the distribution for a particular probability os 0.164 or whatever? Again, this seems to me to contradict the decision-analytic focus of the applied research.

These brusque comments on display may seem peripheral but to me they are important. Communication is a central task of statistics, and ideally a state-of-the-art data analysis can have state-of-the-art displays to match.

References

Abraham, Laurie (2010). Can you really predict the success of a marriage in 15 minutes? Slate, 8 March. http://www.slate.com/articles/double_x/doublex/2010/03/can_you_really_predict_the_success_of_a_marriage_in_15_minutes.html

Gelman, Andrew, Pasarica, C., and Dodhia, R. (2002). Let’s practice what we preach: turning tables into graphs. American Statistician 56, 121-130.

Question 7 of my final exam for Design and Analysis of Sample Surveys

7. Which of the following statements accurately summarize claims made by Page and Shapiro in The Rational Public and their associated research articles? (Indicate all that apply.)

(a) Americans’ attitudes on policy alternatives are highly unstable over time, reflecting a rational response to unstable political conditions.

(b) When studying public opinion, question-wording is less important than scholars have traditionally thought.

(c) Attitudes about foreign policy change more abruptly than attitudes on domestic issues.

(d) The contents of the mass media account for a high proportion of opinion changes on foreign policy.

(e) Using the assumption of rationality, Page and Shapiro fit a hedonic regression to estimate the underlying utility function of survey respondents.

(f) Page and Shapiro use the term “rational” ironically; their fundamental claim is that Americans are easily distracted and that rational-public models are seriously flawed.

Solution to question 6

From yesterday:

6. A survey of New York City residents is performed using cluster sampling. The design effect is 3.0. From the survey, the estimated proportion who prefer the Mets to the Yankees is 0.42 with a standard error of 0.05. How many people were in the sample?

Solution: The standard error is sqrt(d.eff)*0.5/sqrt(n) = 0.05. Thus sqrt(n) = sqrt(d.eff)*(0.5/0.05), so n = 3*(0.5/0.05)^2 = 300.

More on the difficulty of “preaching what you practice”

A couple months ago, in discussing Charles Murray’s argument that America’s social leaders should “preach what they practice” (Murray argues that they—we!—tend to lead good lives of hard work and moderation but are all too tolerant of antisocial and unproductive behavior among the lower classes), I wrote:
Continue reading ‘More on the difficulty of “preaching what you practice”’ »

Question 6 of my final exam for Design and Analysis of Sample Surveys

6. A survey of New York City residents is performed using cluster sampling. The design effect is 3.0. From the survey, the estimated proportion who prefer the Mets to the Yankees is 0.42 with a standard error of 0.05. How many people were in the sample?

Solution to question 5

From yesterday:

5. Which of the following better describes changes in public opinion on most issues? (Choose only one.)

(a) Dynamic stability: On any given issue, average opinion remains stable but liberals and conservatives move back and forth in opposite directions (the “accordion model”)

(b) Uniform swing: Average opinion on an issue can move but the liberals and conservatives don’t move much relative to each other (the disribution of opinions is a “solid block of wood”)

(c) Compensating tradeoffs: When considering multiple survey questions on the same general topic, average opinion can move sharply to the left or right on individual questions while the average over all the questions remains stable (the “rubber band model”)

Solution: b. You can make an argument for option a over the long term, but if you have to pick just one of the three, you have to go with uniform swing.

Wikipedia author confronts Ed Wegman

Wegman: “It’s not reprinted 100 percent like you had it.”

Wikipedia guy: “No, you added another paragraph at the end and you changed the headline. . . . You even copied the typos that I’ve corrected on my website. It was taken verbatim and reprinted in your paper.”

The original author got a check for $500 but, unfortunately, no free subscription to “Wiley Interdisciplinary Reviews: Computational Statistics” (a $1400-$2800 value).

P.S. To those who think I’m being mean to Wegman: I haven’t yet heard that he’s apologized to the people whose work he copied without attribution, or to the people who spent their time tracking all this down, or to the U.S. Congress for misrepresenting his expertise in his official report.

Everyone makes mistakes, and just about everyone has ethical lapses at times. But when you get caught you’re supposed to make apology and restitution.

Question 5 of my final exam for Design and Analysis of Sample Surveys

5. Which of the following better describes changes in public opinion on most issues? (Choose only one.)

(a) Dynamic stability: On any given issue, average opinion remains stable but liberals and conservatives move back and forth in opposite directions (the “accordion model”)

(b) Uniform swing: Average opinion on an issue can move but the liberals and conservatives don’t move much relative to each other (the disribution of opinions is a “solid block of wood”)

(c) Compensating tradeoffs: When considering multiple survey questions on the same general topic, average opinion can move sharply to the left or right on individual questions while the average over all the questions remains stable (the “rubber band model”)

Solution to question 4

From yesterday:

4. Researchers have found that survey respondents overreport church attendance. Thus, naive estimates from surveys overstate the percentage of Americans who attend church regularly. Does this have a large impact on estimates of time trends in religious attendance?

Solution: Yes. See this article by Hadaway, Marler, and Chaves, who write, “We suspect that the actual attendance rate has declined since World War II, despite the fact that the survey rate remained basically stable.”

A statistical research project: Weeding out the fraudulent citations

John Mashey points me to a blog post by Phil Davis on “the emergence of a citation cartel.” Davis tells the story:
Continue reading ‘A statistical research project: Weeding out the fraudulent citations’ »

Question 4 of my final exam for Design and Analysis of Sample Surveys

4. Researchers have found that survey respondents overreport church attendance. Thus, naive estimates from surveys overstate the percentage of Americans who attend church regularly. Does this have a large impact on estimates of time trends in religious attendance?

Solution to question 3

From yesterday:

3. We discussed in class the best currently available method for estimating the proportion of military servicemembers who are gay. What is that method? (Recall the problems with the direct approach: there is no simple way to survey servicemembers at random, nor is it likely that they would answer such a question honestly.)

Solution: I was talking about the work of Gary Gates, combining an estimate of the percentage of gays in the population with an estimate of the probability that someone is in the military, given that he or she is gay.

I hate to get all Gerd Gigerenzer on you here, but . . .

Jonathan Cantor points me to an opinion piece by psychologist Reid Hastie, “Our Gift for Good Stories Blinds Us to the Truth.”

I have mixed feelings about Hastie’s article. On one hand I do think his point is important. It’s not new to me, but presumably it’s new to many readers of bloomberg.com. I like Hastie’s book (with Robyn Dawes), Rational Choice in an Uncertain World, and I’m predisposed to like anything new that he writes.

On the other hand, there’s something about Hastie’s article that bothered me. It seemed a bit smug, as if he thinks he understands the world and wants to just explain it to the rest of us. That could be fine—after all, Hastie is a distinguished psychology researcher—but I wasn’t so clear that he’s so clear on what he’s saying. For example:

The human brain is designed to support two modes of thought: visual and narrative. These forms of thinking are universal across human societies throughout history, develop reliably early in individuals’ lives, and are associated with specialized regions of the brain.

Is that really true? How does math fit into this picture? Or music? Music has a sort of narrative structure but it doesn’t seem quite like a story, either.

Hastie continues:

What isn’t universal or natural is the kind of highly structured cognitive processes that underlie logical and mathematical thinking.

Not natural . . . really? Maybe math is not universal, but certainly it’s natural. I was doing it when I was 2 years old. And music, that does seem to be universal, no?

Later on:

The mathematics of causal reasoning has recently experienced a major change, with the widespread acceptance of Bayesian Causal Networks as a normative, rational model for causal induction and reasoning.

Ummm . . . maybe Hastie is a bit too accepting of this particular story! I think Bayesian inference is great—I wrote two books on the topic!—but I wouldn’t go so far as to call it “a normative, rational model for causal induction and reasoning.” But I suppose that if I feel able to opine about psychology, I can’t object to Hastie expressing his views on statistics.

Hastie continues with a famous example:

The legendary theorists of decision-making Amos Tversky and Daniel Kahneman illustrated [our desire for stories] with the following pair of judgment questions: One group of respondents was asked, “What is the probability that a massive flood will occur sometime in the next year and drown more than 1,000 Americans?” The typical estimate was low (less than 20 percent). But, when another comparable sample of respondents was asked, “What is the probability that an earthquake in California will be followed by a flood in the next year that drowns at least 1,000 Americans?” the estimates were significantly higher.

The irrationality is that the second question is about a much more specific event, an earthquake that would be only one of the several reasons for the flood referred to in the first question. It is logically impossible for the second probability to be higher than the first. But, because the second question provides a plausible scenario for the unlikely outcome in the first query, our innate preference for a good story trumps our logical thinking skills.

This story is a great example of the availability heuristic, but I don’t see how it demonstrates a problem with “our logical thinking skills.” When responding to the first question, many people have difficulty visualizing that massive flood. The second question gives a clue. But I don’t see the combination of responses (coming from different sets of people) as indicating irrationality. Most people are not flood experts. They answer the questions as best they can, and when you give more information they will use it.

I hate to get all Gerd Gigerenzer on you here, but what’s the point of saying that this “trumps our logical thinking skills”? I think Kahneman and Tversky did better, decades ago, by writing of “heuristics and biases.”

What’s the political message here?

The article under discussion concludes with:

So the next time you hear a good story about why the financial recession, or any other economically significant event, was caused by a single collection of bad actors — or how a simple linear narrative “explains” an important event — remember this: Just as we are wired to like a diet rich in fats and sugars, we have an appetite for simple, coherent narratives. Neither habit is good for our long-term health.

(Reid Hastie, a professor of behavioral science at the University of Chicago Booth School of Business, is a contributor to Business Class. The opinions expressed are his own.)

Aaahhhh, now I get the message: The financial crisis is nobody’s fault! Let’s put aside the politics of blame, let’s all work together etc etc. OK, fine. Does this apply to all catastrophes? If you know someone in a plane that crashed, are we allowed to check if the pilot was stoned before takeoff? If someone takes $100,000 from you on a fraudulent pretext, and you catch him, are you allowed to try to collect? Or is it only in the financial crisis that we should set aside all “good stories” and “simple linear narratives”?

I agree that our financial problems our complex, and I’m all for warning people about the simplicity of storytelling, but I’m also a bit suspicious of someone from the University of Chicago School of Business telling me not to think about stories of the financial crisis.

Getting quantitative

Also, I’m surprised that, when people estimate “the probability that a massive flood will occur sometime in the next year and drown more than 1,000 Americans” as less than 20%, Hastie characterizes that estimate as “low.” Even Katrina drowned only 387 people (according to this source which I found by googling Katrina drownings). If a 20% chance of this “massive flood” occurring in a one-year period is “low,” I’d be interested in what Hastie thinks is a more reasonable probability estimate.

Responding

Hastie’s article bothered me for two reasons. First, what does it mean to it describe “the kind of highly structured cognitive processes that underlie logical and mathematical thinking” as “unnatural.” I don’t quite get what “natural” means here.

Second, I see an implicit political message, which seems to be that we shouldn’t blame anyone for the financial crisis:

We know there was no single cause or event that set in motion the crisis and that the truth is complex and multicausal. So why do we keep seeking the easy answers? It may be that we are hard-wired to do so.

Or, as the guy said in Repo Man, “it’s society’s fault.”

I contacted bloomberg.com, the publishers of the above-linked article, but was told:

We typically don’t publish opeds responding to articles we’ve published, though we welcome letters to the editor. We also post corrections to pieces containing factual errors and would gladly review any objections you have to Mr. Hastie’s column.

Fair enough, but in this case I don’t think the problems would be resolved by a correction note. I’m more bothered by the totality of the piece. For example, the claim that logical reasoning is “unnatural” is not quite a “factual error” but it still seems wrong to me.

P.S. Someone who knows the judgment and decision making field better than I do writes:

I don’t think that Reid has a political agenda here. (He has only been at Chicago for a few years, and Chicago’s School of Business is not monolithic.) . . . To say that blame narratives are oversimplified is not the same as saying that nobody should be blamed; you may be reading the latter subtext into his text.

So maybe I was being unfair. Although I’d feel a little better about Hastie’s column if he’d clarified that, even though stories can be oversimplified, the “life is complicated” defense shouldn’t be used to get people off the hook.

Also, I’m still unhappy about the claim that logical and mathematical reasoning is “unnatural.” But this fits with the innumeracy of thinking there’s a greater-than-20%-chance of a major flood in any given year. I feel that, to Hastie, numbers are just words. Which is consistent with the idea that mathematical reasoning is unnatural to him.

Question 3 of my final exam for Design and Analysis of Sample Surveys

3. We discussed in class the best currently available method for estimating the proportion of military servicemembers who are gay. What is that method? (Recall the problems with the direct approach: there is no simple way to survey servicemembers at random, nor is it likely that they would answer such a question honestly.)

Solution to question 2

From yesterday:

2. Which of the following are useful goals in a pilot study? (Indicate all that apply.)

(a) You can search for statistical significance, then from that decide what to look for in a confirmatory analysis of your full dataset.

(b) You can see if you find statistical significance in a pre-chosen comparison of interest.

(c) You can examine the direction (positive or negative, even if not statistically significant) of comparisons of interest.

(d) With a small sample size, you cannot hope to learn anything conclusive, but you can get a crude estimate of effect size and standard deviation which will be useful in a power analysis to help you decide how large your full study needs to be.

(e) You can talk with survey respondents and get a sense of how they perceived your questions.

(f) You get a chance to learn about practical difficulties with sampling, nonresponse, and question wording.

(g) You can check if your sample is approximately representative of your population.

Solution: e and f. The purpose of a pilot study is to test out the data collection. The sample size will be too small for a, b, c, d, and g. In some of their earliest work, Kahneman and Tversky documented the common misconception of researchers that data from a small pilot study should closely match the population.

The question would have clearer if I’d inserted the word “small” before “pilot” in the preamble.

Stolen jokes

Fun stories here (from Kliph Nesteroff, link from Mark Palko).

More on Uncle Woody

Here.

See also here. He did Wacky Packs!

Question 2 of my final exam for Design and Analysis of Sample Surveys

2. Which of the following are useful goals in a pilot study? (Indicate all that apply.)

(a) You can search for statistical significance, then from that decide what to look for in a confirmatory analysis of your full dataset.

(b) You can see if you find statistical significance in a pre-chosen comparison of interest.

(c) You can examine the direction (positive or negative, even if not statistically significant) of comparisons of interest.

(d) With a small sample size, you cannot hope to learn anything conclusive, but you can get a crude estimate of effect size and standard deviation which will be useful in a power analysis to help you decide how large your full study needs to be.

(e) You can talk with survey respondents and get a sense of how they perceived your questions.

(f) You get a chance to learn about practical difficulties with sampling, nonresponse, and question wording.

(g) You can check if your sample is approximately representative of your population.

Solution to question 1

From yesterday:

1. Suppose that, in a survey of 1000 people in a state, 400 say they voted in a recent primary election. Actually, though, the voter turnout was only 30%. Give an estimate of the probability that a nonvoter will falsely state that he or she voted. (Assume that all voters honestly report that they voted.)

Solution: Draw the probability tree, you get that the proportion of people who say they voted is .3+.7p. Solve .3+.7p=.4, you get p=(.4-.3)/.7=.14, or 14%. I was also going to ask for the standard error (which you’d obtain by starting with the standard error for the “.4″ and propagating that through) but I decided to keep it simple. As it was, only about half the students got this question right. This is not a knock on the kids—I just didn’t teach this material well—I’m just letting you know to give a sense that this isn’t such an easy problem.

P.S. As some commenters note, Problem 1 isn’t so realistic. Commenter awm points out that “for the most part people aren’t lying and that the sorts of people who participate in surveys about elections are disproportionately the sort of people who vote.” My problem would’ve been cleaner if I’d also said to assume there was no nonresponse, and if I’d chosen a better example!

black and Black, white and White

I’ve always thought it looked strange to see people referred to in print as Black or White rather than black or white. For example consider this sentence: “A black guy was walking down the street and he saw a bunch of white guys standing around.” That looks fine, whereas “A Black guy was walking down the street and he saw a bunch of White guys standing around”—that looks weird to me, as if the encounter was taking place in an Ethnic Studies seminar.

But maybe I’m wrong on this. Jay Livingston argues that black and white are colors whereas Black and White are races (or, as I would prefer to say, ethnic categories) and illustrates with this picture of a white person and a White person:

In conversation, I sometimes talk about pink people, brown people, and tan people, but that won’t work in a research paper.

P.S. I suspect Carp will argue that I’m being naive: meanings of words change across contexts and over time. To which I reply: Sure, but I still have to choose how to write these words!

Question 1 of my final exam for Design and Analysis of Sample Surveys

1. Suppose that, in a survey of 1000 people in a state, 400 say they voted in a recent primary election. Actually, though, the voter turnout was only 30%. Give an estimate of the probability that a nonvoter will falsely state that he or she voted. (Assume that all voters honestly report that they voted.)

P.S. The commenters are picking up some of the unintended “Hare and pineapple” ambiguity in my question!