Skip to content

Familial Linkage between Neuropsychiatric Disorders and Intellectual Interests

When I spoke at Princeton last year, I talked with neuroscientist Sam Wang, who told me about a project he did surveying incoming Princeton freshmen about mental illness in their families. He and his coauthor Benjamin Campbell found some interesting results, which they just published:

A link between intellect and temperament has long been the subject of speculation. . . . Studies of the artistically inclined report linkage with familial depression, while among eminent and creative scientists, a lower incidence of affective disorders is found. In the case of developmental disorders, a heightened prevalence of autism spectrum disorders (ASDs) has been found in the families of mathematicians, physicists, and engineers. . . .

We surveyed the incoming class of 2014 at Princeton University about their intended academic major, familial incidence of neuropsychiatric disorders, and demographic variables. . . . Consistent with prior findings, we noticed a relation between intended academic majors and ASDs. Looking for relations between other neuropsychiatric disorders and academic interest we also noted a heightened prevalence of bipolar disorder, major depressive disorder and substance abuse in the families of those pursuing the humanities. A composite score based on these four heritable disorders was strongly correlated with a student’s intended academic major. Thus, familial risk toward a spectrum of psychopathologies can predict propensity toward technical versus humanist interests.

When I spoke with Sam last year we discussed various ways to analyze the data as well as various interpretations of the results, but I don’t actually remember any of our conversation except for the bit where he described to me how they conducted their study.

Charles Murray [perhaps] does a Tucker Carlson, provoking me to unleash the usual torrent of graphs

Charles Murray wrote a much-discussed new book, “Coming Apart: The State of White America, 1960-2010.”

David Frum quotes Murray as writing, in an echo of now-forgotten TV personality Tucker Carlson, that the top 5% of incomes “tends to be liberal—right? There’s no getting around it. Every way of answering this question produces a yes.”

[I’ve interjected a “perhaps” into the title of this blog post to indicate that I don’t have the exact Murray quote here so I’m relying on David Frum’s interpretation.]

Frum does me the favor of citing Red State Blue State as evidence, and I’d like to back this up with some graphs.

Frum writes:

Say “top 5%” to Murray, and his imagination conjures up everything he dislikes: coastal liberals listening to NPR in their Lexus hybrid SUVs. He sees that image so intensely that no mere number can force him to remember that the top 5% also includes the evangelical Christian assistant coach of a state university football team. . . .

To put it in graphical terms:


Continue reading ‘Charles Murray [perhaps] does a Tucker Carlson, provoking me to unleash the usual torrent of graphs’ »

The more likely it is to be X, the more likely it is to be Not X?

This post is by Phil Price.

A paper by Wood, Douglas, and Sutton looks at “Beliefs in Contradictory Conspiracy Theories.”  Unfortunately the  subjects were 140 undergraduate psychology students, so one wonders how general the results are.  I found this sort of arresting:

In Study 1 (n=137), the more participants believed that Princess Diana faked her own death, the more they believed she was murdered.  In Study 2 (n=102), the more participants believed that Osama Bin Laden was already dead when U.S. Special Forces raided his compound in Pakistan, the more they believed he is still alive.

As the article says, “conspiracy advocates’ distrust of official narratives may be so strong that many alternative theories are simultaneously endorsed in spite of any contradictions between them.”  But I think the authors overstate things when they say “One would think that there ought to be a negative correlation between beliefs in contradictory accounts of events — the more one believes in a particular theory, the less likely rival theories will seem.”  Well, one might think that, but actually a positive correlation makes sense to me.  I can see how, if you really think that a lot of what the government says is a lie, you would think “well, I don’t know exactly which part of the Bin Laden account is a lie but they are probably lying about something; maybe he was already dead, or maybe he’s still alive now, but I don’t know which.”  The authors realize this is what is going on, they just make too much of how surprising it should be.

Philosophy of Bayesian statistics: my reactions to Hendry

Continuing with my discussion here and here of the articles in the special issue of the journal Rationality, Markets and Morals on the philosophy of Bayesian statistics:

David Hendry, “Empirical Economic Model Discovery and Theory Evaluation”:

Hendry presents a wide-ranging overview of scientific learning, with an interesting comparison of physical with social sciences. (For some reason, he discusses many physical sciences but restricts his social-science examples to economics and psychology.)

The only part of Hendry’s long and interesting article that I will discuss, however, is the part where he decides to take a gratuitous swing at Bayes. I don’t know why he did this, but maybe it’s part of some fraternity initiation thing, like TP-ing the dean’s house on Halloween.

Here’s the story. Hendry writes:

‘Prior distributions’ widely used in Bayesian analyses, whether subjective or ‘objective’, cannot be formed in such a setting either, absent a falsely assumed crystal ball. Rather, imposing a prior distribution that is consistent with an assumed model when breaks are not included is a recipe for a bad analysis in macroeconomics. Fortunately, priors are neither necessary nor sufficient in the context of discovery.

I could just laugh this off—but as someone who has published two books and hundreds of articles on applied Bayesian statistics, I think I’ll take Hendry seriously.

Let me start with the tone. I generally don’t like when people take words or phrases that you disagree with them and put them in quotes. If you’re going to put “prior distributions” and “objective” in quotes, then please show the same disrespect to your other terms: “falsely” . . . “crystal ball” . . . “breaks” . . . “recipe” . . . “macroeconomics” . . . “discovery.”

But let me get to the substance. First, Hendry’s right. No statistical method is necessary. With sufficient effort, I think you can solve all statistical problems with Bayesian methods, or with robust methods, or with bootstrapping, or with any number of alternative approaches. Fuzzy sets would probably work too. Different approaches have different advantages, but I’m sure that if Hendry adopts a self-denying ordinance and decides to never use priors, he can solve all sorts of data analysis problems. He’ll just have to work really hard sometimes. But, to be fair, there are some problems that I have to work really hard on too. In short: econometrics methods tend to require more effort in complicated settings, but they often have appealing robustness properties. It’s fair enough that Hendry and I place different values on robustness vs. modeling flexibility.

My most serious criticism with Hendry’s above paragraph is the old, old story: he’s singling out Bayesian methods and priors as being particularly bad. Meanwhile all those likelihood functions and assumptions of additivity, symmetry, etc. all just sneak in. Hendry’s standing at the back window with a shotgun, scanning for priors coming over the hill, while a million assumptions just walk right into his house through the front door.

Here’s Hendry’s summary:

The pre-existing framework of ideas is bound to structure any analysis for better or worse, but being neither necessary nor sufficient, often blocking, and unhelpful in a changing world, prior distributions should play a minimal role in data analyses that seek to discover useful knowledge.

I’m going to have to disagree. I could give a million examples of useful knowledge that can be discovered with the aid of prior distributions. For example, where are the houses in the U.S. that have high radon levels? What are the effects of redistricting? How much perchloroethylene does the body metabolize? What is public opinion on gay rights by state? Or, for a classic from Mosteller and Wallace in 1960, classify the authorship of the Federalist Papers using 1960s technology.

I’m not saying that Hendry and his colleagues need to be using Bayesian methods in his applied research. I’m not even saying that Bayesian methods are needed to solve the problems listed in the above paragraph. In practice these problems were indeed solved using Bayesian inference, but I think other approaches could get there too. What I am saying is, why is Hendry so sure that “prior distributions should play a minimal role” etc.? I’m really bothered when people go beyond the simple and direct, “I have no personal experience with Bayesian inference solving a useful problem” to prescriptive (and wrong) statements such as “prior distributions should play a minimal role.” And it’s just silly to say that priors are “unhelpful in a changing world.” I’d think an econometrician would know about time series models!

Hendry also pulls the no-true-Scotsman trick:

Fortunately, priors are neither necessary nor sufficient in the context of discovery. For example, children learn whatever native tongue is prevalent around them, be it Chinese, Arabic or English, for none of which could they have a ‘prior’. Rather, trial-and-error learning seems a child’s main approach to language acquisition: see Clark and Clark (1977). Certainly, a general language system seems to be hard wired in the human brain (see Pinker 1994; 2002) but that hardly constitutes a prior. Thus, in one of the most complicated tasks imaginable, which computers still struggle to emulate, priors are not needed.

This is a no-true-Scotsman argument because, when confronted with an example in which our brains figure things out using a pre-existing structure (not for Chinese, Arabic, or English, but for human language in general), Hendry simply says that this system that is “hard wired in the human brain . . . hardly constitutes a prior.” Huh? It’s definitely a prior. That’s the whole point: our brains are tuned to decode human language.

Why does this bug me so much about a few throwaway paragraphs in an otherwise-pretty-good-article? Hendry’s anti-Bayesian sentiments are no more clueless than those earlier expressed by, say, John DiNardo. The difference is that DiNardo was just venting his opinions and was pretty open about this, whereas Hendry’s presenting his prejudices with an air of expertise. If Hendry wants to work on “replacing unrestricted non-linear functions by an encompassing theory-derived form, such as an ogive,” then fine. His theoretical models of model selection seem interesting and could perhaps be useful. I just wish he’d cut out the part where he implicitly disparages the work of Mosteller and Wallace, Lax and Phillips, and a few zillion other researchers who’ve used Bayesian methods to solve problems.

It’s not too late for Hendry to reform (I hope). All he needs to do is to retreat to present the positive virtues of his preferred inferential approach along with his explanations as to why Bayesian methods have not seemed useful for him. He’s an econometrician, he doesn’t work in toxicology and that’s fine. I think both his positive and his negative statements would be stronger if he would be more aware of the limits of his own experience. Just as, in mathematics, a theorem is clearer if you understand the range of its applicability and the areas where there are counterexamples.

Bayesian model-building by pure thought: Some principles and examples

This is one of my favorite papers:

In applications, statistical models are often restricted to what produces reasonable estimates based on the data at hand. In many cases, however, the principles that allow a model to be restricted can be derived theoretically, in the absence of any data and with minimal applied context. We illustrate this point with three well-known theoretical examples from spatial statistics and time series. First, we show that an autoregressive model for local averages violates a principle of invariance under scaling. Second, we show how the Bayesian estimate of a strictly-increasing time series, using a uniform prior distribution, depends on the scale of estimation. Third, we interpret local smoothing of spatial lattice data as Bayesian estimation and show why uniform local smoothing does not make sense. In various forms, the results presented here have been derived in previous work; our contribution is to draw out some principles that can be derived theoretically, even though in the past they may have been presented in detail in the context of specific examples.

I just love this paper. But it’s only been cited 17 times (and four of those were by me), so I must have done something wrong. In retrospect I think it would’ve made more sense to write it as three separate papers; then each might have had its own impact. In any case, I hope the article provides some enjoyment and insight to those of you who click through.

What is a prior distribution?

Some recent blog discussion revealed some confusion that I’ll try to resolve here.

I wrote that I’m not a big fan of subjective priors. Various commenters had difficulty with this point, and I think the issue was most clearly stated by Bill Jeffreerys, who wrote:

It seems to me that your prior has to reflect your subjective information before you look at the data. How can it not?

But this does not mean that the (subjective) prior that you choose is irrefutable; Surely a prior that reflects prior information just does not have to be inconsistent with that information. But that still leaves a range of priors that are consistent with it, the sort of priors that one would use in a sensitivity analysis, for example.

I think I see what Bill is getting at. A prior represents your subjective belief, or some approximation to your subjective belief, even if it’s not perfect. That sounds reasonable but I don’t think it works. Or, at least, it often doesn’t work.

Let’s start with a simple example. You hop on a scale that gives unbiased measurements with errors that have a standard deviation of 0.1 kg. To do Bayesian analysis, you assign a N(0,10000^2) prior on your true weight. That doesn’t represent your subjective belief! It’s not even an approximation. No problem—it works fine for most purposes—but it’s not subjective.

More generally, think of all the linear and logistic regressions we use. Instead of thinking of these as subjective beliefs, I prefer to think of the joint probability distribution as a model, reflecting a set of assumptions. In some settings these assumptions represent subjective beliefs, in other settings they don’t.

This article from 2002 might help. If I could go back and alter it, I’d add something on weakly informative priors, but I still agree with the general approach discussed there.

P.S. Just to give an example of what I mean by prior information: The analyses in Red State Blue State all use noninformative prior distributions. But a lot of prior information comes in, in the selection of what questions to study, what models to consider, and what variables to include in the model. For example, as state-level predictors we include region of the country, Republican vote in the previous presidential election, and average state income. Prior information goes into the choice and construction of all these predictors. But the prior distribution is a particular probability distribution that in this case is flat and does not reflect prior knowledge.

One way to think about informative prior distributions is as a form of smoothing: when setting the parameters of a probability distribution based on prior knowledge, we are imposing some time smoothness on the parameters. I think that’s probably a good idea and that the Red State Blue State analyses (among others) would be better for it. I didn’t set up this prior structure because I wasn’t easily equipped to do so and it seemed like too much effort, but perhaps at some future time this sort of structuring will be as commonplace as hierarchical modeling is today.

“Turn a Boring Bar Graph into a 3D Masterpiece”

Jimmy sends in this.

Steps include “Make whimsical sparkles by drawing an ellipse using the Ellipse Tool,” “Rotate the sparkles . . . Give some sparkles less Opacity by using the Transparency Palette,” and “Add a haze around each sparkle by drawing a white ellipse using the Ellipse Tool.”

The punchline:

Now, the next time you need to include a boring graph in one of your designs you’ll be able to add some extra emphasis and get people to really pay attention to those numbers!

P.S. to all the commenters: Yeah, yeah, do your contrarian best and tell me why chartjunk is actually a good thing, how I’m just a snob, etc etc.

More on the economic benefits of universities

Last year my commenters and I discussed Ed Glaeser’s claim that the way to create a great city is to “create a great university and wait 200 years.”

I passed this on to urbanist Richard Florida and received the following response:
Continue reading ‘More on the economic benefits of universities’ »

Web equation

Aleks sends along this app which, while cute, is not quite “killer” for me. I find it more difficult to write the equation using the trackpad than to simply type it in using Latex! But I suppose it could be useful to beginners who want their papers to look more like science.

Philosophy of Bayesian statistics: my reactions to Senn

Continuing with my discussion of the articles in the special issue of the journal Rationality, Markets and Morals on the philosophy of Bayesian statistics:

Stephen Senn, “You May Believe You Are a Bayesian But You Are Probably Wrong”:

I agree with Senn’s comments on the impossibility of the de Finetti subjective Bayesian approach. As I wrote in 2008, if you could really construct a subjective prior you believe in, why not just look at the data and write down your subjective posterior. The immense practical difficulties with any serious system of inference render it absurd to think that it would be possible to just write down a probability distribution to represent uncertainty. I wish, however, that Senn would recognize my Bayesian approach (which is also that of John Carlin, Hal Stern, Don Rubin, and, I believe, others). De Finetti is no longer around, but we are!

I have to admit that my own Bayesian views and practices have changed. In particular, I resonate with Senn’s point that conventional flat priors miss a lot and that Bayesian inference can work better when real prior information is used. Here I’m not talking about a subjective prior that is meant to express a personal belief but rather a distribution that represents a summary of prior scientific knowledge. Such an expression can only be approximate (as, indeed, assumptions such as logistic regressions, additive treatment effects, and all the rest, are only approximations too), and I agree with Senn that it would be rash to let philosophical foundations be a justification for using Bayesian methods. Rather, my work on the philosophy of statistics is intended to demonstrate how Bayesian inference can fit into a falsificationist philosophy that I am comfortable with on general grounds.

The inevitable problems with statistical significance and 95% intervals

I’m thinking more and more that we have to get rid of statistical significance, 95% intervals, and all the rest, and just come to a more fundamental acceptance of uncertainty.

In practice, I think we use confidence intervals and hypothesis tests as a way to avoid acknowledging uncertainty. We set up some rules and then act as if we know what is real and what is not. Even in my own applied work, I’ve often enough presented 95% intervals and gone on from there. But maybe that’s just not right.

I was thinking about this after receiving the following email from a psychology student:
Continue reading ‘The inevitable problems with statistical significance and 95% intervals’ »

Philosophy of Bayesian statistics: my reactions to Cox and Mayo

The journal Rationality, Markets and Morals has finally posted all the articles in their special issue on the philosophy of Bayesian statistics.

My contribution is called Induction and Deduction in Bayesian Data Analysis. I’ll also post my reactions to the other articles. I wrote these notes a few weeks ago and could post them all at once, but I think it will be easier if I post my reactions to each article separately.

To start with my best material, here’s my reaction to David Cox and Deborah Mayo, “A Statistical Scientist Meets a Philosopher of Science.” I recommend you read all the way through my long note below; there’s good stuff throughout:

1. Cox: “[Philosophy] forces us to say what it is that we really want to know when we analyze a situation statistically.”

This reminds me of a standard question that Don Rubin (who, unlike me, has little use for philosophy in his research) asks in virtually any situation: “What would you do if you had all the data?” For me, that “what would you do” question is one of the universal solvents of statistics.

2. Mayo defines scientific objectivity as concerning “the goal of using data to distinguish correct from incorrect claims about the world” and contrasts this with so-called objective Bayesian statistics. All I can say here is that the terms “subjective” and “objective” seem way overloaded at this point. To me, science is objective in that it aims for reproducible findings that exist independent of the observer, and it’s subjective in that the process of science involves many individual choices. And I think the statistics I do (mostly, but not always, using Bayesian methods) is both objective and subjective in that way.

3. Cox discusses Fisher’s rule that it’s ok to use prior information in design of data collection but not in data analysis. Like a lot of hundred-year-old ideas, this rule makes sense in some contexts but not in others. Consider the notorious study in which a random sample of a few thousand people was analyzed, and it was found that the most beautiful parents were 8 percentage points more likely to have girls, compared to less attractive parents. The result was statistically significant (p<.05) and published in a reputable journal. But in this case we have good prior information suggesting that the difference in sex ratios in the population, comparing beautiful to less-beautiful parents, is less than 1 percentage point. A classical design analysis reveals that, with this level of true difference, any statistically-significant oberved difference in the sample is likely to be noise. (Even conditional on statistical significance, the observed difference has an over 40% chance of being in the wrong direction and will overestimate the population difference by an order of magnitude.) At this point, you might well say that the original analysis should never have been done at all---but, given that it has been done, it is essential to use prior information to interpret the data and generalize from sample to population.

Where did Fisher’s principle go wrong here? The answer is simple—and I think Cox would agree with me here. We’re in a setting where the prior information is much stronger than the data. If one’s only goal is to summarize the data, then taking the difference of 8% (along with a confidence interval and even a p-value) is fine. But if you want to generalize to the population—which was indeed the goal of the researcher in this example—then it makes no sense to stop there.

Cox illustrates the difficulty in a later quote: “[Bayesians'] conceptual theories are trying to do two entirely different things. One is trying to extract information from the data, while the other, personalistic theory, is trying to indicate what you should believe, with regard to information from the data and other, prior, information treated equally seriously. These are two very different things.”

Yes, but Cox is missing something important! He defines two goals:
(a) Extracting information from the data.
(b) A “personalistic theory” of “what you should believe.”
I’m talking about something in between, which is inference for the population. I think Laplace would understand what I’m talking about here. The sample is (typically) of no interest in itself, it’s just a means to learning about the population. But my inferences about the population aren’t “personalistic”—at least, no more than the dudes at CERN are personalistic when they’re trying to learn about particle theory from cyclotron experiments, and no more than the Census and the Bureau of Labor Statistics are personalistic when they’re trying to learn about the U.S. economy from sample data.

4. Cox: “There are situations where it is very clear that whatever a scientist or statistician might do privately in looking at data, when they present their information to the public or government department or whatever, they should absolutely not use prior information, because the prior opinions on some of these prickly issues of public policy can often be highly contentious with different people with strong and very conflicting views.”

Maybe. But I don’t think Cox even believes this statement himself if it were taken literally. For example, right now I’m working on the politically controversial problem of reconstructing historical climate from tree rings. We have a lot of prior information on the processes under which tree rings grow and how they are measured. I don’t think anyone would want to just take raw numbers from core samples as a climate estimate! All the tools from Statistical Methods for Research Workers won’t take you from tree rings to temperature estimates. You need some scientific knowledge and prior information on where these measurements came from.

So let me interpret what I think Cox was saying. I take him to be dividing any scientific inference into two parts, inside and outside. Priors are allowed in the inside work of scientific modeling, which uses lots of external information, from the basic assumptions that the data correspond to your scientific goals, through the mathematical form of the transfer function, down to details such as an assumption of normally-distributed measurement errors, which might be supported based on prior experimental evidence. But Cox would prefer to avoid priors in the outside problem. In my example, I assume he’d allow prior information on the tree-ring measurement process—I don’t see how you can get anywhere otherwise—but he’d rather not combine with external estimates of the temperature series. That’s a tenable position. It doesn’t avoid all the controversy—manipulations of the data model can map in predictable ways to changes in the final inferences—but it could make sense.

I’ve followed this approach in much of my own applied work, using noninformative priors and carefully avoiding the use of prior information in the final stages a statistical analysis. But that can’t always be the right choice. Sometimes (as in the sex ratio example above), the data are just too weak—and a classical textbook data analysis can be misleading. Imagine a Venn diagram, where one circle is “Topics that are so controversial that we want to avoid using prior information in the statistical analysis” and the other circle is “Problems where the data are weak compared to prior information.” If you’re in the intersection of these circles, you have to make some tough choices!

More generally, there is a Bayesian solution to the problem of sensitivity to prior assumptions. That solution is sensitivity analysis: perform several analyses using different reasonable priors. Make more explicit the mapping from prior and data to conclusions. Be open about sensitivity, don’t try to sweep the problem under the rug, etc etc. And, if you’re going that route, I’d also like to see some analysis of sensitivity to assumptions that are not conventionally classified as “prior.” You know, those assumptions that get thrown in because they’re what everybody does. For example, Cox regression is great, but additivity is a prior assumption too! (One might argue that assumptions such as additivity, logistic links, etc., are exempt from Fisher’s strictures by virtue of being default assumptions rather than being based on prior information—but I certainly don’t think Mayo would take that position, given her strong feelings on Bayesian default priors.)

My point here is that all statistical methods require choices—assumptions, if you will. Not all your choices can be determined or even validated from the data at hand. If you don’t want your choices to be based on prior information, what other options do you have? You can rely on convention—using methods that appear in major textbooks and have stood the test of time—or maybe on theory. Both these meta-foundational approaches have their virtues but neither is perfect: Conventional methods are not necessarily good (as can be seen by noting that for many problems there are multiple conventional methods that give different results), and theory often doesn’t help (for example classical confidence intervals and hypothesis tests are insufficient in the simple sex-ratio problem noted above).

“the forces of native stupidity reinforced by that blind hostility to criticism, reform, new ideas and superior ability which is human as well as academic nature”

Q. D. Leavis wrote:

The answer does seem to be that the academic world, like other worlds, is run by the politicians, and sensitively scrupulous people tend to leave politics to other people, while people with genuine work to do certainly have no time as well as no taste for committee-rigging and the associated techniques. And then of course there are the forces of native stupidity reinforced by that blind hostility to criticism, reform, new ideas and superior ability which is human as well as academic nature.

Not that I’ve ever read anything by Mrs. Leavis (or, as the Brits used to write, Mrs Leavis). The above quote is one of the epigraphs to a book by Richard Kostelanetz. Whom I’ve never heard of, except in a footnote in John Rodden’s classic Orwell study, The Politics of Literary Reputation.

I’ll have more to say about Orwell in another post, but for now let me return to the above Leavis quote, to which I have three reactions:

1. On a personal level, I’m on Leavis’s side. I’d much rather work (or blog, which I feel is related to my work and is also a public service) than spend time on academic politics: forming coalitions, doing the pre-meeting meetings, trading favors, kissing up and kicking down, and all the rest.

To put it another way, I don’t like political games because (a) I’m not good at manipulation and deception, and (b) Much of politics is zero-sum, and I prefer to collaborate in positive-sum activities such as writing Stan.

2. But on a more practical level, somebody needs to do the dirty work. Every once in awhile. I’ve encountered some administrators who are good at “committee-rigging,” etc., and others who show less political ability. I’ve seem people use political processes in a pointless destructive way—power for the sake of power—but others can use their political skills to foster smooth cooperation.

To put it another way, I require the political efforts of others to create the safe space I need to do my work. And it’s a special bonus when these political efforts are not “reinforced by that blind hostility to criticism, reform, new ideas and superior ability.”

3. As a political scientist, I recognize that politics is necessary. There’s no such thing as a non-political process. Politics is how we fight against entropy. Whatever non-politicized zones we have in life are often the result of continued political effort. As the saying goes, the price of liberty is eternal vigilance.

Ultimately I’ll have to go with #3.

Statistical Murder

English: Photo of Robert Zubrin taken by the M...

Image via Wikipedia

Robert Zubrin writes in “How Much Is an Astronaut’s Life Worth?” (Reason, Feb 2012):

…policy analyst John D. Graham and his colleagues at the Harvard Center for Risk Analysis found in 1997 that the median cost for lifesaving expenditures and regulations by the U.S. government in the health care, residential, transportation, and occupational areas ranges from about $1 million to $3 million spent per life saved in today’s dollars. The only marked exception to this pattern occurs in the area of environmental health protection (such as the Superfund program) which costs about $200 million per life saved.

Graham and his colleagues call the latter kind of inefficiency “statistical murder,” since thousands of additional lives could be saved each year if the money were used more cost-effectively. To avoid such deadly waste, the Department of Transportation has a policy of rejecting any proposed safety expenditure that costs more than $3 million per life saved. That ceiling therefore may be taken as a high-end estimate for the value of an American’s life as defined by the U.S. government.

This reminds me of my old article on Value of Life – where the hidden cost of the Iraq war for the US comes to 720,000 lives lost (based on the huge cost).

Enhanced by Zemanta

A tax on inequality, or a tax to keep inequality at the current level?

My sometime coauthor Aaron Edlin cowrote (with Ian Ayres) an op-ed recommending a clever approach to taxing the rich.

In their article they employ a charming bit of economics jargon, using the word “earn” to mean “how much money you make.” They “propose an automatic extra tax on the income of the top 1 percent of earners.” I assume their tax would apply to unearned income as well, but they (or their editor at the Times) are just so used to describing income as “earnings” that they just threw that in. Funny.

Also, there’s a part of the article that doesn’t make sense to me.
Continue reading ‘A tax on inequality, or a tax to keep inequality at the current level?’ »

Convenient page of data sources from the Washington Post

Wayne Folta points us to this list.

G+ > Skype

I spoke at the University of Kansas the other day. Kansas is far away so I gave the talk by video. We did it using a G+ hangout, and it worked really well, much much better than when I gave a talk via Skype. With G+, I could see and hear the audience clearly, and they could hear me just fine while seeing my slides (or my face, I went back and forth). Not as good as a live presentation but pretty good, considering.

P.S. And here’s how to do it!

Conflict of interest disclaimer: I was paid by Google last year to give a short course.

How many parameters are in a multilevel model?

Stephen Collins writes:

I’m reading your Multilevel modeling book and am trying to apply it to my work. I’m concerned with how to estimate a random intercept model if there are hundreds/thousands of levels. In the Gibbs sampling, am I sampling a parameter for each level? Or, just the hyper-parameters? In other words, say I had 500 zipcode intercepts modeled as ~ N(m,s). Would my posterior be two dimensional, sampling for “m” and “s,” or would it have 502 dimensions?

My reply: Indeed you will have hundreds or thousands of parameters—or, in classical terms, hundreds or thousands of predictive quantities. But that’s ok. Even if none of those predictions is precise, you’re learning about the model.

See page 526 of the book for more discussion of the number of parameters in a multilevel model.

Using predator-prey models on the Canadian lynx series

The “Canadian lynx data” is one of the famous examples used in time series analysis. And the usual models that are fit to these data in the statistics time-series literature, don’t work well. Cavan Reilly and Angelique Zeringue write:

Reilly and Zeringue then present their analysis. Their simple little predator-prey model with a weakly informative prior way outperforms the standard big-ass autoregression models. Check this out:

Or, to put it into numbers, when they fit their model to the first 80 years and predict to the next 34, their root mean square out-of-sample error is 1480 (see scale of data above). In contrast, the standard model fit to these data (the SETAR model of Tong, 1990) has more than twice as many parameters but gets a worse-performing root mean square error of 1600, even when that model is fit to the entire dataset. (If you fit the SETAR or any similar autoregressive model to the first 80 years and use it to predict the next 34, the predictions are a disaster—the predicted values quickly go toward the mean and can’t even attempt to track the curve.)

As Reilly and Zeringue note, the above graph shows potential room for improvement in the model, but even as is, it shows the huge benefits that can be obtained by attempting to model the underlying process rather than simply fitting the data using a conventional family of models.

(It’s funny for me to emphasize this point, given how often I use conventional models such as linear and logistic regression.)

P.S. The title and text above have been modified to reflect comments below with reference to models fit to the lynx data in the ecology literature. There appears to be not enough communication between ecologists and statisticians. The statistical point above still holds—a simple model with some reasonable structure can outperform a generic data-fitting model such as an autoregression—but you should probably check out some of the references given in the comments if you’re interested in the lynx example or ecology models more generally.

Educational monoculture

John Cook writes that he’d like to hear more people talk about “educational monoculture.” I don’t actually know John Cook but I enjoy reading his blog, so I feel like the least I can do is to honor his request.

I have to admit that I have a bit of a monocultural temperament myself. I have strong feelings about the right and wrong way to do things, and I don’t have much patience for what seems to me to be the wrong way. As a result, I’ve often disparaged or ignored important statistical developments because some small aspect of the new idea didn’t fit with my thinking. (On the plus side, I think I’ve disparaged or ignored lots more bad ideas thad deserve oblivion.)

I’ve always been suspicious of the hedgehog/fox distinction because my impression is that just about everybody likes to think of him or herself as a fox. Being a hedgehog is like being “ideological”; most of us like to think of ourselves as pragmatic foxes. And in any case I think most statisticians are foxes.

One of the many positive outcomes of my mugging at Berkeley was a commitment to pluralism (for example, see here).

Beyond this, I move away from my natural monocultural instincts by teaching classes that include material I wouldn’t otherwise cover, by listening carefully to people I respect who do things in a different way than I do, and by thinking hard about why certain methods or attitudes which seem silly to me, still remain popular.

Finally, my approach as a political scientist and public opinion researcher is to understand the views of others. I think I have a pretty good grip on why it can make sense for people to vote for Gingrich or Romney or Obama or Santorum or whatever, and I’m interested in understanding political ideologies as they manifest themselves in different areas (even in statistics, where political views range from Dennis Lindley to Jacob Wolfowitz).

“Moving beyond monoculture” doesn’t mean that I abandon my skepticism but it means that I should at least try to understand other approaches to looking at the world.

P.S. I thought the above discussion would be more useful than yet another argument about the extent to which modern education is such a scam etc.

Suggested resolution of the Bem paradox

There has been an increasing discussion about the proliferation of flawed research in psychology and medicine, with some landmark events being John Ioannides’s article, “Why most published research findings are false” (according to Google Scholar, cited 973 times since its appearance in 2005), the scandals of Marc Hauser and Diederik Stapel, two leading psychology professors who resigned after disclosures of scientific misconduct, and Daryl Bem’s dubious recent paper on ESP, published to much fanfare in Journal of Personality and Social Psychology, one of the top journals in the field.

Alongside all this are the plagiarism scandals, which are uninteresting from a scientific context but are relevant in that, in many cases, neither the institutions housing the plagiarists nor the editors and publishers of the plagiarized material seem to care. Perhaps these universities and publishers are more worried about bad publicity (and maybe lawsuits, given that many of the plagiarism cases involve law professors) than they are about scholarly misconduct.

Before going on, perhaps it’s worth briefly reviewing who is hurt by the publication of flawed research. It’s not a victimless crime. Here are some of the malign consequences:

- Wasted time and resources spent by researchers trying to replicate non-findings and chasing down dead ends.

- Fake science news bumping real science news off the front page.

- When the errors and scandals come to light, a decline in the prestige of higher-quality scientific work.

- Slower progress of science, delaying deeper understanding of psychology, medicine, and other topics that we deem important enough to deserve large public research efforts.

This is a hard problem!

There’s a general sense that the system is broken with no obvious remedies. I’m most interested in presumably sincere and honest scientific efforts that are misunderstood and misrepresented into more than they really are (the breakthrough-of-the-week mentality criticized by Ioannides and exemplfied by Bem). As noted above, the cases of outright fraud have little scientific interest but I brought them up to indicate that, even in extreme cases, the groups whose reputations seem at risk from the unethical behavior often seem more inclined to bury the evidence than to stop the madness.

If universities, publishers, and editors are inclined to look away when confronted with out-and-out fraud and plagiarism, we can hardly be surprised if they’re not aggressive against merely dubious research claims.

In the last section of this post, I briefly discuss several examples of dubious research that I’ve encountered, just to give a sense of the difficulties that can arise in evaluating such reports.

What to do (statistics)?

My generic solution to the statistics problems involved in estimating small effects is to replace multiple comparisons by multilevel modeling, that is, to estimate configurations rather than single effects or coefficients. This tactic won’t solve every problem but it’s my overarching conceptual framework. There’s lots room for research on how to do better in particular problem settings.

What to do (scientific publishing)?

I have clearer ideas of resolutions (at least in the short term) of the Bem paradox; in short, what to do with dubious but potentially interesting findings.

So far there seem to be two suggestions out there: Either publish such claims in top journals (as for example Bem’s in JPSP, or the contagion-of-obesity paper in NEJM), or the journals should reject them (perhaps from some combination of more careful review of methodology, higher standards than classical 5% significance, and Bayesian skepticism).

The problem with the publish-in-top-journals strategy is that it ensures publicity for some mistakes and it creates incentives for researchers to stretch their statistics to get a prestigious publication.

The problem with the reject-’em-all-and-let-the-Arxiv-sort-’em-out strategy is that it’s perhaps too rigorous. So many papers have potential methodological flaws. Recall that the Bem paper was published, which means in some sense that its reviewers thought the paper’s flaws were no worse than what usually gets published in JPSP. Long-term, sure, we’d like to improve methodological rigor, but in the meantime a key problem with Bem’s paper was not just its methodological flaws, it was also the implausibility of the claimed results.

So here’s my proposed solution. Instead of publishing speculative results in top journals such as JPSP, Science, Nature, etc., publish them in lower-ranked venues. For example, Bem could publish his experiments in some specialized journal of psychological measurement. If the work appears to be solid (as judged by the usual corps of referees), then publish it, get it out there. I’m not saying to send the paper to a trash journal; if it’s good stuff it can go in a good journal, the sort where peer review really means something. (I assume there’s also a journal of parapsychology but that’s probably just for true believers; it’s fair enough that Bem etc would like to publish somewhere that outsiders would respect.)

Under this system, JPSP could feel free to reject the Bem paper on the grounds that it’s too speculative to get the journal’s implicit endorsement. This is not suppression or censorship or anything like it, it’s just a recommendation that the paper be sent to a more specialized journal where there will be a chance for criticism and replication. At some point, if the findings are tested and replicated and seem to hold up, then it could be time for a publication in JPSP, Science, or Nature.

From the other side, this should be acceptable to the Bems and Fowlers who like to work on the edge. You still get your ideas out there in a respectable publication (and you still might even get a bit of publicity), and then you, the skeptics, and the rest of the scientific community can go at it in public.

There have also been proposals for more interactive publications of individual articles, with bloglike opportunities for discussion and replies. That’s fine too, but I think the only way to make real progress here is to accept that no individual article will tell the whole story, especially if the article is a report of new research. If the Bem finding is real, this can be demonstrated in a series of papers in some specialized journal.
Continue reading ‘Suggested resolution of the Bem paradox’ »

Chris Schmid on Evidence Based Medicine

Chris Schmid is a statistician at New England Medical Center who is an expert on evidence-based medicine. I invited him to present an introductory overview lecture on the topic at last year’s Joint Statistical Meetings, and here are his slides. All 123 of them. I don’t know how he expected to go though all of these in an hour. You could teach a semester-long course based on this material.

Good stuff, I recommend you all read it.

Difficulties in publishing non-replications of implausible findings

Eric Tassone points me to this news article by Christopher Shea on the challenges of debunking ESP. Shea writes:

Earlier this year, a major psychology journal published a paper suggesting that there was some evidence for “pre-cognition,” a form of ESP. Stuart Ritchie, a doctoral student at the University of Edinburgh, is part of a team that tried, but failed, to replicate those results. Here, he tells the Chronicle of Higher Education’s Tom Bartlett about the difficulties he’s had getting the results published.

Several journals told the team they wouldn’t publish a study that did no more than disprove a previous study. . . . An editor at another journal said he’d “only accept our paper if we ran a fourth experiment where we got a believer [in ESP] to run all the participants, to control for . . . experimenter effects.”

My reaction is, this isn’t as easy a question as it might seem. At first, one’s reaction might share Ritchie’s frustration that a shoddy paper by Bem got published while Ritchie’s careful replication got dinged. But, as I wrote when the issue came up on the sister blog:

Setting aside the whole “psychic powers” thing, it makes sense to me not to run the new experiment. After all, it’s hardly news that ESP doesn’t work. If “ESP doesn’t work” were publishable, you could fill up a journal many times over with such findings. And what would be the point of that? Better to start a new journal with some catchy title such as Replications of Well-Known Findings. In the physics division, you could have articles demonstrating that objects fall down, not up. In the chemistry division, you could publish demonstrations that H2 + O2 yields H2O plus energy. The biology section could have a paper demonstrating that cats and dogs can’t produce offspring. And so on.

So I don’t know the answer here. On one hand, we can hardly require or even expect that journals fill their pages with dog-bites-man nonreplications. (And, even in a computerized era where there are no page limits, there are still constraints on the time of editors and reviewers.) On the other hand, this leads to an asymmetry where crap gets on the front page and the refutation doesn’t even get published on page B16.

Fight! (also a bit of reminiscence at the end)

Martin Lindquist and Michael Sobel published a fun little article in Neuroimage on models and assumptions for causal inference with intermediate outcomes. As their subtitle indicates (“A response to the comments on our comment”), this is a topic of some controversy. Lindquist and Sobel write:

Our original comment (Lindquist and Sobel, 2011) made explicit the types of assumptions neuroimaging researchers are making when directed graphical models (DGMs), which include certain types of structural equation models (SEMs), are used to estimate causal effects. When these assumptions, which many researchers are not aware of, are not met, parameters of these models should not be interpreted as effects. . . . [Judea] Pearl does not disagree with anything we stated. However, he takes exception to our use of potential outcomes notation, which is the standard notation used in the statistical literature on causal inference, and his comment is devoted to promoting his alternative conventions. [Clark] Glymour’s comment is based on three claims that he inappropriately attributes to us. Glymour is also more optimistic than us about the potential of using directed graphical models (DGMs) to discover causal relations in neuroimaging research . . .

Lindquist and Sobel’s arguments make sense to me, except on one point. They consider a causal setting z -> x -> y, where z is the treatment variable, x is the intermediate outcome, and y is the ultimate outcome, and much of their discussion centers on estimating the causal effect of x on y. I have two difficulties with their perspective:

1. If x is an observed variable that is not directly manipulated, I don’t know if it makes sense to talk about the effect of x on y, unconditional on the intervention that was used to change x. In their example, I’d talk about “the effect of x on y, if x is changed through z.” Different z’s can induce different effects of x on y.

2. Lindquist and Sobel talk about the effect of z on x. If z=0 or 1, they write x(z), so that the causal effect of z on x is x(1) – x(0) (or, more generally, x(1) compared to x(0), but we lose nothing by considering simple differences here). So far, so good.

But I get stuck at the next step, where they define the effect of x on y. If x can equal 0 or 1, they write y(z,x), so that the causal effect of x on y, conditional on z, is y(z,1) – y(z,0). At least, I think that’s what they’re saying.

The trouble is, I don’t see how the two parts of this model fit together. For any given item in the experiment, I think they’re following the rule that x(z) has a particular (although maybe unknown) value. But then I don’t see what it means to look at y(z,1) – y(z,0). For any particular value of z, it seems to me that only one of these two terms is possible. (For example, if x(z)=1, then y(z,1) is defined but y(z,0) seems meaningless.)

I’m not saying that this framework is wrong, just that I don’t understand it.

That said, Lindquist and Sobel’s criticisms of Pearl and Glymour seem sound to me.
Continue reading ‘Fight! (also a bit of reminiscence at the end)’ »

Advice on do-it-yourself stats education?

Dustin Palmer writes:

I am a recent graduate looking for a bit of advice. While I took intro classes on math and statistics in my undergraduate degree as a political science major, I find myself university-less and seeking to develop my statistics toolkit.

I work for an NGO in the international development field. I think that a solid statistics foundation would offer me not only more career opportunities, but more importantly, a deeper and more nuanced understanding of the processes and problems that interest me. I’m talking about field experiments and practical quantitative and qualitative data analysis.

I have plenty of free time, ambition, and enthusiasm to improve this part of my toolbox, but I lack an attachment to an institution and much in the way of financial resources. How would you go about making a concentrated effort at acquiring an understanding of the field and its actual application in something like R or Stata, which I admit to never having used?

Perhaps I am simply asking about web resources or best texts, but any broad advice would be much appreciated too.

My gut recommendation is to start with a problem you care about and figure out what you need to get a reasonable solution, then go to the next problem, and so forth. For books, you could start with The Statistical Sleuth and my book with Jennifer. If you want to learn R, just try to make some pretty and useful graphs, that will motivate you to be able to do more.

Any other suggestions?