Skip to content
Archive of posts filed under the Causal Inference category.

Workshop on Interpretable Machine Learning

Andrew Gordon Wilson sends along this conference announcement: NIPS 2017 Symposium Interpretable Machine Learning Long Beach, California, USA December 7, 2017 Call for Papers: We invite researchers to submit their recent work on interpretable machine learning from a wide range of approaches, including (1) methods that are designed to be more interpretable from the start, […]

What am I missing and what will this paper likely lead researchers to think and do?

This post is by Keith. In a previous post Ken Rice brought our attention to a recent paper he had published with Julian Higgins and  Thomas Lumley (RHL). After I obtained access and read the paper, I made some critical comments regarding RHL which ended with “Or maybe I missed something.” This post will try to discern […]

“From ‘What If?’ To ‘What Next?’ : Causal Inference and Machine Learning for Intelligent Decision Making”

Panos Toulis writes in to announce this conference: NIPS 2017 Workshop on Causal Inference and Machine Learning (WhatIF2017) “From ‘What If?’ To ‘What Next?’ : Causal Inference and Machine Learning for Intelligent Decision Making” — December 8th 2017, Long Beach, USA. Submission deadline for abstracts and papers: October 31, 2017 Acceptance decisions: November 7, 2017 […]

Air rage update

So. Marcus Crede, Carol Nickerson, and I published a letter in PPNAS criticizing the notorious “air rage” article. (Due to space limitations, our letter contained only a small subset of the many possible criticisms of that paper.) Our letter was called “Questionable association between front boarding and air rage.” The authors of the original paper, […]

Causal inference using data from a non-representative sample

Dan Gibbons writes: I have been looking at using synthetic control estimates for estimating the effects of healthcare policies, particularly because for say county-level data the nontreated comparison units one would use in say a difference-in-differences estimator or quantile DID estimator (if one didn’t want to use the mean) are not especially clear. However, given […]

“How conditioning on post-treatment variables can ruin your experiment and what to do about it”

Brendan Nyhan writes: Thought this might be of interest – new paper with Jacob Montgomery and Michelle Torres, How conditioning on post-treatment variables can ruin your experiment and what to do about it. The post-treatment bias from dropout on Turk you just posted about is actually in my opinion a less severe problem than inadvertent […]

Rosenbaum (1999): Choice as an Alternative to Control in Observational Studies

Winston Lin wrote in a blog comment earlier this year: Paul Rosenbaum’s 1999 paper “Choice as an Alternative to Control in Observational Studies” is really thoughtful and well-written. The comments and rejoinder include an interesting exchange between Manski and Rosenbaum on external validity and the role of theories. And here it is. Rosenbaum begins: In […]

Causal identification + observational study + multilevel model

Sam Portnow writes: I am attempting to model the impact of tax benefits on children’s school readiness skills. Obviously, benefits themselves are biased, so I am trying to use the doubling of the maximum allowable additional child tax credit in 2003 to get an unbiased estimate of benefits. I was initially planning to attack this […]

What are best practices for observational studies?

Mark Samuel Tuttle writes: Just returned from the annual meeting of the American Medical Informatics Association (AMIA); in attendance were many from Columbia. One subtext of conversations I had with the powers that be in the field is the LACK of Best Practices for Observational Studies. They all agree that however difficult they are that […]

The Pandora Principle in statistics — and its malign converse, the ostrich

The Pandora Principle is that once you’ve considered a possible interaction or bias or confounder, you can’t un-think it. The malign converse is when people realize this and then design their studies to avoid putting themselves in a position where they have to consider some potentially important factor. For example, suppose you’re considering some policy […]

Torture talk: An uncontrolled experiment is still an experiment.

Paul Alper points us to this horrifying op-ed by M. Gregg Bloche about scientific study of data from U.S. military torture programs. I’ll leave the torture stuff to the experts or this guy who you’ve probably heard of. Instead, I have a technical point to make. In the op-ed, Bloche writes: In a true experimental […]

Does declawing cause harm?

Alex Chernavsky writes: I discovered your blog through a mutual friend – the late Seth Roberts. I’m not a statistician. I’m a cat-loving IT guy who works for an animal shelter in Upstate New York. I have a dataset that consists of 17-years’-worth of animal admissions data. When an owner surrenders an animal to us, […]

It’s hard to know what to say about an observational comparison that doesn’t control for key differences between treatment and control groups, chili pepper edition

Jonathan Falk points to this article and writes: Thoughts? I would have liked to have seen the data matched on age, rather than simply using age in a Cox regression, since I suspect that’s what really going on here. The non-chili eaters were much older, and I suspect that the failure to interact age, or […]

Multilevel modeling: What it can and cannot do

Today’s post reminded me of this article from 2005: We illustrate the strengths and limitations of multilevel modeling through an example of the prediction of home radon levels in U.S. counties. . . . Compared with the two classical estimates (no pooling and complete pooling), the inferences from the multilevel models are more reasonable. . […]

How does a Nobel-prize-winning economist become a victim of bog-standard selection bias?

Someone who wishes to remain anonymous writes in with a story: Linking to a new paper by Jorge Luis García, James J. Heckman, and Anna L. Ziff, an economist Sue Dynarski makes this “joke” on facebook—or maybe it’s not a joke: How does one adjust standard errors to account for the fact that N of […]

His concern is that the authors don’t control for the position of games within a season.

Chris Glynn wrote last year: I read your blog post about middle brow literature and PPNAS the other day. Today, a friend forwarded me this article in The Atlantic that (in my opinion) is another example of what you’ve recently been talking about. The research in question is focused on Major League Baseball and the […]

How to design future studies of systemic exercise intolerance disease (chronic fatigue syndrome)?

Someone named Ramsey writes on behalf of a self-managed support community of 100+ systemic exercise intolerance disease (SEID) patients. He read my recent article on the topic and had a question regarding the following excerpt: For conditions like S.E.I.D., then, the better approach may be to gather data from people suffering “in the wild,” combining […]

Statisticians and economists agree: We should learn from data by “generating and revising models, hypotheses, and data analyzed in response to surprising findings.” (That’s what Bayesian data analysis is all about.)

Kevin Lewis points us to this article by economist James Heckman and statistician Burton Singer, who write: All analysts approach data with preconceptions. The data never speak for themselves. Sometimes preconceptions are encoded in precise models. Sometimes they are just intuitions that analysts seek to confirm and solidify. A central question is how to revise […]

Analyze all your comparisons. That’s better than looking at the max difference and trying to do a multiple comparisons correction.

[cat picture] The following email came in: I’m in a PhD program (poli sci) with a heavy emphasis on methods. One thing that my statistics courses emphasize, but that doesn’t get much attention in my poli sci courses, is the problem of simultaneous inferences. This strikes me as a problem. I am a bit unclear […]

The Publicity Factory: How even serious research gets exaggerated by the process of scientific publication and media exposure

The starting point is that we’ve seen a lot of talk about frivolous science, headline-bait such as the study that said that married women are more likely to vote for Mitt Romney when ovulating, or the study that said that girl-named hurricanes are more deadly than boy-named hurricanes, and at this point some of these […]