Skip to content
 

Where to start reading about causality

A correspondent writes:

I’ve recently started skimming your blog (perhaps steered there by Brad deLong or Mark Thoma) but despite having waded through such enduring classics as Feller Vol II, Henri Theil’s “Econometrics”, James Hamilton’s “Time Series Analysis”, and T.W. Anderson’s “Multivariate Analysis”, I’m finding some of the discussions such as Pearl/Rubin a bit impenetrable. I don’t have a stats degree so I am thinking there is some chunk of the core curriculum on modeling and causality that I am missing. Is there a book (likely one of yours – e.g. Bayesian Data Analysis) that you would recommend to help fill in my background?

1. I recommend the new book, “Mostly Harmless Econometrics,” by Angrist and Pischke (see my review here).

2. After that, I’d read the following chapters from my book with Jennifer:

Chapter 9: Causal inference using regression on the treatment variable

Chapter 10: Causal inference using more advanced models

Here are some pretty pictures, from the low-birth-weight example:

fig10.3.png

and from the Electric Company example:

fig23.1_small.png

3. Beyond this, you could read the books by Morgan and Winship and Pearl, but both these are a bit more technical and less applied that the two books linked to above.

The commenters may have other suggestions.

7 Comments

  1. Nis says:

    John Goldthorpe has a not very mathematical way of explaining different ways of thinking about causation/causality. Him being a sociologist, he is not overly concerned about the technicalities involved in establishing causal relations through a model – at least he isn't in this article. Instead he wants to find out how to understand causality as a social scientist (or non-economist). The article gives plenty of food for thought.


    This link
    takes you to the abstract for the article.

  2. Anonymous Coward says:

    I don't know what technical level is appropriate. I found several papers by Heckman to be interesting reading, including this one:
    http://ideas.repec.org/p/iza/izadps/dp3425.html

    There's also this:
    http://jenni.uchicago.edu/discussion/discussion.h

  3. bill r says:

    Pearl (at least) addresses a broader question: How to infer the causal structure from observational data. This is, to me, much more interesting than just estimating the causal effect.

    Another book that addresses that is "Causation, Prediction and Search" <a> by Sprites et. al.

  4. Andrew Gelman says:

    Thanks for the additional references.

    Bill: I don't think it makes sense to talk about "inferring the causal structure [as distinct from estimating causal effects] from observational data." At least not in the social and environmental science problems on which I work. But I respect that others have a different view on this, and I appreciate the link.

  5. bill r says:

    Andrew,
    I use it in product development to figure out which attribute knobs to turn to make the end-users happier. I may not have been clear, but then estimating the causal effects of those attributes becomes important. There are multiple advocates (e.g. the widget folks want their projects funded) and the network and resulting estimates help clarify the discussion.

    Since we can test our changes relatively easily, if there were only one cause, we could cost-optimize it (which we have done for some parts of the products.)

  6. Thomas R says:

    Hi Andrew,

    You said:

    'I don't think it makes sense to talk about "inferring the causal structure [as distinct from estimating causal effects] from observational data."'

    Following the earlier discussion there now seems to be general agreement that in certain situations adjusting for a pre-treatment variable (e.g. by propensity score) may introduce additional bias aka M-bias, aka "variable inclusion bias".

    Doesn't it follow from this that to have any confidence in reported estimates of causal effects we need to know a great deal about causal structure, in order to know that this is not occurring?

    In Pearl's language we need to know there are no open backdoor paths, in Rubin's language we need to know that treatment is ignorable, conditional on whatever we are adjusting for.

    Without such detailed knowledge there is no reason a priori why adjusting for more covariates should bring us any closer to achieving ignorability.

    The methods for inferring causal structure rely on assumptions (most controversially "faithfulness"). It is arguable whether these methods are usefully applicable to the social sciences, since data is often noisy, treatments may be ill-defined, and the results cannot usually be checked by experiment (as they sometimes can in biology).

    However, these methods do not assume anything like conditional ignorability and indeed they explicitly allow for the possibility of unmeasured confounders.

    In the absence of a detailed theory implying (rather than simply assuming) conditional ignorability why is there any more reason to trust estimates of causal effects obtained via adjustment than to trust "estimates" of causal structure?

  7. Andrew Gelman says:

    Bill:

    In your example, I'd say that you're not inferring the causal structure; what you're doing is inferring some coefficients and then, for convenience or future optimization, taking some coefficient estimates near zero and setting them to zero.

    Thomas:

    Certainly, you want to know if a variable is a pretest score or an instrument (for example); this is a key aspect of your model. But I don't see this as being inferred from the data; I see this as being pre-specified. As I noted above (in the part of this comment addressed to Bill), I agree that you can estimate the strengths of associations from data, and for convenience you might want to set some of these estimates to zero. In Rubin's framework, you generally won't "know" that treatment is ignorable in an observational study; the idea is to include more information in the model to make ignorability more of a good approximation. If you don't include key important information, you can have problems. See Dehajia and Wahba's classic paper for discussion and illustration of this point.