Causality and Statistical Learning
Andrew Gelman, Statistics and Political Science, Columbia University
Wed 27 Mar, 4pm, Betty Ford Auditorium, Ford School of Public Policy
Causal inference is central to the social and biomedical sciences. There are unresolved debates about the meaning of causality and the methods that should be used to measure it. As a statistician, I am trained to say that randomized experiments are a gold standard, yet I have spent almost all my applied career analyzing observational data. In this talk we shall consider various approaches to causal reasoning from the perspective of an applied statistician who recognizes the importance of causal identification yet must learn from available information.
I wish I didn’t have a meeting at 4 today. I’d love to head down from North Campus to check this out.
In your Review Essay, you discuss ‘adding time into the system’ noting that ‘this is not always so easy to do with observational data, and in many ways the goals are different: unidirectional causation in one case and equilibrium modeling in the other.’
There’s a recent paper in _Science_ — ‘Detecting causality in complex ecosystems’ (10/26/12, #338 pages 496-500, George Sugihara, Robert May, et al) — that introduces ‘a method, based on nonlinear state space reconstruction, that can distinguish causality from correlation.’ It seems like this might address those differing goals.
Paul:
I took a look. I don’t have enough experience in this area to have a sense of what this method is doing. I could imagine it working in a setting such as ecology with clearly-defined linkages between state variables. I don’t think it would do much in the sorts of social and environmental science problems that I’ve worked on. That’s one of my themes: different methods can be appropriate in different application areas.
Curious about the point you make with the intelligence -> beer -> SES slide. For once, wish I were in Michigan and not sunny California.
Is the point that, after all the DAGging, they still have to say at the bottom that there may be other variables to worry about? Or is it rather that there are many possible experiments consistent with their DAGging, and so we’ll still have to worry about the details of “intervention”?
See pages 961-962 of this review.
Are Berger and Pope’s data available to the public? It could be an evening’s entertainment to analyze it properly. If nothing else, it seems ideally suited to illustrating how to do model order estimation and calculate confidence intervals. (It would be amusing to see what happens to the predicted values and the confidence intervals when you extrapolate their 5th degree polynomials across the tie score boundary.) Interesting too their choice to partition the data at delta=0. Did they actually treat it as a piecewise regression problem and identify the changepoint from first principles? The only ref I’ve got that addresses piecewise regression is a 1999 paper by Thomas Minka – http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.39.4002 – but, looking at the data, delta=0 seems an unlikely changepoint. (Minka cites Lyle Broemeling, Bayesian analysis of linear models, in his changepoint discussion.)
Ref: http://statmodeling.stat.columbia.edu/2009/08/14/you_cant_win_fo/
Four comments:
1. The low order fits in their revised paper (http://www.stat.columbia.edu/~gelman/stuff_for_blog/devinpope.pdf) appear much more plausible than their 5th order fits.
2. They still need to make an argument for selecting the polynomial order(s) they did, e.g., show BIC values.
3. They should show confidence bounds associated with the polynomial order they settled on. Do reasonable bounds not include the points on the opposite side of delta=0?
4. They need to make a quantitative argument for there being a discontinuity at delta=0.
(I know, those are basically the same comments I made above.)
Pingback: Everyone is Looking for a Cat | Pink Iguana