Jennifer Hill writes:
Registration for the 2016 Atlantic Causal Inference Conference is now live.
Stay tuned for short course registration (free for conference participants) and an announcement regarding a causal inference data analysis competition…both coming soon!
Also please consider signing up to give a lightning talk (link on website).
The conference will be held 26-27 May in NYC, and I strongly recommend it.
Sounds fantastic. Would love to see a black-box treatment effects inference competition. (Not a believer in black-box techniques, but it would be interesting).
James:
If you really think it’s a good idea to do this, you could probably set it up yourself and run it thru the conference!
Naive question: What exactly is a “black-box treatment effects inference competition”?
As Kaggle is to prediction, I guess a black-box inference competition would be to causality? Of course probably want to use simulated data or data with good experimental estimates so that you have a comparison point. But I’d love to give the various matching/counterfactual prediction models a run on these sorts of problems.
This would be a cool dataset: http://www.kellogg.northwestern.edu/faculty/gordon_b/files/kellogg_fb_whitepaper.pdf
Prediction performance, you can use metrics like rms error, log loss etc. to evaluate in a competition.
What is an analogous way to evaluate the performance of a inference / causality competition?
Know the truth
Right. Simulated data or large-scale experiment as the benchmark. If all the datasets are on the same scale, then maybe you could score participants by their average bias across datasets? Or perhaps score them by the posterior density they give to the (known) true effects?
Wouldn’t such a contest be similar to what was described in this old blog post on here:
http://statmodeling.stat.columbia.edu/2013/04/13/18536/
If so, I thought the consensus in the comments thread there was that such determinations of causality were impossible unless experimental intervention was allowed?
i.e. Given purely two columns of data X, Y is it ever possible to conclude whether X causes Y or the other way around?
who said anything about 2 columns?
“the two variables were temperature and altitude of cities in Germany, and they said that altitude causes temperature”
Maybe, it is better to ask the question the other way around: What would a “black-box treatment effects inference competition” look like? What are the givens?
If you get 10 columns instead of 2 does that make things any easier?
Even given many related columns it will be impossible to determine the causal effect of B -> A without finding some exogenous variation in B. And yet policy and healthcare decisions get made every day based on massively confounded co-movements between B and A. I feel comfortable thinking about “black-box techniques” for causal inferences as a way to prevent some bad decisions being made, not to estimate precise treatment effects.
There have been several such competitions, see:
https://competitions.codalab.org/competitions/1381
http://www.causality.inf.ethz.ch/cause-effect.php
Isabelle Guyon’s ChaLearn organization has organized a bunch of these things. http://www.causality.inf.ethz.ch
There will be a black box treatment effects competition!!! Details should be announced within a week’s time.
This is quite literally the best news.
I have two related questions: If a thousand groups submit a response (an answer) to the black box treatment effects competition then some response has to be the best, even if it is not a particularly good (small error from the truth) response. How does one differentiate the best response from a lucky random causal discovery algorithm (that is, there is an infinite number of random model selection algorithms, some fraction of which will construct a model close to the truth)?. Second, and relatedly, how does one determine if the winner wasn’t just lucky in that the winning causal discovery algorithm worked pretty well with the contest data but it would work less well (relative to other competing causal discovery algorithms) with different data.
Follow-up replication at the next conference.
+1
Hi,
I noticed that the conference is sold out. Is there a waitlist for students who will like to attend the conference, in case someone else is unable to attend?