Evens Salies writes:
I have a question regarding a randomizing constraint in my current funded electricity experiment.
After elimination of missing data we have 110 voluntary households from a larger population (resource constraints do not allow us to have more households!). I randomly assign them to threated and non treated where the treatment variable is some ICT that allows the treated to track their electricity consumption in real tim. The ICT is made of two devices, one that is plugged on the household’s modem and the other on the electric meter. A necessary condition for being treated is that the distance between the box and the meter be below some threshold (d), the value of which is 20 meters approximately.
50 ICTs can be installed.
60 households will be in the control group.
But, I can only assign 6 households in the control group for whom d is less than 20. Therefore, I have only 6 households in the control group who have a counterfactual in the group of treated. To put it differently, for 54 households in the control group, the overlap assumption is violated because these 54 households could never have been treated. I am correct to say this?
Please, could you send me to a paper on Program Evalution/Causal Inference that addresses such issue? Should I discard the 54 households who could not be treates (due to the distance constraint). This would be unfair in such a small trial.
I don’t know of any references on this (beyond chapters 9 and 10 of my book with Jennifer). My quick answer is that you should model your outcome conditional on the treatment indicator and also this distance variable. If distance doesn’t matter at all, maybe you’re ok, and if it does matter, maybe a reasonable model will correct for the non-overlap. Or maybe you’ll be able to keep most of the data and just discard some cases with extreme values of the predictor.