## Lack of complete overlap

Evens Salies writes:

I have a question regarding a randomizing constraint in my current funded electricity experiment.

After elimination of missing data we have 110 voluntary households from a larger population (resource constraints do not allow us to have more households!). I randomly assign them to threated and non treated where the treatment variable is some ICT that allows the treated to track their electricity consumption in real tim. The ICT is made of two devices, one that is plugged on the household’s modem and the other on the electric meter. A necessary condition for being treated is that the distance between the box and the meter be below some threshold (d), the value of which is 20 meters approximately.

50 ICTs can be installed.
60 households will be in the control group.

But, I can only assign 6 households in the control group for whom d is less than 20. Therefore, I have only 6 households in the control group who have a counterfactual in the group of treated. To put it differently, for 54 households in the control group, the overlap assumption is violated because these 54 households could never have been treated. I am correct to say this?

Please, could you send me to a paper on Program Evalution/Causal Inference that addresses such issue? Should I discard the 54 households who could not be treates (due to the distance constraint). This would be unfair in such a small trial.

My response:

I don’t know of any references on this (beyond chapters 9 and 10 of my book with Jennifer). My quick answer is that you should model your outcome conditional on the treatment indicator and also this distance variable. If distance doesn’t matter at all, maybe you’re ok, and if it does matter, maybe a reasonable model will correct for the non-overlap. Or maybe you’ll be able to keep most of the data and just discard some cases with extreme values of the predictor.

1. Kaiser says:

Some of the numbers don’t make sense here. It sounded like 6 out of 60 (10%) of the control group has d less than 20. If the treatment assignment is indeed random as was asserted, then 10% of the treated group has d less than 20, or flipped the other way 90% of the treated group has d >= 20 which would imply they couldn’t be treated at all.

• Evens Salies says:

Dear Kaiser.

thanks for your comment. I should not have written ” … I randomly assign them to t[]reated and non treated” and latter added “But, I can only assign 6 households in the control group for whom d is less than 20.”

So the former sentence is true only for 6 households. So you can’t use your symmetry argument.

2. James says:

It sounds to me like this is a design question rather than a modeling issue (unless there are further details omitted). Something about this sounds rather odd–is it the situation that only 56 of the 110 households have d < 20? Or is it that only 6 out of 60 households in the control group have d < 20? In the latter case, presumably only a small fraction of households in the treatment group have d < 20, and so intervention uptake is going to be very weak.

• Evens Salies says:

Dear James:

Yes, 56 out of 110 households (the whole sample) have d < 20, and, only 6 out of 60 households in the control group have d < 20. But, from this you can deduce that 56 – 6 = 50 households in the treatment group (whole – control) have d < 20, t.i.t.s. all (not a small fraction) households in the treatment group have d < 20.

Furthermore, you are not correct when you say that "intervention uptake is going to be very weak". It will be weak only if the d variable is a proxy for other variables that may affect either the potential outcome or the decision by households to enroll. Regardin the former, i have to wait until i analyse the results of the experiment. Regarding the latter, fortunately, the d variable is not likely to have affected household participation in the experiment as neither them nor us were aware of the constraint that d would impose in assignment to groups.

• Kaiser says:

Can you describe how your “randomization” is accomplished?

• Evens Salies says:

This is my Stata routine

summarize INDIV
scalar NTOTA=r(N) // Size of our sample -> NTOTA
scalar NGRP2=50 // Group 2′s size (treat)
scalar NGRP1=NTOTA-NGRP2 // Group 1′s size (ctrl)
generate RANDN=1+int(NTOTA*runiform()) // Generate pseudorandom integers over [1,N]
sort RANDN // Sort ascendingly RANDN and INDIV accordingly
generate CONTROL=1 if DIST>20 // Individuals who have to be in the ctrl group anyway
sort CONTROL, stable // Pool them as they appear in the data (stable option)
generate IGRP=_n // Create an index from 1 to N
generate GRP=1 if IGRPNTOTA-NGRP1 // Remaining individuals from NGRP1+1 are attached the number 2
drop IGRP RANDN CONTROL

3. Chuck says:

Re Kaiser, James. I think that the description is for a population of 110 of which 56 have d < 20 m.

I believe that there is a good chance that there would be many significant, systematic differences between houses with d 20. For example, all tiny houses with inside or adjacent meters(which probably tend to be older and owned or occupied by older, poor, smaller households) will have d 20. Perhaps gathering sufficient info on age of dwelling, income, age and number of members of household, will allow adjustment for such effects. But, it seems to me that the stratification on d < 20 m prior to measurement is a real problem.

Question, is it really true that you must have d < 20. Or is it the case that the system works with P = 0.95 if d <20, P= 0.9 if 20 < d < 25. P = 0.80 if 25 < d < 30? Etc. If so, you might try some other strategy for dividing between treated and control pops. That probably depends on the expense of trying to install and verifying successful install.

• Evens Salies says:

Dear Chuck:

Thanks for your insights. The system works if d is approximately below 20 n there are walls between the meter and the box. Without walls, a higher d would work. But, we can’t verify in every house post-assignment, so the decision to take d at 20 was made. Note that thickness of the walls is an important factor too. This may happen when one or more floors separate the two devices.

4. Wouldn’t it make sense to block your houses into different groups, say d < 20, 20 < d 30, THEN randomize within the groups? For example you could say p = 0.1 for d > 30, p = .25 for 20 < d < 30, and p = .75 for d < 20 or some similar thing where the p values are chosen in a principled manner (perhaps via some kind of cost-of-information optimization).

It seems to me a waste of resources to split the d 20 would act as control data. However you can only do this if you can convince yourself that you’ve correctly modeled the differences, which requires attempting the treatment on at least some of the d > 20 houses so that you have data to model the differences with.

As Chuck says above, most likely d = 20 is not a hard cut-off. You can then use the other characteristics of the house (year built, construction type, insulation type, square footage, occupant number, occupant demographics, etc) to attempt to control for the fact that you’ve biased yourself towards smaller d in the design.

• Hopefully the intent of my above post comes through even though wordpress swallowed the greater than and less than signs sometimes.

5. Kaiser says:

Chuck: but what does it mean by “I can only assign 6 households in the control group for whom d is less than 20. Therefore, I have only 6 households in the control group.”? If 56 has d < 20 m, why can't these be split equally? Why 6?

• It sounds to me like he ran the assignment and found out that only 6 of the control group could have even been treated. This either sounds like bad luck, or there will not be very many of the intervention group that can be treated either. The final option is that there’s a typo, so “6″ is supposed to be “26″ or something like that.

• Evens Salies says:

No, it is not a typo.
- 110 households decided to participate the expe
- under our budget constraint we can only install 50 ICTs
- given the distance and type of meter constraints, only 56 households can receive the ICT so 54 can only be assigned to the control group
- therefore 6 households AMONG the 56 can be assigned randomly

• Unless you install fewer than 50 ICTs. Perhaps you should do a value of information optimization to determine how many ICTs you should install, or change your experimental protocol, perhaps install 25 of the ICTs initially, and then in 25 households in the control group install the ICT at a later date, and do time series analysis to determine the effect on controls of installation. I dunno, it seems like you have options.

• Evens Salies says:

Hi Daniel. This is a good idea. Today, the only thing we can do is to take those 6 from the ctrl grp qho can be equiped and switch them to the tretment group. That’s not a lot. And we can’t afford now to sample new households.

6. Chuck says:

Re Kaiser at 12:15

I understood the constraints to be (a) 50 test boxes and (b)range of the radio link (or length of the wire?) 20 meters.

My thought was that there were 110 homes. 56 of the 110 had d 20 (d .gt. 20). So, there were only 56 homes that could be treated. If they split the 56 evenly into control and test, then they would only have 28 to test. That would leave 22 boxes idle.

Another fix would be to check if the antennas on the test equipment could be easily and cheaply modified. In particular, if the antenna were attached to one of the devices by 5 or 10 feet of cable, the antenna could be moved around until a spot with a good signal was found. Depending on the band that is being used this approach has a good chance of improving range. In particular, I expect this approach would help a lot in the unlicensed bands at 900 MHz, 2.4 GHz, and 5 GHz.

Chuck

• Kaiser says:

I think we narrowed down to the two main issues here: 1) there is a design problem: why ruin a good experimental setup just because some antennas will lay idle? 2) what’s origin of the 20 feet? I read it as a technical restriction but others don’t.

7. d says:

If you have measured the distance, you could condition on it instead of the dummy variable <20 or not. Then it is an RD design with partial overlap (Rubin 1977).

• Evens Salies says:

Dear d. After I had asked some help to Andrew, I came to the same conclusion than you. And thanks for the reference which I am going to read quickly.
To summarize: I can only assign 6 households in the control group for whom d is less than 20. Therefore, I have only 6 households in the control group who have a counterfactual in the group of treated. To put it differently, for 54 households in the control group, the overlap assumption is violated because these 54 households could never have been treated.

And yes, I have measured the distance.

Thank you very much to all of you who spent some time helping me.

• d says:

Hi Evens

In theory, if you can correctly model the relationship between distance and the outcome, it would not be a problem even if you had 0 instead of 6 hh there. Those 6 hh allow you to relax the necessary modeling assumptions and will help you though.

Then model

Z: 1 if treated, 0 if control (either forced or at random)
X: distance – 20
Y: outcome variable

Y = b0 + b1 Z + f(X, g) + e

where e is the residual error and f(.) is some function of X with parameters g. You could try linear regression on X, quadratic, logs, etc. Rubin’s results then say the effect of interest (b1) will be unbiased if you can accurately model f().

8. Evens Salies says:

Hi d,

please, could you tell me which of the following papers published by Rubin in 1977 is the one you refer to? http://www.evens-salies.com/rubin.jpg

Cheers.
btw, who’s d? you can answer me there if you wish -> evens.salies@ofce.sciences-po.fr

• daniel oberski says:

sorry about the “pseudonym”, the first time I just wanted to make a quick comment and the second time it copied the first time. I meant the first one, assignment to a treatment group based on the basis of a covariate.

9. Evens Salies says:

Thanks Daniel.