Identifying Neighborhood Effects

Dionissi Aliprantis writes:

I have just published a paper (online here) on what we can learn about neighborhood effects from the results of the Moving to Opportunity housing mobility experiment. I wanted to suggest the paper (and/or the experiment more broadly) as a topic for your blog, as I am hoping the paper can start some constructive conversations.

The article is called “Assessing the evidence on neighborhood effects from Moving to Opportunity,” and here’s the abstract:

The Moving to Opportunity (MTO) experiment randomly assigned housing vouchers that could be used in low-poverty neighborhoods. Consistent with the literature, I [Aliprantis] find that receiving an MTO voucher had no effect on outcomes like earnings, employment, and test scores. However, after studying the assumptions identifying neighborhood effects with MTO data, this paper reaches a very different interpretation of these results than found in the literature. I first specify a model in which the absence of effects from the MTO program implies an absence of neighborhood effects. I present theory and evidence against two key assumptions of this model: that poverty is the only determinant of neighborhood quality and that outcomes only change across one threshold of neighborhood quality. I then show that in a more realistic model of neighborhood effects that relaxes these assumptions, the absence of effects from the MTO program is perfectly compatible with the presence of neighborhood effects. This analysis illustrates why the implicit identification strategies used in the literature on MTO can be misleading.

I haven’t had a chance to read the paper, but I can share this horrible graph:

And Figure 4 is even worse!

But don’t judge a paper by its graphs. There could well be interesting stuff here, so feel free to discuss.

39 thoughts on “Identifying Neighborhood Effects

      • Or simply something like:

        x=rnorm(10000)
        y=rnorm(10000)
        plot(y~x,col=rgb(.2,.2,.2,.1),pch=19)

        Play with alpha as needed.

        Export to image file using the Cairo package to accommodate the alpha levels.

        On the plus side, I give extra respect to people who plot the raw data.

        • And I see that R has the hexbin package as well:

          library(hexbin)
          hbin<-hexbin(x,y,xbins=30)
          plot(hbin)

          I have mixed feelings about this plot.

        • The hexbin package uses lattice underneath. ggplot2 also has support for it.

          d <- ggplot(diamonds, aes(carat, price))
          d + geom_hex()

          http://docs.ggplot2.org/current/geom_hex.html

          You could also do something using the stat_desnity_2d function in ggplot2

          ggplot(data,
          aes(x = x, y = y)) +
          stat_density_2d(aes(fill = ..density..,
          alpha=cut(..density..,breaks=c(0,2e-2,Inf))),
          n = 256,
          geom = "raster",
          contour = FALSE) +
          scale_alpha_manual(values=c(0,1),guide="none") +
          geom_jitter(width = .05, height = .05)

          For a large number of data points I tend to prefer this approach. It that will cause the alpha levels of the density estimate to go to zero when the density is nearly zero (yYou may need to play with the breaks though to make it work). Once I'm happy with the 2d density estimate, I tend to drop the geom_jitter(width = .05, height = .05) line so that I'm just left with the raster.

  1. I’m actually feeling a bit bad about the focus on that graph. Skimming the paper (it’s outside my realm of expertise), it appears well-written and that a lot of good thought has gone into the modeling. I’m sure many would be pleased with the absence of p-values (I’m a frequentist, but won’t hold that against it). One thing that bothers me is the lack of a clear separation between methods and results, which is the norm in the world of laboratory/clinical research that I’m familiar with — tell me what you’re going to do, then tell me what resulted from doing it. To me, this comes across as something like a stream-of-consciousness approach to a paper, and maybe that’s the norm in this field, but to me it feels untidy. Anyway, it did some neat PCA.

      • IMHO the abstract should have said something like “By defining a metric of neighborhood quality using METHOD based on measured variables XXX we can see that YYY, whereas the previous research implicitly defines neighborhood quality in a way that does not agree with basic intuitions about neighborhood quality”

        Or something of the sort. Also, an explicit labeled section defining neighborhood quality etc.

  2. I think this is a decent paper. It definitely leaves a lot of questions open, but it makes a definite contribution. Here is the thread of the argument as I see it:

    1. Looking at the population of neighborhoods from the US census, there is substantial variation in a number of variables which plausibly correlate with “neighborhood quality”, but are not entirely correlated with poverty rate. Examples: high-school graduation rate, B.A. attainment rate.

    2. “MTO did not move a large share of families into neighborhoods with substantial shares of residents with high school diplomas, college degrees, where the male employment-to-population ratio was high, the female employment rate was high, and in which there were few single-headed households.”

    3. From points 1 and 2, there exist many neighborhoods which are plausibly of higher quality than those included in the MTO program.

    Conclusion: it’s possible that MTO didn’t show an effect because it didn’t get people into sufficiently “good” neighborhoods.

    • But that is all in the Sampson (2008 http://scholar.harvard.edu/sampson/files/2008_ajs_moving_to_inequality.pdf) paper so the question is: what does this add? We already know that people almost entirely moved to places that were only marginally different, and this was no surprise since it’s not like section 8 gives you a lot of money for your rent. No, this was not a test of whether moving poor people to middle class neighborhoods would change their life trajectories or those of their children. Do we really need these models to tell the story?

      Sampson of course also uses his measure of concentrated disadvantage which is similar to the “quality” measure in this paper, but from my quick read there is no justification of use of the term “quality” or for even using new terminology given that concentrated disadvantage has become the standard term.

  3. The rest of the paper seems like a complicated, lengthy attempt to challenge the tunnel vision that complicated models seem to induce in their users. You could simply say “the prevailing models assume that neighborhood quality is determined entirely by poverty rate. Well… maybe it’s not?”

  4. I wrote to Professor Gelman because I enjoy his blog, and especially his discussions about the frictions involved in publishing. I submitted this paper to many journals, but no editor or referee ever made a substantive comment to me about the main idea (in Section 3).

    In fact, this blog post is reflective of the modal referee report I received on the paper. The paper overturns the current interpretation of one of the most prominent social experiments in recent decades, and the responses have ignored this to discuss third order details like notation or aesthetics. If one reads the paper, they might be sympathetic to the figure discussed here, whose aim is to illustrate the support of quality conditional on poverty. But that is beside the point, and I actually don’t want to talk about the figure at all. I want anyone reading this to respond to the big idea in the paper, which is in Section 3. If I am mistaken, please, tell me how. I would be thrilled to hear substantive comments about the paper, and hope it starts some conversations about how we should interpret MTO.

    • Dionissi:

      I can be a challenge to communicate a big idea. Perhaps the most relevant comments in this thread are those of Clark and Paul above, in that they’re pointing to issues of clarity in exposition. I’m not sure what exactly it was about your paper that made it difficult for me to jump in and see the key substantive points you were making, but it if the editors and referees had the same reactions, this suggests that there was a communication difficulty. In posting on the blog, I was hoping to catch the attention of some subset of readers who have particular interest in neighborhoods and poverty, who could more easily get to the substance of the paper. I too am disappointed that no such readers came to comment. In retrospect I guess my note on the graph was a distraction. On the other hand, comments on graphs can be helpful too. Given that, empirically speaking, this paper suffered from communication issues, it should be good for you to learn ways of communicating your points more clearly in both words and graphs.

    • Dionissi:

      I have been there e.g. http://statmodeling.stat.columbia.edu/2011/03/28/explaining_the/

      Had what I thought (and still think) is an important insight that I was in way too much in a hurry to share (and while overextended on other stuff) – communicated it in a way others did not get.

      Responses that discuss third order details like notation or aesthetics perhaps are a very good indicator of a communication failure.

      Until you can find a way to communicate the important insights so that others will get it (before giving up or getting annoyed) – you are most likely just wasting your time.

      So you have to find a way to communicate that works or move on.

      (On the topic in the my link, I think I have gotten it across in presentations and one to one conversations but not [yet] in a stand alone paper that is concise enough for a non-captured audience [ http://statmodeling.stat.columbia.edu/2011/05/14/missed_friday_t/ ].)

    • I just think the formalism is only interesting to those who enjoy formalism. Anyone can understand that an experiment doesn’t show much if there’s no meaningful distinction between the treatment and the control group – they don’t need an assumption called A5 and a heap of variables to get that.

      If you want to attract a broad audience, argue simply and in plain English, as you do in your conclusion. This isn’t particle physics. It’s entirely possible to provide the gist of the argument with a minimum of technicality.

      • Hi Paul,

        Thanks for your comments and for the time and energy you have spent on the paper. I hope that my responses to your comments can convince you that the paper is worth it.

        To begin, the issue of “Anyone can understand…” is is the whole point of the paper! There is a debate about whether there was a big distinction between the treatment and control groups in MTO. A group of prominent researchers would disagree with my paper and say, based on raw poverty rates, that there were big differences in the neighborhoods of the treatment and control groups. That group of researchers would (I believe) point to Figure 1 here

        http://www.jstor.org/stable/23469734

        while I would point to Figure 4(b) (on page 18) here

        http://dionissialiprantis.com/pdfs/LATEs_of_nbd_quality_REStat1.pdf

        or Figure 4 of the paper under discussion. You would notice that the distributions in terms of (raw) poverty and quality are quite different, and you might also notice that the comparisons are between different groups.

        • Thanks for this and your other replies. I can see you are fighting the good fight. Personally I do not need the modeling technicality to understand your point, but perhaps in your target audience it is more necessary. I hate when people put blinders on the moment they get a model they can compute with – whatever you have to do to disabuse them of that notion is alright with me!

    • I also think it’s not very interesting to argue that “if you change the assumptions, the results might be different”. Everyone knows that. The interesting question is, if you change the assumptions, do you ACTUALLY get a different answer? This paper shows very little in that direction. It doesn’t overturn anything, at best it sheds some doubt on existing work.

    • I too think your idea is interesting and important, but that the structure and content of this paper doesn’t communicate it clearly. Let me give you an example just from your abstract:

      “The Moving to Opportunity (MTO) experiment randomly assigned housing vouchers that could be used in low-poverty [sic? high-poverty? low-income?] neighborhoods. Consistent with the literature, I [Aliprantis] find that receiving an MTO voucher had no effect on outcomes like earnings, employment, and test scores. However, after studying the assumptions identifying neighborhood effects with MTO data, this paper reaches a very different interpretation of these results than found in the literature. I first specify a model in which….”

      The first part is fine, you introduce MTO and discuss the previous null results… Then you say that you have a very different interpretation….. and I’m really hopeful at this point that I’m going to get a summary of the interpretation…. and then no, “I first specify a model in which….” you start to talk about *what you did* not *what you found*

      Throughout the paper in my skimming you talk a lot about *what you did* Here is an alternative style that might improve things:

      …a very different interpretation of these results than found in the literature. By defining a plausible metric for neighborhood quality based on causal factors that contribute to economic success we found that the neighborhoods involved in the MTO experiment simply did not include those neighborhoods where the causal factors were present. The recipients simply did not have the ability to move to REAL opportunity. As such, the MTO experiment would be expected to have null results even if an alternative experiment that included the higher quality neighborhoods would have been successful.

      I suspect my next step after the abstract would be to immediately discuss a model for causality in neighborhood effects, what happens when someone moves from a dysfunctional neighborhood to a functional one that *makes* people more successful. Then, after discussing this, talk about how to measure functionality, define your measurements, define your model for functionality/quality, and fit the model.

      Now, show summaries of your results and compare them to what you’d have gotten if you didn’t include the causality and neighborhood quality (ie. the previous types of studies). Show this *graphically* with very good graphs. The point of the graphs is to highlight the differences between moving around for no good reason and moving *to opportunity*. Motivate the graphs by discussing their purpose *right there in the figure legend*. Don’t use figure legends to tell me what data is in the graphs, tell me what the graphs MEAN. Let the axis labels tell me what data is in the figure.

      Finally only after making your case that you’ve got a better understanding of what happened, and the reader understands your analysis, now compare your analysis to what happened previously and explain why there simply isn’t any evidence in this study to say one way or another whether moving to opportunity would help people, simply because the plausible causal factors weren’t engaged in the study design.

      At least, this is what I THINK your paper is trying to say, and I hope these suggestions could help you communicate your important ideas. Perhaps even if I’m misinterpreting your results, you might at least get some ideas for how to communicate what your real results are.

      • Hi Daniel,

        Thank you for these comments, they are helpful. The one thing I will say in my defense is that our other paper might be more what you are looking for

        http://dionissialiprantis.com/pdfs/LATEs_of_nbd_quality_REStat1.pdf

        The point of this paper is to clearly characterize what the literature has been doing to this point. If you take a look at some of those papers with this in mind (linked in some of my other replies), the approach of my paper might seem more reasonable.

        • Sort, of. I mean, I think your linked paper has some interesting stuff. But, I still think it suffers from communication issues that make it less clear. One thing I can suggest is this. The typical academic paper is full of stuff designed to snow-job people, tables and charts and graphs and verbosity designed to make it so that it seems very complicated and amazing that you even did the research… This is “needed” because a lot of researchers simply aren’t doing much, and they need to get tenure anyway.

          But, if you actually ARE working on an important topic, and you actually DO have significant contributions in important directions that are different from the bulk of what people do, and you have a deep understanding of an important topic with evidence to back it up, you are much better off communicating that without all the snow-job.

          “We have discovered that the MTO project simply did not succeed in inducing the bulk of people to move to higher quality neighborhoods, and this is the reason that MTO failed to have the effect it sought. Furthermore, when we restrict to the subset of people who did move to higher quality neighborhoods, we find that this had a large and robust effect on this small subset. It remains to be seen whether that effect comes from something special about the people who did move to higher quality neighborhoods, or whether the neighborhood itself was the cause of these large effects. In figure 1 we show that for a given level of poverty, there is a wide range of neighborhood quality possible (consider putting a horizontal line through the scatterplot and some text directly on the scatterplot to this effect). In figure 2 we show that MTO simply did not cause many large changes in neighborhood quality (show scatterplot of delta-quality against initial quality). Furthermore, when restricted to the groups who did make large changes in their neighborhood quality (plot those points in red on your graph) the outcome changes are large and robust (plot graph of outcome score against delta quality).

          etc etc. Pretend your colleagues are undergraduates who need things spelled out for them, then spell them out. Avoid being condescending, but just spell out what you know.

    • I think another feature of the communication failure here, if I am understanding the core point correctly, is that the view of MTO has shifted substantially based on the recent work by Chetty et. al. that *does* show substantial effects: http://scholar.harvard.edu/hendren/publications/effects-exposure-better-neighborhoods-children-new-evidence-moving-opportunity

      If people’s priors have shifted to “MTO had positive effects that are hard to measure and mostly show-up farther into the future than the original work suggested,” then your set-up is wrong and will likely lead people to shrug. “Chetty et. al. changed my mind and here’s some evidence that weakly buttresses that. Fine.”

      If on the other hand you are saying that priors should shift from the Chetty et. al. results, then you need to be very clear about how they should change.

      • The paper you cite interprets MTO as evidence against neighborhood effects on adult labor market outcomes using the same logic as used throughout the literature on MTO, with no model of neighborhood effects ever stated or estimated.

        In the Introduction the authors write that “The MTO experiment generated large differences in neighborhood environments for comparable families, providing an opportunity to evaluate the causal effects of improving neighborhood environments for low-income families…” Previous “studies have consistently found that the MTO treatments had no significant impacts on the earnings and employment rates of adults and older youth, suggesting that neighborhood environments might be less important for economic success.”

        The logic is that because the experiment decreased poverty, then the program must have improved adult labor market outcomes if there were neighborhood effects on these outcomes. I show that this logic imposes strong assumptions, and both of my papers are focused on adult labor market outcomes because this is the clearest outcome for showing the differences in our methodology.

        My papers suggest that the neighborhood effects driving the experimental program effects in the cited paper are likely to be very large.

        • Overall the observation that MTO is not a good test of a “neighborhood effects” theory is clear, and that has been obvious for a number of years. To the extent that you are making the same point with a different model, that’s a contribution. Pointing out that even the lower poverty census tracts that people moved to were still actually highly disadvantaged is good, and if economics as a discipline hasn’t really thought about that, it’s good to point out that just because the “poverty rate” in a census tract is under 10% does not mean that it is substantially different economically, meaning that 200% of poverty level is actually considered for things like school lunch eligibility.

          But then you go on to make claims that neighborhood effects do matter … that’s the problem. MTO is not a good test of neighborhood effects no matter which way you slice it.

          Further, Wilson’s work was about the importance of stable economically diverse neighborhoods, not about throwing people into places where they know no one and have no social infrastructure, such as a church they grew up in, which gives them access to people from different economic situations.

          You also haven’t actually stated that as the problem you paper is addressing nor have you specified what you mean by “neighborhood” effects or neighborhood clearly. Sampson’s 2008 paper, which you cite, gives a summary of how the concept of neighborhood effect has developed and varied over the last 100 years but you don’t address that at all. Further, is a census tract a neighborhood? Also something you have not addressed. In dense neighborhoods in big cities, the answer is clearly no most of the time.

          Finally, given that a minuscule number of families actually left their real neighborhoods it then becomes hard to tease out whether those families were systematically different, and specifically if they were highly motivated strivers, who if they had stayed would have found Prep for Prep or fencing or some other program(s) to get their kids into better schools and to help them avoid criminally-involved peers. To me that’s the real issue, the data are just not there to say one way or another if moving to a substantially different neighborhood has an impact.

        • Hi Elin,

          Thanks for your comments. You are definitely right that I am not the first person to bring up these points. But, at least as far as I can tell, I am the first to cast them in terms of modeling assumptions. I think this adds to the discussion of MTO in at least two ways:

          1) Hopefully this framing will speak to economists. As I quoted in another response, prominent economists continue to interpret MTO as a test of neighborhood effects.

          2) I do think it is useful to cast the discussion in terms of modeling assumptions. The reason is that by carefully studying modeling assumptions, we can still leverage the experimental design of MTO to learn about neighborhood effects (as we attempt in our other paper – http://dionissialiprantis.com/pdfs/LATEs_of_nbd_quality_REStat1.pdf). With this approach we can more clearly interpret what we did learn about – moves from the worst neighborhoods to less-bad neighborhoods (at least according to our definition of quality, which I would concede leaves out measures of important factors like social networks, but that I would hope captures a large share of the key mechanisms described in Wilson’s work). This also helps to highlight what we cannot learn about from MTO – whether moving poor people to middle class neighborhoods would change their life trajectories or those of their children.

  5. Lastly, figures that are too cluttered to be interpreted don’t really make a point. I can’t tell if the relationship between the new neighborhood quality metric and poverty is actually very loose as you seem to be claiming, or if you just used dots that were too big. That may be obvious to you, or to those deeply acquainted with your subject matter, but keep in mind that we are complete strangers to your work.

    Overall, you will not have the conversations you wish to have if your opening statement is not understood.

    I say all of this with utmost respect for the substance of the paper, which is clearly significant. That was why I at least tried to look past the barriers and tease out the main argument.

    • The purpose of this figure is to show existence, and not to convey information to the reader about the strength of the relationship between poverty and quality. The figure is about the support – and no other feature of the distribution – of quality conditional on poverty. The additional relevant features of the distribution are displayed in the table below. I had thought about using a heat map of the distribution as suggested above, but believed this would distract from the point of the figure. The point is to show the existence of low-quality neighborhoods at given levels of poverty, including those that satisfy the MTO threshold. Another option would have been to show interquartile ranges, or characterize the conditional distributions in some other way, but that would omit valuable information. For example, showing such a range would omit the neighborhoods around the 50th percentile of poverty and 5th percentile of quality.

      All of this being said, the reason that I earlier stated my disinterest in discussing the figure is because it is on page 18 of a 20-page paper. The main point of the paper is delivered in Section 3, right after the introduction and the description of the program.

      • I think you’re taking an overly theoretical perspective here. The existence of assumption violations isn’t persuasive in applied data analysis. Assumptions are virtually always violated to some degree in real data, and every method worth its salt has at least some robustness to mild violation of assumptions. To be persuasive, you should at least have some measure of the degree to which the assumption was violated.

        I agree it’s a relatively minor point. But when making a big splash it is worth a little trouble to manage perceptions :)

Leave a Reply to Paul Cancel reply

Your email address will not be published. Required fields are marked *