Skip to content

No evidence that providing driver’s licenses to unauthorized immigrants in California decreases traffic safety

[cat picture]

So. A reporter asked me what I thought of this article, “Providing driver’s licenses to unauthorized immigrants in California improves traffic safety,” by Hans Lueders, Jens Hainmueller, and Duncan Lawrence. It’s embargoed! so I’m not supposed to post anything on it until now.

From the abstract:

We examine the short-term effects of . . . California’s Assembly Bill 60 (AB60), under which more than 600,000 [driver’s] licenses were issued in the first year of implementation in 2015 . . . We find that, contrary to concerns voiced by opponents of the law, AB60 has had no discernible short-term effect on the number of accidents. The law primarily allowed existing unlicensed drivers to legalize their driving. We also find that, although AB60 had no effect on the rate of fatal accidents, it did decrease the rate of hit and run accidents, suggesting that the policy reduced fears of deportation and vehicle impoundment. . . .

The paper seems reasonable to me.

But I’d like to see what would happen if they would perform the same analysis but using 2014 as their break point, or 2013, or 2011, or 2010, etc. They’re looking at changes in 2015 that are correlated with %AB60 licences, but (a) other things can be happening in 2015, and (b) there are lots of things correlated with this county predictor. Also, the %AB60 licenses is low (always less than 6%, so it seems from the graphs). It would seem to me that lots of things can be driving the rate of collisions and the rate of hit and runs. So the causal story does not seem so clear to me. There are too many alternative explanations.

So, I’m with them when they say that the data show no evidence of negative consequence from the law allowing driver’s licenses for unauthorized immigrants. But I’d need more evidence to be convinced of their causal claims, in particular on the hit and runs. That seems like more of a reach.

To put it another way, the paper is called, “Providing driver’s licenses to unauthorized immigrants in California improves traffic safety,” but I’d be happier with a title such as “No evidence that providing driver’s licenses to unauthorized immigrants in California decreases traffic safety.”

P.S. The paper is in PPNAS but the editor is not Susan Fiske so I guess I should really just call it PNAS this time.


  1. D Kane says:

    Data is supposed to be available for replication, but I don’t see it:

    Am I missing something?

    • Tom says:

      I don’t think so. I clicked without result and noticed the counter showed 0 downloads.

      • D Kane says:

        Still nothing. By the way, what was your prior as to the likely outcome of such a study co-authored by people associated with the “Immigration Policy Lab?” I wager that they rarely (if ever) find the net effects of immigration to be negative . . .

        • Andrew says:


          I don’t know. I sent an email to one of the authors of the study and I’ll report back if he says anything relevant to the discussion. This comment thread illustrates, once again, the problems with the current scientific publication system of tech report, secret reviews, publication, publicity, and news coverage. I’d much prefer a system where the paper is in one place, along with feedback from peer reviewers and outsiders, and responses by the authors. Much of this blog can be seen as a hack to approximate that ideal system.

  2. J says:>The number of pedestrians, cyclists and drivers killed in L.A. traffic rose sharply in 2016

    Same paper, same day!

    • Tom Passin says:

      It’s astonishing to me, but the only place I saw in the LA Times story about deaths per unit traffic was this:

      “Seleta Reynolds, the L.A. Transportation Department’s general manager, cited an increase in driving as one reason for the rising number of fatalities.”

      After all the studies “they” have done, don’t “they” know the importance of rate vs number? At least the paper Andrew referenced at the top of the post has a chart titled “Accidents per 1000 Capita”.

    • Steve Sailer says:

      Here are excerpts from the LA Times article:

      The number of pedestrians, cyclists and drivers killed in L.A. traffic rose sharply in 2016

      Traffic deaths in Los Angeles rose sharply despite a high-profile campaign by Mayor Eric Garcetti and other city leaders to eliminate fatal traffic crashes.

      In 2016, the first full year that Garcetti’s Vision Zero policy was in effect in L.A., 260 people were killed in traffic crashes on city streets, an increase of almost 43% over the previous year.

      Rising traffic deaths appear to be more than a one-year aberration: So far in 2017, crash fatalities are 22% higher than in the same period last year.

      Los Angeles’ increase in traffic deaths outpaces national trends. In 2016, 40,200 people died in crashes involving cars, a 6% increase over the previous year, according to the National Safety Council.

      … Seleta Reynolds, the L.A. Transportation Department’s general manager, cited an increase in driving as one reason for the rising number of fatalities. Car sales and car registrations have risen in Southern California, driven by a strong economy and low gas prices.

      Drivers are also facing more distractions in their cars, and in some some neighborhoods, more people are choosing to walk or bike, Reynolds said.

      In addition, the Los Angeles Police Department is issuing dramatically fewer speeding tickets today, which could be contributing to the jump in fatal crashes.

      Reynolds said she is concerned that more crashes involving pedestrians are resulting in deaths. Through mid-March, pedestrian collisions were up 3% compared with 2015, but fatalities involving pedestrians surged 58% over the same period, according to LAPD data.

      Reynolds attributes the higher number of pedestrian deaths to vehicle speeds. When struck by a car moving at 20 mph, a pedestrian has a 10% chance of dying, but the risk of death increases to 80% if the vehicle is moving at 40 mph, according to a federal study of crash data.

      Pedestrians make up nearly half of the fatalities in traffic collisions, although they are involved in only 14% of total crashes, according to a city analysis of data from 2009 to 2013.

      The city saw about 55,350 traffic collisions in 2016, which represents a 7% increase over 2015 and a 20% uptick from 2014. Those crashes include collisions between drivers, between drivers and pedestrians or bicyclists, and hit-and-run and DUI-related crashes.

  3. Andrew says:

    It’s a sad comment on the state of the nerd-o-sphere that my umpteenth post on the hot hand got 70 comments, but this post on something that’s actually important gets only 2 comments!

    • jrc says:

      Yeah – we’re being lazy today – and now that Spring Break (woooO!!!1!) is over, I guess I should make a small contribution:

      “But I’d like to see what would happen if they would perform the same analysis but using 2014 as their break point, or 2013, or 2011, or 2010, etc.”

      One graph I really like to see in these kinds of papers is “year of placebo treatment” on the x-axis and “point estimates with CI” on the Y-axis, with the actual treatment year in a different color or marker symbol or something. So you just do the analysis as if every year was the treatment year, and show what you would get. The idea is that you can immediately see whether that year was “different” in some important way.

      A similar graph is often used for non-parametric analyses like regression discontinuities, when you have a bandwidth choice you have to make, and then you graph the point estimate for a range of potential other bandwidths as a robustness check. You could easily do that at different cut-offs of the running variable to make it more of the kind of “placebo check” I’m describing here – in fact there are some randomization-type inference methods for RD that propose doing this to get a sampling distribution of BetaHat.

    • LRS says:

      You gotta post in the morning Andrew!

    • Rahul says:

      Crowding out? Posts on the trivial have an opportunity cost in terms of attention?

  4. Steve Sailer says:

    Across the country, the rate of traffic deaths per mile, both for motorists and pedestrians, rose notably in both 2015 and in the first half of 2016.

    It’s a real problem.

    It might be due to smartphones distracting people, or it might be due to something else, such as a Ferguson Effect in which the cops retreat to the donut shop.

  5. Steve Sailer says:

    Here’s the NYT article on how traffic deaths per mile driven were up in 2015 and 2016, even though driving should be getting steadily safer due to newer, safer cars and new safety equipment on new cars:

    • Tom says:

      Another fact that supports the general trend. State Farm Insurance is the largest writer of private passenger auto in the US, with an 18% market share. The market is mature with around $200 billion (+) in total premiums. State Farm had remarkably bad results in this segment, which is their largest. Their reported underwriting loss for 2016 was $7 billion. From this, I would infer that the dollar value of 2016 claims was about 10% higher than they expected. This is directionally consistent with the rest of the industry, who performed worse than expected and are raising rates. It isn’t a financial issue for State Farm. The next two largest competitors also appear to have underestimated claim costs, but to a lesser extent. The takeaway is that insured costs rose sharply in 2016, and they weren’t fully anticipated by the largest insurers. Which is consistent with the increases in deaths.

      Miles driven is also up, but not to the same extent. My personal theory is that gas at the pump is down about 1/3 or a bit more since 2014. And that drivers whose driving preferences are constrained by pump prices are exactly who you don’t want to see on a highway.

      • Steve Sailer says:


        I’d sure like to hear what the inside scoop is from insurance company on who is suddenly getting into all these incremental fatal accidents over the last two years.

        What if all this high tech safety equipment that auto companies are making standard on new cars is killing people for unexpected reasons? (I have no evidence that this is true, but I’d sure like to be reassured that it isn’t true.)

        Or maybe it’s another Ferguson effect, just like the spike in homicide deaths since 2014? Maybe the cops have retreated to the donut shop and motorists are speeding like crazy?

  6. Phil says:

    NHSTA says that in 2015 about 3500 people were killed by “distracted” drivers, where texting and talking on a cell phone are explicitly mentioned; I’m not sure what else they would count. But I also don’t know how they get the numbers. I guess they can usually tell when someone is in the middle of a phone call when they get in a crash, but how can they tell if you’re reading your email or checking a text? Almost any estimate I can think of, there would be a lot more ‘false negatives’ (concluding the driver was not ‘distracted’ when s/he actually was) than ‘false negatives.’

    I’ll just throw in a little anecdote here, recognizing its irrelevance — yeah, yeah, the plural of ‘anecdote’ is not ‘data’, I know. A few weeks ago I was riding my bike. There were three cars in a left-turn lane, and I got behind the last one. A bunch of other cars were stopped at the light to go straight ahead, so there as a row of stationary cars on my right. Eventually my left-turn lane got the green arrow, and the first two cars pulled through. The car immediately ahead of me just sat there. After about five seconds, I pulled my bike around to the right and started to pass the car. As I did so, I looked in through the passenger window and saw the driver holding his phone, pointed at his face (I couldn’t actually see his face because my eyes were above the car’s roof level). Just then, he started moving forward. That would have been OK except the car also drifted way to the right, threatening to squeeze me against the cars to my right. I slammed on the brakes and pulled in behind the car as it made the turn. I was pretty unhappy so when the guy pulled into a parking lot ahead of me, I followed. He pulled into a parking space and got out, and I said (fairly politely…really!) “Please don’t use your phone while you’re driving, it’s dangerous. I was passing you and you pulled to the right before making the turn…could easily have crushed me against another car.” He said “Gee, I’m sorry, you’re right…I’ll be sure not to do that again.” BWAAAAAHHHH hahahahaha I crack myself up. No, actually he said “I wasn’t on my phone.” I said “I saw through the window that you were looking at your phone.” He said “You shouldn’t try to pass on the right when I’m moving.” I said “You weren’t moving when I started to pass you, you were stopped at the green arrow even after the cars ahead of you went through…because you were looking at your phone.” He said “OK.” Not “OK, you’re right” or “OK I’m sorry” or “OK I won’t do it again”, just “OK.” Infuriating.

    Thanks for listening.

    • Andrew says:


      Your story reminds me . . . a few months ago, I was on the street, doing something . . . umm, I don’t remember what, but it was something antisocial and some passerby chewed me out, and I was going to respond in kind, but then I thought of you, so I said, Sorry, my bad! The guy who was chewing me out was so surprised!

      Then there was the other story, I think I might’ve blogged it, actually . . . I was heading up Amsterdam Avenue, when a guy who was walking across the street stopped right in front of me and started yelling that I was going the wrong way! The guy himself was jaywalking but that doesn’t bother me at all. What was weird is that Amsterdam Avenue is one way, and I was going the right way! This was so weird that I followed him and said, Hey guy, whassup, I’m actually going the right way! Then the guy got spooked and started screaming that he’d call the cops. That sounded like bad news, as who knows what he might make up if the cops came by, so I left. It was just particularly disturbing, somehow, kinda like how I imagine you must’ve felt.

    • jrkrideau says:

      As a long time cyclist I work on the theory that all car drivers are legally blind, oblivious, homicidal maniacs until they exit the automobile. Cell phones have magnified these attributes.

      • Dzhaughn says:

        As a long time driver, I work on the assumption that each bicyclist has a death wish, expects everyone else to look out for them especially, and has a highly misguided fashion sense. So we’ll get along fine! :)

        • Martha (Smith) says:

          As a former sometimes cyclist and a somewhat reluctant driver, I have observed that each category (cyclists, drivers) has some careful, considerate people and also some foolish, inconsiderate people. (e.g., some cyclists use their cellphone while cycling.)

    • Also in the “distracted driving” bucket is eating, drinking (non-alcoholic beverages), putting on makeup, shaving, reading and even screaming kids in the back seat can be categorized as distracted.

  7. Tom says:

    I am a bit obsessed with this study. Starting with the fact that the basic unit is county and AB licenses by county isn’t available. So they ‘model’ the missing data.

    But this from page 6 of the appendix: “This procedure yields a total annual decrease in hit-and-run accidents by 3,958. For comparison, the total number of hit-and-run accidents was 73,046 (81,530) in 2014 (2015).”

    The total, Y/Y change was 73,000 to 81,000. An increase of 8,000. But the AB licenses decreased hit and runs by 4,000. Which is saying that they would have been 85,000, except for the AB licenses changes. And the 4,000 was the result of their complex model.

    I would say that there were a lot more hit and runs. And maybe it would have been worse without the AB licenses. However, those total numbers are not listed in any normal manner. Using parentheses to separate years seems bizarre.

    Also, there have been newspaper articles about an epidemic of hit and runs. And LA says it has 40,000 or around half. Using counties as the exposure unit seems weak given the weights of the largest handful of counties.

  8. Steve Sailer says:

    So the number of hit and runs didn’t decrease. Instead it went up about 12%.

    So let’s create a model that gives Mono County (population 14,000) the same weight as Los Angeles County (population 10 million)!

    • Steve Sailer says:

      It’s almost as if the Los Angeles Times, like all respectable media outlets, has a Narrative that’s more important than the actual numbers.

      • When it comes to illegal immigration the Times is hardly respectable. They clearly have a narrative and they aren’t just biased but they outright lie. The paper practically writes a story a day about some poor illegal alien getting deported. They downplay the persons criminal record and create sympathy for his/her poor family that will they be separated from. Last year they ran about 200 stories of this ilk. How many stories about the victims of these peoples crimes? Zero! Not just last year but back to when my son was killed by an illegal alien in 2010.

        Steve in particular saw your tweet to Anne Coulter. Would like to speak with you.

  9. Cody L Custis says:

    “To put it another way, the paper is called, “Providing driver’s licenses to unauthorized immigrants in California improves traffic safety,” but I’d be happier with a title such as “No evidence that providing driver’s licenses to unauthorized immigrants in California decreases traffic safety.””

    The authors didn’t collect enough data to disprove the null, claim that is evidence for the null, then gave it to the popular press and politicians to misinterpret. Yuck.

  10. gdanning says:

    I have a question for those who know a lot more about this than I do: Give the width of the confidence intervals in the 3rd graph in the article (ie, the one re hit-and-run accidents), can one exclude (with 95% confidence, etc) the possibility that the true slope of the regression line is zero? (Ie, the points (1, +1) and (6, +1) both lie within the bands of the conf interval, so how can this analysis exclude the possibility that a line that connects those two points does not represent the true relationship between the independent and dependent variables?)

    PS: For the purpose of this question, I am assuming that the methodology that the authors use is A-Ok – in other words, I am asking about the results on the authors’ own terms.

  11. academic gossip says:

    There’s an inconsistency in the construction of the variables, it I think it may account for the results.

    Their key finding is that accidents *per resident* in a county doesn’t get an extra boost proportional to “exposure” to illegal immigrants, but exposure is defined as licenses *per licensed driver* in that county issued to illegals. This inconsistency causes problems if legals and illegals have different rates of driving, say due to illegals being less able to afford cars.

    If illegals have fewer vehicle drivers per capita, immigrant-magnet counties with higher “exposure” will tend to also have higher growth (from before to after the law) in the denominator of the per-resident accident rate calculation. Accidents are a rare event compared to getting a drivers’ license, and differences between legals’ and illegals’ accident rates may therefore be small compared to their difference in rates of automobile driving. But that would generate a spurious “licensing illegals makes roads safer” result using the method of this study.

  12. academic gossip says:

    Also, their construction of the fixed effects for time is odd, and may dilute the analysis. The data are monthly, for 2006 to 2015, and there is no obvious reason not to have one fixed effect per month. Instead, they use one fixed effect per year, and one for each of the 12 months of the calendar, so that every January is considered the same during the decade analyzed.

    • jrc says:

      I think this fixed-effect model – using year and month, not year-X-month – is both incredibly common and fundamentally misguided. So I agree with you that a more clearly interpretable model has year-X-month FE, but I don’t think their particular choice is odd, it is just what most people do.

      I actually wonder why, given the general “difference-in-difference” interpretation of these kind of spatio-temporal FE models, you still see the year FE and month FE specification so much, instead of the year-X-month. Maybe it is just because when these models were developed everyone just used annual data, and then when they got more temporally-refined data, they just added in quarter/month/week/whatever indicator variables on top of what was already there. But clearly, if you want to get the “DnD” type of interpretation, you should cut “vertically” at the level at which the variation is assigned (year, month-X-year, week-X-year, etc…), just like you cut “horizontally” as the regional level to which you merge the data (state, country, etc.). But it is such a common specification that I very rarely hear anyone (other than myself) making the point you make above, which makes me think this isn’t usually an issue of p-hacking so much as an issue of not thinking clearly about the relationship between the model and the data.

      • Fixed effect per month of the year is equivalent to assumption of an annual periodicity. Fixed effect for month x year is equivalent to analyzing each time period separately with no smoothing/correlations in time. The best model is probably a medium dof continuous function.

        • The thing that gets me the most is, don’t they teach anyone anything about continuous functions in social science classes? These “dummy” variable approaches are equivalent to representing what is after all most likely a continuous function by a series of step functions. At the boundary between years you will automatically see a “signal” which is more or less “aliasing” of the kind that used to drive us all crazy when people scaled up bitmap fonts.

          • jrc says:

            Two things:

            1 – usually you do this when the explanatory variable of interest is constant over the period within a cell. If I have variation at the county-year level, then within any given county, I don’t think you really have this step-function problem if you use year effects. If you had a continuous RHS variable of interest, then I think you have your problem.

            2 – suppose again I have a county-year panel. County indicator variables are similar or equivalent to de-meaning the observations (both X and Y) by county… in many cases running the dummy variables or mean-differencing by county would produce essentially the same results.

            The thought experiment isn’t about “controlling” for something. The thought experiment is about which comparisons are being made and how.

            • Well, evidently people are using Year and Month dummies. So you’re assuming that whatever you’re observing is a function F(i) where i is the count of months, and it’s represented as G(Year(i)) + H(MonthOfYear(i))

              Now, your proposal to simply do Year x Month dummies is basically the same as simply modeling F(i) = F_i a sequence of discrete values where you can choose each of the F_i. Assuming whatever you’re looking at is aggregated over the i’th month then yes, you don’t have an aliasing problem at the year boundary because each month is completely separate, what you have is lack of regularization… F_1 = 10^6, F_2 = -17, F_3 = 37, F_4 = 9999…. you can set each of them to whatever, and have no “time structure” in the model where near-in-time values are more like each other, or pseudo-periodicity where each month of the year is nearly similar but not exactly… etc.

              On the other hand, if you do F(i) = G(Year(i)) + H(MonthOfYear(i)) you assume that whatever isn’t part of some constant value for the whole year G(Year(i)), is part of some *perfectly periodic* signal along the months that repeats each year H(MonthOfYear(i)). In this case, if there is an aperiodic component (trend, pseudo-periodic seasonality, etc) to the function, you WILL have unrepresentable stuff that will appear as “signal” when in fact it might be best to describe it as “seasonal type but not perfectly periodic” noise or “a trend that continues throughout the year and isn’t just constant across all the months after removing a periodic component”.

              The ideal case, you have some function F*(i) which is constrained to be from a family of continuous interpolation functions that has less than N degrees of freedom (where N is the total number of months you observe) and hence has some time-covariance structure built in. Gaussian processes, Splines, Fourier Series, Chebyshev Polynomials, piecewise Chebyshev… all of these kinds of things are capable of representing time-variation together with time-covariance/regularization/continuity/Lipshitz type conditions that are usually what you want in statistical analysis with noise.

      • academic gossip says:

        Maybe it’s a vestigial habit from the days when memory and computing time were more expensive.

  13. Steve Sailer says:

    It’s a successful study in the sense that they took a politically disastrous reality — an 11.6% increase in hit-and-runs — and did enough statistical voodoo to it in order to get the top headline on asserting that Jerry Brown’s policy was a success.

  14. Tom says:

    If the half million new drivers licenses actually reduced hit and runs among those drivers, that means that all the other drivers had 18% more hit and runs.

    So the entire population of state drivers, except the ones with newly minted drivers licenses, ran amok in 2015? The Report that contacted Jonathan did a pretty good job.
    Especially compared to the LA Times. After all, she was writing about a statistical study and actually contacted a statistician for a comment! How cool is that?

    She quoted: “Mark Krikorian, executive director of the Center for Immigration Studies, a Washington, D.C., think tank that supports tighter controls on immigration, dismissed the study as limited and premature.

    “The point of giving driver’s licenses to illegals is to document them, to give them partial amnesty,” Krikorian said. “With these documents, they can more effectively embed themselves in society.”

    So, Mark Kirkorian doesn’t believe it exactly, but even more, doesn’t care. It’s obvious that the authors were going to come up with their increase = decrease conclusions regardless of the data.

    And their data is still missing. It isn’t uploaded. I emailed both the LA Times guy and the Lead Author of the study, with no response. I can’t see how they could have gotten any negative anything out of the 2014 to 2015 data, so am both curious and also lost is what you refer to as the voodoo. They did include income and unemployment by county for something.

    • academic gossip says:

      Tom, I suggested above a way they could have gotten a negative result : using per-driver covariates to predict a per-resident measures would give a negative result if immigrants have fewer drivers per resident.

Leave a Reply