Brian Kinghorn points to this news article by Christian Grothoff and J. M. Porup, “The NSA’s SKYNET program may be killing thousands of innocent people; ‘Ridiculously optimistic’ machine learning algorithm is ‘completely bullshit,’ says expert.” The article begins:
In 2014, the former director of both the CIA and NSA proclaimed that “we kill people based on metadata.” Now, a new examination of previously published Snowden documents suggests that many of those people may have been innocent.
Last year, The Intercept published documents detailing the NSA’s SKYNET programme. According to the documents, SKYNET engages in mass surveillance of Pakistan’s mobile phone network, and then uses a machine learning algorithm on the cellular network metadata of 55 million people to try and rate each person’s likelihood of being a terrorist.
The news displays some leaked documents labeled Top Secret. I don’t know if it’s legal for me to copy them here, but one of them says, “0.18% False Alarm Rate at 50% Miss Rate.” Grothoff and Porup write:
A false positive rate of 0.18 percent across 55 million people would mean 99,000 innocents mislabelled as “terrorists” . . . The leaked NSA slide decks offer strong evidence that thousands of innocent people are being labelled as terrorists; what happens after that, we don’t know.
I find this quite disturbing. I’m betting a lot can be chalked up to the Base Rate Fallacy.
If Pr(being terrorist) < Pr(flagged by model) Then Pr(terrorist | flagged) < Pr(flagged | terrorist)
P.S. A commenter points to a news article by Martin Robbins that concludes, “Nobody is being killed because of a flaky algorithm.” It’s hard to know either way, given that so much of this is secret.