Data Nation

When it comes to racial profiling, data both hurts and helps. Liberty and Scott investigate the damage policing data can do to communities and how data can also be used to solve the problem.

Show Notes

A Native American man gets pulled over for driving a nice car, a black man is arrested in front of his family for a crime he didn’t commit – innocent people are at risk because of racial profiling. But to stop profiling, you have to first identify it, and that’s not as easy as it seems. Liberty and Scott are going deep into data in this episode, investigating how data is used against marginalized communities, and how it should be used to protect and serve them. They go to the experts to find out which methods are failing, what solutions can mitigate the dangers of facial recognition technology and smart policing, and how we know we’ve succeeded in ending profiling. 

Liberty and Scott speak with Craig Watkins, Martin Luther King Jr. visiting professor at MIT; and Brandon Del Pozo, former police chief in NYC and Vermont.
 
Data Nation is a production of the MIT Institute for Data, Systems, and Society and Voxtopica

What is Data Nation?

We face many overwhelming challenges in America today: systemic racism, data privacy, and political misinformation. These are big problems, and there are a lot of opinions and ideas on how to fix them. Scholars and industry experts often disagree on how to find solutions. So, how can we find the right way to move forward? We let the data speak for itself. Join hosts Liberty Vittert and Scott Tranter as they gather data and get the facts about today’s most pressing problems to find out: are solutions even possible? They’ll investigate with MIT professors dedicated to researching these issues, and talk with the people on the ground encountering these problems every day so that we can find the best solutions that triumph over these challenges and solve America’s biggest problems.

Data Nation is a production of MIT's Institute for Data, Systems, and Society, with Voxtopica.

Speaker 1 (00:04):
Welcome to Data Nation. I'm [inaudible 00:00:08] and I'm the director of the MIT's Institute of Data Systems and Society. Today on Data Nation, Liberty and Scott are examining racial profiling and policing.

Liberty (00:23):
In 2021 in Rapid City, South Dakota, a police officer named Jeffrey Otto was on duty when he reported a car for suspicious behavior. It was a Mercedes, and he told another officer that he wanted to "keep an eye on this car because it's a young, Native male driving this really nice car." Otto eventually saw the driver step out of the car, and when he realized it wasn't actually an Indigenous man, he claimed the car didn't need to be pulled over anymore. And thankfully, after the event, the Rapid City Police Department removed Otto from the force. While Indigenous advocacy groups saw Otto's termination as a step in the right direction. They also argued that this was not an isolated incident. This wasn't the first time. And in a quote to CNN, they said, "The officers alleged comments represent a culture of discrimination towards Native Americans in the city's police department."

Speaker 3 (01:27):
This is just one example of one city in the United States where communities are feeling burdened and unsafe from the consequences of racial profiling. So how do we solve this? Well, first you have to identify who's profiling and how much they are. The problem is figuring this out isn't that easy, and you certainly can't figure it out with the status quo methods. The data is telling us the wrong thing.

Liberty (01:49):
One of the current tests used to identify patterns of racial profiling is called benchmarking. Benchmarking compares a percentage of stops for people of a specific race with the percentage of that minority in that geographical area. And an example of one of these is the 1999 report on the New York City police's policy of stop and frisk. At the time, officers were patrolling private residential buildings and stopping individuals that they believed were trespassing. And in 1999, 25.6% of the city's population was Black but Black individuals comprised 50.6% of all the persons the police stopped. And the New York Attorney General used benchmarking to determine if the practice was being used unconstitutionally. And later in 2013, a federal judge ruled that stop and frisk can indeed been used in an unconstitutional manner.

Speaker 3 (02:49):
So basically, it seems like benchmarking revealed important data in this case, but does benchmarking work every time? Is it always accurate?

Liberty (02:57):
Not really. The problem with benchmarking is that it relies on census data, which can give a really misleading view because census data doesn't account for non-residents. If a major interstate runs through a city or if it's an area that brings in lot of tourists or visitors, how are all these non-residents captured in that benchmark?

Speaker 3 (03:21):
So what we really want to figure out is what are the reliable methods to identify patterns in profiling and how can knowing these patterns help communities and police departments stop profiling from happening? So we're going to bring these questions to Chief Brandon Del Poso. Chief Del Poso serve 19 years in the NYPD, where he commanded the 6th and 50th precincts as well as units in internal affairs. What I like to do is start out with a definition. How do we define the status quo in terms of determining if police are racially profiling? What is it and what's your definition and how do we look at it for an audience that doesn't necessarily understand what it is or is looking for a specific definition?

Speaker 4 (04:00):
Broadly, we're talking about race informing suspicion and suspicion giving police a reason to act. So broadly construed racial profiling is the inappropriate, the unjust in many cases, or unlawful use of race as a factor in forming criminal suspicion against someone. And then that empowering the police under statutes and law to then act, whether it's to conduct a stop, to conduct a frisk or a search or make an arrest. So in regulating racial profiling as a matter of policy and also in understanding the data, what to include and exclude. We have to sort of settle on or at least understand or have a working definition of to what extent race as a data point can and cannot inform criminal suspicion.

Speaker 3 (04:47):
How do you quantify that or how do you identify that when you are two parts training officers and interacting with them or determining whether or not that was a factor?

Speaker 4 (04:56):
I came up with a continuum where when the race is involved in the instance of the crime, "Hey, there's a group of Klansmen setting crosses on fire in Black churches. Or hey, there was a pattern robbery in New Jersey where the folks were coming in and doing smash and grabs of Columbian jewelry stores and they were Columbian suspects and the victims knew it cause they're like, 'We're Columbian.' The people were screaming at us in Spanish. It was a Columbian accent. This was inside the community job. And they were going and doing these jewelry store robberies." If a cop is driving by a jewelry store and there's four people peering into the jewelry store, it makes a difference for suspicion whether they're Columbian or not, whether they look Hispanic or not.

(05:41):
So at one level you have the instance where race informs suspicion and the instance as a fairly straightforward data point. And then on the other side you have cases where race informs suspicion in a very generalized unsupportable way that makes cultural and frankly biased assumptions about a race of people. So those are the two extremes. And you want cops to understand that their work has to be in the first set of definitions and has to steer clear of the second set of definitions. And that bias of many kinds is leading you towards that second set.

Liberty (06:15):
And I think that sets the framework for what we are looking for and what we are trying to determine of how we stop this. So in the police right now, you are a chief. What do you do to stop that?

Speaker 4 (06:29):
You don't want to tell cops that they can never use the physical description of a person to know who to stop and who not to stop. Cause knowing who to stop is also knowing who not to stop. And the police need to be very judicious about seizing people, telling them they're not free to leave and investigating them and frisking them. And if a physical description of somebody being dark-skinned, light-skinned, Black, Hispanic, Asian helps narrow down the suspects so that innocent people aren't caught in that web of investigation, that's good. But then we have this other group of generalizations that we've seen all the way from the New Jersey turnpike in the nineties and TSA and immigration all the way up through now, where you're just saying this class of people, "Black people", "Hispanic people", "Arabs". The suspicion is by virtue of being in the class, not by the instance at hand.

(07:19):
And it's how fruitful a police officers searches are at the roadside. So if a police officer stops a car, does the investigation decides to conduct a search that they have the adequate level of suspicion. In theory, if race didn't inform his calculus in a biased way, all stops would be equally fruitful across races. If race was an artificially and unfounded informing suspicion as a matter of bias, let's say for argument sake, the stops would be fruitful three quarters of a time. You'd get the drugs, the guns, the contraband you, you'd get the person wanted on a warrant.

(07:55):
If you're conducting searches at the roadside and you're finding that your searches are much more likely to be fruitful for white drivers than for Black drivers, you got to step back and ask yourself at that point, what is it about Black drivers that's giving the police officer what amounts to being unfounded suspicion a greater percentage of the time, if you follow what I'm saying? And when you see that disparity in what you can call the hit rate and that disparity is by race, then you're getting towards a metric that at least in traffic enforcement, you can use to detect and attack bias.

Liberty (08:30):
So in practice, it makes sense that we're using the wrong metric. We're trying to use this census data to figure out if there's racial profiling, when really we should be using this hit rate. Is there any issue in the practicality in actually deploying the concept of hit rate in real time in a police department to see if somebody is racially biased?

Speaker 4 (08:50):
So one of the challenges is that what counts as a hit. So I was finding in my old police department in Burlington that until marijuana was decriminalized, so much of the marijuana offenses were the basis for stops and suspicion. And so you had community activists saying, I think with some accuracy, listen, if the hit rates between Blacks and whites are the same, but most of the hit rates for Blacks or marijuana and the hit rates for whites are more serious crimes, are you just defining down the hit rate in a trivial way? It's interesting, there're debates about why don't licenses have race on them like Vermont licenses don't have race. And people would say, "Well, we need to know the race because we need to collect that data." The counter argument is what the cop perceives the race to be is what matters because the perception is what yields the actions.

(09:41):
So a cop may perceive a Pacific Islander to be Hispanic. I think that's fine because if they're recording what they perceive the race to be, then they're going to record what they've based their actions on. But to answer your question, you need an administrative data collection system that captures all of these variables from the moment of the stop, the reason for the stop all the way through what was found, whether a warning was given, the length of detention. And there's a lot of disparate data collection in American policing.

Liberty (10:07):
In the big police departments, we even see issues with implementing smart police saying, you look at facial recognition, which we know has a huge racial bias to it. Should we be using smart policing? Is this a good thing? Does the benefits outweigh the harms? And how do we implement it even in the big police departments in a way that is going to be just and lawful for our citizens?

Speaker 4 (10:30):
Let's say that there is a pattern rapist that's on his or her third sexual assault and the police got some surveillance footage and they know... Whatever the race, a white male, a Pacific Islander, an Asian, but they know that they're looking for the person. Old school policing is creating a drag net right where you're going and you're in that neighborhood and you're looking for people that matches the description. And if it's past a certain time at night and you're walking 50 feet behind a woman, they're going to stop you.

(10:58):
So it's a complex issue and I'm trying to work folks through the dynamics because you don't want facial recognition to supply suspicion in and of itself to justify an arrest. You get terrible outcomes and as you said, they are biased. But if you ban facial recognition, you get the reversion in cases of acute public safety problems to cast a wide net that will draw innocent people into it. And so another thing to think about is do you want to use facial recognition for petty crimes, for solving thefts, for solving misdemeanors? Or do you want to reserve it for serious pattern felonies where there's a person on the loose that has to be quickly identified?

Liberty (11:32):
I will tell you, I'm one of those people that writes and says, stands on my soapbox and says, I think facial recognition should be banned. I think it should be banned until we can come up with these types of methods of what type of crime should we actually be using facial recognition for? And these very sort of specific boundaries around it. But that it seems that with a lot of smart policing, people go, "I'm so excited, we can now implement facial recognition in the police." They throw it in there as soon as the technology is available.

Speaker 4 (12:02):
That's a great point. So there's two things going on at least. One is everything you said I completely agree with, and they put too much stock and have too much faith in the technology, and they use the technology to replace other more careful and deliberate and less error prone processes. And they also use it as an excuse for not having to be accountable to the community and rely on the community to be a participant in public safety. Most shootings, most robberies are done in a community where people kind of know what's going on.

(12:33):
I've said this to a group of police chiefs and it landed like a lead balloon. But I said, "If you are using technology as a replacement for getting the trust of the community and for community partnership in public safety, you're using technology wrong." So to say, I don't need someone to be a good witness and give a good description. I don't need somebody to agree to testify that they witnessed something. I don't need somebody to call me up and leave a tip and say, "I think Joe did that robbery or Joe did that shooting." Because they don't trust the police to say, "That doesn't matter. I have facial recognition, I can do this myself," is a terrible miscarriage of American policing.

Speaker 3 (13:11):
As the policing industry moves to establish new practices to identify racial profiling and patterns, they still face many new challenges. Chief Del Poso brought up the issue of facial recognition. Right now, millions of surveillance cameras are being used in private homes and public spaces, and local law enforcement wants to utilize these smart security devices. Google Nests' Doorbell and Arlo Essential Wired Video Doorbell are devices that include built-in facial recognition and they can provide data to police. In fact, Amazon's Ring is partnered with almost 2000 law enforcement agencies to allow officers to ask Ring users to share their video recordings without use of a warrant. While it's helpful to law enforcement agencies, the sharing of data raises concerns about privacy rights in the digital age, and it raises ethical concerns about the ability of law enforcement to utilize this data properly. For police to get your DNA and fingerprints, you need to be arrested. But for facial recognition to be used, you just need to be in public. Another big problem when it comes to facial recognition technology is that the algorithm just doesn't always get it right.

Liberty (14:14):
Actually, it was faulty facial recognition that led to a completely innocent African American Detroit man to be arrested for a crime that he did not commit. Robert Williams was pulling up to his house after work when a Detroit police car pulled in behind him and they blocked his SUV in case he tried to escape. The police then proceeded to arrest Williams on his front lawn in front of his wife and his two little girls. No one would tell him what crime he had committed. And after 18 hours in police custody, Williams was finally connected with the defense attorney who explained that someone had stolen watches from a store in Detroit. The store owner had sent the surveillance footage to the Detroit Police Department and the blurry image was then sent to the Michigan State Police. They ran facial recognition on this blurry photo.

(15:10):
It matched with Robert Williams old driver's license photo. And so he was arrested for stealing watches at a store he had never even been inside. Following this, lawyers at the ACLU filed a lawsuit against the police department and won. But Williams argues that winning the case didn't undo the trauma inflicted on his family by a failure in facial recognition and really in policing. And no one can ever erase the image of their father being handcuffed from his daughter's minds. And this family will forever remember the day that really an algorithm took their data away.

Speaker 3 (15:52):
This particular example raised a lot of red flags in how facial recognition and other emerging technology is used in policing. It seems like things can go wrong and when they go wrong, the consequences are severe. It leaves us with the question, how can we forward with this technology and can it be used in a way that produces positive outcomes for the communities?

Liberty (16:11):
We decided to talk with Craig Watkins of MIT. Craig is the Martin Luther King Jr. visiting professor at MIT and the founding director of the Institute for Media Innovation. Craig is leading a team that is addressing the issue of artificial intelligence and systemic racism. So Craig, we were just discussing the case in Detroit where Robert Williams was arrested for a crime he did not commit, and it was all because of faulty facial recognition. So is this really just a one-off incident or is there potential for this same incident to happen over and over and become really problematic.

Speaker 5 (16:51):
Prior to this, technology activists, critics of these technologies have long argued that facial recognition is seriously problematic insofar as it's less predictable in terms of accuracy when it comes to people with darker skin tones. It's less accurate when it comes to women versus men that is recognizing female faces versus male faces. And this of course, has to do with the sort of training sets the data around which these technologies have been developed. And so this has led to some serious problems, not only in theory, but in the real material world in terms of identifying people falsely, accusing them of crimes that they did not commit and putting them and their families through the horror of having to go through that experience.

Speaker 3 (17:36):
Do you think technology is a net positive in policing these days? And can it be a net positive in policing going forward?

Speaker 5 (17:42):
That's an interesting question. So I think clearly the promise of technology is real, and so far as the promise being right that it can make any form of labor, policing included, more efficient, more data informed. And so those are all things that have the potential to be sort of a net positive. In terms of being able to accelerate access to information, being able to importantly, not only accelerate access to information, but access to insights that can be generated from that information. And that can be good for, again, any form of human endeavor, policing included. Of course, the question is, to what extent do we create procedures? Do we create policies and practices that allow us to realize that net positive, to realize the potential positive impacts of these systems? And we can't assume that just by virtue of them existing and by virtue of us adopting and deploying them, that they will generate these net benefits. And that unless we are very intentional in terms of how we adopt, how we deploy these systems, it can lead to impacts that don't lead to a net positive.

Speaker 3 (18:56):
So you brought up an interesting term, net benefit. Net benefit means there's some good, there's some bad, but there's more good than bad. When we talk about justice and policing, we want to eliminate as much bad as possible. You don't want a false positive because false positives could end someone up in jail or on death row. How do you integrate this technology and these data science techniques in a way that the public will understand the net benefits?

Speaker 5 (19:20):
In my research and in conversations that I've had with community stakeholders, for example, there are some communities, communities of color, communities made up primarily of working class or poor individuals. There is a history and an understandable one of a high degree of mistrust when it comes to policing and police related work. And so for them, it's going to require some additional effort. It's going to require a strategic effort in terms of convincing them that these systems can lead to sort of a net benefit. So the question is net benefit for whom? And I think some populations might say, "Well, certainly these technologies are a net benefit maybe for some segments of society. They maybe a net benefit for those who are in charge of managing these systems and deploying police resources." But for the communities who bear the brunt of these systems, who are disproportionately profiled and surveilled as a result of these systems, that there's just no possible way that they could see these technologies as a net benefit in any way, shape, form, or fashion.

Speaker 3 (20:26):
So we've heard the term predictive policing. Can you expound on what that really means and what it looks like in the real world?

Speaker 5 (20:32):
Basically, it's this idea of being able to, as a result, collecting massive amounts of data, processing that data, identifying patterns in that data that police might be able to predict where a crime might take place, a particular type of crime, in a particular location, at a particular time, by a certain type of person. These are all of the kinds of things that are now being promised with the technology. But it's this idea that you can marshal data, that you can organize data in such a way and develop algorithms that allow you to identify patterns that give you a greater degree of precision in terms of trying to identify where you think a certain type of crime might happen. And what that suggests is that if you can predict where crime is going to happen, then you allocate resources that perhaps prevent that client from happening, a greater police presence in certain areas at certain times of the day, those kinds of things.

Speaker 3 (21:34):
So I got my start modeling at predicting what people are going to do such as, is this person going to eat a Big Mac or are they going to eat at Taco Bell? I can be 80% sure they're going to eat a Big Mac, which means 20% of the time I'm wrong. So in this example, I think the math is good and useful, but useful in a low stakes game. So when you talk about predictive policing, is the math and science good enough for a high stakes game when it comes to somebody's liberty and freedom?

Speaker 5 (21:58):
So let's say you subscribe to the model of predictive policing. I think currently the way in which that model is being practiced, what I would argue is problematic about it is if you can coordinate data in such a way that it can give you that kind of insight, that could be useful. But then the question becomes, what actions do you take as a result of those predictive analytics that are being generated?

Speaker 3 (22:23):
And who's deciding what those actions are?

Speaker 5 (22:25):
Who's deciding? Are they purely punitive? So then are you just going in and surveilling and criminalizing and sort of punishing certain segments of the population? Or an alternative is that if we are predicting certain patterns and we see certain trends as it relates to certain kinds of crimes that we're concerned with, what can we do to perhaps create conditions and situations where this no longer becomes a prevalent problem. So there are other ways to respond to that data in those analytics rather than simply write these punitive measures, more police on the ground, more aggressive policing, stop and frisk, those kinds of things. And that requires a very different kind of mindset, very different approach to policing.

Liberty (23:08):
How do we know that we've succeeded in ending these bias practices? How or when do we know that we've done a good job and hit success?

Speaker 5 (23:18):
That's a great question and there are many possible answers. One that I will give is when the communities that historically have felt the most pain from policing related behaviors and processes, when they began to trust the system, when they began to say, these technologies are creating safer, more secure communities, safer, more secure lived experiences for us, then I think that could be one indicator or one market of success when you get buy in from them. My vision is hopefully within the next five years or so, maybe this will happen, maybe it won't. But in the next five years that any police department, any decision that they're making as it relates to what kind of new technology they want to adopt, what kind of new technology they want to integrate into their practices, that before doing that they would engage the community in a meaningful and substantive way.

(24:19):
And I think when we get there, that'll be a marker of success. That'll be a marker of where we've turned, I think, a really important corner because that sort of speaks to a greater degree of accountability, a greater degree of partnership and engagement between police and the communities that they tend to work in.

Speaker 3 (24:41):
We've heard in this episode, there are several dilemmas that have to be addressed when it comes to eliminating racial profiling, from detecting profiling patterns to navigating new policing technology. The solutions aren't exactly simple.

Liberty (24:53):
When it comes to detecting bad actors. It's clear that we need simple and understandable metrics that serve as an early warning system and flag individual officers who are actively profiling. And as we work to remove these bad actors, we really need to keep an eye on the new technology and algorithms that are being introduced to policing.

Speaker 3 (25:19):
New methods like smart and predictive policing need to be added very carefully so that communities feel they can work with police and actually be served by them, not be victims of police when they've committed no crime.

Liberty (25:30):
I guess the good thing is that even though these solutions may be complex, there are solutions to move forward. And like Craig said, we know we've hit success when affected communities begin to trust the system again. That is the real marker of success. Thanks so much for listening to this episode of Data Nation. This podcast is brought to you by MIT's Institute for Data Systems and Society. And if you want to learn more about what IDSS does, please follow us @mitidss on Twitter or visit our website at idss.mit.edu.