Ravi Shroff (NYU collaborator)
The Stanford Open Policing Project began with an assignment in 2014 to a group of journalism students at Stanford University. The task: negotiate for the stop and search data for every state police or patrol department that collects it. A year and several classes later, the students, with the guidance of their professor, Cheryl Phillips, had collected more than 100 million records. Phillips and Vignesh Ramachandran from the journalism program then joined forces with Sharad Goel, from the school of engineering, and a team of PhD students to process, analyze and publish the results of their work, thanks to the support of the Knight Foundation. The Stanford Library\'s contributed by adding the resources of its digital repository, ensuring that any data collected as part of the project would be preserved and made freely available going forward. Finally, Poynter joined this collaborative and growing effort by supporting the in-person training of journalists at the annual Investigative Reporters and Editors Conference and at other workshops on covering policing.
The end result: More than 150 journalists have gone through hands-on training in analyzing state patrol stops and the data has been downloaded and used in multiple stories, from work by the National Geographic to the Los Angeles Times. Now, students at the University of Maryland have been working with the Open Policing Project to collect city- and county-level data and the team expects to clean and analyze that data to better understand racial disparities in policing. The end gola is simple, help researchers and journalists cover this vital story by taking on the laborious task of collecting and processing the data and make it more possible that journalists can tell stories on this issue. The stop and search records of public police agencies is public record but too often is trapped in mainframes and not released in an easily obtainable format. The Open Policing Project hopes to help resolve that problem and provide support for data journalists tackling these stories.
What makes this project innovative?
The sheer scope of this project is its innovation. Collecting policing interaction data at scale has not been done before. Currently, a comprehensive, national repository detailing interactions between police and the public doesn’t exist. That’s why the Stanford Open Policing Project is collecting and standardizing data on vehicle and pedestrian stops from law enforcement departments across the country — and we’re making that information freely available. We’ve already gathered 130 million records from 31 state police agencies and have begun collecting data on stops from law enforcement agencies in major cities and counties, as well. We are marrying that work with hands-on training for journalists, class projects for student journalists and collaborative work with journalists to tell important stories from this data.
We, the Stanford Open Policing Project, are an interdisciplinary team of researchers and journalists at Stanford University. We are committed to combining the academic rigor of statistical analysis with the explanatory power of data journalism.
What was the impact of your project? How did you measure it?
We have measured the project in part through journalists trained: more than 150 and counting as well as stories told. The National Geographic used the data in a recent project on race and policing, the LA Times, the San Jose Mercury News, the Center for Investigative Reporting, the Marshall Project and NBC have all published stories using the data or reporting about the patterns we identified in the data. Those are just a few of the news organizations tackling this work. Nearly every week, we hear from a journalist who is downloading the data. Our team routinely provides guidance on best practices in analyzing the data and includes a tutorial on our site as well. The impact also can be seen through the work of the news organizations and by the way this data helps move the issue to the forefront. When Trevor Noah chooses to do a sketch on policing using data from the Stanford Open Policing Project, the takeaway for us was that this important issue is gaining notice in mainstream journalism coverage and in popular media. That can only be a good thing for coverage of such a vital area for the United States.
Source and methodology
The Stanford Open Policing Project has collected and standardized more than 100 million records of traffic stop and search data from 31 states.
Creating this resource has been marked by challenges. Some states don’t collect demographics of who police pull over. States that do collect the information don’t always release the data. Even when states do provide the information, the way they track and then process the data varies widely across the country, creating challenges for standardizing the information. We logged all communications with each police agency and verified missing data, unusual data-points within certain columns and more. We kept a raw data version and processed data version.
Data from 20 states, comprising more than 60 million state patrol stops, are sufficiently detailed to facilitate rigorous statistical analysis. The result? The project has found significant racial disparities in policing. These disparities can occur for many reasons: differences in driving behavior, to name one. But, in some cases, we find evidence that bias also plays a role.
On our site, you can explore our results. You’ll find tutorials that walk you through the steps to understand the data yourself, and information on a new statistical test of discrimination developed as part of this project. See our technical paper for more details. (https://5harad.com/papers/traffic-stops.pdf)
To facilitte transparency and research, we released the records we collected and our analysis code. We’ll be regularly updating the repository, and we’re collecting even more information, including local police stops.
Our analysis was done in Python and R. Our findings our detailed on our site and we provide plots that allow the reader to interact with the data for their state.