This past weekend, I had the opportunity to travel to Houston and check out NeuLaw's Criminal Record Database. I wanted to take a moment a make an introduction to this project and talk about why it's important.
The U.S. needs a useable criminal records database. Primarily, this is because the closest thing we have right now, the FBI’s Unified Crime Report (UCR), doesn’t cut it. First, the UCR doesn’t have individual identifiers, so the ability to track cases is impossible. Second, the numbers that the UCR releases are cumulative, which means that you can’t understand the life cycle of a case from original charge to final disposition. Last, the UCR relies on local law enforcement agencies (there are over 18,000 in the U.S.) to voluntarily hand over their data, leaving the dataset incomplete. All of these factors mean that the UCR is woefully deficient. (As a side note, there is discussion about improving and reforming the UCR, but I have not seen any concrete recommendations.)
Understanding the current data problem, it’s easier to understand the need for the ambitious NeuLaw project. The database intends to collect tens of millions of criminal records from around the country and compile them in one standardized database. This project has cumulative longitudinal data, but also lets researchers drill down to the individual cases and follow their journey through the criminal justice system. As of last fall, the database had 22.5 million records from 1977 to 2014 from Harris County, Texas, New York City, Miami-Dade County, Florida, and New Mexico. To go deeper on this project, you can read this article by Pablo, David, Sasha and Gabe, the project’s core team.
This project has not been easy. The team applied for numerous freedom of information requests to get this data. Once with data was in hand, they took the painstaking effort to standardize it across jurisdictions. While the whole effort is impressive, it’s the standardization process that is amazing. Standardizing charges, dispositions, and human input errors is no small feat, especially when you’re talking about tens of millions of records.
As this project continues to grow, I hope that we have a discussion around the broader use of the project’s data standard. Yes, the database on its own is a needed tool; however, it would be revolutionary to have jurisdictions across the country collect data in its standard format. Undoubtedly, this is a big ask (even bigger than the creation of the database itself), but successfully implementing this standard would be beneficial in two main ways.
First, if the local crime data being created looked like the fields in the NeuLaw database, then the ability to update and keep the project current becomes immensely easier. This would both increase the usability of the tool, and it would greatly reduce the human hours it takes to keep the project going. Second, the White House, NYU’s GovLab, SpotCrime Open data standard, and Measures for Justice among others are trying to find or create a standard for justice data. I don’t see why we can’t explore using the NeuLaw data standard as a national standard. There’s no reason to reinvent the wheel if we already have a functional standard just waiting to be packaged as such.
I realize the dream of a national criminal records data standard is a hard ask in a decentralized criminal justice system like ours. However, it is a dream worth fighting for. Without such a standard, we will continue to struggle to create a deep understanding of how our criminal justice system functions. This should matter to everyone involved in criminal justice reform. because it’s near impossible to find a solution if you can’t first understand the problem.