Social Science Bites

Megan Stevenson on Why Interventions in the Criminal Justice System Don’t Work

July 1, 2024 8993

Do policies built around social and behavioral science research actually work? That’s a big, and contentious, question. It’s also almost an existential question for the disciplines involved. It’s also a question that Megan Stevenson, a professor of law and of economics at the University of Virginia School of Law, grapples with as she explores how well randomized control trials can predict the real-world efficacy of interventions in criminal justice. What she’s found so far in that particular niche has echoed across the research establishment.

As she writes in the abstract of an article she saw published in the Boston University Law Review:

This Essay is built around a central empirical claim: that most reforms and interventions in the criminal legal space are shown to have little lasting effect when evaluated with gold standard methods. While this might be disappointing from the perspective of someone hoping to learn what levers to pull to achieve change, I argue that this teaches us something valuable about the structure of the social world. When it comes to the type of limited-scope interventions that lend themselves to high-quality evaluation, social change is hard to engineer. Stabilizing forces push people back towards the path they would have been on absent the intervention. Cascades—small interventions that lead to large and lasting changes—are rare. And causal processes are complex and context-dependent, meaning that a success achieved in one setting may not port well to another.

In this Social Science Bites podcast, Stevenson tells interviewer David Edmonds that “the paper is not saying ‘nothing works ever.’ It’s saying nothing works among this subset of interventions, and interventions, as we talked about, are the type of interventions that get studied by randomized control trials tend to be pretty limited in scope. You can randomly allocate money, but you can’t randomly allocate class or socioeconomic status.”

Despite this cautionary finding in her research. Stevenson hasn’t despaired about her career choice or that of other social and behavioral scientists. “Many of us are in this line of work because we care about the world,” she notes. “We want to make the world a better place. We want to think about the best way to do it. And this is valuable information along that path. It’s valuable information in that it shuts some doors. … So keep trying other doors, keep experimenting.”

To download an MP3 of this podcast, right-click this link and save. The transcript of the conversation appears below.

David Edmonds: This podcast has now been running for many years. Over that time, we’ve heard from a number of social scientists about interventions designed to solve various problems. For example, one of the early Social Science Bites was with Lawrence Sherman, well known for his experimental work in criminology. In that interview, he told us how randomized control trials had helped make some important discoveries; hot spot policing, which concentrates police resources in a small number of places, could be highly effective in reducing crime, or so we were told.

Recently, a new paper by Megan Stevenson, who teaches in the law school at the University of Virginia, has cast doubt on the idea that any such interventions actually work. Not surprisingly, her paper has caused a huge storm in the world of social science. Megan Stevenson, welcome to Social Science Bites.

Megan Stevenson: Thank you so much. Lovely to be here.

Edmonds: Today’s topic, why interventions in the criminal justice system don’t work. So, my first question, what do you mean by intervention? What interventions are we talking about?

Stevenson: Well, in specific, we’re talking about interventions studied with randomized control trials. So randomized control trials, or RCTs, as we call them for short, are a way of evaluating how effective something is. You’re probably most familiar with them from the medical setting. You know it’s how we figured out that vaccines work and don’t have terrible side effects.

Used in the social science space, in particular in criminal justice, RCTs are a way of evaluating programs or interventions like intensive probation, like hot spots policing, that’s when you send police to a hot spot at a corner where there’s a lot of crime to see how much you can reduce crime. Programs like tough on crime, Scared Straight programs where you take a bunch of kids into prison, get them really scared about it, boot camps, things like cognitive behavioral therapy, so on and so forth. So, it’s any sort of program or policy that governments do, that NGOs do, to try and reduce crime or help people that have been involved in the justice system get on a better path.

Edmonds: So potentially multiple interventions. You recently wrote a paper which caused, I think it’s fair to say, a huge stir in the social science world. It was called “Cause, Effect, and the Structure of the Social World.” Tell us what you set out to do in that paper.

Stevenson: Well, this paper reviews 50 or 60 years of RCTs in the criminal justice space, and what it finds is that when you take a really broad bird’s eye view of the literature, you find that very, very few interventions, when evaluated with this kind of gold standard method, are found to achieve what they set out to achieve, and the ones that do tend not to replicate when tried again in another time or another place.

Edmonds: I guess, in retrospect, examining these multiple studies to see whether any intervention works, it was a pretty obvious thing to do, but nobody had done it before. Was there something that triggered the thought that this was potentially a fertile area of research?

Stevenson: Well, that’s sort of, you know, how I describe the study, but the way that it came about is kind of the opposite way. So I’m an economist. I’ve been studying the criminal justice space for 10 or 15 years, which means I do applied statistical research to see what works and what doesn’t, what’s going on right now with the criminal justice system, mostly in the United States. Honestly, this is just something that I started noticing over the years. It’s something that I think most people know in the back of their minds. Sometimes people talk about but rarely gets acknowledged. The research that people hear about, that people talk about, is the rare case of success.

You know, as social scientists, we’re programmed, or, you know, we’re incentivized to try and find the successful intervention. That’s what gets published. That’s what’s going to build our career, give your tenure, so on and so forth. And so I think I’d entered this field very hopeful, like, I’m going to be able to figure out these important key areas where if you intervene, you’re going to be able to have a large and lasting effect, tipping points, I guess, to use, like, kind of the popular vernacular, where if you fix this thing at just this right spot, it’s going to have very long trickle out effects and, and I wasn’t finding it, and other people weren’t really finding it either. And then when we think we find one, then there’s a bunch of research trying to replicate it, and it generally doesn’t replicate.

This had just been kind of like a growing realization for me over the years, and at some point, I just thought, like, “God, this is so interesting. This is not a failure. I mean, sure what? From one perspective, it’s a failure, but, like, it’s not really a failure. This is what science is. This is how you learn, and we’ve learned something really interesting about the structure of the social world by doing these randomized control trials for so many years.” That’s where I wrote the paper.

Edmonds: And you’ve mentioned randomized control trials a few times, RCTs, and we’ve covered them many times in this podcast, but just give me an example of how a randomized control trial might be set up in the criminal justice system in any one of the many interventions that you talked about?

Stevenson: Sure, so maybe there’s a program within a prison, you know, it’s some sort of program that is designed to help people get on their feet when they’re released, and it entails some job training, some group therapy type stuff some maybe it extends even after you’re released and gives some subsidized employment to help you get on your feet. People, they’re excited about this program, but there’s more demand than there’s space for and so how do they decide who’s allowed in? They randomly choose people. They take a bunch of applicants, and they randomly say, “OK, well, you can come in,” and, “I’m sorry, but we don’t have space.” And so you have this group of people that were randomly allocated to either be within the program or to not receive the program. So, because they’re randomly allocated, on average across these two groups, they’re going to look very similar. You know, it was just pure luck that put them in the program or put them out of the program. And so, when you follow them up over the years and see, you know, are they committing crime at the same rates? Are they getting arrested at the same rates? Are they employed at the same rates or not? You can attribute any potential differences to the fact that one group was in the program and the other group was not.

Edmonds: One of the most surprising failures to me was the risk assessment that you mentioned in the paper. There’s been all sorts of hype, I think partly because of artificial intelligence that would be able to predict who was going to commit future crimes and so who was safer to release. It amazes me that you suggest even that hasn’t worked.

Stevenson: Yeah, that’s one of the subjects that led me to this paper, because this is something I’ve studied a lot, and risk assessment is very much cited as number one on the kind of evidence-based type practices that can be adopted in the criminal justice space. Just for people that don’t know. Risk Assessments are just statistical tools for predicting who’s going to commit crime in the future. And when I started studying their adoption in different settings and found, wow, they really don’t bring about a whole lot, you know? And that’s a combination of judges don’t use them intensively. They do use them. They do respond to them, but there’s a lot of also ignoring them and overruling them, and maybe the fact that the tools aren’t bringing as much gain and information as people want. Or, you know, judges are following these different agendas. They have different goals. You know, their goal is not solely do we, to incarcerate those at highest risk of crime. Because if you did that, honestly, what you’d be doing is you’d be locking up all teenage boys and not letting them out until, you know, their hormonal mix sets up a little bit.

So yeah, so risk assessment is an example of something that seems really effective. It seems like it would be really effective. And it’s not the only one. There’s lots of them, you know, kind of depending on your predisposition. There’s lots and lots and lots that people were excited about for various reasons and strong theoretical reasons that in reality, didn’t have the impact people hoped for.

Edmonds: It sounds like you weren’t too surprised by your result. It was very surprising to me, but it sounds like you kind of expected it. But it just seems very depressing.

Stevenson: Well, like I said, it was a slow process of transformation for me. I think the moment that was my ‘aha!’ moment was when I saw a presentation about job training programs for recently released prisoners, and that seems like such an important place to intervene, to like, get somebody on their feet when they’re just released from prison, get them employed. That could be really pivotal in terms of future arrests and restabilizing their lives at one of those tipping points, like I talked about earlier. And it didn’t have lasting effects. You know, it had short term effects, but nothing that lasted and, you know, much beyond when the program was over.

It’s been a slow process of change for me. So no, I wasn’t really surprised. This is what I’ve come to expect when I evaluate things. I don’t know that I find it depressing anymore. I think when I started in this career path, I was much more interested in this idea of being a social engineer. And that’s something, you know, that many people share for good or for evil right, but my case, I was hoping it was for good.

What I’ve come to realize is that I think there is naivete and some, I’ll almost say, some egotism to the thought that you could be the one to step in and solve other people’s problems like just that easily with just this one single program or intervention. The paper is not saying ‘nothing works ever.’ It’s saying nothing works among this subset of interventions, and interventions, as we talked about, are the type of interventions that get studied by randomized control trials tend to be pretty limited in scope. You can randomly allocate money, but you can’t randomly allocate class or socioeconomic status. You know, like there’s so much more than money that differentiates the haves and the have nots in this world.

I think change is still achievable. It’s something we should still aspire towards, but we need to think in different terms than you know, like we’re going to come in with this one great intervention that is going to change everything.

Edmonds: Well, let’s come on to what might work in a moment. But I hadn’t realized you were trained as an economist, but you’re writing here about legal interventions, interventions in the legal system. But obviously there are interventionist policies in numerous areas. In education, for example, there’s a question about whether reducing class size works, or does giving teachers extra pay incentivize them to do a better job? Is there any reason to think that your findings in the criminology area can’t be extrapolated to all these other domains.

Stevenson: So I personally think, at least in broad strokes, that what I learned in the criminal justice space is similar phenomena to what’s going on in education and healthcare in a variety of different domains. I say that for a couple reasons. Part of it is theoretical, you know, like, when I think about this kind of, like, failure to, to find successful interventions in criminal justice, it comes down to the fact that I think people are already trying to build the best lives they can given their circumstances. So that’s why it’s, it’s so hard to change them, because people are already out there trying to do the best they can.

I’m going to slip into econ speak. So, apologies here. But, like, they’ve already kind of, like, maximized their utility subject to constraints. And I think that’s broadly true across people. Like, obviously, when you move away from the population studied in criminal justice and look at wealthier people or more advantaged people, you know, their constraints are different. Maybe they’re less, but they’ve already kind of maximized what they can given what they’re facing. And the remaining constraints in the world tend to be, they’re sticky, you know, they’re systemic. They’re things that change slowly over time, or maybe rapidly over time, given, you know, in the rare case of a real kind of moment of social revolution or technological revolution, I guess. But they’re not something that you can just do, like an NGO, can just implement a policy and, like, fix it or something.

So that’s part of it is theoretical. The other part of it is, you know, my understanding of the research in these areas. You know, there’s a tend to be a fairly similar finding in terms of RCTs, that at least when it comes to these sort of like interventions that have kind of indirect, multi-step causal pathways, or supposed causal pathways, that they don’t tend to be very successful, or once again, once somebody’s found an initial success in one place, it tends not to replicate when they try to expand it or scale it up.

This is not saying nothing works in the entire sphere of anything in the world. This is saying when it comes to these type of limited-scope interventions, when you’re not looking at the direct effect of something, you’re looking at sort of like the indirect effects, the long-term effects, effects that require, like, a few kind of steps down a causal pathway that tends to be very, very hard to predict and manipulate.

Edmonds: I think that the key sentence, tell me, if you disagree, the key sentence in your article is this, “the primary goal of this article is to build and support the claim that when it comes to the type of limited-scope interventions evaluated by RCTs, randomized control trials, the social world is full of stabilizing forces that resist change.” And there’s that phrase, “stabilizing forces that resist change.” Just explain what you mean by that.

Stevenson: It’s like what I was talking about earlier, that when it comes to the remaining constraints that limit people from what they want to achieve, to limit them from kind of a better life, that these tend to be deep, hard to move, sticky structural forces. And part of the goal of the paper is to just re-articulate this claim that when you evaluate things with RCTs, they tend not to work in a slightly more abstract sense. The term of stabilizing forces was an abstraction that really felt powerful. It felt like I was able to see something kind of important about the way the world works. And I don’t know if that metaphor will be helpful for other people in the same way it was for me. But the idea of stabilizing forces and the other idea there was the idea of cascades, which is just a term that I use for this idea of, like, the holy grail of you find the tipping point intervention that has, like, cumulative, huge bang for your buck kind of thing.

Edmonds: OK, let’s talk a bit about what might work. Suppose somebody came along and said, “I think crime is driven by poverty,” a perfectly plausible claim, and “what we need to do is redistribute money from the rich to the poor, and that will reduce crime rates.” Now that obviously is not something that can be analyzed, well, not very easily, I wouldn’t have thought by a randomized control trial. So what can we say about that kind of claim?

Stevenson: So, let’s kind of break that down. I mean, you can look at large shocks to wealth, like you can look at lottery winners, for instance, and see how their crime behavior may or may not change. I want to call that a limited-scope intervention. Some people argue it’s not limited scope, but it’s very different from a systemic change where, instead of just kind of impacting one person’s life, you’re impacting the entire structure of society by engaging in a kind of a large-scale wealth distribution towards the poor. That’s the sort of thing you can’t study with a randomized control trial.

And in fact, like our social science tools, the causal inference methods that we might hope to use to answer such a question, they all follow a fairly similar structure to a randomized control trial. They’re called natural experiments. We kind of like hope to stumble upon them in real life. The core of the idea is this idea of holding all else constant but changing this one thing, this intervention, so you can evaluate just the impact of that intervention, as opposed to all the other whole things you’re holding constant. You can’t hold societies constant while also completely reforming them with a massive wealth redistribution policy. It’s not something that you can kind of study and predict and use the standard tools of science to be able to say something about what the impact is.

Edmonds: So where does that leave the social scientists? Because it does sound like a plausible claim that redistributing wealth from the rich to the poor would have an effect on crime rates, but you’re saying that there’s no evidence one could adduce one way or the other to test that hypothesis.

Stevenson: I mean, I’m saying that we can’t approach it with the same level of confidence, you know, that you might hope for right? You want to know is this going to work before you try. You don’t ever really have that. I mean, it’s not that you approach it with no knowledge. Obviously, we have theory. We can look at instances in which countries have engaged in some sort of policy, and obviously, to become a country that engages in massive wealth redistribution, there’s a lot that changes, you know, like that’s not something the US or the UK is about to do tomorrow. You have to think about it within this context, within all of the systemic changes that would bring you even to the point where you could even consider such a policy.

I’m not trying to say we know nothing in this world, but there is a tendency in the reform space right now to be very focused on evidence-based change, evidence-based interventions, on piloting some sort of thing before you scale it up, of best practices, things that work, and so forth. And I think that paradigm is just way too unrealistic in terms of knowing in advance the impact of the types of change that are on that scale.

Edmonds: Can I ask you what kind of response you’ve had from social scientists to the article?

Stevenson: They’ve been largely pretty positive, I think. A lot of people have reached out to me to say how much they appreciated it and that it impacted their thinking, or expressed some stuff that they had been feeling for a while but hadn’t fully articulated. And so that was really gratifying. You know, there are some people who, I think ignored it because it didn’t fully fit into their agenda. I think they see the literature in broad strokes as similar, but they’re more optimistic about our ability to use this toolkit to identify: “Yeah, most things don’t work, but there are a few things that work and that replicate and that scale up, and we just need to keep plugging away to try and find.” I think that’s kind of the strongest critique that I have heard.

Edmonds: Because potentially it is quite threatening to a lot of social science, because a lot of social science is devoted to tinkering around with systems and then testing it with randomized control trials. And you’re suggesting that’s all going down a blind alley.

Stevenson: I mean, the paper is actually built on that evidence. I wouldn’t say it’s on a blind alley. I say we’ve learned a ton from it. It’s just not what people kind of set out to learn. You set out to learn like which lever to pull to make the engine run better. But what we learn is it’s not really about the levers. I still love this methodology. I love the research people do. I’m a total nerd for econometrics. I mean, I’m working on a project right now that’s using similar causal inference methods to look at the impact of incarceration on wealth. I don’t want people to stop doing it. I feel like there’s, there hasn’t been much introspection, and there has been some claims that are kind of, forward-making claims that are just not supported by the evidence. And I think when you take a look at these claims, you think about social change in a different way. So I think it’s kind of important. Many of us are in this line of work because we care about the world. We want to make the world a better place. We want to think about the best way to do it. And this is valuable information along that path. It’s valuable information in that it shuts some doors. It’s like we tried those doors. The secret is not behind that door. So keep trying other doors, keep experimenting. Like, what’s the next step?

Edmonds: I’m a bit puzzled by that, because you say that you don’t want people to stop trying with this micro interventionist stuff when you’ve presented compelling evidence that it doesn’t work, that they’re wasting their time and their resources and grant money and all the rest of it. Why shouldn’t they stop trying?

Stevenson: OK, so let me rephrase that. I don’t want people to stop trying to figure out how to make the world a better place. That endeavor extends way beyond randomized control trials and has for a long time. There’s a particular set of methodology that I’ve been trained in, the econometrics methodology. I just kind of love it, and I don’t want it to end this. And there’s a separate set of questions that are sort of outside of the scope of this paper, but are really important questions which are like, we know this thing is going to have an impact because it’s sort of a direct mechanical impact, but we just don’t know how big it is. Those questions are really important and fascinating, and that is a different set of questions you can answer with the same toolkit. It’s not just like, what is the indirect lasting if this ball hits that ball, hits that ball, hits that ball, kind of impact of an intervention.

You know, in this example that I’m talking about the impact of incarceration on wealth. You know taking somebody out of society for a couple of years is going to stop their accumulation of wealth. That’s in the category of pretty much direct impact. You cannot work for two years. You cannot pull in money to pay your mortgage for a couple years. It’s going to impact your wealth. How big is it? And so, using these tools to kind of answer these questions, they’re not like I found this amazing tipping point that’s going to change the world type questions. It’s the practical, nitty gritty aspect of doing science, the kind of more basic measurement that I think is important, interesting.

Edmonds: Thank God, so there’s still a role for the social scientist. Megan Stevenson, thank you very much indeed.

Stevenson: Thank you. Take care.

Social Science Bites

Welcome to the blog for the Social Science Bites podcast: a series of interviews with leading social scientists. Each episode explores an aspect of our social world. You can access all audio and the transcripts from each interview here. Don’t forget to follow us on Twitter @socialscibites.

View all posts by Social Science Bites

Published

July 1, 2024