RCTs as an ethical evaluation choice in pay for success

March 6, 2018 - 2:45pm

In a randomized controlled trial (RCT), eligible people are randomly assigned to either the treatment or control group. While the treatment group receives program services, the control group will receive status quo services or none at all. In theory, proper randomization creates similar treatment and control groups. Assuming that the evaluation is properly designed and implemented, any significant difference in outcomes between the groups is taken to be caused by program services.

In pay for success (PFS)—also referred to as social impact bonds—RCTs are often hailed as the most rigorous way to determine program impact, build the evidence base of a social program, and confirm that funder repayments are based on actual improvements in participant outcomes. However, skeptics of RCTS often point to the ethical implications of denying program services to people in the control group, especially if they might have benefitted. While there are valid realities that could preclude an RCT, the points below respond specifically to the ethical concerns around RCTs.

Point: It is wrong to deny services to people who need them.

Counterpoint: Time, funding, and resource constraints mean that not everyone eligible for the program will receive program services—even in the absence of an RCT.

Ideally, everyone should be able to access services that can help improve their lives. But whether in the context of PFS or a traditional fee-for-service program, and regardless of whether an RCT is in place, resource constraints, such as funds, staff, time, and geographic boundaries practically limit the actual number of people a program can serve.  Service staff usually respond to limited resources by implementing a first-come, first-serve model, and/or employing tools like waitlists to delay service provision to people whom the program cannot accommodate at that time. The idea of denying services to people who need them because of randomization can be troubling, but as a result of these resource constraints, not all eligible people will receive program services anyway.

But people in the control group are not necessarily denied program services forever. RCTs are temporarily implemented to determine whether or not a program works. Once the effectiveness of a specific program in a specific place is demonstrated, the need for a control group decreases. Given sufficient resources, service staff can then begin offering program services to more eligible people—including those who were once in the control group.

Point: If a program works, it should be made available to everyone that might benefit from it.

Counterpoint: Rarely is a social program so obviously effective to justify omitting a strong evaluation.  

This argument assumes that project intervention will lead to improved outcomes for the target population, or, at the very least, will do no harm. However, very few social programs are so obviously effective (given complex policy, social, and built environments) and those that people intuitively expect to be successful may actually have negative consequences for participants. An RCT could protect against continuing funding for an ineffective or even harmful program, or eliminating a truly effective program. As such, an RCT could save money and protect the target population. For example, the “scared straight” programs of the 1970s took at-risk youth to visit adult prisons in the hopes that the experience would deter them from crime. This seems like a logical conclusion. However, RCTs conducted years afterwards found that youth who participated in these programs were nearly 40 percent more likely than peers who did not to be incarcerated later in life.

Furthermore, even interventions with a strong evidence base may not produce improved outcomes when scaled in new communities or populations. For example, the PFS Adolescent Behavioral Learning (ABLE) Project for Incarcerated Youth in New York City was based on Moral Reconation Therapy (MRT). The project aimed to reduce the recidivism rates of 16 to 18-year-olds entering Rikers Island Jail. But the one-year assessment found that MRT, despite its strong evidence base, did not yield improved outcomes for incarcerated adolescents at Rikers. The Rikers program demonstrates that context matters—even evidence-based interventions may fall short when scaled in new, or particularly challenging, environments.

Ethical questions around RCTs are not particular to PFS. But the accurate measurement of success in PFS projects has practical consequences for the numerous stakeholders involved, with investor repayments, program validation, and the well-being of participants on the line. Furthermore, a strong evidence base indicating positive outcomes increases the likelihood of scaling a particular program, attracting investors to future PFS projects, and incorporating the program into government agenda. These implications bring RCTs, and the associated ethical debates, to the forefront in the PFS field.

