Despite good intentions, public services – whether it be in education, criminal justice, children’s services, or anywhere else – won’t always achieve the outcomes we strive for. Or, to put it simply, sometimes things just don’t “work”.
With respect to many areas of public sending and practice, we simply don’t know enough to be confident whether or not it is making a difference. But there’s no reason that we need to stumble around in the dark, uncertain that what we’re doing is having the desired impact. It’s possible to identify what works and what doesn’t, rather than simply hoping for the best.
In the same way that medicine is rigorously tested in scientific trials and demonstrated to be beneficial before being used in practice, we can — and should — rigorously test our public services. And in the same way, your doctor consults evidence-based guidelines before making a prescription, decision-makers in social policy can consult the best available evidence when deciding how to act. Not only is this possible, but it’s also essential to meaningfully improve outcomes and spending taxpayer’s money responsibly.
By looking at the track record of whether a particular sort of service has been successful in the past – as illuminated by high-quality evidence – we can make sure that we’re delivering the services that have the best chance of working. We can also stop doing things that have a track record of failure, and we can adapt existing services to make them more effective.
But how do you go about making high-quality evidence-informed decisions? A big part of the work of the UK What Works Centres involves figuring out how best to support decision-makers to do this across varying areas of public policy. Although specifics will always vary by context, in our experience at the Early Intervention Foundation (EIF), there are three particularly important and cross-cutting considerations that are crucial to consider carefully when using evidence to help decide which services to deliver:
1. Considering the strength of evidence – is the evidence trustworthy?
When delving into the evidence, you may identify a stack of studies or reports that have something to say about the effectiveness of a range of services (or “evaluations“).
However, not all evidence is equal, and not all evaluations are trustworthy. A flawed study may conclude that a service is effective when, in reality, it has produced no benefits, or is even harmful. So where do you start? And which evaluations can we trust?
The “gold standard” for assessing effectiveness is the randomised control trial (RCT); these studies, when conducted well, give the most trustworthy estimates of how effective a service has been.
To make a high-quality evidence-informed decision, you need to use your professional judgement to assess the comparability — or fit — between your situation, and the situations in which a service has been shown to be effective in the past
The special sauce of the RCT is that it tells us if a service has been effective by allowing us to compare what happens when a service is delivered to what would have happened had the service not been delivered — using the power of randomisation to create a control group. There are also other strong designs in the evaluator’s toolkit such as quasi-experimental design evaluations (or QEDs). These designs, like RCTs, attempt to make trustworthy claims about the impact of a service by using a control group, though this is created using statistical methods (such as matching) rather than randomisation.
Services that have been found to be effective in multiple high-quality RCTs or QEDs (often grouped together in research studies called “systematic reviews”, or “meta-analyses”) have a track record of success and represent a “best bet” about what might work in the future based on the existing evidence.
However, not all RCTs or QEDs are perfect, and some are more trustworthy than others. It’s often difficult to unpick which is which. Where available, it’s best to rely on existing work. The UK What Works Centres provide online registries of evidence such as the EIF Guidebook and the EEF Toolkit to assist decision-makers across different areas of social policy in deciphering which evaluations are trustworthy.
2. Considering the ‘fit’ of the evidence – is the evidence relevant to my situation?
Services that are backed by high-quality RCTs or QEDs represent a “best bet” about what might work in the future. But even the best RCT can’t guarantee a service will work in the future and in all situations. The evidence can only guarantee that the service worked, in the past, delivered in a specific way, to a specific group of people, in a specific place.
To make a high-quality, evidence-informed decision, you need to use your professional judgement to assess the comparability — or fit — between your situation (where and how you want to deliver the service), and the situations in which a service has been evaluated and shown to be effective in the past. The more comparable they are, the more likely it is that you will replicate the successes of the past. If you don’t consider fit when deciding which service to deliver, it’s less likely that you successfully achieve the outcomes you hope for.
Registries of evidence can’t make your decision for you, but they can shine a light on what might otherwise seem like a murky and fraught road to successful policymaking.
In particular, you need to think about the population you intend to deliver the service to — do the people you want to help with this service resemble those who it has helped in the past? The service might work well for people with certain characteristics — in terms of age, sex, ethnicity, socio-demographic status, and so on — but not for others. You also need to think about your context, and implementation.
When selecting a service, it’s essential to know whether you can successfully deliver it under conditions comparable to those when it was demonstrated to be effective. Changes to implementation or context (like the setting of delivery, who delivers the service, and for how long) may compromise a service’s effectiveness.
3. Considering the size of effects and cost — is the evidence demonstrating meaningful improvements?
The next step is to look carefully at what the best and most relevant evidence says about how much beneficial impact the services have. High-quality evaluations will report numerical estimates of impact, known as “effects” or “effect sizes”.
How you interpret effect is rarely straightforward and will vary depending on the sort of service you’re looking to implement and why. But it’s absolutely vital that it’s considered. Some services may have strong, trustworthy RCT evidence suggesting that they actually haven’t achieved very much (perhaps only small, or negligible effects), whereas other services may have a strong record of producing practically meaningful or relatively large effects — the most effective services must be prioritised.
Bigger isn’t always better though, and it is important to consider what benefits can be achieved at what cost. For instance, if two services have a track record of producing similarly sized effects but one is very expensive and one is cheaper, then — all else being equal — it is clearly in the public interest to pursue the less expensive service.
None of this is easy. Decision-makers aren’t on their own though, and the UK What Works centres endeavour to help make those difficult decisions less difficult by providing accessible information on the trustworthiness and relevance of evidence, along with information on the benefits and costs of services. Registries of evidence can’t make your decision for you, but they can shine a light on what might otherwise seem like a murky and fraught road to successful policymaking, and help you to achieve real and meaningful change.