Reviewing the evidence on evidence-based policy

FEATURE: Evidence-based policy is a simple and alluring concept — who could disagree with the idea of basing decisions on facts? So why don’t we have it yet? Figuring out ‘what works’ can be more complicated than it seems.

Calls for public policy based on evidence are so common as to border on the ritualistic. ‘Evidence-based policy’ is such a woolly — yet self-evidently good — idea that it is nearly impossible to disagree with.

It’s hardly a new idea. Kevin Rudd told public servants early on in his prime ministership that “policy innovation and evidence-based policy-making is at the heart of being a reformist government.” The Blair government in the UK advocated basing reform on evidence, popularising the term “what works”. Deng Xiaoping famously quipped that “it doesn’t matter whether the cat is black or white, as long as it catches mice.” Back in the nineteenth century Florence Nightingale wrote that health decisions “must be tested by results.”

More recently Karen Chester, commissioner and deputy chair at the Productivity Commission, has called it “a critically endangered beast — seldom seen and rarely funded”.

The recurring nature of these incantations stem from the importance of the problems at hand. The failure to use evidence effectively can have disastrous effects, as in the case of sudden infant death syndrome:

“For most of the second half of the twentieth century, new parents were advised by medical professionals to place babies to sleep on their fronts — with advocates such as the popular paediatrician Dr Benjamin Spock explaining this could reduce the risk of infants choking in their sleep if they were to vomit. This practice continued for decades while empirical studies were slowly accumulating evidence that, in fact, babies left to sleep on their fronts might be at higher risk of sudden infant death syndrome (SIDS) than back-sleepers. Finally, in 2005, a systematic review of the literature was published which identified the relative risk of SIDS to be nearly three times higher for front-sleepers. The authors of the review argued that, had a more rigorous review of evidence been done in the 1970s, this ‘might have prevented over 10,000 infant deaths in the UK and at least 50,000 in Europe, the USA, and Australasia’.”

The fact that we are still regularly reminded of the need to move towards evidence-based policy suggests governments have not yet attained this goal. So how should evidence be used in public policy? And what pitfalls do policy makers need to be wary of?

Evidence and values

Evidence does not stand on its own weight. Unlike much of clinical medicine — commonly cited as ripe for emulation by advocates of evidence-based policy — many of the decisions governments face are not merely about doing the same thing better — such as helping people live longer, healthier lives — but are instead subject to tradeoffs between competing goals.

“Half the battle is understanding the problem. Failure to do this properly is one of the most common causes of policy failure and poor regulation.”

Even in situations where the facts are very important to making a decision and can give a clear indication about what the impact of a policy will be, such as some parts of economics, differing priorities can lead different decision makers to different conclusions. This is, of course, one of the reasons why we entrust political decisions to politicians and not technocrats.

A common mistake is to call on evidence to resolve a policy debate based on ideological differences, argues London School of Economics’ Associate Professor Justin Parkhurst, the author of ebook The Politics of Evidence. It’s possible new medical evidence about abortion will be found, for example, but it’s unlikely it will make much difference to policy debates about it.

Policy makers should also be careful about relying on evidence to dismiss the public’s concerns. Although it’s easy to believe ‘we’ are the rational ones and others are wrong, such thinking can obscure that the real point of difference is actually differing values, or the misuse of quantitative data to dismiss personal experience. “What gets measured gets done”, as they say.

So rather than trying to shoehorn evidence into fixing an ideological difference, it’s important to keep in mind the limits between values and the means of achieving them. It would not be possible, nor desirable, to reduce the policy making process down to a bloodless technocratic approach. As former Productivity Commission chair Gary Banks says, “Values, interests, personalities, timing, circumstance and happenstance — in short, democracy — determine what actually happens”.

Types of evidence

Overgeneralised thinking about types of evidence in policy can lead to overzealous use of rules of thumb, thinks Parkhurst.

The holdover of looking to medicine to guide how evidence is used has led to “the widespread, and often uncritical, embrace of so-called ‘hierarchies of evidence’ to judge the relevance of evidence to inform policy decisions”, he explains, “typically placing methodologies such as randomised controlled trials or meta-analyses at the top of such hierarchies”.

One of the biggest temptations with RCTs “may be their allure of providing certainty from their ability to draw conclusions of causal effect.” However, “only those interventions with simple and direct causal pathways will bring such certainty, and this still says nothing of the policy importance of that effect.” Parkhurst urges goal clarification as one of the first steps to deciding what evidence should be used. It’s no use citing an RCT if it’s not relevant to the question at hand — evidence should be appropriate to the goals of the policy.

What is being measured?

Sociological framing of the data should be kept in mind. As Nancy Krieger, professor of social epidemiology at the Harvard TH Chan School of Public Health, argues:

“Label infant mortality a problem of ‘minorities’ and present data only on racial/ethnic differences in rates, and the White poor disappear from view; label it as a ‘poverty’ issue and proffer data stratified only by income, and the impact of racism on people of colour at each income level is hidden from sight; define the ‘race’ or socioeconomic position of the infant solely in terms of the mother’s characteristics, and the contributions of the father’s traits and household class position to patterns of infant mortality likewise will be obscured.”

So context is important. Which raises another point on which medicine and many areas of policy differ — whereas the human body tends to respond in predictable ways to medical intervention, the complexity involved in social systems means that what works in one place may not work hold true across different countries, cultures or demographics.

Meta-analyses, which typically combine findings from multiple experiments for a larger sample size and are often considered one of the “gold-standard” study designs, should be approached critically. Systematically reviewing studies to draw out a single result tends to assume the cases are directly comparable, and that participants are reacting to the same stimuli in similar ways.

If not done carefully, a meta-analysis can blur the record. One example is Robert Martinson’s 1974 paper reviewing all published English-language reports on prisoner rehabilitation from a 22-year period, which found “no clear pattern to indicate the efficacy of any particular method of treatment”. But a later book by Ray Pawson and Nicholas Tilley noted that this — at that point the most-cited paper in the history of evaluation research — was widely interpreted as a conclusion that “nothing works” for prison reform.

Instead, they argue the assumption that the concept of ‘prisoner rehabilitation’ — in this case covering a wide range of interventions, including education and training, drug treatment, counselling, decarceration, sentence variation, community psychotherapy and other approaches — would work in similar ways across all populations set “an impossibly stringent criterion for ‘success'”. In fact, many of the cases Martinson examined did show positive results for certain groups, meaning that rather than taking the evidence to show that rehabilitation does not work, or that examples of success are “isolated”, it could be argued that “most things have been found sometimes to work”, Pawson and Tilley believe.

Indeed, Martinson acknowledged that, if it is the case that the results show no clear positive outcomes, it is possible to draw completely opposing conclusions: either educators, medical practitioners, counsellors and the like had not yet discovered effective tools for helping rehabilitate offenders, or it is not possible to rehabilitate offenders. The first could support an argument for greater funding, the second an argument that any funding is wasteful. Many politicians went with the latter answer.

The right approach

Keeping the limits of evidence in mind, what principles should be employed in making evidence-based policy? There are a few essential ingredients, argues Gary Banks.

High quality research can’t happen overnight, so planning ahead can reduce the chances you’ll need to throw something together quickly for an impatient minister.

Methodology is all-important. Notwithstanding debates about particular approaches, all good methodologies have a number of features in common, Banks suggests:

  • they test a theory or proposition as to why policy action will be effective — ultimately promoting community wellbeing — with the theory also revealing what impacts of the policy should be observed if it is to succeed;
  • they have a serious treatment of the ‘counterfactual’ — namely, what would happen in the absence of any action?;
  • they involve, wherever possible, quantification of impacts (including estimates of how effects vary for different policy ‘doses’ and for different groups);
  • they look at both direct and indirect effects (often it’s the indirect effects that can be most important);
  • they set out the uncertainties and control for other influences that may impact on observed outcomes;
  • they are designed to avoid errors that could occur through self-selection or other sources of bias;
  • they provide for sensitivity tests; and, importantly;
  • they have the ability to be tested and, ideally, replicated by third parties.

The right approach isn’t just about finding the right answer — it will also help you gain an understanding of the problem to be solved. “Half the battle is understanding the problem. Failure to do this properly is one of the most common causes of policy failure and poor regulation,” Banks thinks.

Good data is vital. A lack of baseline data for specific populations is still a common problem, making it difficult, or impossible, to know afterwards the impact of an intervention.

Transparency is another key ingredient: governments often dislike the idea of making the data and methodologies behind a particular policy idea publicly available, as this gives critics the opportunity to pick apart their analysis. But it also means serious problems can be discovered before they’re implemented.

Skilled people are crucial. “You can’t have good evidence, you can’t have good research, without good people. People skilled in quantitative methods and other analysis are especially valuable,” says Banks.

All this depends, of course, on decent resourcing and a government open to good ideas.

So while it’s not all up to public servants themselves to put the correct pieces in place, there is plenty that can be done to incorporate good evidential practices into policymaking. We might never attain an evidence-based utopia, but with hard work and critical thinking, perhaps one day evidence-based policy will be considered a somewhat less endangered beast.

  • 15Carmel05

    Thank goodness someone has tackled this policy credo. While I like a commitment to testing policy against real life, the world would be a different place without value-led initiatives and imagination.