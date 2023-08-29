Assistant minister Andrew Leigh will use an address to the National Press Club to explain how evidence-based assurances in federal policy are set to become the verified norm, setting the stage for programs to be canned if they are shown not to be delivering outcomes as planned.

Making the case for systems and culture change within the APS that can navigate difficult problems, Leigh will call for policymakers to adopt a more modest attitude about failing early and often.

It is part of the government’s plan to inject better quality decision making capability within the public service and wean teams off expensive contractor services.

“To generate and sustain a culture of continual learning, we need to be open to being proven wrong, and to use that information to do better the next time. We need to accept honest feedback — not pretend to get by,” Leigh said.

“A report from the Australian Evaluation Society estimates that in 2021-22, the commonwealth procured 224 evaluations from external consultants, at a total cost of $52 million.

“However, because not all commissioned evaluations can be identified, this is likely to significantly underestimate the total volume of external evaluations commissioned from consultants by the commonwealth,” he said.

Being confident about the evaluations for government programs that were contracted out to the private sector was a difficult task, Leigh said because there was neither incentive to undertake a high-quality evaluation should the findings reflect a negative outcome.

“That’s why we’re also encouraging agencies to rebuild their own in-house evaluation capabilities … an insourcing approach that’s consistent with the way that finance minister Katy Gallagher is operating across the Australian government.

“Another reason that consultants’ evaluations may fall short is if they are commissioned to produce evaluations late in the process when there is insufficient planning and data available,” he said.

Outlining the work of the 14-person Australian Centre for Evaluation, recently allocated $2 million in the 2023 federal Budget, Leigh said the group was dedicated to working across government agencies to improve evaluation capabilities, practices and culture.

The new group would be responsible for strengthening APS evaluation planning. Particular focus would be given to the planning stage at the time of new budget proposals, Leigh said, with emphasis on ensuring evaluation was not an afterthought but considered at all stages of policy development.

“Past reports have clearly shown the need to improve the quality of evaluation across government. Work done for the Thodey Review of the public sector found that the quality of evaluation was ‘piecemeal’,” Leigh said.

“Some high-quality evaluations have been conducted, including by the behavioural economics team (BETA) in PM&C. But in many other areas, the capacity to conduct rigorous evaluation is lacking.”

Using the example of successful randomised trials in NSW, which showed recidivism among offenders was curbed when people were given a tailored approach to treat their addiction,

“Drug courts don’t just help addicts — they also make the streets safer,” Leigh said.

“Evaluations seek to answer a range of questions. For example, it is important to know whether a spending program was delivered on time, on budget, and as intended.

“For new programs and small-scale pilots, it is also valuable to seek views from participants or service providers that could help address any weaknesses in the program design and implementation.

“After all, good new policies often under-perform due to poor implementation,” Leigh said.

Dating as far back as 1747, Dr Leigh said randomised trials on scurvy treatments had been effectively used to save the lives of thousands of sailors. Later trials in the 1940s and 1950s were also relied on to introduce a ‘what works philosophy’ to medical approaches, revising the use of antibiotics to treat the common cold and demonstrate the safety and efficacy of the polio vaccine.

Other more recent randomised trials conducted to test the efficacy of government initiatives, such as one run in the UK examining the impact of placing social workers in schools, to testing the takeup and use of freely distributed malaria nets in Africa, helped decision-makers choose whether to continue funding projects based on whether they were working.

The member for Fenner and assistant minister for competition, charities, Treasury and employment also explained that one of the key goals of randomised trials was to understand the counterfactual — or what would happen in the event the intervention never occurred in the first place.

The difference between a well-designed and run randomised trial was the way in which an evaluation approached the counterfactual. Constructing assumptions about counterfactuals was ultimately unhelpful in allowing researchers to understand whether a program worked for whom, why, or in what circumstances, he said.

“To determine what works, every evaluation — whether we’re talking medicine or policy – is trying to do one simple thing: figure out the counterfactual,” Leigh said.

“What would have happened if you didn’t take the pill, or didn’t participate in the program? This is what we get to see in the movie Sliding Doors, when we follow Gwyneth Paltrow’s two possible lives, according to whether or not she catches the train. It’s what you got as a kid when you re-read a Choose Your Own Adventure book.”

“Any evaluation that assumes the world would otherwise have remained static is likely to produce a flawed result,” he said.

The rub is that better quality evaluation of randomised trials ultimately cost more. If their purpose is to deliver higher impact, value for money government programs, the investment is a compelling one.

Complicating things further, good evaluations are more likely to highlight failure — this issue was not simply one encountered in public sector evaluations, Leigh said.

“The fact that failure is more common than success does not suggest that program designers are foolish or careless, but that they’re grappling with problems that are really, really difficult. In the face of major challenges, low-quality evaluation is a hindrance, not a help,” he said.

Leigh will deliver his speech at the NPC later today.