Increasing the use of performance targets in government is an appealing idea, but one that should be approached cautiously. Experience shows people can be very creative at gaming the system.
Thanks to their size and conflicting sources of legitimacy, it is common for bureaucracies to have unclear end goals. In an ideal world, the interests of your immediate team, branch, secretary, minister(s), government, stakeholders and the public would all align neatly, but the reality is often far different. Throw in red tape and a range of other barriers to effective cooperation, and it’s no wonder government has a reputation for inertia.
Targets are sometimes proposed as the weapon to fight this malaise. Setting clear goals can communicate that an issue is high priority, pushing everyone in the same direction and hopefully aligning the considerable powers of the public sector. If well designed and used sensibly, targets can be useful. New Zealand has had some success with prescribing 10 whole-of-government targets through its Better Public Service Results program, cutting the number of infants missing out on vaccinations by two-thirds, for example.
But there are plenty of potential pitfalls when it comes to setting targets.
The crime rate
One big problem is that much of the work of government is hard to measure, so setting clear targets can be a big waste of time — or even make the problem worse.
Take policing for example. Tying police performance measures to crime statistics sounds like an obvious idea — high crime rates will prompt police to rethink how they use their resources and push them towards more effective strategies. A high crime rate looks bad for the police commissioner, especially if reducing it is a key performance indicator of the job. Tie senior pay to the statistics and you’ll have an even stronger incentive to deal with the problem.
But while statistics are obviously an important source of information about what’s happening with crime, tying them to performance is probably a bad idea for a few reasons.
Firstly, police don’t have control over most of the factors that influence crime — unemployment, education, social norms, and so on. Of course there are all sorts of things police can do, but there are many reasons the crime rate might get better or worse that are outside the control of police. Performance incentives can’t be effective when responsibility for performance outcomes does not sit with the party the targets are being imposed upon.
Secondly, it creates incentives to under-record data. When the South African government announced it would reduce violent crime by 7-10% per annum, rates of robbery and attempted murder fell by almost 40% in the space of five years — but murder only fell by 8.5%. Given that in the real world these trends tend to move in similar ways and that it’s much harder to fudge murder statistics, the response by police appears to have been to under-record. Even if politicians are able to claim the crime rate has gone down, the public will lose trust.
Third, increased focus on crime often sends the signal that police are taking crime more seriously, which may prompt increased trust in the community and higher reporting. This may help explain why family violence rates keep increasing — in Victoria in 2003-04 there were 28,000 family violence incidents reported to police, a figure which had jumped to 76,500 last year. While there is still a long way to go to fix the problem, says Domestic Violence Victoria, “we believe that increased reporting in Victoria is partly because women are more confident they will be taken seriously, and better responses from Victoria Police”.
Gaming in Targetworld
In 1998 the newly-elected Blair government introduced more than 300 performance targets applying across all government departments. The targets were tied to budgetary allocations agreed with the Treasury, and applied to everything from local bus reliability to the staffing of the armed forces and the conduct of foreign policy.
Each major target was then broken down into many performance indicators for each organisation. For example, in 2004, 10 high-level targets applying to the Health Department were translated into some 300 lower-level targets for the various public sector health organisations in England.
Starting in 2001, public hospitals and other public health-delivery organisations in England were given star ratings according to their performance against targets and other indicators. Managers whose organisations lost their stars or never gained any could expect to be fired. The Prime Minister’s Delivery Unit was established to closely monitor the top 20 most politically sensitive targets.
This system has been described by Oxford public administration scholar Christopher Hood in Gaming in Targetworld: the targets approach to managing British public services.
The target approach led to some incredible improvements in the statistics, including a reduction in the number of patients waiting 12 months or more for surgical operations in English public hospitals, from more than 40,000 to fewer than 10,000 between 2001 and 2003.
But while there was probably some genuine improvement, there was a huge amount of gaming.
Some organisations flat-out lied about how well they were doing, or manipulated data classifications, such as whether a particular case was a “life threatening emergency” requiring them to be seen within a certain period.
There is evidence that ambulances would wait outside emergency rooms until the hospital was confident the patient could be seen within a four-hour waiting target; in some cases, trolleys in hallways were turned into “beds” by removing their wheels to satisfy the target requiring patients be admitted to a hospital bed within 12 hours of emergency admission.
Two studies found discrepancies of 30% for waiting list performance between data provided by organisations and surveys of patients.
Public managers also observed that gaming was taking place in interviews.
Hood outlines three classic types of gaming.
First is the ratchet effect, where the organisation knows next year’s target will be higher than what they can manage this year, so they underperform this year to make it easier to reach next year’s target.
Second is the threshold effect, where setting the same target across the board gives higher performers no incentive to improve, and might even encourage them to reduce output to just what the target requires.
Third is output distortion or manipulating reported results — “hitting the target and missing the point”.
The ratchet effect was observed in budget negotiations, where some agencies were known to be particularly adept at convincing Treasury to set “virtually unmissable” targets, “as in the case of a long-term cancer-reduction target that was almost certain to be met as a result of decisions to quit smoking that had been made as part of a social trend that had begun a decade or more earlier.” Failure to meet targets would also sometimes be met with a lowering of subsequent targets to avoid embarrassment, effectively rewarding failure.
The threshold effect was observed in education, where schools were set test score targets, encouraging teachers to focus disproportionately on students who were just below the target line, and less on their very well- or very-poor performing classmates.
Output distortion occurred in all sorts of ways. One example is that around 20% of general practitioner clinics met the target that everyone should be able to see a doctor within 48 hours by preventing patients from booking appointments more than 24 hours ahead.
Then there’s the potential opportunity cost of everyone striving for a few targets. Focusing significant resources on a few priorities will almost necessarily mean less attention for other problems. This may be a trade-off the government is willing to make — the targets should be high priority problems — but is particularly problematic if the targets are not producing meaningful results.
Ambiguity and motivation
The outcomes required of most public services are hard to quantify.
The problem is nailing down precise performance indicators when it’s difficult to know what it is you’re measuring, or if the thing you’re measuring can be measured at all.
“Performance measurement may be appropriate when ambiguity is relatively low, but it is difficult and potentially damaging in settings marked by a high degree of ambiguity,” note public administration scholars Tineke Abma and Mirko Noordegraaf.
“Performance measurement … runs into problems because this method cannot deal with contradictory preferences, contested knowledge, fuzzy means-end relations and unclear relations between outputs and outcomes that characterise some management settings.”
Student test scores are, after all, not what you’re really trying to fix — the point is to improve learning. If test score targets are being met at the expense of learning, then there’s no point. Throw in some debates about what should be learned and why, and suddenly it becomes harder to work out what the targets should be. That doesn’t mean you can’t have targets, just that you need to really think through what you’re actually measuring and how people will respond to the setting of targets.
And just to make the situation more difficult, you have to consider what impact all this might have on staff motivation. Not only can poorly chosen incentives encourage gaming, they can undermine another powerful tool for getting results — inherent motivation. Dysfunction caused by a bad performance regime is a drag on morale. But we also know that material incentives in particular can demotivate employees who want to do well because they believe in the mission of the organisation — a common reason for people to take up jobs in the public sector. Sometimes setting targets can send the message to staff that they’re underperforming, which can also damage morale.
Although targets can be a powerful tool for improving performance, they can also be a significant source of dysfunction. So choose your targets carefully and take gaming into consideration.