Digital data is becoming ever more valuable as the raw material of the information age. And just like physical resources, its purity varies.
Poor-quality data creates problems that become costly on a large scale, and while government agencies hope their information can be relied upon, getting certainty on its integrity is not actually a simple matter.
ACT Health got serious about information integrity and assurance in 2012, after highly damaging revelations that emergency department waiting times were deliberately falsified over several years to make them look better. Three years later, emergency department waiting times for both of Canberra’s hospitals are displayed on the website in real time — as is a wealth of performance information about other areas such as elective surgery — providing a useful community service as well as greater transparency.
One of the most importance changes in recent years was the creation of new role — a director of information integrity — in mid-2013, and the recruitment of local academic Charles Palmer to fill it. Since then he has applied his considerable expertise to assessing, reporting on and improving the reliability of all of the agency’s data for the benefit of staff and citizens. But first, there was one important job to do.
“The concern,” he told The Mandarin, “was how on earth would we detect that data had been falsified? And so my original role was to devise a mechanism, which I did using statistical measures, to run on-demand tests on a daily basis, to see if the data had been falsified again.”
Just having such measures in place is, Palmer says, probably enough to stop staff trying to fudge the figures again. ACT auditor-general Maxine Cooper concluded it happened in response to “managerial pressure” but not direct instruction — inadequate system access controls and practices created the opportunity, as well as making it impossible to be certain who was to blame, beyond the one executive who admitted fault.
After that, Palmer explained to his new employers how he could apply his experience and recent academic work all over the agency. He posed a challenging question: “How do we know that we can trust any of the data, and if so, how much?”
“The conversation led to there, and I was offered the role … to offer an integrity assurance to executives on the information that we’re sharing with the Commonwealth, with other health agencies around Australia, and indeed with our consumers,” he said.
“I’m also a member of the open data fora for the ACT government, so I’d like to introduce [information integrity assurance] there, too. This is a long dream, so that when you, the ratepayer, consume data on [the government’s] open data website, I’d like to be able to give you an idea as to how much you can trust it as [the Australian Bureau of Statistics] are partially doing.”
Palmer defines information integrity as “a measure of how much you can trust the information you’re reading in a report” while data quality is more subjective as it just refers to “the absence of intolerable defects”. What is good enough for one person, a nurse for example, may not pass muster for other users of health data.
“I’ve devised a mechanism — which is also my thesis topic — so that, when a consumer receives a report I’ll tell them what the trust levels of the information that comprised that report might be,” he explained.
The thrust of his latest research is “credentialing data as it transforms into information” — after reading through about 1300 peer-reviewed papers over several years, he’s comfortable it is novel work.
“It’s applicable anywhere in government or in industry, but it’s extremely hard to get people to understand it’s a significant investment to make a data warehouse work properly in this fashion,” said Palmer.
“ACT Health are prepared to give it a shot and we’re starting to get some results.”
Based on survey data and a comprehensive literature review, he wrote in his earlier 2011 master’s thesis that large organisations around the world were clearly beset by “a pervasive and costly issue with unknown or unacceptable data quality”.
A quick turnaround
In June this year, the auditor-general returned to the Health Directorate and reported that “governance over the integrity of health data” had improved and “considerable effort” had been made to address most of the very serious issues raised in her 2012 investigation of the false figures:
“Improvements in areas such as training staff in systems usage, managing user access to the systems, procedural improvements and data integrity routines implemented in the Health Directorate data warehouse environment have contributed to improved data integrity.
“However, there is more work to be done with respect to training, documentation and allocation of responsibilities, outcome measures, evaluation, corrective actions and assurance.”
The directorate’s performance information branch was also overhauled. The agency’s 2013-14 annual report explains the aims were:
“… a more streamlined governance and reporting arrangement; enhanced workflows and communication between business units; a strategic information management governance approach; a renewed data quality assurance framework; and a timely, efficient and responsive service for our clients’ needs.”
Palmer’s key contribution is a new 10-stage information architecture. Data — which simply represents real-world measurements — is not what he calls information until it is drawn together from various different sources, combined and put into context.
“And so each source system is going to be different in terms of levels of maintenance, management, understanding, data quality assurance, how well it’s extracted, how well it matches our metadata structures, and how well the data is being managed insofar as things like range checks,” he said.
“So, in a maternity system you would be concerned if there was a 90-year-old mother presenting … so we check for things like that and we also check it against the national metadata standard [and] common sources of truth.
“The upshot is a report about the report that says: ‘here’s all of the trust levels in this information so the consumer can decide how much they can rely on the information in the report. No one else has done this.”