Government agencies must proactively manage the constantly evolving risk of re-identification of public data in order to protect vulnerable members of the community, according to the Queensland Office of the Information Commissioner.
The OIC report tabled on Tuesday looked at how two unnamed government agencies have managed privacy risks when releasing de-identified data.
The audit detected “critical issues” which could potentially put stakeholders, clients and staff at risk, according to information commissioner Rachael Rangihaeata.
Both of the audited agencies had detailed governance arrangements for public data, but only one had adequate guidance to assist decision-makers when releasing de-identified data. The other agency’s guidance was “not sufficient to support effective re-identification risk management”, which hindered its governance arrangements for managing privacy risks.
Both agencies lacked the governance arrangements needed to regularly monitor and review re-identification risks in de-identified datasets, the report noted.
“Without these arrangements, neither agency can be confident that risk management strategies remain effective over time,” it said.
Four public de-identified datasets from each agency were examined. The audit found neither agency could consistently show how they developed de-identification techniques and managed re-identification risks in the datasets.
“One agency has sufficient records of re-identification risk management for two examined datasets. The other two datasets from this agency, and all four datasets from the other agency, lack sufficient records. We cannot assess how re-identification risk was managed in these datasets,” it said.
The audit assessed the re-identification risk in the published data, and found a “real risk of re-identification” in three datasets from one agency. Both agencies failed to monitor and review re-identification risk in the datasets.
While de-identification can allow agencies to maximise the information they publish, it “does not guarantee that privacy risks are managed”, as de-identified data can be re-identified, the report said.
The OIC pointed to a 2019 example, where the Office of the Victorian Information Commissioner issued a report on Public Transport Victoria’s disclosure of myki travel information during a datathon. The report showed how “de-identified data releases can result in serious privacy breaches when appropriate controls are absent or ineffective”.
Similarly, in 2018 the Office of the Australian Information Commissioner found that the Department of Health had breached Australian privacy principles when releasing de-identified data about Medicare and Pharmaceutical Benefits Schedule, the report noted.
“In both instances, agencies had applied a range of de-identification techniques to the data to protect individuals. Despite de-identification, both examples experienced re-identification events,” the report said.
Rangihaeata argued agencies which publish de-identified data must manage privacy risks the same way they manage risks in other activities.
“Re-identification of public data shouldn’t be as easy as a Guess Who game. We expect agencies to proactively manage this risk to protect vulnerable people, including victims of family and domestic violence,” she said.
“De-identification is technically complex and involves more than removing direct identifiers. The external environment is constantly evolving and can make assessing the re-identification risk challenging.”
To keep up with the changing environment — in which “privacy risks are not static” — agencies must regularly review privacy risks and assess the effectiveness of risk treatments, the report said.
Documenting risk assessments and the reasons for selecting risk treatments can also support regular monitoring and review.
Meanwhile, a methodical risk management approach — aided by sound governance arrangements — can assure agencies that the risk treatments applied to de-identified public data remain effective over time.
The report recommended all government agencies review all published data and identify datasets containing de-identified data.
It also made four recommendations to government agencies that publish de-identified data:
- Assign a custodian to each published de-identified dataset and capture this information in a register,
- Implement and maintain policies or procedures that govern de-identified data releases, including guidance to decision-makers,
- Monitor the external data environment and the effectiveness of risk treatments, and regularly review existing de-identified datasets for changes in re-identification risk,
- Manage privacy when publishing de-identified data by adequately capturing, assessing and treating re-identification risk.