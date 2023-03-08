The role of data science in optimising health and other government services

There have been phenomenal medical advances and associated health outcomes in recent years. These have primarily been achieved through improvements in specialised interventions targeting specific conditions – surgical, medical and pharmaceutical.

While these advances have extended life expectancy in many parts of the world, reducing the variation observed across nations, ‘healthy’ or productive life expectancy is yet to catch up. In addition, a larger proportion of the population is dependent on ongoing, specialised and expensive medical care.

The next wave of improvements in health outcomes will come in the form of humanising and contextualising healthcare. Rather than treating a condition in an ever-masterful but narrow sense, treatment will appreciate that the person’s health outcomes are strongly influenced by their lived context with economic, cultural, geographical, educational and habitual dimensions.

A definitive feature of this new perspective will be that the health state of an individual presenting for care is dynamic — it has a past, a future and a present. Therefore, it requires a pragmatic exploration of the person’s health trajectory up to the present time, in the lived context with all the influencing factors mentioned above, which will reveal insights about the network of cause-and-effect relationships underpinning health outcomes.

The second definitive feature will be the understanding that where the patient is presenting for health service is only one node of an interconnected, multi-faceted and complex system, stretching from primary to hospital to palliative care, with many nodes in between.

A systematic interlacing of these two features – health trajectory as a dynamic, multi-factorial phenomenon arising out of the lived societal context of the individual and an integrated appreciation of the health system as a multi-nodal entity combined with an understanding of the impact of interactions with other government services – will potentially lead to the most efficient health system we have created and the healthiest and most productive society.

An ever-increasing capability of this new health system will be to identify future or emerging adverse health outcomes, then mitigate or ameliorate them by knowing what combinations of care from across the whole spectrum of the system and with what timing would be most effective and coordinating these as soon as the risks are identified.

From a philosophical perspective, this new thinking can be described as moving away from the reductionism and linearism of British empiricism, which has underpinned much of modern science, towards the more sophisticated ‘the total is greater than the sum of its parts’ appreciation of German philosophy.

For the health service delivery scientist doing the fieldwork, it does come at a cost – the loss of pristine experimental conditions and the simplicity and transparency of studying the impact of one factor while freezing all others.

However, it is modern data science – multivariate analyses, statistical and machine learning techniques – operating on extensive data collections and linkage powered by modern data warehousing and computation platforms which enables the recapture of the transparency and interpretability of traditional empirical methods while overcoming their artificiality and implementational limitations in grasping and wielding the phenomenon of health with its full gambit of moving parts.

Data science: a predictive understanding of an individual’s health trajectory

In the simplest of terms, data science operates and produces transformational outcomes for both aspects of the new approach to healthcare.

In relation to the individual’s health trajectory, data science can generate objective, quantitative, interpretable and comprehensive accounts of health risks in the form of implementable algorithms.

Algorithms such as these have been built for NSW Health. The first, and perhaps the most foundational, is risk of hospitalisation (ROH), computed for each resident of NSW every day and available to NSW Health experts through an electronic platform – the Patient Flow Portal.

ROH shows the likelihood of an individual having an unplanned hospital admission in the next 12 months. In addition, it has 50% better prediction accuracy compared to an overseas algorithm against which it was benchmarked.

Its improved performance is partly due to its comprehensive capture of real-life factors that impact health – socioeconomic advantage and disadvantage, cultural background, indigenousness, extensive health service utilisation and medical history.

The ROH is designed to better identify patients who would benefit from an integrated care program focused on managing care for patients with chronic physical conditions and co-morbidities. In addition, ROH can separate patient cohorts who will benefit from the program compared to those who will not in a probabilistic sense.

ROH has proven to be a universal indicator of health vulnerability. For example, it has been the single best predictor of complications (hospitalisation, ICU admissions) resulting from COVID-19 in NSW. The Agency for Clinical Innovation (ACI) has used it as a critical part of its health service triage strategy in relation to the pandemic.

More recently, it has been a better predictor of death than the Charlson Score – the international medical gold standard. The universality of the score as a measure of health vulnerability is directly related to its socially contextualised nature.

Other important algorithms built for NSW Health are readmission risk, risk of death and risk of hospital-acquired complications (HAC).

For example, the readmission risk computes the likelihood of a patient returning to hospital within 90 days following discharge. As part of this study, simple and effective interventions were identified that reduced this risk significantly. Armed with this score and knowledge of the intervention, a health agency can identify and treat high-risk patients resulting in optimised health and hospital avoidance outcomes per unit intervention cost.

The risk-of-death algorithm is a proactive approach to identifying patients who may require palliative care, significantly improving the quality of end-of-life and reducing the spiralling and avoidable hospitalisations that occur at this stage in life.

HAC risk helps to identify patients at high risk of acquiring complications in the hospital for targeted preventative interventions. HACs are also an important quality metric against which hospitals are evaluated.

A HAC risk score per patient per admission creates an expected number of HACs per facility (reflecting state-wide performance) against which that facility may be more objectively and accurately evaluated. This would be a significant step up from the current practice of penalising an institution when its reported number of HACs is above or below a designated range.

Each of these algorithms, in its way, benefits from social contextualisation. That is, they consider all factors of the lived experience that act on the phenomenon they are designed to quantify and predict. As a result, the algorithms make possible the best-informed interventions at their corresponding point along an individual’s health trajectory.

The most impactful example of knowing the right time to deliver an intervention comes from our primary-tertiary data linkage study. It has been shown that all else being equal, if a diabetic patient had a GP in the dataset who flagged the person as diabetic, then this patient would have 40% less risk of being hospitalised in the following 12 months compared to another diabetic patient from whom there was no GP flag for diabetes. A similar result was observed for chronic kidney disease.

A systematic approach to integrating health care for optimised outcomes

Integrated care programs are some of the most forward-thinking examples of health service delivery closest to those described here. These programs appreciate the whole health system and deploy multiple nodes to treat a patient, usually with multiple or chronic conditions or complex vulnerability.

A critical step in elevating integrated care interventions from local, ad-hoc and eclectic designs, usually dependent on the enthusiasm of a group of treating clinicians, to a systematic mode of service delivery, is establishing rigorous scientific methods capable of evaluating their efficacy. These methods would reveal the causal relationship between nodes of intervention and outcome, the type of patient they are beneficial for, and also constitute a grounding scientific mechanism which makes the ongoing discovery and improvement of these programs possible.

All the above is made possible with data science and linked datasets. For example, frequent presenters to the emergency department (ED), although small in number, place extensive demands on services, to say nothing of the costs and consequences for the patients themselves.

In addition, EDs are often not well equipped to address the multi-dimensional nature of patient needs and the complex circumstances surrounding repeated presentation. Employing an intensive short-term community-based case management model, the Checkpoint program run in the Nepean Blue Mountains region sought to improve care coordination for this patient group, thereby reducing their reliance on ED.

Working with this cohort of patients, there was a 53% reduction in ED presentations in the two years following the intervention.

The first challenge was the synthetic construction of an equivalent comparison group, as unlike experiments, medical interventions do not have a pre-assigned control group. This was done by capitalising on a linked dataset consisting of 10 years of ED and hospital admission activity and socio-economic and demographic information.

An algorithm defined the patient cohort as a distinct subset dissociated from the rest of the NSW population. Think of it as a suitability algorithm for the intervention program. This algorithm was then applied to the population to find other people with the same distinction score, matching each of the actual patients. This matched the patient cohort with another cohort with the same suitability for the intervention, except they did not receive it.

Having established that the intervention was highly effective (with approximate savings of $7100 per patient), the algorithm was then used to find all the other people in our linked dataset who were statistically well-placed to benefit from the program. This means that if the intervention were scaled up to be state-wide, it would show where in the state the potential patients are and the likely benefit they will derive.

This creates ideal conditions for an optimal, cost-effective, statewide intervention strategy. Given that the derived benefit may also be a function of the suitability score, there is scope for the statewide implementation to be more effective than the pilot implementation and for ongoing scientific refinement of the intervention.

Use case for other government services

The philosophy of the approach described here and the role of data science within it applies in principle to any government department that aims to optimise the societal benefit of their services against a finite budget. In many evaluations that we have carried out, health interacts with education, social services, and aged care. For these domains adopting this approach would amount to swapping health as the focus with another from the periphery.

There is a significant benefit all forms and departments of government would enjoy from the establishment of inter-agency linked datasets in Australia. Each government department can benefit from data science in proportion to the quality of its data assets and technological platforms.

A mutually enabling relationship exists between data science and data assets and technologies. The better the assets and technologies, the more enabling data science can be from a business function perspective.

The less obvious aspect is that even with limited data and technologies, carrying out data science will add significant certainty as to how best to grow these aspects and capabilities. It will reveal, in practice, the most fruitful path to take in terms of expanding data assets and the most effective platform technologies to adopt.

Dr Yalchin Oytam led the COVID-19 modelling that guided the Premier and Cabinet’s management of the epidemic in NSW. He holds a PhD in neuroengineering from UNSW.