Data infrastructure is about building the foundations for the way government uses data. Not just now, but in the future. This work deals with some very technical, system-level infrastructure, so why are we doing user research? Surely you only need to worry about that for citizen-facing services?
Well, no. Whatever you build, wherever you build it, there will be always be users, be they citizens or other people working in government. And if the service doesn’t meet user needs, it will fail. There are a few risks with data infrastructure in particular. We’ve found that if you don’t take account of the diverse needs of users across government, you risk making dangerous assumptions. It’s easy to start building for an idealised data future, rather than the reality of how data is actually being used today.
In this post, I wanted to share five lessons we’ve learnt about making data infrastructure user-centred.
Most government service teams don’t really care about data, they care about their service
If we want to get people to change their data habits and use the new pieces of data infrastructure we’re building, such as open registers, we can’t just tell them to do it. We need to show people how we can help them to improve their service. Understanding this means that we’re learning to make the infrastructure relevant and adoptable because it addresses the real-life problems that services face, rather than just asserting how we’d like people to be using data. That would be our need, not theirs.
Sourcing and updating data still relies on people
We’d all love to think that there’s a world of fancy data updates and verification processes magically operating somewhere in government. The reality is, finding and updating data still relies on people talking to people. If someone wants reference data, such as a list of prisons, they’ll probably ask someone in their team. If we don’t take account of this behaviour among users, we impede their chances of finding and using the data sources we’re developing.
In government, spreadsheets are the default model for data
Microsoft Excel is everywhere in government. Most civil servants aren’t using specialised data formats. They’re looking at data in Excel. So when they are searching for and investigating whether a dataset is useful to them, they need to see it in a way they’re used to: a spreadsheet. Even if the underlying structures we’re creating are richer and more flexible, we’ve learnt that spreadsheets need to be an option for users who prefer to consume data in this format.
Use metadata sparingly
We thought it would be a great idea to include helpful metadata, so users understand the data before they use it. If you’re unfamiliar with the term, metadata is data about data, such as who created it or why it was created. But most users prefer to see the actual data first before finding out more about the dataset. They may want to see a small amount of specific metadata, like who created the data and when it was updated. But mostly it’s about checking out the data first. Showing users too much metadata upfront is overwhelming and doesn’t reflect how they assess the value of datasets.
Using any third-party data source is about balancing risk and reward
Most services rely on data. However when teams store their own copies of data, such as lists of prisons or schools, it can be difficult to keep it up to date. Out of date data can lead to a poor service experience. So data specialists advocate fetching data from reliable sources using automated services like APIs (application programming interfaces). However, during our research we found many service teams still prefer to store their own copies of data. Understanding this has altered how we serve data in our services. For example, with open registers, users can choose to download the latest copy of a register rather than use the API. This gives them the benefits of using up-to-date data, without any risk.
This article was originally published on the UK Government Digital Service blog.