NHS organisations are investing significantly in data platforms, analytics tools, AI capabilities, and electronic patient records. These investments are well-intentioned and, in the right conditions, genuinely transformative. But there is a condition that is often underestimated, sometimes ignored entirely, and consistently expensive to recover from when it goes wrong.
That condition is data quality. When the underlying data is incomplete, inconsistent, or ungoverned, every investment made on top of it is weakened. Dashboards produce figures that cannot be trusted. AI models learn the wrong patterns. Waiting list data has to be manually validated before it can be acted on. EPR migrations arrive at go-live carrying problems that take months and significant cost to unpick.
This article sets out why data quality is not a technical detail to be handled downstream of strategic decisions. It is a strategic issue in its own right, and one that NHS digital leaders need to own at the most senior level.
What Data Quality Actually Means in an NHS Context
Data quality is often treated as a synonym for accuracy. In practice it covers several distinct properties, each of which matters differently depending on the use case.
Completeness refers to whether the data that should exist actually does. In NHS settings, this is a persistent problem. A significant proportion of patient data is still captured on paper in many Trusts and never transferred to electronic systems. Clinical observations recorded in one system are not visible in another. Referral pathway events are logged inconsistently across sites.
Consistency means that the same concept means the same thing across different systems and departments. In a Trust running over a hundred clinical and operational systems, consistency is rarely achieved without deliberate effort. The same patient can appear under different identifiers in different systems. The same clinical concept can be coded differently by different teams.
Timeliness matters in operational analytics. Data that arrives too late to inform a decision is data that has not delivered value, regardless of its technical accuracy.
Fitness for purpose is the most pragmatic framing. Data that is adequate for one use case may be entirely inadequate for another. Operational reporting can often tolerate imperfections that would render the same data unusable for AI model training or population health analysis.
The question is not whether your data is perfect. It is whether it is fit for the specific decisions you are trying to make with it.
Where the Costs Show Up
EPR Transitions
The cost of poor data quality becomes most visible, and most financially tangible, at the point of major system transitions. Analysis published in 2026 estimated that repairing data errors following EPR go-lives will cost the NHS at least £13.5 million in that year alone, across nine major acute Trust transitions. That figure covers only the direct remediation cost. It excludes wider productivity losses, internal staff time, delayed benefits realisation, and any patient safety implications.
The most operationally damaging manifestation is disruption to patient tracking lists. When records are duplicated, incomplete, or migrated incorrectly, Trusts lose confidence in their waiting list data at exactly the moment when accurate RTT management matters most. Analysis of previous EPR transitions has found that patient tracking lists can increase by around 25% on average following go-live, reflecting the volume of records that require manual validation before they can be acted on.
NHS England's own RTT publication data has recorded instances where waiting time figures for specific months had to be revised after submission, following EPR go-lives that affected data completeness. These are not technical anomalies. They are predictable consequences of insufficient data quality investment before major system changes.
Analytics and Reporting
Poor data quality degrades the trustworthiness of analytics outputs in ways that are often subtle and sometimes invisible to the people using them. Dashboards built on inconsistent source data can appear to function normally while producing figures that are systematically misleading. Operational decisions taken on the basis of those figures compound the problem over time.
The NHS has invested in predictive risk tools and population health models over a number of years. Several of these initiatives have underperformed or been quietly shelved. In a significant proportion of cases, the cause was not the tool itself. It was the data it depended on: inconsistent quality, limited contextual information, or a narrow source base that left the model poorly generalised to the real population it was meant to serve.
Trust in analytics outputs erodes quickly and recovers slowly. Once clinical or operational staff have experienced dashboards or tools that gave them unreliable information, rebuilding confidence requires both technical improvement and sustained evidence over time.
AI and Advanced Analytics
AI models learn from data. If the data used to train a model contains systematic biases, gaps, or inconsistencies, the model's outputs will reflect those flaws. A deterioration detection model trained on patchy observation data will have blind spots in exactly the clinical situations where complete data is least likely. A demand forecasting model trained on inconsistently coded admission records will produce forecasts that systematically misrepresent demand in certain pathways.
This is not a theoretical concern. It is a documented pattern. NHS organisations that have the strongest data foundations are the ones that are extracting real value from AI and advanced analytics. Those investing in AI tools without first addressing data quality are experiencing the gap between what was promised and what is delivered.
The FDP and National Platforms
The NHS Federated Data Platform depends on quality source data from contributing organisations. Poor quality source data creates problems that the platform's own tools can surface but cannot fix. As the FDP's own governance documentation makes clear, data quality issues must be corrected at the EPR and local system level before being ingested. A national platform cannot compensate for problems in the data environments feeding into it.
This has direct implications for Trusts participating in the FDP. The value of FDP tools for waiting list validation, theatre optimisation, and population health depends on the cleanliness of the data each Trust contributes. Investment in local data quality is investment in national platform value.
Why Data Quality Problems Persist
Capture Problems at Source
Many data quality problems originate not in databases or analytics platforms but in the processes by which data is first recorded. Clinical staff working under pressure take shortcuts in documentation. Free-text fields are used where structured codes would be more useful for analytics. Paper-based capture in emergency and ward settings means data is either not digitised at all, or is entered retrospectively with limited accuracy.
These are not primarily technical problems. They are workflow and cultural problems, which is why they cannot be solved by investing in better analytics tools downstream.
Siloed Systems With No Common Standards
A Trust running 300 clinical and operational systems supplied by over 100 vendors will, without deliberate intervention, end up with 300 different approaches to how similar data is captured, coded, and stored. There is no automatic consistency. Achieving it requires defined data standards, enforced at the point of system procurement and configuration, and actively monitored over time.
The absence of a data catalogue or metadata management framework means that even internally, staff often do not know what data exists, where it lives, or whether it means the same thing in two different systems. Discovery becomes archaeology.
Governance Gaps
Data quality requires ownership. Without defined data owners and data stewards who are accountable for the quality of specific datasets, problems accumulate without anyone having the authority or responsibility to fix them. The NHS has historically underinvested in data governance roles relative to the scale of the data environment those roles need to manage.
The Data Security and Protection Toolkit requires evidence of data governance arrangements as part of annual compliance. But meeting the DSPT threshold is not the same as having a governance framework that genuinely drives quality improvement over time.
Migration and Integration Risk
Every system integration is an opportunity for data quality degradation. When data moves between systems, transformation rules can introduce errors. When legacy systems are decommissioned, historical data can be lost or corrupted in migration. When new systems are implemented, mapping between old and new data structures is rarely straightforward.
Data quality investment before a migration is consistently more effective and less costly than remediation after one. The chair of the NHS Chief Data and Analytical Officer Network has been explicit on this point: data and analytics leaders need to be involved from the outset of EPR programmes, not brought in after the key decisions have been made.
The Architecture Connection
Data quality is inseparable from data architecture. The structural decisions made about how data is stored, how systems are integrated, and how data flows between platforms determine whether quality can be systematically managed or whether it remains a series of individual, reactive fixes.
A fragmented architecture, where data lives in dozens of siloed repositories with inconsistent integration, makes it structurally difficult to establish and maintain quality standards. There is no single place to apply data quality rules. There is no unified view of the data estate. Governance accountability is diffuse.
A cloud-first enterprise data architecture, with defined data flows, a centralised metadata and cataloguing layer, and clear ownership boundaries across clinical, operational, and analytical systems, creates the conditions in which data quality can be systematically managed rather than constantly chased.
This is one reason why investment in data architecture should precede, or run in parallel with, investment in analytics and AI tools. The platform choices made now determine whether data quality is a problem that gets better over time or one that compounds.
Architecture does not guarantee data quality. But poor architecture makes data quality structurally impossible to sustain at scale.
Practical Steps for NHS Digital Leaders
Addressing data quality in a large NHS Trust is not a single project. It is an ongoing programme that requires sustained commitment at leadership level. The most effective starting points are these:
- Establish data ownership. Every major dataset needs an accountable data owner with the authority and responsibility to drive quality improvements. Without ownership, problems are visible but unactioned.
- Invest in data discovery before investing in analytics. Know what data you have, where it lives, how it flows, and what quality problems exist before deciding what to build on top of it.
- Fix quality at source, not downstream. Data quality tools applied to analytics platforms treat symptoms. Improving capture processes, training clinical staff, and enforcing coding standards at the point of entry address causes.
- Build data quality into EPR and system procurement. Require suppliers to demonstrate how their systems support data quality monitoring, metadata management, and standards compliance. Do not treat this as a post-implementation concern.
- Involve data and analytics leadership in major system transitions from the outset. EPR migrations and major platform changes are the highest-risk moments for data quality. Expert involvement before key decisions are made is far less costly than remediation after go-live.
- Monitor continuously, not periodically. Data quality is not a project with an end date. Dashboards, automated checks, and regular governance reviews need to be part of ongoing operations, not annual audits.
Where VE3 Can Help
VE3 works with NHS Trusts to design and implement the data governance frameworks, architecture, and operating models that make sustained data quality achievable. Our work includes data estate discovery, governance framework design, data lifecycle management, and transition roadmap planning for organisations undergoing major digital transformation.
If your Trust is preparing for an EPR go-live, building analytics capability, or seeking to understand the current state of your data estate before making further investment, we would welcome the conversation.


.png)
.png)
.png)



