Digital Transformation

The Ontology Problem in NHS Data: Why AI Will Get It Wrong Until You Solve This First

Pamela Sengupta
March 25, 2026

There is a seductive logic to the way NHS AI is currently being discussed. The argument runs something like this: the NHS has extraordinary amounts of data covering the health of millions of people across decades of clinical encounters (Activity in the NHS, 2024). Modern AI systems are extraordinarily powerful at finding patterns in huge datasets; therefore, if we can just get the data onto the right platform, the AI will do the rest. The ten-year h (AI rollout in NHS hospitals stalled by major implementation challenges, study finds, 2025)ealth plan gestures toward this vision. The FDP is described as the infrastructure that will make it possible. Generative AI tools are proliferating across clinical settings with remarkable speed.

The problem with this logic is not in its conclusion but in the step it skips. That step is ontology or more precisely, the widespread absence of a coherent, formally defined semantic model of what NHS clinical data actually means. No amount of computing can compensate for its absence, and the NHS has been accumulating this deficit for decades without fully reckoning with the consequences. Until it does, the most sophisticated AI models deployed on NHS data will produce outputs that are technically impressive but operationally unreliable, difficult to detect, and potentially dangerous to act on.

1. What Ontology Actually Means in a Clinical Context

An ontology is not simply a vocabulary or a taxonomy. It is a formal model of the concepts that exist within a domain, the relationships between those concepts, and the rules that govern how they interact. In healthcare, an ontology tells a system not just that a patient had a knee operation, but what kind of operation it was, how that operation relates to the diagnosis that preceded it, what the expected care pathway looks like following it, which clinical codes are associated with it, how it affects the patient's referral-to-treatment clock, what theatre resources it typically requires, and how it relates to the dozens of other concepts a clinician or analyst might need to reason about in connection with that event.

Without this relational structure, data is just records. With it, data becomes something a system can reason about following chains of semantic meaning rather than matching patterns in raw numbers. This distinction is what separates AI that produces clinically actionable insight from AI that produces statistically interesting noise.

The Gap Between Coding Standards and Working Ontology

The NHS has, to its credit, developed extensive coding standards over the years. SNOMED CT is the mandated clinical terminology for the health service, providing a comprehensive hierarchical structure of clinical concepts. ICD-10 and ICD-11 provide disease classifications. OPCS-4 codes procedures. NHS Data Dictionary elements define the administrative and operational concepts used in national reporting. These are valuable assets.

But having a terminology standard and having a functioning ontology for a given trust's data environment are not the same thing, and the gap between them is where most NHS AI projects quietly founder. Terminology standards define what things can be called. Ontology defines what things mean in relation to each other, within the specific operational context of a particular organisation, based on how the data has actually been collected and how the business processes that generated it actually work. That second layer of meaning cannot be imported from a national code list. It has to be built, domain by domain, with clinical and operational expertise that sits inside the organisation or with partners who have acquired it through sustained engagement with NHS environments.

Terminology standards define what things can be called. Ontology defines what they mean in relation to each other. The NHS has the former. Most trusts are still building the latter.

2. How the Ontology Gap Manifests in Practice

The core of the problem is that NHS trusts have been collecting data for decades in systems designed to serve operational purposes, not to support machine reasoning. An EPR records a diagnosis in ICD-10 because national reporting requires it, but the way that code gets entered, what it gets associated with in the system, and how it relates to other data elements recorded in the same encounter varies between trusts, between departments within the same trust, between clinicians within the same department, and sometimes between different episodes for the same patient.

A SNOMED code for a particular condition might appear in one system with one set of associated attributes and in another system with a completely different set. The same clinical concept might be represented using five different coding conventions across five source systems feeding a trust's analytical environment. (Clinical Terminologies, 2025) The operational systems did not need to resolve these inconsistencies because each served its own narrow purpose. The AI model, however, needs to reason across all of them simultaneously and it has no way to do this reliably unless the data has been given a coherent semantic structure before it arrives.

The Transfer Learning Trap

This inconsistency is evident in the pattern of NHS AI projects that produce impressive pilots but fail to generalise. (Tytler, 2026) A model trained on data from one trust to predict emergency department attendances fails to transfer to another trust because the way ED attendance data is recorded differs in ways that are invisible to human reviewers but large enough to fundamentally confuse a model learning patterns. A predictive model for delayed transfers of care produces different recommendations when run against data from two departments within the same trust because the definition of a delayed discharge is applied inconsistently. (D'Amour et al., 2020)

A waiting list management tool learns to optimise for a particular definition of RTT clock start that turns out to reflect a historical coding practice rather than current clinical policy. These are not AI failures in the sense of algorithmic inadequacy. They are data failures — and specifically ontological failures. The absence of a consistent, formally defined model of what the concepts in the data actually mean.

Fifteen Domains, Fifteen Opportunities for Semantic Drift

In an acute trust, the ontological challenge extends across at least fifteen clinical and operational domains: inpatient care, outpatient management, emergency attendance, theatre services, maternity, critical care, diagnostics, pharmacy, radiology, workforce, finance, community care, RTT, cancer pathways, and supply chain. Each domain has its own data characteristics, its own source systems, its own coding conventions, and its own definitional complexities that have often evolved differently even within the same organisation.

Building an ontology that correctly models all fifteen domains, captures the relationships between them, and is grounded in how the trust's source systems actually encode clinical reality is the work that makes AI reliable. It is also work that requires a combination of clinical knowledge, data architecture expertise, and Foundry-specific technical capability that very few organisations or vendors currently hold in sufficient depth.

3. What the FDP Architecture Provides and What It Cannot

The NHS Federated Data Platform's approach to this problem is architecturally correct in its intent. The Canonical Data Model, commissioned by NHS England as part of the FDP programme, is owned by the NHS — explicitly separate from Palantir's intellectual property — and is designed to provide a standardised semantic layer across the platform. Palantir Foundry's Ontology Manager provides the tooling to instantiate and manage ontological relationships within a trust or ICB's FDP instance.

The FDP is, in principle, designed to be a platform on which good ontological practice can be built. The challenge is that having the tools and the framework is not the same as having the implementation. The national CDM sets standards at a level designed for cross-organisational interoperability, but for AI to work reliably at the trust level, the ontology needs to be instantiated at a much more granular level of clinical and operational detail than any national programme can specify from the centre.

What Trust-Level Ontological Depth Looks Like

Consider what it means to model the inpatient episode in the depth that AI requires. An inpatient episode is not just a date of admission and a date of discharge. It involves a referring source, an admission method, a specialty, a consultant, a sequence of ward moves, a set of clinical observations, a set of investigations, a set of interventions, a diagnosis coded at multiple levels of specificity, a set of procedures, a discharge destination, and a connection to the RTT pathway that the episode may have started, continued, or completed.

The ontological model needs to capture all of these elements, their relationships to each other, and their relationships to the concepts in other domains they connect with — a theatre booking that precedes the episode, a diagnostic result that informed it, a community care package that follows it. Multiply this level of modelling across fifteen clinical domains, and the scale of the work becomes clear. It is not a one-time implementation exercise. It is an ongoing programme of semantic engineering that requires sustained clinical and technical expertise.

Without the right ontology and depth of data model, AI will get confused. If the input is wrong, the model will not correct it by itself. That is where we are losing value right now across NHS AI implementations.

4. The Clinical and Regulatory Consequences of Getting This Wrong

When Unreliable AI Meets Clinical Decision-Making

The consequences of deploying AI against data that lacks proper ontological grounding are not abstract. They are clinical and organisational. When an AI system produces a recommendation based on data encoding inconsistent definitions, the recommendation may be systematically wrong in ways that are hard to detect. A model that recommends theatre schedule changes to improve utilisation, but was trained on theatre data where procedure codes were applied inconsistently across surgical specialities, will optimise for a definition of utilisation that does not accurately reflect operational reality.

Clinicians who interact with these recommendations and find them unreliable will lose confidence in AI-assisted decision support across the board creating precisely the algorithmic aversion that health informatics researchers have identified as one of the most significant barriers to AI adoption in healthcare. NHS England paused the Foresight AI project in mid-2025 following concerns about GP data governance, a high-profile illustration of how trust in AI systems can collapse rapidly when the data foundations are not sufficiently transparent or well-governed. The ontology problem does not just create bad AI outputs. It creates distrust in AI as a tool, at exactly the moment when the NHS needs to be building confidence in it.

The Regulatory Pressure Is Already Arriving

There is also a regulatory dimension that is increasingly impossible to ignore. The MHRA is developing frameworks for AI as a medical device, with the expectation that clinical AI systems will be transparent, explainable, and auditable. The NHS AI safety gap has been explicitly identified by clinical informatics leaders as a critical priority, with calls for standards that address AI failure modes, including performance drift and hidden coordination that static documentation frameworks have not yet caught up with.

Explainability in the context of a clinical AI model means being able to show not just what the model recommended but why, in terms that clinical and governance stakeholders can understand and challenge. This requires that the data on which the model was trained, and the data against which it makes recommendations, are structured in a way that allows meaningful explanation. A model trained on inconsistently coded clinical data without a clear ontological structure cannot produce explanations that will satisfy the regulatory environment emerging from both the MHRA and the broader NHS AI governance framework. Building ontological foundations now is preparation for a regulatory future that is already arriving  not a theoretical exercise.

5. Ontology as Infrastructure: The Cumulative Investment Argument

The good news is that ontological work, done well, is cumulative. A well-built ontology for one acute domain — once validated against live data and refined through real analytical use — becomes a reusable asset. The acute CDM being developed by Pathfinder trusts working with FDP today is, if built to a transferable standard, something that dozens of trusts will be able to adopt and adapt rather than building from scratch.

The solution exchange model that Palantir is developing for the FDP, which allows analytical applications built on top of a standard ontological layer to be packaged and distributed across trusts, only works if that ontological layer is genuinely consistent and well-grounded. An AI model for RTT pathway prediction, built on a trust's FDP instance with a rigorously defined ontology for the referral-to-treatment domain, can in principle be packaged and made available to other trusts whose data is mapped to the same CDM. This is the network effect of getting ontology right: investments made by early adopters compound into shared infrastructure that the whole system benefits from.

Sequencing the Work: What Minimum Viable Ontology Looks Like

The practical question for NHS CDOs and data transformation leads is not whether to invest in ontology but how to sequence the work and what the minimum viable ontological foundation looks like before AI deployment becomes appropriate. The answer varies by use case, but a working principle is this: any AI application intended to influence clinical or operational decisions in a live environment requires an ontological model of sufficient depth and consistency to support the specific reasoning the AI needs to perform.

A waiting list management tool requires a coherent model of the RTT pathway, including the relationships between referrals, appointments, and procedures, and the definitions of clock start, clock stop, and the various pause and exclusion rules. A theatre optimisation tool requires a coherent model of theatre capacity, session types, procedure durations, speciality requirements, and the relationship between planned and actual activity. A predictive model for delayed transfers of care requires a coherent model of discharge planning concepts, bed availability, the social care interface, and the factors that contribute to delay. None of these models is trivial to build. All of them reward the investment many times over when they enable AI that clinicians trust and act on.

6. Building Ontology on Palantir Foundry: The Technical Realities

Palantir Foundry provides sophisticated tooling for ontology management through its Ontology Manager application, which allows the definition of object types, link types, and action types that together constitute a working ontology for a given domain. Object types represent the core entities — the patient, the episode, the pathway, the provider, and the resource. Link types represent the relationships between them. Action types define the operations that can be performed on or in relation to these entities. (Object and link types, 2025)

Building these definitions correctly requires someone who understands both the clinical domain and the Foundry platform deeply enough to make the right modelling decisions: when to represent a relationship as a link type versus a derived property, how to handle temporal relationships between entities, how to manage the version history of ontological objects as data changes over time, and how to ensure that the ontology instantiated in Foundry's semantic layer is correctly aligned with the CDM structure that national reporting and interoperability require.

This is not work that can be done at scale by data engineers without clinical informatics expertise, or by clinicians without data platform knowledge. The combination of deep NHS clinical knowledge and genuine Palantir Foundry engineering capability is the scarce resource that currently limits how quickly this work can progress across the system. Organisations that have it, or that partner with organisations that have it, are building ontological assets that will compound in value as AI capability on the FDP matures.

Reference Data Management: The Unsung Dependency

One technical dependency that often gets insufficient attention in ontology discussions is reference data management. An ontology in Foundry is only as reliable as the reference data that underpins its coding schemes. SNOMED CT version changes, ICD-10 to ICD-11 migration, updates to OPCS-4 codes, changes to national specialty and service line definitions — all of these feed into the ontological model and can invalidate AI model outputs if they are not tracked, versioned, and propagated correctly through the data model.

A reference data management capability that maintains the provenance and version history of every coding scheme used in the ontology, and that triggers appropriate validation and retraining workflows when reference data changes, is foundational infrastructure for any AI programme built on FDP. Building it properly at the outset is far less costly than retrofitting it after AI models are in production and reference data drift has already corrupted their outputs.

What This Means for NHS Trusts Planning AI on FDP

The NHS AI strategy expressed in the ten-year health plan is ambitious and, in its aspirations, entirely right. AI has demonstrated genuine value across a range of clinical applications — from radiology interpretation to deterioration prediction to administrative automation — and its potential contribution to NHS recovery and transformation is real. But realising that potential at scale, across the breadth of acute care, requires getting the data right first.

Getting the data right, in the sense that matters for AI, means building the ontological foundations that allow AI systems to reason about clinical and operational concepts reliably, consistently, and in ways that clinicians can trust and regulators can scrutinise. The trusts building those foundations now, through careful CDM development, rigorous ontological modelling, and sustained investment in data quality and lineage, are not just doing good data housekeeping. They are building the infrastructure on which the next generation of NHS AI will run.

The trusts that skip this step in favour of deploying AI against legacy data in the hope that the model will sort it out will find, as many have already found, that the model cannot sort it out — and that the cost of retrofitting ontological structure after the fact is considerably higher than building it correctly at the outset.

How VE3 Supports Ontology Development and AI Enablement on FDP

VE3 supports NHS trusts and ICBs in building the ontological foundations and canonical data models that safe, effective AI in acute care requires. Our FDP+ Enablement and Assurance practice combines Palantir Foundry engineering expertise with deep clinical informatics knowledge, helping organisations move from data ingestion to genuinely trustworthy analytical and AI capability. We have built ontological frameworks across acute and ICB data environments, developed reference data management architectures for multi-source NHS data platforms, and supported trusts through the complex journey from legacy coding practices to semantically coherent, AI-ready data environments. If your trust is preparing to deploy AI on FDP or is grappling with data quality and consistency issues that are limiting what your platform can deliver, we would be glad to share what we have learned.

  • © 2026 VE3. All rights reserved.