Most enterprise AI failures in supply chain have nothing to do with the algorithm. They are caused by the data that goes in: inconsistent identifiers, incompatible definitions, and event records that cannot be joined across systems. This article addresses why multi-ERP data harmonisation is not a pre-project task but the project itself, and what getting it right actually requires.
The Real Reason Supply Chain AI Projects Stall
Ask a team that has recently shelved a supply chain AI initiative what went wrong, and the answer is rarely the model. It is almost always the data. The model was perfectly capable of producing useful outputs. The problem was that the inputs it needed, consistent supplier identifiers, comparable lead-time definitions, joinable shipment records across plants, simply did not exist in a usable form.
This is the multi-ERP problem. Large organisations, particularly those that have grown through acquisition or divisional expansion, typically operate several ERP instances simultaneously. Different versions of SAP. Legacy systems running in parallel with newer platforms. Divisions that made independent technology decisions years ago that nobody has had the mandate or budget to rationalise.
Each system holds valuable operational data. The problem is that the same entity, a supplier, a part, a purchase order, a plant, is recorded differently in each one. Different ID formats. Different naming conventions. Different timestamp granularities. Different definitions for what constitutes a late delivery. The data exists, but it is not harmonised, and without harmonisation, no AI layer built on top of it can be trusted.
80%
of supply chain data sits outside the ERP, distributed across disconnected systems, spreadsheets, and manual records. (Infor Research)
30%
of annual revenue that siloed or incorrect data can cost an organisation, according to IDC estimates.
What Harmonisation Actually Means in Practice
Data harmonisation is often described as a technical exercise: build a pipeline, map the fields, load to a warehouse. That framing understates the problem significantly. The technical pipeline is the easy part. The hard part is semantic harmonisation, which means agreeing on what things mean across systems, not just what they are called.
Consider a straightforward example. An organisation wants to build a supplier reliability model. To train that model, it needs a consistent definition of on-time delivery across all divisions. But in one ERP, on-time is calculated against the original promised delivery date. In another, it is calculated against the last confirmed date. In a third, the definition varies by commodity category. These are not data quality problems that a pipeline can fix. They require a governed decision about what the definition is, documented in a way that all downstream systems and models use consistently.
This is the ontology problem. Before any AI model can produce reliable outputs from multi-ERP data, the organisation needs a semantic layer: a governed set of definitions for every entity and KPI that matters operationally. Supplier. Part. Purchase order. Shipment. Lead time. OTIF. These definitions need to be agreed, versioned, and enforced across every data product that feeds into the AI layer.
62%
of supply chain traceability and AI initiatives stall before full deployment, with data inconsistency identified as the most pervasive root cause. (Gartner, 2024)
The Four Components of a Usable Data Foundation
Organisations that successfully build supply chain AI on top of multi-ERP environments typically put four components in place before any model training begins.
- Stable key identifiers. Supplier IDs, part numbers, plant codes, and shipment references must be consistent and joinable across systems. Where they are not, a resolution layer is required that maps variants to a canonical identifier. This is foundational. Without it, no join across systems is reliable.
- Governed event timestamps. Order date, confirmed ship date, actual ship date, receipt date, and delivery date all need to be present, time-zone consistent, and captured at the same granularity across every system contributing to the model. Missing or inconsistent timestamps are the single most common cause of broken OTIF and lead-time calculations.
- Documented exception definitions. What counts as a disruption? What threshold defines a late delivery? What qualifies as an inventory hotspot? These rules need to be defined before data is modelled, not inferred after. AI models trained on inconsistently defined exceptions produce inconsistently reliable outputs.
- Lineage and auditability. Every data product feeding the AI layer should carry full lineage: where the data originated, when it was ingested, what transformations were applied, and who approved the definition it uses. This is not optional for regulated or multi-divisional environments. It is the audit trail that makes AI outputs defensible.
None of these components requires a full ERP rationalisation programme. They require a data product discipline applied incrementally, starting with the specific entities and time windows needed for the first pilot use cases.
The Federated Approach: Harmonise Without Forcing Uniformity
One of the most common objections to multi-ERP harmonisation is that it sounds like a mandate for standardisation that will meet divisional resistance. That concern is legitimate but addressable.
The federated approach separates the question of what data means from the question of how each division manages its own systems. A division can continue operating its own ERP, with its own internal processes and configurations, while also publishing governed data products that conform to shared definitions at the point of ingestion into the harmonised layer. The division does not need to change how it works. It needs to agree on how its data is described when it leaves its own environment.
This is how national-scale federated programmes, including the NHS Federated Data Platform, have operated successfully across dozens of autonomous organisations with fundamentally different underlying systems. Consistent semantics at the data product level, divisional autonomy at the operational level. The governance framework defines the former without imposing constraints on the latter.
Why This Is the Prerequisite, Not the Precursor
There is a tendency in enterprise AI programmes to treat data harmonisation as a pre-project activity, something that happens before the real work begins. This framing creates two problems.
The first is sequencing risk. Teams that treat harmonisation as a precursor often spend months in data preparation before any AI value is demonstrated, which erodes executive patience and budget confidence. A better approach is to scope the data foundation tightly around a specific thin-slice use case, harmonise only the entities and time window that use case requires, demonstrate value, and then expand the foundation incrementally as additional use cases are onboarded.
The second is underinvestment. When harmonisation is treated as a pre-project task rather than a core deliverable, it tends to receive less rigour, less documentation, and less governance than it deserves. The result is a data layer that works for the pilot but does not scale when a second division or a second use case is added. Treating harmonisation as the foundational product, governed, versioned, and owned by identifiable data stewards, is what makes the AI investment durable rather than fragile.
95%
of AI supply chain initiatives that failed to deliver sustained ROI cited fragmented data, siloed systems, and undocumented workflows as the primary cause, not model performance. (SCMR, 2026)
The Competitive Implication
Organisations that solve the multi-ERP harmonisation problem do not just enable better AI outputs. They build a compounding operational advantage. Once a governed semantic layer exists, every subsequent AI use case, disruption prediction, inventory optimisation, procurement analytics, quality monitoring, can be onboarded at a fraction of the cost and time of the first. The data foundation becomes a platform, not a project.
Those that skip this step, layering AI models directly onto inconsistent, poorly governed data, produce outputs their operational teams cannot trust, act on, or audit. The models may be sophisticated. The results will not be.
In 2026, the supply chain AI leaders are not the organisations with the most advanced models. They are the ones that built the right foundation first.
About VE3
VE3 is an enterprise AI and technology consultancy with deep delivery expertise in multi-ERP data environments, Palantir Foundry, Azure data platforms, and SAP integration. VE3 has built governed data foundations for national-scale programmes including the NHS Federated Data Platform, and applies the same data product discipline to complex, multi-divisional supply chain environments. To discuss your data harmonisation and AI readiness programme, contact the VE3 team.


.png)
.png)
.png)



