In any Oracle data warehouse migration, one task dwarfs the rest: the transformation layer. A mature estate often carries well over a thousand Oracle Data Integrator (ODI) mappings, each encoding business logic that the organisation depends on and that almost nobody has fully documented. ODI migration is where programmes either find their stride or quietly stall, and the reason they stall is almost always the same: an attempt to lift-and-shift logic that was never designed to move.

This piece sets out why ODI mappings resist a straight port, and a practical approach to re-engineering thousands of them into modern, governed, reusable cloud pipelines without losing the meaning buried inside them.

Why ODI mappings don't lift-and-shift cleanly

ODI is an ELT tool: rather than transforming data in a separate engine, it pushes transformation logic down to execute inside the target database, orchestrated through reusable Knowledge Modules. That design is efficient on Oracle, but it means the logic is tightly coupled to Oracle's SQL dialect and to ODI's own repository constructs. There is no clean one-to-one equivalent in a cloud lakehouse.

Three consequences follow. The logic is dialect-specific, so it cannot simply be copied across. Much of it is undocumented, living in mappings and Knowledge Modules rather than in any specification. And a large share of mappings are near-duplicates - the same pattern repeated across dozens of tables. Treat each as a unique artefact to hand-rewrite and the programme drowns; treat them all as identical and you lose the genuinely complex minority that matter most.

The governing principle: re-express intent, not syntax

The goal is not to translate ODI line by line. It is to capture what each mapping does - its intent - and re-express that intent in cloud-native pipelines and a metadata-driven pattern. Done well, a thousand bespoke mappings collapse into a much smaller set of reusables, parameterised patterns plus a manageable tail of genuine one-offs. Here is how to get there.

Step 1 - Inventory and classify the mappings

You cannot plan from a guess. Parse the ODI repository to produce a complete register of every mapping, its sources and targets, its Knowledge Modules, and its dependencies. Then classify each by complexity - simple pass-through, standard transformation, or genuinely bespoke logic - and by how often the same pattern recurs.

This single step turns "we have about 1,200 mappings" into evidence: how many are trivial, how many share a pattern, and where the real complexity concentrates. Everything downstream is scoped from this register.

Step 2 - Decide a path for each mapping

With the register in hand, assign each mapping a path against clear criteria - functional fit, complexity, recurrence, data quality and ownership:

Templatise. The high-volume, low-complexity majority that follow a common pattern. These are re-expressed once as a parameterised pipeline and applied across many tables.

Re-engineer. Moderately complex mappings that need adapting but follow recognisable logic. Converted with review.

Redesign. The bespoke minority carrying real business rules. Reworked deliberately with the relevant data owner, and often improved in the process rather than faithfully reproduced.

Deciding this explicitly - rather than rewriting everything or lifting everything - is where the budget and timeline are won.

Step 3 - Re-platform the data movement

Separate the movement of data from the transformation of it. In ODI both are entangled; in a modern platform they are distinct concerns. Replace ODI's extract-and-load with cloud-native ingestion: mirroring or change-data-capture for low-latency relational sources, and metadata-driven Data Factory pipelines using the Oracle connector - which can unload tables in parallel by partition - for bulk movement. Land everything raw in the Bronze layer with full audit metadata, so the capture is always replayable and the transformation logic has a clean, immutable starting point.

Step 4 - Re-express the transformation logic

Now rebuild the logic itself in the right place in the medallion structure. Conformance, cleansing and history-building belong in Silver, expressed as PySpark notebooks; curated business rules and aggregations belong in Gold, expressed in the warehouse's T-SQL. Where ODI relied on Knowledge Modules to apply a repeated pattern, that pattern becomes a reusable notebook or stored procedure rather than copies scattered across mappings.

The high-volume, templatised majority can be largely automated - parsed from the source and generated into the target form, with human review. Specialist engineering effort is then concentrated where it earns its keep: the complex, business-critical logic that genuinely needs a person. Critically, place logic deliberately and document it as you go, so it is owned and testable rather than reburied in opaque jobs.

Step 5 - Make it metadata-driven and reusable

This is the difference between modernising and merely re-hosting. Instead of a thousand standalone jobs, build a small number of parameterised, config-driven pipelines that read what to do from metadata - which source, which keys, which load pattern - and apply it consistently. New sources and tables are then onboarded by adding configuration, not by writing another bespoke pipeline.

The payoff is durable: a far smaller, consistent, observable estate that your in-house team can actually understand and operate, rather than the sprawling, undocumented set you started with.

Step 6 - Prove equivalence through reconciliation

Re-engineering is only trustworthy if you can prove the new logic produces the same answers. Run old and new in parallel and reconcile at the mapping and output level - row counts, column-level content, business metrics - each against an explicit tolerance, with the most important outputs signed off by their data owner. (This connects directly to migrating without breaking statutory or peak-season reporting: reconciliation is the evidence base that lets you decommission the legacy with confidence.)

Common pitfalls to avoid

The recurring mistakes are predictable: hand-rewriting every mapping (slow and unaffordable); lifting them all unchanged (you inherit the debt and the dialect lock-in); ignoring the long tail of near-duplicate mappings that should have been templatised; losing undocumented business logic in translation; and skipping reconciliation until the end. The approach above exists to avoid every one of them.

Facing a wall of ODI mappings? Ask us for an ETL re-engineering assessment that inventories and classifies your estate before a line of code is written