Migration Assessments Test 2%% of Your Data. The Other 98%% Arrives on Cutover Weekend.

This problem exists in every system migration. The examples here are from SAP — the mathematics applies everywhere.

Somewhere in your programme, there is a dashboard. It shows mapping coverage. It is green — or at least amber trending green. The steering committee reviewed it last Thursday. The programme director referenced it in an email to the CFO. Ninety-five percent mapped. Nearly there.

Here is what that dashboard does not show: whether anyone has tested the actual data.

Not a sample of it. Not a representative subset. The actual data — every record, every field, every transformation. In most migration programmes, the answer is no. What has been tested is typically two to five percent of the total dataset. The remaining ninety-five to ninety-eight percent has been mapped, but never verified. It has been assumed to be safe because the sample was safe.

That assumption is the single most expensive risk in enterprise migration. And it has been the industry standard for twenty years.

Why migration assessments sample

The reason is practical, not malicious. A typical enterprise migration involves hundreds of thousands of records across dozens of object types — suppliers, materials, purchase orders, goods receipts, invoices, payments, GL accounts, cost centres, customer masters, bills of material. Testing every record through a full load cycle requires environment capacity, transformation logic, and elapsed time that most programmes cannot afford during assessment.

So the standard practice is to sample. Select a representative set. Load it into a sandbox. Check whether the load succeeded. Report the results. Extrapolate to the full dataset.

This approach is rational. It is also dangerous — because migration failures are not randomly distributed.

Failures cluster around edge cases. Unusual payment terms that exist in the source but not the target. Non-standard units of measure that pass validation but lose precision in conversion. Legacy classification codes that were meaningful in the old system but collapse into a single value in the new one. Country codes that predate ISO standardisation. Records with null values in fields that the target system requires.

These edge cases are precisely the records a random sample is least likely to include. And they are precisely the records that will fail on cutover weekend.

The arithmetic is straightforward. A two percent sample of one hundred thousand records examines two thousand. If the true failure rate is five percent, five thousand records will fail in production. The sample catches perhaps forty of them. The other four thousand nine hundred and sixty arrive on cutover weekend, when the cost of remediation is ten to fifty times higher than catching them during assessment.

The difference between coverage and safety

Most programme dashboards track mapping coverage — the percentage of source fields that have been assigned a target in the new system. This is a useful progress metric. It answers the question: have we decided where each field goes?

It does not answer a different and more important question: does the data survive the move?

A field can be mapped to a target and still lose information in transit. Consider a simple example. The source system has two supplier classifications — one for trade vendors and one for one-time vendors. Purchasing processes depend on this distinction. Reports filter by it. Approval workflows reference it. Payment terms differ by classification.

During mapping, both classifications are assigned to a single grouping in the target. The mapping is complete. The dashboard shows green. Both values have a target.

But the distinction has been lost. In the target system, trade vendors and one-time vendors are indistinguishable. Every purchasing report that relied on this classification is silently wrong from day one. The mapping did not fail. It succeeded — and in succeeding, it destroyed a business-critical signal that nobody tested for.

This is a lossy transformation. The data moved. The meaning did not.

A dashboard that tracks coverage will never catch this. Coverage asks whether a field has been assigned. It does not ask whether the assignment preserves meaning. The gap between those two questions is where migrations fail.

Four ways "mapped" data fails silently

When a programme reports ninety-five percent mapping coverage, the real risk is not in the unmapped five percent. It is in the mapped ninety-five percent — records that have been assigned a target but never tested for transformation integrity.

Lossy transformations. Two distinct values in the source collapse into one value in the target. The mapping is technically complete — both values have a destination — but the distinction is destroyed. Downstream processes that depended on that distinction behave incorrectly without generating any error message. The failure is silent and permanent.

Broken dependency chains. A supplier is migrated, but its purchasing organisation assignment is missing — perhaps because the configuration was handled by a different workstream, or because the assignment existed in ECC but did not map cleanly to the target structure. The supplier exists in S/4HANA. The purchase order references it. But when the PO is created, it fails — because the supplier has no valid purchasing organisation for that company code. The supplier is present. The PO cannot be posted. The goods receipt cannot follow. The invoice has nothing to reference. One missing link in the chain and every object downstream is blocked. The data arrived. The dependency did not.

Default-masked gaps. A field in the source is null or contains a non-standard value. The transformation assigns a default — immediate payment terms, for instance. The record loads successfully. The dashboard stays green. Six months later, Accounts Payable discovers that four hundred suppliers are on immediate payment terms who should be on sixty-day terms. The company has been paying invoices on receipt instead of holding cash for two months. Across four hundred suppliers, that is potentially millions in working capital that left the business sixty days earlier than it needed to. The CFO sees a cash flow problem. The treasury team investigates. Nobody connects it to a default value assigned during migration eighteen months ago. The transformation did not fail — it lied. And the lie cost the business real money every single day it went undetected.

Loaded-but-not-usable records. A material master is loaded with all mandatory fields populated. It passes the load check. But the combination of material type, planning profile, storage location, classification, and characteristics creates a configuration that does not exist in the target plant. The material is in the new system. It cannot be used in a production order. Worse — when downstream processes reference it, the invalid configuration propagates. A bill of materials inherits it. A production order references the BOM. A goods movement attempts to post against a storage location that the plant does not recognise. The corruption is not contained. It spreads through every process that touches the record. The material is technically migrated, operationally useless, and actively poisoning the data around it.

Every one of these failures passes the mapping coverage check. Every one produces a green row on the dashboard. Every one causes operational damage that surfaces weeks or months after go-live, when the cost of fixing it is vastly higher than catching it before cutover.

What verification actually requires

The gap in current practice is not effort or intention. It is method. Sampling tests whether a subset of records can be loaded. It does not test whether the transformation preserves meaning across the full dataset.

Verification requires a different approach: for each record, apply the forward transformation — converting the source into the target format. Then apply the inverse transformation — converting the target back into the source format. Then compare. If the roundtrip recovers the exact original, the transformation is provably lossless. If it does not, something was lost — and whatever was lost is exactly what will break in production.

Think of it like a hospital thermometer. A patient's temperature reads 37.5°C — not a fever, but worth monitoring. The reading is converted to Fahrenheit for the US-based system: 99.5°F. So far, so good. Now the system converts it back to Celsius for the night shift handover — but the conversion rounds to the nearest degree. 99.5°F becomes 37°C. The patient's record now shows a perfectly normal temperature. The slight elevation that warranted monitoring has vanished. No alert is triggered. No follow-up is scheduled. The data moved between systems. The clinical signal did not survive the move.

This is exactly what happens in a data migration — except instead of one patient, it is one hundred thousand records. Instead of a temperature, it is a supplier classification, a payment term, or a currency code. And instead of a missed fever, it is a missed business rule that surfaces on cutover weekend when nobody remembers why.

This is not a new concept in mathematics. It is called a bijective proof — a demonstration that a function and its inverse are consistent. It is the standard for verifying lossless transformations in fields from cryptography to data compression to signal processing.

It has never been systematically applied to data migration.

The reason is not mathematical complexity. It is architectural. Running a bijective proof on every record requires three things most migration programmes do not have: a forward transformation function for every object type, a corresponding inverse function for every object type, and an engine that can execute both at scale across hundreds of thousands of records. Building these requires deep knowledge of both the source and target system architectures — not just the field mappings, but the structural dependencies, the value transformations, the entity relationships, and the API constraints.

This is what we have built at Migration Proof.

What full verification looks like in practice

Migration Proof runs a bijective proof on every record in your dataset. Not a sample. Not a representative subset. Every supplier, every material, every purchase order, every invoice.

The engine walks the full dependency chain — invoice to goods receipt to purchase order to material to supplier — and proves each link. It checks formal preconditions on every field before transformation is even attempted. Records that fail preconditions are quarantined at the gate — they do not enter the transformation pipeline. They are diagnosed with the exact field, the exact value, the exact rule violated, and the specific remediation action.

Records that pass preconditions are transformed forward, then inverse-transformed back, then compared field by field. If the roundtrip matches, the record is certified lossless and logged in an Ownership Ledger with a cryptographic hash and timestamp. If it does not match, the record is flagged as untransformable with the precise point of failure.

The result is not a traffic-light dashboard. It is a mathematical proof applied to one hundred percent of your data. And a mathematical proof does not require trust in the person who produced it. The inverse function is there. Anyone can verify it. Run it yourself.

Currently, this proof is available for SAP ECC-to-S/4HANA migrations. The mathematics applies to any system-to-system transformation.

The question your programme should be asking

If your programme is reporting high mapping coverage, the natural assumption is that the migration is on track. That assumption may be correct. But it is untested — unless someone has verified that the transformations are lossless, not just complete.

The question is not have we mapped everything? The question is does everything survive?

If the answer to the second question is unknown, then the programme is carrying risk that no dashboard is measuring. And that risk arrives on cutover weekend, when the options are: fix it under pressure, or roll back and explain why.

There is a simpler path. Run a read-only diagnostic on your own data. Download a short extractor report. Execute it against your source system. Upload the output. In under twenty minutes, see which records are provably lossless, which dependency chains are intact, which records are untransformable, and what percentage of your data is genuinely safe — not just mapped.

No writes to your source system. No disruption to your programme timeline. No consultants. Just mathematics applied to your data.

The dashboard tells you how far your programme has come. The proof tells you whether your programme will succeed. One of those takes months and costs six figures. The other takes twenty minutes and costs a fraction.

migrationproof.io

Migration Proof builds transformation integrity — not mapping coverage — for data migrations. Read-only diagnostic. Bijective proof on every record. Results in under twenty minutes. Launching shortly for SAP S/4HANA. The mathematics applies everywhere.

A note from us

If this article stopped you in your tracks — if it made you nod, or raised a question, or made you look at your own programme dashboard differently — we would love to hear from you.

Migration Proof is an AI-native operation. Five specialised AI personas run the chain walk, precondition checks, transformation, proof, and reporting. Behind them, twenty-five years of enterprise system experience shaped every rule they apply.

We are mostly agents — and we are proud of that, because agents prove every record, not a two percent sample. When you write to us, a human replies.

hello@migrationproof.io

We read every message. We reply to every question.

Next →Bijective Proof