Sebastien Rousseau

Post-Quantum Payments Infrastructure: Why Banks May Replace Rather Than Retrofit Legacy Rails

ML-KEM and ML-DSA do not fit cleanly inside the rails that carry SWIFT MT and ISO 20022. The honest engineering answer is that retrofit is a controlled migration plan with a short shelf life, and replacement is the only stable destination.

11 min read

Post-Quantum Payments Infrastructure: Why Banks May Replace Rather Than Retrofit Legacy Rails

The cryptographic primitives that authenticate every wholesale payment in production today — RSA, ECDSA, ECDH — have an expiry date. The US Quantum Computing Cybersecurity Preparedness Act ⧉ wrote that expiry date into federal procurement law in late 2022. The BIS Working Paper No. 1208 ⧉ put the same expiry into the supervisory frame for central banks. NIST FIPS 203 ⧉ and FIPS 204 ⧉ published the replacements in August 2024.

Payment infrastructure has not yet absorbed what that means.

This article is the engineering case for replacement over retrofit. It is written for architects who already understand the algorithms and need to decide what to do with SWIFT MT, ISO 20022 pacs and pain messages, RTGS interfaces, HSM estates, and the certificate hierarchies underneath all of it.


Executive Summary / Key Takeaways

  • Harvest-now-decrypt-later (HNDL) is the operational threat. Adversaries record encrypted payment traffic in 2026 to decrypt it once a cryptanalytically relevant quantum computer (CRQC) exists. The captured traffic includes settlement instructions, beneficiary data, and authentication material with long-lived sensitivity.
  • NIST has standardised the replacements. ML-KEM (FIPS 203) for key encapsulation and ML-DSA (FIPS 204) for digital signatures are the defaults. SLH-DSA (FIPS 205) covers the stateless hash-based fallback.
  • The size delta breaks legacy assumptions. Public keys and signatures are 5–20× larger than RSA-2048 equivalents. That collides with MTU on payment networks, fixed-buffer assumptions in MT message handlers, and the cryptographic throughput of installed HSM fleets.
  • Hybrid (classical + PQC) is the migration vehicle, not the destination. Hybrid TLS and hybrid X.509 buy two to three years of interoperability while production rails are replaced. They do not solve the underlying capacity problem.
  • PKI is the load-bearing wall. A certificate authority whose signature algorithm becomes forgeable invalidates every certificate beneath it. The bank's institutional exposure is the chain, not any single endpoint.
  • Crypto-agility is the architectural property to engineer for. Algorithm identifiers, key formats, signature envelopes, and HSM partitions must all be parameterisable. Anything pinned to RSA at compile time is technical debt that will fall due simultaneously.

Harvest Now, Decrypt Later: The Threat Model That Removes the Option to Wait #

HNDL inverts the usual cryptographic timeline. Conventional risk assessment asks when the threat materialises. HNDL asks when the data captured today becomes useful to an adversary. For payment messages — beneficiary identities, account numbers, structured remittance data, sanction-screening payloads, intra-bank settlement instructions — the window of sensitivity is years to decades. Most of that traffic is recorded somewhere right now.

The NSA's CNSA 2.0 timeline ⧉ gives national-security systems until 2035 to complete the transition. Financial supervisors are moving on faster schedules — the PRA's expectations on operational resilience ⧉ treat cryptographic agility as a third-party concentration risk. The expectation in 2026 is that material payment rails publish their PQC migration plan in their resilience self-attestation.

The HNDL adversary does not need a CRQC today. The adversary needs:

  1. Network position. Submarine-cable taps, ISP-level capture, and compromised middleboxes are all in scope. Wholesale payment traffic concentrates through a small number of network paths.
  2. Storage. A petabyte of structured payment data is a manageable archive in 2026.
  3. Patience. The capture costs nothing per intercepted message. The yield arrives later.

The migration argument is therefore not "quantum computers may arrive in 2035." It is "any TLS session that completes tonight with RSA-2048 key exchange is exposed for as long as the data inside it remains sensitive."

The Size Problem Is the Engineering Problem #

Public discussion of PQC migration tends to focus on algorithm selection. The harder problem is dimensional.

Primitive Public key Signature / ciphertext
RSA-2048 256 bytes 256 bytes (signature)
ECDSA P-256 64 bytes 64 bytes (signature)
ML-KEM-768 1,184 bytes 1,088 bytes (ciphertext)
ML-DSA-65 1,952 bytes 3,309 bytes (signature)
SLH-DSA-128f 32 bytes 17,088 bytes (signature)

Those numbers map directly onto failure modes that legacy payment infrastructure was never designed for:

The retrofit path is to triage these constraints individually — bigger buffers here, faster HSMs there, fragmentation tolerance in the middleboxes. That is a defensible six-month bridge. It is not an architecture.

Retrofit Versus Replace: The Decision That Defines the Programme #

The honest framing is that retrofit is a controlled migration plan with a short shelf life, and replacement is the only stable destination. The decision is which one the bank funds first, and how long the retrofit window stays open before it becomes a permanent kludge.

Retrofit means:

That work can be done. It does not fix the underlying problem, which is that SWIFT MT and many ISO 20022 implementations encode the cryptographic envelope inside a message format that pins the algorithm. The next algorithm transition — and there will be one, when ML-KEM eventually shows weakness or a new standard supersedes it — runs the same migration again on the same rails.

Replacement means accepting that the cryptographic layer is not a property of the message format. It is a property of a separable envelope service that the message format calls into. Concretely:

The replacement design survives the next algorithm change without re-touching the rail.

The Crypto-Agile Architecture, Layer by Layer #

The infrastructure layers that matter for PQC migration are not the business layers of "data, control, economics" that suit a generic banking narrative. The layers that matter are cryptographic.

Layer What it does The PQC question Architectural directive
HSM / key-management Generates, stores, and operates on key material under hardware isolation Does the installed HSM firmware support ML-KEM, ML-DSA, and a hybrid key-encapsulation API? What is the signing throughput delta versus ECDSA on the same hardware? Inventory every HSM partition by algorithm support and per-second capacity. Decommission anything that pins to RSA without a firmware path. Stand up dedicated PQC partitions before production cutover.
PKI / certificate authority Issues, revokes, and chains trust through X.509 certificates Can the CA sign with ML-DSA today? Is there a tested process for rotating the root and re-issuing the chain? Are CRL and OCSP responders sized for ML-DSA signature weight? Treat the CA stack as the load-bearing wall. Establish a PQC-capable subordinate now. Time the root rotation for the longest-lived certificate dependency, not for convenience.
Transport / network Terminates TLS, IPsec, and MACsec between payment endpoints Does the load balancer, WAF, and middlebox path tolerate hybrid handshakes that exceed legacy MTU? Are session-resumption tickets sized for PQC keys? Move TLS termination to a crypto-agile boundary (sidecar or mesh). Raise MTU policy on payment VPNs. Test the full path with fragmentation deliberately induced.
Application / message payload Carries SWIFT MT, ISO 20022 pacs / pain / camt messages and their cryptographic envelopes Does the rail's message handler tolerate ML-DSA-sized signed envelopes? Are intermediate parsers algorithm-aware or do they truncate on length? Separate envelope from payload. Sign at a service boundary, not inside the message-format handler. Treat algorithm identifiers as data, not as schema.
Audit / evidence Produces the cryptographic chain of custody supervisors and clients rely on Are historical signed records still verifiable once the signing algorithm is deprecated? Is there a long-term archival signature plan? Counter-sign archives with a hash-based primitive (SLH-DSA) for assurance that survives any single algorithm break. Treat the audit chain as a regulated artefact, not as a build by-product.

The discipline is to make every algorithm choice a configuration value at every layer. The institution that hard-codes RSA-2048 at any of those layers inherits a coordinated end-of-life event when that algorithm falls.

What This Means by Bank Type #

The exposure profile differs by institution. The directives differ accordingly.

Global Banks #

Global banks operate the largest installed HSM fleets, the longest certificate chains, and the most complex network paths between counterparties. The dominant risk is not algorithm selection — it is the coordination cost of changing algorithms across hundreds of internal services and dozens of external counterparties simultaneously.

The directive is to fund the PQC-capable CA, the crypto-agile transport boundary, and the algorithm-parameterised signing service as 2026 work, before any single rail is retrofitted. The retrofit then becomes a routine production change inside a known framework. Without the framework, every rail retrofit re-litigates the same architectural decisions.

Regional Banks #

Regional banks have less algorithmic surface area but proportionally fewer specialist staff. The dominant risk is HSM vendor lock-in to algorithms the vendor has not committed to support.

The directive is to write PQC support — specifically ML-KEM and ML-DSA, with a tested firmware upgrade path — into every HSM contract renewal from 2026 onward. Banks without that clause inherit a forced hardware replacement on the vendor's timetable, not their own.

Fintechs and PSPs #

Payment service providers and fintechs typically sit between a bank counterparty and a merchant or end-user system. Their cryptographic exposure is the API boundary on both sides.

The directive is to publish a hybrid TLS interface — classical plus ML-KEM — on the bank-facing side as table stakes in 2026 commercial conversations. The fintech that arrives with PQC interoperability already demonstrated wins integration cycles against the fintech that has not yet started.

Corporate Treasurers #

Treasurers do not operate cryptographic infrastructure directly. They do consume it — every bank API, every secure file transfer, every signed confirmation depends on the bank's PKI.

The directive is to add three questions to every bank RFP in 2026: which PQC algorithms is the bank using today in customer-facing TLS, what is the bank's plan for ML-DSA-signed payment confirmations, and how does the bank intend to preserve verifiability of historical signed records once RSA is deprecated. Banks that cannot answer those questions are signalling something about their underlying engineering readiness.

What Happens Next #

The first wave of PQC deployment in payments will be invisible to end users. Hybrid TLS appears in the handshake, certificate chains grow, HSM signing latency creeps up by a few milliseconds, and the rails continue to operate. That is the success path.

The visible failures will be retrofit-driven: a rail that cannot accept an ML-DSA-signed envelope without truncation, a CA whose CRL distribution point chokes on the new signature weight, a middlebox that fragments hybrid handshakes into reordered ClientHellos. Those failures will land in production through 2027.

The architectural decision in 2026 is whether to fund the replacement infrastructure that makes the retrofit irrelevant, or to fund a sequence of rail-specific fixes that each look cheaper individually and aggregate into a longer, more expensive migration. The bank that picks the first path will run quieter operations through the transition. The bank that picks the second will spend the rest of the decade explaining incident reviews to supervisors.

PQC is not a cryptography problem dressed as an infrastructure problem. It is an infrastructure problem that cryptography happens to have started.

Questions? Answers.

Is there a deadline that forces this work?

The hard regulatory deadlines are jurisdictional. The US Quantum Computing Cybersecurity Preparedness Act ⧉ binds federal systems. The NSA CNSA 2.0 timeline ⧉ targets 2035 for national-security systems. The BIS Project Leap ⧉ publication and the FSB's work programme are pulling that horizon forward for systemic payment infrastructure. HNDL means the operational clock started running well before any of those nominal dates.

Why is ML-KEM the recommended key encapsulation rather than something faster?

ML-KEM (the standardised version of CRYSTALS-Kyber) had the strongest combination of small ciphertext and key sizes among the lattice candidates, with mature implementations and side-channel hardening. NIST published it as FIPS 203 ⧉. Faster candidates exist but carry larger size or weaker confidence intervals on security parameters.

Why not use SLH-DSA everywhere instead of ML-DSA?

SLH-DSA (the standardised version of SPHINCS+) is hash-based and therefore relies only on hash-function security, which is the most conservative assumption available. Its signatures are 5–20× larger than ML-DSA's. That is acceptable for archival counter-signing, but unworkable for transactional signing where size matters per message. The standard pattern is ML-DSA for production signing and SLH-DSA for archival assurance.

Can a bank just wait until the rails publish PQC profiles?

A bank that waits inherits the migration window the rail publishes, which is shorter than the bank's own internal change cycle. By the time SWIFT, the local RTGS operator, and the relevant CCPs each publish their PQC profile, the migration window will be twelve to twenty-four months. Banks that have not pre-built their CA, transport, and HSM capability will not meet it without operational shortcuts.

What is the single highest-leverage thing to fund first?

A PQC-capable subordinate certificate authority, integrated into the existing PKI, that can issue dual-algorithm certificates (RSA plus ML-DSA) without disrupting production trust. That establishes the rotation primitive. Everything else — transport upgrades, HSM partition planning, message-envelope changes — can be scheduled around it.

References #

Last reviewed .