Sebastien Rousseau

CASE STUDY

BankStatementParser — unified transaction intelligence for treasury

Role: Author and maintainer

Period: 2024 – present

Status: Production, actively maintained

Problem

Corporate treasury teams receive bank statements in CAMT, PAIN.001, MT940, OFX, CSV, and scanned PDFs from dozens of banks. Each format carries different field semantics, encodings, and ambiguities. Most teams hand-build brittle per-bank parsers, blocking real-time cash forecasting, fraud detection, and audit-ready reconciliation.

What I built

An open-source Python toolkit that unifies every common bank statement format into a single, normalised transaction stream. Schema-validated CAMT / MT940 / PAIN parsers, OCR fallback for scanned PDFs, deterministic field mapping, and SR 11-7-grade audit evidence for every transformation step.

Engineering rigour
SignalEvidence
Formats supportedCAMT (.052, .053, .054), MT940, OFX, CSV, scanned PDF (OCR)
Normalisation targetSingle unified transaction record schema
Audit trailPer-field provenance — source format + parser version logged per row
LicenseApache-2.0 / MIT

External validation

  • Featured in the 2026-06-14 article: From Bank Statements to Unified Transaction Intelligence
  • Designed to satisfy BCBS 239 risk-data aggregation requirements

Standards

  • ISO 20022 CAMT
  • SWIFT MT940
  • OFX
  • BCBS 239