Real-Time CDC Pipelines: Debezium + Kafka + Flink Hard Parts

Why Real-Time CDC Is Non-Negotiable in Finance

Batch pipelines were the industry norm for a reason: they're predictable, debuggable, and operationally simple. But the moment your business moves from "we need yesterday's numbers by 9 AM" to "we need the last 30 seconds of transaction state to flag a suspicious pattern" — batch is architecturally dead. It cannot be tuned to sub-minute latency without becoming something it was never designed to be.

In Finance and FinTech specifically, the consequences are measured in dollars and regulatory fines. Real-time risk scoring, instant fraud detection, near-real-time SWIFT reconciliation, and Basel III reporting windows are requirements that batch pipelines structurally fail. Change Data Capture (CDC) with a streaming backbone is the correct answer — but only if it's engineered properly.

At Vipra Software, we've built and operated CDC pipelines for financial institutions across Europe, South Asia, and Asia-Pacific — processing anywhere from 3 GB to 10 GB of incremental data per day with end-to-end latencies under 800ms. What follows is the unfiltered engineering reality of making those systems work at production scale, not just in a demo environment.

3–10GB

daily incremental ingestion

Typical FinTech pipeline volume

<800ms

end-to-end latency

Source commit → warehouse availability

99.99%

delivery reliability target

Exactly-once via Flink + idempotent sink

Business Use Cases: Why Finance Can't Wait

Before we get into the engineering, it's worth anchoring this in business outcomes. CDC isn't a technology project — it's an enabler for specific financial capabilities that have hard latency requirements.

Real-Time Risk Monitoring

Credit exposure, counterparty risk, and intraday VaR calculations need a view of positions that is seconds old — not hours. Every CDC event from a trade booking system or settlement engine must land in a queryable store fast enough to influence the next decision. A 30-minute lag means you're running risk calculations against stale reality. With a Debezium → Kafka → Flink pipeline, position changes are visible in downstream risk models within sub-second latency of the database commit.

Transaction Reconciliation

Cross-border payments, card network settlements, and internal funding legs need continuous reconciliation. Batch-based reconciliation catches errors the next morning — which is too late when a payment network expects a same-day response. Streaming CDC enables continuous matching between internal ledger states and external settlement files, surfacing breaks in near real-time rather than at day-end batch close.

Regulatory Reporting (Near Real-Time)

MiFID II, EMIR, and emerging real-time reporting mandates from the FCA and RBI require near-instantaneous transaction reporting windows. CDC pipelines that land trade events in a compliant data store within seconds of execution — with a complete audit trail — dramatically reduce regulatory reporting risk. The alternative is a fragile, nightly batch export that fails silently and only reveals data gaps at month-end.

Customer 360 Analytics

Hyper-personalised product recommendations, real-time credit limit adjustments, and churn-risk scoring all depend on having an up-to-date customer view. A CDC pipeline that propagates changes from CRM, transaction, and behavioural systems in near real-time enables operational analytics at a level that batch-fed data warehouses simply cannot support.

End-to-End Architecture: Source to Warehouse

The canonical CDC architecture is straightforward to diagram and genuinely hard to operate. Here's the full stack as we deploy it for financial clients, followed by the strategic decisions that determine whether it survives production.

// Vipra Software — Production CDC Pipeline Architecture

Source MySQL / Postgres / Oracle

→

CDC Agent Debezium Connector

→

Broker Kafka / Redpanda / Confluent

→

Stream Processor Apache Flink

→

Target Snowflake / BigQuery / Delta / GCS

Historical Load Strategy

The initial snapshot load is where most teams make their first serious mistake. The common approach — point Debezium at the source and enable snapshot.mode=initial — works fine for tables with 10 million rows. It does not work for a financial ledger with 2 billion rows in a production database during business hours.

Our recommended pattern for large-scale historical migrations:

Parallel bulk export via JDBC source connector or native database export (pg_dump, mysqldump with streaming output) into staging S3/GCS, partitioned by a natural key range.
Spark-based batch ingestion from the staging layer into the target warehouse — BigQuery load jobs, Snowflake COPY INTO, or Delta Lake MERGE — at high throughput with explicit write idempotency via primary key deduplication.
Cutover point logging: record the exact binlog position (MySQL: binlog_file + binlog_pos, Postgres: LSN) at which the bulk export was initiated. This becomes the Debezium starting offset for the incremental phase.
Start the CDC connector from the recorded offset only after the bulk load is verified complete and consistent. This ensures zero gap and zero double-processing.

⚠️

Production Warning

Never run Debezium's initial snapshot against a large production OLTP database during business hours. The REPEATABLE READ transaction it holds for snapshot consistency will accumulate lock wait timeouts and degraded query performance. Always pre-export via an offline replica.

Incremental CDC Pipeline Design

Once the historical load is complete, the CDC pipeline takes over. Debezium reads the database transaction log (binlog / WAL), converts each change event into a structured Kafka message — with a before/after envelope and full schema metadata — and publishes it to a per-table topic.

The Flink job consumes these topics, applies transformations (PII masking, enrichment, deduplication), and writes to the target. The critical design decisions here:

Exactly-Once vs At-Least-Once

Flink's exactly-once semantics require a combination of: source offsets being checkpointed atomically with Flink state, and the sink supporting transactional writes. BigQuery's Storage Write API and Snowflake's connector via Kafka Connect both support this pattern. For Delta Lake on S3/GCS, exactly-once is achieved through Delta's ACID transaction log combined with Flink's two-phase commit sink.

The honest take: exactly-once has overhead. Checkpoint intervals, two-phase commit coordination, and transactional overhead at the sink add latency. For most financial pipelines, we recommend exactly-once with checkpointing every 30–60 seconds, accepting that this is your minimum latency floor for guaranteed delivery, and using idempotent writes with deduplication keys as an additional safety layer.

Idempotency Design

Every record written to the target must carry a deterministic composite key: typically table_name + primary_key + transaction_id + lsn. This key enables the sink to safely handle re-delivered events — a connector restart or Flink task failure will re-deliver events from the last checkpoint, and a MERGE/UPSERT operation at the target treats re-delivery as a no-op rather than a duplicate insert.

The Hard Parts: What Breaks in Production

This is the section the tutorials skip. Every item below is something we've personally debugged in a production financial environment. Some of these cost hours; some cost days. None of them are covered in the official quickstart documentation.

1. Connector Restarts and Binlog Position Loss

Debezium's MySQL connector stores its position in Kafka's internal __consumer_offsets topic and optionally in a dedicated offsets topic. When the connector restarts after an unexpected crash, it reads its stored offset and resumes. The failure mode that catches teams off-guard: the database's binlog has rotated and the stored offset no longer exists.

MySQL default binlog retention is 7 days (or less with aggressive purge settings). If your Debezium connector is down for longer than the retention window — due to an extended incident, a protracted deployment, or a misconfigured health check that didn't alert — the connector will fail to resume with a cryptic error: The connector is trying to read binlog starting at ..., but this is no longer available on the server.

# MySQL: check current binlog retention (binlog_expire_logs_seconds)
SHOW VARIABLES LIKE 'binlog_expire_logs_seconds';
# Recommended minimum for production CDC: 604800 (7 days)
# For compliance-sensitive environments: 2592000 (30 days)

# Debezium connector config — explicit offset storage
"database.history.kafka.topic": "dbhistory.fintech-ledger",
"offset.storage.topic": "cdc-offsets-fintech-ledger",
"offset.flush.interval.ms": "10000",
# Store offsets redundantly — don't rely solely on __consumer_offsets

Recovery from binlog gap requires re-running the historical snapshot from the recorded baseline offset, which means a partial or full re-migration. In a FinTech environment, this is a major incident. Prevention is the only acceptable strategy: set binlog_expire_logs_seconds to at least 30 days, monitor connector lag with aggressive alerting (alert at 4 hours of downtime, page at 24), and maintain a documented recovery runbook.

2. Kafka Offset Drift and Reprocessing Nightmares

Offset drift happens when the consumer group's committed offset diverges from the actual processing state — typically caused by auto-commit being enabled, a consumer crash between processing and commit, or topic compaction removing messages the consumer hasn't yet consumed.

The consequence in a financial pipeline: either you miss events (offset jumps forward) or you reprocess events (offset resets backward). Both are bad. Missing a transaction event means your downstream state is inconsistent. Reprocessing means duplicates that require deduplication to be correctly handled end-to-end.

🚨

Never Use enable.auto.commit=true in a Financial CDC Pipeline

Auto-commit in Kafka consumers commits offsets on a timer, regardless of whether downstream processing succeeded. A consumer crash after auto-commit but before successful sink write means that event is permanently lost from the consumer's perspective. Always use manual offset management with explicit commit after confirmed downstream write.

Flink's Kafka source handles this correctly when checkpointing is enabled — offsets are committed to Kafka only when a Flink checkpoint succeeds, ensuring exactly the at-least-once guarantee you need before adding idempotent sinks. But this means your Kafka topic retention must be long enough to cover the checkpoint interval plus any operational delay. Minimum recommended retention: 7 days for CDC topics.

3. Schema Registry Conflicts: The Silent Data Corruptor

Schema Registry provides a central versioned store for Avro/Protobuf/JSON Schema schemas, enabling schema evolution without breaking consumers. It sounds like the solution to schema management. In practice, it introduces its own category of production failure that is uniquely difficult to debug.

The scenario we've seen most frequently: a source database engineer adds a NOT NULL column to a production table without telling the data engineering team. Debezium detects the schema change, attempts to register a new schema version, and the Schema Registry — depending on its compatibility setting — either rejects the new schema or accepts it in a way that breaks existing consumers.

Compatibility Mode	What It Allows	Risk in CDC
`BACKWARD`	New schema can read old data	Old consumers break on new fields
`FORWARD`	Old schema can read new data	New fields silently ignored by old consumers
`FULL`	Both backward and forward	Safest — requires all new fields to be nullable with defaults
`NONE`	No compatibility checks	Consumer deserialization exceptions at runtime

Our recommendation for financial pipelines: enforce FULL compatibility at the registry level, combined with a schema change approval gate in your CI/CD pipeline. No schema change should reach production without a registered, compatible schema version. Use Debezium's schema.history.internal.kafka.topic to persist full schema history independently of the registry, enabling recovery from registry corruption.

4. Late-Arriving Events and Event-Time vs Processing-Time

In streaming systems, there are two clocks: event time (when the event actually occurred, from the database transaction timestamp) and processing time (when Flink processes the event). In healthy systems, these are close. In financial systems under load — or when a connector restarts and replays a backlog — they can diverge by minutes.

Late-arriving events cause window aggregations to produce incorrect results. A transaction that was committed at 14:55:59 but arrives in Flink at 15:02:00 will miss a 5-minute tumbling window that closed at 15:01:00. For risk aggregations, this means an undercount. For regulatory reporting, this may constitute a breach.

⚙️

Flink Watermark Strategy for CDC Pipelines

Use WatermarkStrategy.forBoundedOutOfOrderness(Duration.ofSeconds(30)) for CDC streams where connector lag is the primary lateness source. Pair this with a side output for events that exceed the lateness threshold — these must be handled in a separate reconciliation job, not silently dropped. For compliance-sensitive aggregations, consider using processing-time windows with a corrective batch job rather than attempting to guarantee completeness in the stream.

5. Out-of-Order Data: The Kafka Partition Problem

Kafka guarantees ordering within a partition, not across partitions. If a single entity's events — say, a customer account's UPDATE and DELETE — land on different partitions (because you used a non-key-based partitioning strategy, or because a rebalance changed partition assignment), Flink may process the DELETE before the UPDATE, leaving a ghost record in your target.

For CDC pipelines, always partition Kafka topics by the source table's primary key hash. Debezium supports this natively via the message.key.columns configuration. This ensures all events for a given row land on the same partition and are consumed in order by a single Flink task operator.

# Debezium connector — enforce key-based partitioning
"message.key.columns": "transactions:transaction_id;accounts:account_id",
"producer.override.partitioner.class": "org.apache.kafka.clients.producer.internals.DefaultPartitioner",
# Flink — key by the source primary key field before any stateful operator
stream
  .keyBy(event -> event.getKey())    // partition-local ordering
  .process(new DeduplicationFunction());

6. Data Duplication and Deduplication at Scale

At-least-once delivery — which is what you get from Debezium + Kafka under normal connector restart scenarios — means duplicates are not an exception; they're expected. Your deduplication strategy must be correct and scalable.

For Flink-based deduplication, we use a KeyedProcessFunction with a MapState<String, Long> storing the last-seen transaction sequence number per key. Events arriving with a lower-or-equal sequence number are discarded. State TTL is set to 24 hours to bound state store growth without risking incorrect deduplication of valid late arrivals within the watermark window.

At the target layer, both Snowflake and BigQuery support MERGE statements with explicit deduplication keys. Delta Lake's MERGE INTO with WHEN MATCHED THEN UPDATE achieves the same. This provides a second line of defence: even if a duplicate slips through Flink's stateful dedup, it is absorbed by the idempotent sink write.

7. Backfill vs Live Stream Conflicts

When you need to re-derive a historical window — a regulatory requirement, a data model change, a bug fix — you're running a Flink batch backfill job while the live CDC stream is simultaneously writing to the same target. Without coordination, you will have writes from two jobs conflicting over the same partitions or rows.

The cleanest pattern: write backfill output to a staging table with a job-run identifier suffix, validate it, then use a single atomic SWAP or RENAME operation to replace the production partition. Never write a backfill directly into a live production table that a streaming job is actively updating. The window of conflict is small but the consequences — corrupted aggregate state in a risk model — are not.

Best Practices: What Actually Works

Pipeline Design Patterns

The pipeline pattern we use for FinTech deployments combines a thin Flink enrichment and routing layer with business-logic separation in the target warehouse, rather than overloading the stream processor with complex transformation logic. Flink should handle: ordering guarantees, deduplication, PII masking, and routing to the correct sink topic. Business logic — joins, aggregations, derived metrics — should live in the warehouse using scheduled dbt models fed by the CDC stream.

This keeps the Flink job stateless enough to be restart-safe and operationally simple, while keeping business logic in a version-controlled, testable, SQL-based layer.

Monitoring and Observability

A CDC pipeline with no observability is not a production system — it's an unmonitored time bomb. Critical metrics to instrument:

Debezium connector lag: the number of binlog events buffered but not yet published to Kafka. Alert at >10,000 events; page at >50,000.
Kafka consumer group lag: the number of Kafka messages consumed but not yet processed by Flink. Use Kafka's built-in kafka-consumer-groups.sh or expose via JMX → Prometheus → Grafana.
Flink checkpoint duration and failure rate: checkpoint duration exceeding 60 seconds indicates state store pressure. Failure rate above zero in a 5-minute window warrants immediate investigation.
End-to-end latency: measure the delta between the database transaction timestamp (from the Debezium event envelope) and the timestamp of confirmed sink write. Target: <1 second at p50, <5 seconds at p99.
Schema Registry registration failures: alert on any failed schema registration attempt — this is always a symptom of an unanticipated source schema change.

Retry and Dead-Letter Queue Strategy

Not every event can be processed successfully on first delivery. Malformed events, downstream sink timeouts, and transient network failures are facts of life. Every Flink job should route failed events to a dedicated Dead-Letter Queue (DLQ) Kafka topic with the full event payload, the error reason, and the processing timestamp. A separate monitoring job reads the DLQ and alerts on accumulation. A scheduled remediation job replays DLQ events after the root cause is resolved.

Never silently drop events. In a financial pipeline, every dropped event is a potential audit finding.

Flink Checkpointing and State Management

Flink's state backend choice is critical for financial pipelines with large keyed state. RocksDB is the correct choice for CDC deduplication state — it spills to disk, supports incremental checkpoints, and handles state sizes measured in tens of GB per task slot without the OOM risk of the in-memory HashMap backend. Configure incremental checkpoints to S3 or GCS with a retention of 3 completed checkpoints minimum.

// Flink — production checkpointing config
env.enableCheckpointing(60_000); // 60s interval
env.getCheckpointConfig()
   .setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE)
   .setMinPauseBetweenCheckpoints(30_000)     // prevent checkpoint storms
   .setCheckpointTimeout(120_000)              // 2 min timeout
   .setMaxConcurrentCheckpoints(1)              // no overlapping checkpoints
   .enableExternalizedCheckpoints(
     ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);

// RocksDB state backend with incremental checkpoints
env.setStateBackend(new EmbeddedRocksDBStateBackend(true));
env.getCheckpointConfig().setCheckpointStorage("s3://vipra-flink-state/prod-cdc");

Scalability and Performance at 3–10 GB/Day

At 3–10 GB/day of incremental ingestion, you're operating in a throughput range that is well within a single well-configured Kafka + Flink cluster's capabilities — but the configuration details matter enormously. Misconfigurations that are invisible at 100 MB/day become latency killers at 5 GB/day.

Kafka Topic Partitioning Strategy

For CDC topics, partition count should be set to match the maximum expected parallelism of the Flink consumer, not the current throughput. This avoids the painful (and downtime-inducing) operation of repartitioning a topic once consumer parallelism needs to be increased. A reasonable starting point for a financial transaction topic: 12 partitions. Segment size: 256 MB. Retention: 7 days minimum.

Compaction is not appropriate for CDC topics — you need the full event history for replay, not just the latest value per key. Reserve log compaction for aggregate-state materialisation topics, not the raw CDC stream.

Throughput vs Latency Trade-offs

The fundamental tension: higher Flink operator throughput (more records per second) requires larger buffer sizes and batch writes, which increases latency. Lower latency requires smaller buffers and more frequent, smaller writes, which reduces throughput and increases sink load.

For FinTech pipelines with sub-second latency targets, we configure Flink's network buffer timeout to 50ms and BigQuery/Snowflake sink batch size to a maximum of 500 records or 1-second time window, whichever triggers first. This gives p50 latency under 200ms while keeping sink write amplification manageable.

Horizontal Scaling

Flink scales horizontally by adding task managers. When you need to increase throughput: first increase parallelism of the Flink job (without adding task managers, to confirm the bottleneck is compute, not network), then add task managers. Monitor task manager CPU and GC pressure. RocksDB compaction becomes a hidden bottleneck at high parallelism — monitor RocksDB's write stall metrics if you observe unexpected latency spikes without obvious CPU or network causes.

Security and Compliance: Finance-Grade Requirements

Data Encryption in Transit and at Rest

All Kafka traffic should use TLS 1.2+ between Debezium connectors, brokers, and Flink consumers. Confluent Cloud and Redpanda Cloud enforce this by default; self-managed Kafka requires explicit keystore/truststore configuration in all client and broker properties. Kafka topics containing PII or financial data must be encrypted at rest — Confluent Cloud provides envelope encryption via KMS integration; for self-managed clusters, use filesystem-level encryption on the broker data directories.

PII Masking in the Stream

PII masking must happen at the Flink processing layer, before data lands in the target warehouse. This is non-negotiable in GDPR-regulated markets (EU, UK, India's DPDPA). Our implementation uses a Flink MapFunction that applies field-level masking rules loaded from a configuration store — account numbers are SHA-256 hashed with a rotating pepper, card numbers are truncated to last 4 digits, and customer identifiers are pseudonymised with a deterministic tokenisation scheme that supports re-identification only via a secured key management service.

Audit Trails

CDC's core value proposition for compliance is that it naturally produces an immutable audit trail — every insert, update, and delete in the source database becomes a versioned event in Kafka. Preserving this trail requires a dedicated audit archive topic with infinite retention (or very long retention — minimum 7 years for financial records), compressed with LZ4 or ZSTD, and backed up to cold storage (S3 Glacier, GCS Nearline, Azure Archive) on a nightly schedule.

The Debezium event envelope includes: operation type (c/u/d), before-state, after-state, source metadata (database, table, LSN/binlog position), and transaction timestamp. This is a compliance-ready audit record with no additional instrumentation required at the application layer.

🔐

Multi-Tenant Cloud Target — Architecture Options

Financial clients with multi-tenancy requirements can route CDC streams to different cloud targets per data classification: Snowflake (structured data warehouse, strong RBAC, column-level security for PII), BigQuery on Google Cloud Platform (near-real-time analytics with Authorized Views per tenant), Delta Lake on S3/GCS/ADLS (cost-efficient archive with schema evolution support), or a combination via Flink routing rules keyed on tenant ID and data classification tag.

Why Vipra Software

The patterns described in this article are not theoretical. They are the operational playbook we've developed and refined across engagements with financial institutions — from a Dublin-based payments processor handling €2B in daily transaction volume, to a FinTech lending platform in South Asia migrating 400M historical loan records to BigQuery on Google Cloud Platform while keeping a live CDC stream running without a single missed event.

We don't parachute in with a pre-built template. Every pipeline we build is designed to the specific source database characteristics, target schema requirements, compliance constraints, and latency SLAs of the client. We bring production scars — the failed connector restarts, the 3 AM schema conflicts, the offset management decisions that don't show up in any tutorial — and we bake that knowledge into the architecture from day one.

Architecture Review and Risk Assessment

We audit your existing or planned CDC architecture against production failure modes — connector configuration, offset management, schema evolution gaps, and compliance requirements. You get a prioritised risk register and remediation roadmap within two weeks.

End-to-End Pipeline Engineering

From Debezium connector configuration to Flink job implementation, Schema Registry setup, and multi-cloud target integration (Snowflake, BigQuery on GCP, Delta Lake), we build production-grade CDC pipelines with full observability, alerting, and documented runbooks.

Historical Migration Without Downtime

Our migration methodology handles datasets from 50M to 5B+ rows with a zero-data-loss, zero-downtime cutover pattern. We've executed this in regulated environments where even a 30-second service interruption requires board-level approval.

Ongoing Platform Operations and Optimisation

Post-launch, our data platform engineers monitor pipeline health, manage schema evolution, handle capacity planning, and continuously optimise throughput and cost — so your team focuses on business value, not infrastructure emergencies.

Conclusion: Real-Time Data Is an Engineering Problem

The Debezium + Kafka + Flink stack is genuinely production-capable for financial data pipelines. It handles the scale, the latency requirements, the schema evolution complexity, and the compliance needs that enterprises face. But it requires engineers who have seen it fail — and who have built the operational muscle to prevent those failures from recurring.

The biggest risk isn't that the technology doesn't work. It's that teams underestimate the operational surface area: the binlog retention race condition, the schema registry compatibility mode that silently drops fields, the Flink state backend choice that works fine in staging and causes GC pauses in production under load.

Real-time data systems are not a feature you ship — they're an infrastructure investment that compounds. Done right, a CDC pipeline is the backbone of every real-time capability your business will build for the next five years: risk systems, personalisation, compliance reporting, fraud detection. Done poorly, it's a source of intermittent, hard-to-diagnose data quality issues that erode trust in every system downstream.

If you're building or planning a CDC pipeline for a financial application, we'd welcome the conversation. Not a sales call — an architecture review. Bring your schema, your volume numbers, your latency targets, and your compliance constraints. We'll tell you where the landmines are.

Vipra Software Engineering Team Data Platform & Real-Time Systems · Global

Vipra Software's data engineering practice specialises in real-time streaming pipelines, cloud data platforms, and enterprise data modernisation for financial services clients across Europe, South Asia, and Asia-Pacific. Offices in Bengaluru, Dublin, Sydney, Dubai, and beyond.

Ready to Build a Production-Grade CDC Pipeline?

Book a free architecture review with our data engineering team. We'll assess your current or planned pipeline, identify production risks, and outline a build roadmap — no obligation, no boilerplate pitch.

Book Architecture Review → Email Our Team