🧭 Analogy
Imagine a shop that must update its ledger and also file a slip in the outbox for the courier. Doing them as two separate acts means a power cut between them leaves the books and the courier disagreeing. The outbox pattern staples both into one notarized action — either the page is signed and the slip is filed together, or neither happens — and the courier later picks up slips idempotently, even if he comes by twice.
The problem: the dual write
The naive way to emit an event when state changes is to write the database and publish to the stream as two separate operations. This is the dual write, and it is an anti-pattern: there is no atomic guarantee between the two. It works most of the time — which is precisely the danger. The rare failure (crash between the two writes) leaves the database and stream inconsistent; missing or phantom data surfaces weeks later and is nearly impossible to trace. Dual writes are acceptable only for loss-tolerant measurement data; for anything that matters, use the outbox.
graph TD
App["Application"] -->|"1. write DB ✓"| DB[("Database")]
App -->|"2. publish ✗ (crash here)"| Str["Stream"]
DB -.->|"DB and stream now disagree"| Gap["Phantom / lost event"]The transactional outbox
An outbox is a dedicated table written within the same transaction as the internal state change, so the two are atomic. An asynchronous process then ships the outbox rows to the stream.
graph TD APP["Application"] -->|"single transaction"| TX["BEGIN ... COMMIT"] TX --> ROW["Update business table"] TX --> OB["Insert into outbox table"] OB -->|"async relay (CDC / poller)"| STREAM["Event stream"] STREAM --> C["Consumers"]
The outbox needs a strict ordering ID (autoincrementing) plus a created_at timestamp. Crucially, it need not map 1:1 to internal tables — a major benefit for isolating the internal model: you can denormalize at insertion, or keep a 1:1 mapping and denormalize downstream (Bellemare’s eventification pattern, with a private namespace for raw events and a public namespace for the published result).
For schema compatibility, serialize before committing (the strongest guarantee — a serialization failure rolls back the whole transaction and enables a single shared outbox), rather than after writing (which risks unserializable events accumulating). Debezium’s outbox event router is the recommended tooling: it suppresses DELETE propagation, can still emit tombstones, and auto-cleans the table.
The key insight
The outbox is the cleanest way to be event-first without a CDC framework. Because the app owns its own capture, there is no cross-team dependency on a shared connector platform, and the internal model stays hidden behind the outbox’s public schema.
Processing effectively once
Even with perfect production, consumers can see duplicates (retries, or a crash before the offset commit). The goal is effectively once: updates to the source of truth are consistently applied despite failures — loosely called “exactly once,” though processing and side effects may re-execute after a crash. Two routes:
- With transactions (full support currently only in Kafka): wrap output events, changelog-backed state updates, and the consumer offset increment in a single atomic transaction; consumers abstain from uncommitted transactions.
- Without transactions: identify and filter duplicates yourself. Generate a dedupe ID (a hash of high-cardinality properties), maintain a per-partition state store of processed IDs (best-effort, with TTL/max-size since perfect dedup is prohibitively expensive), dedup only within a single partition, and always produce with a key plus idempotent writes.
Idempotent writes (Kafka and Pulsar) ensure an event is written once even on producer or broker failure — the foundation either route builds on.
Committing offsets before processing loses data
Always commit offsets after the work — producing output and applying state — completes. Committing at the start means a crash mid-processing advances the offset past unprocessed events, silently dropping them. After-the-fact commit gives the strongest at-least-once guarantee, which idempotency then upgrades toward effectively once.
See also
- Event-first design — why teams owning their own capture matters.
- Event processing topologies — changelogs and transactional state.
- Schemas and evolution — serialize-before-commit for safety.
When to use it — and when not
✅ Reach for it when
- A state change and its event must both happen or neither (no lost or phantom events)
- You want to isolate the internal data model from the published event
- Consumers may receive duplicates and must apply each effect once
⛔ Think twice when
- Dual writes — writing to the DB and the stream separately with no atomic guarantee
- Loss-tolerant measurement data where occasional drops are acceptable (dual write may suffice)
- You cannot modify the application or the store at all (use CDC/query-based liberation instead)
Related topics
Make the event stream the first place shared data lands — aligning bounded contexts to business requirements and adopting the stream-first mindset over application-first thinking.
ed-patternsEvent Processing TopologiesStateless and stateful stream processing — transformations, repartitioning and copartitioning, materialized state, changelogs, windowing, and handling out-of-order and late events.
ed-patternsSchemas and EvolutionThe data contract behind every event — explicit schemas, compatibility types (forward, backward, full), the schema registry, and how to handle breaking changes.
Check your understanding
Score: 0 / 41. The defining requirement of the transactional outbox is that…
Atomicity is the whole point: both writes commit together or roll back together, eliminating the dual-write gap. An async process then ships the outbox rows to the stream.
2. Why is the 'dual write' an anti-pattern?
Dual writes work most of the time, which is exactly what makes them dangerous: the rare failure leaves the DB and stream inconsistent with no trace.
3. 'Effectively once' processing means…
Loosely called 'exactly once,' it really means the net effect is applied once; a crash before offset commit can cause re-execution, so side effects must be idempotent.
4. Without broker transactions, how do you achieve effectively-once application?
Perfect dedup is prohibitively expensive, so it's best-effort with TTL/max-size, scoped per partition, paired with keys and idempotent writes.
Comments
Sign in with GitHub to join the discussion.