🧭 Analogy
A mailroom processes a morning’s mail in bursts: one clerk sorts incoming items (work queue), photocopies some for multiple departments (copier), discards junk (filter), routes invoices and letters to different desks (splitter), and a supervisor finally tallies the day’s totals from every desk (reduce). Batch computational patterns are that mailroom — short-lived, parallel, and assembled from a few reusable roles.
Where serving patterns run long-lived servers, batch computational patterns are short-lived processes that handle large amounts of data quickly via parallelism — telemetry aggregation, daily reports, video transcoding. MapReduce is the famous example, but several smaller patterns compose into it.
The work queue
A work queue is the simplest batch pattern: a batch of wholly independent work items processed within a time bound, with workers scaled up or down to keep pace. Most of its logic is independent of the work itself, so it is built from reusable library containers with two interfaces:
- Source interface — supplies the stream of items, implemented as an ambassador to the generic queue manager. A RESTful API exposes
GET /api/v1/items(list) andGET /api/v1/items/<name>(detail). Notably it has no completion endpoint — the manager tracks processed vs. pending itself, keeping logic generic. (Always version APIs asv1up front.) - Worker interface — processes one item. It is a one-shot call in its own container group needing isolation against malicious injection, so it uses a file-based API: a
WORK_ITEM_FILEenv var points to a file (mountable via ConfigMap) containing the item’s data — also easier for shell-script workers.
🛠️ Let the orchestrator do the bookkeeping
The manager loads available work, diffs it against already-processed/in-progress state, and spawns workers for the rest. Built on Kubernetes Job objects (which reliably run a worker to completion even across machine failures) and annotations marking which item each job handles, the entire work queue needs no storage of its own.
The dynamic scaling math is concrete: for stability, (per-item processing time / parallelism) must be less than the interarrival time. An item arriving every minute taking 30 s keeps up (2:1) and drains backlogs; taking 1 minute is perfectly balanced but slackless; taking 2 minutes loses ground unboundedly. Four-way parallelism turns a 1-minute item into an effective 15 s. A multi-worker pattern (a specialization of the adapter) composes several reusable worker containers into one — detect faces → tag identities → blur faces.
Event-driven batch: coordination primitives
When processing needs more than one action per item, work queues are linked so one queue’s output feeds another — forming workflows (a DAG of stages coordinated by completion events). Beyond the trivial output→input chain, five coordination patterns:
graph TD IN["Input queue"] --> COP["Copier: duplicate stream"] COP --> Q1["Job A queue"] COP --> Q2["Job B queue"] IN2["Input queue"] --> FIL["Filter: drop non-matching"] IN3["Input queue"] --> SPL["Splitter: route by type (no drops)"] SPL --> EMAIL["Email queue"] SPL --> SMS["SMS queue"] IN4["Input queue"] --> SHA["Sharder: even shards"] M1["Repo 1"] --> MER["Merger: many to one"] M2["Repo 2"] --> MER --> BUILD["Shared build queue"]
- Copier — duplicates one stream into two or more identical streams (transcode a video to 4K, 1080p, mobile, and a GIF thumbnail).
- Filter — drops items not meeting a criterion (keep only users who opted into promotional email); ideally an ambassador wrapping the source.
- Splitter — routes different inputs to different queues without dropping any (shipped-order notifications into email and SMS queues).
- Sharder — divides one queue into evenly distributed shards via a sharding function, for reliability (a bad deploy affects only ~25% with four shards — staged rollouts) and even resource distribution across regions; it spills to healthy shards as they shrink.
- Merger — the opposite of a copier: combines many queues into one (many repos’ commits merged into a single build-and-test stream).
The data stream between stages is managed by publish/subscribe (pub/sub) infrastructure — topics that publishers write and subscribers read, with reliable storage and delivery (Amazon SQS, Azure EventGrid, or open-source Kafka with --replication-factor and --partitions). See asynchronous messaging at scale.
Coordinated batch: join and reduce
Where the previous patterns split and chain, coordinated patterns pull parallel outputs back together — the reduce half of MapReduce (the map step being a sharded work queue).
- Join (barrier synchronization) — holds work until all parallel items finish, guaranteeing no data is missing before an aggregation phase (enabling whole-set statistics). The cost: it forces all data through the previous stage first, reducing parallelism and adding latency.
- Reduce — does not wait; it optimistically merges parallel outputs pairwise, each step combining several into one, repeated until a single output remains. Crucially it can start while map/shard is still running, finishing the whole computation faster.
graph TD subgraph Join JA["Item A"] --> BAR["Barrier<br/>wait for all"] JB["Item B"] --> BAR JC["Item C"] --> BAR BAR --> JAGG["Aggregate<br/>complete set"] end subgraph Reduce RA["Item A"] --> R1["Merge"] RB["Item B"] --> R1 RC["Item C"] --> R2["Merge"] R1 --> R2 R2 --> ROUT["Final output"] end
💡 Join for completeness, reduce for speed
Use a join when correctness demands the complete set (e.g. wait for all license-plate blurring before deleting originals, so a failure allows a full rerun). Use reduce when you can combine incrementally — counting words across sharded pages, summing populations across counties, or weighting per-town histograms into a national one. Reduce’s overlap with the map stage is what makes MapReduce fast.
The hands-on image-processing pipeline ties it all together: a multi-worker plate-detector + blurrer, sharded for reliability; a join before deleting originals; a copier into delete-originals and recognition queues; recognition again sharded with a multi-worker group; and a final reduce summing everything into the aggregate count.
See also
- Serving patterns — the long-running counterpart to batch.
- Single-node patterns — the ambassador (source) and adapter (multi-worker) reused here.
- Asynchronous messaging at scale — the pub/sub that connects batch stages.
When to use it — and when not
✅ Reach for it when
- When a large batch of mostly independent work items must be processed within a time bound.
- When processing needs multiple stages, chained so one queue's output feeds another.
- When parallel outputs must be combined into an aggregate result (the reduce of MapReduce).
⛔ Think twice when
- For long-running serving systems — use the serving patterns instead.
- When a single item needs a synchronous immediate response rather than queued processing.
Related topics
The three multi-node serving topologies — replicate to scale requests, shard to scale data, scatter/gather to scale time — plus the readiness, hot-sharding, and straggler realities that govern them.
ds-patternsSingle-Node Patterns: Sidecar, Ambassador, AdapterThe three co-located container patterns — sidecar augments, ambassador brokers, adapter normalizes — that turn one machine's containers into reusable distributed-system building blocks.
ds-scalabilityAsynchronous Messaging at ScaleDecoupling producers from consumers with queues and logs — persistence and delivery guarantees, pub/sub, competing consumers, dead-letter queues, and the event-log shift Kafka makes.
Check your understanding
Score: 0 / 41. Why does a work queue use a FILE-based interface for workers but an HTTP interface for the source?
The source is an ambassador exposing a RESTful list/detail API; the worker is a remote one-shot call needing isolation and simplicity, so it receives a WORK_ITEM_FILE env var pointing to mountable file data.
2. For a stable work queue, what must hold between processing time and interarrival time?
If effective processing time (per-item time / parallelism) is below interarrival time, the queue keeps up and drains backlogs; if above, the backlog grows unboundedly.
3. What does a sharder do in an event-driven batch pipeline?
A sharder splits a queue via a sharding function so a bad worker update or failure affects only a fraction (e.g. 25% with four shards), and work spreads evenly across regions; it spills to healthy shards as they shrink.
4. How does reduce differ from join (barrier synchronization)?
Join guarantees completeness (no missing data) at the cost of parallelism and latency; reduce repeatedly combines outputs pairwise until one remains and can start before all map work finishes.
Comments
Sign in with GitHub to join the discussion.