Many machines, one system
Distributed Systems
A distributed system is one where a machine you didn’t know existed can break yours. This track covers the fallacies you inherit the moment you cross a network, the fundamentals of consistency and scale, and the reusable patterns — sidecar, sharding, scatter-gather — that tame them.
Mark a topic “learned” on its page and watch the bars fill.
Skill map
Learned nodes light up — the glowing one is your next step. Click any node to jump in.
Foundations
The hard truths — the fallacies of distributed computing, communication models, consistency and CAP, time and ordering, and consensus.
The dangerous assumptions every distributed system must unlearn — why networks fail partially, clocks lie, and 'it worked on one machine' stops being true.
✦ Complete · ⏱ 5 min 2 · Beginner Communication and Consistency: CAP and the ModelsWhat CAP really forces you to choose, and the spectrum from eventual to strict serializability — so you pick a consistency model on purpose, not by accident.
✦ Complete · ⏱ 5 min 3 · Intermediate Time and Ordering in Distributed SystemsWhy wall-clock timestamps can't order events across machines, and how logical clocks, version vectors, and anti-entropy reason about order instead.
✦ Complete · ⏱ 5 min 4 · Advanced Consensus and CoordinationHow nodes agree on one value despite failures — two-phase commit, Raft, Paxos, and the ownership-election primitives that make leaders safe.
✦ Complete · ⏱ 5 minScalability & Data
Scaling out — load balancing and statelessness, caching, distributed databases, replication and partitioning, and asynchronous messaging at scale.
What scaling actually means — scale up vs out vs down, the twin principles of replication and optimization, statelessness, and why Amdahl's law caps your gains.
✦ Complete · ⏱ 5 min 6 · Intermediate Load Balancing and ElasticityHow a load balancer spreads requests across stateless replicas — Layer 4 vs Layer 7, distribution policies, health checks, elastic autoscaling, and the cascading-failure defenses that keep it all standing.
✦ Complete · ⏱ 5 min 7 · Intermediate Distributed CachingHow caching buys capacity by not asking the database — cache-aside vs read/write-through, TTLs and eviction, hit-rate economics, and HTTP/CDN caching at the edge.
✦ Complete · ⏱ 5 min 8 · Advanced Distributed Databases: Replication and ShardingScaling the data tier — read replicas, partitioning and sharding, leader-follower vs leaderless replication, NoSQL data models, and the consistency knobs real engines expose.
✦ Complete · ⏱ 5 min 9 · Intermediate Asynchronous Messaging at ScaleDecoupling producers from consumers with queues and logs — persistence and delivery guarantees, pub/sub, competing consumers, dead-letter queues, and the event-log shift Kafka makes.
✦ Complete · ⏱ 5 minSystem Patterns
Reusable building blocks — single-node patterns (sidecar, ambassador, adapter), serving patterns (replication, sharding, scatter-gather), and batch patterns.
The three co-located container patterns — sidecar augments, ambassador brokers, adapter normalizes — that turn one machine's containers into reusable distributed-system building blocks.
✦ Complete · ⏱ 5 min 11 · Advanced Serving Patterns: Replicated, Sharded, Scatter/GatherThe three multi-node serving topologies — replicate to scale requests, shard to scale data, scatter/gather to scale time — plus the readiness, hot-sharding, and straggler realities that govern them.
✦ Complete · ⏱ 5 min 12 · Advanced Batch Computational Patterns: Work Queues, Event-Driven, CoordinatedPatterns for short-lived, parallel data processing — the work queue, the event-driven coordination primitives (copier, filter, splitter, sharder, merger), and the coordinated join/reduce that produces aggregates.
✦ Complete · ⏱ 6 min🌐 The network is not reliable — design for partial failure
The eight fallacies of distributed computing all reduce to one: the network will fail in ways a single process never does. Latency is nonzero, messages get lost and duplicated, and parts of the system go dark while others stay up. Idempotency, timeouts, retries and backpressure aren’t extras — they’re the baseline.