Architecture & Core Concepts

Brokers, topics, partitions, producers, consumers, and the commit log.

Beginner 35 min read 📨 Kafka

The Distributed Commit Log

At its core, Kafka is a distributed commit log. Think of it like a shared notebook that multiple writers can append to but nobody can erase. Every message written to Kafka is appended to the end of a log, given a sequential number (offset), and stored durably on disk. This simple concept is what makes Kafka so powerful — it's an append-only, ordered, persistent record of everything that happened.

Unlike traditional message queues (RabbitMQ, SQS) where messages are deleted after consumption, Kafka retains messages for a configurable period (default 7 days). This means multiple consumers can read the same data independently, at their own pace, and even replay history.

Why this matters

The commit log model means Kafka isn't just a message bus — it's a source of truth. You can rebuild application state by replaying the log. This is the foundation of event sourcing, stream processing, and change data capture (CDC).

Brokers — The Servers

A Kafka broker is a server that stores data and serves client requests. A Kafka cluster is a group of brokers working together. In production, you typically run 3 or more brokers for fault tolerance — if one broker dies, the others take over.

Each broker is identified by a unique numeric ID. Brokers handle millions of reads and writes per second. They store data on disk (not just in memory), which sounds slow but Kafka uses sequential I/O and OS page cache to achieve throughput that rivals in-memory systems.

# List all brokers in a cluster
kafka-metadata.sh --snapshot /var/kafka-logs/__cluster_metadata-0/00000000000000000000.log --broker-list

# Describe broker configuration
kafka-configs.sh --bootstrap-server localhost:9092 --entity-type brokers --entity-name 0 --describe
Key Takeaway: A single broker can handle hundreds of MB/s of throughput. But always run at least 3 brokers in production — Kafka is designed for distributed operation and loses its fault-tolerance guarantees with fewer than 3 nodes.

Topics — Named Feeds

A topic is a named category or feed of messages. Think of topics as tables in a database — you might have a user-signups topic, a page-views topic, and an orders topic. Producers write to topics, consumers read from topics.

Topics are logical — they don't live on a single broker. Instead, each topic is divided into partitions that are spread across multiple brokers. This is how Kafka achieves parallelism and scalability.

# Create a topic with 3 partitions and replication factor 3
kafka-topics.sh --bootstrap-server localhost:9092 \
  --create --topic user-events \
  --partitions 3 --replication-factor 3

# List all topics
kafka-topics.sh --bootstrap-server localhost:9092 --list

# Describe a topic (shows partitions, replicas, ISR)
kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic user-events
Output
Topic: user-events	TopicId: abc123	PartitionCount: 3	ReplicationFactor: 3
  Topic: user-events	Partition: 0	Leader: 1	Replicas: 1,2,3	Isr: 1,2,3
  Topic: user-events	Partition: 1	Leader: 2	Replicas: 2,3,1	Isr: 2,3,1
  Topic: user-events	Partition: 2	Leader: 3	Replicas: 3,1,2	Isr: 3,1,2

Partitions — The Unit of Parallelism

A partition is an ordered, immutable sequence of messages. Each partition is a separate log file on disk. When a producer sends a message, it goes to exactly one partition of the topic (determined by the message key or round-robin).

Partitions are Kafka's secret to scalability. If you have 3 partitions and 3 consumers, each consumer reads from one partition in parallel. More partitions = more parallelism = more throughput. But don't go overboard — each partition has overhead (file handles, memory, replication traffic).

PartitionsMax Parallel ConsumersUse Case
11Low volume, strict ordering needed
3-63-6Most applications
12-3012-30High throughput
100+100+Extreme scale (LinkedIn, Netflix)
Kafka partition log structure showing sequential offsets and append-only writes
Kafka partition log anatomy — Source: kafka.apache.org (Apache 2.0)

Offsets — Your Position in the Stream

An offset is a unique, sequential number assigned to each message within a partition. It's like a page number in a book. Offset 0 is the first message, offset 1 is the second, and so on. Offsets are never reused, even after messages expire.

Each consumer tracks its own offset — "I've read up to offset 42 in partition 0." This is how Kafka enables multiple independent consumers to read the same topic at different speeds without interfering with each other.

# Check consumer group offsets (where each consumer is in each partition)
kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
  --describe --group my-app
Output
GROUP    TOPIC         PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG
my-app   user-events   0          1042            1050            8
my-app   user-events   1          998             998             0
my-app   user-events   2          1105            1200            95

The LAG column is crucial — it shows how far behind each consumer is. Partition 2 has a lag of 95, meaning 95 messages are waiting to be processed. High lag means your consumer can't keep up.

Replication — Fault Tolerance

Every partition is replicated across multiple brokers. One replica is the leader (handles all reads and writes), and the others are followers (passively replicate from the leader). If the leader dies, a follower is automatically promoted to leader.

The ISR (In-Sync Replicas) is the set of replicas that are fully caught up with the leader. The setting min.insync.replicas controls how many replicas must acknowledge a write before it's considered committed. With acks=all and min.insync.replicas=2, you're guaranteed that at least 2 brokers have the data before the producer gets a success response.

⚠️ Common Mistake: replication-factor = 1

Wrong: Creating topics with --replication-factor 1 in production.

Why: If the single broker holding that partition dies, the data is gone. No recovery possible.

Instead: Always use --replication-factor 3 in production. This tolerates 1 broker failure (or 2 with min.insync.replicas=2).

ZooKeeper vs KRaft

Traditionally, Kafka used ZooKeeper to manage cluster metadata (broker list, topic configs, leader election). Starting with Kafka 3.3, KRaft mode (Kafka Raft) replaces ZooKeeper with a built-in consensus protocol. KRaft is simpler to operate (one less system to manage), scales better, and is now the recommended approach.

FeatureZooKeeperKRaft
DependencySeparate ZooKeeper clusterBuilt into Kafka
ScalingLimited by ZKMillions of partitions
Recovery timeMinutesSeconds
StatusDeprecated (removal in 4.0)Production ready
🔍 Deep Dive: How Kafka Achieves High Throughput

Kafka achieves millions of messages per second through several design choices: (1) Sequential I/O — appending to the end of a file is much faster than random writes. (2) Zero-copy — data goes from disk to network socket without copying through application memory. (3) Batching — messages are sent in batches, amortizing network overhead. (4) Compression — batches are compressed (gzip, snappy, lz4, zstd). (5) Page cache — the OS caches recently written data in memory, so consumers reading recent data hit memory, not disk.

Kafka Architecture: Producer → Broker Cluster → Consumer
Producer
sends messages
Broker 1
leader
Broker 2
follower
Broker 3
follower
Consumer
reads messages
Core Concepts
Topic
Named feed of messages
Partition
Ordered, immutable log
Offset
Position in partition
Broker
Server in cluster
Replica
Partition copy
Consumer Group
Parallel readers

Practice Exercises

Easy Draw the Architecture

Sketch a 3-broker Kafka cluster with 1 topic, 3 partitions, replication factor 2. Label leaders and followers.

Medium Partition Assignment

If you have 6 partitions and 3 consumers in a group, how are partitions assigned? What if a consumer fails?

Practice Exercises

Easy Hello World Variant

Modify the example to accept user input and print a personalized greeting.

Easy Code Reading

Read through the code examples above and predict the output before running them.

Medium Extend the Example

Take one code example and add error handling, input validation, or a new feature.