Multi-Cluster & Replication

MirrorMaker 2, cross-datacenter replication, and disaster recovery.

Advanced 35 min read 📨 Kafka

Why Multi-Cluster?

A single Kafka cluster works for most applications. But you need multiple clusters when you have: data residency requirements (EU data stays in EU), disaster recovery needs, latency requirements across regions, or organizational isolation.

MirrorMaker 2 (MM2)

MirrorMaker 2 is Kafka's built-in tool for replicating data between clusters. It's a Kafka Connect-based framework that copies topics, consumer group offsets, and ACLs from one cluster to another.

# mm2.properties
clusters = us-east, eu-west

us-east.bootstrap.servers = kafka-us:9092
eu-west.bootstrap.servers = kafka-eu:9092

# Replicate all topics from us-east to eu-west
us-east->eu-west.enabled = true
us-east->eu-west.topics = .*

# Sync consumer group offsets
sync.group.offsets.enabled = true
emit.heartbeats.enabled = true
MirrorMaker 2 replicating data between two Kafka clusters
MirrorMaker 2 cross-cluster replication — Source: kafka.apache.org (Apache 2.0)

Replication Patterns

PatternDescriptionUse Case
Active-PassiveOne cluster active, other is standbyDisaster recovery
Active-ActiveBoth clusters serve traffic, replicate to each otherMulti-region low latency
Hub-SpokeEdge clusters replicate to central hubAggregating data from many locations
Key Takeaway: Start with a single cluster. Add multi-cluster only when you have clear requirements (DR, data residency, latency). MM2 handles replication, but test your failover procedure regularly.

Practice Exercises

Hard Production Scenario

Design a solution using these concepts for a real-world production system.

Hard Performance Analysis

Benchmark two different approaches and explain which is better and why.