Understanding Delivery Semantics
In distributed systems, message delivery has three possible guarantees. Understanding the differences is crucial for choosing the right approach for your use case:
| Semantic | Guarantee | How It Fails | Use Case |
|---|---|---|---|
| At-most-once | Messages may be lost, never duplicated | Producer doesn't retry on failure | Metrics, logs you can afford to lose |
| At-least-once | Messages never lost, may be duplicated | Producer retries, consumer processes before committing offset | Most applications (with idempotent consumers) |
| Exactly-once | Messages never lost AND never duplicated | Requires coordination between producer, broker, and consumer | Financial transactions, billing, inventory |
Idempotent Producers
The first step to exactly-once is preventing duplicate messages from the producer side. When a producer retries a send (e.g., due to a network timeout), the broker might receive the same message twice. Idempotent producers solve this.
# Enable idempotent producer
producer = Producer({
'bootstrap.servers': 'localhost:9092',
'enable.idempotence': True, # Prevents duplicates on retry
'acks': 'all', # Required for idempotence
'retries': 2147483647, # Max retries (set automatically)
'max.in.flight.requests.per.connection': 5, # Safe with idempotence
})
With enable.idempotence=true, Kafka assigns each producer a PID (Producer ID) and each message a sequence number. If the broker receives a message with a PID+sequence it's already seen, it silently discards the duplicate. This is transparent — no code changes needed.
Transactional Messaging
For end-to-end exactly-once across multiple topics, use transactions. A transaction groups multiple sends into an atomic unit — either all messages are committed or none are.
// Transactional producer
Properties props = new Properties();
props.put("transactional.id", "order-processor-1"); // Must be unique per instance
props.put("enable.idempotence", true);
KafkaProducer producer = new KafkaProducer(props);
producer.initTransactions();
try {
producer.beginTransaction();
// All these sends are atomic
producer.send(new ProducerRecord("orders", orderId, orderJson));
producer.send(new ProducerRecord("inventory", productId, inventoryUpdate));
producer.send(new ProducerRecord("notifications", userId, notification));
producer.commitTransaction(); // All or nothing
} catch (Exception e) {
producer.abortTransaction(); // None of the messages are visible
}
read_committed isolation
Consumers must set isolation.level=read_committed to only see messages from committed transactions. Without this, consumers see ALL messages including those from aborted transactions.
idempotence on every producer (no reason not to). Use transactions only when you need atomic writes across multiple topics. Kafka Streams gets exactly-once for free with processing.guarantee=exactly_once_v2.⚠️ Common Mistake: Exactly-once doesn't mean your consumer is idempotent
Kafka's exactly-once guarantees that messages are delivered exactly once. But if your consumer writes to an external database, the database write might succeed while the offset commit fails — causing the consumer to reprocess. Your consumer logic must still be idempotent (writing the same data twice produces the same result).
May lose messages
May duplicate
No loss, no dups
Practice Exercises
Medium Build a Mini Project
Combine concepts from this tutorial to build a small utility or tool.
Medium Debug Challenge
Introduce a bug in one of the code examples and practice finding and fixing it.
Hard Refactoring Exercise
Rewrite one example using a different approach and compare the tradeoffs.