Kafka Producers

Sending messages with keys, partitioning strategies, serialization, and delivery guarantees.

Beginner 30 min read 📨 Kafka

How Producers Work

A Kafka producer is any application that sends messages to Kafka. When you call producer.send(), the message doesn't go directly to a broker. It passes through several stages: serialization, partitioning, batching, and finally transmission. Understanding this pipeline is key to configuring producers for your throughput and durability needs.

The Producer Record

Every message you send is a ProducerRecord with these fields:

FieldRequiredPurpose
topicYesWhich topic to send to
valueYesThe message payload (bytes)
keyNoUsed for partitioning. Same key = same partition = ordering
partitionNoOverride automatic partitioning
headersNoKey-value metadata (like HTTP headers)
timestampNoEvent time (defaults to current time)
Kafka producer sending messages to topic partitions consumed by consumer group
Producer to consumer flow via partitioned topic — Source: kafka.apache.org (Apache 2.0)

Partitioning Strategies

The partitioner decides which partition a message goes to. This is critical because ordering is only guaranteed within a single partition. Messages with the same key always go to the same partition, ensuring ordered processing per entity (e.g., all events for user 42 go to partition 3).

# Python producer example (confluent-kafka)
from confluent_kafka import Producer

producer = Producer({'bootstrap.servers': 'localhost:9092'})

# Key-based partitioning: all events for user_42 go to the same partition
producer.produce(
    topic='user-events',
    key='user_42',          # Determines partition (hash of key % num_partitions)
    value='{"action": "login", "timestamp": "2024-01-15T10:30:00Z"}',
)

# No key: round-robin across partitions (no ordering guarantee)
producer.produce(
    topic='page-views',
    value='{"url": "/home", "duration": 30}',
)

producer.flush()  # Wait for all messages to be delivered

Acknowledgment Modes (Durability vs Speed)

The acks setting controls how many brokers must confirm receipt before the producer considers a message "sent." This is the fundamental tradeoff between durability and speed:

SettingBehaviorDurabilityLatencyUse When
acks=0Don't wait for any confirmationMay lose dataFastestMetrics, logs you can afford to lose
acks=1Leader confirms writeLose data if leader dies before replicationFastMost applications
acks=allAll in-sync replicas confirmNo data loss (with min.insync.replicas)SlowerFinancial data, orders, critical events

⚠️ Common Mistake: Using acks=0 or acks=1 for critical data

Wrong: Sending payment events with acks=1.

Why: If the leader broker dies after acknowledging but before replicating, the message is lost. For a payment, that means money moved but you have no record of it.

Instead: Use acks=all with min.insync.replicas=2 for any data you can't afford to lose.

Batching and Compression

Producers don't send messages one at a time — they batch them for efficiency. Two key settings control this:

# High-throughput producer configuration
producer = Producer({
    'bootstrap.servers': 'kafka1:9092,kafka2:9092,kafka3:9092',
    'acks': 'all',
    'compression.type': 'lz4',
    'batch.size': 65536,      # 64KB batches
    'linger.ms': 10,          # Wait 10ms to fill batches
    'retries': 3,
    'enable.idempotence': True,  # Prevent duplicates on retry
})
Key Takeaway: For most applications, set acks=all, enable.idempotence=true, compression.type=lz4, and linger.ms=5. This gives you durability, deduplication, compression, and good throughput out of the box.
🔍 Deep Dive: Idempotent Producers

Network failures can cause a producer to retry sending a message that was actually received by the broker, creating duplicates. enable.idempotence=true prevents this by assigning each producer a unique ID and each message a sequence number. The broker detects and discards duplicate deliveries. This is transparent — your code doesn't change. Always enable it unless you have a specific reason not to.

Producer Message Flow
Application
create record
Serializer
key + value
Partitioner
choose partition
Batch
accumulate
Broker
write to log
Acknowledgment Modes (acks)
acks=0
Fire and forget
acks=1
Leader confirms
acks=all
All replicas confirm

Practice Exercises

Easy Hello World Variant

Modify the example to accept user input and print a personalized greeting.

Easy Code Reading

Read through the code examples above and predict the output before running them.

Medium Extend the Example

Take one code example and add error handling, input validation, or a new feature.