Creating and Managing Topics
Topics are the fundamental unit of organization in Kafka. Before producing or consuming messages, you need to create a topic. You can create topics via the CLI, programmatically, or let Kafka auto-create them (not recommended for production).
# Create a topic
kafka-topics.sh --bootstrap-server localhost:9092 \
--create --topic orders \
--partitions 6 --replication-factor 3
# List all topics
kafka-topics.sh --bootstrap-server localhost:9092 --list
# Describe topic details
kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic orders
# Delete a topic
kafka-topics.sh --bootstrap-server localhost:9092 --delete --topic old-topic
Topic: orders PartitionCount: 6 ReplicationFactor: 3 Partition: 0 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3 Partition: 1 Leader: 2 Replicas: 2,3,1 Isr: 2,3,1 Partition: 2 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2 Partition: 3 Leader: 1 Replicas: 1,3,2 Isr: 1,3,2 Partition: 4 Leader: 2 Replicas: 2,1,3 Isr: 2,1,3 Partition: 5 Leader: 3 Replicas: 3,2,1 Isr: 3,2,1
Choosing Partition Count
Partition count determines the maximum parallelism for consumers. If you have 6 partitions, at most 6 consumers in a group can read in parallel. But more partitions also mean more overhead — more file handles, more memory for replication, and longer leader elections.
Rule of thumb: Start with max(expected_throughput_MB/s / 10, expected_consumer_count). You can always add partitions later, but you cannot reduce them without recreating the topic.
⚠️ You cannot reduce partition count
Once a topic has 12 partitions, you can increase to 24 but never go back to 6. Plan carefully. Over-partitioning wastes resources; under-partitioning limits scalability.
Data Retention
Kafka doesn't keep messages forever (unless you configure it to). Two settings control retention:
| Setting | Default | Purpose |
|---|---|---|
retention.ms | 604800000 (7 days) | How long to keep messages by time |
retention.bytes | -1 (unlimited) | Max size per partition before deletion |
cleanup.policy | delete | delete removes old segments; compact keeps latest per key |
Log Compaction
With cleanup.policy=compact, Kafka keeps only the latest value for each key. Older values with the same key are removed during compaction. This is perfect for maintaining current state — like a user's latest profile or a product's current price.
# Set retention to 30 days for a topic
kafka-configs.sh --bootstrap-server localhost:9092 \
--alter --entity-type topics --entity-name orders \
--add-config retention.ms=2592000000
# Enable log compaction
kafka-configs.sh --bootstrap-server localhost:9092 \
--alter --entity-type topics --entity-name user-profiles \
--add-config cleanup.policy=compact
delete retention for event streams (logs, clicks, metrics). Use compact for state (user profiles, product catalog, configuration). You can use both: compact,delete keeps latest per key AND deletes old segments.Practice Exercises
Easy Hello World Variant
Modify the example to accept user input and print a personalized greeting.
Easy Code Reading
Read through the code examples above and predict the output before running them.
Medium Extend the Example
Take one code example and add error handling, input validation, or a new feature.