Why Schema Registry?
Kafka stores messages as raw bytes. Without schemas, producers and consumers must agree on the message format out-of-band. This breaks in practice — a producer adds a field, a consumer doesn't know about it, and deserialization fails at 3am on a Saturday.
Schema Registry solves this by providing a central registry of message schemas. Producers register schemas before sending, consumers fetch schemas before reading, and the registry enforces compatibility rules to prevent breaking changes.
Apache Avro
Avro is the most common serialization format with Schema Registry. It's compact (binary), fast, and supports schema evolution. Unlike JSON, Avro messages don't include field names — the schema defines the structure, and the data is just values.
{
"type": "record",
"name": "User",
"namespace": "com.example",
"fields": [
{"name": "id", "type": "int"},
{"name": "name", "type": "string"},
{"name": "email", "type": "string"},
{"name": "age", "type": ["null", "int"], "default": null}
]
}
The age field uses a union type ["null", "int"] with a default of null. This means age is optional — old messages without an age field can still be read by consumers expecting one.
Compatibility Modes
Schema Registry enforces compatibility between schema versions to prevent breaking consumers:
| Mode | Rule | Safe Changes | Unsafe Changes |
|---|---|---|---|
| BACKWARD | New schema can read old data | Add optional field, delete field | Add required field |
| FORWARD | Old schema can read new data | Delete optional field, add field | Delete required field |
| FULL | Both directions | Add/delete optional fields only | Any required field change |
| NONE | No checks | Anything goes | — |
BACKWARD compatibility (the default). This ensures consumers can always read data written by older producers. Always add new fields as optional with defaults.# Register a schema
curl -X POST http://localhost:8081/subjects/users-value/versions \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
-d '{"schema": "{\"type\":\"record\",\"name\":\"User\",\"fields\":[{\"name\":\"id\",\"type\":\"int\"},{\"name\":\"name\",\"type\":\"string\"}]}"}'
# Check compatibility before registering
curl -X POST http://localhost:8081/compatibility/subjects/users-value/versions/latest \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
-d '{"schema": "..."}'
# List all subjects
curl http://localhost:8081/subjects
🔍 Deep Dive: Protobuf and JSON Schema
While Avro is the most common format, Schema Registry also supports Protobuf (popular in gRPC services) and JSON Schema (human-readable but less efficient). Choose Avro for new projects, Protobuf if your team already uses gRPC, and JSON Schema only if human readability of raw messages is critical for debugging.
Practice Exercises
Medium Build a Mini Project
Combine concepts from this tutorial to build a small utility or tool.
Medium Debug Challenge
Introduce a bug in one of the code examples and practice finding and fixing it.
Hard Refactoring Exercise
Rewrite one example using a different approach and compare the tradeoffs.