Consistent Hashing & Partitioning

Hard 25 min read

Why Consistent Hashing Matters

Why This Matters

The Problem: When you distribute data across N servers using hash(key) % N, adding or removing a server remaps almost every key. If you go from 10 to 11 servers, ~91% of keys move to a different server -- invalidating caches and triggering massive data migration.

The Solution: Consistent hashing maps both keys and servers onto a circular hash space. Adding or removing a server only remaps ~1/N of the keys (e.g., ~10% with 10 servers), minimizing disruption.

Real Impact: Amazon DynamoDB, Apache Cassandra, Akamai CDN, and Discord all use consistent hashing to distribute data across nodes with minimal reshuffling during scaling events.

Real-World Analogy

Imagine students in a circular seating arrangement:

  • Hash ring = A round table with seats numbered 0 to 360
  • Servers = Students sitting at specific seats (say seats 90, 180, 270)
  • Data keys = Papers that land on random seat numbers
  • Assignment rule = Each paper goes to the next student clockwise
  • Adding a student = Only papers between the new student and the previous one move

The Problem with Naive Hashing

ScenarioNaive hash(key) % NConsistent Hashing
10 servers, add 1~91% keys remapped~9% keys remapped
10 servers, remove 1~90% keys remapped~10% keys remapped
100 servers, add 1~99% keys remapped~1% keys remapped
Cache invalidationNear-total cache miss stormMinimal cache disruption

The Cache Stampede Problem

Problem: With naive hashing, adding a server remaps most keys. Every remapped key is a cache miss. If thousands of keys miss simultaneously, they all hit the database, potentially crashing it.

Solution: Consistent hashing limits remapping to ~1/N of keys. Combine with request coalescing (only one request fetches the data; others wait) to further reduce database load during node changes.

How Consistent Hashing Works

Hash Ring with Virtual Nodes
0 A Server A A' B Server B B' C Server C C' key1 -> A key2 -> B key3 -> B key4 -> C' Legend Server A (+ vnode A') Server B (+ vnode B') Server C (+ vnode C') Data keys Keys go to next server clockwise clockwise

Algorithm Steps

How It Works

  • Step 1: Hash each server name to a position on a circular ring (0 to 2^32 - 1)
  • Step 2: Hash each data key to a position on the same ring
  • Step 3: Walk clockwise from the key's position until you find a server -- that server owns the key
  • Step 4: When adding a server, only keys between the new server and its predecessor are remapped
  • Step 5: When removing a server, its keys are reassigned to the next server clockwise

Virtual Nodes

With only a few physical servers on the ring, data distribution can be uneven. Virtual nodes solve this by placing each physical server at multiple positions on the ring. A server with 150 virtual nodes gets 150 positions, resulting in much more even data distribution.

Better Balance

With 3 physical nodes and 0 virtual nodes, one server might own 50% of the ring. With 150 virtual nodes each, distribution approaches a uniform 33% per server.

Heterogeneous Hardware

A powerful server can have more virtual nodes than a weaker one. A server with 32GB RAM gets 200 vnodes; one with 16GB gets 100 vnodes -- proportional to capacity.

Smoother Rebalancing

When a node leaves, its virtual nodes are scattered around the ring. Load spreads evenly across many surviving nodes instead of doubling load on one neighbor.

consistent_hashing.py
import hashlib
from bisect import bisect_right

class ConsistentHash:
    """Consistent hashing with virtual nodes."""

    def __init__(self, num_vnodes=150):
        self.num_vnodes = num_vnodes
        self.ring = {}          # hash_value -> node_name
        self.sorted_keys = []   # sorted hash values on the ring

    def _hash(self, key):
        """Generate a consistent hash for a key."""
        digest = hashlib.md5(key.encode()).hexdigest()
        return int(digest, 16)

    def add_node(self, node):
        """Add a node with virtual nodes to the ring."""
        for i in range(self.num_vnodes):
            vnode_key = f"{node}-vnode-{i}"
            hash_val = self._hash(vnode_key)
            self.ring[hash_val] = node
            self.sorted_keys.append(hash_val)
        self.sorted_keys.sort()

    def remove_node(self, node):
        """Remove a node and all its virtual nodes."""
        for i in range(self.num_vnodes):
            vnode_key = f"{node}-vnode-{i}"
            hash_val = self._hash(vnode_key)
            del self.ring[hash_val]
            self.sorted_keys.remove(hash_val)

    def get_node(self, key):
        """Find which node owns a given key."""
        if not self.ring:
            return None

        hash_val = self._hash(key)
        # Find the first node clockwise on the ring
        idx = bisect_right(self.sorted_keys, hash_val)
        if idx == len(self.sorted_keys):
            idx = 0  # Wrap around to start of ring
        return self.ring[self.sorted_keys[idx]]

# Usage
ch = ConsistentHash(num_vnodes=150)
ch.add_node("server-1")
ch.add_node("server-2")
ch.add_node("server-3")

print(ch.get_node("user:1001"))  # -> "server-2"
print(ch.get_node("user:1002"))  # -> "server-1"

# Add a new server -- only ~33% of keys remap
ch.add_node("server-4")
print(ch.get_node("user:1001"))  # likely still "server-2"

Data Partitioning Strategies

StrategyHow It WorksProsCons
Hash Partitioninghash(key) determines partitionEven distribution, good for point lookupsRange queries require scatter-gather
Range PartitioningKey ranges assigned to partitions (A-M, N-Z)Efficient range queries, sorted dataHot spots if keys are not uniformly distributed
Consistent HashingHash ring with minimal remappingSmooth scaling, minimal data movementSlightly more complex implementation
GeographicData partitioned by region/locationLow latency for local users, data sovereigntyCross-region queries are expensive

Consistent Hashing in Practice

Amazon DynamoDB

Uses consistent hashing to partition data across storage nodes. Virtual nodes ensure even distribution. When a node fails, its token ranges are redistributed to surviving nodes automatically.

Apache Cassandra

Each node owns a set of token ranges on the ring. The Murmur3 partitioner hashes partition keys to tokens. Adding a node requires streaming only the relevant token ranges.

Memcached / Redis Cluster

Client libraries use consistent hashing to determine which cache server holds a key. Adding a server only invalidates ~1/N of cached items instead of all of them.

Akamai CDN

Consistent hashing determines which edge server caches a particular URL. This ensures the same URL is always routed to the same server, maximizing cache hit rates.

Practice Problems

Medium Cache Cluster Scaling

You run a 10-node Memcached cluster using consistent hashing. During Black Friday, you need to add 5 more nodes:

  1. Calculate the percentage of keys that will be remapped
  2. Design a warm-up strategy to minimize cache miss impact
  3. Choose the optimal number of virtual nodes

With consistent hashing, adding K nodes to N existing nodes remaps approximately K/(N+K) of the keys. For warmup, pre-populate new nodes by reading from old ones before switching traffic.

# 1. Keys remapped: K/(N+K) = 5/(10+5) = 33%
# About 33% of keys will map to new nodes (cache miss)

# 2. Warm-up strategy:
def warm_up_new_nodes(old_ring, new_ring, keys_sample):
    for key in keys_sample:
        old_node = old_ring.get_node(key)
        new_node = new_ring.get_node(key)
        if old_node != new_node:
            # Copy data from old node to new node
            value = old_node.get(key)
            if value:
                new_node.set(key, value)

# 3. Virtual nodes: 100-200 per physical node
# With 150 vnodes and 15 physical nodes = 2250 points on ring
# Standard deviation of load: ~5% (good enough)

Hard Replication on the Hash Ring

Design a replication strategy for a distributed key-value store using consistent hashing:

  1. Each key should be replicated on 3 distinct physical nodes
  2. Handle the case where virtual nodes of the same physical server are adjacent on the ring
  3. Implement a read repair mechanism

Walk clockwise from the key's position and pick the next 3 distinct physical nodes (skip virtual nodes of already-chosen physical servers). For read repair, compare versions from all replicas and update stale ones.

def get_replica_nodes(ring, key, num_replicas=3):
    """Get N distinct physical nodes for replication."""
    hash_val = ring._hash(key)
    idx = bisect_right(ring.sorted_keys, hash_val)
    replicas = []
    seen_physical = set()

    for _ in range(len(ring.sorted_keys)):
        pos = idx % len(ring.sorted_keys)
        node = ring.ring[ring.sorted_keys[pos]]
        if node not in seen_physical:
            replicas.append(node)
            seen_physical.add(node)
        if len(replicas) == num_replicas:
            break
        idx += 1

    return replicas

# Read repair: compare versions from replicas
def read_with_repair(key, replicas):
    responses = [node.get(key) for node in replicas]
    latest = max(responses, key=lambda r: r.version)
    for node, resp in zip(replicas, responses):
        if resp.version < latest.version:
            node.put(key, latest)  # Repair stale replica
    return latest.value

Hard Hot Key Mitigation

A celebrity post goes viral and one key receives 100x normal traffic, overloading its assigned node:

  1. Explain why consistent hashing alone does not solve hot keys
  2. Design a solution that distributes hot key reads across multiple nodes
  3. Ensure writes still go to a single authoritative node

Append a random suffix to the key for reads (key:shard_0, key:shard_1, etc.) to spread across multiple nodes. Writes go to the primary; a background process fans out to read replicas.

import random

NUM_READ_SHARDS = 10

def write_hot_key(ring, key, value):
    """Write to primary + fan out to read shards."""
    primary = ring.get_node(key)
    primary.put(key, value)
    # Async fan-out to read shards
    for i in range(NUM_READ_SHARDS):
        shard_key = f"{key}:shard_{i}"
        node = ring.get_node(shard_key)
        node.put(shard_key, value)

def read_hot_key(ring, key):
    """Read from a random shard to distribute load."""
    shard = random.randint(0, NUM_READ_SHARDS - 1)
    shard_key = f"{key}:shard_{shard}"
    node = ring.get_node(shard_key)
    return node.get(shard_key)

Quick Reference

OperationTime ComplexityNotes
Lookup keyO(log N) with binary searchN = total virtual nodes on ring
Add nodeO(V log N) where V = vnodesInsert V entries into sorted ring
Remove nodeO(V log N)Remove V entries from sorted ring
Keys remapped on add~K/(N+K) of total keysK = nodes added, N = existing nodes

Key Takeaways

  • Naive modular hashing remaps ~90%+ keys when nodes change; consistent hashing remaps ~1/N
  • Virtual nodes solve the uneven distribution problem with few physical servers
  • Replication walks clockwise and picks distinct physical nodes
  • Hot keys require application-level sharding, not just consistent hashing
  • Used in DynamoDB, Cassandra, Memcached, Akamai, Discord, and many more