Caching Strategies

Why Caching Matters

The Problem: Database queries and API calls are slow and expensive. If 1,000 users request the same product page, fetching it from the database 1,000 times is wasteful.

The Solution: Store frequently accessed data in fast, in-memory storage (cache). Serve subsequent requests from the cache instead of the slow data source.

Real Impact: Facebook uses Memcached to handle billions of requests. A cache hit takes ~1ms vs ~50ms for a database query -- a 50x improvement.

Real-World Analogy

Think of caching like a chef's prep station:

Database = The walk-in refrigerator in the back (lots of storage, slow to access)
Cache = The prep counter right in front of the chef (limited space, instant access)
Cache hit = Ingredient is already on the prep counter -- grab it immediately
Cache miss = Need to walk to the fridge, get it, bring it to the counter
Eviction = Counter is full -- remove something to make space for new ingredients

Cache Hierarchy

The Cache Hierarchy (Fastest to Slowest)

There are several patterns for how your application interacts with the cache. The right choice depends on your read/write ratio and consistency requirements.

Cache-Aside (Lazy Loading)

Application checks cache first. On miss, fetches from DB and stores in cache. Most common pattern. Simple to implement. Cache only contains requested data.

Write-Through

Every write goes to both cache and DB synchronously. Cache is always consistent with DB. Higher write latency but no stale data. Good for read-heavy workloads.

Write-Back (Write-Behind)

Writes go to cache only, then asynchronously flushed to DB. Very fast writes but risk of data loss if cache crashes before flush. Good for write-heavy workloads.

Write-Around

Writes go directly to DB, bypassing cache. Cache only populated on reads (cache-aside for reads). Prevents cache pollution from data that is written but rarely read.

Cache-Aside Pattern Flow

Cache Eviction Policies

Caches have limited memory. When the cache is full and a new item needs to be stored, an eviction policy decides which existing item to remove.

Policy	Strategy	Best For	Complexity
LRU	Least Recently Used -- evict oldest accessed item	General purpose, most common	O(1)
LFU	Least Frequently Used -- evict least accessed item	Workloads with stable hot set	O(log n)
FIFO	First In, First Out -- evict oldest item	Simple, time-based data	O(1)
TTL	Time To Live -- expire after fixed duration	Session data, tokens	O(1)
Random	Remove a random item	Surprisingly effective, very simple	O(1)

Distributed Caching with Redis

Redis is the most popular distributed caching solution. It is an in-memory data store that supports strings, hashes, lists, sets, and sorted sets with built-in TTL support.

redis_caching.py

import redis
import json
import time

# Connect to Redis
cache = redis.Redis(host="localhost", port=6379, db=0)

def get_user_profile(user_id):
    """Cache-aside pattern for user profiles."""
    cache_key = f"user:{user_id}"

    # Step 1: Check cache
    cached = cache.get(cache_key)
    if cached:
        print("Cache HIT")
        return json.loads(cached)

    # Step 2: Cache miss - fetch from database
    print("Cache MISS - querying database")
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)

    # Step 3: Store in cache with 5 minute TTL
    cache.setex(
        cache_key,
        300,  # TTL in seconds (5 minutes)
        json.dumps(user)
    )

    return user

def update_user_profile(user_id, new_data):
    """Write-through: update DB and cache together."""
    # Update database first
    db.execute(
        "UPDATE users SET name=%s, email=%s WHERE id=%s",
        (new_data["name"], new_data["email"], user_id)
    )

    # Invalidate cache (or update it)
    cache_key = f"user:{user_id}"
    cache.delete(cache_key)  # Invalidation approach
    # OR: cache.setex(cache_key, 300, json.dumps(new_data))

def get_leaderboard(limit=10):
    """Use Redis sorted set for a real-time leaderboard."""
    # Add scores
    cache.zadd("leaderboard", {"alice": 1500, "bob": 2100, "carol": 1800})

    # Get top N players (descending)
    top_players = cache.zrevrange("leaderboard", 0, limit - 1, withscores=True)
    return [(name.decode(), score) for name, score in top_players]

# Measure cache performance
start = time.time()
profile = get_user_profile(42)  # Miss: ~10ms
print(f"First call: {(time.time()-start)*1000:.1f}ms")

start = time.time()
profile = get_user_profile(42)  # Hit: ~0.1ms
print(f"Second call: {(time.time()-start)*1000:.1f}ms")

Common Pitfall: Cache Stampede

Problem: A popular cache key expires. Hundreds of requests simultaneously hit the database to regenerate it.

Solutions:

Locking: Only one request fetches from DB; others wait for cache to be populated
Early refresh: Refresh the cache before it expires (e.g., at 80% of TTL)
Jittered TTL: Add random variance to TTL so keys do not expire simultaneously

Practice Problems

Easy Choose the Caching Strategy

For each scenario, which caching strategy is best and why?

A product catalog page (read 99%, write 1%)
A logging service writing millions of events/second
A user session store

Consider the read/write ratio and consistency needs. Cache-aside works for reads, write-back for writes, write-through for consistency.

# 1. Product catalog: Cache-Aside
#    - Read-heavy (99% reads)
#    - Stale data for a few minutes is OK
#    - Set TTL to 5-10 minutes
#    - Products rarely change

# 2. Logging service: Write-Back
#    - Extremely write-heavy
#    - Buffer writes in cache, flush to DB in batches
#    - Acceptable to lose a few seconds of logs
#    - Reduces DB write pressure by 100x+

# 3. Session store: Write-Through with TTL
#    - Every session update must be persisted
#    - Reads need to be fast (every request)
#    - TTL = session expiry time (e.g., 30 min)
#    - Redis is perfect for this use case

Medium Cache Hit Ratio Optimization

Your cache has a 60% hit ratio. Each cache hit costs 1ms and each miss costs 50ms (cache check + DB query). With 10,000 requests/second:

What is the average response time?
If you increase the hit ratio to 90%, what is the new average?
What strategies would you use to improve from 60% to 90%?

Average = (hit_ratio * hit_time) + (miss_ratio * miss_time). To improve hit ratio: increase cache size, better TTL, pre-warm cache, analyze access patterns.

# 1. Average with 60% hit ratio:
avg_60 = (0.60 * 1) + (0.40 * 50)
print(f"60% hit: {avg_60}ms avg")  # 20.6ms

# 2. Average with 90% hit ratio:
avg_90 = (0.90 * 1) + (0.10 * 50)
print(f"90% hit: {avg_90}ms avg")  # 5.9ms

# That's a 3.5x improvement!

# 3. Strategies to improve hit ratio:
# a) Increase cache size (more items fit)
# b) Tune TTL (too short = unnecessary misses)
# c) Pre-warm cache on startup with popular items
# d) Use cache-aside + write-through together
# e) Analyze access patterns, cache hot data only
# f) Use consistent hashing for distributed cache

Medium Design a Caching Layer

Design a multi-level caching layer for a news feed that serves 1M active users:

What would you cache at each level?
What TTL for each cache level?
How do you handle cache invalidation when a new post is created?

Use CDN for static assets, Redis for feed data, local memory for hot users. Invalidation can use pub/sub to notify cache nodes.

# Multi-level caching for news feed

# Level 1: CDN (static assets)
# - Profile images, CSS, JS, video thumbnails
# - TTL: 24 hours (versioned URLs)
# - Invalidate via CDN purge API

# Level 2: Redis Cluster (feed data)
# - Pre-computed feed: user_id -> [post_ids]
# - Post content: post_id -> {title, body, author}
# - TTL: 5 minutes for feeds, 1 hour for posts

# Level 3: App Server Memory (hot data)
# - Top 100 trending posts (shared across requests)
# - TTL: 30 seconds

# Cache invalidation on new post:
# 1. Write post to database
# 2. Publish event to Redis Pub/Sub channel
# 3. Feed workers subscribe to channel:
#    - Update cached feeds for post author's followers
#    - Use fan-out-on-write for users with <10K followers
#    - Use fan-out-on-read for celebrities (>10K followers)

Quick Reference

Caching Strategy Selection Guide

Strategy	Read Perf	Write Perf	Consistency	Best For
Cache-Aside	High (after warm-up)	N/A (reads only)	Eventual	Read-heavy, general purpose
Write-Through	High	Lower (dual write)	Strong	Read-heavy with consistency
Write-Back	High	Very High	Eventual	Write-heavy workloads
Write-Around	High (for hot data)	High	Eventual	Write-once, read-later data

Key Takeaways

Cache-aside is the most common and safest starting point
Always set a TTL -- unbounded caches lead to stale data and memory issues
LRU is the default eviction policy for most use cases
Cache hit ratio above 90% is the target; below 80% investigate your access patterns
Guard against cache stampede, especially for popular keys

Why Caching Matters

Why Caching Matters

Real-World Analogy

Cache Hierarchy