Design a URL Shortener

Medium 30 min read

Problem Statement & Requirements

Why URL Shorteners Matter

The Problem: Long URLs are unwieldy, hard to share, and look unprofessional in messages, tweets, and printed materials.

The Solution: A URL shortener converts long URLs into compact, shareable links that redirect to the original destination.

Real Impact: Bitly processes over 600 million clicks per month. TinyURL has shortened billions of URLs since 2002.

Real-World Analogy

Think of a URL shortener like a coat check system:

  • Long URL = Your bulky coat that is hard to carry around
  • Short code = The small numbered ticket you receive
  • Database = The coat rack where items are stored and indexed
  • Redirect = Handing in your ticket to retrieve your coat
  • Expiration = After a period, unclaimed coats are donated

Functional Requirements

URL Shortening

Given a long URL, generate a unique short URL. The short link should be as compact as possible (7 characters or fewer).

URL Redirection

When a user accesses a short URL, redirect them to the original long URL with minimal latency (under 10ms).

Custom Aliases

Users can optionally choose a custom short key for their URL (e.g., liziu.co/my-brand).

Link Expiration

URLs should have a configurable expiration time. Default expiry is 5 years.

Non-Functional Requirements

Key Constraints

  • High Availability: The service should be always available (99.99% uptime)
  • Low Latency: Redirects must happen in real time (sub-10ms)
  • Read Heavy: Read-to-write ratio of approximately 100:1
  • Non-Guessable: Short URLs should not be predictable
  • Scalability: Must handle 100M+ new URLs per month

Back-of-Envelope Estimation

Always Start with Numbers

Back-of-envelope estimation is critical in system design interviews. It sets the scale and helps you make informed decisions about storage, bandwidth, and caching.

Metric Estimate Calculation
New URLs per month 100 million Given requirement
New URLs per second ~40 100M / (30 * 24 * 3600)
Reads per second ~4,000 40 * 100 (100:1 ratio)
URLs over 5 years 6 billion 100M * 12 * 5
Storage per URL ~500 bytes short_key + long_url + metadata
Total storage (5 yrs) ~3 TB 6B * 500 bytes
Cache memory (20%) ~35 GB/day 4000 * 86400 * 0.2 * 500 bytes

System API Design

url_shortener_api.py
# REST API Design for URL Shortener

# 1. Create Short URL
# POST /api/v1/urls
# Request Body:
# {
#   "long_url": "https://example.com/very/long/path",
#   "custom_alias": "my-link",       (optional)
#   "expiry_days": 365               (optional)
# }
# Response: { "short_url": "https://liziu.co/Ab3xK9" }

# 2. Redirect Short URL
# GET /{short_key}
# Response: 301 Redirect to original URL

# 3. Delete Short URL
# DELETE /api/v1/urls/{short_key}
# Response: 204 No Content

from flask import Flask, request, redirect, jsonify
import hashlib
import string

app = Flask(__name__)

class URLShortenerAPI:
    """REST API for URL Shortener Service"""

    def create_short_url(self, long_url: str,
                         custom_alias: str = None,
                         expiry_days: int = 365) -> dict:
        """
        Create a shortened URL.

        Args:
            long_url: The original URL to shorten
            custom_alias: Optional custom short key
            expiry_days: Days until the link expires

        Returns:
            dict with short_url and metadata
        """
        if custom_alias:
            if self.db.exists(custom_alias):
                raise ConflictError("Alias already taken")
            short_key = custom_alias
        else:
            short_key = self.generate_short_key(long_url)

        self.db.save(short_key, long_url, expiry_days)
        return {"short_url": f"https://liziu.co/{short_key}"}

    def redirect_url(self, short_key: str):
        """Look up short key and redirect to original URL."""
        # Check cache first, then database
        long_url = self.cache.get(short_key)
        if not long_url:
            long_url = self.db.get(short_key)
            self.cache.set(short_key, long_url)
        return redirect(long_url, code=301)

301 vs 302 Redirect

301 (Permanent): Browser caches the redirect. Reduces server load but you lose analytics on repeat visits.

302 (Temporary): Browser always hits your server. Better for analytics but higher server load.

Recommendation: Use 301 for maximum performance, 302 if you need click analytics.

Database Schema

schema.sql
-- URL mapping table
CREATE TABLE urls (
    id            BIGINT PRIMARY KEY AUTO_INCREMENT,
    short_key     VARCHAR(7) UNIQUE NOT NULL,
    long_url      VARCHAR(2048) NOT NULL,
    created_at    TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    expires_at    TIMESTAMP,
    user_id       BIGINT,
    click_count   BIGINT DEFAULT 0,
    INDEX idx_short_key (short_key),
    INDEX idx_expires_at (expires_at)
);

-- User table (optional, for authenticated users)
CREATE TABLE users (
    id            BIGINT PRIMARY KEY AUTO_INCREMENT,
    email         VARCHAR(255) UNIQUE NOT NULL,
    api_key       VARCHAR(64) UNIQUE NOT NULL,
    created_at    TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

NoSQL vs SQL for URL Shortener

A NoSQL database like DynamoDB or Cassandra is a strong choice here because:

  • Simple key-value lookups (short_key to long_url)
  • No complex joins or transactions needed
  • Horizontal scaling is straightforward
  • Billions of records with predictable performance

However, SQL works well too if you need analytics queries or your scale is moderate.

URL Shortening Algorithm

Approach 1: Base62 Encoding

Why Base62?

Base62 uses characters [a-z, A-Z, 0-9] to encode numbers. With 7 characters, we can represent 62^7 = 3.5 trillion unique URLs -- far more than we need.

Base62 Encoding Flow
Auto-Increment ID 2009215674938 Base62 Encode id % 62 repeatedly map to [a-zA-Z0-9] Short Key Ab3xK9z Base62 Character Set (62 characters) a-z (0-25) | A-Z (26-51) | 0-9 (52-61) 7 chars = 62^7 = 3,521,614,606,208 unique URLs
base62_encoding.py
import string

# Base62 character set: a-z, A-Z, 0-9
CHARSET = string.ascii_lowercase + string.ascii_uppercase + string.digits
BASE = 62

def base62_encode(num: int) -> str:
    """
    Convert a numeric ID to a Base62 string.

    Example: 2009215674938 -> 'Ab3xK9z'

    Time:  O(log_62(n)) -- typically 7 iterations
    Space: O(1)
    """
    if num == 0:
        return CHARSET[0]

    result = []
    while num > 0:
        remainder = num % BASE
        result.append(CHARSET[remainder])
        num //= BASE

    return ''.join(reversed(result))


def base62_decode(short_key: str) -> int:
    """Convert a Base62 string back to a numeric ID."""
    num = 0
    for char in short_key:
        num = num * BASE + CHARSET.index(char)
    return num


# Example usage
unique_id = 2009215674938
short = base62_encode(unique_id)
print(f"Encoded: {unique_id} -> {short}")
# Encoded: 2009215674938 -> Ab3xK9z

decoded = base62_decode(short)
print(f"Decoded: {short} -> {decoded}")
# Decoded: Ab3xK9z -> 2009215674938

Approach 2: MD5 Hash + Truncation

md5_approach.py
import hashlib

def hash_based_shorten(long_url: str, length: int = 7) -> str:
    """
    Generate a short key using MD5 hash.

    Pros: No need for a centralized counter
    Cons: Potential collisions require retry logic
    """
    hash_hex = hashlib.md5(long_url.encode()).hexdigest()

    # Take first 43 bits (enough for Base62 7-char key)
    hash_int = int(hash_hex[:11], 16)
    short_key = base62_encode(hash_int)[:7]

    return short_key

# Collision handling: if key exists, append counter and re-hash
def shorten_with_retry(long_url: str, db) -> str:
    """Handle collisions by retrying with modified input."""
    for attempt in range(5):
        candidate = hash_based_shorten(long_url + str(attempt))
        if not db.exists(candidate):
            return candidate
    raise Exception("Could not generate unique key after 5 attempts")

Algorithm Comparison

Approach Pros Cons Best For
Base62 + Counter No collisions, deterministic Needs distributed counter High-throughput systems
MD5 Hash Stateless, no coordination Collisions possible Distributed generation
Pre-generated Keys Fast, no computation at write Key management overhead Extremely high write volume

Read/Write Flow

URL Shortener System Architecture
Client Browser/App Load Balancer Round Robin App Server 1 App Server 2 App Server 3 Redis Cache Hot URLs Database URL Mappings Key Generation Pre-generated keys DB Replica Read-only HTTPS Cache hit? Cache miss Replicate Write: POST /api/v1/urls Read: GET /{short_key}

Write Path (Create Short URL)

Step-by-Step Write Flow

  • Step 1: Client sends POST request with the long URL
  • Step 2: Load balancer routes to an available app server
  • Step 3: App server requests a unique key from the Key Generation Service
  • Step 4: App server writes the mapping (short_key, long_url) to the database
  • Step 5: Response with the short URL is returned to the client

Read Path (Redirect)

Step-by-Step Read Flow

  • Step 1: Client sends GET request for the short URL
  • Step 2: Load balancer routes to an app server
  • Step 3: App server checks Redis cache for the short key
  • Step 4: On cache hit, return the long URL immediately. On cache miss, query the database
  • Step 5: Populate cache with the result, then return a 301/302 redirect

Scaling & Optimization

Database Partitioning

Range-Based Sharding

Partition by first character of short key. Simple but can lead to hot partitions if keys are not uniformly distributed.

Hash-Based Sharding

Apply a hash function to the short key and distribute across shards. Ensures even distribution but makes range queries harder.

Consistent Hashing

Use consistent hashing for dynamic shard addition/removal with minimal data redistribution. Best for growing systems.

Caching Strategy

caching_strategy.py
import redis

class URLCache:
    """
    Caching layer for URL Shortener.

    Strategy: Cache the top 20% most accessed URLs.
    With 4,000 reads/sec, we cache ~35 GB of hot data.
    """

    def __init__(self):
        self.client = redis.Redis(
            host='redis-cluster',
            port=6379,
            decode_responses=True
        )
        self.TTL = 86400  # 24 hours

    def get_url(self, short_key: str) -> str:
        """Look up URL in cache. Returns None on miss."""
        return self.client.get(f"url:{short_key}")

    def set_url(self, short_key: str, long_url: str):
        """Cache URL with TTL. Uses LRU eviction policy."""
        self.client.setex(
            f"url:{short_key}",
            self.TTL,
            long_url
        )

    def invalidate(self, short_key: str):
        """Remove URL from cache (e.g., on deletion)."""
        self.client.delete(f"url:{short_key}")

Rate Limiting

Preventing Abuse

Without rate limiting, malicious users could exhaust your key space or overload the write path. Implement:

  • IP-based rate limiting: Max 100 URLs per hour per IP address
  • API key limits: Authenticated users get higher limits (1,000/hour)
  • Token bucket algorithm: Smooth rate limiting with burst capacity

Additional Optimizations

Production Considerations

  • Analytics: Stream click events to Kafka for async processing (don't block redirects)
  • Cleanup: Background job to delete expired URLs and reclaim keys
  • CDN: Place a CDN in front for popular short URLs to reduce origin load
  • Monitoring: Track cache hit ratio, p99 latency, and error rates

Practice Problems

Medium Key Generation at Scale

Design a Key Generation Service (KGS) that pre-generates unique keys and hands them out to app servers on demand. Consider:

  1. How do you ensure no two servers get the same key?
  2. What happens if the KGS goes down?
  3. How do you handle key exhaustion?

Think about maintaining two tables: one for unused keys and one for used keys. App servers can fetch keys in batches. Consider using a "used" vs "unused" key pool pattern.

# Key Generation Service Design
# 1. Pre-generate millions of unique 7-char keys
# 2. Store in two tables: unused_keys, used_keys
# 3. App servers request batch of keys (e.g., 1000)

class KeyGenerationService:
    def __init__(self, db):
        self.db = db
        self.batch_size = 1000

    def get_keys(self, count: int) -> list:
        """Atomically move keys from unused to used."""
        with self.db.transaction():
            keys = self.db.query(
                "SELECT key FROM unused_keys LIMIT %s FOR UPDATE",
                count
            )
            self.db.execute(
                "INSERT INTO used_keys SELECT * FROM unused_keys WHERE ..."
            )
            self.db.execute(
                "DELETE FROM unused_keys WHERE ..."
            )
        return keys

# Reliability: Run 2 KGS replicas with separate key ranges
# Key exhaustion: Monitor unused key count, generate more at threshold

Medium Analytics Dashboard

Extend the URL shortener to provide real-time analytics showing click counts, geographic distribution, and referral sources. How would you design the analytics pipeline?

  1. How do you collect click data without slowing down redirects?
  2. What storage would you use for analytics data?
  3. How do you handle real-time vs batch analytics?

Consider an event-driven architecture. Publish click events to a message queue and process them asynchronously. Think about time-series databases for analytics storage.

# Analytics Pipeline Design
# 1. On redirect, publish click event to Kafka (async, non-blocking)
# 2. Stream processors aggregate data in real-time
# 3. Store in ClickHouse or Apache Druid for fast OLAP queries

class ClickEvent:
    short_key: str
    timestamp: int
    ip_address: str
    user_agent: str
    referrer: str
    country: str  # derived from IP via GeoIP

# Kafka topic: "url-clicks"
# Real-time: Flink/Spark Streaming -> Redis counters
# Batch: Kafka -> S3 -> Spark -> Data warehouse
# Dashboard: Query ClickHouse for aggregations

Hard Global URL Shortener

Scale the URL shortener to serve users across multiple continents with sub-5ms redirect latency. Design the multi-region architecture.

  1. How do you replicate data across regions?
  2. How do you handle writes from different regions?
  3. What consistency model would you choose?

Consider eventual consistency for reads (redirects) and a single-leader model for writes. DNS-based routing can direct users to the nearest region. Think about how CDNs cache redirects.

# Multi-Region Architecture
# Regions: US-East, US-West, EU-West, AP-Southeast

# Writes: Route to nearest region, replicate async
# - Each region has its own key range (no conflicts)
# - US-East: keys starting a-m
# - EU-West: keys starting n-z
# - Cross-region replication lag: ~200ms

# Reads: Serve from local cache/DB
# - Local Redis cache (sub-1ms)
# - Local DB replica (sub-5ms)
# - Fallback to origin region (~100ms)

# DNS: GeoDNS routes users to nearest PoP
# CDN: Cache 301 redirects at edge (CloudFront, Cloudflare)
# Consistency: Eventual for reads, strong for writes per region

Quick Reference

URL Shortener Design Summary

Component Technology Choice Rationale
Database DynamoDB / Cassandra Key-value lookups, horizontal scaling
Cache Redis Cluster Sub-ms reads, LRU eviction
Load Balancer AWS ALB / Nginx Distribute across app servers
Key Generation Base62 + Snowflake ID No collisions, distributed-friendly
Analytics Kafka + ClickHouse Async processing, fast aggregations
CDN Cloudflare / CloudFront Cache redirects at edge

Key Takeaways

Interview Tips

  • Always start with requirements clarification and back-of-envelope estimation
  • The read path is 100x more important than the write path -- optimize for reads
  • Base62 encoding with a distributed ID generator is the recommended approach
  • Caching the top 20% of URLs handles 80% of traffic (Pareto principle)
  • Consider 301 vs 302 redirect tradeoffs early in the discussion
  • Don't forget rate limiting, monitoring, and cleanup of expired URLs