Core Design Patterns - Microservices Architecture

Introduction to Microservices Patterns

Design patterns are proven solutions to common problems. Think of them as recipes that successful companies use to build reliable systems.

Saga Pattern - Managing Distributed Transactions

The Problem

In a monolith, you can wrap everything in a database transaction. In microservices, each service has its own database - how do you ensure consistency?

Solution: Break the transaction into a series of local transactions, each with a compensating action if something fails.

Python - Simple Saga Pattern

class OrderSaga:
    def create_order(self, order_data):
        # Step 1: Create order
        order = order_service.create(order_data)

        try:
            # Step 2: Reserve inventory
            inventory_service.reserve(order.items)
        except Exception:
            # Compensate: Cancel order
            order_service.cancel(order.id)
            raise

        try:
            # Step 3: Process payment
            payment_service.charge(order.total)
        except Exception:
            # Compensate: Release inventory and cancel order
            inventory_service.release(order.items)
            order_service.cancel(order.id)
            raise

        return order

CQRS - Command Query Responsibility Segregation

Separate read and write operations for better performance and scalability.

Commands: Create, Update, Delete (Write operations)
Queries: Read operations
Benefit: Optimize each side independently

Python - CQRS Example

# Write Model - Commands
class OrderCommandService:
    def create_order(self, order_data):
        order = Order(**order_data)
        db.session.add(order)
        db.session.commit()
        # Publish event
        event_bus.publish('OrderCreated', order)

# Read Model - Queries
class OrderQueryService:
    def get_order_summary(self, order_id):
        # Optimized read from denormalized view
        return read_db.query(OrderSummary).filter_by(id=order_id).first()

Core Architecture Patterns

Patterns for complex scenarios like event sourcing, strangler fig migration, and more.

Event Sourcing

Store all changes as a sequence of events instead of just the current state.

Traditional	Event Sourcing
Store current state only	Store all state changes
Lost history	Complete audit trail
Can't rebuild state	Replay events to rebuild

Python - Event Sourcing

class AccountEventStore:
    def __init__(self):
        self.events = []

    def apply_event(self, event):
        self.events.append(event)

    def get_current_state(self, account_id):
        # Rebuild state from events
        balance = 0
        for event in self.events:
            if event.account_id == account_id:
                if event.type == 'DEPOSITED':
                    balance += event.amount
                elif event.type == 'WITHDRAWN':
                    balance -= event.amount
        return balance

# Usage
store = AccountEventStore()
store.apply_event(Event('DEPOSITED', account_id=123, amount=100))
store.apply_event(Event('WITHDRAWN', account_id=123, amount=30))
current_balance = store.get_current_state(123)  # Returns 70

Strangler Fig Pattern

Gradually migrate from monolith to microservices without a big-bang rewrite.

Amazon's Strangler Fig Migration

Amazon migrated their monolith over years:

Step 1: Identify bounded contexts
Step 2: Extract one service at a time
Step 3: Route new traffic to new service
Step 4: Migrate old traffic gradually
Step 5: Decommission monolith code

Anti-Patterns to Avoid

Distributed Monolith: Services too tightly coupled
Shared Database: Services sharing same database
Chatty Services: Too many network calls

Resilience & Infrastructure Patterns

Patterns that make your microservices robust, fault-tolerant, and production-ready.

Circuit Breaker Pattern

Prevent cascading failures by failing fast when a service is unavailable.

Why It Matters

Without circuit breakers, one failing service can bring down your entire system through cascading failures and resource exhaustion.

States:

Closed: Normal operation, requests pass through
Open: Too many failures, all requests fail fast
Half-Open: Test if service recovered

Python - Circuit Breaker Implementation

from enum import Enum
from datetime import datetime, timedelta

class CircuitState(Enum):
    CLOSED = "closed"
    OPEN = "open"
    HALF_OPEN = "half_open"

class CircuitBreaker:
    def __init__(self, failure_threshold=5, timeout=60):
        self.failure_threshold = failure_threshold
        self.timeout = timeout
        self.failures = 0
        self.last_failure_time = None
        self.state = CircuitState.CLOSED

    def call(self, func, *args, **kwargs):
        if self.state == CircuitState.OPEN:
            if self._should_attempt_reset():
                self.state = CircuitState.HALF_OPEN
            else:
                raise Exception("Circuit breaker is OPEN")

        try:
            result = func(*args, **kwargs)
            self._on_success()
            return result
        except Exception as e:
            self._on_failure()
            raise e

    def _on_success(self):
        self.failures = 0
        self.state = CircuitState.CLOSED

    def _on_failure(self):
        self.failures += 1
        self.last_failure_time = datetime.now()
        if self.failures >= self.failure_threshold:
            self.state = CircuitState.OPEN

    def _should_attempt_reset(self):
        return (datetime.now() - self.last_failure_time).seconds >= self.timeout

# Usage
breaker = CircuitBreaker(failure_threshold=3, timeout=30)
try:
    result = breaker.call(external_service.get_data)
except Exception:
    # Fall back to cached data or default response
    result = get_cached_data()

Bulkhead Pattern

Isolate resources to prevent one failing component from consuming all resources.

Python - Thread Pool Bulkhead

from concurrent.futures import ThreadPoolExecutor
from functools import wraps

class Bulkhead:
    def __init__(self, max_workers=10):
        self.executor = ThreadPoolExecutor(max_workers=max_workers)

    def execute(self, func, *args, **kwargs):
        future = self.executor.submit(func, *args, **kwargs)
        return future.result(timeout=5)  # 5 second timeout

# Separate bulkheads for different services
payment_bulkhead = Bulkhead(max_workers=5)
inventory_bulkhead = Bulkhead(max_workers=10)

# Payment service gets 5 threads max
payment_result = payment_bulkhead.execute(payment_service.charge, order)

# Inventory service gets 10 threads max
inventory_result = inventory_bulkhead.execute(inventory_service.reserve, items)

Retry Pattern with Exponential Backoff

Automatically retry failed requests with increasing delays.

Python - Retry with Backoff

import time
from functools import wraps

def retry_with_backoff(max_retries=3, base_delay=1, max_delay=60):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            retries = 0
            delay = base_delay

            while retries < max_retries:
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    retries += 1
                    if retries >= max_retries:
                        raise e

                    # Exponential backoff: 1s, 2s, 4s, 8s...
                    time.sleep(min(delay, max_delay))
                    delay *= 2
                    print(f"Retry {retries}/{max_retries} after {delay}s")

            return wrapper
    return decorator

@retry_with_backoff(max_retries=3, base_delay=1)
def call_external_api():
    response = requests.get("https://api.example.com/data")
    response.raise_for_status()
    return response.json()

Rate Limiting Pattern

Control request rate to prevent overload and ensure fair usage.

Algorithm	How It Works	Best For
Token Bucket	Tokens refill at fixed rate, consume per request	Smooth traffic with bursts
Leaky Bucket	Process requests at constant rate	Strict rate control
Fixed Window	X requests per time window	Simple implementation
Sliding Window	Rolling time window	More accurate rate limiting

API Gateway Pattern

Single entry point that routes requests, handles cross-cutting concerns.

Responsibilities:

Request routing to appropriate microservices
Authentication and authorization
Rate limiting and throttling
Request/response transformation
Caching
Load balancing

Hands-On Implementation

Complete, production-ready implementations of core patterns.

Saga Pattern - Complete Orchestration

Python - Saga Orchestrator

class SagaStep:
    def __init__(self, action, compensation):
        self.action = action
        self.compensation = compensation

class SagaOrchestrator:
    def __init__(self):
        self.steps = []
        self.completed_steps = []

    def add_step(self, action, compensation):
        self.steps.append(SagaStep(action, compensation))

    def execute(self):
        try:
            for step in self.steps:
                result = step.action()
                self.completed_steps.append(step)
            return {"status": "success"}
        except Exception as e:
            # Compensate in reverse order
            for step in reversed(self.completed_steps):
                try:
                    step.compensation()
                except Exception as comp_error:
                    print(f"Compensation failed: {comp_error}")
            raise e

# Order Creation Saga
def create_order_saga(order_data):
    saga = SagaOrchestrator()

    # Step 1: Create Order
    saga.add_step(
        action=lambda: order_service.create(order_data),
        compensation=lambda: order_service.cancel(order_id)
    )

    # Step 2: Reserve Inventory
    saga.add_step(
        action=lambda: inventory_service.reserve(order_data['items']),
        compensation=lambda: inventory_service.release(order_data['items'])
    )

    # Step 3: Process Payment
    saga.add_step(
        action=lambda: payment_service.charge(order_data['total']),
        compensation=lambda: payment_service.refund(transaction_id)
    )

    # Step 4: Ship Order
    saga.add_step(
        action=lambda: shipping_service.create_shipment(order_id),
        compensation=lambda: shipping_service.cancel_shipment(shipment_id)
    )

    return saga.execute()

CQRS with Event-Driven Updates

Python - CQRS Implementation

from abc import ABC, abstractmethod
from dataclasses import dataclass
from datetime import datetime

# Command Side (Write Model)
@dataclass
class CreateOrderCommand:
    customer_id: str
    items: list
    total: float

class OrderCommandHandler:
    def __init__(self, event_bus):
        self.event_bus = event_bus

    def handle(self, command: CreateOrderCommand):
        # Create order in write database
        order = Order(
            id=generate_id(),
            customer_id=command.customer_id,
            items=command.items,
            total=command.total,
            status="PENDING",
            created_at=datetime.now()
        )

        write_db.orders.insert(order)

        # Publish event for read model
        self.event_bus.publish(OrderCreatedEvent(
            order_id=order.id,
            customer_id=order.customer_id,
            total=order.total,
            timestamp=order.created_at
        ))

        return order.id

# Query Side (Read Model)
class OrderQueryHandler:
    def get_order_summary(self, order_id):
        # Query optimized read model
        return read_db.order_summaries.find_one({"order_id": order_id})

    def get_customer_orders(self, customer_id):
        # Denormalized view for fast queries
        return read_db.customer_orders.find({"customer_id": customer_id})

# Event Handler - Updates Read Model
class OrderEventHandler:
    def on_order_created(self, event: OrderCreatedEvent):
        # Update denormalized read model
        read_db.order_summaries.insert({
            "order_id": event.order_id,
            "customer_id": event.customer_id,
            "total": event.total,
            "status": "PENDING",
            "created_at": event.timestamp
        })

        read_db.customer_orders.insert({
            "customer_id": event.customer_id,
            "order_id": event.order_id,
            "total": event.total
        })

Java Circuit Breaker with Resilience4j

Java - Resilience4j Circuit Breaker

import io.github.resilience4j.circuitbreaker.CircuitBreaker;
import io.github.resilience4j.circuitbreaker.CircuitBreakerConfig;
import io.github.resilience4j.circuitbreaker.CircuitBreakerRegistry;

import java.time.Duration;

public class PaymentService {
    private final CircuitBreaker circuitBreaker;

    public PaymentService() {
        CircuitBreakerConfig config = CircuitBreakerConfig.custom()
            .failureRateThreshold(50) // 50% failure rate triggers open
            .waitDurationInOpenState(Duration.ofSeconds(30))
            .slidingWindowSize(10) // Last 10 calls
            .permittedNumberOfCallsInHalfOpenState(3)
            .build();

        CircuitBreakerRegistry registry = CircuitBreakerRegistry.of(config);
        this.circuitBreaker = registry.circuitBreaker("paymentService");
    }

    public PaymentResult processPayment(Order order) {
        return circuitBreaker.executeSupplier(() -> {
            // Call external payment gateway
            return externalPaymentGateway.charge(order.getTotal());
        });
    }

    public PaymentResult processPaymentWithFallback(Order order) {
        return circuitBreaker.executeSupplier(
            () -> externalPaymentGateway.charge(order.getTotal()),
            throwable -> {
                // Fallback: Queue payment for later processing
                paymentQueue.enqueue(order);
                return new PaymentResult(Status.QUEUED, "Payment queued");
            }
        );
    }
}

Node.js API Gateway with Rate Limiting

Node.js - API Gateway

const express = require('express');
const rateLimit = require('express-rate-limit');
const axios = require('axios');

const app = express();
app.use(express.json());

// Rate limiting middleware
const limiter = rateLimit({
    windowMs: 15 * 60 * 1000, // 15 minutes
    max: 100, // Max 100 requests per window
    message: 'Too many requests from this IP'
});

app.use('/api/', limiter);

// Service registry
const services = {
    users: 'http://user-service:3001',
    products: 'http://product-service:3002',
    orders: 'http://order-service:3003'
};

// Route requests to appropriate service
app.all('/api/:service/*', async (req, res) => {
    const serviceName = req.params.service;
    const serviceUrl = services[serviceName];

    if (!serviceUrl) {
        return res.status(404).json({ error: 'Service not found' });
    }

    const targetUrl = req.url.replace(`/api/${serviceName}`, '');

    try {
        const response = await axios({
            method: req.method,
            url: serviceUrl + targetUrl,
            data: req.body,
            headers: {
                'Authorization': req.headers.authorization
            },
            timeout: 5000
        });

        res.status(response.status).json(response.data);
    } catch (error) {
        if (error.code === 'ECONNABORTED') {
            res.status(504).json({ error: 'Service timeout' });
        } else {
            res.status(500).json({ error: 'Service unavailable' });
        }
    }
});

app.listen(8000, () => console.log('API Gateway running on port 8000'));

Event Sourcing with Go

Go - Event Store

package main

import (
    "time"
)

type Event struct {
    ID          string
    AggregateID string
    Type        string
    Data        map[string]interface{}
    Timestamp   time.Time
}

type EventStore struct {
    events []Event
}

func NewEventStore() *EventStore {
    return &EventStore{events: make([]Event, 0)}
}

func (es *EventStore) AppendEvent(event Event) {
    event.Timestamp = time.Now()
    es.events = append(es.events, event)
}

func (es *EventStore) GetEvents(aggregateID string) []Event {
    result := make([]Event, 0)
    for _, event := range es.events {
        if event.AggregateID == aggregateID {
            result = append(result, event)
        }
    }
    return result
}

// Account Aggregate
type Account struct {
    ID      string
    Balance float64
}

func (a *Account) ApplyEvent(event Event) {
    switch event.Type {
    case "DEPOSITED":
        a.Balance += event.Data["amount"].(float64)
    case "WITHDRAWN":
        a.Balance -= event.Data["amount"].(float64)
    }
}

func ReconstructAccount(accountID string, store *EventStore) *Account {
    account := &Account{ID: accountID, Balance: 0}
    events := store.GetEvents(accountID)

    for _, event := range events {
        account.ApplyEvent(event)
    }

    return account
}

// Usage
func main() {
    store := NewEventStore()

    store.AppendEvent(Event{
        AggregateID: "acc-123",
        Type:        "DEPOSITED",
        Data:        map[string]interface{}{"amount": 100.0},
    })

    store.AppendEvent(Event{
        AggregateID: "acc-123",
        Type:        "WITHDRAWN",
        Data:        map[string]interface{}{"amount": 30.0},
    })

    account := ReconstructAccount("acc-123", store)
    // account.Balance = 70.0
}

Try It Yourself

Clone these examples and experiment:

Modify the Circuit Breaker thresholds and observe behavior
Add more steps to the Saga and test compensation logic
Implement a simple rate limiter using the Token Bucket algorithm
Create event sourcing for a shopping cart aggregate

Practice Exercises

Apply patterns through hands-on challenges.

Exercise 1: Implement Saga Pattern for E-commerce

Objective: Build a Saga for order processing with compensation logic

Scenario: E-commerce order flow

Create Order (compensation: cancel order)
Reserve Inventory (compensation: release inventory)
Process Payment (compensation: refund payment)
Schedule Shipping (compensation: cancel shipping)

Requirements:

If any step fails, run compensations in reverse order
Log each step and compensation
Return success/failure with details

Solution:

Python Solution

class ECommerceSaga:
    def __init__(self):
        self.steps_executed = []
        self.order_id = None
        self.reservation_id = None
        self.transaction_id = None
        self.shipment_id = None

    def execute(self, order_data):
        try:
            # Step 1: Create Order
            self.order_id = self._create_order(order_data)
            self.steps_executed.append('create_order')

            # Step 2: Reserve Inventory
            self.reservation_id = self._reserve_inventory(order_data['items'])
            self.steps_executed.append('reserve_inventory')

            # Step 3: Process Payment
            self.transaction_id = self._process_payment(order_data['total'])
            self.steps_executed.append('process_payment')

            # Step 4: Schedule Shipping
            self.shipment_id = self._schedule_shipping(self.order_id)
            self.steps_executed.append('schedule_shipping')

            return {"status": "success", "order_id": self.order_id}

        except Exception as e:
            print(f"Saga failed at step: {e}")
            self._compensate()
            return {"status": "failed", "error": str(e)}

    def _compensate(self):
        print("Starting compensation...")

        for step in reversed(self.steps_executed):
            try:
                if step == 'schedule_shipping':
                    shipping_service.cancel(self.shipment_id)
                    print("✓ Cancelled shipping")

                elif step == 'process_payment':
                    payment_service.refund(self.transaction_id)
                    print("✓ Refunded payment")

                elif step == 'reserve_inventory':
                    inventory_service.release(self.reservation_id)
                    print("✓ Released inventory")

                elif step == 'create_order':
                    order_service.cancel(self.order_id)
                    print("✓ Cancelled order")

            except Exception as comp_error:
                print(f"✗ Compensation failed for {step}: {comp_error}")

    def _create_order(self, order_data):
        # Implementation
        return order_service.create(order_data)

    def _reserve_inventory(self, items):
        # Implementation
        return inventory_service.reserve(items)

    def _process_payment(self, amount):
        # Implementation
        return payment_service.charge(amount)

    def _schedule_shipping(self, order_id):
        # Implementation
        return shipping_service.schedule(order_id)

Exercise 2: Build Circuit Breaker from Scratch

Objective: Implement a Circuit Breaker with all three states

Requirements:

Track failure count and success count
Implement CLOSED, OPEN, HALF_OPEN states
Configurable failure threshold (e.g., 5 failures triggers OPEN)
Configurable timeout (e.g., 60 seconds before trying HALF_OPEN)
In HALF_OPEN, allow limited requests to test recovery

Test cases:

5 consecutive failures should open the circuit
Requests during OPEN state should fail immediately
After timeout, circuit should transition to HALF_OPEN
Successful requests in HALF_OPEN should close the circuit

Hint: Use the example from the Hands-On section as a starting point

Exercise 3: Design CQRS for Blog Platform

Objective: Separate read and write models for a blog system

Scenario: Blog platform with posts, comments, and likes

Write Model Commands:

CreatePost
UpdatePost
DeletePost
AddComment
LikePost

Read Model Queries:

GetPostDetail (post + comments + like count)
GetUserFeed (personalized feed with denormalized data)
GetTrendingPosts (sorted by likes, recent activity)

Task:

Design the write model (normalized database)
Design the read model (denormalized views)
Define events that sync write → read
Implement one command handler and one query handler

Hint: Write model focuses on data integrity, read model on query performance

Pattern Catalog & Reference

Production-grade patterns for complex distributed systems.

Choreography vs Orchestration

Aspect	Choreography	Orchestration
Control	Decentralized	Centralized
Communication	Event-driven	Command-driven
Coupling	Loose	Tighter
Complexity	Harder to trace	Easier to understand

Python - Orchestration with Temporal

from temporalio import workflow, activity

@workflow.defn
class OrderWorkflow:
    @workflow.run
    async def run(self, order_data):
        # Orchestrator controls the flow
        order = await workflow.execute_activity(
            create_order, order_data, start_to_close_timeout=timedelta(seconds=30)
        )

        inventory = await workflow.execute_activity(
            reserve_inventory, order.items, start_to_close_timeout=timedelta(seconds=30)
        )

        payment = await workflow.execute_activity(
            process_payment, order.total, start_to_close_timeout=timedelta(seconds=30)
        )

        await workflow.execute_activity(
            ship_order, order.id, start_to_close_timeout=timedelta(seconds=30)
        )

        return order

Spotify's Saga Pattern at Scale

Use Case: Playlist creation across multiple services
Pattern: Event-driven choreography
Services Involved: 12+ microservices
Events/sec: 100,000+
Key Learning: Idempotency is critical for event replay

Complete Pattern Catalog

Pattern	Problem	Solution	When to Use	Trade-offs
Circuit Breaker	Cascading failures	Prevent calls to failing service	External dependencies, slow services	Complexity vs. resilience
Saga	Distributed transactions	Compensating transactions	Multi-service workflows	Eventual consistency
CQRS	Read/write performance	Separate read/write models	Complex queries, high read load	Data synchronization complexity
Event Sourcing	Audit trail, temporal queries	Store events, not state	Financial systems, compliance	Storage overhead, complexity
Strangler Fig	Legacy migration	Gradually replace old system	Monolith to microservices	Dual maintenance period
API Gateway	Client complexity, cross-cutting	Single entry point	Mobile apps, public APIs	Single point of failure
Service Mesh	Service-to-service communication	Infrastructure layer for networking	Large-scale microservices	Operational complexity
Bulkhead	Resource exhaustion	Isolate resources per service	Shared resources, critical services	Resource inefficiency
Retry Pattern	Transient failures	Automatic retry with backoff	Network glitches, temporary outages	Increased latency, thundering herd
Rate Limiting	Service abuse, overload	Throttle requests per client	Public APIs, DoS protection	Legitimate traffic may be blocked
BFF (Backend for Frontend)	Different client needs	Custom backend per client type	Mobile/web/IoT different requirements	Code duplication
Sidecar	Cross-cutting concerns	Co-located helper process	Logging, monitoring, proxying	Resource overhead per service
Service Discovery	Dynamic service locations	Registry for service lookup	Cloud, containerized environments	Additional infrastructure dependency
Database per Service	Tight coupling via shared DB	Each service owns its data	Independent deployability needed	Data consistency challenges
Outbox Pattern	Dual-write problem	Transactional outbox table	Database + messaging atomic writes	Polling overhead, latency

Real-World Case Studies

Netflix Chaos Engineering

Challenge: Ensure resilience in distributed system with 1000+ microservices
Patterns Used: Circuit Breaker (Hystrix), Bulkhead, Retry, Fallback
Innovation: Chaos Monkey - randomly terminates production instances
Impact: 99.99% uptime despite constant failures
Tech Stack: Spring Cloud, Hystrix, Ribbon, Eureka
Key Metric: Serves 230M+ subscribers across 190+ countries
Lesson: "Embrace failure as a feature, not a bug"

Uber's Saga Pattern at Scale

Challenge: Coordinate ride booking across payment, dispatch, driver, rider services
Pattern: Saga with Orchestration (Cadence workflow engine)
Workflow Steps: Validate rider → Match driver → Process payment → Start trip
Compensation: Refund payment if driver cancels, release driver if payment fails
Scale: 18M+ trips/day, sub-second booking confirmation
Tech: Cadence (now Temporal), Go, Node.js microservices
Key Learning: Orchestration better than choreography for complex workflows

Amazon's Strangler Fig Migration

Challenge: Migrate monolithic e-commerce platform to microservices
Pattern: Strangler Fig + API Gateway
Approach: Route new features to microservices, legacy to monolith
Duration: 5+ year gradual migration
Result: 2-tier SOA → hundreds of microservices
Services: Product Catalog, Cart, Checkout, Recommendations (all separate)
Impact: Deploy every 11.7 seconds, 99.99% availability

Capital One's CQRS + Event Sourcing

Use Case: Banking transaction processing and fraud detection
Pattern: CQRS + Event Sourcing
Write Side: Process transactions, store events in event store
Read Side: Fraud detection models, account balance views, transaction history
Benefits: Complete audit trail, temporal queries for compliance
Scale: Billions of events, real-time fraud detection
Tech: Kafka for event streaming, Cassandra for event store

Anti-Patterns to Avoid

Distributed Monolith

Problem: Microservices with tight coupling, shared database, synchronous dependencies

Symptoms: Can't deploy independently, cascading failures, slow deployments

Solution: Database per service, async communication, bounded contexts

Chatty Services

Problem: Too many synchronous inter-service calls for a single user request

Symptoms: High latency, network congestion, timeout cascades

Solution: API composition, data duplication, event-driven communication

Shared Database

Problem: Multiple services reading/writing same database tables

Symptoms: Schema changes break multiple services, tight coupling

Solution: Database per service, data replication via events

Microservices for Every Small Feature

Problem: Over-engineering with too many tiny services

Symptoms: Operational nightmare, debugging complexity, network overhead

Solution: Start with modular monolith, extract services based on team/scaling needs

Pattern Combinations & When to Use

Scenario	Recommended Patterns	Rationale
E-commerce Order Flow	Saga + Circuit Breaker + Retry	Saga for multi-step workflow, Circuit Breaker for payment gateway, Retry for transient failures
Banking/Finance	Event Sourcing + CQRS + Outbox	Complete audit trail, read/write optimization, guaranteed message delivery
Social Media Feed	CQRS + Cache-Aside + Rate Limiting	Fast reads, write optimization, prevent abuse
IoT Platform	Event-Driven + Bulkhead + Service Mesh	Handle massive events, isolate device types, secure service-to-service
Legacy Migration	Strangler Fig + BFF + API Gateway	Gradual migration, client-specific APIs, routing layer
Public API Platform	API Gateway + Rate Limiting + Circuit Breaker	Single entry, abuse prevention, backend protection
Real-time Analytics	Event Sourcing + CQRS + Materialized Views	Event replay, query optimization, pre-computed aggregates

Pattern Selection Decision Tree

Start Here: Do you need microservices?

NO if: Small team (< 5 devs), simple domain, startup/MVP stage → Use modular monolith
YES if: Multiple teams, scaling bottlenecks, polyglot needs, independent deployability

Multi-Service Workflows?

Simple flow (2-3 steps): Event-driven choreography
Complex flow (4+ steps with compensation): Saga with orchestration
Long-running (days/weeks): Temporal/Cadence workflow engine

Read vs Write Performance Issues?

Heavy writes, simple reads: Database optimization, write-through cache
Heavy reads, complex queries: CQRS with materialized views
Audit/compliance required: Event Sourcing + CQRS

Resilience Concerns?

External service failures: Circuit Breaker + Fallback
Transient network errors: Retry with Exponential Backoff
Resource exhaustion: Bulkhead + Rate Limiting
Cascading failures: Circuit Breaker + Timeout + Bulkhead

Quick Pattern Selector

Start small: Implement Circuit Breaker + Retry for all external calls
Add workflow: Use Saga when you have 3+ service coordination
Scale reads: Add CQRS when read load >> write load (10x+)
Audit trail: Add Event Sourcing only if compliance requires it
API layer: Add API Gateway when you have 3+ client types