Deployment

Hard45 min read

Packaging with pyproject.toml

Why Deployment Matters

The Problem: 'Works on my machine' is where most production incidents start. Manual deploys are slow, error-prone, and unscalable.

The Solution: Multi-stage Docker images + uvicorn/gunicorn + GitHub Actions + Kubernetes turn deployment into a one-button operation with reproducible builds and rollback.

Real Impact: Investing in deployment plumbing once saves every team member time forever — and is the price of admission for shipping to real users.

Real-World Analogy

Think of deployment as packing a moving truck:

  • pyproject.toml = the inventory list — exactly what's in each box
  • Docker image = the truck itself — your app sealed with everything it needs
  • CI/CD = the moving company that always packs the truck the same way
  • Kubernetes = the dispatch center routing trucks to warehouses, replacing broken ones
  • Observability = the GPS + dashcam telling you each truck is healthy in real time
[project]
name = "myapp"
version = "1.0.0"
requires-python = ">=3.12"
dependencies = [
    "fastapi>=0.110",
    "uvicorn[standard]",
    "sqlalchemy>=2",
]

[project.optional-dependencies]
dev = ["pytest", "ruff", "mypy"]

[project.scripts]
myapp = "myapp.cli:main"

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
$ pip install build
$ python -m build              # produces dist/*.whl and dist/*.tar.gz
$ pip install dist/myapp-1.0.0-py3-none-any.whl

Pinning Dependencies for Production

For libraries, use loose constraints in pyproject.toml. For applications, also pin exact versions in a lockfile for reproducibility.

$ pip install pip-tools
$ pip-compile pyproject.toml -o requirements.txt   # exact pins
$ pip-sync requirements.txt                         # install exactly these

Modern alternatives: uv (Rust-based, fast), poetry (full workflow), pipenv, pdm.

Dockerizing a Python App

# Dockerfile
FROM python:3.12-slim AS builder
WORKDIR /app
COPY pyproject.toml requirements.txt ./
RUN pip install --user --no-cache-dir -r requirements.txt

FROM python:3.12-slim
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY src ./src
ENV PATH=/root/.local/bin:$PATH PYTHONUNBUFFERED=1
EXPOSE 8000
CMD ["uvicorn", "myapp.main:app", "--host", "0.0.0.0", "--port", "8000"]
$ docker build -t myapp:1.0 .
$ docker run -p 8000:8000 myapp:1.0

Multi-stage builds shrink images

Build wheels in a fat stage, copy only the runtime artifacts to a slim stage. A typical Python service image drops from ~1.2GB to ~150MB this way.

Running in Production

uvicorn — ASGI Server

$ uvicorn myapp.main:app --host 0.0.0.0 --port 8000 --workers 4

gunicorn + uvicorn workers

For Linux production, the standard is gunicorn managing uvicorn workers — robust process management, graceful reloads, and signal handling.

$ gunicorn myapp.main:app \
    -w 4 \
    -k uvicorn.workers.UvicornWorker \
    --bind 0.0.0.0:8000 \
    --access-logfile - \
    --error-logfile -

How many workers?

Rule of thumb: 2 × CPU + 1 for CPU-bound, more for I/O-bound. With async, even a few workers can handle thousands of concurrent connections.

CI/CD with GitHub Actions

# .github/workflows/ci.yml
name: CI
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
          cache: pip
      - run: pip install -e ".[dev]"
      - run: ruff check .
      - run: mypy src
      - run: pytest --cov=myapp

  build:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: docker build -t myapp:${{ github.sha }} .
      - run: docker push myapp:${{ github.sha }}

Kubernetes Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
        - name: app
          image: myapp:1.0
          ports:
            - containerPort: 8000
          readinessProbe:
            httpGet: { path: /health, port: 8000 }
          livenessProbe:
            httpGet: { path: /health, port: 8000 }
          resources:
            requests: { cpu: "100m", memory: "128Mi" }
            limits:   { cpu: "500m", memory: "512Mi" }
---
apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector: { app: myapp }
  ports:
    - port: 80
      targetPort: 8000

Add an Ingress, a HorizontalPodAutoscaler, and you have an autoscaling, load-balanced service. See the Kubernetes track for the full picture.

Production Checklist

Before shipping

  • Health endpoint: /health returning 200 when ready, used by load balancer + Kubernetes probes.
  • Structured JSON logs to stdout — let your platform aggregate.
  • Metrics endpoint: /metrics for Prometheus.
  • Graceful shutdown: handle SIGTERM, drain in-flight requests.
  • Secrets: from env or secret manager, never committed.
  • Pinned versions: reproducible builds — no surprises in prod.
  • Resource limits: CPU and memory caps prevent runaway containers.
  • Read-only filesystem: container with no write access reduces blast radius.

🎯 Practice Exercises

Exercise 1: Build a wheel

Take a small Python app. Add pyproject.toml. Run python -m build. Install the resulting wheel in a fresh venv.

Exercise 2: Dockerize

Write a multi-stage Dockerfile. Measure final image size. Run the container locally.

Exercise 3: CI pipeline

Create a GitHub Actions workflow that runs ruff, mypy, and pytest on every push.

Exercise 4: Kubernetes

Write a Deployment + Service for your app. Apply to a local cluster (kind/minikube). Hit the service from a port-forward.