Threads in Python
Why Threading Matters
The Problem: Sequential code waits idle while a network call returns — wasting capacity. Doing 100 HTTP calls one at a time takes 100x as long as it should.
The Solution: Threads let one process handle many I/O-bound waits concurrently. The GIL prevents true parallel CPU execution, but for I/O — where most apps spend their time — threading is the simplest concurrency model.
Real Impact: Use ThreadPoolExecutor for I/O-bound work, multiprocessing for CPU-bound. Knowing the GIL distinction prevents the most common Python performance mistake.
Real-World Analogy
Think of threads as call-center agents sharing one phone book:
- Thread = an agent who handles one customer at a time but can switch when the customer is on hold
- GIL = only one agent may read the phone book at a time — fine if everyone's on hold
- Lock = a sticky note reserving a phone book page for one agent
- ThreadPoolExecutor = the supervisor who hands out customers from a queue
- Daemon thread = a background agent that gets sent home when the office closes
The threading module provides OS-level threads. Threads share memory and are great for I/O-bound workloads (network, disk). Due to the GIL (covered below), they don't speed up CPU-bound Python code.
import threading
import time
def worker(n: int):
print(f"worker {n} starting")
time.sleep(1)
print(f"worker {n} done")
threads = [threading.Thread(target=worker, args=(i,)) for i in range(5)]
for t in threads: t.start()
for t in threads: t.join() # wait for completion
Daemon Threads
t = threading.Thread(target=poll_loop, daemon=True)
t.start()
# Daemon threads die when the main program exits — useful for background loops
The Global Interpreter Lock (GIL)
CPython's GIL ensures only one thread executes Python bytecode at a time. This means:
- I/O-bound work: Threading helps. While one thread waits on a socket, another can run.
- CPU-bound work: Threading does NOT help — the GIL serializes execution. Use
multiprocessinginstead. - C extensions (NumPy, etc.): Release the GIL during heavy work — threading helps there.
# CPU-bound — threading provides ~0 speedup
def crunch():
return sum(i * i for i in range(10_000_000))
# I/O-bound — threading provides big speedup
def fetch(url):
return requests.get(url).text
Python 3.13+ free-threaded build
Python 3.13 ships an experimental no-GIL build. Once mature, this will allow real parallel Python execution. For now, the GIL is the rule.
concurrent.futures — High-Level API
The recommended way to use threads in modern Python. Submit functions to an executor, get futures back.
from concurrent.futures import ThreadPoolExecutor
import requests
urls = ["https://example.com"] * 20
with ThreadPoolExecutor(max_workers=10) as ex:
# map: in input order
for result in ex.map(lambda u: requests.get(u).status_code, urls):
print(result)
# Or submit individually for finer control
futures = [ex.submit(requests.get, u) for u in urls]
for f in futures:
print(f.result().status_code)
Synchronization Primitives
Even with the GIL, Python operations are not atomic at the bytecode level. Use locks for shared mutable state.
import threading
counter = 0
lock = threading.Lock()
def increment(n):
global counter
for _ in range(n):
with lock: # automatic acquire/release
counter += 1
Primitives
| Primitive | Use |
|---|---|
Lock | Mutex — one holder at a time |
RLock | Reentrant — same thread can acquire multiple times |
Semaphore(n) | Up to n holders simultaneously |
Event | One thread signals, others wait |
Condition | Lock + signal — wait until a predicate becomes true |
Barrier(n) | All n threads wait until everyone arrives |
queue.Queue | Thread-safe FIFO queue — preferred for producer/consumer |
Producer/Consumer with Queue
import queue, threading
q = queue.Queue(maxsize=100)
def producer():
for i in range(1000):
q.put(i) # blocks if full
q.put(None) # sentinel
def consumer():
while True:
item = q.get()
if item is None: break
process(item)
q.task_done()
Thread-Local Storage
Each thread gets its own attributes — useful for request-scoped data without explicit plumbing.
import threading
local_data = threading.local()
def handle_request(req):
local_data.user = req.user
process()
def process():
print(local_data.user) # whatever was set in THIS thread
🎯 Practice Exercises
Exercise 1: URL fetcher
Fetch 50 URLs with ThreadPoolExecutor. Compare wall-clock time against a sequential version.
Exercise 2: Counter race
Spawn 10 threads incrementing a counter 100k times. Show the count is less than 1M without a lock. Fix with Lock.
Exercise 3: Producer-consumer
Build a thread pool where producers fill a Queue and consumers drain it. Use a sentinel value to shut down cleanly.
Exercise 4: CPU vs I/O
Run a CPU-bound function with threads and time it. Repeat with an I/O-bound function. Observe why one scales and the other doesn't.