FastAPI 102: Async vs Sync, What Actually Happens
In Part 1 we traced the full path a request
takes through FastAPI: ASGI server, middleware, dependency injection, handler. One question I kicked
down the road: should your handler be async def or def? Most developers pick
one and stick with it. The honest answer depends on what your handler does. Pick wrong and you either
kill throughput silently or pay for thread pool overhead you didn't need. Part 2 is about what actually
happens at the runtime level for each choice.
The event loop: one thread, cooperative multitasking
Uvicorn (FastAPI's ASGI server) runs on a single-threaded event loop, Python's asyncio
event loop. Single-threaded does not mean one request at a time. It means all requests share the same
thread and switch between each other cooperatively.
The key word is cooperatively. The event loop runs one coroutine at a time. A coroutine
"yields" control back to the event loop when it hits an await expression. While it's
waiting (for a network response, a DB query, a file read), the event loop runs other coroutines.
When the awaited operation completes, the coroutine is resumed.
Event Loop Thread: T=0ms Handle request A: start processing T=5ms Request A hits "await db.query()", yields to event loop T=5ms Handle request B: start processing T=8ms Request B hits "await http_client.get()", yields T=12ms DB query for A completes, resume request A T=15ms HTTP call for B completes, resume request B T=16ms Request A sends response T=18ms Request B sends response All of this happened on ONE thread.
This is why async shines on I/O-bound work. While you wait for the database, you aren't burning a thread. The thread is handling other requests.
What actually happens with async def
When you define a route with async def, FastAPI calls it as a coroutine directly on
the event loop. No thread pool, no executor hop. The event loop runs your code until it hits an
await, at which point it yields and picks up other work.
import httpx
from fastapi import FastAPI
app = FastAPI()
# Good: async def that awaits an async I/O operation
@app.get("/users/{id}")
async def get_user(id: int):
async with httpx.AsyncClient() as client:
response = await client.get(f"https://api.example.com/users/{id}")
# While waiting for HTTP response, event loop handles other requests
return response.json()
When await client.get(...) is called, the coroutine suspends, the event loop runs
other coroutines, and when the HTTP response arrives, your coroutine resumes. Zero threads used.
Thousands of concurrent requests possible on one CPU core.
The silent killer: blocking inside async def
This is where people get burned. If you call a blocking function inside async def,
you don't get an error. The code just runs. And while it runs, the entire event loop is frozen until
your blocking call returns. I have shipped this bug. I will probably ship it again.
import requests # sync HTTP library
# Wrong: requests.get() is blocking. It blocks the entire event loop.
# ALL other concurrent requests wait while this network call is in progress.
@app.get("/users/{id}")
async def get_user_broken(id: int):
response = requests.get(f"https://api.example.com/users/{id}")
# While this takes 200ms, every other request is frozen
return response.json()
# Right: httpx.AsyncClient is async-native
@app.get("/users/{id}")
async def get_user_fixed(id: int):
async with httpx.AsyncClient() as client:
response = await client.get(f"https://api.example.com/users/{id}")
return response.json()
Same trap with database drivers. A synchronous SQLAlchemy session inside async def
blocks the event loop for every query, and you only notice under load.
from sqlalchemy.orm import Session
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine
# Bad: sync SQLAlchemy inside async def blocks the event loop
@app.get("/orders")
async def get_orders_broken(db: Session = Depends(get_sync_db)):
return db.query(Order).all() # Blocking DB call, freezes event loop
# Good: async SQLAlchemy
@app.get("/orders")
async def get_orders_fixed(db: AsyncSession = Depends(get_async_db)):
result = await db.execute(select(Order))
return result.scalars().all()
What actually happens with def
When you define a route with plain def, FastAPI does not call it on the event
loop. It offloads it to a thread pool executor, by default
asyncio.get_event_loop().run_in_executor(None, func) using Python's
ThreadPoolExecutor.
# FastAPI internally does something like:
import asyncio
from functools import partial
# When it encounters a def route:
result = await loop.run_in_executor(None, partial(your_sync_handler, **kwargs))
This means:
- Your function runs in a separate thread, so blocking I/O doesn't freeze the event loop.
- The event loop keeps handling other requests while your thread is blocked.
- You pay for it: thread pooling, context switching, GIL contention. None of it is free.
- The thread pool is capped. Under heavy concurrency, requests queue up waiting for a thread.
import requests
from sqlalchemy.orm import Session
# Good: def with sync libraries runs in a thread pool
# Blocking requests.get() is fine here, it blocks the thread, not the event loop
@app.get("/users/{id}")
def get_user_sync(id: int, db: Session = Depends(get_sync_db)):
response = requests.get(f"https://api.example.com/users/{id}")
user = db.query(User).filter(User.id == id).first()
return {"user": user, "external": response.json()}
The decision framework
Here's the decision tree for every route and dependency you write:
What does my function do?
│
├─► I/O-bound work (DB, HTTP, file, cache)
│ │
│ ├─► Async library available (httpx, asyncpg, aioredis)?
│ │ └─► async def + await (best performance)
│ │
│ └─► Only sync library available (requests, psycopg2)?
│ └─► def (thread pool) (safe, not optimal)
│
├─► CPU-bound work (image processing, ML inference, heavy computation)
│ └─► async def + ProcessPoolExecutor
│ (thread pool won't help, Python GIL limits CPU parallelism)
│
└─► Fast/trivial computation (no I/O)
└─► Either works. async def if you're already in an async context.
import asyncio
from concurrent.futures import ProcessPoolExecutor
executor = ProcessPoolExecutor()
# CPU-bound: image resizing, ML inference, etc.
@app.post("/resize")
async def resize_image(file: UploadFile):
image_bytes = await file.read()
# Offload CPU work to process pool, won't block event loop or GIL
loop = asyncio.get_event_loop()
resized = await loop.run_in_executor(executor, resize_sync, image_bytes)
return StreamingResponse(io.BytesIO(resized), media_type="image/jpeg")
Real code comparisons
Requests vs httpx
import requests
import httpx
# SYNC: use inside def routes or run_in_executor
def fetch_user_sync(user_id: int) -> dict:
response = requests.get(f"https://api.example.com/users/{user_id}")
response.raise_for_status()
return response.json()
# ASYNC: use inside async def routes
async def fetch_user_async(user_id: int) -> dict:
async with httpx.AsyncClient() as client:
response = await client.get(f"https://api.example.com/users/{user_id}")
response.raise_for_status()
return response.json()
# Performance comparison under 100 concurrent requests:
# requests (in def, thread pool): ~2.1s total (thread overhead, GIL contention)
# httpx async (in async def): ~0.4s total (single thread, cooperative switching)
Sqlalchemy sync vs async
from sqlalchemy import create_engine, select
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine, async_sessionmaker
# Sync engine, use with def routes
sync_engine = create_engine("postgresql://user:pass@localhost/db")
SyncSession = sessionmaker(bind=sync_engine)
def get_sync_db():
db = SyncSession()
try:
yield db
finally:
db.close()
# Async engine, use with async def routes
async_engine = create_async_engine("postgresql+asyncpg://user:pass@localhost/db")
AsyncSessionLocal = async_sessionmaker(async_engine, class_=AsyncSession)
async def get_async_db():
async with AsyncSessionLocal() as session:
yield session
# Usage
@app.get("/orders/sync")
def list_orders_sync(db: Session = Depends(get_sync_db)):
return db.execute(select(Order)).scalars().all()
@app.get("/orders/async")
async def list_orders_async(db: AsyncSession = Depends(get_async_db)):
result = await db.execute(select(Order))
return result.scalars().all()
Asyncio.gather: concurrent vs sequential awaits
In async def, awaiting operations sequentially means waiting for each one to complete
before starting the next. asyncio.gather() runs them concurrently. All calls start
immediately, and you wait for the last one to finish.
import asyncio
import httpx
import time
# Sequential: total time = sum of all call durations
@app.get("/dashboard/slow")
async def dashboard_slow(user_id: int):
async with httpx.AsyncClient() as client:
orders = await client.get(f"/api/orders/{user_id}") # 80ms
profile = await client.get(f"/api/profile/{user_id}") # 60ms
balance = await client.get(f"/api/balance/{user_id}") # 50ms
# Total: ~190ms, each call waits for the previous
# Concurrent: total time = max of all call durations
@app.get("/dashboard/fast")
async def dashboard_fast(user_id: int):
async with httpx.AsyncClient() as client:
orders_task = client.get(f"/api/orders/{user_id}")
profile_task = client.get(f"/api/profile/{user_id}")
balance_task = client.get(f"/api/balance/{user_id}")
orders, profile, balance = await asyncio.gather(
orders_task, profile_task, balance_task
)
# Total: ~80ms, all calls run in parallel
return {
"orders": orders.json(),
"profile": profile.json(),
"balance": balance.json()
}
I reach for asyncio.gather() constantly on aggregation endpoints. Dashboard views that
pull from three services are the canonical case.
run_in_executor: escaping blocking code
Sometimes you have a sync library you can't replace. A legacy SDK, a C extension, a vendor client
with no async version. run_in_executor lets you call it from an async def
route without freezing the event loop.
import asyncio
from concurrent.futures import ThreadPoolExecutor
thread_pool = ThreadPoolExecutor(max_workers=20)
# Legacy sync function you can't change
def legacy_payment_process(amount: float, card_token: str) -> dict:
return old_payment_sdk.charge(amount, card_token) # Blocking
@app.post("/payment")
async def process_payment(payment: PaymentRequest):
loop = asyncio.get_event_loop()
# Run blocking function in thread pool, event loop stays free
result = await loop.run_in_executor(
thread_pool,
legacy_payment_process,
payment.amount,
payment.card_token
)
return result
# Cleaner with functools.partial for multiple args:
from functools import partial
result = await loop.run_in_executor(
thread_pool,
partial(legacy_payment_process, payment.amount, payment.card_token)
)
Thread pool size matters. Default is min(32, os.cpu_count() + 4). For I/O-bound work
(most cases), set it higher. For CPU-bound, use ProcessPoolExecutor instead.
Background tasks: what they don't do
Common misconception: background tasks run in parallel with the response. They don't. They run after the response is sent, and they share the event loop.
from fastapi import BackgroundTasks
import asyncio
async def send_analytics(user_id: int, action: str):
# Runs AFTER response sent, but on the same event loop
await analytics_client.track(user_id, action) # async, OK
# time.sleep(5) here would freeze the event loop for 5 seconds, very bad
@app.post("/purchase")
async def purchase(item_id: int, bg: BackgroundTasks, user: User = Depends(get_current_user)):
order = await create_order(user.id, item_id)
bg.add_task(send_analytics, user.id, "purchase")
return order # Client gets response; send_analytics runs after
# For fire-and-forget tasks that need true parallelism:
# Use a task queue (Celery, ARQ, Dramatiq) instead of BackgroundTasks
Quiz
Q1. A def route calls requests.get(). Does this block the event loop?
No. FastAPI runs def routes in a thread pool executor. The blocking requests.get() call blocks the thread it's running in, not the event loop. The event loop is free to handle other async requests while this thread waits for the HTTP response.
The cost: a thread is occupied for the duration of the call. Under high concurrency, you can exhaust the thread pool. For moderate loads, def plus a sync library is perfectly fine.
Q2. You have an async def route that calls time.sleep(2). What happens to other requests during those 2 seconds?
They all freeze. time.sleep() is a blocking call. Inside async def, it runs on the event loop thread. For 2 seconds, the event loop is completely blocked. No other coroutines run, no other requests get handled.
Fix: use await asyncio.sleep(2) instead. This suspends the coroutine and yields control to the event loop, which can run other requests while waiting. If you need to sleep in a sync context, use def, which runs in a thread pool so the sleep only blocks that thread.
Q3. You need to call a CPU-heavy image compression function inside a FastAPI route. Should you use async def or def, and with which executor?
async def with ProcessPoolExecutor. CPU-bound work is bounded by Python's GIL. Even in a thread pool, only one thread executes Python bytecode at a time, so multiple threads don't parallelize CPU work.
A ProcessPoolExecutor spawns separate processes that sidestep the GIL, so you get true parallel CPU execution. Use await loop.run_in_executor(process_executor, compress_image, data) from an async def route to run CPU work in a process while the event loop stays free.
Plain def runs in a thread pool. It won't freeze the event loop, but it also won't give you CPU parallelism.
Q4. You use asyncio.gather() to fire 5 HTTP requests simultaneously from a route. One of them raises an exception. What happens to the others?
By default, the exception propagates immediately and the other tasks are cancelled. asyncio.gather()'s default behavior is to raise the first exception and cancel remaining tasks.
To handle errors per-task instead, use return_exceptions=True:
results = await asyncio.gather(
task1(), task2(), task3(), task4(), task5(),
return_exceptions=True # Returns exceptions as values instead of raising
)
# results is a list, some may be exceptions, some may be actual values
for r in results:
if isinstance(r, Exception):
log.error(f"Task failed: {r}")
else:
process(r)
Use return_exceptions=True when you want best-effort behavior and want as many results back as possible, even if some calls fail.
Lifecycle and async/sync are the two concepts behind most FastAPI production bugs I've seen. Next: Part 3: API design, pagination, filtering, and error handling at scale. Cursor vs offset pagination, Pydantic query param filtering, and enforcing one error shape across the entire API.