FastAPIProduction

FastAPI 102: Async vs Sync, What Actually Happens

March 29, 202610 min readPART 02 / 18

In Part 1 we traced the full path a request takes through FastAPI: ASGI server, middleware, dependency injection, handler. One question I kicked down the road: should your handler be async def or def? Most developers pick one and stick with it. The honest answer depends on what your handler does. Pick wrong and you either kill throughput silently or pay for thread pool overhead you didn't need. Part 2 is about what actually happens at the runtime level for each choice.

The event loop: one thread, cooperative multitasking

Uvicorn (FastAPI's ASGI server) runs on a single-threaded event loop, Python's asyncio event loop. Single-threaded does not mean one request at a time. It means all requests share the same thread and switch between each other cooperatively.

The key word is cooperatively. The event loop runs one coroutine at a time. A coroutine "yields" control back to the event loop when it hits an await expression. While it's waiting (for a network response, a DB query, a file read), the event loop runs other coroutines. When the awaited operation completes, the coroutine is resumed.

Event Loop Thread:

T=0ms   Handle request A: start processing
T=5ms   Request A hits "await db.query()", yields to event loop
T=5ms   Handle request B: start processing
T=8ms   Request B hits "await http_client.get()", yields
T=12ms  DB query for A completes, resume request A
T=15ms  HTTP call for B completes, resume request B
T=16ms  Request A sends response
T=18ms  Request B sends response

All of this happened on ONE thread.

This is why async shines on I/O-bound work. While you wait for the database, you aren't burning a thread. The thread is handling other requests.

What actually happens with async def

When you define a route with async def, FastAPI calls it as a coroutine directly on the event loop. No thread pool, no executor hop. The event loop runs your code until it hits an await, at which point it yields and picks up other work.

import httpx
from fastapi import FastAPI

app = FastAPI()

# Good: async def that awaits an async I/O operation
@app.get("/users/{id}")
async def get_user(id: int):
    async with httpx.AsyncClient() as client:
        response = await client.get(f"https://api.example.com/users/{id}")
        # While waiting for HTTP response, event loop handles other requests
    return response.json()

When await client.get(...) is called, the coroutine suspends, the event loop runs other coroutines, and when the HTTP response arrives, your coroutine resumes. Zero threads used. Thousands of concurrent requests possible on one CPU core.

The silent killer: blocking inside async def

This is where people get burned. If you call a blocking function inside async def, you don't get an error. The code just runs. And while it runs, the entire event loop is frozen until your blocking call returns. I have shipped this bug. I will probably ship it again.

import requests  # sync HTTP library

# Wrong: requests.get() is blocking. It blocks the entire event loop.
# ALL other concurrent requests wait while this network call is in progress.
@app.get("/users/{id}")
async def get_user_broken(id: int):
    response = requests.get(f"https://api.example.com/users/{id}")
    # While this takes 200ms, every other request is frozen
    return response.json()

# Right: httpx.AsyncClient is async-native
@app.get("/users/{id}")
async def get_user_fixed(id: int):
    async with httpx.AsyncClient() as client:
        response = await client.get(f"https://api.example.com/users/{id}")
    return response.json()

Same trap with database drivers. A synchronous SQLAlchemy session inside async def blocks the event loop for every query, and you only notice under load.

from sqlalchemy.orm import Session
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine

# Bad: sync SQLAlchemy inside async def blocks the event loop
@app.get("/orders")
async def get_orders_broken(db: Session = Depends(get_sync_db)):
    return db.query(Order).all()  # Blocking DB call, freezes event loop

# Good: async SQLAlchemy
@app.get("/orders")
async def get_orders_fixed(db: AsyncSession = Depends(get_async_db)):
    result = await db.execute(select(Order))
    return result.scalars().all()

What actually happens with def

When you define a route with plain def, FastAPI does not call it on the event loop. It offloads it to a thread pool executor, by default asyncio.get_event_loop().run_in_executor(None, func) using Python's ThreadPoolExecutor.

# FastAPI internally does something like:
import asyncio
from functools import partial

# When it encounters a def route:
result = await loop.run_in_executor(None, partial(your_sync_handler, **kwargs))

This means:

Your function runs in a separate thread, so blocking I/O doesn't freeze the event loop.
The event loop keeps handling other requests while your thread is blocked.
You pay for it: thread pooling, context switching, GIL contention. None of it is free.
The thread pool is capped. Under heavy concurrency, requests queue up waiting for a thread.

import requests
from sqlalchemy.orm import Session

# Good: def with sync libraries runs in a thread pool
# Blocking requests.get() is fine here, it blocks the thread, not the event loop
@app.get("/users/{id}")
def get_user_sync(id: int, db: Session = Depends(get_sync_db)):
    response = requests.get(f"https://api.example.com/users/{id}")
    user = db.query(User).filter(User.id == id).first()
    return {"user": user, "external": response.json()}

The decision framework

Here's the decision tree for every route and dependency you write:

What does my function do?
    │
    ├─► I/O-bound work (DB, HTTP, file, cache)
    │       │
    │       ├─► Async library available (httpx, asyncpg, aioredis)?
    │       │       └─► async def + await (best performance)
    │       │
    │       └─► Only sync library available (requests, psycopg2)?
    │               └─► def (thread pool) (safe, not optimal)
    │
    ├─► CPU-bound work (image processing, ML inference, heavy computation)
    │       └─► async def + ProcessPoolExecutor
    │           (thread pool won't help, Python GIL limits CPU parallelism)
    │
    └─► Fast/trivial computation (no I/O)
            └─► Either works. async def if you're already in an async context.

import asyncio
from concurrent.futures import ProcessPoolExecutor

executor = ProcessPoolExecutor()

# CPU-bound: image resizing, ML inference, etc.
@app.post("/resize")
async def resize_image(file: UploadFile):
    image_bytes = await file.read()
    # Offload CPU work to process pool, won't block event loop or GIL
    loop = asyncio.get_event_loop()
    resized = await loop.run_in_executor(executor, resize_sync, image_bytes)
    return StreamingResponse(io.BytesIO(resized), media_type="image/jpeg")

Real code comparisons

Requests vs httpx

import requests
import httpx

# SYNC: use inside def routes or run_in_executor
def fetch_user_sync(user_id: int) -> dict:
    response = requests.get(f"https://api.example.com/users/{user_id}")
    response.raise_for_status()
    return response.json()

# ASYNC: use inside async def routes
async def fetch_user_async(user_id: int) -> dict:
    async with httpx.AsyncClient() as client:
        response = await client.get(f"https://api.example.com/users/{user_id}")
        response.raise_for_status()
        return response.json()

# Performance comparison under 100 concurrent requests:
# requests (in def, thread pool):   ~2.1s total (thread overhead, GIL contention)
# httpx async (in async def):       ~0.4s total (single thread, cooperative switching)

Sqlalchemy sync vs async

from sqlalchemy import create_engine, select
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine, async_sessionmaker

# Sync engine, use with def routes
sync_engine = create_engine("postgresql://user:pass@localhost/db")
SyncSession = sessionmaker(bind=sync_engine)

def get_sync_db():
    db = SyncSession()
    try:
        yield db
    finally:
        db.close()

# Async engine, use with async def routes
async_engine = create_async_engine("postgresql+asyncpg://user:pass@localhost/db")
AsyncSessionLocal = async_sessionmaker(async_engine, class_=AsyncSession)

async def get_async_db():
    async with AsyncSessionLocal() as session:
        yield session

# Usage
@app.get("/orders/sync")
def list_orders_sync(db: Session = Depends(get_sync_db)):
    return db.execute(select(Order)).scalars().all()

@app.get("/orders/async")
async def list_orders_async(db: AsyncSession = Depends(get_async_db)):
    result = await db.execute(select(Order))
    return result.scalars().all()

Asyncio.gather: concurrent vs sequential awaits

In async def, awaiting operations sequentially means waiting for each one to complete before starting the next. asyncio.gather() runs them concurrently. All calls start immediately, and you wait for the last one to finish.

import asyncio
import httpx
import time

# Sequential: total time = sum of all call durations
@app.get("/dashboard/slow")
async def dashboard_slow(user_id: int):
    async with httpx.AsyncClient() as client:
        orders = await client.get(f"/api/orders/{user_id}")      # 80ms
        profile = await client.get(f"/api/profile/{user_id}")    # 60ms
        balance = await client.get(f"/api/balance/{user_id}")    # 50ms
    # Total: ~190ms, each call waits for the previous

# Concurrent: total time = max of all call durations
@app.get("/dashboard/fast")
async def dashboard_fast(user_id: int):
    async with httpx.AsyncClient() as client:
        orders_task = client.get(f"/api/orders/{user_id}")
        profile_task = client.get(f"/api/profile/{user_id}")
        balance_task = client.get(f"/api/balance/{user_id}")

        orders, profile, balance = await asyncio.gather(
            orders_task, profile_task, balance_task
        )
    # Total: ~80ms, all calls run in parallel
    return {
        "orders": orders.json(),
        "profile": profile.json(),
        "balance": balance.json()
    }

I reach for asyncio.gather() constantly on aggregation endpoints. Dashboard views that pull from three services are the canonical case.

run_in_executor: escaping blocking code

Sometimes you have a sync library you can't replace. A legacy SDK, a C extension, a vendor client with no async version. run_in_executor lets you call it from an async def route without freezing the event loop.

import asyncio
from concurrent.futures import ThreadPoolExecutor

thread_pool = ThreadPoolExecutor(max_workers=20)

# Legacy sync function you can't change
def legacy_payment_process(amount: float, card_token: str) -> dict:
    return old_payment_sdk.charge(amount, card_token)  # Blocking

@app.post("/payment")
async def process_payment(payment: PaymentRequest):
    loop = asyncio.get_event_loop()

    # Run blocking function in thread pool, event loop stays free
    result = await loop.run_in_executor(
        thread_pool,
        legacy_payment_process,
        payment.amount,
        payment.card_token
    )
    return result

# Cleaner with functools.partial for multiple args:
from functools import partial

result = await loop.run_in_executor(
    thread_pool,
    partial(legacy_payment_process, payment.amount, payment.card_token)
)

Thread pool size matters. Default is min(32, os.cpu_count() + 4). For I/O-bound work (most cases), set it higher. For CPU-bound, use ProcessPoolExecutor instead.

Background tasks: what they don't do

Common misconception: background tasks run in parallel with the response. They don't. They run after the response is sent, and they share the event loop.

from fastapi import BackgroundTasks
import asyncio

async def send_analytics(user_id: int, action: str):
    # Runs AFTER response sent, but on the same event loop
    await analytics_client.track(user_id, action)  # async, OK
    # time.sleep(5) here would freeze the event loop for 5 seconds, very bad

@app.post("/purchase")
async def purchase(item_id: int, bg: BackgroundTasks, user: User = Depends(get_current_user)):
    order = await create_order(user.id, item_id)
    bg.add_task(send_analytics, user.id, "purchase")
    return order  # Client gets response; send_analytics runs after

# For fire-and-forget tasks that need true parallelism:
# Use a task queue (Celery, ARQ, Dramatiq) instead of BackgroundTasks

Quiz

Q1. A def route calls requests.get(). Does this block the event loop?

No. FastAPI runs def routes in a thread pool executor. The blocking requests.get() call blocks the thread it's running in, not the event loop. The event loop is free to handle other async requests while this thread waits for the HTTP response.

The cost: a thread is occupied for the duration of the call. Under high concurrency, you can exhaust the thread pool. For moderate loads, def plus a sync library is perfectly fine.

Q2. You have an async def route that calls time.sleep(2). What happens to other requests during those 2 seconds?

They all freeze. time.sleep() is a blocking call. Inside async def, it runs on the event loop thread. For 2 seconds, the event loop is completely blocked. No other coroutines run, no other requests get handled.

Fix: use await asyncio.sleep(2) instead. This suspends the coroutine and yields control to the event loop, which can run other requests while waiting. If you need to sleep in a sync context, use def, which runs in a thread pool so the sleep only blocks that thread.

Q3. You need to call a CPU-heavy image compression function inside a FastAPI route. Should you use async def or def, and with which executor?

async def with ProcessPoolExecutor. CPU-bound work is bounded by Python's GIL. Even in a thread pool, only one thread executes Python bytecode at a time, so multiple threads don't parallelize CPU work.

A ProcessPoolExecutor spawns separate processes that sidestep the GIL, so you get true parallel CPU execution. Use await loop.run_in_executor(process_executor, compress_image, data) from an async def route to run CPU work in a process while the event loop stays free.

Plain def runs in a thread pool. It won't freeze the event loop, but it also won't give you CPU parallelism.

Q4. You use asyncio.gather() to fire 5 HTTP requests simultaneously from a route. One of them raises an exception. What happens to the others?

By default, the exception propagates immediately and the other tasks are cancelled. asyncio.gather()'s default behavior is to raise the first exception and cancel remaining tasks.

To handle errors per-task instead, use return_exceptions=True:

results = await asyncio.gather(
    task1(), task2(), task3(), task4(), task5(),
    return_exceptions=True  # Returns exceptions as values instead of raising
)
# results is a list, some may be exceptions, some may be actual values
for r in results:
    if isinstance(r, Exception):
        log.error(f"Task failed: {r}")
    else:
        process(r)

Use return_exceptions=True when you want best-effort behavior and want as many results back as possible, even if some calls fail.

Lifecycle and async/sync are the two concepts behind most FastAPI production bugs I've seen. Next: Part 3: API design, pagination, filtering, and error handling at scale. Cursor vs offset pagination, Pydantic query param filtering, and enforcing one error shape across the entire API.

← PREV

FastAPI 101: The Request Lifecycle