FastAPI 102: Async vs Sync — What Actually Happens
FastAPIProduction

FastAPI 102: Async vs Sync — What Actually Happens

March 29, 202610 min readPART 02 / 18

In Part 1 we traced the full path a request takes through FastAPI — ASGI server, middleware, dependency injection, handler. One of the key questions left open: should your handler be async def or def? Most developers pick one and stick with it. The actual answer depends on what your handler does — and getting it wrong either kills your throughput silently or adds unnecessary overhead. This is Part 2: what actually happens at the runtime level for each choice.

The event loop: one thread, cooperative multitasking

Uvicorn (FastAPI's ASGI server) runs on a single-threaded event loop — Python's asyncio event loop. "Single-threaded" doesn't mean it can only handle one request at a time — it means it handles all requests on one thread, switching between them cooperatively.

The key word is cooperatively. The event loop runs one coroutine at a time. A coroutine "yields" control back to the event loop when it hits an await expression. While it's waiting (for a network response, a DB query, a file read), the event loop runs other coroutines. When the awaited operation completes, the coroutine is resumed.

Event Loop Thread:

T=0ms   Handle request A: start processing
T=5ms   Request A hits "await db.query()" — yields to event loop
T=5ms   Handle request B: start processing
T=8ms   Request B hits "await http_client.get()" — yields
T=12ms  DB query for A completes → resume request A
T=15ms  HTTP call for B completes → resume request B
T=16ms  Request A sends response
T=18ms  Request B sends response

All of this happened on ONE thread.

This is why async is powerful for I/O-bound work. While waiting for the database, you're not burning a thread — the thread is handling other requests.

What actually happens with async def

When you define a route with async def, FastAPI calls it as a coroutine directly on the event loop. No thread pool, no overhead — just the event loop running your code until it hits an await, at which point it yields and handles other work.

import httpx
from fastapi import FastAPI

app = FastAPI()

# ✅ Correct async def — awaits an async I/O operation
@app.get("/users/{id}")
async def get_user(id: int):
    async with httpx.AsyncClient() as client:
        response = await client.get(f"https://api.example.com/users/{id}")
        # While waiting for HTTP response, event loop handles other requests
    return response.json()

When await client.get(...) is called, the coroutine suspends, the event loop runs other coroutines, and when the HTTP response arrives, your coroutine resumes. Zero threads used. Thousands of concurrent requests possible on one CPU core.

The silent killer: blocking inside async def

Here's where developers get burned. If you call a blocking function inside async def, you don't get an error — the code just runs. But blocking the event loop means nothing else can run until your blocking call returns.

import requests  # sync HTTP library

# ❌ WRONG — requests.get() is blocking. It blocks the entire event loop.
# ALL other concurrent requests wait while this network call is in progress.
@app.get("/users/{id}")
async def get_user_broken(id: int):
    response = requests.get(f"https://api.example.com/users/{id}")
    # While this takes 200ms, every other request is frozen
    return response.json()

# ✅ CORRECT — httpx.AsyncClient is async-native
@app.get("/users/{id}")
async def get_user_fixed(id: int):
    async with httpx.AsyncClient() as client:
        response = await client.get(f"https://api.example.com/users/{id}")
    return response.json()

The same trap exists with database drivers. Calling a synchronous SQLAlchemy session inside async def blocks the event loop for every query.

from sqlalchemy.orm import Session
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine

# ❌ Sync SQLAlchemy inside async def — blocks event loop
@app.get("/orders")
async def get_orders_broken(db: Session = Depends(get_sync_db)):
    return db.query(Order).all()  # Blocking DB call — freezes event loop

# ✅ Async SQLAlchemy — correct
@app.get("/orders")
async def get_orders_fixed(db: AsyncSession = Depends(get_async_db)):
    result = await db.execute(select(Order))
    return result.scalars().all()

What actually happens with def

When you define a route with def (no async), FastAPI does not call it on the event loop. Instead, it offloads it to a thread pool executor — by default, asyncio.get_event_loop().run_in_executor(None, func) using Python's ThreadPoolExecutor.

# FastAPI internally does something like:
import asyncio
from functools import partial

# When it encounters a def route:
result = await loop.run_in_executor(None, partial(your_sync_handler, **kwargs))

This means:

  • Your function runs in a separate thread — blocking I/O doesn't freeze the event loop
  • The event loop is free to handle other requests while your thread is blocked
  • But there's overhead: thread creation/pooling, context switching, GIL interaction
  • The thread pool has a limited size — under high concurrency, requests queue waiting for a thread
import requests
from sqlalchemy.orm import Session

# ✅ def with sync libraries — FastAPI runs this in a thread pool
# Blocking requests.get() is fine here — it blocks the thread, not the event loop
@app.get("/users/{id}")
def get_user_sync(id: int, db: Session = Depends(get_sync_db)):
    response = requests.get(f"https://api.example.com/users/{id}")
    user = db.query(User).filter(User.id == id).first()
    return {"user": user, "external": response.json()}

The decision framework

Here's the decision tree for every route and dependency you write:

What does my function do?
    │
    ├─► I/O-bound work (DB, HTTP, file, cache)
    │       │
    │       ├─► Async library available (httpx, asyncpg, aioredis)?
    │       │       └─► async def + await ✅ (best performance)
    │       │
    │       └─► Only sync library available (requests, psycopg2)?
    │               └─► def (thread pool) ✅ (safe, not optimal)
    │
    ├─► CPU-bound work (image processing, ML inference, heavy computation)
    │       └─► def + ProcessPoolExecutor ✅
    │           (thread pool won't help — Python GIL limits CPU parallelism)
    │
    └─► Fast/trivial computation (no I/O)
            └─► Either works — async def if already in async context
import asyncio
from concurrent.futures import ProcessPoolExecutor

executor = ProcessPoolExecutor()

# CPU-bound: image resizing, ML inference, etc.
@app.post("/resize")
async def resize_image(file: UploadFile):
    image_bytes = await file.read()
    # Offload CPU work to process pool — won't block event loop or GIL
    loop = asyncio.get_event_loop()
    resized = await loop.run_in_executor(executor, resize_sync, image_bytes)
    return StreamingResponse(io.BytesIO(resized), media_type="image/jpeg")

Real code comparisons

Requests vs httpx

import requests
import httpx

# SYNC — use inside def routes or run_in_executor
def fetch_user_sync(user_id: int) -> dict:
    response = requests.get(f"https://api.example.com/users/{user_id}")
    response.raise_for_status()
    return response.json()

# ASYNC — use inside async def routes
async def fetch_user_async(user_id: int) -> dict:
    async with httpx.AsyncClient() as client:
        response = await client.get(f"https://api.example.com/users/{user_id}")
        response.raise_for_status()
        return response.json()

# Performance comparison under 100 concurrent requests:
# requests (in def, thread pool):   ~2.1s total (thread overhead, GIL contention)
# httpx async (in async def):       ~0.4s total (single thread, cooperative switching)

Sqlalchemy sync vs async

from sqlalchemy import create_engine, select
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine, async_sessionmaker

# Sync engine — use with def routes
sync_engine = create_engine("postgresql://user:pass@localhost/db")
SyncSession = sessionmaker(bind=sync_engine)

def get_sync_db():
    db = SyncSession()
    try:
        yield db
    finally:
        db.close()

# Async engine — use with async def routes
async_engine = create_async_engine("postgresql+asyncpg://user:pass@localhost/db")
AsyncSessionLocal = async_sessionmaker(async_engine, class_=AsyncSession)

async def get_async_db():
    async with AsyncSessionLocal() as session:
        yield session

# Usage
@app.get("/orders/sync")
def list_orders_sync(db: Session = Depends(get_sync_db)):
    return db.execute(select(Order)).scalars().all()

@app.get("/orders/async")
async def list_orders_async(db: AsyncSession = Depends(get_async_db)):
    result = await db.execute(select(Order))
    return result.scalars().all()

Asyncio.gather: concurrent vs sequential awaits

In async def, awaiting operations sequentially means waiting for each one to complete before starting the next. asyncio.gather() runs them concurrently — all start immediately, you wait for all to finish.

import asyncio
import httpx
import time

# ❌ Sequential — total time = sum of all call durations
@app.get("/dashboard/slow")
async def dashboard_slow(user_id: int):
    async with httpx.AsyncClient() as client:
        orders = await client.get(f"/api/orders/{user_id}")      # 80ms
        profile = await client.get(f"/api/profile/{user_id}")    # 60ms
        balance = await client.get(f"/api/balance/{user_id}")    # 50ms
    # Total: ~190ms — each call waits for the previous

# ✅ Concurrent — total time = max of all call durations
@app.get("/dashboard/fast")
async def dashboard_fast(user_id: int):
    async with httpx.AsyncClient() as client:
        orders_task = client.get(f"/api/orders/{user_id}")
        profile_task = client.get(f"/api/profile/{user_id}")
        balance_task = client.get(f"/api/balance/{user_id}")

        orders, profile, balance = await asyncio.gather(
            orders_task, profile_task, balance_task
        )
    # Total: ~80ms — all calls run in parallel
    return {
        "orders": orders.json(),
        "profile": profile.json(),
        "balance": balance.json()
    }

asyncio.gather() is one of the highest-use async patterns for APIs that aggregate data from multiple sources.

run_in_executor: escaping blocking code

Sometimes you have a sync library you can't replace — a legacy SDK, a C extension, a third-party client with no async version. run_in_executor lets you call it from an async def route without blocking the event loop.

import asyncio
from concurrent.futures import ThreadPoolExecutor

thread_pool = ThreadPoolExecutor(max_workers=20)

# Legacy sync function you can't change
def legacy_payment_process(amount: float, card_token: str) -> dict:
    return old_payment_sdk.charge(amount, card_token)  # Blocking

@app.post("/payment")
async def process_payment(payment: PaymentRequest):
    loop = asyncio.get_event_loop()

    # Run blocking function in thread pool — event loop stays free
    result = await loop.run_in_executor(
        thread_pool,
        legacy_payment_process,
        payment.amount,
        payment.card_token
    )
    return result

# Cleaner with functools.partial for multiple args:
from functools import partial

result = await loop.run_in_executor(
    thread_pool,
    partial(legacy_payment_process, payment.amount, payment.card_token)
)

Thread pool size matters. Default is min(32, os.cpu_count() + 4). For I/O-bound work (most cases), you can set it higher. For CPU-bound, use ProcessPoolExecutor instead.

Background tasks: what they don't do

A common misconception: background tasks run in parallel with the response. They don't — they run after the response is sent. And they share the event loop.

from fastapi import BackgroundTasks
import asyncio

async def send_analytics(user_id: int, action: str):
    # Runs AFTER response sent — but on the same event loop
    await analytics_client.track(user_id, action)  # async — OK
    # time.sleep(5) here would freeze the event loop for 5 seconds — very bad

@app.post("/purchase")
async def purchase(item_id: int, bg: BackgroundTasks, user: User = Depends(get_current_user)):
    order = await create_order(user.id, item_id)
    bg.add_task(send_analytics, user.id, "purchase")
    return order  # Client gets response; send_analytics runs after

# For fire-and-forget tasks that need true parallelism:
# Use a task queue (Celery, ARQ, Dramatiq) instead of BackgroundTasks

Quiz

Q1. A def route calls requests.get(). Does this block the event loop?

No. FastAPI runs def routes in a thread pool executor. The blocking requests.get() call blocks the thread it's running in, not the event loop. The event loop is free to handle other async requests while this thread waits for the HTTP response.

The cost: a thread is occupied for the duration of the call. Under high concurrency, you can exhaust the thread pool. But for moderate loads, def + sync library is perfectly fine.

Q2. You have an async def route that calls time.sleep(2). What happens to other requests during those 2 seconds?

They all freeze. time.sleep() is a blocking call. Inside async def, it runs on the event loop thread. For 2 seconds, the event loop is completely blocked — no other coroutines can run, no other requests can be handled.

Fix: use await asyncio.sleep(2) instead. This suspends the coroutine and yields control to the event loop, which can run other requests while waiting. If you need to sleep in a sync context, use def (runs in thread pool, sleep only blocks that thread).

Q3. You need to call a CPU-heavy image compression function inside a FastAPI route. Should you use async def or def, and with which executor?

async def with ProcessPoolExecutor. CPU-bound work is bounded by Python's GIL — even in a thread pool, only one thread can execute Python bytecode at a time. Multiple threads don't parallelize CPU work.

A ProcessPoolExecutor spawns separate processes that bypass the GIL, enabling true parallel CPU execution. Use await loop.run_in_executor(process_executor, compress_image, data) from an async def route to run CPU work in a process while the event loop stays free.

Plain def runs in a thread pool — it would work (blocks a thread, not the event loop), but doesn't give you CPU parallelism.

Q4. You use asyncio.gather() to fire 5 HTTP requests simultaneously from a route. One of them raises an exception. What happens to the others?

By default, the exception propagates immediately and the other tasks are cancelled. asyncio.gather()'s default behavior is to raise the first exception and cancel remaining tasks.

To handle errors per-task instead, use return_exceptions=True:

results = await asyncio.gather(
    task1(), task2(), task3(), task4(), task5(),
    return_exceptions=True  # Returns exceptions as values instead of raising
)
# results is a list — some may be exceptions, some may be actual values
for r in results:
    if isinstance(r, Exception):
        log.error(f"Task failed: {r}")
    else:
        process(r)

Use return_exceptions=True when you want best-effort behavior — get as many results as possible even if some calls fail.

That covers the two biggest conceptual hurdles in FastAPI: the full request lifecycle (Part 1) and the async vs sync model (Part 2). Together they explain most of the production bugs and performance surprises you'll encounter. Next up — Part 3: API Design — Pagination, Filtering & Error Handling at Scale. Cursor vs offset pagination, Pydantic query param filtering, and enforcing one error shape across the entire API.

← PREV
FastAPI 101: The Request Lifecycle
← All FastAPI Posts