March 11, 2026Python9 min read

The Shared State Trap: How a FastAPI 'Optimisation' Leaked User Data

Published March 11, 20269 min read

We replaced Flask's request-scoped g with a plain module-level dict during our FastAPI migration. It worked in tests. It worked in staging. In production, under concurrent load, it silently served one tenant's data to a completely different user. For three days. The first signal was a support ticket with a screenshot.

The rewrite nobody questioned

Four years into running a Flask 1.x reporting API, we decided to rewrite it in FastAPI. The pitch was sound. Native async for slow I/O, Pydantic validation, OpenAPI docs that might actually stay in sync with reality. Management approved. Engineering was excited. Two sprints later tests were green, slow endpoints were 40% faster in load tests, and we deployed on a Wednesday.

The migration looked like a success. Dashboards green. Seventy-two hours later, the support tickets started coming in.

Wrong data, zero errors

"I'm seeing reports that don't belong to my company."

First ticket, we dismissed as a frontend cache glitch. Second one made us nervous. The third had a screenshot and confirmed a multi-tenant data leak. Users were getting valid, well-formed API responses with HTTP 200 status codes and data that belonged to a different organisation.

The logs were completely clean. No exceptions. No 500s. No suspicious query patterns. No latency spikes. Just a steady stream of healthy 200 responses containing the wrong org's data. From a monitoring perspective, there was nothing to alert on.

I spent the next afternoon adding deep instrumentation. The org_id from the JWT at auth time. The org_id passed to each database query. The org_id on every returned row. Deployed and waited. When the next incident hit, the log line read:

auth.org_id=2041 → query.org_id=2041 → result.org_id=1038

Auth was correct. Query filter was correct. The data that came back wasn't. Which meant either the database was lying (unlikely) or the org_id my logging statement was reading wasn't the same org_id my query was reading. I sat with that for a minute before I understood what it meant.

The "Optimisation" that broke everything

In the old Flask codebase, we used flask.g extensively. Flask's request-scoped proxy for storing per-request data through the life of a request. It's how we avoided threading context (org ID, user ID, request metadata) through every function signature in the codebase. Convenient, idiomatic Flask, and ran fine for four years.

During the FastAPI migration, one of the team replaced flask.g with what seemed like an equivalent: a module-level dictionary. Cleaner, they thought. No Flask import, more "Pythonic." I reviewed the PR. I didn't flag it either.

services/context.py — the broken version

# Looked harmless. Was catastrophic.
_request_context: dict = {}

def set_context(org_id: int, user_id: int) -> None:
    _request_context["org_id"] = org_id
    _request_context["user_id"] = user_id

def get_org_id() -> int:
    return _request_context.get("org_id")

# Used in the route handler:
@router.get("/reports/{report_id}")
async def get_report(
    report_id: int,
    token: TokenData = Depends(verify_token),
):
    set_context(token.org_id, token.user_id)  # Set context for this "request"
    await asyncio.sleep(0)                     # Yield to event loop (batching)
    report = await fetch_report(report_id)    # Calls get_org_id() internally
    return report

In Flask, this pattern is safe. Flask uses Werkzeug's LocalProxy backed by threading.local(). With thread-per-request, each thread has its own isolated copy of any thread-local variable. Flask's g is inherently scoped to one request, one thread.

FastAPI is different. It runs on an async event loop. One OS thread handles thousands of concurrent requests. That module-level _request_context dict is one object in memory, shared across every concurrent coroutine. When two requests write to the same keys, the last write wins and whoever reads next gets the wrong value.

How the corruption happens

To understand why this fails, you need to see how Python's async event loop interleaves coroutines. When a coroutine hits an await, it yields control back to the event loop, which picks up another coroutine. Cooperative scheduling is why async is fast. It's also why shared mutable state is a trap.

BROKEN: Module-level dict, two concurrent requests

  Time │  Request A (org=2041)           Request B (org=1038)
  ─────┼──────────────────────────────────────────────────────
   t1  │  set_context(org_id=2041)
       │  _request_context = {"org_id": 2041}
   t2  │  await asyncio.sleep(0) ──────► yields to event loop
   t3  │                                 set_context(org_id=1038)
       │                                 _request_context = {"org_id": 1038}
   t4  │                                 await db.fetch(...) ──► yields
   t5  │  ◄────────────────────────────── event loop resumes A
   t6  │  get_org_id()
   t7  │  returns 1038  ✗  ← B overwrote A's key!
   t8  │  query: WHERE org_id = 1038
   t9  │  → org 1038's data returned to org 2041's user

       _request_context = {"org_id": 1038}
                          ─────────────────
                      One shared dict. All requests.

Any await is a potential interleave point. Our handler set the context, then immediately awaited something (a cache lookup, a DB call, sometimes just asyncio.sleep(0) for batching). In that window, another request could write to the same dict. When the first request resumed, it read the wrong org ID, queried with the wrong filter, and returned the wrong tenant's data.

Under low load, the timing almost never aligned. Under production load with dozens of concurrent requests, it happened constantly. Responses were structurally valid (correct JSON shape, HTTP 200, real data) so no automated monitor caught anything. There was nothing to catch. From the system's perspective, everything was working.

The fix: contextvars

Python 3.7 introduced contextvars, which is built for exactly this problem. A ContextVar is automatically scoped to the current async task (or OS thread). Each coroutine gets its own isolated binding. It's the async-native equivalent of thread-local storage, and it works correctly across await boundaries.

services/context.py — the fixed version

from contextvars import ContextVar
from typing import Optional

# Each async task gets its own isolated copy of these values.
# ContextVar is safe across await boundaries — no shared state.
_org_id_var: ContextVar[Optional[int]] = ContextVar("org_id", default=None)
_user_id_var: ContextVar[Optional[int]] = ContextVar("user_id", default=None)

def set_context(org_id: int, user_id: int) -> None:
    _org_id_var.set(org_id)
    _user_id_var.set(user_id)

def get_org_id() -> int:
    org_id = _org_id_var.get()
    if org_id is None:
        raise RuntimeError("org_id not set — is set_context() missing from this path?")
    return org_id

def get_user_id() -> int:
    user_id = _user_id_var.get()
    if user_id is None:
        raise RuntimeError("user_id not set — is set_context() missing from this path?")
    return user_id

When Request A calls _org_id_var.set(2041), Python's async runtime stores that binding in A's execution context, which is a lightweight namespace the event loop maintains per coroutine. When Request B calls _org_id_var.set(1038), it writes to B's context. The two never touch.

FIXED: ContextVar, two concurrent requests

  Time │  Request A (org=2041)           Request B (org=1038)
  ─────┼──────────────────────────────────────────────────────
   t1  │  _org_id_var.set(2041)
       │  Context A: { _org_id_var → 2041 }
   t2  │  await asyncio.sleep(0) ──────► yields to event loop
   t3  │                                 _org_id_var.set(1038)
       │                                 Context B: { _org_id_var → 1038 }
   t4  │                                 await db.fetch(...) ──► yields
   t5  │  ◄────────────────────────────── event loop resumes A
   t6  │  _org_id_var.get()
   t7  │  returns 2041  ✓  ← reads from A's own context
   t8  │  query: WHERE org_id = 2041
   t9  │  → org 2041's data returned to org 2041's user  ✓

       Context A: { _org_id_var: 2041 }   ← isolated
       Context B: { _org_id_var: 1038 }   ← isolated

One import swap. One class change. That was the entire fix. The damage it caused took considerably longer to clean up.

An honest post-mortem

We ran a full audit of every affected request. Three days of logs, cross-referenced against support tickets and org ID mismatches in our access logs. We identified seventeen tenants who had received at least one response containing another tenant's data. We disclosed to every one of them individually, revoked the affected report exports, and filed a GDPR incident report.

The disclosure calls were some of the most uncomfortable conversations I've had with clients. The data involved wasn't especially sensitive (aggregated analytics, not financial records or PII), but that didn't really soften anything. Data isolation is a contract.

3 days undetected

17 affected tenants

1 import to fix it

0 automated alerts fired

What we changed after

Beyond the immediate fix, we made three structural changes to prevent a recurrence:

We deprecated the context helpers entirely on new endpoints. org_id and user_id are now injected via FastAPI's Depends() system as typed parameters. Every function that needs the org ID receives it explicitly. The data flow is visible in every signature instead of hidden in a global.
We added cross-tenant isolation tests. Integration tests fire two concurrent requests for different orgs and assert each response contains only data belonging to the requesting org. They run in CI on every PR, took about three hours to write, and would have caught this bug in staging immediately.
We added a custom Pylint rule that flags any mutable module-level dict or list inside services/. Module-level state is fine for config and constants, not for per-request data. The linter makes the distinction enforced instead of advisory.

The broader lesson

The mistake wasn't carelessness. The developer who introduced it was experienced. The pattern of storing request context in a "global" is completely normal in Flask, Django, and every other thread-per-request framework. It's how you avoid prop-drilling context through twenty function signatures. For four years it had worked fine.

The problem was translating a thread-safe pattern into an async context without understanding what made it thread-safe in the first place.

Flask's g isn't just a dict. It's backed by LocalProxy, which wraps threading.local(). The safety is invisible unless you've read the source. When we copied the pattern without copying the mechanism, we got all of the convenience and none of the isolation.

Migrating from sync to async, every piece of "ambient" state deserves a hard look. Thread-local storage, request-local proxies, singleton caches. They all behave differently when the execution model changes. What was safe in thread-per-request can become a data leak in async.

If you're running FastAPI and passing context through your call chain via anything other than explicit parameters or ContextVar, go audit it today. Not tomorrow. I'm serious. Silent data leaks wait for the right concurrency timing, and then they show up in a support ticket with a screenshot.