The CodePush Update That Silently Bricked 40,000 React Native Users for 72 Hours
Tuesday, 11:47 PM. I'm about to close my laptop when Sentry fires. One error, isolated, from a single device. I dismiss it as a flaky emulator and pour another coffee.
By 12:15 AM, that single error has cascaded into 8,000 concurrent crashes across iOS and Android. By 2 AM, we have 40,000 users staring at a white screen. The React Native app, fully functional 90 minutes ago, is completely dead. And we hadn't deployed a single line of native code.
Production failure
We were 14 months into a React Native rewrite of a B2C platform that had grown to 140,000 monthly active users. The CD pipeline was something we were proud of: push to main, CI runs tests, build uploads to CodePush, bundle rolls out to users silently in the background. No App Store review delay, no forced update prompts, no deployment windows. Ship code like a web app.
The deployment that killed us looked completely routine. A PR merged at 10:15 PM: upgrade of react-native-camera from v3.44 to v4.0, plus a new QR-code scanning feature for the in-store loyalty program. All 214 tests passed. Build succeeded. CodePush deployment triggered automatically. Bundle was live.
At 11:47 PM, the first Sentry event arrived:
TypeError: null is not an object (evaluating 'RNCameraModule.getConstants')
at node_modules/react-native-camera/src/RNCamera.js:38:44
at RNCamera.componentDidMount (src/screens/ScanScreen.tsx:22:12)
Sessions affected: 1
Device: iPhone 14 Pro (iOS 17.2)
App version: 3.8.1 (binary) + CodePush bundle cd7a3f2
By the time I actually read that error at 12:10 AM, Sentry showed 8,000 affected sessions. CodePush had been silently pushing the bundle to every app that launched in the background for the past two hours. Update sync was fast, under 3 seconds per device. We had achieved perfect delivery at scale, of the wrong code.
False assumptions
Our entire mobile release philosophy rested on a few beliefs that were each technically true and collectively catastrophic.
"CodePush updates are safe because they're just JavaScript." True, in the sense that JS bundles can't change native modules. They can absolutely call native module APIs that don't exist in the binary installed on the user's device.
"We can roll back in seconds." CodePush does have a rollback command. Rollback deploys a new bundle, which still needs to be downloaded, applied, and the app restarted. For a user whose device crashes on launch, the app never runs long enough to check for updates. No check, no rollback.
"Our tests cover the critical paths." They covered business logic. Not one test booted an actual React Native runtime and verified that native module bindings resolved correctly.
The missing assumption was simpler: CodePush replaces JavaScript, not native modules. If the new JS bundle calls a native API the installed binary doesn't expose, you have a crash on every launch with no in-app mechanism to recover automatically.
Investigation: what actually happened at the native layer
react-native-camera v4.0 shipped a breaking change: it reorganized its native module exports. The v3 binary registered a module named RNCameraModule with a getConstants() method. v4 renamed this module internally and restructured the constants API.
Our CodePush bundle was built against the v4 npm package. Our App Store binary was built against v3 native code, the version from 6 weeks ago, when we last submitted a native release. The JavaScript bridge tried to call RNCameraModule.getConstants(). The v3 native binary had no such export path. Result: null. Crash on mount of any screen importing the camera module, which (because we lazy-load poorly) was the app's App.tsx root component. Every launch, every user, immediate white screen.
THE MISMATCH: Binary vs Bundle Native Module API
─────────────────────────────────────────────────────────────────────
App Store Binary (installed on 140k devices)
┌────────────────────────────────────────────────────────┐
│ react-native-camera v3 NATIVE MODULE │
│ │
│ Registered as: "RNCamera" │
│ Methods: takePicture(), recordVideo() │
│ Constants: via getConstants() on "RNCamera" │
└────────────────────────────────────────────────────────┘
↑ bridge ↑
┌────────────────────────────────────────────────────────┐
│ CodePush Bundle (deployed 10:17 PM) │
│ react-native-camera v4 JS LAYER │
│ │
│ Calls: RNCameraModule.getConstants() ← WRONG NAME │
│ ↳ resolves to: null │
│ ↳ null.getConstants() → TypeError │
│ ↳ React error boundary catches it │
│ ↳ Root boundary → white screen │
└────────────────────────────────────────────────────────┘
RNCameraModule does not exist in the v3 binary.
Error propagates to root. App shows nothing. Every. Single. Launch.
The kicker: CI ran tests in a Jest environment that mocks all native modules. jest.mock('react-native-camera') meant every test passed with a perfectly functional mock, completely disconnected from the binary that would actually run on devices.
Root cause: the rollback that couldn't
At 12:20 AM, my first instinct was right: roll back. I ran:
# Roll back to the previous bundle
appcenter codepush rollback MyOrg/MyApp-iOS Production
appcenter codepush rollback MyOrg/MyApp-Android Production
CodePush confirmed the rollback. Nothing recovered. Here's why: CodePush rollback works by pushing a new deployment pointing at the previous bundle hash. The app must launch, reach the CodePush sync() call in App.tsx, download the rollback bundle, and restart. Our app crashed before App.tsx finished mounting. The sync() call never ran. The rollback bundle was never fetched.
For users who hadn't opened the app yet, the rollback worked perfectly. They got the previous bundle before the bad one. For the 40,000 users who had already cached the broken bundle on-device, they were stuck. Their only escape was one of:
- Delete and reinstall the app (losing local data)
- Wait for us to release a new App Store binary (3 to 5 day review minimum)
- Or, if we got lucky, use CodePush's
rollbackOnErrorauto-revert
We had not enabled rollbackOnError. We hadn't thought we'd need it.
The actual fix path took 72 hours:
INCIDENT TIMELINE
─────────────────────────────────────────────────────────────────────
10:17 PM CodePush bundle deployed (cd7a3f2)
10:17 PM Bundle begins propagating to active devices
11:47 PM First Sentry crash reported (1 device)
12:10 AM 8,000 concurrent crashes — incident declared
12:20 AM CodePush rollback issued — new bundle deploying
12:35 AM Realization: rollback doesn't help crashed users
12:50 AM Emergency App Store submission prepared (reverted binary)
01:15 AM iOS expedited review requested (Apple)
01:30 AM Android emergency release submitted (Google Play)
02:00 AM Peak impact: 40,127 users on broken bundle
+6h Workaround bundle: disable camera import at root level
→ deploy via CodePush to users who CAN still reach sync()
→ ~31,000 users recover (app launches, shows limited UI)
+18h Apple expedited review approved — binary live
+29h Google Play review complete — binary live
+72h Last affected devices clear broken bundle cache
(force-close + reopen triggers fresh CodePush check)
─────────────────────────────────────────────────────────────────
Total hours of degraded experience: 72
Users who had to reinstall: ~1,200 (never opened app to get fix)
Support tickets filed: 4,300+
Refunds issued: 214 (users who couldn't complete purchases)
The fix: bundle validation before it ships
We rebuilt the CodePush pipeline with a validation gate that runs before any bundle reaches production. You can test native module resolution without a physical device. You just need to build the actual JS bundle (not jest mocks) and import-validate it against a manifest of expected native modules.
/**
* Native module binding validator.
* Builds the JS bundle, then checks that every native module import
* resolves to a non-null NativeModules entry — using the module manifest
* from the LAST SHIPPED binary (stored in version-manifest.json).
*/
const { execSync } = require('child_process');
const path = require('path');
const fs = require('fs');
const MANIFEST_PATH = path.join(__dirname, '../native-module-manifest.json');
const BUNDLE_OUTPUT = '/tmp/rn-validate-bundle.js';
// Step 1: Build the actual Metro bundle (not jest, not mocks)
console.log('Building Metro bundle for validation...');
execSync(
`npx react-native bundle --platform ios --dev false --entry-file index.js --bundle-output ${BUNDLE_OUTPUT} --assets-dest /tmp/rn-validate-assets`,
{ stdio: 'inherit' }
);
// Step 2: Extract native module references from the bundle
const bundle = fs.readFileSync(BUNDLE_OUTPUT, 'utf8');
const nativeModuleRefs = [...bundle.matchAll(/NativeModules["([^"]+)"]/g)]
.map(m => m[1]);
// Step 3: Compare against last-known-good binary manifest
const manifest = JSON.parse(fs.readFileSync(MANIFEST_PATH, 'utf8'));
const missing = nativeModuleRefs.filter(m => !manifest.modules.includes(m));
if (missing.length > 0) {
console.error('\n❌ BUNDLE VALIDATION FAILED');
console.error('The following native modules are called in the JS bundle');
console.error('but are NOT registered in the current App Store binary:\n');
missing.forEach(m => console.error(` • ${m}`));
console.error('\nYou must ship a native binary update BEFORE this CodePush bundle.');
process.exit(1);
}
console.log(`✅ All ${nativeModuleRefs.length} native module references validated.`);
We generate native-module-manifest.json as part of every native binary release. The CI pipeline for CodePush now runs this validation before any bundle is promoted. If validation fails, the deployment is blocked and the error message tells you exactly what native release must ship first.
We also enabled three CodePush safeguards we had never activated.
const codePushOptions: CodePushOptions = {
// Auto-revert the bundle if it crashes 3 times within 10 minutes
rollbackRetryOptions: {
delayInHours: 0.167, // 10 minutes
maxRetryAttempts: 3,
},
// Don't apply update immediately — wait for next restart
// Prevents mid-session disruption; also means broken bundles
// don't crash a currently-running session
installMode: CodePush.InstallMode.ON_NEXT_RESTART,
// Staged rollout: push to 5% of users first, monitor for 1 hour
// before promoting to 100%
// (set in appcenter-config.json, controlled via CD pipeline)
deploymentKey: process.env.CODEPUSH_DEPLOYMENT_KEY,
};
Staged rollout alone would have caught this before it reached 40,000 users. A 5% rollout to 7,000 users with a 1-hour Sentry error-rate watch would have triggered auto-promotion failure at roughly 350 affected users instead of 40,000. That number still haunts me a little.
Results after the fix
In the 10 months since, the validation script has caught 3 bundles that would have caused similar crashes. All caught in CI, before any user saw them. Each time the fix was the same: ship the native binary first, then the CodePush bundle.
Lessons learned
CodePush is not a safe escape hatch from the App Store review process. It's a powerful tool with a hard constraint: your JavaScript bundle must be compatible with the native binary installed on the device. Violate that constraint and you have a crash with no automatic recovery path.
Rollback is not instant recovery. CodePush rollback requires the app to run long enough to call sync(). If the crash is at launch, rollback reaches exactly zero affected users. Enable rollbackRetryOptions from day one. It's the only automatic recovery for this failure mode.
Jest mocks are a lie you tell your CI. jest.mock('react-native-camera') means tests pass regardless of whether the real native module is compatible. Build the actual Metro bundle and validate native module references against your shipped binary manifest. No emulator required.
Staged rollouts are mandatory. Every CodePush deployment should start at 5 to 10 percent of users. Monitor error rates for at least 30 minutes before promoting. In the worst case this limits blast radius from 40,000 to around 2,000 users, and realistically catches issues at a few hundred.
Native binary releases and CodePush releases must be coordinated explicitly. We now maintain native-module-manifest.json, which records every native module registered by the last shipped binary. Any CodePush bundle referencing a module not in that manifest is blocked.