Why your app runs locally but crashes in production
If you’ve ever said “it works on my machine” with complete confidence—only to watch the app fall over in production—you’re not alone. This isn’t bad luck or incompetence. It’s a structural reality of how modern software is built, tested, and deployed.
Local environments are controlled, forgiving, and quiet. Production is none of those things.
Below are the real reasons production-only crashes happen—and how experienced teams think about them.
1. Your local environment is lying to you
Development machines are optimized for comfort, not realism.
Locally, you usually have:
- More CPU and memory than production
- Fewer background processes competing for resources
- A single user, not thousands
- Clean state (fresh databases, empty caches)
- Different OS libraries and kernel behavior
Production, on the other hand, is constrained by quotas, container limits, orchestration rules, and shared infrastructure. A memory allocation that looks harmless locally can cross a container limit and trigger an instant kill in production—with no stack trace and no graceful shutdown.
When your app “just exits” in prod, it’s often not crashing. It’s being terminated.
2. Concurrency reveals bugs you didn’t write tests for
Most local testing is single-user and sequential. Production is parallel, noisy, and impatient.
Under real load:
- Requests overlap
- Shared state is accessed concurrently
- Timeouts stack up
- Retries amplify traffic
- Queues back up faster than they drain
This is where race conditions, deadlocks, and subtle ordering bugs surface. Code that is logically correct in isolation can fail spectacularly when executed concurrently.
If a bug only appears sometimes in production, assume concurrency until proven otherwise.
3. Real data is messier than test data
Test fixtures are polite. Production data is not.
In production you’ll encounter:
- Unexpected nulls
- Oversized payloads
- Deeply nested objects
- Corrupt rows
- Legacy values from old versions of the app
Most production crashes happen at the boundaries: parsing, serialization, validation, and transformation. Not because the logic is wrong—but because the assumptions are.
If your app has never processed actual production-shaped data locally, you’re testing a different application.
4. Infrastructure failures don’t look like application errors
Some of the most confusing production crashes aren’t bugs in your code at all.
Common examples:
- OOM kills reported only as
Killed - Container restarts with no logs
- Network timeouts that look like random failures
- DNS resolution delays under load
- Disk or inode exhaustion
From inside the process, these failures can be silent. Without proper logging and metrics, you’re left guessing.
This is why “it didn’t log anything” is not a mystery—it’s a signal that the process didn’t get the chance to.
5. Observability is not optional in production
Local debugging relies on breakpoints and stack traces. Production debugging relies on context.
At minimum, production systems need:
- Structured logs (not just
console.log) - Correlation or trace IDs
- Request-level metadata
- Resource metrics (CPU, memory, GC)
- Visibility into restarts and crashes
Without observability, production failures feel random. With it, patterns emerge quickly.
Most “unpredictable” production crashes are only unpredictable because nothing is watching.
The core insight
Production doesn’t expose new bugs—it exposes unhandled realities.
Your app isn’t failing because it’s broken. It’s failing because production introduces:
- Scale
- Pressure
- Imperfect data
- Limited resources
- Real users behaving unexpectedly
Local success means your logic works.
Production stability means your assumptions hold.
Those are very different achievements.
Final takeaway
If your application crashes only in production, don’t ask:
“Why does production behave differently?”
Ask instead:
“Which conditions exist in production that I never tested for?”
That question leads to answers.