Load Testing

Exercising a system under expected workload to answer “can we handle load X?” or “what is the maximum Y we can handle?“. Not the same as stress testing, which cranks pressure until things break.

Why?

Scalability goals (1 user to 100 to 10 million) require numbers, not hunches. C-level asks “can we handle 10x users?” need evidence-backed answers.

Two workload schedules:

Steady load: constant arrival rate held for the duration
Stepwise load: incrementally increasing rate in discrete steps

Start with why [Mel21]

The reason drives the design:

New system: establish average workload + safety buffer
Expected growth (10x users): find bottlenecks
Spike (tax season, Black Friday): hard part is simulating the spike
High uptime (99.99%): endurance test on top of load test
“Performance testing” is on the checklist with no real reason: bail

What to test

Not 100% coverage. Start with what observability flags as slow, or the critical path per product requirements
Compute-heavy workflows, UX-sensitive flows (signup > 2s and users quit), hard external deadlines (1s to approve/decline)
Low current utilization means you have to guess the rate-limiting step and revise as you ramp up

How to test

Hardware principle [Liu09]

Test on production-equivalent hardware. A 16 GB laptop is not a 128 GB server, and limiting factors differ wildly. Otherwise you waste time optimizing RAM when RAM is not the problem.

Reality principle

Use real workload shapes. Legal may block real customer data, so use the best approximation.

War story (JZ)

A DB migration timed out on prod because the test DB was much smaller. “Managers run the report monthly” turned into “managers run it hourly to watch their team.” Plan types, customer sizes, entity counts all matter.

Volume principle

“More is the new more.” You cannot fake real pressure. Faking CPU pressure (encoding video in a loop) does not reveal lock contention that only shows up with 500 real users.

Reproducibility

Two runs on the same code should produce similar results. Unlike unit tests, load tests have real randomness (generated data, scheduler, luck), so aim for similarity, not identity.

Endurance tests

Running analogy

Jeff can run 10 km/h for 1 hour, not 4. Looking at a 15-min sample you’d conclude he could run forever.

CPUs don’t get tired, but software accumulates “fatigue”: memory leaks (java.lang.OutOfMemoryError), swap thrashing, file handle exhaustion, disk fill, log growth.

Holiday freeze (JZ)

Services ran unrestarted long enough to hit internal-resource limits. Fix was rolling restarts. This is an endurance problem even at low load.

Picking duration has no universal rule. Use product requirements (e-commerce: 5 days across Thanksgiving to Cyber Monday) or maintenance windows (SLA allows downtime Sun 02:00 to 03:00, so must survive at least a week).

Evaluating success

Raw results rarely suffice. Post-process, aggregate, correlate with external factors. Criteria:

Total work completed within total time limit?
Individual item time met 99% of the time?

For endurance, look at the trend: does the “yes” stay yes across the whole window?

When you fail

Apply course techniques to the specific slow scenario, re-test, repeat. Software has limits (Jeff will not beat Kipchoge’s 2:01:09). If you have hit the wall, consider redesign, or rethink the constraints: don’t bill all customers on the same day if you can spread billing across the month.

Constant vigilance

Repeat load tests regularly. Software grows in complexity faster than hardware improves (in the current era), so catch regressions before prod does. Example: https://arewefastyet.com tracks Firefox perf.

🛠️ Steven Gong

Table of Contents

Load Testing

Start with why [Mel21]

What to test

How to test

Hardware principle [Liu09]

Reality principle

Volume principle

Reproducibility

Endurance tests

Evaluating success

When you fail

Constant vigilance

Graph View

Backlinks

🛠️ Steven Gong

Table of Contents

Load Testing

Start with why [Mel21]

What to test

How to test

Hardware principle [Liu09]

Reality principle

Volume principle

Reproducibility

Endurance tests

Evaluating success

When you fail

Constant vigilance

Related

Graph View

Backlinks