Performance testing basics
Load, stress, spike, and soak testing explained simply, the metrics that matter (response time, throughput, error rate, percentiles like p95), and Core Web Vitals in plain English.
A website can be perfectly correct and still be terrible. If every button works but each page takes nine seconds to load, users leave before they ever see it work. Performance testing asks a different question from the usual "does it work?". It asks "does it work well, and does it keep working when lots of people show up at once?" It's the difference between a shop that sells the right products and a shop where the queue is out the door.
Four ways to push a system
There isn't one kind of performance test; there are several, and they answer different questions. Think of them as four ways to stress a bridge before you trust real traffic on it.
| Type | Question it answers | Everyday picture |
|---|---|---|
| Load testing | Does it stay fast under the traffic we expect? | A normal busy lunch rush at a cafe. |
| Stress testing | Where does it break, and how badly? | Cramming in more and more people until something gives. |
| Spike testing | Can it survive a sudden surge? | A flash sale or a viral post hitting all at once. |
| Soak testing | Does it hold up over a long time? | Running steadily for hours to catch slow leaks. |
Soak testing is the sneaky-important one. Some problems only appear after hours of running: memory slowly filling up, a small leak that's invisible in a five-minute test but crashes the server overnight. A quick test would never catch it.
The metrics that matter
To measure performance you need numbers. A few matter most, and they tell a story together:
- Response time: how long one request takes to answer. Lower is better. This is what a user feels as speed.
- Throughput: how many requests the system handles per second. Higher means it can serve more people at once.
- Error rate: the share of requests that fail. Under heavy load, watch this climb; a system that stays fast but starts failing isn't actually coping.
- Percentiles: the honest way to report response time (see below).
Say 99 people load a page in 1 second and one person waits 30 seconds. The average is about 1.3 seconds, which sounds fine but hides a disaster. A percentile is fairer. p95 = 2 seconds means 95% of users had it in 2 seconds or faster (and 5% waited longer). Teams track p95 and p99 precisely because they reveal the unlucky users an average sweeps under the rug.
Whenever someone quotes you a single "average load time," gently ask for the p95. The average tells you about a typical good day; the p95 tells you how bad the bad cases get, and the bad cases are the ones people complain about.
Core Web Vitals: performance from the user's eyes
There's a second kind of performance, closer to home: how fast and smooth a single page feels in the browser. Google defined three plain measures for this called Core Web Vitals. They matter both for users and, because search engines factor them in, for how easily people find a site at all.
| Vital | Full name | What it really measures |
|---|---|---|
| LCP | Largest Contentful Paint | How long until the main content shows up. "When can I actually see the page?" |
| INP | Interaction to Next Paint | How quickly the page reacts when you tap or click. "Does it feel snappy or laggy?" |
| CLS | Cumulative Layout Shift | How much things jump around as the page loads. "Did the button move just as I went to tap it?" |
That last one, CLS, is the maddening experience of going to tap a link and having an ad load above it, shoving the whole page down so you tap the wrong thing. Low CLS means a calm, stable page. The three vitals together capture loading, responsiveness, and visual stability: the trio that decides whether a page feels good.
Performance is not a one-time check you pass and forget. Pages get heavier as features are added, and a site that was fast at launch quietly slows over months. Measure it regularly, the way you'd weigh yourself on a scale, so you catch the drift early instead of after users start leaving.
- Performance testing asks "does it work well, and under pressure?", not just "does it work?"
- Load (expected traffic), stress (find the breaking point), spike (sudden surge), and soak (long haul) each push the system differently.
- Track response time, throughput, and error rate together: staying fast while failing isn't coping.
- Averages hide the unlucky users; report percentiles like p95 and p99 to see how bad the bad cases get.
- Core Web Vitals capture how a page feels to a real person: LCP (can I see it?), INP (does it respond?), and CLS (does it stay still?).