P90, P95 & P99 Latency Simulator

Learn latency percentiles with an interactive simulator. Generate request samples, visualize distributions, compare average latency with P50, P90, P95, and P99, and see how tail latency affects real users.

What You Will Learn

Understand that P90, P95, and P99 are percentile rank cutoffs
Visualize how latency distributions create long-tail behavior
Compare average latency, median latency, and percentile latency
See why P99 exposes slow requests that averages can hide
Practice choosing the right latency metric for SLOs and incident response

This interactive simulator requires JavaScript to run. Please enable JavaScript in your browser to use this tool.

// simulator

P90, P95 & P99 Latency Simulator

Supported by

|Become a sponsor

percentiles are rank cutoffs

400 requests

Explore P90, P95, and P99 latency

Generate a latency sample, sort every request, and watch how a tiny slow tail changes the numbers your dashboards show.

Shape the latency sample

Requests

400

Slow tail

Jitter

50%

Highlight a percentile

P95 means 95% of requests finished in 90ms or less. The remaining 5% were slower.

Only the slowest 5% are above this line.

Average

64ms

Can hide a painful tail

P50

60ms

Median request

P90

80ms

10% are slower

P95

90ms

5% are slower

P99

193ms

1% are slower

Latency distribution

The bars show request counts by latency bucket. Vertical lines mark percentile cutoffs.

P95 = 90ms

P90

P95

P99

0ms242ms

Request stream

Each square is one request in arrival order.

first 144

fast normal above P95 above P99

Read the tail before you tune

P99 is 3.2x the median in this sample. If average and P50 look fine while P99 jumps, optimize the rare slow path: cache misses, cold starts, retries, lock waits, noisy nodes, or downstream calls.

Sorted request ruler

Requests are sorted fastest to slowest. Percentiles are rank positions, not averages.

rank 380 / 400

P90

P95

P99

First 90%

Normal user experience. This is what average dashboards over-emphasize.

Next 9%

Slow requests. P95 is usually where support tickets start to appear.

Slowest 1%

The tail. P99 tells you whether rare paths are painful.

Incident readout

Which metric would catch the user pain?

P90

80ms

10% of requests are slower than this cutoff.

P95

90ms

5% of requests are slower than this cutoff.

P99

193ms

1% of requests are slower than this cutoff.

Takeaway: P90 is enough for this scenario. The tail is present, but it is not dramatically separated.

What the numbers say

P90: 80ms means 90% of sampled requests completed by that point. It describes the upper edge of normal.

P95: 90ms leaves only 5% of requests above it. It is a practical SLO metric because it catches recurring pain without being as jumpy as P99.

P99: 193ms isolates the slowest 1%. In this run, that group averages 212ms, while the median is only 60ms.

How P90, P95, and P99 latency work

Percentiles are sorted ranks

P90: sort every request by latency. The P90 value is the point where 90% of requests are at or below that value.
P95: leaves only the slowest 5% above the cutoff, which makes it useful for SLOs and user-facing dashboards.
P99: focuses on the slowest 1%, so it exposes rare but painful paths that averages often hide.

Why average is not enough

A few very slow requests can be invisible when the median and average look normal.
Tail latency often comes from retries, lock waits, cold starts, cache misses, or slow dependencies.
Percentiles show how many users are affected, not just how slow the typical request is.

How to use them in practice

Use P50 to understand the normal request path.
Use P90 or P95 to track broad user experience and SLO health.
Use P99 to investigate tail events, capacity limits, and reliability regressions.

Keep learning

What is P99 latency?SLOs, SLIs, and error budgets PromQL playground