Skip to main content
mid
intermediate
SRE

SLI, SLO, and SLA Definitions

Question

Explain the difference between SLI, SLO, and SLA with examples.

Answer

SLI (Service Level Indicator) is a metric measuring service behavior (e.g., latency, error rate). SLO (Service Level Objective) is an internal target for that metric (e.g., 99.9% availability). SLA (Service Level Agreement) is an external contract with customers specifying consequences for missing targets. Example: SLI = request latency, SLO = 95% of requests < 200ms, SLA = credits if monthly availability drops below 99.5%.

Why This Matters

These concepts form the foundation of reliability engineering. SLIs help measure user experience, SLOs balance reliability with feature velocity, and SLAs formalize business commitments. Without clear SLOs, teams either over-engineer for unnecessary reliability or ship too fast and burn out on incidents.

Code Examples

SLO definition example

yaml

SLI measurement in Prometheus

promql
Common Mistakes
  • Setting SLOs at 100% (impossible and prevents any changes)
  • Confusing SLOs (internal targets) with SLAs (customer contracts)
  • Not tracking error budget, making it meaningless
Follow-up Questions
Interviewers often ask these as follow-up questions
  • How do you handle situations when you're close to exhausting your error budget?
  • What's the difference between availability and reliability?
  • How do you choose appropriate SLO targets?
Tags
sre
reliability
monitoring
slo
business