SLA Reference and Uptime Calculator
Uptime percentage tables, allowed downtime calculations, error budget guidance, and clear definitions for the three pillars of service reliability.
SLI vs SLO vs SLA
| Term | Full Name | Definition | Example |
|---|---|---|---|
| SLI | Service Level Indicator | A quantitative measure of a specific aspect of the service. This is what you measure. | Proportion of requests completed in under 200ms |
| SLO | Service Level Objective | A target value or range for an SLI. This is what you aim for internally. | 99.9% of requests complete in under 200ms over a 30-day window |
| SLA | Service Level Agreement | A contractual commitment with consequences for missing targets. This is what you promise externally. | 99.9% uptime; service credits issued for breaches exceeding 0.1% downtime |
Best practice: Set your SLO stricter than your SLA. If your SLA promises 99.9%, aim for an SLO of 99.95%. The gap is your safety margin.
Uptime Percentage and Allowed Downtime
| Uptime % | Common Name | Downtime per Day | Downtime per Month | Downtime per Year |
|---|---|---|---|---|
| 99% | Two nines | 14 min 24 sec | 7 hr 18 min | 3 days 15 hr 40 min |
| 99.5% | -- | 7 min 12 sec | 3 hr 39 min | 1 day 19 hr 50 min |
| 99.9% | Three nines | 1 min 26 sec | 43 min 50 sec | 8 hr 45 min 57 sec |
| 99.95% | Three and a half nines | 43 sec | 21 min 55 sec | 4 hr 22 min 58 sec |
| 99.99% | Four nines | 8.6 sec | 4 min 23 sec | 52 min 36 sec |
| 99.999% | Five nines | 0.86 sec | 26 sec | 5 min 15 sec |
Error Budget Calculation
An error budget is the inverse of your SLO -- it quantifies how much unreliability your service can tolerate before breaching commitments. Error budgets enable data-driven decisions about feature velocity versus reliability investment.
Formula
Error Budget = 1 - SLO target
Example: SLO = 99.9%
Error Budget = 1 - 0.999 = 0.001 = 0.1%
In a 30-day month: 0.001 * 30 * 24 * 60 = 43.2 minutes of allowed downtime| SLO | Error Budget | Monthly Budget (minutes) | Budget Burn Rate |
|---|---|---|---|
| 99.9% | 0.1% | 43.2 min | A 15-min outage consumes 34.7% of monthly budget |
| 99.95% | 0.05% | 21.6 min | A 15-min outage consumes 69.4% of monthly budget |
| 99.99% | 0.01% | 4.32 min | A 5-min outage consumes 100%+ of monthly budget |
Practical Guidance
Choose the right number of nines
More nines cost exponentially more. Moving from 99.9% to 99.99% often requires redundant infrastructure, multi-region deployment, and sophisticated failover. Make sure the business value justifies the engineering investment.
Define "downtime" precisely
Is a 500ms response time "down"? Is partial degradation "down"? Your SLA should define exactly what constitutes a breach. Common approaches: error rate above threshold, latency above threshold, or complete unavailability.
Use error budgets to negotiate
When error budget is healthy, teams can take more risks with deployments. When error budget is low, freeze non-critical changes and focus on reliability. This creates an objective framework for the speed-vs-stability tradeoff.