Status Page Design Best Practices

Design patterns and UX principles for building status pages that communicate clearly during both normal operations and incidents.

Component Hierarchy

Organize your status page components in a hierarchy that matches how users think about your service. Group related components and show rollup status at each level.

Operational -- Overall Status

Operational -- Web Application

Dashboard, API Console, Account Settings

Operational -- API

REST API, GraphQL API, Webhooks

Degraded -- Infrastructure

US-East, US-West, EU-Central (degraded)

Rollup logic:A parent component's status is the worst status among its children. If any child is "Major Outage," the parent shows "Major Outage."

Granularity tradeoff: Too few components and users cannot tell which part is affected. Too many and the page becomes noisy. Aim for 5-15 top-level components.

Design Approaches Comparison

Approach	Structure	Best For	Drawbacks
Simple	Single overall status indicator with a list of recent incidents	Single-product services, small teams, early-stage products	Cannot communicate partial outages; all-or-nothing
Grouped	Components grouped by category (e.g., API, Dashboard, Infrastructure) with per-component status	Multi-service platforms, SaaS products with distinct features	Requires maintenance; users must understand your component names
Detailed	Per-component status with uptime graphs, latency metrics, and historical incident timeline	Infrastructure providers, API-first products, enterprise customers	Information-dense; can overwhelm non-technical users

Metric Display Patterns

Uptime bar chart (90-day)

Show a horizontal bar of 90 vertical segments, one per day. Green for fully operational, yellow for degraded, red for outage, gray for no data. This is the most recognized pattern in the industry. Hovering a segment shows the date and any incidents.

Response time graph

Line chart showing p50 and p95 response times over the last 24 hours or 7 days. Use a consistent y-axis scale. Mark incident periods with a subtle background highlight so users can correlate latency spikes with known issues.

Uptime percentage display

Show the current uptime percentage for each component over 30, 60, and 90 day windows. Display to two decimal places (e.g., 99.98%). Avoid showing 100.00% unless it is literally true -- round down rather than up to maintain trust.

Incident Timeline UX

The incident timeline is the most-read section of your status page during an outage. Design it for scanability.

Reverse chronological order

Newest updates at the top. Users arriving during an active incident want the latest information first.

Timestamps in UTC and local time

Display timestamps in UTC with the user's local time in parentheses. Use relative time (e.g., "12 minutes ago") for recent updates, absolute time for older entries.

Status badges

Color-coded badges (Investigating, Identified, Monitoring, Resolved) provide instant visual scanning. Use consistent colors across the entire status page.

Collapse resolved incidents

After 24-48 hours, collapse resolved incidents to a single line to keep the current view focused. Provide an "Incident History" page for the full archive.

Subscriber Notifications

Channel	Use Case	Latency	Considerations
Email	All incidents and maintenance windows	Minutes	Highest reach; risk of spam filtering
Webhook	Integration with internal tools (Slack, PagerDuty)	Seconds	Requires implementation; most flexible
RSS/Atom	Technical users and monitoring tools	Depends on poll interval	Low maintenance; no user management needed
SMS	Critical incidents (SEV1/SEV2) only	Seconds	Higher cost; use sparingly to avoid notification fatigue

API Status Endpoint Design

Provide a machine-readable API endpoint so customers can programmatically check your status and integrate it into their own monitoring.

Recommended response format

GET /api/v1/status

{
  "status": {
    "indicator": "minor",
    "description": "Minor System Outage"
  },
  "components": [
    {
      "id": "api",
      "name": "REST API",
      "status": "operational",
      "updatedAt": "2026-04-12T10:30:00Z"
    },
    {
      "id": "dashboard",
      "name": "Web Dashboard",
      "status": "degradedPerformance",
      "updatedAt": "2026-04-12T10:25:00Z"
    }
  ],
  "activeIncidents": [
    {
      "id": "inc-2026-0412",
      "title": "Elevated dashboard latency",
      "status": "identified",
      "severity": "SEV3",
      "createdAt": "2026-04-12T10:15:00Z",
      "updatedAt": "2026-04-12T10:25:00Z"
    }
  ],
  "scheduledMaintenances": []
}

Status values: operational, degradedPerformance, partialOutage, majorOutage, underMaintenance

Response headers: Include Cache-Control (30-60 seconds) and an ETag to reduce polling load.

Availability: Host the status API on separate infrastructure from your main service so it remains available during outages.