Status Page Design Best Practices

Design patterns and UX principles for building status pages that communicate clearly during both normal operations and incidents.

Component Hierarchy

Organize your status page components in a hierarchy that matches how users think about your service. Group related components and show rollup status at each level.

Operational -- Overall Status
Operational -- Web Application
Dashboard, API Console, Account Settings
Operational -- API
REST API, GraphQL API, Webhooks
Degraded -- Infrastructure
US-East, US-West, EU-Central (degraded)

Rollup logic:A parent component's status is the worst status among its children. If any child is "Major Outage," the parent shows "Major Outage."

Granularity tradeoff: Too few components and users cannot tell which part is affected. Too many and the page becomes noisy. Aim for 5-15 top-level components.

Design Approaches Comparison

ApproachStructureBest ForDrawbacks
SimpleSingle overall status indicator with a list of recent incidentsSingle-product services, small teams, early-stage productsCannot communicate partial outages; all-or-nothing
GroupedComponents grouped by category (e.g., API, Dashboard, Infrastructure) with per-component statusMulti-service platforms, SaaS products with distinct featuresRequires maintenance; users must understand your component names
DetailedPer-component status with uptime graphs, latency metrics, and historical incident timelineInfrastructure providers, API-first products, enterprise customersInformation-dense; can overwhelm non-technical users

Metric Display Patterns

Uptime bar chart (90-day)

Show a horizontal bar of 90 vertical segments, one per day. Green for fully operational, yellow for degraded, red for outage, gray for no data. This is the most recognized pattern in the industry. Hovering a segment shows the date and any incidents.

Response time graph

Line chart showing p50 and p95 response times over the last 24 hours or 7 days. Use a consistent y-axis scale. Mark incident periods with a subtle background highlight so users can correlate latency spikes with known issues.

Uptime percentage display

Show the current uptime percentage for each component over 30, 60, and 90 day windows. Display to two decimal places (e.g., 99.98%). Avoid showing 100.00% unless it is literally true -- round down rather than up to maintain trust.

Incident Timeline UX

The incident timeline is the most-read section of your status page during an outage. Design it for scanability.

Reverse chronological order

Newest updates at the top. Users arriving during an active incident want the latest information first.

Timestamps in UTC and local time

Display timestamps in UTC with the user's local time in parentheses. Use relative time (e.g., "12 minutes ago") for recent updates, absolute time for older entries.

Status badges

Color-coded badges (Investigating, Identified, Monitoring, Resolved) provide instant visual scanning. Use consistent colors across the entire status page.

Collapse resolved incidents

After 24-48 hours, collapse resolved incidents to a single line to keep the current view focused. Provide an "Incident History" page for the full archive.

Subscriber Notifications

ChannelUse CaseLatencyConsiderations
EmailAll incidents and maintenance windowsMinutesHighest reach; risk of spam filtering
WebhookIntegration with internal tools (Slack, PagerDuty)SecondsRequires implementation; most flexible
RSS/AtomTechnical users and monitoring toolsDepends on poll intervalLow maintenance; no user management needed
SMSCritical incidents (SEV1/SEV2) onlySecondsHigher cost; use sparingly to avoid notification fatigue

API Status Endpoint Design

Provide a machine-readable API endpoint so customers can programmatically check your status and integrate it into their own monitoring.

Recommended response format

GET /api/v1/status

{
  "status": {
    "indicator": "minor",
    "description": "Minor System Outage"
  },
  "components": [
    {
      "id": "api",
      "name": "REST API",
      "status": "operational",
      "updatedAt": "2026-04-12T10:30:00Z"
    },
    {
      "id": "dashboard",
      "name": "Web Dashboard",
      "status": "degradedPerformance",
      "updatedAt": "2026-04-12T10:25:00Z"
    }
  ],
  "activeIncidents": [
    {
      "id": "inc-2026-0412",
      "title": "Elevated dashboard latency",
      "status": "identified",
      "severity": "SEV3",
      "createdAt": "2026-04-12T10:15:00Z",
      "updatedAt": "2026-04-12T10:25:00Z"
    }
  ],
  "scheduledMaintenances": []
}

Status values: operational, degradedPerformance, partialOutage, majorOutage, underMaintenance

Response headers: Include Cache-Control (30-60 seconds) and an ETag to reduce polling load.

Availability: Host the status API on separate infrastructure from your main service so it remains available during outages.