Skip to content

Reliability Overview

The Reliability area is where you set, monitor, and respond to reliability promises about your services. Define a service-level objective (SLO), track compliance over time, and get paged before the error budget runs out.

This page is the workspace’s reliability home — the hub you land on when you click Reliability in the left nav.

The Reliability Overview hub is composed of, top to bottom:

  1. Status strip — three clickable chips showing total SLOs, breached SLOs, and currently firing burn-rate alerts. Each chip deep-links into the matching filtered view.
  2. Burn-rate alerts firing now — cards for every currently-firing burn-rate alert, deduplicated by parent SLO. Empty state reads e.g. “All alerts quiet”. Each card opens the parent SLO detail.
  3. Breached SLOs — top 5 breached SLOs, compact cards with compliance / target / window. A “See all N →” link appears when there are more than 5 breached SLOs.
  4. Quick start — three action cards:
  5. How SLOs work — a single concept paragraph for first-timers.

Reliability Overview hub

An SLO is a reliability promise written as “this service should achieve X% over a Y-day window.” You pick the signal you care about (the SLI), the target percentage, and the time window. KloudMate evaluates compliance on a schedule and alerts you when the error budget runs low.