Kubernetes Component SLI Metrics

    By default, Kubernetes 1.27 publishes Service Level Indicator (SLI) metrics for each Kubernetes component binary. This metric endpoint is exposed on the serving HTTPS port of each component, at the path /metrics/slis. The ComponentSLIs feature gate defaults to enabled for each Kubernetes component as of v1.27.

    • a gauge (which represents the current state of the healthcheck)
    • a counter (which records the cumulative counts observed for each healthcheck state)

    You can use the metric information to calculate per-component availability statistics. For example, the API server checks the health of etcd. You can work out and report how available or unavailable etcd has been - as reported by its client, the API server.

    While the counter data looks like this:

    1. # HELP kubernetes_healthchecks_total [ALPHA] This metric records the results of all healthcheck.
    2. kubernetes_healthchecks_total{name="autoregister-completion",status="error",type="readyz"} 1
    3. kubernetes_healthchecks_total{name="autoregister-completion",status="success",type="readyz"} 14
    4. kubernetes_healthchecks_total{name="etcd",status="success",type="healthz"} 15
    5. kubernetes_healthchecks_total{name="etcd",status="success",type="readyz"} 15
    6. kubernetes_healthchecks_total{name="informer-sync",status="error",type="readyz"} 1
    7. kubernetes_healthchecks_total{name="log",status="success",type="healthz"} 15
    8. kubernetes_healthchecks_total{name="log",status="success",type="readyz"} 15
    9. kubernetes_healthchecks_total{name="ping",status="success",type="healthz"} 15

    Using this data