Kubernetes Component SLI Metrics

    As an alpha feature, Kubernetes lets you configure Service Level Indicator (SLI) metrics for each Kubernetes component binary. This metric endpoint is exposed on the serving HTTPS port of each component, at the path /metrics/slis. You must enable the ComponentSLIs feature gate for every component from which you want to scrape SLI metrics.

    • a gauge (which represents the current state of the healthcheck)
    • a counter (which records the cumulative counts observed for each healthcheck state)

    You can use the metric information to calculate per-component availability statistics. For example, the API server checks the health of etcd. You can work out and report how available or unavailable etcd has been - as reported by its client, the API server.

    While the counter data looks like this:

    1. # HELP kubernetes_healthchecks_total [ALPHA] This metric records the results of all healthcheck.
    2. kubernetes_healthchecks_total{name="autoregister-completion",status="error",type="readyz"} 1
    3. kubernetes_healthchecks_total{name="autoregister-completion",status="success",type="readyz"} 14
    4. kubernetes_healthchecks_total{name="etcd",status="success",type="healthz"} 15
    5. kubernetes_healthchecks_total{name="etcd",status="success",type="readyz"} 15
    6. kubernetes_healthchecks_total{name="informer-sync",status="error",type="readyz"} 1
    7. kubernetes_healthchecks_total{name="log",status="success",type="healthz"} 15
    8. kubernetes_healthchecks_total{name="log",status="success",type="readyz"} 15
    9. kubernetes_healthchecks_total{name="ping",status="success",type="healthz"} 15

    Using this data