Kubernetes Metrics Reference

    This page details the metrics that different Kubernetes components export. You can query the metrics endpoint for these components using an HTTP scrape, and fetch the current metrics data in Prometheus format.

    List of Alpha Kubernetes Metrics

    NameStability LevelTypeHelpLabelsConst Labels
    aggregator_openapi_v2_regeneration_countALPHACounterCounter of OpenAPI v2 spec regeneration count broken down by causing APIService name and reason.
    apiservice
    reason
    None
    aggregator_openapi_v2_regeneration_durationALPHAGaugeGauge of OpenAPI v2 spec regeneration duration in seconds.
    reason
    None
    aggregator_unavailable_apiserviceALPHACustomGauge of APIServices which are marked as unavailable broken down by APIService name.
    name
    None
    aggregator_unavailable_apiservice_totalALPHACounterCounter of APIServices which are marked as unavailable broken down by APIService name and reason.
    name
    reason
    None
    apiextensions_openapi_v2_regeneration_countALPHACounterCounter of OpenAPI v2 spec regeneration count broken down by causing CRD name and reason.
    crd
    reason
    None
    apiextensions_openapi_v3_regeneration_countALPHACounterCounter of OpenAPI v3 spec regeneration count broken down by group, version, causing CRD and reason.
    crd
    group
    reason
    version
    None
    apiserver_admission_step_admission_duration_seconds_summaryALPHASummaryAdmission sub-step latency summary in seconds, broken out for each operation and API resource and step type (validate or admit).
    operation
    rejected
    type
    None
    apiserver_admission_webhook_fail_open_countALPHACounterAdmission webhook fail open count, identified by name and broken out for each admission type (validating or mutating).
    name
    type
    None
    apiserver_admission_webhook_rejection_countALPHACounterAdmission webhook rejection count, identified by name and broken out for each admission type (validating or admit) and operation. Additional labels specify an error type (calling_webhook_error or apiserver_internal_error if an error occurred; no_error otherwise) and optionally a non-zero rejection code if the webhook rejects the request with an HTTP status code (honored by the apiserver when the code is greater or equal to 400). Codes greater than 600 are truncated to 600, to keep the metrics cardinality bounded.
    error_type
    name
    operation
    rejection_code
    type
    None
    apiserver_admission_webhook_request_totalALPHACounterAdmission webhook request total, identified by name and broken out for each admission type (validating or mutating) and operation. Additional labels specify whether the request was rejected or not and an HTTP status code. Codes greater than 600 are truncated to 600, to keep the metrics cardinality bounded.
    code
    name
    operation
    rejected
    type
    None
    apiserver_audit_error_totalALPHACounterCounter of audit events that failed to be audited properly. Plugin identifies the plugin affected by the error.
    plugin
    None
    apiserver_audit_event_totalALPHACounterCounter of audit events generated and sent to the audit backend.NoneNone
    apiserver_audit_level_totalALPHACounterCounter of policy levels for audit events (1 per request).
    level
    None
    apiserver_audit_requests_rejected_totalALPHACounterCounter of apiserver requests rejected due to an error in audit logging backend.NoneNone
    apiserver_cache_list_fetched_objects_totalALPHACounterNumber of objects read from watch cache in the course of serving a LIST request
    index
    resource_prefix
    None
    apiserver_cache_list_returned_objects_totalALPHACounterNumber of objects returned for a LIST request from watch cache
    resource_prefix
    None
    apiserver_cache_list_totalALPHACounterNumber of LIST requests served from watch cache
    index
    resource_prefix
    None
    apiserver_cel_compilation_duration_secondsALPHAHistogramNoneNone
    apiserver_cel_evaluation_duration_secondsALPHAHistogramNoneNone
    apiserver_certificates_registry_csr_honored_duration_totalALPHACounterTotal number of issued CSRs with a requested duration that was honored, sliced by signer (only kubernetes.io signer names are specifically identified)
    signerName
    None
    apiserver_certificates_registry_csr_requested_duration_totalALPHACounterTotal number of issued CSRs with a requested duration, sliced by signer (only kubernetes.io signer names are specifically identified)
    signerName
    None
    apiserver_client_certificate_expiration_secondsALPHAHistogramDistribution of the remaining lifetime on the certificate used to authenticate a request.NoneNone
    apiserver_crd_webhook_conversion_duration_secondsALPHAHistogramCRD webhook conversion duration in seconds
    crd_name
    from_version
    succeeded
    to_version
    None
    apiserver_current_inqueue_requestsALPHAGaugeMaximal number of queued requests in this apiserver per request kind in last second.
    request_kind
    None
    apiserver_delegated_authn_request_duration_secondsALPHAHistogramRequest latency in seconds. Broken down by status code.
    code
    None
    apiserver_delegated_authn_request_totalALPHACounterNumber of HTTP requests partitioned by status code.
    code
    None
    apiserver_delegated_authz_request_duration_secondsALPHAHistogramRequest latency in seconds. Broken down by status code.
    code
    None
    apiserver_delegated_authz_request_totalALPHACounterNumber of HTTP requests partitioned by status code.
    code
    None
    apiserver_egress_dialer_dial_duration_secondsALPHAHistogramDial latency histogram in seconds, labeled by the protocol (http-connect or grpc), transport (tcp or uds)
    protocol
    transport
    None
    apiserver_egress_dialer_dial_failure_countALPHACounterDial failure count, labeled by the protocol (http-connect or grpc), transport (tcp or uds), and stage (connect or proxy). The stage indicates at which stage the dial failed
    protocol
    stage
    transport
    None
    apiserver_envelope_encryption_dek_cache_fill_percentALPHAGaugePercent of the cache slots currently occupied by cached DEKs.NoneNone
    apiserver_envelope_encryption_dek_cache_inter_arrival_time_secondsALPHAHistogramTime (in seconds) of inter arrival of transformation requests.
    transformation_type
    None
    apiserver_flowcontrol_current_executing_requestsALPHAGaugeNumber of requests in initial (for a WATCH) or any (for a non-WATCH) execution stage in the API Priority and Fairness subsystem
    flow_schema
    priority_level
    None
    apiserver_flowcontrol_current_inqueue_requestsALPHAGaugeNumber of requests currently pending in queues of the API Priority and Fairness subsystem
    flow_schema
    priority_level
    None
    apiserver_flowcontrol_current_rALPHAGaugeR(time of last change)
    priority_level
    None
    apiserver_flowcontrol_dispatch_rALPHAGaugeR(time of last dispatch)
    priority_level
    None
    apiserver_flowcontrol_dispatched_requests_totalALPHACounterNumber of requests executed by API Priority and Fairness subsystem
    flow_schema
    priority_level
    None
    apiserver_flowcontrol_epoch_advance_totalALPHACounterNumber of times the queueset’s progress meter jumped backward
    priority_level
    success
    None
    apiserver_flowcontrol_latest_sALPHAGaugeS(most recently dispatched request)
    priority_level
    None
    apiserver_flowcontrol_next_discounted_s_boundsALPHAGaugemin and max, over queues, of S(oldest waiting request in queue) - estimated work in progress
    bound
    priority_level
    None
    apiserver_flowcontrol_next_s_boundsALPHAGaugemin and max, over queues, of S(oldest waiting request in queue)
    bound
    priority_level
    None
    apiserver_flowcontrol_priority_level_request_utilizationALPHATimingRatioHistogramObservations, at the end of every nanosecond, of number of requests (as a fraction of the relevant limit) waiting or in any stage of execution (but only initial stage for WATCHes)
    phase
    priority_level
    None
    apiserver_flowcontrol_priority_level_seat_utilizationALPHATimingRatioHistogramObservations, at the end of every nanosecond, of utilization of seats for any stage of execution (but only initial stage for WATCHes)
    priority_level
    map[phase:executing]
    apiserver_flowcontrol_read_vs_write_current_requestsALPHATimingRatioHistogramObservations, at the end of every nanosecond, of the number of requests (as a fraction of the relevant limit) waiting or in regular stage of execution
    phase
    request_kind
    None
    apiserver_flowcontrol_rejected_requests_totalALPHACounterNumber of requests rejected by API Priority and Fairness subsystem
    flow_schema
    priority_level
    reason
    None
    apiserver_flowcontrol_request_concurrency_in_useALPHAGaugeConcurrency (number of seats) occupied by the currently executing (initial stage for a WATCH, any stage otherwise) requests in the API Priority and Fairness subsystem
    flow_schema
    priority_level
    None
    apiserver_flowcontrol_request_concurrency_limitALPHAGaugeShared concurrency limit in the API Priority and Fairness subsystem
    priority_level
    None
    apiserver_flowcontrol_request_dispatch_no_accommodation_totalALPHACounterNumber of times a dispatch attempt resulted in a non accommodation due to lack of available seats
    flow_schema
    priority_level
    None
    apiserver_flowcontrol_request_execution_secondsALPHAHistogramDuration of initial stage (for a WATCH) or any (for a non-WATCH) stage of request execution in the API Priority and Fairness subsystem
    flow_schema
    priority_level
    type
    None
    apiserver_flowcontrol_request_queue_length_after_enqueueALPHAHistogramLength of queue in the API Priority and Fairness subsystem, as seen by each request after it is enqueued
    flow_schema
    priority_level
    None
    apiserver_flowcontrol_request_wait_duration_secondsALPHAHistogramLength of time a request spent waiting in its queue
    execute
    flow_schema
    priority_level
    None
    apiserver_flowcontrol_watch_count_samplesALPHAHistogramcount of watchers for mutating requests in API Priority and Fairness
    flow_schema
    priority_level
    None
    apiserver_flowcontrol_work_estimated_seatsALPHAHistogramNumber of estimated seats (maximum of initial and final seats) associated with requests in API Priority and Fairness
    flow_schema
    priority_level
    None
    apiserver_init_events_totalALPHACounterCounter of init events processed in watch cache broken by resource type.
    resource
    None
    apiserver_kube_aggregator_x509_insecure_sha1_totalALPHACounterCounts the number of requests to servers with insecure SHA1 signatures in their serving certificate OR the number of connection failures due to the insecure SHA1 signatures (either/or, based on the runtime environment)NoneNone
    apiserver_kube_aggregator_x509_missing_san_totalALPHACounterCounts the number of requests to servers missing SAN extension in their serving certificate OR the number of connection failures due to the lack of x509 certificate SAN extension missing (either/or, based on the runtime environment)NoneNone
    apiserver_request_aborts_totalALPHACounterNumber of requests which apiserver aborted possibly due to a timeout, for each group, version, verb, resource, subresource and scope
    group
    resource
    scope
    subresource
    verb
    version
    None
    apiserver_request_body_sizesALPHAHistogramApiserver request body sizes broken out by size.
    resource
    verb
    None
    apiserver_request_filter_duration_secondsALPHAHistogramRequest filter latency distribution in seconds, for each filter type
    filter
    None
    apiserver_request_post_timeout_totalALPHACounterTracks the activity of the request handlers after the associated requests have been timed out by the apiserver
    source
    status
    None
    apiserver_request_slo_duration_secondsALPHAHistogramResponse latency distribution (not counting webhook duration) in seconds for each verb, group, version, resource, subresource, scope and component.
    component
    group
    resource
    scope
    subresource
    verb
    version
    None
    apiserver_request_terminations_totalALPHACounterNumber of requests which apiserver terminated in self-defense.
    code
    component
    group
    resource
    scope
    subresource
    verb
    version
    None
    apiserver_request_timestamp_comparison_timeALPHAHistogramTime taken for comparison of old vs new objects in UPDATE or PATCH requests
    code_path
    None
    apiserver_selfrequest_totalALPHACounterCounter of apiserver self-requests broken out for each verb, API resource and subresource.
    resource
    subresource
    verb
    None
    apiserver_storage_data_key_generation_duration_secondsALPHAHistogramLatencies in seconds of data encryption key(DEK) generation operations.NoneNone
    apiserver_storage_data_key_generation_failures_totalALPHACounterTotal number of failed data encryption key(DEK) generation operations.NoneNone
    apiserver_storage_db_total_size_in_bytesALPHAGaugeTotal size of the storage database file physically allocated in bytes.
    endpoint
    None
    apiserver_storage_envelope_transformation_cache_misses_totalALPHACounterTotal number of cache misses while accessing key decryption key(KEK).NoneNone
    apiserver_storage_list_evaluated_objects_totalALPHACounterNumber of objects tested in the course of serving a LIST request from storage
    resource
    None
    apiserver_storage_list_fetched_objects_totalALPHACounterNumber of objects read from storage in the course of serving a LIST request
    resource
    None
    apiserver_storage_list_returned_objects_totalALPHACounterNumber of objects returned for a LIST request from storage
    resource
    None
    apiserver_storage_list_totalALPHACounterNumber of LIST requests served from storage
    resource
    None
    apiserver_storage_transformation_duration_secondsALPHAHistogramLatencies in seconds of value transformation operations.
    transformation_type
    None
    apiserver_storage_transformation_operations_totalALPHACounterTotal number of transformations.
    status
    transformation_type
    transformer_prefix
    None
    apiserver_terminated_watchers_totalALPHACounterCounter of watchers closed due to unresponsiveness broken by resource type.
    resource
    None
    apiserver_tls_handshake_errors_totalALPHACounterNumber of requests dropped with ‘TLS handshake error from’ errorNoneNone
    apiserver_validating_admission_policy_check_duration_secondsALPHAHistogramValidation admission latency for individual validation expressions in seconds, labeled by policy and param resource, further including binding, state and enforcement action taken.
    enforcement_action
    params
    policy
    policy_binding
    state
    validation_expression
    None
    apiserver_validating_admission_policy_check_totalALPHACounterValidation admission policy check total, labeled by policy and param resource, and further identified by binding, validation expression, enforcement action taken, and state.
    enforcement_action
    params
    policy
    policy_binding
    state
    validation_expression
    None
    apiserver_validating_admission_policy_definition_totalALPHACounterValidation admission policy count total, labeled by state and enforcement action.
    enforcement_action
    state
    None
    apiserver_watch_cache_events_dispatched_totalALPHACounterCounter of events dispatched in watch cache broken by resource type.
    resource
    None
    apiserver_watch_cache_initializations_totalALPHACounterCounter of watch cache initializations broken by resource type.
    resource
    None
    apiserver_watch_events_sizesALPHAHistogramWatch event size distribution in bytes
    group
    kind
    version
    None
    apiserver_watch_events_totalALPHACounterNumber of events sent in watch clients
    group
    kind
    version
    None
    apiserver_webhooks_x509_insecure_sha1_totalALPHACounterCounts the number of requests to servers with insecure SHA1 signatures in their serving certificate OR the number of connection failures due to the insecure SHA1 signatures (either/or, based on the runtime environment)NoneNone
    apiserver_webhooks_x509_missing_san_totalALPHACounterCounts the number of requests to servers missing SAN extension in their serving certificate OR the number of connection failures due to the lack of x509 certificate SAN extension missing (either/or, based on the runtime environment)NoneNone
    attachdetach_controller_forced_detachesALPHACounterNumber of times the A/D Controller performed a forced detachNoneNone
    attachdetach_controller_total_volumesALPHACustomNumber of volumes in A/D Controller
    plugin_name
    state
    None
    authenticated_user_requestsALPHACounterCounter of authenticated requests broken out by username.
    username
    None
    authentication_attemptsALPHACounterCounter of authenticated attempts.
    result
    None
    authentication_duration_secondsALPHAHistogramAuthentication duration in seconds broken out by result.
    result
    None
    authentication_token_cache_active_fetch_countALPHAGauge
    status
    None
    authentication_token_cache_fetch_totalALPHACounter
    status
    None
    authentication_token_cache_request_duration_secondsALPHAHistogram
    status
    None
    authentication_token_cache_request_totalALPHACounter
    status
    None
    cloudprovider_aws_api_request_duration_secondsALPHAHistogramLatency of AWS API calls
    request
    None
    cloudprovider_aws_api_request_errorsALPHACounterAWS API errors
    request
    None
    cloudprovider_aws_api_throttled_requests_totalALPHACounterAWS API throttled requests
    operation_name
    None
    cloudprovider_azure_api_request_duration_secondsALPHAHistogramLatency of an Azure API call
    request
    resource_group
    source
    subscription_id
    None
    cloudprovider_azure_api_request_errorsALPHACounterNumber of errors for an Azure API call
    request
    resource_group
    source
    subscription_id
    None
    cloudprovider_azure_api_request_ratelimited_countALPHACounterNumber of rate limited Azure API calls
    request
    resource_group
    source
    subscription_id
    None
    cloudprovider_azure_api_request_throttled_countALPHACounterNumber of throttled Azure API calls
    request
    resource_group
    source
    subscription_id
    None
    cloudprovider_azure_op_duration_secondsALPHAHistogramLatency of an Azure service operation
    request
    resource_group
    source
    subscription_id
    None
    cloudprovider_azure_op_failure_countALPHACounterNumber of failed Azure service operations
    request
    resource_group
    source
    subscription_id
    None
    cloudprovider_gce_api_request_duration_secondsALPHAHistogramLatency of a GCE API call
    region
    request
    version
    zone
    None
    cloudprovider_gce_api_request_errorsALPHACounterNumber of errors for an API call
    region
    request
    version
    zone
    None
    cloudprovider_vsphere_api_request_duration_secondsALPHAHistogramLatency of vsphere api call
    request
    None
    cloudprovider_vsphere_api_request_errorsALPHACountervsphere Api errors
    request
    None
    cloudprovider_vsphere_operation_duration_secondsALPHAHistogramLatency of vsphere operation call
    operation
    None
    cloudprovider_vsphere_operation_errorsALPHACountervsphere operation errors
    operation
    None
    cloudprovider_vsphere_vcenter_versionsALPHACustomVersions for connected vSphere vCenters
    hostname
    version
    build
    None
    container_cpu_usage_seconds_totalALPHACustomCumulative cpu time consumed by the container in core-seconds
    container
    pod
    namespace
    None
    container_memory_working_set_bytesALPHACustomCurrent working set of the container in bytes
    container
    pod
    namespace
    None
    container_start_time_secondsALPHACustomStart time of the container since unix epoch in seconds
    container
    pod
    namespace
    None
    cronjob_controller_cronjob_job_creation_skew_duration_secondsALPHAHistogramTime between when a cronjob is scheduled to be run, and when the corresponding job is createdNoneNone
    csi_operations_secondsALPHAHistogramContainer Storage Interface operation duration with gRPC error code status total
    driver_name
    grpc_status_code
    method_name
    migrated
    None
    endpoint_slice_controller_changesALPHACounterNumber of EndpointSlice changes
    operation
    None
    endpoint_slice_controller_desired_endpoint_slicesALPHAGaugeNumber of EndpointSlices that would exist with perfect endpoint allocationNoneNone
    endpoint_slice_controller_endpoints_added_per_syncALPHAHistogramNumber of endpoints added on each Service syncNoneNone
    endpoint_slice_controller_endpoints_desiredALPHAGaugeNumber of endpoints desiredNoneNone
    endpoint_slice_controller_endpoints_removed_per_syncALPHAHistogramNumber of endpoints removed on each Service syncNoneNone
    endpoint_slice_controller_endpointslices_changed_per_syncALPHAHistogramNumber of EndpointSlices changed on each Service sync
    topology
    None
    endpoint_slice_controller_num_endpoint_slicesALPHAGaugeNumber of EndpointSlicesNoneNone
    endpoint_slice_controller_syncsALPHACounterNumber of EndpointSlice syncs
    result
    None
    endpoint_slice_mirroring_controller_addresses_skipped_per_syncALPHAHistogramNumber of addresses skipped on each Endpoints sync due to being invalid or exceeding MaxEndpointsPerSubsetNoneNone
    endpoint_slice_mirroring_controller_changesALPHACounterNumber of EndpointSlice changes
    operation
    None
    endpoint_slice_mirroring_controller_desired_endpoint_slicesALPHAGaugeNumber of EndpointSlices that would exist with perfect endpoint allocationNoneNone
    endpoint_slice_mirroring_controller_endpoints_added_per_syncALPHAHistogramNumber of endpoints added on each Endpoints syncNoneNone
    endpoint_slice_mirroring_controller_endpoints_desiredALPHAGaugeNumber of endpoints desiredNoneNone
    endpoint_slice_mirroring_controller_endpoints_removed_per_syncALPHAHistogramNumber of endpoints removed on each Endpoints syncNoneNone
    endpoint_slice_mirroring_controller_endpoints_sync_durationALPHAHistogramDuration of syncEndpoints() in secondsNoneNone
    endpoint_slice_mirroring_controller_endpoints_updated_per_syncALPHAHistogramNumber of endpoints updated on each Endpoints syncNoneNone
    endpoint_slice_mirroring_controller_num_endpoint_slicesALPHAGaugeNumber of EndpointSlicesNoneNone
    ephemeral_volume_controller_create_failures_totalALPHACounterNumber of PersistenVolumeClaims creation requestsNoneNone
    ephemeral_volume_controller_create_totalALPHACounterNumber of PersistenVolumeClaims creation requestsNoneNone
    etcd_bookmark_countsALPHAGaugeNumber of etcd bookmarks (progress notify events) split by kind.
    resource
    None
    etcd_lease_object_countsALPHAHistogramNumber of objects attached to a single etcd lease.NoneNone
    etcd_request_duration_secondsALPHAHistogramEtcd request latency in seconds for each operation and object type.
    operation
    type
    None
    etcd_version_infoALPHAGaugeEtcd server’s binary version
    binary_version
    None
    field_validation_request_duration_secondsALPHAHistogramResponse latency distribution in seconds for each field validation value and whether field validation is enabled or not
    enabled
    field_validation
    None
    garbagecollector_controller_resources_sync_error_totalALPHACounterNumber of garbage collector resources sync errorsNoneNone
    get_token_countALPHACounterCounter of total Token() requests to the alternate token sourceNoneNone
    get_token_fail_countALPHACounterCounter of failed Token() requests to the alternate token sourceNoneNone
    job_controller_job_finished_totalALPHACounterThe number of finished job
    completion_mode
    reason
    result
    None
    job_controller_job_pods_finished_totalALPHACounterThe number of finished Pods that are fully tracked
    completion_mode
    result
    None
    job_controller_job_sync_duration_secondsALPHAHistogramThe time it took to sync a job
    action
    completion_mode
    result
    None
    job_controller_job_sync_totalALPHACounterThe number of job syncs
    action
    completion_mode
    result
    None
    job_controller_pod_failures_handled_by_failure_policy_totalALPHACounter
    action
    None
    job_controller_terminated_pods_tracking_finalizer_totalALPHACounterThe number of terminated pods (phase=Failed|Succeeded), that have the finalizer batch.kubernetes.io/job-tracking, The event label can be "add" or "delete".
    event
    None
    kube_apiserver_clusterip_allocator_allocated_ipsALPHAGaugeGauge measuring the number of allocated IPs for Services
    cidr
    None
    kube_apiserver_clusterip_allocator_allocation_errors_totalALPHACounterNumber of errors trying to allocate Cluster IPs
    cidr
    scope
    None
    kube_apiserver_clusterip_allocator_allocation_totalALPHACounterNumber of Cluster IPs allocations
    cidr
    scope
    None
    kube_apiserver_clusterip_allocator_available_ipsALPHAGaugeGauge measuring the number of available IPs for Services
    cidr
    None
    kube_apiserver_pod_logs_pods_logs_backend_tls_failure_totalALPHACounterTotal number of requests for pods/logs that failed due to kubelet server TLS verificationNoneNone
    kube_apiserver_pod_logs_pods_logs_insecure_backend_totalALPHACounterTotal number of requests for pods/logs sliced by usage type: enforce_tls, skip_tls_allowed, skip_tls_denied
    usage
    None
    kube_pod_resource_limitALPHACustomResources limit for workloads on the cluster, broken down by pod. This shows the resource usage the scheduler and kubelet expect per pod for resources along with the unit for the resource if any.
    namespace
    pod
    node
    scheduler
    priority
    resource
    unit
    None
    kube_pod_resource_requestALPHACustomResources requested by workloads on the cluster, broken down by pod. This shows the resource usage the scheduler and kubelet expect per pod for resources along with the unit for the resource if any.
    namespace
    pod
    node
    scheduler
    priority
    resource
    unit
    None
    kubelet_certificate_manager_client_expiration_renew_errorsALPHACounterCounter of certificate renewal errors.NoneNone
    kubelet_certificate_manager_client_ttl_secondsALPHAGaugeGauge of the TTL (time-to-live) of the Kubelet’s client certificate. The value is in seconds until certificate expiry (negative if already expired). If client certificate is invalid or unused, the value will be +INF.NoneNone
    kubelet_certificate_manager_server_rotation_secondsALPHAHistogramHistogram of the number of seconds the previous certificate lived before being rotated.NoneNone
    kubelet_certificate_manager_server_ttl_secondsALPHAGaugeGauge of the shortest TTL (time-to-live) of the Kubelet’s serving certificate. The value is in seconds until certificate expiry (negative if already expired). If serving certificate is invalid or unused, the value will be +INF.NoneNone
    kubelet_cgroup_manager_duration_secondsALPHAHistogramDuration in seconds for cgroup manager operations. Broken down by method.
    operation_type
    None
    kubelet_container_log_filesystem_used_bytesALPHACustomBytes used by the container’s logs on the filesystem.
    uid
    namespace
    pod
    container
    None
    kubelet_containers_per_pod_countALPHAHistogramThe number of containers per pod.NoneNone
    kubelet_cpu_manager_pinning_errors_totalALPHACounterThe number of cpu core allocations which required pinning failed.NoneNone
    kubelet_cpu_manager_pinning_requests_totalALPHACounterThe number of cpu core allocations which required pinning.NoneNone
    kubelet_device_plugin_alloc_duration_secondsALPHAHistogramDuration in seconds to serve a device plugin Allocation request. Broken down by resource name.
    resource_name
    None
    kubelet_device_plugin_registration_totalALPHACounterCumulative number of device plugin registrations. Broken down by resource name.
    resource_name
    None
    kubelet_eviction_stats_age_secondsALPHAHistogramTime between when stats are collected, and when pod is evicted based on those stats by eviction signal
    eviction_signal
    None
    kubelet_evictionsALPHACounterCumulative number of pod evictions by eviction signal
    eviction_signal
    None
    kubelet_graceful_shutdown_end_time_secondsALPHAGaugeLast graceful shutdown start time since unix epoch in secondsNoneNone
    kubelet_graceful_shutdown_start_time_secondsALPHAGaugeLast graceful shutdown start time since unix epoch in secondsNoneNone
    kubelet_http_inflight_requestsALPHAGaugeNumber of the inflight http requests
    long_running
    method
    path
    server_type
    None
    kubelet_http_requests_duration_secondsALPHAHistogramDuration in seconds to serve http requests
    long_running
    method
    path
    server_type
    None
    kubelet_http_requests_totalALPHACounterNumber of the http requests received since the server started
    long_running
    method
    path
    server_type
    None
    kubelet_kubelet_credential_provider_plugin_durationALPHAHistogramDuration of execution in seconds for credential provider plugin
    plugin_name
    None
    kubelet_kubelet_credential_provider_plugin_errorsALPHACounterNumber of errors from credential provider plugin
    plugin_name
    None
    kubelet_lifecycle_handler_http_fallbacks_totalALPHACounterThe number of times lifecycle handlers successfully fell back to http from https.NoneNone
    kubelet_managed_ephemeral_containersALPHAGaugeCurrent number of ephemeral containers in pods managed by this kubelet. Ephemeral containers will be ignored if disabled by the EphemeralContainers feature gate, and this number will be 0.NoneNone
    kubelet_node_nameALPHAGaugeThe node’s name. The count is always 1.
    node
    None
    kubelet_pleg_discard_eventsALPHACounterThe number of discard events in PLEG.NoneNone
    kubelet_pleg_last_seen_secondsALPHAGaugeTimestamp in seconds when PLEG was last seen active.NoneNone
    kubelet_pleg_relist_duration_secondsALPHAHistogramDuration in seconds for relisting pods in PLEG.NoneNone
    kubelet_pleg_relist_interval_secondsALPHAHistogramInterval in seconds between relisting in PLEG.NoneNone
    kubelet_pod_resources_endpoint_errors_get_allocatableALPHACounterNumber of requests to the PodResource GetAllocatableResources endpoint which returned error. Broken down by server api version.
    server_api_version
    None
    kubelet_pod_resources_endpoint_errors_listALPHACounterNumber of requests to the PodResource List endpoint which returned error. Broken down by server api version.
    server_api_version
    None
    kubelet_pod_resources_endpoint_requests_get_allocatableALPHACounterNumber of requests to the PodResource GetAllocatableResources endpoint. Broken down by server api version.
    server_api_version
    None
    kubelet_pod_resources_endpoint_requests_listALPHACounterNumber of requests to the PodResource List endpoint. Broken down by server api version.
    server_api_version
    None
    kubelet_pod_resources_endpoint_requests_totalALPHACounterCumulative number of requests to the PodResource endpoint. Broken down by server api version.
    server_api_version
    None
    kubelet_pod_start_duration_secondsALPHAHistogramDuration in seconds from kubelet seeing a pod for the first time to the pod starting to runNoneNone
    kubelet_pod_status_sync_duration_secondsALPHAHistogramDuration in seconds to sync a pod status update. Measures time from detection of a change to pod status until the API is successfully updated for that pod, even if multiple intevening changes to pod status occur.NoneNone
    kubelet_pod_worker_duration_secondsALPHAHistogramDuration in seconds to sync a single pod. Broken down by operation type: create, update, or sync
    operation_type
    None
    kubelet_pod_worker_start_duration_secondsALPHAHistogramDuration in seconds from kubelet seeing a pod to starting a worker.NoneNone
    kubelet_preemptionsALPHACounterCumulative number of pod preemptions by preemption resource
    preemption_signal
    None
    kubelet_run_podsandbox_duration_secondsALPHAHistogramDuration in seconds of the run_podsandbox operations. Broken down by RuntimeClass.Handler.
    runtime_handler
    None
    kubelet_run_podsandbox_errors_totalALPHACounterCumulative number of the run_podsandbox operation errors by RuntimeClass.Handler.
    runtime_handler
    None
    kubelet_running_containersALPHAGaugeNumber of containers currently running
    container_state
    None
    kubelet_running_podsALPHAGaugeNumber of pods that have a running pod sandboxNoneNone
    kubelet_runtime_operations_duration_secondsALPHAHistogramDuration in seconds of runtime operations. Broken down by operation type.
    operation_type
    None
    kubelet_runtime_operations_errors_totalALPHACounterCumulative number of runtime operation errors by operation type.
    operation_type
    None
    kubelet_runtime_operations_totalALPHACounterCumulative number of runtime operations by operation type.
    operation_type
    None
    kubelet_server_expiration_renew_errorsALPHACounterCounter of certificate renewal errors.NoneNone
    kubelet_started_containers_errors_totalALPHACounterCumulative number of errors when starting containers
    code
    container_type
    None
    kubelet_started_containers_totalALPHACounterCumulative number of containers started
    container_type
    None
    kubelet_started_host_process_containers_errors_totalALPHACounterCumulative number of errors when starting hostprocess containers. This metric will only be collected on Windows and requires WindowsHostProcessContainers feature gate to be enabled.
    code
    container_type
    None
    kubelet_started_host_process_containers_totalALPHACounterCumulative number of hostprocess containers started. This metric will only be collected on Windows and requires WindowsHostProcessContainers feature gate to be enabled.
    container_type
    None
    kubelet_started_pods_errors_totalALPHACounterCumulative number of errors when starting podsNoneNone
    kubelet_started_pods_totalALPHACounterCumulative number of pods startedNoneNone
    kubelet_volume_metric_collection_duration_secondsALPHAHistogramDuration in seconds to calculate volume stats
    metric_source
    None
    kubelet_volume_stats_available_bytesALPHACustomNumber of available bytes in the volume
    namespace
    persistentvolumeclaim
    None
    kubelet_volume_stats_capacity_bytesALPHACustomCapacity in bytes of the volume
    namespace
    persistentvolumeclaim
    None
    kubelet_volume_stats_health_status_abnormalALPHACustomAbnormal volume health status. The count is either 1 or 0. 1 indicates the volume is unhealthy, 0 indicates volume is healthy
    namespace
    persistentvolumeclaim
    None
    kubelet_volume_stats_inodesALPHACustomMaximum number of inodes in the volume
    namespace
    persistentvolumeclaim
    None
    kubelet_volume_stats_inodes_freeALPHACustomNumber of free inodes in the volume
    namespace
    persistentvolumeclaim
    None
    kubelet_volume_stats_inodes_usedALPHACustomNumber of used inodes in the volume
    namespace
    persistentvolumeclaim
    None
    kubelet_volume_stats_used_bytesALPHACustomNumber of used bytes in the volume
    namespace
    persistentvolumeclaim
    None
    kubeproxy_network_programming_duration_secondsALPHAHistogramIn Cluster Network Programming Latency in secondsNoneNone
    kubeproxy_sync_proxy_rules_duration_secondsALPHAHistogramSyncProxyRules latency in secondsNoneNone
    kubeproxy_sync_proxy_rules_endpoint_changes_pendingALPHAGaugePending proxy rules Endpoint changesNoneNone
    kubeproxy_sync_proxy_rules_endpoint_changes_totalALPHACounterCumulative proxy rules Endpoint changesNoneNone
    kubeproxy_sync_proxy_rules_iptables_restore_failures_totalALPHACounterCumulative proxy iptables restore failuresNoneNone
    kubeproxy_sync_proxy_rules_iptables_totalALPHAGaugeNumber of proxy iptables rules programmed
    table
    None
    kubeproxy_sync_proxy_rules_last_queued_timestamp_secondsALPHAGaugeThe last time a sync of proxy rules was queuedNoneNone
    kubeproxy_sync_proxy_rules_last_timestamp_secondsALPHAGaugeThe last time proxy rules were successfully syncedNoneNone
    kubeproxy_sync_proxy_rules_no_local_endpoints_totalALPHAGaugeNumber of services with a Local traffic policy and no endpoints
    traffic_policy
    None
    kubeproxy_sync_proxy_rules_service_changes_pendingALPHAGaugePending proxy rules Service changesNoneNone
    kubeproxy_sync_proxy_rules_service_changes_totalALPHACounterCumulative proxy rules Service changesNoneNone
    kubernetes_build_infoALPHAGaugeA metric with a constant ‘1’ value labeled by major, minor, git version, git commit, git tree state, build date, Go version, and compiler from which Kubernetes was built, and platform on which it is running.
    build_date
    compiler
    git_commit
    git_tree_state
    git_version
    go_version
    major
    minor
    platform
    None
    kubernetes_feature_enabledALPHAGaugeThis metric records the data about the stage and enablement of a k8s feature.
    name
    stage
    None
    kubernetes_healthcheckALPHAGaugeThis metric records the result of a single healthcheck.
    name
    type
    None
    kubernetes_healthchecks_totalALPHACounterThis metric records the results of all healthcheck.
    name
    status
    type
    None
    leader_election_master_statusALPHAGaugeGauge of if the reporting system is master of the relevant lease, 0 indicates backup, 1 indicates master. ‘name’ is the string used to identify the lease. Please make sure to group by name.
    name
    None
    node_authorizer_graph_actions_duration_secondsALPHAHistogramHistogram of duration of graph actions in node authorizer.
    operation
    None
    node_collector_evictions_numberALPHACounterNumber of Node evictions that happened since current instance of NodeController started, This metric is replaced by node_collector_evictions_total.
    zone
    None
    node_collector_unhealthy_nodes_in_zoneALPHAGaugeGauge measuring number of not Ready Nodes per zones.
    zone
    None
    node_collector_zone_healthALPHAGaugeGauge measuring percentage of healthy nodes per zone.
    zone
    None
    node_collector_zone_sizeALPHAGaugeGauge measuring number of registered Nodes per zones.
    zone
    None
    node_cpu_usage_seconds_totalALPHACustomCumulative cpu time consumed by the node in core-secondsNoneNone
    node_ipam_controller_cidrset_allocation_tries_per_requestALPHAHistogramNumber of endpoints added on each Service sync
    clusterCIDR
    None
    node_ipam_controller_cidrset_cidrs_allocations_totalALPHACounterCounter measuring total number of CIDR allocations.
    clusterCIDR
    None
    node_ipam_controller_cidrset_cidrs_releases_totalALPHACounterCounter measuring total number of CIDR releases.
    clusterCIDR
    None
    node_ipam_controller_cidrset_usage_cidrsALPHAGaugeGauge measuring percentage of allocated CIDRs.
    clusterCIDR
    None
    node_ipam_controller_multicidrset_allocation_tries_per_requestALPHAHistogramHistogram measuring CIDR allocation tries per request.
    clusterCIDR
    None
    node_ipam_controller_multicidrset_cidrs_allocations_totalALPHACounterCounter measuring total number of CIDR allocations.
    clusterCIDR
    None
    node_ipam_controller_multicidrset_cidrs_releases_totalALPHACounterCounter measuring total number of CIDR releases.
    clusterCIDR
    None
    node_ipam_controller_multicidrset_usage_cidrsALPHAGaugeGauge measuring percentage of allocated CIDRs.
    clusterCIDR
    None
    node_memory_working_set_bytesALPHACustomCurrent working set of the node in bytesNoneNone
    number_of_l4_ilbsALPHAGaugeNumber of L4 ILBs
    feature
    None
    plugin_manager_total_pluginsALPHACustomNumber of plugins in Plugin Manager
    socket_path
    state
    None
    pod_cpu_usage_seconds_totalALPHACustomCumulative cpu time consumed by the pod in core-seconds
    pod
    namespace
    None
    pod_memory_working_set_bytesALPHACustomCurrent working set of the pod in bytes
    pod
    namespace
    None
    pod_security_errors_totalALPHACounterNumber of errors preventing normal evaluation. Non-fatal errors may result in the latest restricted profile being used for evaluation.
    fatal
    request_operation
    resource
    subresource
    None
    pod_security_evaluations_totalALPHACounterNumber of policy evaluations that occurred, not counting ignored or exempt requests.
    decision
    mode
    policy_level
    policy_version
    request_operation
    resource
    subresource
    None
    pod_security_exemptions_totalALPHACounterNumber of exempt requests, not counting ignored or out of scope requests.
    request_operation
    resource
    subresource
    None
    prober_probe_duration_secondsALPHAHistogramDuration in seconds for a probe response.
    container
    namespace
    pod
    probe_type
    None
    prober_probe_totalALPHACounterCumulative number of a liveness, readiness or startup probe for a container by result.
    container
    namespace
    pod
    pod_uid
    probe_type
    result
    None
    pv_collector_bound_pv_countALPHACustomGauge measuring number of persistent volume currently bound
    storage_class
    None
    pv_collector_bound_pvc_countALPHACustomGauge measuring number of persistent volume claim currently bound
    namespace
    None
    pv_collector_total_pv_countALPHACustomGauge measuring total number of persistent volumes
    plugin_name
    volume_mode
    None
    pv_collector_unbound_pv_countALPHACustomGauge measuring number of persistent volume currently unbound
    storage_class
    None
    pv_collector_unbound_pvc_countALPHACustomGauge measuring number of persistent volume claim currently unbound
    namespace
    None
    replicaset_controller_sorting_deletion_age_ratioALPHAHistogramThe ratio of chosen deleted pod’s ages to the current youngest pod’s age (at the time). Should be <2.The intent of this metric is to measure the rough efficacy of the LogarithmicScaleDown feature gate’s effect onthe sorting (and deletion) of pods when a replicaset scales down. This only considers Ready pods when calculating and reporting.NoneNone
    rest_client_exec_plugin_call_totalALPHACounterNumber of calls to an exec plugin, partitioned by the type of event encountered (no_error, plugin_execution_error, plugin_not_found_error, client_internal_error) and an optional exit code. The exit code will be set to 0 if and only if the plugin call was successful.
    call_status
    code
    None
    rest_client_exec_plugin_certificate_rotation_ageALPHAHistogramHistogram of the number of seconds the last auth exec plugin client certificate lived before being rotated. If auth exec plugin client certificates are unused, histogram will contain no data.NoneNone
    rest_client_exec_plugin_ttl_secondsALPHAGaugeGauge of the shortest TTL (time-to-live) of the client certificate(s) managed by the auth exec plugin. The value is in seconds until certificate expiry (negative if already expired). If auth exec plugins are unused or manage no TLS certificates, the value will be +INF.NoneNone
    rest_client_rate_limiter_duration_secondsALPHAHistogramClient side rate limiter latency in seconds. Broken down by verb, and host.
    host
    verb
    None
    rest_client_request_duration_secondsALPHAHistogramRequest latency in seconds. Broken down by verb, and host.
    host
    verb
    None
    rest_client_request_size_bytesALPHAHistogramRequest size in bytes. Broken down by verb and host.
    host
    verb
    None
    rest_client_requests_totalALPHACounterNumber of HTTP requests, partitioned by status code, method, and host.
    code
    host
    method
    None
    rest_client_response_size_bytesALPHAHistogramResponse size in bytes. Broken down by verb and host.
    host
    verb
    None
    retroactive_storageclass_errors_totalALPHACounterTotal number of failed retroactive StorageClass assignments to persistent volume claimNoneNone
    retroactive_storageclass_totalALPHACounterTotal number of retroactive StorageClass assignments to persistent volume claimNoneNone
    root_ca_cert_publisher_sync_duration_secondsALPHAHistogramNumber of namespace syncs happened in root ca cert publisher.
    code
    None
    root_ca_cert_publisher_sync_totalALPHACounterNumber of namespace syncs happened in root ca cert publisher.
    code
    None
    running_managed_controllersALPHAGaugeIndicates where instances of a controller are currently running
    manager
    name
    None
    scheduler_e2e_scheduling_duration_secondsALPHAHistogramE2e scheduling latency in seconds (scheduling algorithm + binding). This metric is replaced by scheduling_attempt_duration_seconds.
    profile
    result
    None
    scheduler_goroutinesALPHAGaugeNumber of running goroutines split by the work they do such as binding.
    operation
    None
    scheduler_permit_wait_duration_secondsALPHAHistogramDuration of waiting on permit.
    result
    None
    scheduler_plugin_execution_duration_secondsALPHAHistogramDuration for running a plugin at a specific extension point.
    extension_point
    plugin
    status
    None
    scheduler_scheduler_cache_sizeALPHAGaugeNumber of nodes, pods, and assumed (bound) pods in the scheduler cache.
    type
    None
    scheduler_scheduler_goroutinesALPHAGaugeNumber of running goroutines split by the work they do such as binding. This metric is replaced by the \”goroutines\” metric.
    work
    None
    scheduler_scheduling_algorithm_duration_secondsALPHAHistogramScheduling algorithm latency in secondsNoneNone
    scheduler_unschedulable_podsALPHAGaugeThe number of unschedulable pods broken down by plugin name. A pod will increment the gauge for all plugins that caused it to not schedule and so this metric have meaning only when broken down by plugin.
    plugin
    profile
    None
    scheduler_volume_binder_cache_requests_totalALPHACounterTotal number for request volume binding cache
    operation
    None
    scheduler_volume_scheduling_stage_error_totalALPHACounterVolume scheduling stage error count
    operation
    None
    scrape_errorALPHACustom1 if there was an error while getting container metrics, 0 otherwiseNoneNone
    service_controller_nodesync_latency_secondsALPHAHistogramA metric measuring the latency for nodesync which updates loadbalancer hosts on cluster node updates.NoneNone
    service_controller_update_loadbalancer_host_latency_secondsALPHAHistogramA metric measuring the latency for updating each load balancer hosts.NoneNone
    serviceaccount_legacy_tokens_totalALPHACounterCumulative legacy service account tokens usedNoneNone
    serviceaccount_stale_tokens_totalALPHACounterCumulative stale projected service account tokens usedNoneNone
    serviceaccount_valid_tokens_totalALPHACounterCumulative valid projected service account tokens usedNoneNone
    storage_count_attachable_volumes_in_useALPHACustomMeasure number of volumes in use
    node
    volume_plugin
    None
    storage_operation_duration_secondsALPHAHistogramStorage operation duration
    migrated
    operation_name
    status
    volume_plugin
    None
    ttl_after_finished_controller_job_deletion_duration_secondsALPHAHistogramThe time it took to delete the job since it became eligible for deletionNoneNone
    volume_manager_selinux_container_errors_totalALPHAGaugeNumber of errors when kubelet cannot compute SELinux context for a container. Kubelet can’t start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of containers.NoneNone
    volume_manager_selinux_container_warnings_totalALPHAGaugeNumber of errors when kubelet cannot compute SELinux context for a container that are ignored. They will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.NoneNone
    volume_manager_selinux_pod_context_mismatch_errors_totalALPHAGaugeNumber of errors when a Pod defines different SELinux contexts for its containers that use the same volume. Kubelet can’t start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of Pods.NoneNone
    volume_manager_selinux_pod_context_mismatch_warnings_totalALPHAGaugeNumber of errors when a Pod defines different SELinux contexts for its containers that use the same volume. They are not errors yet, but they will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.NoneNone
    volume_manager_selinux_volume_context_mismatch_errors_totalALPHAGaugeNumber of errors when a Pod uses a volume that is already mounted with a different SELinux context than the Pod needs. Kubelet can’t start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of Pods.NoneNone
    volume_manager_selinux_volume_context_mismatch_warnings_totalALPHAGaugeNumber of errors when a Pod uses a volume that is already mounted with a different SELinux context than the Pod needs. They are not errors yet, but they will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.NoneNone
    volume_manager_selinux_volumes_admitted_totalALPHAGaugeNumber of volumes whose SELinux context was fine and will be mounted with mount -o context option.NoneNone
    volume_manager_total_volumesALPHACustomNumber of volumes in Volume Manager
    plugin_name
    state
    None
    volume_operation_total_errorsALPHACounterTotal volume operation errors
    operation_name
    plugin_name
    None
    volume_operation_total_secondsALPHAHistogramStorage operation end to end duration in seconds
    operation_name
    plugin_name
    None
    watch_cache_capacityALPHAGaugeTotal capacity of watch cache broken by resource type.
    resource
    None
    watch_cache_capacity_decrease_totalALPHACounterTotal number of watch cache capacity decrease events broken by resource type.
    resource
    None
    watch_cache_capacity_increase_totalALPHACounterTotal number of watch cache capacity increase events broken by resource type.
    resource
    None
    workqueue_adds_totalALPHACounterTotal number of adds handled by workqueue
    name
    None
    workqueue_depthALPHAGaugeCurrent depth of workqueue
    name
    None
    workqueue_longest_running_processor_secondsALPHAGaugeHow many seconds has the longest running processor for workqueue been running.
    name
    None
    workqueue_queue_duration_secondsALPHAHistogramHow long in seconds an item stays in workqueue before being requested.
    name
    None
    workqueue_retries_totalALPHACounterTotal number of retries handled by workqueue
    name
    None
    workqueue_unfinished_work_secondsALPHAGaugeHow many seconds of work has done that is in progress and hasn’t been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
    name
    None
    workqueue_work_duration_secondsALPHAHistogramHow long in seconds processing an item from workqueue takes.
    name
    None