Managing metrics

    In OKD 4.13, cluster components are monitored by scraping metrics exposed through service endpoints. You can also configure metrics collection for user-defined projects.

    You can define the metrics that you want to provide for your own workloads by using Prometheus client libraries at the application level.

    In OKD, metrics are exposed through an HTTP service endpoint under the canonical name. You can list all available metrics for a service by running a curl query against http://<endpoint>/metrics. For instance, you can expose a route to the prometheus-example-app example service and then run the following to view all of its available metrics:

    Example output

    1. # HELP http_requests_total Count of all HTTP requests
    2. # TYPE http_requests_total counter
    3. http_requests_total{code="200",method="get"} 4
    4. http_requests_total{code="404",method="get"} 2
    5. # HELP version Version information about this binary
    6. # TYPE version gauge
    7. version{version="v0.1.0"} 1

    Additional resources

    You can create a ServiceMonitor resource to scrape metrics from a service endpoint in a user-defined project. This assumes that your application uses a Prometheus client library to expose metrics to the /metrics canonical name.

    This section describes how to deploy a sample service in a user-defined project and then create a ServiceMonitor resource that defines how that service should be monitored.

    To test monitoring of a service in a user-defined project, you can deploy a sample service.

    Procedure

    1. Create a YAML file for the service configuration. In this example, it is called prometheus-example-app.yaml.

    2. Add the following deployment and service configuration details to the file:

      1. apiVersion: v1
      2. kind: Namespace
      3. metadata:
      4. name: ns1
      5. ---
      6. apiVersion: apps/v1
      7. kind: Deployment
      8. metadata:
      9. labels:
      10. app: prometheus-example-app
      11. name: prometheus-example-app
      12. namespace: ns1
      13. spec:
      14. replicas: 1
      15. selector:
      16. matchLabels:
      17. app: prometheus-example-app
      18. metadata:
      19. labels:
      20. app: prometheus-example-app
      21. containers:
      22. - image: ghcr.io/rhobs/prometheus-example-app:0.4.1
      23. imagePullPolicy: IfNotPresent
      24. name: prometheus-example-app
      25. ---
      26. apiVersion: v1
      27. kind: Service
      28. metadata:
      29. labels:
      30. app: prometheus-example-app
      31. name: prometheus-example-app
      32. namespace: ns1
      33. spec:
      34. ports:
      35. - port: 8080
      36. protocol: TCP
      37. targetPort: 8080
      38. name: web
      39. selector:
      40. app: prometheus-example-app
      41. type: ClusterIP

      This configuration deploys a service named prometheus-example-app in the user-defined ns1 project. This service exposes the custom version metric.

    3. Apply the configuration to the cluster:

      1. $ oc apply -f prometheus-example-app.yaml

      It takes some time to deploy the service.

    4. You can check that the pod is running:

      1. NAME READY STATUS RESTARTS AGE
      2. prometheus-example-app-7857545cb7-sbgwq 1/1 Running 0 81m

    Specifying how a service is monitored

    To use the metrics exposed by your service, you must configure OKD monitoring to scrape metrics from the /metrics endpoint. You can do this using a ServiceMonitor custom resource definition (CRD) that specifies how a service should be monitored, or a PodMonitor CRD that specifies how a pod should be monitored. The former requires a Service object, while the latter does not, allowing Prometheus to directly scrape metrics from the metrics endpoint exposed by a pod.

    This procedure shows you how to create a ServiceMonitor resource for a service in a user-defined project.

    Prerequisites

    • You have access to the cluster as a user with the cluster-admin role or the monitoring-edit role.

    • You have enabled monitoring for user-defined projects.

    • For this example, you have deployed the prometheus-example-app sample service in the project.

    Procedure

    1. Add the following ServiceMonitor resource configuration details:

      1. apiVersion: monitoring.coreos.com/v1
      2. kind: ServiceMonitor
      3. metadata:
      4. labels:
      5. k8s-app: prometheus-example-monitor
      6. name: prometheus-example-monitor
      7. namespace: ns1
      8. spec:
      9. endpoints:
      10. - interval: 30s
      11. port: web
      12. scheme: http
      13. selector:
      14. matchLabels:
      15. app: prometheus-example-app

      This defines a ServiceMonitor resource that scrapes the metrics exposed by the prometheus-example-app sample service, which includes the version metric.

    2. Apply the configuration to the cluster:

      1. $ oc apply -f example-app-service-monitor.yaml

      It takes some time to deploy the ServiceMonitor resource.

    3. You can check that the ServiceMonitor resource is running:

      1. NAME AGE
      2. prometheus-example-monitor 81m

    Additional resources

    As a cluster administrator or as a user with view permissions for all projects, you can view a list of metrics available in a cluster and output the list in JSON format.

    Prerequisites

    • You are a cluster administrator, or you have access to the cluster as a user with the cluster-monitoring-view role.

    • You have installed the OKD CLI (oc).

    • You have obtained the OKD API route for Thanos Querier.

    • You are able to get a bearer token by using the oc whoami -t command.

    Procedure

    1. If you have not obtained the OKD API route for Thanos Querier, run the following command:

      1. $ oc get routes -n openshift-monitoring thanos-querier -o jsonpath='{.status.ingress[0].host}'
    2. Retrieve a list of metrics in JSON format from the Thanos Querier API route by running the following command. This command uses oc to authenticate with a bearer token.