Managing metrics
You can define the metrics that you want to provide for your own workloads by using Prometheus client libraries at the application level.
In OKD, metrics are exposed through an HTTP service endpoint under the canonical name. You can list all available metrics for a service by running a curl
query against http://<endpoint>/metrics
. For instance, you can expose a route to the prometheus-example-app
example application and then run the following to view all of its available metrics:
Example output
# HELP http_requests_total Count of all HTTP requests
# TYPE http_requests_total counter
http_requests_total{code="200",method="get"} 4
http_requests_total{code="404",method="get"} 2
# HELP version Version information about this binary
# TYPE version gauge
version{version="v0.1.0"} 1
Additional resources
- See the Prometheus documentation for details on Prometheus client libraries.
You can create a ServiceMonitor
resource to scrape metrics from a service endpoint in a user-defined project. This assumes that your application uses a Prometheus client library to expose metrics to the /metrics
canonical name.
This section describes how to deploy a sample service in a user-defined project and then create a ServiceMonitor
resource that defines how that service should be monitored.
To test monitoring of a service in a user-defined project, you can deploy a sample service.
Procedure
Create a YAML file for the service configuration. In this example, it is called
prometheus-example-app.yaml
.Add the following deployment and service configuration details to the file:
apiVersion: v1
kind: Namespace
metadata:
name: ns1
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: prometheus-example-app
name: prometheus-example-app
namespace: ns1
spec:
replicas: 1
selector:
matchLabels:
app: prometheus-example-app
labels:
app: prometheus-example-app
spec:
containers:
- image: ghcr.io/rhobs/prometheus-example-app:0.3.0
imagePullPolicy: IfNotPresent
name: prometheus-example-app
---
apiVersion: v1
kind: Service
metadata:
labels:
app: prometheus-example-app
name: prometheus-example-app
namespace: ns1
spec:
ports:
- port: 8080
protocol: TCP
targetPort: 8080
name: web
selector:
app: prometheus-example-app
type: ClusterIP
This configuration deploys a service named
prometheus-example-app
in the user-definedns1
project. This service exposes the customversion
metric.Apply the configuration to the cluster:
It takes some time to deploy the service.
You can check that the pod is running:
$ oc -n ns1 get pod
Example output
NAME READY STATUS RESTARTS AGE
prometheus-example-app-7857545cb7-sbgwq 1/1 Running 0 81m
To use the metrics exposed by your service, you must configure OKD monitoring to scrape metrics from the endpoint. You can do this using a ServiceMonitor
custom resource definition (CRD) that specifies how a service should be monitored, or a PodMonitor
CRD that specifies how a pod should be monitored. The former requires a Service
object, while the latter does not, allowing Prometheus to directly scrape metrics from the metrics endpoint exposed by a pod.
This procedure shows you how to create a ServiceMonitor
resource for a service in a user-defined project.
Prerequisites
You have access to the cluster as a user with the
cluster-admin
role or themonitoring-edit
role.You have enabled monitoring for user-defined projects.
For this example, you have deployed the
prometheus-example-app
sample service in thens1
project.
Procedure
Create a YAML file for the
ServiceMonitor
resource configuration. In this example, the file is calledexample-app-service-monitor.yaml
.Add the following
ServiceMonitor
resource configuration details:This defines a
ServiceMonitor
resource that scrapes the metrics exposed by theprometheus-example-app
sample service, which includes theversion
metric.
A |
Apply the configuration to the cluster:
$ oc apply -f example-app-service-monitor.yaml
You can check that the
ServiceMonitor
resource is running:Example output
Additional resources
See the for more information on
ServiceMonitor
andPodMonitor
resources.
The OKD monitoring dashboard enables you to run Prometheus Query Language (PromQL) queries to examine metrics visualized on a plot. This functionality provides information about the state of a cluster and any user-defined workloads that you are monitoring.
As a cluster administrator, you can query metrics for all core OKD and user-defined projects.
As a developer, you must specify a project name when querying metrics. You must have the required privileges to view metrics for the selected project.
As a cluster administrator or as a user with view permissions for all projects, you can access metrics for all default OKD and user-defined projects in the Metrics UI.
Prerequisites
You have access to the cluster as a user with the
cluster-admin
role or with view permissions for all projects.You have installed the OpenShift CLI (
oc
).
Procedure
In the Administrator perspective within the OKD web console, select Monitoring → Metrics.
Select Insert Metric at Cursor to view a list of predefined queries.
To create a custom query, add your Prometheus Query Language (PromQL) query to the Expression field.
To add multiple queries, select Add Query.
To delete a query, select next to the query, then choose Delete query.
To disable a query from being run, select next to the query and choose Disable query.
Select Run Queries to run the queries that you have created. The metrics from the queries are visualized on the plot. If a query is invalid, the UI shows an error message.
Queries that operate on large amounts of data might time out or overload the browser when drawing time series graphs. To avoid this, select Hide graph and calibrate your query using only the metrics table. Then, after finding a feasible query, enable the plot to draw the graphs.
Optional: The page URL now contains the queries you ran. To use this set of queries again in the future, save this URL.
Additional resources
- See the for more information about creating PromQL queries.
You can access metrics for a user-defined project as a developer or as a user with view permissions for the project.
In the Developer perspective, the Metrics UI includes some predefined CPU, memory, bandwidth, and network packet queries for the selected project. You can also run custom Prometheus Query Language (PromQL) queries for CPU, memory, bandwidth, network packet and application metrics for the project.
Prerequisites
You have enabled monitoring for user-defined projects.
You have deployed a service in a user-defined project.
You have created a
ServiceMonitor
custom resource definition (CRD) for the service to define how the service is monitored.
Procedure
From the Developer perspective in the OKD web console, select Monitoring → Metrics.
Select the project that you want to view metrics for in the Project: list.
Choose a query from the Select Query list, or run a custom PromQL query by selecting Show PromQL.
In the Developer perspective, you can only run one query at a time.
Additional resources
- See the Prometheus query documentation for more information about creating PromQL queries.
Additional resources
- See the for details on accessing non-cluster metrics as a developer or a privileged user
After running the queries, the metrics are displayed on an interactive plot. The X-axis in the plot represents time and the Y-axis represents metrics values. Each metric is shown as a colored line on the graph. You can manipulate the plot interactively and explore the metrics.
Procedure
In the Administrator perspective:
Initially, all metrics from all enabled queries are shown on the plot. You can select which metrics are shown.
To hide all metrics from a query, click for the query and click Hide all series.
To hide a specific metric, go to the query table and click the colored square near the metric name.
To zoom into the plot and change the time range, do one of the following:
Visually select the time range by clicking and dragging on the plot horizontally.
Use the menu in the left upper corner to select the time range.
To reset the time range, select Reset Zoom.
To display outputs for all queries at a specific point in time, hold the mouse cursor on the plot at that point. The query outputs will appear in a pop-up box.
To hide the plot, select Hide Graph.
In the Developer perspective:
To zoom into the plot and change the time range, do one of the following:
Visually select the time range by clicking and dragging on the plot horizontally.
Use the menu in the left upper corner to select the time range.
To reset the time range, select Reset Zoom.
Additional resources
- See the Querying metrics section on using the PromQL interface