Verifying node health
Prerequisites
You have access to the cluster as a user with the
cluster-admin
role.You have installed the OpenShift CLI (
oc
).
Procedure
List the name, status, and role for all nodes in the cluster:
Summarize CPU and memory usage for each node within the cluster:
$ oc adm top nodes
Summarize CPU and memory usage for a specific node:
$ oc adm top node my-node
You can review cluster node health status, resource consumption statistics, and node logs. Additionally, you can query kubelet
status on individual nodes.
Prerequisites
You have installed the OpenShift CLI (
oc
).
Procedure
The kubelet is managed using a systemd service on each node. Review the kubelet’s status by querying the
kubelet
systemd service within a debug pod.Start a debug pod for a node:
Set
/host
as the root directory within the debug shell. The debug pod mounts the host’s root file system in/host
within the pod. By changing the root directory to , you can run binaries contained in the host’s executable paths:# chroot /host
Check whether the
kubelet
systemd service is active on the node:# systemctl is-active kubelet
Output a more detailed
kubelet.service
status summary:
You can gather journald
unit logs and other logs within /var/log
on individual cluster nodes.
You have access to the cluster as a user with the
cluster-admin
role.You have installed the OpenShift CLI (
oc
).You have SSH access to your hosts.
Procedure
Query
kubelet
journald
unit logs from OKD cluster nodes. The following example queries control plane nodes only:Collect logs from specific subdirectories under
/var/log/
on cluster nodes.Retrieve a list of logs contained within a
/var/log/
subdirectory. The following example lists files in/var/log/openshift-apiserver/
on all control plane nodes:$ oc adm node-logs --role=master --path=openshift-apiserver
Inspect a specific log within a
/var/log/
subdirectory. The following example outputs/var/log/openshift-apiserver/audit.log
contents from all control plane nodes:If the API is not functional, review the logs on each node using SSH instead. The following example tails
/var/log/openshift-apiserver/audit.log
:$ ssh core@<master-node>.<cluster_name>.<base_domain> sudo tail -f /var/log/openshift-apiserver/audit.log