Verifying node health

Prerequisites

You have access to the cluster as a user with the cluster-admin role.
You have installed the OpenShift CLI (oc).

Procedure

List the name, status, and role for all nodes in the cluster:
Summarize CPU and memory usage for each node within the cluster:
```
$ oc adm top nodes
```
Summarize CPU and memory usage for a specific node:
```
$ oc adm top node my-node
```

You can review cluster node health status, resource consumption statistics, and node logs. Additionally, you can query kubelet status on individual nodes.

Prerequisites

You have installed the OpenShift CLI (oc).

Procedure

The kubelet is managed using a systemd service on each node. Review the kubelet’s status by querying the kubelet systemd service within a debug pod.
1. Start a debug pod for a node:
2. Set /host as the root directory within the debug shell. The debug pod mounts the host’s root file system in /host within the pod. By changing the root directory to , you can run binaries contained in the host’s executable paths:
```
# chroot /host
```
3. Check whether the kubelet systemd service is active on the node:
```
# systemctl is-active kubelet
```
4. Output a more detailed kubelet.service status summary:

You can gather journald unit logs and other logs within /var/log on individual cluster nodes.

You have access to the cluster as a user with the cluster-admin role.
You have installed the OpenShift CLI (oc).
You have SSH access to your hosts.

Procedure

Query kubelet journald unit logs from OKD cluster nodes. The following example queries control plane nodes only:
Collect logs from specific subdirectories under /var/log/ on cluster nodes.
1. Retrieve a list of logs contained within a /var/log/ subdirectory. The following example lists files in /var/log/openshift-apiserver/ on all control plane nodes:
```
$ oc adm node-logs --role=master --path=openshift-apiserver
```
2. Inspect a specific log within a /var/log/ subdirectory. The following example outputs /var/log/openshift-apiserver/audit.log contents from all control plane nodes:
3. If the API is not functional, review the logs on each node using SSH instead. The following example tails /var/log/openshift-apiserver/audit.log:
```
$ ssh core@<master-node>.<cluster_name>.<base_domain> sudo tail -f /var/log/openshift-apiserver/audit.log
```