Investigating pod issues
After a pod is defined, it is assigned to run on a node until its containers exit, or until it is removed. Depending on policy and exit code, Pods are either removed after exiting or retained so that their logs can be accessed.
The first thing to check when pod issues arise is the pod’s status. If an explicit pod failure has occurred, observe the pod’s error state to identify specific image, container, or pod network issues. Focus diagnostic data collection according to the error state. Review pod event messages, as well as pod and container log information. Diagnose issues dynamically by accessing running Pods on the command line, or start a debug pod with root access based on a problematic pod’s deployment configuration.
Pod failures return explicit error states that can be observed in the field in the output of oc get pods
. Pod error states cover image, container, and container network related failures.
The following table provides a list of pod error states along with their descriptions.
Reviewing pod status
You can query pod status and error states. You can also query a pod’s associated deployment configuration and review base image availability.
Prerequisites
You have access to the cluster as a user with the
cluster-admin
role.You have installed the OpenShift CLI ().
skopeo
is installed.
Procedure
Switch into a project:
List pods running within the namespace, as well as pod status, error states, restarts, and age:
$ oc get pods
Determine whether the namespace is managed by a deployment configuration:
$ oc status
If the namespace is managed by a deployment configuration, the output includes the deployment configuration name and a base image reference.
Inspect the base image referenced in the preceding command’s output:
$ skopeo inspect docker://<image_reference>
If the base image reference is not correct, update the reference in the deployment configuration:
$ oc edit deployment/my-deployment
When deployment configuration changes on exit, the configuration will automatically redeploy. Watch pod status as the deployment progresses, to determine whether the issue has been resolved:
$ oc get pods -w
Review events within the namespace for diagnostic information relating to pod failures:
You can inspect pod and container logs for warnings and error messages related to explicit pod failures. Depending on policy and exit code, pod and container logs remain available after pods have been terminated.
Prerequisites
You have access to the cluster as a user with the
cluster-admin
role.Your API service is still functional.
You have installed the OpenShift CLI (
oc
).
Procedure
Query logs for a specific pod:
Query logs for a specific container within a pod:
$ oc logs <pod_name> -c <container_name>
Logs retrieved using the preceding
oc logs
commands are composed of messages sent to stdout within pods or containers.Inspect logs contained in
/var/log/
within a pod.List log files and subdirectories contained in
/var/log
within a pod:$ oc exec <pod_name> ls -alh /var/log
Query a specific log file contained in
/var/log
within a pod:$ oc exec <pod_name> cat /var/log/<path_to_log>
List log files and subdirectories contained in
/var/log
within a specific container:$ oc exec <pod_name> -c <container_name> ls /var/log
-
$ oc exec <pod_name> -c <container_name> cat /var/log/<path_to_log>
Accessing running pods
You can review running pods dynamically by opening a shell inside a pod or by gaining network access through port forwarding.
Prerequisites
You have access to the cluster as a user with the
cluster-admin
role.Your API service is still functional.
You have installed the OpenShift CLI ().
Procedure
Switch into the project that contains the pod you would like to access. This is necessary because the
oc rsh
command does not accept the-n
namespace option:$ oc project <namespace>
Start a remote shell into a pod:
1 If a pod has multiple containers, oc rsh
defaults to the first container unless-c <container_name>
is specified.Create a port forwarding session to a port on a pod:
$ oc port-forward <pod_name> <host_port>:<pod_port> (1)
You can start a debug pod with root access, based on a problematic pod’s deployment or deployment configuration. Pod users typically run with non-root privileges, but running troubleshooting pods with temporary root privileges can be useful during issue investigation.
Prerequisites
You have access to the cluster as a user with the
cluster-admin
role.Your API service is still functional.
You have installed the OpenShift CLI (
oc
).
Procedure
Start a debug pod with root access, based on a deployment.
Obtain a project’s deployment name:
$ oc get deployment -n <project_name>
Start a debug pod with root privileges, based on the deployment:
$ oc debug deployment/my-deployment --as-root -n <project_name>
Start a debug pod with root access, based on a deployment configuration.
Obtain a project’s deployment configuration name:
$ oc get deploymentconfigs -n <project_name>
Start a debug pod with root privileges, based on the deployment configuration:
$ oc debug deploymentconfig/my-deployment-configuration --as-root -n <project_name>
You can append |
Copying files to and from pods and containers
You can copy files to and from a pod to test configuration changes or gather diagnostic information.
Prerequisites
You have access to the cluster as a user with the
cluster-admin
role.Your API service is still functional.
You have installed the OpenShift CLI (
oc
).
Procedure
Copy a file to a pod:
Copy a file from a pod:
1 The first container in a pod is selected if the -c
option is not specified.