Application Introspection and Debugging

    For this example we’ll use a Deployment to create two pods, similar to the earlier example.

    application/nginx-with-request.yaml

    Create deployment by running following command:

    1. kubectl apply -f https://k8s.io/examples/application/nginx-with-request.yaml
    1. deployment.apps/nginx-deployment created

    Check pod status by following command:

    1. kubectl get pods
    1. NAME READY STATUS RESTARTS AGE
    2. nginx-deployment-1006230814-6winp 1/1 Running 0 11s
    3. nginx-deployment-1006230814-fmgu3 1/1 Running 0 11s

    We can retrieve a lot more information about each of these pods using kubectl describe pod. For example:

    1. kubectl describe pod nginx-deployment-1006230814-6winp
    1. Name: nginx-deployment-1006230814-6winp
    2. Namespace: default
    3. Node: kubernetes-node-wul5/10.240.0.9
    4. Start Time: Thu, 24 Mar 2016 01:39:49 +0000
    5. Labels: app=nginx,pod-template-hash=1006230814
    6. Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"nginx-deployment-1956810328","uid":"14e607e7-8ba1-11e7-b5cb-fa16" ...
    7. Status: Running
    8. IP: 10.244.0.6
    9. Controllers: ReplicaSet/nginx-deployment-1006230814
    10. Containers:
    11. nginx:
    12. Container ID: docker://90315cc9f513c724e9957a4788d3e625a078de84750f244a40f97ae355eb1149
    13. Image: nginx
    14. Image ID: docker://6f62f48c4e55d700cf3eb1b5e33fa051802986b77b874cc351cce539e5163707
    15. Port: 80/TCP
    16. QoS Tier:
    17. cpu: Guaranteed
    18. memory: Guaranteed
    19. Limits:
    20. cpu: 500m
    21. memory: 128Mi
    22. Requests:
    23. memory: 128Mi
    24. cpu: 500m
    25. State: Running
    26. Started: Thu, 24 Mar 2016 01:39:51 +0000
    27. Ready: True
    28. Restart Count: 0
    29. Environment: <none>
    30. Mounts:
    31. /var/run/secrets/kubernetes.io/serviceaccount from default-token-5kdvl (ro)
    32. Conditions:
    33. Type Status
    34. Initialized True
    35. Ready True
    36. PodScheduled True
    37. Volumes:
    38. default-token-4bcbi:
    39. Type: Secret (a volume populated by a Secret)
    40. SecretName: default-token-4bcbi
    41. Optional: false
    42. QoS Class: Guaranteed
    43. Node-Selectors: <none>
    44. Tolerations: <none>
    45. Events:
    46. FirstSeen LastSeen Count From SubobjectPath Type Reason Message
    47. --------- -------- ----- ---- ------------- -------- ------ -------
    48. 54s 54s 1 {default-scheduler } Normal Scheduled Successfully assigned nginx-deployment-1006230814-6winp to kubernetes-node-wul5
    49. 54s 54s 1 {kubelet kubernetes-node-wul5} spec.containers{nginx} Normal Pulling pulling image "nginx"
    50. 53s 53s 1 {kubelet kubernetes-node-wul5} spec.containers{nginx} Normal Pulled Successfully pulled image "nginx"
    51. 53s 53s 1 {kubelet kubernetes-node-wul5} spec.containers{nginx} Normal Created Created container with docker id 90315cc9f513
    52. 53s 53s 1 {kubelet kubernetes-node-wul5} spec.containers{nginx} Normal Started Started container with docker id 90315cc9f513

    Here you can see configuration information about the container(s) and Pod (labels, resource requirements, etc.), as well as status information about the container(s) and Pod (state, readiness, restart count, events, etc.).

    Ready tells you whether the container passed its last readiness probe. (In this case, the container does not have a readiness probe configured; the container is assumed to be ready if no readiness probe is configured.)

    Restart Count tells you how many times the container has been restarted; this information can be useful for detecting crash loops in containers that are configured with a restart policy of ‘always.’

    Currently the only Condition associated with a Pod is the binary Ready condition, which indicates that the pod is able to service requests and should be added to the load balancing pools of all matching services.

    Lastly, you see a log of recent events related to your Pod. The system compresses multiple identical events by indicating the first and last time it was seen and the number of times it was seen. “From” indicates the component that is logging the event, “SubobjectPath” tells you which object (e.g. container within the pod) is being referred to, and “Reason” and “Message” tell you what happened.

    A common scenario that you can detect using events is when you’ve created a Pod that won’t fit on any node. For example, the Pod might request more resources than are free on any node, or it might specify a label selector that doesn’t match any nodes. Let’s say we created the previous Deployment with 5 replicas (instead of 2) and requesting 600 millicores instead of 500, on a four-node cluster where each (virtual) machine has 1 CPU. In that case one of the Pods will not be able to schedule. (Note that because of the cluster addon pods such as fluentd, skydns, etc., that run on each node, if we requested 1000 millicores then none of the Pods would be able to schedule.)

    1. NAME READY STATUS RESTARTS AGE
    2. nginx-deployment-1006230814-6winp 1/1 Running 0 7m
    3. nginx-deployment-1006230814-fmgu3 1/1 Running 0 7m
    4. nginx-deployment-1370807587-6ekbw 1/1 Running 0 1m
    5. nginx-deployment-1370807587-fg172 0/1 Pending 0 1m

    To find out why the nginx-deployment-1370807587-fz9sd pod is not running, we can use kubectl describe pod on the pending Pod and look at its events:

    1. kubectl describe pod nginx-deployment-1370807587-fz9sd
    1. Namespace: default
    2. Node: /
    3. Labels: app=nginx,pod-template-hash=1370807587
    4. Status: Pending
    5. IP:
    6. Controllers: ReplicaSet/nginx-deployment-1370807587
    7. Containers:
    8. nginx:
    9. Image: nginx
    10. Port: 80/TCP
    11. QoS Tier:
    12. memory: Guaranteed
    13. cpu: Guaranteed
    14. Limits:
    15. cpu: 1
    16. memory: 128Mi
    17. Requests:
    18. cpu: 1
    19. memory: 128Mi
    20. Environment Variables:
    21. Volumes:
    22. default-token-4bcbi:
    23. Type: Secret (a volume populated by a Secret)
    24. SecretName: default-token-4bcbi
    25. Events:
    26. FirstSeen LastSeen Count From SubobjectPath Type Reason Message
    27. --------- -------- ----- ---- ------------- -------- ------ -------
    28. 1m 48s 7 {default-scheduler } Warning FailedScheduling pod (nginx-deployment-1370807587-fz9sd) failed to fit in any node
    29. fit failure on node (kubernetes-node-6ta5): Node didn't have enough resource: CPU, requested: 1000, used: 1420, capacity: 2000
    30. fit failure on node (kubernetes-node-wul5): Node didn't have enough resource: CPU, requested: 1000, used: 1100, capacity: 2000

    To correct this situation, you can use kubectl scale to update your Deployment to specify four or fewer replicas. (Or you could leave the one Pod pending, which is harmless.)

    Events such as the ones you saw at the end of kubectl describe pod are persisted in etcd and provide high-level information on what is happening in the cluster. To list all events you can use

    1. kubectl get events

    but you have to remember that events are namespaced. This means that if you’re interested in events for some namespaced object (e.g. what happened with Pods in namespace my-namespace) you need to explicitly provide a namespace to the command:

    1. kubectl get events --namespace=my-namespace

    To see events from all namespaces, you can use the --all-namespaces argument.

    In addition to kubectl describe pod, another way to get extra information about a pod (beyond what is provided by kubectl get pod) is to pass the -o yaml output format flag to kubectl get pod. This will give you, in YAML format, even more information than kubectl describe pod--essentially all of the information the system has about the Pod. Here you will see things like annotations (which are key-value metadata without the label restrictions, that is used internally by Kubernetes system components), restart policy, ports, and volumes.

    1. kubectl get pod nginx-deployment-1006230814-6winp -o yaml

    Sometimes when debugging it can be useful to look at the status of a node — for example, because you’ve noticed strange behavior of a Pod that’s running on the node, or to find out why a Pod won’t schedule onto the node. As with Pods, you can use kubectl describe node and kubectl get node -o yaml to retrieve detailed information about nodes. For example, here’s what you’ll see if a node is down (disconnected from the network, or kubelet dies and won’t restart, etc.). Notice the events that show the node is NotReady, and also notice that the pods are no longer running (they are evicted after five minutes of NotReady status).

    1. kubectl get nodes
    1. NAME STATUS ROLES AGE VERSION
    2. kubernetes-node-861h NotReady <none> 1h v1.13.0
    3. kubernetes-node-bols Ready <none> 1h v1.13.0
    4. kubernetes-node-st6x Ready <none> 1h v1.13.0
    5. kubernetes-node-unaj Ready <none> 1h v1.13.0
    1. kubectl describe node kubernetes-node-861h
    1. Name: kubernetes-node-861h
    2. Role
    3. Labels: kubernetes.io/arch=amd64
    4. kubernetes.io/os=linux
    5. kubernetes.io/hostname=kubernetes-node-861h
    6. Annotations: node.alpha.kubernetes.io/ttl=0
    7. volumes.kubernetes.io/controller-managed-attach-detach=true
    8. Taints: <none>
    9. CreationTimestamp: Mon, 04 Sep 2017 17:13:23 +0800
    10. Phase:
    11. Conditions:
    12. Type Status LastHeartbeatTime LastTransitionTime Reason Message
    13. ---- ------ ----------------- ------------------ ------ -------
    14. OutOfDisk Unknown Fri, 08 Sep 2017 16:04:28 +0800 Fri, 08 Sep 2017 16:20:58 +0800 NodeStatusUnknown Kubelet stopped posting node status.
    15. MemoryPressure Unknown Fri, 08 Sep 2017 16:04:28 +0800 Fri, 08 Sep 2017 16:20:58 +0800 NodeStatusUnknown Kubelet stopped posting node status.
    16. DiskPressure Unknown Fri, 08 Sep 2017 16:04:28 +0800 Fri, 08 Sep 2017 16:20:58 +0800 NodeStatusUnknown Kubelet stopped posting node status.
    17. Ready Unknown Fri, 08 Sep 2017 16:04:28 +0800 Fri, 08 Sep 2017 16:20:58 +0800 NodeStatusUnknown Kubelet stopped posting node status.
    18. Addresses: 10.240.115.55,104.197.0.26
    19. cpu: 2
    20. hugePages: 0
    21. memory: 4046788Ki
    22. pods: 110
    23. Allocatable:
    24. cpu: 1500m
    25. hugePages: 0
    26. pods: 110
    27. System Info:
    28. Machine ID: 8e025a21a4254e11b028584d9d8b12c4
    29. System UUID: 349075D1-D169-4F25-9F2A-E886850C47E3
    30. Boot ID: 5cd18b37-c5bd-4658-94e0-e436d3f110e0
    31. Kernel Version: 4.4.0-31-generic
    32. OS Image: Debian GNU/Linux 8 (jessie)
    33. Operating System: linux
    34. Architecture: amd64
    35. Container Runtime Version: docker://1.12.5
    36. Kubelet Version: v1.6.9+a3d1dfa6f4335
    37. Kube-Proxy Version: v1.6.9+a3d1dfa6f4335
    38. ExternalID: 15233045891481496305
    39. Non-terminated Pods: (9 in total)
    40. Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
    41. --------- ---- ------------ ---------- --------------- -------------
    42. ......
    43. Allocated resources:
    44. (Total limits may be over 100 percent, i.e., overcommitted.)
    45. CPU Requests CPU Limits Memory Requests Memory Limits
    46. ------------ ---------- --------------- -------------
    47. 900m (60%) 2200m (146%) 1009286400 (66%) 5681286400 (375%)
    48. Events: <none>
    1. kubectl get node kubernetes-node-861h -o yaml
    1. apiVersion: v1
    2. kind: Node
    3. metadata:
    4. creationTimestamp: 2015-07-10T21:32:29Z
    5. labels:
    6. kubernetes.io/hostname: kubernetes-node-861h
    7. name: kubernetes-node-861h
    8. resourceVersion: "757"
    9. uid: 2a69374e-274b-11e5-a234-42010af0d969
    10. spec:
    11. externalID: "15233045891481496305"
    12. podCIDR: 10.244.0.0/24
    13. providerID: gce://striped-torus-760/us-central1-b/kubernetes-node-861h
    14. status:
    15. addresses:
    16. - address: 10.240.115.55
    17. type: InternalIP
    18. - address: 104.197.0.26
    19. type: ExternalIP
    20. capacity:
    21. cpu: "1"
    22. memory: 3800808Ki
    23. pods: "100"
    24. conditions:
    25. - lastHeartbeatTime: 2015-07-10T21:34:32Z
    26. lastTransitionTime: 2015-07-10T21:35:15Z
    27. reason: Kubelet stopped posting node status.
    28. status: Unknown
    29. type: Ready
    30. nodeInfo:
    31. bootID: 4e316776-b40d-4f78-a4ea-ab0d73390897
    32. containerRuntimeVersion: docker://Unknown
    33. kernelVersion: 3.16.0-0.bpo.4-amd64
    34. kubeProxyVersion: v0.21.1-185-gffc5a86098dc01
    35. kubeletVersion: v0.21.1-185-gffc5a86098dc01
    36. machineID: ""
    37. osImage: Debian GNU/Linux 7 (wheezy)
    38. systemUUID: ABE5F6B4-D44B-108B-C46A-24CCE16C8B6E