Understanding node rebooting

Another challenge is how to handle nodes that are running critical infrastructure such as the router or the registry. The same node evacuation process applies, though it is important to understand certain edge cases.

When rebooting nodes that host critical OKD infrastructure components, such as router pods, registry pods, and monitoring pods, ensure that there are at least three nodes available to run these components.

The following scenario demonstrates how service interruptions can occur with applications running on OKD when only two nodes are available:

Node A is marked unschedulable and all pods are evacuated.
The registry pod running on that node is now redeployed on node B. Node B is now running both registry pods.
Node B is now marked unschedulable and is evacuated.

When using three nodes for infrastructure components, this process does not result in a service disruption. However, due to pod scheduling, the last node that is evacuated and brought back into rotation does not have a registry pod. One of the other nodes has two registry pods. To schedule the third registry pod on the last node, use pod anti-affinity to prevent the scheduler from locating two registry pods on the same node.

Additional information

For more information on pod anti-affinity, see Placing pods relative to other pods using affinity and anti-affinity rules.

Pod anti-affinity is slightly different than node anti-affinity. Node anti-affinity can be violated if there are no other suitable locations to deploy a pod. Pod anti-affinity can be set to either required or preferred.

Procedure

To reboot a node using pod anti-affinity:

Edit the node specification to configure pod anti-affinity:

This example assumes the container image registry pod has a label of registry=default. Pod anti-affinity can use any Kubernetes match expression.
Enable the MatchInterPodAffinity scheduler predicate in the scheduling policy file.
Perform a graceful restart of the node.

In most cases, a pod running an OKD router exposes a host port.

The scheduler predicate ensures that no router pods using the same port can run on the same node, and pod anti-affinity is achieved. If the routers are relying on IP failover for high availability, there is nothing else that is needed.

For router pods relying on an external service such as AWS Elastic Load Balancing for high availability, it is that service’s responsibility to react to router pod restarts.

In rare cases, a router pod may not have a host port configured. In those cases, it is important to follow the recommended restart process for infrastructure nodes.

Procedure

To perform a graceful restart of a node:

Mark the node as unschedulable:
```
$ oc adm cordon <node1>
```
Drain the node to remove all the running pods:
Restart the node:
Mark the node as schedulable after the reboot is complete:
```
$ oc adm uncordon <node1>
```

Verify that the node is ready:

Example output


<node1> Ready   worker   6d22h   v1.18.3+b0068a8

Additional information