Guide for scheduling Windows containers in Kubernetes

Configure an example deployment to run Windows containers on the Windows node
Highlight Windows specific functionality in Kubernetes

Before you begin

Create a Kubernetes cluster that includes a control plane and a worker node running Windows Server
It is important to note that creating and deploying services and workloads on Kubernetes behaves in much the same way for Linux and Windows containers. to interface with the cluster are identical. The example in the section below is provided to jumpstart your experience with Windows containers.

The example YAML file below deploys a simple webserver application running inside a Windows container.

Create a service spec named with the contents below:

Note: Port mapping is also supported, but for simplicity this example exposes port 80 of the container directly to the Service.

Check that all nodes are healthy:
```
kubectl get nodes
```
Deploy the service and watch for pod updates:

When the service is deployed correctly both Pods are marked as Ready. To exit the watch command, press Ctrl+C.
Check that the deployment succeeded. To verify:
- Two pods listed from the Linux control plane node, use kubectl get pods
- Node-to-pod communication across the network, curl port 80 of your pod IPs from the Linux control plane node to check for a web server response
- Pod-to-pod communication, ping between pods (and across hosts, if you have more than one Windows node) using docker exec or kubectl exec
- Service-to-pod communication, curl the virtual service IP (seen under kubectl get services) from the Linux control plane node and from individual pods
- Service discovery, curl the service name with the Kubernetes
- Inbound connectivity, curl the NodePort from the Linux control plane node or machines outside of the cluster

Note: Windows container hosts are not able to access the IP of services scheduled on them due to current platform limitations of the Windows networking stack. Only Windows pods are able to access service IPs.

Observability

Follow the instructions in the LogMonitor GitHub page to copy its binaries and configuration files to all your containers and add the necessary entrypoints for LogMonitor to push your logs to STDOUT.

Using configurable Container usernames

Windows containers can be configured to run their entrypoints and processes with different usernames than the image defaults. Learn more about it here.

Windows container workloads can be configured to use Group Managed Service Accounts (GMSA). Group Managed Service Accounts are a specific type of Active Directory account that provide automatic password management, simplified service principal name (SPN) management, and the ability to delegate the management to other administrators across multiple servers. Containers configured with a GMSA can access external Active Directory Domain resources while carrying the identity configured with the GMSA. Learn more about configuring and using GMSA for Windows containers here.

Taints and Tolerations

Users need to use some combination of taints and node selectors in order to schedule Linux and Windows workloads to their respective OS-specific nodes. The recommended approach is outlined below, with one of its main goals being that this approach should not break compatibility for existing Linux workloads.

Starting from 1.25, you can (and should) set .spec.os.name for each Pod, to indicate the operating system that the containers in that Pod are designed for. For Pods that run Linux containers, set .spec.os.name to linux. For Pods that run Windows containers, set .spec.os.name to windows.

Note: Starting from 1.25, the IdentifyPodOS feature is in GA stage and defaults to be enabled.

The scheduler does not use the value of .spec.os.name when assigning Pods to nodes. You should use normal Kubernetes mechanisms for assigning pods to nodes to ensure that the control plane for your cluster places pods onto nodes that are running the appropriate operating system.

The .spec.os.name value has no effect on the scheduling of the Windows pods, so taints and tolerations and node selectors are still required to ensure that the Windows pods land onto appropriate Windows nodes.

Ensuring OS-specific workloads land on the appropriate container host

kubernetes.io/os = [windows|linux]
kubernetes.io/arch = [amd64|arm64|…]

If a Pod specification does not specify a nodeSelector like "kubernetes.io/os": windows, it is possible the Pod can be scheduled on any host, Windows or Linux. This can be problematic since a Windows container can only run on Windows and a Linux container can only run on Linux. The best practice is to use a nodeSelector.

However, we understand that in many cases users have a pre-existing large number of deployments for Linux containers, as well as an ecosystem of off-the-shelf configurations, such as community Helm charts, and programmatic Pod generation cases, such as with Operators. In those situations, you may be hesitant to make the configuration change to add nodeSelectors. The alternative is to use Taints. Because the kubelet can set Taints during registration, it could easily be modified to automatically add a taint when running on Windows only.

For example: --register-with-taints='os=windows:NoSchedule'

By adding a taint to all Windows nodes, nothing will be scheduled on them (that includes existing Linux Pods). In order for a Windows Pod to be scheduled on a Windows node, it would need both the nodeSelector and the appropriate matching toleration to choose Windows.

nodeSelector:
    kubernetes.io/os: windows
    node.kubernetes.io/windows-build: '10.0.17763'
    - key: "os"
      operator: "Equal"
      value: "windows"
      effect: "NoSchedule"

The Windows Server version used by each pod must match that of the node. If you want to use multiple Windows Server versions in the same cluster, then you should set additional node labels and nodeSelectors.

Kubernetes 1.17 automatically adds a new label node.kubernetes.io/windows-build to simplify this. If you’re running an older version, then it’s recommended to add this label manually to Windows nodes.

This label reflects the Windows major, minor, and build number that need to match for compatibility. Here are values used today for each Windows Server version.

Simplifying with RuntimeClass

RuntimeClass can be used to simplify the process of using taints and tolerations. A cluster administrator can create a RuntimeClass object which is used to encapsulate these taints and tolerations.

Save this file to runtimeClasses.yml. It includes the appropriate nodeSelector for the Windows OS, architecture, and version.

Run kubectl create -f runtimeClasses.yml using as a cluster administrator
Add runtimeClassName: windows-2019 as appropriate to Pod specs

apiVersion: apps/v1
kind: Deployment
metadata:
  name: iis-2019
  labels:
    app: iis-2019
spec:
  replicas: 1
  template:
    metadata:
      name: iis-2019
      labels:
        app: iis-2019
      runtimeClassName: windows-2019
      - name: iis
        image: mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2019
        resources:
          limits:
            cpu: 1
            memory: 800Mi
          requests:
            cpu: .1
            memory: 300Mi
        ports:
          - containerPort: 80
 selector:
    matchLabels:
      app: iis-2019
---
apiVersion: v1
kind: Service
metadata:
  name: iis
spec:
  type: LoadBalancer
  ports:
  - protocol: TCP
    port: 80