Migrate Replicated Control Plane To Use Cloud Controller Manager

    The cloud-controller-manager is a Kubernetes control plane component that embeds cloud-specific control logic. The cloud controller manager lets you link your cluster into your cloud provider’s API, and separates out the components that interact with that cloud platform from components that only interact with your cluster.

    By decoupling the interoperability logic between Kubernetes and the underlying cloud infrastructure, the cloud-controller-manager component enables cloud providers to release features at a different pace compared to the main Kubernetes project.

    As part of the , all cloud specific controllers must be moved out of the kube-controller-manager. All existing clusters that run cloud controllers in the kube-controller-manager must migrate to instead run the controllers in a cloud provider specific cloud-controller-manager.

    Leader Migration provides a mechanism in which HA clusters can safely migrate “cloud specific” controllers between the kube-controller-manager and the cloud-controller-manager via a shared resource lock between the two components while upgrading the replicated control plane. For a single-node control plane, or if unavailability of controller managers can be tolerated during the upgrade, Leader Migration is not needed and this guide can be ignored.

    Leader Migration can be enabled by setting --enable-leader-migration on kube-controller-manager or cloud-controller-manager. Leader Migration only applies during the upgrade and can be safely disabled or left enabled after the upgrade is complete.

    This guide walks you through the manual process of upgrading the control plane from kube-controller-manager with built-in cloud provider to running both kube-controller-manager and cloud-controller-manager. If you use a tool to administrator the cluster, please refer to the documentation of the tool and the cloud provider for more details.

    It is assumed that the control plane is running Kubernetes version N and to be upgraded to version N + 1. Although it is possible to migrate within the same version, ideally the migration should be performed as part of an upgrade so that changes of configuration can be aligned to each release. The exact versions of N and N + 1 depend on each cloud provider. For example, if a cloud provider builds a cloud-controller-manager to work with Kubernetes 1.22, then N can be 1.21 and N + 1 can be 1.22.

    The control plane nodes should run kube-controller-manager with Leader Election enabled through --leader-elect=true. As of version N, an in-tree cloud privider must be set with --cloud-provider flag and cloud-controller-manager should not yet be deployed.

    The out-of-tree cloud provider must have built a cloud-controller-manager with Leader Migration implementation. If the cloud provider imports k8s.io/cloud-provider and k8s.io/controller-manager of version v0.21.0 or later, Leader Migration will be available. However, for version before v0.22.0, Leader Migration is alpha and requires feature gate ControllerManagerLeaderMigration to be enabled.

    For authorization, this guide assumes that the cluster uses RBAC. If another authorization mode grants permissions to kube-controller-manager and cloud-controller-manager components, please grant the needed access in a way that matches the mode.

    The default permissions of the controller manager allow only accesses to their main Lease. In order for the migration to work, accesses to another Lease are required.

    You can grant kube-controller-manager full access to the leases API by modifying the system::leader-locking-kube-controller-manager role. This task guide assumes that the name of the migration lease is cloud-provider-extraction-migration.

    Do the same to the system::leader-locking-cloud-controller-manager role.

    kubectl patch -n kube-system role 'system::leader-locking-cloud-controller-manager' -p '{"rules": [ {"apiGroups":[ "coordination.k8s.io"], "resources": ["leases"], "resourceNames": ["cloud-provider-extraction-migration"], "verbs": ["create", "list", "get", "update"] } ]}' --type=merge

    Initial Leader Migration configuration

    Leader Migration optionally takes a configuration file representing the state of controller-to-manager assignment. At this moment, with in-tree cloud provider, kube-controller-manager runs route, service, and cloud-node-lifecycle. The following example configuration shows the assignment.

    Leader Migration can be enabled without a configuration. Please see for details.

    On each control plane node, save the content to /etc/leadermigration.conf, and update the manifest of kube-controller-manager so that the file is mounted inside the container at the same location. Also, update the same manifest to add the following arguments:

    • --enable-leader-migration to enable Leader Migration on the controller manager
    • --leader-migration-config=/etc/leadermigration.conf to set configuration file

    In version N + 1, the desired state of controller-to-manager assignment can be represented by a new configuration file, shown as follows. Please note component field of each controllerLeaders changing from kube-controller-manager to cloud-controller-manager.

    1. kind: LeaderMigrationConfiguration
    2. apiVersion: controllermanager.config.k8s.io/v1beta1
    3. leaderName: cloud-provider-extraction-migration
    4. resourceLock: leases
    5. - name: route
    6. component: cloud-controller-manager
    7. - name: service
    8. component: cloud-controller-manager
    9. - name: cloud-node-lifecycle
    10. component: cloud-controller-manager

    When creating control plane nodes of version N + 1, the content should be deploy to . The manifest of cloud-controller-manager should be updated to mount the configuration file in the same manner as kube-controller-manager of version N. Similarly, add --feature-gates=ControllerManagerLeaderMigration=true,--enable-leader-migration, and --leader-migration-config=/etc/leadermigration.conf to the arguments of cloud-controller-manager.

    Create a new control plane node of version N + 1 with the updated cloud-controller-manager manifest, and with the --cloud-provider flag unset for kube-controller-manager. kube-controller-manager of version N + 1 MUST NOT have Leader Migration enabled because, with an external cloud provider, it does not run the migrated controllers anymore and thus it is not involved in the migration.

    Please refer to Cloud Controller Manager Administration for more detail on how to deploy cloud-controller-manager.

    Upgrade Control Plane

    The control plane now contains nodes of both version N and N + 1. The nodes of version N run kube-controller-manager only, and these of version N + 1 run both kube-controller-manager and cloud-controller-manager. The migrated controllers, as specified in the configuration, are running under either kube-controller-manager of version N or cloud-controller-manager of version N + 1 depending on which controller manager holds the migration lease. No controller will ever be running under both controller managers at any time.

    In a rolling manner, create a new control plane node of version N + 1 and bring down one of version N + 1 until the control plane contains only nodes of version N + 1. If a rollback from version N + 1 to N is required, add nodes of version N with Leader Migration enabled for kube-controller-manager back to the control plane, replacing one of version N + 1 each time until there are only nodes of version N.

    Now that the control plane has been upgraded to run both kube-controller-manager and cloud-controller-manager of version N + 1, Leader Migration has finished its job and can be safely disabled to save one Lease resource. It is safe to re-enable Leader Migration for the rollback in the future.

    In a rolling manager, update manifest of cloud-controller-manager to unset both --enable-leader-migration and --leader-migration-config= flag, also remove the mount of /etc/leadermigration.conf, and finally remove /etc/leadermigration.conf. To re-enable Leader Migration, recreate the configuration file and add its mount and the flags that enable Leader Migration back to cloud-controller-manager.

    Default Configuration

    Starting Kubernetes 1.22, Leader Migration provides a default configuration suitable for the default controller-to-manager assignment. The default configuration can be enabled by setting --enable-leader-migration but without --leader-migration-config=.

    • Read the enhancement proposal