Advanced DaemonSet

    If you don’t know much about the Kubernetes DaemonSet, we strongly recommend you read its documents before learning Advanced DaemonSet.

    Note that Advanced DaemonSet extends the same CRD schema of default DaemonSet with newly added fields. The CRD kind name is still . This is done on purpose so that user can easily migrate workload to the Advanced DaemonSet from the default DaemonSet. For example, one may simply replace the value of apiVersion in the DaemonSet yaml file from apps/v1 to apps.kruise.io/v1alpha1 after installing Kruise manager.

    These new fields have been added into RollingUpdateDaemonSet:

    1. const (
    2. + // StandardRollingUpdateType replace the old daemons by new ones using rolling update i.e replace them on each node one after the other.
    3. + // this is the default type for RollingUpdate.
    4. + StandardRollingUpdateType RollingUpdateType = "Standard"
    5. + // InplaceRollingUpdateType update container image without killing the pod if possible.
    6. + InplaceRollingUpdateType RollingUpdateType = "InPlaceIfPossible"
    7. )
    8. // Spec to control the desired behavior of daemon set rolling update.
    9. type RollingUpdateDaemonSet struct {
    10. + // Type is to specify which kind of rollingUpdate.
    11. + Type RollingUpdateType `json:"rollingUpdateType,omitempty" protobuf:"bytes,1,opt,name=rollingUpdateType"`
    12. // ...
    13. MaxUnavailable *intstr.IntOrString `json:"maxUnavailable,omitempty" protobuf:"bytes,2,opt,name=maxUnavailable"`
    14. // ...
    15. MaxSurge *intstr.IntOrString `json:"maxSurge,omitempty" protobuf:"bytes,7,opt,name=maxSurge"`
    16. + // A label query over nodes that are managed by the daemon set RollingUpdate.
    17. + // Must match in order to be controlled.
    18. + // It must match the node's labels.
    19. + Selector *metav1.LabelSelector `json:"selector,omitempty" protobuf:"bytes,3,opt,name=selector"`
    20. + // The number of DaemonSet pods remained to be old version.
    21. + // Default value is 0.
    22. + // Maximum value is status.DesiredNumberScheduled, which means no pod will be updated.
    23. + // +optional
    24. + // Indicates that the daemon set is paused and will not be processed by the
    25. + // daemon set controller.
    26. + // +optional
    27. + Paused *bool `json:"paused,omitempty" protobuf:"varint,5,opt,name=paused"`
    28. }

    Advanced DaemonSet has a rollingUpdateType field in spec.updateStrategy.rollingUpdate which controls the way to rolling update.

    • Standard (default): controller will update daemon Pods by recreating them. It is the same behavior as upstream DaemonSet. You can use maxUnavailable or maxSurge to control order of recreating old and new pods.
    • InPlaceIfPossible: controller will try to in-place update Pod instead of recreating them if possible. You may need to read the for more details of in-place update. Note that in this type, you can only use without maxSurge.
    1. apiVersion: apps.kruise.io/v1alpha1
    2. kind: DaemonSet
    3. spec:
    4. # ...
    5. updateStrategy:
    6. type: RollingUpdate
    7. rollingUpdate:
    8. rollingUpdateType: Standard

    Selector for rolling update

    It helps users to update Pods on specific nodes whose labels could be matched with the selector.

    1. apiVersion: apps.kruise.io/v1alpha1
    2. kind: DaemonSet
    3. spec:
    4. # ...
    5. updateStrategy:
    6. type: RollingUpdate
    7. rollingUpdate:
    8. selector:
    9. matchLabels:
    10. nodeType: canary

    This strategy defines rules for calculating the priority of updating pods. Partition is the number of DaemonSet pods that should be remained to be old version.

    Paused for rolling update

    paused indicates that Pods updating is paused, controller will not update Pods but just maintain the number of replicas.

    1. apiVersion: apps.kruise.io/v1alpha1
    2. kind: DaemonSet
    3. spec:
    4. # ...
    5. updateStrategy:
    6. rollingUpdate:
    7. paused: true

    FEATURE STATE: Kruise v1.3.0

    If you have enabled the PreDownloadImageForDaemonSetUpdate feature-gate during Kruise installation or upgrade, DaemonSet controller will automatically pre-download the image you want to update to the nodes of all old Pods. It is quite useful to accelerate the progress of applications upgrade.

    The parallelism of each new image pre-downloading by DaemonSet is 1, which means the image is downloaded on nodes one by one. You can change the parallelism using apps.kruise.io/image-predownload-parallelism annotation on DaemonSet according to the capability of image registry, for registries with more bandwidth and P2P image downloading ability, a larger parallelism can speed up the pre-download process.

    1. apiVersion: apps.kruise.io/v1alpha1
    2. kind: DaemonSet
    3. metadata:
    4. annotations:
    5. apps.kruise.io/image-predownload-parallelism: "10"

    Lifecycle hook

    FEATURE STATE: Kruise v1.1.0

    This is similar to Lifecycle hook of CloneSet.

    1. type LifecycleStateType string
    2. // Lifecycle contains the hooks for Pod lifecycle.
    3. // PreDelete is the hook before Pod to be deleted.
    4. PreDelete *LifecycleHook `json:"preDelete,omitempty"`
    5. }
    6. LabelsHandler map[string]string `json:"labelsHandler,omitempty"`
    7. FinalizersHandler []string `json:"finalizersHandler,omitempty"`
    8. /********************** FEATURE STATE: 1.2.0 ************************/
    9. // MarkPodNotReady = true means:
    10. // - Pod will be set to 'NotReady' at preparingDelete/preparingUpdate state.
    11. // - Pod will be restored to 'Ready' at Updated state if it was set to 'NotReady' at preparingUpdate state.
    12. // Default to false.
    13. MarkPodNotReady bool `json:"markPodNotReady,omitempty"`
    14. /*********************************************************************/
    15. }

    Examples:

    • When Advanced DaemonSet delete a Pod (including scale in and recreate update):
      • Delete it directly if no lifecycle hook definition or Pod not matched preDelete hook
      • Otherwise, Advanced DaemonSet will firstly update Pod to PreparingDelete state and wait for user controller to remove the label/finalizer and Pod not matched preDelete hook
    1. apiVersion: v1
    2. kind: Pod
    3. metadata:
    4. labels:
    5. example.io/block-deleting: "true" # the pod is hooked by PreDelete hook label
    6. lifecycle.apps.kruise.io/state: PreparingDelete # so we update it to `PreparingDelete` state and wait for user controller to do something and remove the label

    MarkPodNotReady

    FEATURE STATE: Kruise v1.2.0

    1. lifecycle:
    2. preDelete:
    3. markPodNotReady: true
    4. finalizersHandler:
    5. - example.io/unready-blocker

    If you set markPodNotReady=true for preDelete, Kruise will try to set KruisePodReady condition to False when Pods enter PreparingDelete lifecycle state, and Pods will be NotReady, but containers still Running.

    One can use this markPodNotReady feature to drain service traffic before terminating containers.

    Note: this feature only works when pod has KruisePodReady ReadinessGate.

    Example for user controller logic

    Same as yaml example above, we should fisrtly define example.io/block-deleting label in template and lifecycle of Advanced DaemonSet.

    1. apiVersion: apps.kruise.io/v1alpha1
    2. kind: DaemonSet
    3. spec:
    4. template:
    5. metadata:
    6. labels:
    7. example.io/block-deleting: "true"
    8. # ...
    9. lifecycle:
    10. preDelete:
    11. labelsHandler:
    • For Pod in PreparingDelete, check if its Node existing, do something (for example reserve resources) and then remove the label.