Configuring scale to zero

    The scale to zero value controls whether Knative allows replicas to scale down to zero (if set to ), or stop at 1 replica if set to false.

    NOTE: For more information about scale bounds configuration per revision, see the documentation on Configuring scale bounds.

    • Global key: enable-scale-to-zero
    • Per-revision annotation key: No per-revision setting.
    • Possible values: boolean
    • Default: true
    1. apiVersion: operator.knative.dev/v1alpha1
    2. kind: KnativeServing
    3. metadata:
    4. name: knative-serving
    5. spec:
    6. autoscaler:

    This setting specifies an upper bound time limit that the system will wait internally for scale-from-zero machinery to be in place before the last replica is removed.

    IMPORTANT: This is a value that controls how long internal network programming is allowed to take, and should only be adjusted if you have experienced issues with requests being dropped while a revision was scaling to zero replicas.

    • Global key: scale-to-zero-grace-period
    • Per-revision annotation key: n/a
    • Possible values: Duration
    • Default: 30s

    Example:

    1. apiVersion: operator.knative.dev/v1alpha1
    2. kind: KnativeServing
    3. metadata:
    4. name: knative-serving
    5. spec:
    6. config:
    7. scale-to-zero-grace-period: "40s"

    The flag determines the minimum amount of time that the last pod will remain active after the Autoscaler decides to scale pods to zero.

    • Global key: scale-to-zero-pod-retention-period
    • Per-revision annotation key: autoscaling.knative.dev/scaleToZeroPodRetentionPeriod
    • Possible values: Non-negative duration string
    • Default: 0s

    Example:

    1. apiVersion: v1
    2. kind: ConfigMap
    3. metadata:
    4. name: config-autoscaler
    5. namespace: knative-serving
    6. data: