Configuring scale bounds

    You can also specify the initial scale that a Revision is scaled to immediately after creation. This can be a default configuration for all Revisions, or for a specific Revision using an annotation.

    This value controls the minimum number of replicas that each Revision should have. Knative will attempt to never have less than this number of replicas at any one point in time.

    • Global key:
    • Per-revision annotation key: autoscaling.knative.dev/min-scale
    • Possible values: integer
    • Default: 0 if scale-to-zero is enabled and class KPA is used, 1 otherwise

    Note

    For more information about scale-to-zero configuration, see the documentation on .

    Example:

    Per RevisionGlobal (ConfigMap)Global (Operator)

    1. apiVersion: v1
    2. kind: ConfigMap
    3. metadata:
    4. name: config-autoscaler
    5. namespace: knative-serving
    6. data:
    7. min-scale: "3"
    1. apiVersion: operator.knative.dev/v1alpha1
    2. kind: KnativeServing
    3. metadata:
    4. name: knative-serving
    5. spec:
    6. config:
    7. autoscaler:
    8. min-scale: "3"

    Upper bound

    This value controls the maximum number of replicas that each revision should have. Knative will attempt to never have more than this number of replicas running, or in the process of being created, at any one point in time.

    If the max-scale-limit global key is set, Knative ensures that neither the global max scale nor the per-revision max scale for new revisions exceed this value. When max-scale-limit is set to a positive value, a revision with a max scale above that value (including 0, which means unlimited) is disallowed.

    • Global key: max-scale
    • Per-revision annotation key: autoscaling.knative.dev/max-scale
    • Possible values: integer
    • Default: 0 which means unlimited

    Per RevisionGlobal (ConfigMap)Global (Operator)

    1. apiVersion: serving.knative.dev/v1
    2. kind: Service
    3. metadata:
    4. name: helloworld-go
    5. namespace: default
    6. spec:
    7. template:
    8. metadata:
    9. annotations:
    10. autoscaling.knative.dev/max-scale: "3"
    11. spec:
    12. containers:
    13. - image: gcr.io/knative-samples/helloworld-go
    1. apiVersion: v1
    2. kind: ConfigMap
    3. metadata:
    4. name: config-autoscaler
    5. namespace: knative-serving
    6. max-scale: "3"
    7. max-scale-limit: "100"

    This value controls the initial target scale a Revision must reach immediately after it is created before it is marked as Ready. After the Revision has reached this scale one time, this value is ignored. This means that the Revision will scale down after the initial target scale is reached if the actual traffic received only needs a smaller scale.

    When the Revision is created, the larger of initial scale and lower bound is automatically chosen as the initial target scale.

    • Global key: in combination with allow-zero-initial-scale
    • Per-revision annotation key: autoscaling.knative.dev/initial-scale
    • Possible values: integer
    • Default: 1

    Example:

    Per RevisionGlobal (ConfigMap)Global (Operator)

    1. apiVersion: serving.knative.dev/v1
    2. kind: Service
    3. metadata:
    4. name: helloworld-go
    5. namespace: default
    6. spec:
    7. template:
    8. metadata:
    9. annotations:
    10. autoscaling.knative.dev/initial-scale: "0"
    11. spec:
    12. containers:
    13. - image: gcr.io/knative-samples/helloworld-go
    1. apiVersion: v1
    2. kind: ConfigMap
    3. metadata:
    4. name: config-autoscaler
    5. namespace: knative-serving
    6. data:
    7. initial-scale: "0"
    8. allow-zero-initial-scale: "true"
    1. apiVersion: operator.knative.dev/v1alpha1
    2. kind: KnativeServing
    3. metadata:
    4. name: knative-serving
    5. spec:
    6. config:
    7. autoscaler:
    8. initial-scale: "0"
    9. allow-zero-initial-scale: "true"

    Scale Up Minimum

    This value controls the minimum number of replicas that will be created when the Revision scales up from zero.

    • Global key: n/a
    • Per-revision annotation key: autoscaling.knative.dev/activation-scale
    • Possible values: integer
    • Default: 1

    Example:

    Per Revision

    1. apiVersion: serving.knative.dev/v1
    2. kind: Service
    3. metadata:
    4. name: helloworld-go
    5. namespace: default
    6. spec:
    7. template:
    8. annotations:
    9. autoscaling.knative.dev/activation-scale: "5"
    10. containers:
    11. - image: gcr.io/knative-samples/helloworld-go

    Note

    Only supported for the default KPA autoscaler class.

    • Global key: scale-down-delay
    • Per-revision annotation key: autoscaling.knative.dev/scale-down-delay
    • Possible values: Duration, 0s <= value <= 1h
    • Default: 0s (no delay)

    Example:

    Per RevisionGlobal (ConfigMap)Global (Operator)

    1. apiVersion: v1
    2. kind: ConfigMap
    3. metadata:
    4. name: config-autoscaler
    5. namespace: knative-serving
    6. data:
    7. scale-down-delay: "15m"
    1. apiVersion: operator.knative.dev/v1alpha1
    2. kind: KnativeServing
    3. metadata:
    4. name: knative-serving
    5. spec:
    6. config:
    7. autoscaler:
    8. scale-down-delay: "15m"

    Stable window

    The stable window defines the sliding time window over which metrics are averaged to provide the input for scaling decisions when the autoscaler is not in .

    • Global key: stable-window
    • Per-revision annotation key: autoscaling.knative.dev/window
    • Possible values: Duration, 6s <= value <= 1h
    • Default: 60s

    Note

    During scale down, in most cases the last Replica is removed after there has been no traffic to the Revision for the entire duration of the stable window.

    Example:

    1. apiVersion: serving.knative.dev/v1
    2. kind: Service
    3. metadata:
    4. name: helloworld-go
    5. namespace: default
    6. spec:
    7. template:
    8. metadata:
    9. annotations:
    10. autoscaling.knative.dev/window: "40s"
    11. spec:
    12. containers:
    13. - image: gcr.io/knative-samples/helloworld-go
    1. apiVersion: v1
    2. kind: ConfigMap
    3. metadata:
    4. name: config-autoscaler
    5. namespace: knative-serving