Configuring scale to zero

The scale to zero value controls whether Knative allows replicas to scale down to zero (if set to ), or stop at 1 replica if set to false.

NOTE: For more information about scale bounds configuration per revision, see the documentation on Configuring scale bounds.

Global key: enable-scale-to-zero
Per-revision annotation key: No per-revision setting.
Possible values: boolean
Default: true

Global (Operator)

apiVersion: operator.knative.dev/v1alpha1
kind: KnativeServing
metadata:
  name: knative-serving
spec:
    autoscaler:

This setting specifies an upper bound time limit that the system will wait internally for scale-from-zero machinery to be in place before the last replica is removed.

IMPORTANT: This is a value that controls how long internal network programming is allowed to take, and should only be adjusted if you have experienced issues with requests being dropped while a revision was scaling to zero replicas.

Global key: scale-to-zero-grace-period
Per-revision annotation key: n/a
Possible values: Duration
Default: 30s

Example:

Global (Operator)

apiVersion: operator.knative.dev/v1alpha1
kind: KnativeServing
metadata:
  name: knative-serving
spec:
  config:
      scale-to-zero-grace-period: "40s"

The flag determines the minimum amount of time that the last pod will remain active after the Autoscaler decides to scale pods to zero.

Global key: scale-to-zero-pod-retention-period
Per-revision annotation key: autoscaling.knative.dev/scaleToZeroPodRetentionPeriod
Possible values: Non-negative duration string
Default: 0s

Example:

Global (ConfigMap)

apiVersion: v1
kind: ConfigMap
metadata:
 name: config-autoscaler
 namespace: knative-serving
data: