Configuring concurrency

    For per-revision concurrency, you must configure both and autoscaling.knative.dev/target for a soft limit, or containerConcurrency for a .

    For global concurrency, you can set the container-concurrency-target-default value.

    It is possible to set either a soft or hard concurrency limit.

    Note

    If both a soft and a hard limit are specified, the smaller of the two values will be used. This prevents the Autoscaler from having a target value that is not permitted by the hard limit value.

    The soft limit is a targeted limit rather than a strictly enforced bound. In some situations, particularly if there is a sudden burst of requests, this value can be exceeded.

    Warning

    Using a hard limit configuration is only recommended if there is a clear use case for it with your application. Having a low hard limit specified may have a negative impact on the throughput and latency of an application, and may cause additional cold starts.

    • Global key: container-concurrency-target-default
    • Per-revision annotation key: autoscaling.knative.dev/target
    • Possible values: An integer.
    • Default: "100"

    Example:

    Per RevisionGlobal (ConfigMap)Global (Operator)

    1. apiVersion: v1
    2. kind: ConfigMap
    3. metadata:
    4. name: config-autoscaler
    5. namespace: knative-serving
    6. data:
    7. container-concurrency-target-default: "200"
    1. apiVersion: operator.knative.dev/v1alpha1
    2. kind: KnativeServing
    3. metadata:
    4. name: knative-serving
    5. config:
    6. autoscaler:
    7. container-concurrency-target-default: "200"

    Hard limit

    The hard limit is specified using the field on the Revision spec. This setting is not an annotation.

    There is no global setting for the hard limit in the autoscaling ConfigMap, because containerConcurrency has implications outside of autoscaling, such as on buffering and queuing of requests. However, a default value can be set for the Revision’s containerConcurrency field in config-defaults.yaml.

    • Global key: container-concurrency (in config-defaults.yaml)
    • Per-revision spec key: containerConcurrency
    • Possible values: integer
    • Default: 0, meaning no limit

    Example:

    Per RevisionGlobal (Defaults ConfigMap)Global (Operator)

    1. apiVersion: v1
    2. kind: ConfigMap
    3. metadata:
    4. name: config-defaults
    5. namespace: knative-serving
    6. data:
    7. container-concurrency: "50"
    1. apiVersion: operator.knative.dev/v1alpha1
    2. kind: KnativeServing
    3. metadata:
    4. name: knative-serving
    5. spec:
    6. defaults:
    7. container-concurrency: "50"

    Target utilization

    In addition to the literal settings explained previously, concurrency values can be further adjusted by using a target utilization value.

    This value specifies what percentage of the previously specified target should actually be targeted by the Autoscaler. This is also known as specifying the hotness at which a replica runs, which causes the Autoscaler to scale up before the defined hard limit is reached.

    For example, if containerConcurrency is set to 10, and the target utilization value is set to 70 (percent), the Autoscaler will create a new replica when the average number of concurrent requests across all existing replicas reaches 7. Requests numbered 7 to 10 will still be sent to the existing replicas, but this allows for additional replicas to be started in anticipation of being needed when the containerConcurrency limit is reached.

    • Global key: container-concurrency-target-percentage
    • Per-revision annotation key: autoscaling.knative.dev/target-utilization-percentage
    • Possible values: float
    • Default: 70

    Example:

    1. apiVersion: v1
    2. kind: ConfigMap
    3. metadata:
    4. name: config-autoscaler
    5. namespace: knative-serving
    6. data:
    7. container-concurrency-target-percentage: "80"
    1. apiVersion: operator.knative.dev/v1alpha1
    2. kind: KnativeServing
    3. metadata:
    4. name: knative-serving
    5. spec:
    6. config: