The Horizontal Pod Autoscaler - Testing HPAs with kubectl - 《Rancher 2.5.7-2.5.8 Documentation》

For HPA to work correctly, service deployments should have resources request definitions for containers. Follow this hello-world example to test if HPA is working correctly.

Copy the hello-world deployment manifest below.

Hello World Manifest

Deploy it to your cluster.

# kubectl create -f <HELLO_WORLD_MANIFEST>

Copy one of the HPAs below based on the metric type you’re using:

Hello World HPA: Resource Metrics

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: hello-world
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: hello-world
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 50
  - type: Resource
    resource:
      name: memory
      targetAverageValue: 1000Mi

Hello World HPA: Custom Metrics

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: hello-world
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: hello-world
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 50
  - type: Resource
    resource:
      name: memory
      targetAverageValue: 100Mi
  - type: Pods
    pods:
      metricName: cpu_system
      targetAverageValue: 20m

View the HPA info and description. Confirm that metric data is shown.

Resource Metrics

# kubectl get hpa
NAME          REFERENCE                TARGETS                     MINPODS   MAXPODS   REPLICAS   AGE
hello-world   Deployment/hello-world   1253376 / 100Mi, 0% / 50%   1         10        1          6m
# kubectl describe hpa
Name:                                                  hello-world
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Mon, 23 Jul 2018 20:21:16 +0200
Reference:                                             Deployment/hello-world
Metrics:                                               ( current / target )
  resource memory on pods:                             1253376 / 100Mi
  resource cpu on pods  (as a percentage of request):  0% (0) / 50%
Min replicas:                                          1
Max replicas:                                          10
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    the last scale time was sufficiently old as to warrant a new scale
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from memory resource
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:           <none>

Custom Metrics

Enter the following command.
```
# kubectl describe hpa
```
You should receive the output that follows.

Generate a load for the service to test that your pods autoscale as intended. You can use any load-testing tool (Hey, Gatling, etc.), but we’re using .

Test that pod autoscaling works as intended.

To Test Autoscaling Using Resource Metrics:

Upscale to 2 Pods: CPU Usage Up to Target

Use your load testing tool to scale up to two pods based on CPU Usage.

View your HPA.

# kubectl describe hpa

You should receive output similar to what follows.

Name:                                                  hello-world
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Mon, 23 Jul 2018 22:22:04 +0200
Reference:                                             Deployment/hello-world
Metrics:                                               ( current / target )
  resource memory on pods:                             10928128 / 100Mi
  resource cpu on pods  (as a percentage of request):  56% (280m) / 50%
Min replicas:                                          1
Max replicas:                                          10
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    SucceededRescale    the HPA controller was able to update the target scale to 2
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type    Reason             Age   From                       Message
  ----    ------             ----  ----                       -------
  Normal  SuccessfulRescale  13s   horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target

  # kubectl get pods

You should receive output similar to what follows:

  NAME                                                     READY     STATUS    RESTARTS   AGE
  hello-world-54764dfbf8-k8ph2                             1/1       Running   0          1m
  hello-world-54764dfbf8-q6l4v                             1/1       Running   0          3h

Upscale to 3 pods: CPU Usage Up to Target

Use your load testing tool to upscale to 3 pods based on CPU usage with horizontal-pod-autoscaler-upscale-delay set to 3 minutes.

Enter the following command.
```
# kubectl describe hpa
```
You should receive output similar to what follows

Enter the following command to confirm three pods are running.

# kubectl get pods

You should receive output similar to what follows.

  NAME                                                     READY     STATUS    RESTARTS   AGE
  hello-world-54764dfbf8-f46kh                             0/1       Running   0          1m
  hello-world-54764dfbf8-k8ph2                             1/1       Running   0          5m
  hello-world-54764dfbf8-q6l4v                             1/1       Running   0          3h

Downscale to 1 Pod: All Metrics Below Target

Use your load testing to scale down to 1 pod when all metrics are below target for horizontal-pod-autoscaler-downscale-delay (5 minutes by default).

Enter the following command.

# kubectl describe hpa

  Name:                                                  hello-world
  Labels:                                                <none>
  Annotations:                                           <none>
  CreationTimestamp:                                     Mon, 23 Jul 2018 22:22:04 +0200
  Reference:                                             Deployment/hello-world
  Metrics:                                               ( current / target )
    resource cpu on pods  (as a percentage of request):  0% (0) / 50%
  Min replicas:                                          1
  Max replicas:                                          10
  Conditions:
    Type            Status  Reason              Message
    ----            ------  ------              -------
    AbleToScale     True    SucceededRescale    the HPA controller was able to update the target scale to 1
    ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from memory resource
    ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
  Events:
    Type    Reason             Age   From                       Message
    ----    ------             ----  ----                       -------
    Normal  SuccessfulRescale  10m   horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target
    Normal  SuccessfulRescale  6m    horizontal-pod-autoscaler  New size: 3; reason: cpu resource utilization (percentage of request) above target
    Normal  SuccessfulRescale  1s    horizontal-pod-autoscaler  New size: 1; reason: All metrics below target

**To Test Autoscaling Using Custom Metrics:**
 Upscale to 2 Pods: CPU Usage Up to Target
Use your load testing tool to upscale two pods based on CPU usage.
1.  Enter the following command.
    ```
    # kubectl describe hpa
    ```
    You should receive output similar to what follows.
    ```
    Name:                                                  hello-world
    Namespace:                                             default
    Labels:                                                <none>
    Annotations:                                           <none>
    CreationTimestamp:                                     Tue, 24 Jul 2018 18:01:11 +0200
    Reference:                                             Deployment/hello-world
    Metrics:                                               ( current / target )
      resource memory on pods:                             8159232 / 100Mi
      "cpu_system" on pods:                                7m / 20m
      resource cpu on pods  (as a percentage of request):  64% (321m) / 50%
    Min replicas:                                          1
    Max replicas:                                          10
    Conditions:
      Type            Status  Reason              Message
      ----            ------  ------              -------
      AbleToScale     True    SucceededRescale    the HPA controller was able to update the target scale to 2
      ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
      ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
    Events:
      Type    Reason             Age   From                       Message
      ----    ------             ----  ----                       -------
      Normal  SuccessfulRescale  16s   horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target
    ```
2.  Enter the following command to confirm two pods are running.
    ```
    # kubectl get pods
    ```
    You should receive output similar to what follows.
    ```
        NAME                           READY     STATUS    RESTARTS   AGE
        hello-world-54764dfbf8-5pfdr   1/1       Running   0          3s
        hello-world-54764dfbf8-q6l82   1/1       Running   0          6h
    ```
 Upscale to 3 Pods: CPU Usage Up to Target
Use your load testing tool to scale up to three pods when the cpu\_system usage limit is up to target.
1.  Enter the following command.
    ```
    # kubectl describe hpa
    ```
    You should receive output similar to what follows:
    ```
      Name:                                                  hello-world
      Namespace:                                             default
      Labels:                                                <none>
      Annotations:                                           <none>
      CreationTimestamp:                                     Tue, 24 Jul 2018 18:01:11 +0200
      Reference:                                             Deployment/hello-world
      Metrics:                                               ( current / target )
        resource memory on pods:                             8374272 / 100Mi
        "cpu_system" on pods:                                27m / 20m
        resource cpu on pods  (as a percentage of request):  71% (357m) / 50%
      Min replicas:                                          1
      Max replicas:                                          10
      Conditions:
        Type            Status  Reason              Message
        ----            ------  ------              -------
        AbleToScale     True    SucceededRescale    the HPA controller was able to update the target scale to 3
        ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
        ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
      Events:
        Type    Reason             Age   From                       Message
        ----    ------             ----  ----                       -------
        Normal  SuccessfulRescale  3m    horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target
        Normal  SuccessfulRescale  3s    horizontal-pod-autoscaler  New size: 3; reason: pods metric cpu_system above target
    ```
2.  Enter the following command to confirm three pods are running.
    ```
    # kubectl get pods
    ```
    You should receive output similar to what follows:
    ```
      # kubectl get pods
      NAME                           READY     STATUS    RESTARTS   AGE
      hello-world-54764dfbf8-5pfdr   1/1       Running   0          3m
      hello-world-54764dfbf8-m2hrl   1/1       Running   0          1s
      hello-world-54764dfbf8-q6l82   1/1       Running   0          6h
    ```
 Upscale to 4 Pods: CPU Usage Up to Target
Use your load testing tool to upscale to four pods based on CPU usage. `horizontal-pod-autoscaler-upscale-delay` is set to three minutes by default.
1.  Enter the following command.
    ```
    # kubectl describe hpa
    ```
    You should receive output similar to what follows.
    ```
      Name:                                                  hello-world
      Namespace:                                             default
      Labels:                                                <none>
      Annotations:                                           <none>
      CreationTimestamp:                                     Tue, 24 Jul 2018 18:01:11 +0200
      Reference:                                             Deployment/hello-world
      Metrics:                                               ( current / target )
        resource memory on pods:                             8374272 / 100Mi
        "cpu_system" on pods:                                27m / 20m
        resource cpu on pods  (as a percentage of request):  71% (357m) / 50%
      Min replicas:                                          1
      Max replicas:                                          10
      Conditions:
        Type            Status  Reason              Message
        ----            ------  ------              -------
        AbleToScale     True    SucceededRescale    the HPA controller was able to update the target scale to 3
        ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
        ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
      Events:
        Type    Reason             Age   From                       Message
        ----    ------             ----  ----                       -------
        Normal  SuccessfulRescale  5m    horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target
        Normal  SuccessfulRescale  3m    horizontal-pod-autoscaler  New size: 3; reason: pods metric cpu_system above target
        Normal  SuccessfulRescale  4s    horizontal-pod-autoscaler  New size: 4; reason: cpu resource utilization (percentage of request) above target
    ```
2.  Enter the following command to confirm four pods are running.
    ```
    # kubectl get pods
    ```
    You should receive output similar to what follows.
    ```
      NAME                           READY     STATUS    RESTARTS   AGE
      hello-world-54764dfbf8-2p9xb   1/1       Running   0          5m
      hello-world-54764dfbf8-5pfdr   1/1       Running   0          2m
      hello-world-54764dfbf8-m2hrl   1/1       Running   0          1s
      hello-world-54764dfbf8-q6l82   1/1       Running   0          6h
    ```
 Downscale to 1 Pod: All Metrics Below Target
Use your load testing tool to scale down to one pod when all metrics below target for `horizontal-pod-autoscaler-downscale-delay`.
1.  Enter the following command.
    ```
    # kubectl describe hpa
    ```
    You should receive similar output to what follows.
    ```
        Name:                                                  hello-world
        Namespace:                                             default
        Labels:                                                <none>
        Annotations:                                           <none>
        CreationTimestamp:                                     Tue, 24 Jul 2018 18:01:11 +0200
        Reference:                                             Deployment/hello-world
        Metrics:                                               ( current / target )
          resource memory on pods:                             8101888 / 100Mi
          "cpu_system" on pods:                                8m / 20m
          resource cpu on pods  (as a percentage of request):  0% (0) / 50%
        Min replicas:                                          1
        Max replicas:                                          10
        Conditions:
          Type            Status  Reason              Message
          ----            ------  ------              -------
          AbleToScale     True    SucceededRescale    the HPA controller was able to update the target scale to 1
          ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from memory resource
          ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
        Events:
          Type    Reason             Age   From                       Message
          ----    ------             ----  ----                       -------
          Normal  SuccessfulRescale  10m    horizontal-pod-autoscaler  New size: 2; reason: cpu resource utilization (percentage of request) above target
          Normal  SuccessfulRescale  8m    horizontal-pod-autoscaler  New size: 3; reason: pods metric cpu_system above target
          Normal  SuccessfulRescale  5m    horizontal-pod-autoscaler  New size: 4; reason: cpu resource utilization (percentage of request) above target
          Normal   SuccessfulRescale             13s               horizontal-pod-autoscaler  New size: 1; reason: All metrics below target
    ```
2.  Enter the following command to confirm a single pods is running.
    ```
        # kubectl get pods
    ```
    You should receive output similar to what follows.
    ```
        NAME                           READY     STATUS    RESTARTS   AGE
    ```