For HPA to work correctly, service deployments should have resources request definitions for containers. Follow this hello-world example to test if HPA is working correctly.

    1. Copy the hello-world deployment manifest below.

      Hello World Manifest

    2. Deploy it to your cluster.

      1. # kubectl create -f <HELLO_WORLD_MANIFEST>
    3. Copy one of the HPAs below based on the metric type you’re using:

      Hello World HPA: Resource Metrics

      1. apiVersion: autoscaling/v2beta1
      2. kind: HorizontalPodAutoscaler
      3. metadata:
      4. name: hello-world
      5. namespace: default
      6. spec:
      7. scaleTargetRef:
      8. apiVersion: extensions/v1beta1
      9. kind: Deployment
      10. name: hello-world
      11. minReplicas: 1
      12. maxReplicas: 10
      13. metrics:
      14. - type: Resource
      15. resource:
      16. name: cpu
      17. targetAverageUtilization: 50
      18. - type: Resource
      19. resource:
      20. name: memory
      21. targetAverageValue: 1000Mi

      Hello World HPA: Custom Metrics

      1. apiVersion: autoscaling/v2beta1
      2. kind: HorizontalPodAutoscaler
      3. metadata:
      4. name: hello-world
      5. namespace: default
      6. spec:
      7. scaleTargetRef:
      8. apiVersion: extensions/v1beta1
      9. kind: Deployment
      10. name: hello-world
      11. minReplicas: 1
      12. maxReplicas: 10
      13. metrics:
      14. - type: Resource
      15. resource:
      16. name: cpu
      17. targetAverageUtilization: 50
      18. - type: Resource
      19. resource:
      20. name: memory
      21. targetAverageValue: 100Mi
      22. - type: Pods
      23. pods:
      24. metricName: cpu_system
      25. targetAverageValue: 20m
    4. View the HPA info and description. Confirm that metric data is shown.

      Resource Metrics

        1. # kubectl get hpa
        2. NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
        3. hello-world Deployment/hello-world 1253376 / 100Mi, 0% / 50% 1 10 1 6m
        4. # kubectl describe hpa
        5. Name: hello-world
        6. Namespace: default
        7. Labels: <none>
        8. Annotations: <none>
        9. CreationTimestamp: Mon, 23 Jul 2018 20:21:16 +0200
        10. Reference: Deployment/hello-world
        11. Metrics: ( current / target )
        12. resource memory on pods: 1253376 / 100Mi
        13. resource cpu on pods (as a percentage of request): 0% (0) / 50%
        14. Min replicas: 1
        15. Max replicas: 10
        16. Conditions:
        17. Type Status Reason Message
        18. ---- ------ ------ -------
        19. AbleToScale True ReadyForNewScale the last scale time was sufficiently old as to warrant a new scale
        20. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
        21. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
        22. Events: <none>

        Custom Metrics

      1. Enter the following command.

        1. # kubectl describe hpa

        You should receive the output that follows.

    5. Generate a load for the service to test that your pods autoscale as intended. You can use any load-testing tool (Hey, Gatling, etc.), but we’re using .

    6. Test that pod autoscaling works as intended.

      To Test Autoscaling Using Resource Metrics:

      Upscale to 2 Pods: CPU Usage Up to Target

      Use your load testing tool to scale up to two pods based on CPU Usage.

      1. View your HPA.

        1. # kubectl describe hpa

        You should receive output similar to what follows.

        1. Name: hello-world
        2. Namespace: default
        3. Labels: <none>
        4. Annotations: <none>
        5. CreationTimestamp: Mon, 23 Jul 2018 22:22:04 +0200
        6. Reference: Deployment/hello-world
        7. Metrics: ( current / target )
        8. resource memory on pods: 10928128 / 100Mi
        9. resource cpu on pods (as a percentage of request): 56% (280m) / 50%
        10. Min replicas: 1
        11. Max replicas: 10
        12. Conditions:
        13. Type Status Reason Message
        14. ---- ------ ------ -------
        15. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
        16. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
        17. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
        18. Events:
        19. Type Reason Age From Message
        20. ---- ------ ---- ---- -------
        21. Normal SuccessfulRescale 13s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
        1. # kubectl get pods

        You should receive output similar to what follows:

        1. NAME READY STATUS RESTARTS AGE
        2. hello-world-54764dfbf8-k8ph2 1/1 Running 0 1m
        3. hello-world-54764dfbf8-q6l4v 1/1 Running 0 3h

        Upscale to 3 pods: CPU Usage Up to Target

      Use your load testing tool to upscale to 3 pods based on CPU usage with horizontal-pod-autoscaler-upscale-delay set to 3 minutes.

      1. Enter the following command.

        1. # kubectl describe hpa

        You should receive output similar to what follows

      2. Enter the following command to confirm three pods are running.

        1. # kubectl get pods

        You should receive output similar to what follows.

        1. NAME READY STATUS RESTARTS AGE
        2. hello-world-54764dfbf8-f46kh 0/1 Running 0 1m
        3. hello-world-54764dfbf8-k8ph2 1/1 Running 0 5m
        4. hello-world-54764dfbf8-q6l4v 1/1 Running 0 3h

        Downscale to 1 Pod: All Metrics Below Target

      Use your load testing to scale down to 1 pod when all metrics are below target for horizontal-pod-autoscaler-downscale-delay (5 minutes by default).

      1. Enter the following command.

        1. # kubectl describe hpa
        1. Name: hello-world
        2. Labels: <none>
        3. Annotations: <none>
        4. CreationTimestamp: Mon, 23 Jul 2018 22:22:04 +0200
        5. Reference: Deployment/hello-world
        6. Metrics: ( current / target )
        7. resource cpu on pods (as a percentage of request): 0% (0) / 50%
        8. Min replicas: 1
        9. Max replicas: 10
        10. Conditions:
        11. Type Status Reason Message
        12. ---- ------ ------ -------
        13. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 1
        14. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
        15. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
        16. Events:
        17. Type Reason Age From Message
        18. ---- ------ ---- ---- -------
        19. Normal SuccessfulRescale 10m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
        20. Normal SuccessfulRescale 6m horizontal-pod-autoscaler New size: 3; reason: cpu resource utilization (percentage of request) above target
        21. Normal SuccessfulRescale 1s horizontal-pod-autoscaler New size: 1; reason: All metrics below target
    1. **To Test Autoscaling Using Custom Metrics:**
    2. Upscale to 2 Pods: CPU Usage Up to Target
    3. Use your load testing tool to upscale two pods based on CPU usage.
    4. 1. Enter the following command.
    5. ```
    6. # kubectl describe hpa
    7. ```
    8. You should receive output similar to what follows.
    9. ```
    10. Name: hello-world
    11. Namespace: default
    12. Labels: <none>
    13. Annotations: <none>
    14. CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
    15. Reference: Deployment/hello-world
    16. Metrics: ( current / target )
    17. resource memory on pods: 8159232 / 100Mi
    18. "cpu_system" on pods: 7m / 20m
    19. resource cpu on pods (as a percentage of request): 64% (321m) / 50%
    20. Min replicas: 1
    21. Max replicas: 10
    22. Conditions:
    23. Type Status Reason Message
    24. ---- ------ ------ -------
    25. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
    26. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
    27. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
    28. Events:
    29. Type Reason Age From Message
    30. ---- ------ ---- ---- -------
    31. Normal SuccessfulRescale 16s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
    32. ```
    33. 2. Enter the following command to confirm two pods are running.
    34. ```
    35. # kubectl get pods
    36. ```
    37. You should receive output similar to what follows.
    38. ```
    39. NAME READY STATUS RESTARTS AGE
    40. hello-world-54764dfbf8-5pfdr 1/1 Running 0 3s
    41. hello-world-54764dfbf8-q6l82 1/1 Running 0 6h
    42. ```
    43. Upscale to 3 Pods: CPU Usage Up to Target
    44. Use your load testing tool to scale up to three pods when the cpu\_system usage limit is up to target.
    45. 1. Enter the following command.
    46. ```
    47. # kubectl describe hpa
    48. ```
    49. You should receive output similar to what follows:
    50. ```
    51. Name: hello-world
    52. Namespace: default
    53. Labels: <none>
    54. Annotations: <none>
    55. CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
    56. Reference: Deployment/hello-world
    57. Metrics: ( current / target )
    58. resource memory on pods: 8374272 / 100Mi
    59. "cpu_system" on pods: 27m / 20m
    60. resource cpu on pods (as a percentage of request): 71% (357m) / 50%
    61. Min replicas: 1
    62. Max replicas: 10
    63. Conditions:
    64. Type Status Reason Message
    65. ---- ------ ------ -------
    66. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 3
    67. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
    68. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
    69. Events:
    70. Type Reason Age From Message
    71. ---- ------ ---- ---- -------
    72. Normal SuccessfulRescale 3m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
    73. Normal SuccessfulRescale 3s horizontal-pod-autoscaler New size: 3; reason: pods metric cpu_system above target
    74. ```
    75. 2. Enter the following command to confirm three pods are running.
    76. ```
    77. # kubectl get pods
    78. ```
    79. You should receive output similar to what follows:
    80. ```
    81. # kubectl get pods
    82. NAME READY STATUS RESTARTS AGE
    83. hello-world-54764dfbf8-5pfdr 1/1 Running 0 3m
    84. hello-world-54764dfbf8-m2hrl 1/1 Running 0 1s
    85. hello-world-54764dfbf8-q6l82 1/1 Running 0 6h
    86. ```
    87. Upscale to 4 Pods: CPU Usage Up to Target
    88. Use your load testing tool to upscale to four pods based on CPU usage. `horizontal-pod-autoscaler-upscale-delay` is set to three minutes by default.
    89. 1. Enter the following command.
    90. ```
    91. # kubectl describe hpa
    92. ```
    93. You should receive output similar to what follows.
    94. ```
    95. Name: hello-world
    96. Namespace: default
    97. Labels: <none>
    98. Annotations: <none>
    99. CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
    100. Reference: Deployment/hello-world
    101. Metrics: ( current / target )
    102. resource memory on pods: 8374272 / 100Mi
    103. "cpu_system" on pods: 27m / 20m
    104. resource cpu on pods (as a percentage of request): 71% (357m) / 50%
    105. Min replicas: 1
    106. Max replicas: 10
    107. Conditions:
    108. Type Status Reason Message
    109. ---- ------ ------ -------
    110. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 3
    111. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
    112. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
    113. Events:
    114. Type Reason Age From Message
    115. ---- ------ ---- ---- -------
    116. Normal SuccessfulRescale 5m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
    117. Normal SuccessfulRescale 3m horizontal-pod-autoscaler New size: 3; reason: pods metric cpu_system above target
    118. Normal SuccessfulRescale 4s horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
    119. ```
    120. 2. Enter the following command to confirm four pods are running.
    121. ```
    122. # kubectl get pods
    123. ```
    124. You should receive output similar to what follows.
    125. ```
    126. NAME READY STATUS RESTARTS AGE
    127. hello-world-54764dfbf8-2p9xb 1/1 Running 0 5m
    128. hello-world-54764dfbf8-5pfdr 1/1 Running 0 2m
    129. hello-world-54764dfbf8-m2hrl 1/1 Running 0 1s
    130. hello-world-54764dfbf8-q6l82 1/1 Running 0 6h
    131. ```
    132. Downscale to 1 Pod: All Metrics Below Target
    133. Use your load testing tool to scale down to one pod when all metrics below target for `horizontal-pod-autoscaler-downscale-delay`.
    134. 1. Enter the following command.
    135. ```
    136. # kubectl describe hpa
    137. ```
    138. You should receive similar output to what follows.
    139. ```
    140. Name: hello-world
    141. Namespace: default
    142. Labels: <none>
    143. Annotations: <none>
    144. CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
    145. Reference: Deployment/hello-world
    146. Metrics: ( current / target )
    147. resource memory on pods: 8101888 / 100Mi
    148. "cpu_system" on pods: 8m / 20m
    149. resource cpu on pods (as a percentage of request): 0% (0) / 50%
    150. Min replicas: 1
    151. Max replicas: 10
    152. Conditions:
    153. Type Status Reason Message
    154. ---- ------ ------ -------
    155. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 1
    156. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
    157. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
    158. Events:
    159. Type Reason Age From Message
    160. ---- ------ ---- ---- -------
    161. Normal SuccessfulRescale 10m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
    162. Normal SuccessfulRescale 8m horizontal-pod-autoscaler New size: 3; reason: pods metric cpu_system above target
    163. Normal SuccessfulRescale 5m horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
    164. Normal SuccessfulRescale 13s horizontal-pod-autoscaler New size: 1; reason: All metrics below target
    165. ```
    166. 2. Enter the following command to confirm a single pods is running.
    167. ```
    168. # kubectl get pods
    169. ```
    170. You should receive output similar to what follows.
    171. ```
    172. NAME READY STATUS RESTARTS AGE
    173. ```