For HPA to work correctly, service deployments should have resources request definitions for containers. Follow this hello-world example to test if HPA is working correctly.
Copy the
hello-world
deployment manifest below.Hello World Manifest
Deploy it to your cluster.
# kubectl create -f <HELLO_WORLD_MANIFEST>
Copy one of the HPAs below based on the metric type you’re using:
Hello World HPA: Resource Metrics
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: hello-world
namespace: default
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: hello-world
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 50
- type: Resource
resource:
name: memory
targetAverageValue: 1000Mi
Hello World HPA: Custom Metrics
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: hello-world
namespace: default
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: hello-world
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 50
- type: Resource
resource:
name: memory
targetAverageValue: 100Mi
- type: Pods
pods:
metricName: cpu_system
targetAverageValue: 20m
View the HPA info and description. Confirm that metric data is shown.
Resource Metrics
-
# kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
hello-world Deployment/hello-world 1253376 / 100Mi, 0% / 50% 1 10 1 6m
# kubectl describe hpa
Name: hello-world
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Mon, 23 Jul 2018 20:21:16 +0200
Reference: Deployment/hello-world
Metrics: ( current / target )
resource memory on pods: 1253376 / 100Mi
resource cpu on pods (as a percentage of request): 0% (0) / 50%
Min replicas: 1
Max replicas: 10
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale the last scale time was sufficiently old as to warrant a new scale
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events: <none>
Custom Metrics
Enter the following command.
# kubectl describe hpa
You should receive the output that follows.
-
Generate a load for the service to test that your pods autoscale as intended. You can use any load-testing tool (Hey, Gatling, etc.), but we’re using .
Test that pod autoscaling works as intended.
To Test Autoscaling Using Resource Metrics:
Upscale to 2 Pods: CPU Usage Up to Target
Use your load testing tool to scale up to two pods based on CPU Usage.
View your HPA.
# kubectl describe hpa
You should receive output similar to what follows.
Name: hello-world
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Mon, 23 Jul 2018 22:22:04 +0200
Reference: Deployment/hello-world
Metrics: ( current / target )
resource memory on pods: 10928128 / 100Mi
resource cpu on pods (as a percentage of request): 56% (280m) / 50%
Min replicas: 1
Max replicas: 10
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 13s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
-
# kubectl get pods
You should receive output similar to what follows:
NAME READY STATUS RESTARTS AGE
hello-world-54764dfbf8-k8ph2 1/1 Running 0 1m
hello-world-54764dfbf8-q6l4v 1/1 Running 0 3h
Upscale to 3 pods: CPU Usage Up to Target
Use your load testing tool to upscale to 3 pods based on CPU usage with
horizontal-pod-autoscaler-upscale-delay
set to 3 minutes.Enter the following command.
# kubectl describe hpa
You should receive output similar to what follows
Enter the following command to confirm three pods are running.
# kubectl get pods
You should receive output similar to what follows.
NAME READY STATUS RESTARTS AGE
hello-world-54764dfbf8-f46kh 0/1 Running 0 1m
hello-world-54764dfbf8-k8ph2 1/1 Running 0 5m
hello-world-54764dfbf8-q6l4v 1/1 Running 0 3h
Downscale to 1 Pod: All Metrics Below Target
Use your load testing to scale down to 1 pod when all metrics are below target for
horizontal-pod-autoscaler-downscale-delay
(5 minutes by default).Enter the following command.
# kubectl describe hpa
Name: hello-world
Labels: <none>
Annotations: <none>
CreationTimestamp: Mon, 23 Jul 2018 22:22:04 +0200
Reference: Deployment/hello-world
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): 0% (0) / 50%
Min replicas: 1
Max replicas: 10
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 1
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 10m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 6m horizontal-pod-autoscaler New size: 3; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 1s horizontal-pod-autoscaler New size: 1; reason: All metrics below target
**To Test Autoscaling Using Custom Metrics:**
Upscale to 2 Pods: CPU Usage Up to Target
Use your load testing tool to upscale two pods based on CPU usage.
1. Enter the following command.
```
# kubectl describe hpa
```
You should receive output similar to what follows.
```
Name: hello-world
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
Reference: Deployment/hello-world
Metrics: ( current / target )
resource memory on pods: 8159232 / 100Mi
"cpu_system" on pods: 7m / 20m
resource cpu on pods (as a percentage of request): 64% (321m) / 50%
Min replicas: 1
Max replicas: 10
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 16s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
```
2. Enter the following command to confirm two pods are running.
```
# kubectl get pods
```
You should receive output similar to what follows.
```
NAME READY STATUS RESTARTS AGE
hello-world-54764dfbf8-5pfdr 1/1 Running 0 3s
hello-world-54764dfbf8-q6l82 1/1 Running 0 6h
```
Upscale to 3 Pods: CPU Usage Up to Target
Use your load testing tool to scale up to three pods when the cpu\_system usage limit is up to target.
1. Enter the following command.
```
# kubectl describe hpa
```
You should receive output similar to what follows:
```
Name: hello-world
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
Reference: Deployment/hello-world
Metrics: ( current / target )
resource memory on pods: 8374272 / 100Mi
"cpu_system" on pods: 27m / 20m
resource cpu on pods (as a percentage of request): 71% (357m) / 50%
Min replicas: 1
Max replicas: 10
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 3
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 3m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 3s horizontal-pod-autoscaler New size: 3; reason: pods metric cpu_system above target
```
2. Enter the following command to confirm three pods are running.
```
# kubectl get pods
```
You should receive output similar to what follows:
```
# kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-world-54764dfbf8-5pfdr 1/1 Running 0 3m
hello-world-54764dfbf8-m2hrl 1/1 Running 0 1s
hello-world-54764dfbf8-q6l82 1/1 Running 0 6h
```
Upscale to 4 Pods: CPU Usage Up to Target
Use your load testing tool to upscale to four pods based on CPU usage. `horizontal-pod-autoscaler-upscale-delay` is set to three minutes by default.
1. Enter the following command.
```
# kubectl describe hpa
```
You should receive output similar to what follows.
```
Name: hello-world
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
Reference: Deployment/hello-world
Metrics: ( current / target )
resource memory on pods: 8374272 / 100Mi
"cpu_system" on pods: 27m / 20m
resource cpu on pods (as a percentage of request): 71% (357m) / 50%
Min replicas: 1
Max replicas: 10
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 3
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 5m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 3m horizontal-pod-autoscaler New size: 3; reason: pods metric cpu_system above target
Normal SuccessfulRescale 4s horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
```
2. Enter the following command to confirm four pods are running.
```
# kubectl get pods
```
You should receive output similar to what follows.
```
NAME READY STATUS RESTARTS AGE
hello-world-54764dfbf8-2p9xb 1/1 Running 0 5m
hello-world-54764dfbf8-5pfdr 1/1 Running 0 2m
hello-world-54764dfbf8-m2hrl 1/1 Running 0 1s
hello-world-54764dfbf8-q6l82 1/1 Running 0 6h
```
Downscale to 1 Pod: All Metrics Below Target
Use your load testing tool to scale down to one pod when all metrics below target for `horizontal-pod-autoscaler-downscale-delay`.
1. Enter the following command.
```
# kubectl describe hpa
```
You should receive similar output to what follows.
```
Name: hello-world
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
Reference: Deployment/hello-world
Metrics: ( current / target )
resource memory on pods: 8101888 / 100Mi
"cpu_system" on pods: 8m / 20m
resource cpu on pods (as a percentage of request): 0% (0) / 50%
Min replicas: 1
Max replicas: 10
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 1
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 10m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 8m horizontal-pod-autoscaler New size: 3; reason: pods metric cpu_system above target
Normal SuccessfulRescale 5m horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 13s horizontal-pod-autoscaler New size: 1; reason: All metrics below target
```
2. Enter the following command to confirm a single pods is running.
```
# kubectl get pods
```
You should receive output similar to what follows.
```
NAME READY STATUS RESTARTS AGE
```