Kubernetes HPA (Horizontal Pod Autoscaling) on KubeSphere
The Kubernetes HPA feature automatically adjusts the number of Pods to maintain average resource usage (CPU and memory) of Pods around preset values. For details about how HPA functions, see the official Kubernetes document.
This document uses HPA based on CPU usage as an example. Operations for HPA based on memory usage are similar.
- You need to .
- You need to create a workspace, a project and a user (for example, ).
project-regular
must be invited to the project and assigned theoperator
role. For more information, see Create Workspaces, Projects, Users and Roles.
Create a Service
Log in to the KubeSphere web console as
project-regular
and go to your project.In the Create Service dialog box, click Stateless Service.
Set the Service name (for example, ) and click Next.
Click Add Container, set Image to
mirrorgooglecontainers/hpa-example
and click Use Default Ports.Set the CPU request (for example, 0.15 cores) for each container, click √, and click Next.
Note
- To use HPA based on CPU usage, you must set the CPU request for each container, which is the minimum CPU resource reserved for each container (for details, see the official Kubernetes document). The HPA feature compares the average Pod CPU usage with a target percentage of the average Pod CPU request.
- For HPA based on memory usage, you do not need to configure the memory request.
Select Deployments in Workloads on the left navigation bar and click the HPA Deployment (for example, hpa-v1) on the right.
Click More and select Edit Autoscaling from the drop-down menu.
In the Horizontal Pod Autoscaling dialog box, configure the HPA parameters and click OK.
- Target CPU Usage (%): Target percentage of the average Pod CPU request.
- Minimum Replicas: Minimum number of Pods.
- Maximum Replicas: Maximum number of Pods.
In this example, Target CPU Usage (%) is set to
60
, Minimum Replicas is set to1
, and Maximum Replicas is set to .Note
Ensure that the cluster can provide sufficient resources for all Pods when the number of Pods reaches the maximum. Otherwise, the creation of some Pods will fail.
Verify HPA
This section uses a Deployment that sends requests to the HPA Service to verify that HPA automatically adjusts the number of Pods to meet the resource usage target.
Select Workloads in Application Workloads on the left navigation bar and click Create on the right.
In the Create Deployment dialog box, set the Deployment name (for example,
load-generator
) and click Next.Scroll down in the dialog box, select Start Command, and set Command to
sh,-c
and Parameters to (for example,while true; do wget -q -O- http://hpa.demo-project.svc.cluster.local; done
).Click √ and click Next.
Click Next on the Volume Settings tab and click Create on the Advanced Settings tab.
View the HPA Deployment status
After the load generator Deployment is created, go to Workloads in Application Workloads on the left navigation bar and click the HPA Deployment (for example, hpa-v1) on the right. The number of Pods displayed on the page automatically increases to meet the resource usage target.
Choose Workloads in Application Workloads on the left navigation bar, click on the right of the load generator Deployment (for example, load-generator-v1), and choose Delete from the drop-down list. After the load-generator Deployment is deleted, check the status of the HPA Deployment again. The number of Pods decreases to the minimum.
Note
The system may require a few minutes to adjust the number of Pods and collect data.
You can repeat steps in to edit the HPA configuration.
Cancel HPA
Choose Workloads in Application Workloads on the left navigation bar and click the HPA Deployment (for example, hpa-v1) on the right.