A high-availability Kubernetes installation, defined as an installation of Rancher on a Kubernetes cluster with at least three nodes, should be used in any production installation of Rancher, as well as any installation deemed “important.” Multiple Rancher instances running on multiple nodes ensure high availability that cannot be accomplished with a single node environment.
If you are installing Rancher in a vSphere environment, refer to the best practices documented here.
When you set up your high-availability Rancher installation, consider the following:
Make sure nodes are configured correctly for Kubernetes
It’s important to follow K8s and etcd best practices when deploying your nodes, including disabling swap, double checking you have full network connectivity between all machines in the cluster, using unique hostnames, MAC addresses, and product_uuids for every node, checking that all correct ports are opened, and deploying with ssd backed etcd. More details can be found in the kubernetes docs and
RKE keeps record of the cluster state in a file called . This file is important for the recovery of a cluster and/or the continued maintenance of the cluster through RKE. Because this file contains certificate material, we strongly recommend encrypting this file before backing up. After each run of rke up
you should backup the state file.
Run All Nodes in the Cluster in the Same Datacenter
For best performance, run all three of your nodes in the same geographic datacenter. If you are running nodes in the cloud, such as AWS, run each node in a separate Availability Zone. For example, launch node 1 in us-west-2a, node 2 in us-west-2b, and node 3 in us-west-2c.
Monitor Your Clusters to Plan Capacity
The Rancher server’s Kubernetes cluster should run within the system and hardware requirements as closely as possible. The more you deviate from the system and hardware requirements, the more risk you take.
However, metrics-driven capacity planning analysis should be the ultimate guidance for scaling Rancher, because the published requirements take into account a variety of workload types.
Using Rancher, you can monitor the state and processes of your cluster nodes, Kubernetes components, and software deployments through integration with Prometheus, a leading open-source monitoring solution, and Grafana, which lets you visualize the metrics from Prometheus.