Upgrade Consul on Kubernetes
If you make a change to your Helm values file, you will need to perform a for those changes to take effect.
For example, if you’ve installed Consul with the following:
And you wish to set connectInject.enabled
to true
:
global:
name: consul
connectInject:
- enabled: false
+ enabled: true
Perform the following steps:
- Determine your current installed chart version.
helm list -f consul
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
consul default 2 2020-09-30 ... deployed consul-0.24.0 1.8.2
In this example, version 0.24.0
(from consul-0.24.0
) is being used.
Perform a
helm upgrade
:helm upgrade consul hashicorp/consul --version 0.24.0 -f /path/to/my/values.yaml
Before performing the upgrade, be sure you’ve read the other sections on this page, continuing at Determining What Will Change.
NOTE: It’s important to always set the --version
flag, because otherwise Helm will use the most up-to-date version in its local cache, which may result in an unintended upgrade.
You may wish to upgrade your Helm chart version to take advantage of new features, bugfixes, or because you want to upgrade your Consul version, and it requires a certain Helm chart version.
- Update your local Helm repository cache:
helm repo update
- List all available versions:
Here we can see that the latest version of 0.24.1
.
- To determine which version you have installed, issue the following command:
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
consul default 2 2020-09-30 ... deployed consul-0.24.0 1.8.2
In this example, version 0.24.0
(from consul-0.24.0
) is being used. If you want to upgrade to the latest 0.24.1
version, use the following procedure:
Check the changelog for any breaking changes from that version and any versions in between: .
Upgrade by performing a
helm upgrade
with the--version
flag:
helm upgrade consul hashicorp/consul --version 0.24.1 -f /path/to/my/values.yaml
Before performing the upgrade, be sure you’ve read the other sections on this page, continuing at Determining What Will Change.
If a new version of Consul is released, you will need to perform a Helm upgrade to update to the new version.
Ensure you’ve read any for the version you’re upgrading to and the Consul changelog for that version.
Read our to ensure your current Helm chart version supports this Consul version. If it does not, you may need to also upgrade your Helm chart version at the same time.
Set in your
values.yaml
to the desired version:global:
image: consul:1.8.3
values.yaml
Determine your current installed chart version:
helm list -f consul
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
consul default 2 2020-09-30 ... deployed consul-0.24.0 1.8.2
In this example, version 0.24.0
(from consul-0.24.0
) is being used.
- Perform a
helm upgrade
:
Before performing the upgrade, be sure you’ve read the other sections on this page, continuing at Determining What Will Change.
NOTE: It’s important to always set the --version
flag, because otherwise Helm will use the most up-to-date version in its local cache, which may result in an unintended upgrade.
Determining What Will Change
Before upgrading, it’s important to understand what changes will be made to your cluster. For example, you will need to take more care if your upgrade will result in the Consul server statefulset being redeployed.
There is no built-in functionality in Helm that shows what a helm upgrade will change. There is, however, a Helm plugin helm-diff that can be used.
- Install
helm-diff
with:
helm plugin install https://github.com/databus23/helm-diff
- If you are updating your
values.yaml
file, do so now. - Take the same
helm upgrade
command you were planning to issue but performhelm diff upgrade
instead ofhelm upgrade
:
helm diff upgrade consul hashicorp/consul --version 0.24.1 -f /path/to/your/values.yaml
This will print out the manifests that will be updated and their diffs.
- To see only the objects that will be updated, add
| grep "has changed"
:
helm diff upgrade consul hashicorp/consul --version 0.24.1 -f /path/to/your/values.yaml |
grep "has changed"
If either is being redeployed, we will follow the same pattern for upgrades as on other platforms: the servers will be redeployed one-by-one, and then the clients will be redeployed in batches. Read and then continue reading below.
If neither the client daemonset nor the server statefulset is being redeployed, then you can continue with the helm upgrade without any specific sequence to follow.
If you are using Consul’s service mesh features, as opposed to the service sync functionality, you must be aware of the behavior of the service mesh during upgrades.
When a Consul client pod is restarted, it will deregister itself from Consul when it stops. When the pod restarts, it will re-register itself with Consul. Thus, during the period between the Consul client on a node stopping and restarting, the following will occur:
- The node will be deregistered from Consul. It will not show up in the Consul UI nor in API requests.
- Because the node is deregistered, all service pods that were on that node will also be deregistered. This means they will not receive service mesh traffic until the Consul client pod restarts.
- Service pods on that node can continue to make requests through the service mesh because each Envoy proxy maintains a cache of the locations of upstream services. However, if the upstream services change IPs, Envoy will not be able to refresh its cluster information until its local Consul client is restarted. So services can continue to make requests without downtime for a short period of time, however, it’s important for the local Consul client to be restarted as quickly as possible.
Once the local Consul client pod restarts, each service pod needs to re-register itself with the client. This is done automatically by the consul-connect-lifecycle-sidecar
sidecar container that is injected alongside each service.
Because service mesh pods are briefly deregistered during a Consul client restart, it’s important that you do not restart all Consul clients at once. Otherwise you may experience downtime because no replicas of a specific service will be in the mesh.
In addition, it’s important that you have multiple replicas for each service. If you only have one replica, then during restart of the Consul client on the node hosting that replica, it will be briefly deregistered from the mesh. Since it’s the only replica, other services will not be able to make calls to that service. (NOTE: This can also be avoided by stopping that replica so it is rescheduled to a node whose Consul client has already been updated.)
Given the above, we recommend that after Consul servers are upgraded, the Consul client daemonset is set to use the OnDelete
update strategy and Consul clients are deleted one by one or in batches. See and Upgrading Consul Clients for more details.
Upgrading Consul Servers
To initiate the upgrade:
- Change the
global.image
value to the desired Consul version - Set the
server.updatePartition
value equal to the number of server replicas. By default there are 3 servers, so you would set this value to - Set the
updateStrategy
for clients toOnDelete
global:
image: 'consul:123.456'
server:
updatePartition: 3
client:
updateStrategy: |
type: OnDelete
The updatePartition
value controls how many instances of the server cluster are updated. Only instances with an index greater than the updatePartition
value are updated (zero-indexed). Therefore, by setting it equal to replicas, none should update yet.
The updateStrategy
controls how Kubernetes rolls out changes to the client daemonset. By setting it to OnDelete
, no clients will be restarted until their pods are deleted. Without this, they would be redeployed alongside the servers because their Docker image versions have changed. This is not desirable because we want the Consul servers to be upgraded before the clients.
- Next, perform the upgrade:
This will not cause the servers to redeploy (although the resource will be updated). If everything is stable, begin by decreasing the updatePartition
value by one, and performing helm upgrade
again. This will cause the first Consul server to be stopped and restarted with the new image.
Wait until the Consul server cluster is healthy again (30s to a few minutes). This can be confirmed by issuing
consul members
on one of the previous servers, and ensuring that all servers are listed and arealive
.Decrease
updatePartition
by one and upgrade again. Continue untilupdatePartition
is0
. At this point, you may remove theupdatePartition
configuration. Your server upgrade is complete.
With the servers upgraded, it is time to upgrade the clients. If you are using Consul’s service mesh features, you will want to be careful restarting the clients as outlined in Service Mesh.
You can either:
- Manually issue
kubectl delete pod <id>
for each consul daemonset pod - Set the updateStrategy to rolling update with a small number:
client:
updateStrategy: |
rollingUpdate:
maxUnavailable: 2
Then, run helm upgrade
. This will upgrade the clients in batches, waiting until the clients come up healthy before continuing.
- Cordon and drain each node to ensure there are no connect pods active on it, and then delete the consul client pod on that node.
NOTE: If you are using only the Service Sync functionality, you can perform an upgrade without following a specific sequence since that component is more resilient to brief restarts of Consul clients.