Circuit Breaker
Circuit breakers - unlike active Health Checks - do not send additional traffic to our data plane proxies but they rather inspect the existing service traffic. They are also commonly used to prevent cascading failures in our services.
Like a real-world circuit breaker when the circuit is closed then traffic between a source and destination data plane proxy is allowed to freely flow through it, and when it is open then the traffic is interrupted.
The conditions that determine when a circuit breaker is closed or open are being configured in what we call “detectors”. This policy provides 5 different types of detectors and they are triggered on some deviations in the upstream service behavior. All detectors could coexist on the same outbound interface.
Once one of the detectors has been triggered the corresponding data plane proxy is ejected from the set of the load balancer for a period equal to . Every further ejection of the same data plane proxy will further extend the baseEjectionTime multiplied by the number of ejections: for example the 4th ejection will be lasting for a period of time of .
Available detectors:
As usual, we can apply sources
and destinations
selectors to determine how circuit breakers will be applied across our data plane proxies.
For example:
We will apply the configuration with kubectl apply -f [..]
.
The example demonstrates a complete configuration. A CircuitBreaker
can also be configured in a simpler way by leveraging the default values of Envoy for any property that is not explicitly defined, for instance:
We will apply the configuration with kubectl apply -f [..]
.
We will apply the configuration with kumactl apply -f [..]
or via the HTTP API.
In this example when we get five errors in a row of any type (5
is default Envoy value for totalErrors.consecutive
) the data plane proxy will be ejected for 30s
the first time, 60s
for the second time, and so on.
In the current version of Kuma destinations
only supports the tag.
Time interval between ejection analysis sweeps. Defaults to 10s.
baseEjectionTime
The base time that a data plane proxy is ejected for. The real time is equal to the base time multiplied by the number of times the data plane proxy has been ejected. Defaults to 30s.
maxEjectionPercent
The maximum percent of an upstream Envoy cluster that can be ejected due to outlier detection. Defaults to 10% but will eject at least one data plane proxy regardless of the value.
Split Mode: There are two types of errors that might occur in a circuit breaker:
- Locally originated: errors triggered locally when estabilishing a connection at the TCP layer (ie: connection refused, connection reset).
If Split Mode is off, Kuma won’t distinguish errors by their origin and they will be counted together. If Split Mode is on, different parameters can be used to fine tune the detectors. All detectors counts errors according to the state of this parameter.
Detectors
Below is a list of available detectors that can be configured in Kuma.
Total Errors
Errors with status code 5xx and locally originated errors, in Split Mode just errors with status code 5xx.
consecutive
- how many consecutive errors in a row will trigger the detector. Defaults to5
.
Gateway Errors
Subset of totalErrors related to gateway errors (502, 503 or 504 status code).
consecutive
- how many consecutive errors in a row will trigger the detector. Defaults to5
.
Taken into account only in Split Mode, number of locally originated errors.
consecutive
- how many consecutive errors in a row will trigger the detector. Defaults to5
.
Standard Deviation
Detection based on success rate, aggregated from every data plane proxy in the Envoy cluster.
requestVolume
- ignore data plane proxies with a number of requests less thanrequestVolume
. Defaults to100
.minimumHosts
- ignore counting the success rate for an Envoy cluster if the number of data plane proxies with requiredrequestVolume
is less thanminimumHosts
. Defaults to .factor
- resulting threshold equals tomean - (stdev * factor)
. Defaults to1.9
.
Failures
Detection based on success rate with an explicit threshold (unlike ).
requestVolume
- ignore data plane proxies with a number of requests less thanrequestVolume
. Defaults to50
.minimumHosts
- ignore counting the success rate for an Envoy cluster if the number of data plane proxies with requiredrequestVolume
is less thanminimumHosts
. Defaults to5
.