Slow start mode
Slow start mode is a mechanism that affects load balancing weight of upstream endpoints and can be configured per upstream cluster. Currently, slow start is supported in Round Robin and load balancer types.
Users can specify a slow start window parameter (in seconds), so that if endpoint “cluster membership duration” (amount of time since it has joined the cluster) is within the configured window, it enters slow start mode. During slow start window, load balancing weight of a particular endpoint will be scaled with time factor, e.g.:
\[NewWeight = {Weight*TimeFactor}^\frac{1}{Aggression}\]
where,
\[TimeFactor = \frac{max(TimeSinceStartInSeconds,1)}{SlowStartWindowInSeconds}\]
As time progresses, more and more traffic would be sent to endpoint within slow start window.
Whenever a slow start window duration elapses, upstream endpoint exits slow start mode and gets regular amount of traffic according to load balancing algorithm. Its load balancing weight will no longer be scaled with runtime bias and aggression. Endpoint could also exit slow start mode in case it leaves the cluster.
To reiterate, endpoint enters slow start mode:
If no active healthcheck is configured per cluster, immediately if its cluster membership duration is within slow start window.
Endpoint exits slow start mode when:
It does not pass an active healthcheck configured per cluster. Endpoint could further re-enter slow start, if it passes an active healthcheck and its creation time is within slow start window.
It is not recommended enabling slow start mode in low traffic or high number of endpoints scenarios, potential drawbacks would be:
Spurious (non-gradual) increase of traffic per endpoint, whenever a starving endpoint receives a request and sufficient time has passed within slow start window, its load balancing weight will increase non linearly due to time factor.
Below is an example of how result load balancing weight would look like for endpoints in same priority with Round Robin Loadbalancer type, slow start window of 60 seconds, no active healthcheck and 1.0 aggression. Once endpoints E1 and E2 exit slow start mode, their load balancing weight remains constant: