Service Resiliency

    We will make pod recommendation-v2 fail 100% of the time. Get one of the pod names from your system and replace on the following command accordingly:

    1. oc exec -it -n tutorial $(oc get pods -n tutorial|grep recommendation-v2|awk '{ print $1 }'|head -1) -c recommendation /bin/bash
    2. or
    3. kubectl exec -it -n tutorial $(kubectl get pods -n tutorial|grep recommendation-v2|awk '{ print $1 }'|head -1) -c recommendation /bin/bash

    You will be inside the application container of your pod recommendation-v2-2036617847-spdrb. Now execute:

    1. curl localhost:8080/misbehave
    2. exit

    This is a special endpoint that will make our application return only 503s.

    You will see it works every time because Istio will retry the recommendation service automatically and it will land on v1 only.

    1. ./scripts/
    2. customer => preference => recommendation v1 from '2036617847-m9glz': 196
    3. customer => preference => recommendation v1 from '2036617847-m9glz': 197
    4. customer => preference => recommendation v1 from '2036617847-m9glz': 198

    If you open Kiali, you will notice that v2 receives requests, but that failing request is never returned to the user as preference will retry to establish the connection with recommendation, and v1 will reply.

    1. open http://kiali-istio-system.{appdomain}/kiali

    In Kiali, go to Graph, select the recommendation square, and place the mouse over the red sign, like the picture bellow.

    Now, make the pod v2 behave well again

    1. oc exec -it -n tutorial $(oc get pods -n tutorial|grep recommendation-v2|awk '{ print $1 }'|head -1) -c recommendation /bin/bash
    2. or
    3. kubectl exec -it -n tutorial $(kubectl get pods -n tutorial|grep recommendation-v2|awk '{ print $1 }'|head -1) -c recommendation /bin/bash

    You will be inside the application container of your pod recommendation-v2-2036617847-spdrb. Now execute:

    1. curl localhost:8080/behave
    2. exit

    The application is back to round-robin load-balancing between v1 and v2

    1. ./scripts/ http://istio-ingressgateway-istio-system.$(minishift ip)
    2. customer => preference => recommendation v1 from '2039379827-h58vw': 129
    3. customer => preference => recommendation v2 from '2036617847-m9glz': 207
    4. customer => preference => recommendation v1 from '2039379827-h58vw': 130

    Wait only N seconds before giving up and failing. At this point, no other virtual service nor destination rule (in tutorial namespace) should be in effect.

    To check it run kubectl get virtualservice kubectl get destinationrule and if so kubectl delete virtualservice virtualservicename -n tutorial and kubectl delete destinationrule destinationrulename -n tutorial

    You will deploy docker images that were previously built for this tutorial. If you want to build recommendation with Quarkus to add a timeout visit:

    If you have not built the images on your own then let’s deploy the customer pod with its sidecar using the already built images for this tutorial:

    First, introduce some wait time in by making it a slow performer with a 3 second delay by running the command

    1. oc patch deployment recommendation-v2 -p '{"spec":{"template":{"spec":{"containers":[{"name":"recommendation", "image":""}]}}}}' -n tutorial

    Hit the customer endpoint a few times, to see the load-balancing between v1 and v2 but with v2 taking a bit of time to respond

      You will see it return v1 after waiting about 1 second. You don’t see v2 anymore, because the response from v2 expires after the timeout period and it is never returned.

      1. ./scripts/ http://istio-ingressgateway-istio-system.$(minishift ip)
      2. customer => preference => recommendation v1 from '6976858b48-cs2rt': 2907
      3. customer => preference => recommendation v1 from '6976858b48-cs2rt': 2908
      4. customer => preference => recommendation v1 from '6976858b48-cs2rt': 2909
      You will deploy docker images that were previously built. If you want to build recommendation with Quarkus to remove the timeout visit:

      Change the implementation of v2 back to the image that responds without the delay of 3 seconds:

      1. oc patch deployment recommendation-v2 -p '{"spec":{"template":{"spec":{"containers":[{"name":"recommendation", "image":""}]}}}}' -n tutorial

      Then delete the virtual service created for timeout by:

      1. kubectl delete -f istiofiles/virtual-service-recommendation-timeout.yml -n tutorial

      or you can run:

      1. ./scripts/ tutorial

      Let’s perform a load test in our system with siege. We’ll have 10 clients sending 4 concurrent requests each:

      1. siege -r 10 -c 4 -v http://istio-ingressgateway-istio-system.$(minishift ip)

      You should see an output similar to this:

      siege output with all successful requests

      All of the requests to our system were successful.

      Now let’s make things a bit more interesting.

      We will make pod recommendation-v2 fail 100% of the time.Get one of the pod names from your system and replace on the following command accordingly:

      1. oc exec -it -n tutorial $(oc get pods -n tutorial|grep recommendation-v2|awk '{ print $1 }'|head -1) -c recommendation /bin/bash
      2. or
      3. kubectl exec -it -n tutorial $(kubectl get pods -n tutorial|grep recommendation-v2|awk '{ print $1 }'|head -1) -c recommendation /bin/bash

      You will be inside the application container of your pod recommendation-v2-2036617847-spdrb. Now execute:

      1. curl localhost:8080/misbehave
      2. exit

      Open a new terminal window and run next command to inspect the logs of this failing pod:

      First you need the pod name:

      1. oc get pods -n tutorial
      2. or
      3. kubectl get pods -n tutorial
      5. customer-3600192384-fpljb 2/2 Running 0 17m
      6. preference-243057078-8c5hz 2/2 Running 0 15m
      7. recommendation-v1-60483540-9snd9 2/2 Running 0 12m
      8. recommendation-v2-2815683430-vpx4p 2/2 Running 0 15s

      And get the pod name of recommendation-v2.In previous case, it is recommendation-v2-2815683430-vpx4p.

      Then check its log:

      1. oc logs recommendation-v2-2815683430-vpx4p -c recommendation -n tutorial
      2. or
      3. kubectl logs recommendation-v2-2815683430-vpx4p -c recommendation -n tutorial
      4. recommendation request from '99634814-sf4cl': 10
      5. recommendation request from '99634814-sf4cl': 12

      Now, you’ve got one instance of recommendation-v2 that is misbehaving and another one that is working correctly.Let’s redirect all traffic to recommendation-v2:

      1. kubectl create -f -n tutorial
      2. kubectl create -f istiofiles/virtual-service-recommendation-v2.yml -n tutorial

      Let’s perform a load test in our system with siege.We’ll have 10 clients sending 4 concurrent requests each:

      1. siege -r 10 -c 4 -v

      You should see an output similar to this:

      All of the requests to our system were successful.

      So the automatic retries are working as expected.So far so good, the error is never send back to the client.But inspect the logs of the failing pod again:

      Substitute the pod name to your pod name.
      1. oc logs recommendation-v2-2815683430-vpx4p -c recommendation -n tutorial
      2. or
      3. kubectl logs recommendation-v2-2815683430-vpx4p -c recommendation -n tutorial
      4. recommendation request from '99634814-sf4cl': 35
      5. recommendation request from '99634814-sf4cl': 36
      6. recommendation request from '99634814-sf4cl': 37
      7. recommendation request from '99634814-sf4cl': 38

      Notice that the number of requests has been increased by an order of 20.The reason is that the requests are still able to reach the failing service, so even though all consecutive requests to failing pod will fail, Istio is still sending traffic to this failing pod.

      This is where the Circuit Breaker comes into the scene.

      Circuit breaker and pool ejection are used to avoid reaching a failing pod for a specified amount of time.In this way when some consecutive errors are produced, the failing pod is ejected from eligible pods and all further requests are not sent anymore to that instance but to a healthy instance.

      1. kubectl replace -f istiofiles/destination-rule-recommendation_cb_policy_version_v2.yml -n tutorial
      1. siege -r 10 -c 4 -v

      You should see an output similar to this:

      siege output with all successful requests

      All of the requests to our system were successful.

      But now inspect again the logs of the failing pod:

      1. oc logs recommendation-v2-2815683430-vpx4p -c recommendation -n tutorial
      2. or
      3. kubectl logs recommendation-v2-2815683430-vpx4p -c recommendation -n tutorial
      4. recommendation request from '99634814-sf4cl': 38
      5. recommendation request from '99634814-sf4cl': 39
      6. recommendation request from '99634814-sf4cl': 40

      Now the request is only send to this pod once or twice until the circuit is tripped and pod is ejected.After this, no further request is send to failing pod.

      Remove Istio resources:

      1. kubectl delete -f istiofiles/destination-rule-recommendation_cb_policy_version_v2.yml -n tutorial
      2. kubectl delete -f -n tutorial
      1. oc scale deployment recommendation-v2 --replicas=1 -n tutorial
      2. or
      3. kubectl scale deployment recommendation-v2 --replicas=1 -n tutorial

      Restart recommendation-v2 pod:

      1. oc delete pod -l app=recommendation,version=v2
      2. or