熔断

    熔断,是创建弹性微服务应用程序的重要模式。熔断能够使您的应用程序具备应对来自故障、潜在峰值和其他未知网络因素影响的能力。

    这个任务中,你将配置熔断规则,然后通过有意的使熔断器“跳闸”来测试配置。

    • 跟随安装指南安装 Istio。

    • 启动 样例程序。

      如果您启用了 Sidecar 自动注入,通过以下命令部署 服务:

      否则,您必须在部署 httpbin 应用程序前进行手动注入,部署命令如下:

      Zip

      1. $ kubectl apply -f <(istioctl kube-inject -f @samples/httpbin/httpbin.yaml@)

    应用程序 httpbin 作为此任务的后端服务。

    1. 如果您的 Istio 启用了双向 TLS 身份验证,则必须在应用目标规则之前将 TLS 流量策略 mode:ISTIO_MUTUAL 添加到 DestinationRule。否则请求将产生 503 错误,如所述。

      1. $ kubectl apply -f - <<EOF
      2. apiVersion: networking.istio.io/v1alpha3
      3. kind: DestinationRule
      4. metadata:
      5. name: httpbin
      6. spec:
      7. host: httpbin
      8. trafficPolicy:
      9. connectionPool:
      10. tcp:
      11. maxConnections: 1
      12. http:
      13. http1MaxPendingRequests: 1
      14. maxRequestsPerConnection: 1
      15. outlierDetection:
      16. consecutive5xxErrors: 1
      17. interval: 1s
      18. baseEjectionTime: 3m
      19. maxEjectionPercent: 100
      20. EOF
    2. 验证目标规则是否已正确创建:

      1. $ kubectl get destinationrule httpbin -o yaml
      2. apiVersion: networking.istio.io/v1beta1
      3. kind: DestinationRule
      4. ...
      5. spec:
      6. host: httpbin
      7. trafficPolicy:
      8. connectionPool:
      9. http:
      10. http1MaxPendingRequests: 1
      11. maxRequestsPerConnection: 1
      12. tcp:
      13. maxConnections: 1
      14. outlierDetection:
      15. baseEjectionTime: 3m
      16. consecutive5xxErrors: 1
      17. interval: 1s
      18. maxEjectionPercent: 100

    创建客户端程序以发送流量到 httpbin 服务。这是一个名为 Fortio 的负载测试客户端,它可以控制连接数、并发数及发送 HTTP 请求的延迟。通过 Fortio 能够有效的触发前面在 DestinationRule 中设置的熔断策略。

    1. 向客户端注入 Istio Sidecar 代理,以便 Istio 对其网络交互进行管理:

      如果你启用了,可以直接部署 fortio 应用:

      Zip

      否则,你需要在部署 fortio 应用前手动注入 Sidecar:

      1. $ kubectl apply -f <(istioctl kube-inject -f @samples/httpbin/sample-client/fortio-deploy.yaml@)
    2. 登入客户端 Pod 并使用 Fortio 工具调用 httpbin 服务。-curl 参数表明发送一次调用:

      1. $ export FORTIO_POD=$(kubectl get pods -l app=fortio -o 'jsonpath={.items[0].metadata.name}')
      2. $ kubectl exec "$FORTIO_POD" -c fortio -- /usr/bin/fortio curl -quiet http://httpbin:8000/get
      3. HTTP/1.1 200 OK
      4. server: envoy
      5. date: Tue, 25 Feb 2020 20:25:52 GMT
      6. content-type: application/json
      7. access-control-allow-origin: *
      8. x-envoy-upstream-service-time: 36
      9. {
      10. "args": {},
      11. "headers": {
      12. "Content-Length": "0",
      13. "Host": "httpbin:8000",
      14. "User-Agent": "fortio.org/fortio-1.3.1",
      15. "X-B3-Parentspanid": "8fc453fb1dec2c22",
      16. "X-B3-Sampled": "1",
      17. "X-B3-Spanid": "071d7f06bc94943c",
      18. "X-B3-Traceid": "86a929a0e76cda378fc453fb1dec2c22",
      19. "X-Forwarded-Client-Cert": "By=spiffe://cluster.local/ns/default/sa/httpbin;Hash=68bbaedefe01ef4cb99e17358ff63e92d04a4ce831a35ab9a31d3c8e06adb038;Subject=\"\";URI=spiffe://cluster.local/ns/default/sa/default"
      20. },
      21. "origin": "127.0.0.1",
      22. "url": "http://httpbin:8000/get"
      23. }

    DestinationRule 配置中,您定义了 maxConnections: 1http1MaxPendingRequests: 1。这些规则意味着,如果并发的连接和请求数超过一个,在 istio-proxy 进行进一步的请求和连接时,后续请求或连接将被阻止。

    1. 发送并发数为 2 的连接(-c 2),请求 20 次(-n 20):

      1. $ kubectl exec "$FORTIO_POD" -c fortio -- /usr/bin/fortio load -c 2 -qps 0 -n 20 -loglevel Warning http://httpbin:8000/get
      2. 20:33:46 I logger.go:97> Log level is now 3 Warning (was 2 Info)
      3. Fortio 1.3.1 running at 0 queries per second, 6->6 procs, for 20 calls: http://httpbin:8000/get
      4. Starting at max qps with 2 thread(s) [gomax 6] for exactly 20 calls (10 per thread + 0)
      5. 20:33:46 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      6. 20:33:47 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      7. 20:33:47 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      8. Ended after 59.8524ms : 20 calls. qps=334.16
      9. Aggregated Function Time : count 20 avg 0.0056869 +/- 0.003869 min 0.000499 max 0.0144329 sum 0.113738
      10. # range, mid point, percentile, count
      11. >= 0.000499 <= 0.001 , 0.0007495 , 10.00, 2
      12. > 0.001 <= 0.002 , 0.0015 , 15.00, 1
      13. > 0.003 <= 0.004 , 0.0035 , 45.00, 6
      14. > 0.004 <= 0.005 , 0.0045 , 55.00, 2
      15. > 0.005 <= 0.006 , 0.0055 , 60.00, 1
      16. > 0.006 <= 0.007 , 0.0065 , 70.00, 2
      17. > 0.007 <= 0.008 , 0.0075 , 80.00, 2
      18. > 0.008 <= 0.009 , 0.0085 , 85.00, 1
      19. > 0.011 <= 0.012 , 0.0115 , 90.00, 1
      20. > 0.012 <= 0.014 , 0.013 , 95.00, 1
      21. > 0.014 <= 0.0144329 , 0.0142165 , 100.00, 1
      22. # target 50% 0.0045
      23. # target 75% 0.0075
      24. # target 90% 0.012
      25. # target 99% 0.0143463
      26. # target 99.9% 0.0144242
      27. Sockets used: 4 (for perfect keepalive, would be 2)
      28. Code 200 : 17 (85.0 %)
      29. Code 503 : 3 (15.0 %)
      30. Response Header Sizes : count 20 avg 195.65 +/- 82.19 min 0 max 231 sum 3913
      31. Response Body/Total Sizes : count 20 avg 729.9 +/- 205.4 min 241 max 817 sum 14598
      32. All done 20 calls (plus 0 warmup) 5.687 ms avg, 334.2 qps

      有趣的是,几乎所有的请求都完成了!istio-proxy 确实允许存在一些误差。

    2. 将并发连接数提高到 3 个:

      1. $ kubectl exec "$FORTIO_POD" -c fortio -- /usr/bin/fortio load -c 3 -qps 0 -n 30 -loglevel Warning http://httpbin:8000/get
      2. 20:32:30 I logger.go:97> Log level is now 3 Warning (was 2 Info)
      3. Fortio 1.3.1 running at 0 queries per second, 6->6 procs, for 30 calls: http://httpbin:8000/get
      4. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      5. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      6. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      7. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      8. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      9. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      10. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      11. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      12. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      13. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      14. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      15. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      16. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      17. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      18. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      19. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      20. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      21. 20:32:30 W http_client.go:679> Parsed non ok code 503 (HTTP/1.1 503)
      22. Ended after 51.9946ms : 30 calls. qps=576.98
      23. Aggregated Function Time : count 30 avg 0.0040001633 +/- 0.003447 min 0.0004298 max 0.015943 sum 0.1200049
      24. # range, mid point, percentile, count
      25. >= 0.0004298 <= 0.001 , 0.0007149 , 16.67, 5
      26. > 0.001 <= 0.002 , 0.0015 , 36.67, 6
      27. > 0.002 <= 0.003 , 0.0025 , 50.00, 4
      28. > 0.003 <= 0.004 , 0.0035 , 60.00, 3
      29. > 0.004 <= 0.005 , 0.0045 , 66.67, 2
      30. > 0.005 <= 0.006 , 0.0055 , 76.67, 3
      31. > 0.006 <= 0.007 , 0.0065 , 83.33, 2
      32. > 0.007 <= 0.008 , 0.0075 , 86.67, 1
      33. > 0.008 <= 0.009 , 0.0085 , 90.00, 1
      34. > 0.009 <= 0.01 , 0.0095 , 96.67, 2
      35. > 0.014 <= 0.015943 , 0.0149715 , 100.00, 1
      36. # target 50% 0.003
      37. # target 75% 0.00583333
      38. # target 90% 0.009
      39. # target 99% 0.0153601
      40. # target 99.9% 0.0158847
      41. Sockets used: 20 (for perfect keepalive, would be 3)
      42. Code 200 : 11 (36.7 %)
      43. Code 503 : 19 (63.3 %)
      44. Response Header Sizes : count 30 avg 84.366667 +/- 110.9 min 0 max 231 sum 2531
      45. Response Body/Total Sizes : count 30 avg 451.86667 +/- 277.1 min 241 max 817 sum 13556
      46. All done 30 calls (plus 0 warmup) 4.000 ms avg, 577.0 qps

      现在,您将开始看到预期的熔断行为,只有 36.7% 的请求成功,其余的均被熔断器拦截:

      1. Code 200 : 11 (36.7 %)
      2. Code 503 : 19 (63.3 %)
    3. 查询 istio-proxy 状态以了解更多熔断详情:

      1. $ kubectl exec "$FORTIO_POD" -c istio-proxy -- pilot-agent request GET stats | grep httpbin | grep pending
      2. cluster.outbound|8000||httpbin.default.svc.cluster.local.circuit_breakers.default.remaining_pending: 1
      3. cluster.outbound|8000||httpbin.default.svc.cluster.local.circuit_breakers.default.rq_pending_open: 0
      4. cluster.outbound|8000||httpbin.default.svc.cluster.local.circuit_breakers.high.rq_pending_open: 0
      5. cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_active: 0
      6. cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_failure_eject: 0
      7. cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_overflow: 21
      8. cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_total: 29

      可以看到 upstream_rq_pending_overflow21,这意味着,目前为止已有 21 个调用被标记为熔断。

    1. 清理规则:

    2. 下线 httpbin 服务和客户端:

      1. $ kubectl delete deploy httpbin fortio-deploy