Large deployments of K8s

    • Tune ansible settingsfor and timeout vars to fit large numbers of nodes being deployed.

    • Override containers’ foo_image_repo vars to point to intranet registry.

    • Override the download_run_once: true and/or download_localhost: true.See download modes for details.

    • Tune parameters for DNS related applicationsThose are dns_replicas, dns_cpu_limit,dns_cpu_requests, , dns_memory_requests.Please note that limits must always be greater than or equal to requests.

    • Tune CPU/memory limits and requests. Those are located in roles’ defaultsand named like foo_memory_limit, foo_memory_requests andfoo_cpu_limit, foo_cpu_requests. Note that ‘Mi’ memory units for K8swill be submitted as ‘M’, if applied for docker run, and cpu K8s unitswill end up with the ‘m’ skipped for docker as well. This is required asdocker does not understand k8s units well.

    • Tune kubelet_status_update_frequency to increase reliability of kubelet.kube_controller_node_monitor_grace_period,,kube_controller_pod_eviction_timeout for better Kubernetes reliability.Check out

    • Add calico-rr nodes if you are deploying with Calico or Canal. Nodes recoverfrom host/network interruption much quicker with calico-rr. Note thatcalico-rr role must be on a host without kube-master or kube-node role (butetcd role is okay).

    • Check out theInventorysection of the Getting started guide for tips on creating a large scaleAnsible inventory.

    • Override the etcd_events_cluster_setup: true store events in a separatededicated etcd instance.