ETCD Administration



    To destroy a etcd cluster, just use etcd_purge subtask of etcd.yml

    1. ./etcd.yml -t etcd_purge # remove entire cluster
    2. ./etcd.yml -t etcd_purge -e etcd_safeguard=false # purge with brutal force

    !> THINK BEFORE YOU TYPE! IT MAY LEAD TO GLOBAL PGSQL PRIMARY DEMOTE!


    If etcd cluster membership changes, we need to refresh etcd endpoints references:

    • config file of existing etcd members
    • etcdctl client environment variables
    • patroni dcs endpoint config
    • vip-manager dcs endpoint config

    To refresh etcd config file /etc/etcd/etcd.conf on existing members:

    1. ./etcd.yml -t etcd_conf # refresh /etc/etcd/etcd.conf with latest status
    2. ansible etcd -f 1 -b -a 'systemctl restart etcd' # optional: restart etcd

    To refresh etcdctl client environment variables

    1. $ ./etcd.yml -t etcd_env # refresh /etc/profile.d/etcdctl.sh

    To update etcd endpoints reference on patroni:

    To update etcd endpoints reference on vip-manager, (optional, if you are using a L2 vip)

    1. ./pgsql.yml -t pg_vip_config # regenerate vip-manager config
    2. ansible all -f 1 -b -a 'systemctl restart vip-manager' # restart vip-manager to use new config

    You can add new members to existing etcd cluster in 4 steps:

    1. update inventory group etcd with new instance
    2. init the new member with etcd_init=existing, to join the existing cluster rather than create a new one (VERY IMPORTANT)
    3. promote the new member from leaner to follower
    4. update etcd endpoints reference with

    Short Version

    1. etcdctl member add <etcd-?> --learner=true --peer-urls=https://<new_ins_ip>:2380
    2. ./etcd.yml -l <new_ins_ip> -e etcd_init=existing
    3. etcdctl member promote <new_ins_server_id>

    Let’s start from 1 etcd instance.

    1. etcd:
    2. 10.10.10.10: { etcd_seq: 1 }
    3. 10.10.10.11: { etcd_seq: 2 } # <--- add this new member definition to inventory
    4. vars: { etcd_cluster: etcd }

    Add a learner instance etcd-2 to cluster, init, launch, and promote it with:

    1. # tell the existing cluster that a new member etcd-2 is coming
    2. $ etcdctl member add etcd-2 --learner=true --peer-urls=https://10.10.10.11:2380
    3. Member 33631ba6ced84cf8 added to cluster 6646fbcf5debc68f
    4. ETCD_NAME="etcd-2"
    5. ETCD_INITIAL_CLUSTER="etcd-2=https://10.10.10.11:2380,etcd-1=https://10.10.10.10:2380"
    6. ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.10.10.11:2380"
    7. ETCD_INITIAL_CLUSTER_STATE="existing"

    Check the member list with etcdctl member list (or em list), we can see an unstarted member:

    Init the new etcd instance etcd-2 with etcd.yml playbook, we can see the new member is started:

    1. $ ./etcd.yml -l 10.10.10.11 -e etcd_init=existing
    2. ...

    Promote the new member, from leaner to follower:

    1. $ etcdctl member promote 33631ba6ced84cf8 # promote the new leader
    2. Member 33631ba6ced84cf8 promoted in cluster 6646fbcf5debc68f
    3. $ em list # check again, the new member is started
    4. 33631ba6ced84cf8, started, etcd-2, https://10.10.10.11:2380, https://10.10.10.11:2379, false
    5. 429ee12c7fbab5c1, started, etcd-1, https://10.10.10.10:2380, https://10.10.10.10:2379, fals

    Repeat the steps above to add more members. remember to use at least 3 members for production.


    To remove a member, you have to remove it from inventory fist

    1. remove it from inventory and
    2. remove it with etcdctl member remove <server_id> command
    3. add it back to inventory and purge that instance, then remove it from inventory permanently

    To refresh config, you have to comment the member you want to remove, then reload-config as if the member is already removed.

    1. etcd:
    2. hosts:
    3. 10.10.10.10: { etcd_seq: 1 }
    4. 10.10.10.11: { etcd_seq: 2 }
    5. 10.10.10.12: { etcd_seq: 3 } # <---- comment / uncomment this line
    6. vars: { etcd_cluster: etcd }

    Then, you’ll have to actually kick it from cluster with etcdctl member remove command:

    1. $ etcdctl member list
    2. 429ee12c7fbab5c1, started, etcd-1, https://10.10.10.10:2380, https://10.10.10.10:2379, false
    3. 33631ba6ced84cf8, started, etcd-2, https://10.10.10.11:2380, https://10.10.10.11:2379, false
    4. 93fcf23b220473fb, started, etcd-3, https://10.10.10.12:2380, https://10.10.10.12:2379, false # <--- remove this
    5. $ etcdctl member remove 93fcf23b220473fb # kick it from cluster

    Finally, you have to shutdown the instance, and purge it from node, you have to uncomment the member in inventory temporarily, then purge it with etcd.yml playbook:

    After that, remove the member from inventory permanently, all clear!

    Last modified 2023-02-27: