Monitoring ArangoDB Cluster network usage

    Solution

    As we already run as our metric-hub, we want to utilize it to also give us these figures. A very cheap way to generate these values are the counters in the IPTables firewall of our system.

    For this recipe you need to install the following tools:

    Now we need to find out the current configuration of our cluster. For the time being we assume you simply issued

    to get you set up. So you know you’ve got two DB-Servers - one Coordinator, one Agent:

    1. arangod 21406 1 1 16:59 pts/14 00:00:00 bin/etcd-arango --data-dir /var/tmp/tmp-21550-1347489353/shell_server/agentarango4001 --name agentarango4001 --bind-addr 127.0.0.1:4001 --addr 127.0.0.1:4001 --peer-bind-addr 127.0.0.1:7001 --peer-addr 127.0.0.1:7001 --initial-cluster-state new --initial-cluster agentarango4001=http://127.0.0.1:7001
    2. arangod 21408 1 4 16:56 pts/14 00:00:01 bin/arangod --database.directory cluster/data8629 --cluster.agency-endpoint tcp://localhost:4001 --cluster.my-address tcp://localhost:8629 --server.endpoint tcp://localhost:8629 --log.file cluster/8629.log
    3. arangod 21410 1 5 16:56 pts/14 00:00:02 bin/arangod --database.directory cluster/data8630 --cluster.agency-endpoint tcp://localhost:4001 --cluster.my-address tcp://localhost:8630 --server.endpoint tcp://localhost:8630 --log.file cluster/8630.log
    4. arangod 21416 1 5 16:56 pts/14 00:00:02 bin/arangod --database.directory cluster/data8530 --cluster.agency-endpoint tcp://localhost:4001 --cluster.my-address tcp://localhost:8530 --server.endpoint tcp://localhost:8530 --log.file cluster/8530.log

    We can now check which ports they occupied:

    • The Agent has 7001 and 4001. Since it’s running in single server mode its cluster port (7001) should not show any traffic, port 4001 is the interesting one.
    • Claus - This is the Coordinator. Your Application will talk to it on port 8530
    • Pavel - This is the first DB-Server; Claus will talk to it on port 8629
    • Perry - This is the second DB-Server; Claus will talk to it on port 8630

    According to the ports we found in the last section, we will configure our firewall in /etc/ferm/ferm.conf, and put the identities into the comments so we have a persistent naming scheme:

    1. # blindly forward these to the accounting chain:
    2. @def $ARANGO_RANGE=4000:9000;
    3. @def &TCP_ACCOUNTING($PORT, $COMMENT, $SRCCHAIN) = {
    4. @def $FULLCOMMENT=@cat($COMMENT, "_", $SRCCHAIN);
    5. dport $PORT mod comment comment $FULLCOMMENT NOP;
    6. }
    7. @def &ARANGO_ACCOUNTING($CHAINNAME) = {
    8. # The Coordinators:
    9. &TCP_ACCOUNTING(8530, "Claus", $CHAINNAME);
    10. # The DB-Servers:
    11. &TCP_ACCOUNTING(8629, "Pavel", $CHAINNAME);
    12. &TCP_ACCOUNTING(8630, "Perry", $CHAINNAME);
    13. # The Agency:
    14. &TCP_ACCOUNTING(4001, "etcd_client", $CHAINNAME);
    15. # it shouldn't talk to itself if it is only running with a single instance:
    16. &TCP_ACCOUNTING(7007, "etcd_cluster", $CHAINNAME);
    17. }
    18. table filter {
    19. chain INPUT {
    20. proto tcp dport $ARANGO_RANGE @subchain "Accounting" {
    21. &ARANGO_ACCOUNTING("input");
    22. }
    23. policy DROP;
    24. # connection tracking
    25. mod state state INVALID DROP;
    26. # allow local packet
    27. interface lo ACCEPT;
    28. # respond to ping
    29. proto icmp ACCEPT;
    30. # allow IPsec
    31. proto udp dport 500 ACCEPT;
    32. proto (esp ah) ACCEPT;
    33. # allow SSH connections
    34. proto tcp dport ssh ACCEPT;
    35. }
    36. chain OUTPUT {
    37. policy ACCEPT;
    38. proto tcp dport $ARANGO_RANGE @subchain "Accounting" {
    39. &ARANGO_ACCOUNTING("output");
    40. }
    41. # connection tracking
    42. #mod state state INVALID DROP;
    43. mod state state (ESTABLISHED RELATED) ACCEPT;
    44. }
    45. chain FORWARD {
    46. policy DROP;
    47. # connection tracking
    48. mod state state (ESTABLISHED RELATED) ACCEPT;
    49. }
    50. }

    Note: This is a very basic configuration, mainly with the purpose to demonstrate the accounting feature - so don’t run this in production)

    After activating it interactively with

    We now use the iptables command line utility directly to review the status our current setting:

    1. iptables -L -nvx
    2. pkts bytes target prot opt in out source destination
    3. 7636 1821798 Accounting tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpts:4000:9000
    4. 0 0 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 state INVALID
    5. 14700 14857709 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
    6. 130 7800 ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0
    7. 0 0 ACCEPT icmp -- * * 0.0.0.0/0 0.0.0.0/0
    8. 0 0 ACCEPT udp -- * * 0.0.0.0/0 0.0.0.0/0 udp dpt:500
    9. 0 0 ACCEPT esp -- * * 0.0.0.0/0 0.0.0.0/0
    10. 0 0 ACCEPT ah -- * * 0.0.0.0/0 0.0.0.0/0
    11. 0 0 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:22
    12. Chain FORWARD (policy DROP 0 packets, 0 bytes)
    13. pkts bytes target prot opt in out source destination
    14. 0 0 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 state INVALID
    15. 0 0 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
    16. Chain OUTPUT (policy ACCEPT 296 packets, 19404 bytes)
    17. pkts bytes target prot opt in out source destination
    18. 7720 1882404 Accounting tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpts:4000:9000
    19. 14575 14884356 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
    20. Chain Accounting (2 references)
    21. pkts bytes target prot opt in out source destination
    22. 204 57750 tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8530 /* Claus_input */
    23. 20 17890 tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8629 /* Pavel_input */
    24. 262 97352 tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8630 /* Perry_input */
    25. 2604 336184 tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:4001 /* etcd_client_input */
    26. 0 0 tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:7007 /* etcd_cluster_input */
    27. 204 57750 tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8530 /* Claus_output */
    28. 20 17890 tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8629 /* Pavel_output */
    29. 262 97352 tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8630 /* Perry_output */
    30. 2604 336184 tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:4001 /* etcd_client_output */
    31. 0 0 tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:7007 /* etcd_cluster_output */

    You can see nicely the Accounting sub-chain with our comments. These should be pretty straight forward to match. We also see the pkts and bytes columns. They contain the current value of these counters of your system.

    Since your system now generates these numbers, we want to configure collectd with its to aggregate them.

    We do so in the /etc/collectd/collectd.conf.d/iptables.conf:

    Now we restart collectd with , watch the syslog for errors. If everything is OK, our values should show up in:

    1. /var/lib/collectd/rrd/localhost/iptables-filter-Accounting/ipt_packets-Claus_output.rrd

    We can inspect our values with kcollectd: