Docker Swarm

    The contains 3 different roles: nodes, services, and tasks.

    The first role, nodes, represents the hosts that are part of the Swarm. It can be used to automatically monitor the Docker daemons or the Node Exporters who run on the Swarm hosts.

    The second role, tasks, represents any individual container deployed in the swarm. Each task gets its associated service labels. One service can be backed by one or multiple tasks.

    The third one, services, will discover the services deployed in the swarm. It will discover the ports exposed by the services. Usually you will want to use the tasks role instead of this one.

    Prometheus will only discover tasks and service that expose ports.

    NOTE: The rest of this post assumes that you have a Swarm running.

    Setting up Prometheus

    For this guide, you need to setup Prometheus. We will assume that Prometheus runs on a Docker Swarm manager node and has access to the Docker socket at .

    Let’s dive into the service discovery itself.

    You can enable them by editing /etc/docker/daemon.json and setting the following properties:

    Instead of 0.0.0.0, you can set the IP of the Docker Swarm node.

    A restart of the daemon is required to take the new configuration into account.

    The contains more info about this.

    Then, you can configure Prometheus to scrape the Docker daemon, by providing the following prometheus.yml file:

    1. scrape_configs:
    2. # Make Prometheus scrape itself for metrics.
    3. - job_name: 'prometheus'
    4. static_configs:
    5. - targets: ['localhost:9090']
    6. # Create a job for Docker daemons.
    7. dockerswarm_sd_configs:
    8. - host: unix:///var/run/docker.sock
    9. role: nodes
    10. relabel_configs:
    11. # Fetch metrics on port 9323.
    12. - source_labels: [__meta_dockerswarm_node_address]
    13. replacement: $1:9323
    14. # Set hostname as instance label
    15. - source_labels: [__meta_dockerswarm_node_hostname]
    16. target_label: instance

    For the nodes role, you can also use the port parameter of dockerswarm_sd_configs. However, using relabel_configs is recommended as it enables Prometheus to reuse the same API calls across identical Docker Swarm configurations.

    Monitoring Containers

    Let’s now deploy a service in our Swarm. We will deploy cadvisor, which exposes container resources metrics:

    1. docker service create --name cadvisor -l prometheus-job=cadvisor \
    2. --mode=global --publish target=8080,mode=host \
    3. --mount type=bind,src=/var/run/docker.sock,dst=/var/run/docker.sock,ro \
    4. --mount type=bind,src=/,dst=/rootfs,ro \
    5. --mount type=bind,src=/var/run,dst=/var/run \
    6. --mount type=bind,src=/sys,dst=/sys,ro \
    7. google/cadvisor -docker_only

    This is a minimal prometheus.yml file to monitor it:

    1. - source_labels: [__meta_dockerswarm_task_desired_state]
    2. action: keep

    Docker Swarm exposes the desired over the API. In out example, we only keep the targets that should be running. It prevents monitoring tasks that should be shut down.

    1. - source_labels: [__meta_dockerswarm_service_label_prometheus_job]
    2. regex: .+
    3. action: keep

    When we deployed our cadvisor, we have added a label prometheus-job=cadvisor. As Prometheus fetches the tasks labels, we can instruct it to only keep the targets which have a prometheus-job label.

    That last part takes the label prometheus-job of the task and turns it into a target label, overwriting the default dockerswarm job label that comes from the scrape config.

    The Prometheus Documentation contains the full list of labels, but here are other relabel configs that you might find useful.

    1. - source_labels: [__meta_dockerswarm_network_name]
    2. regex: ingress
    3. action: keep

    Global tasks run on every daemon.

    1. - source_labels: [__meta_dockerswarm_service_mode]
    2. regex: global
    3. action: keep
    4. - source_labels: [__meta_dockerswarm_task_port_publish_mode]
    5. regex: host
    6. action: keep

    Connecting to the Docker Swarm

    The above entries have a field host:

    1. host: unix:///var/run/docker.sock

    That is using the Docker socket. Prometheus offers to connect to Swarm using HTTP and HTTPS, if you prefer that over the unix socket.

    There are many discovery labels you can play with to better determine which targets to monitor and how, for the tasks, there is more than 25 labels available. Don’t hesitate to look at the “Service Discovery” page of your Prometheus server (under the “Status” menu) to see all the discovered labels.