Associating secondary interfaces metrics to network attachments

    Exposed metrics contain the interface but do not specify where the interface originates. This is workable when there are no additional interfaces, but if a secondary interface is added, it is difficult to make use of the metrics since it is hard to identify the interfaces using only the interface name as an identifier.

    When adding secondary interfaces, their names depend on the order in which they are added, and different secondary interfaces might belong to different networks and can be used for different purposes.

    With it is possible to extend the current metrics with the additional information that identifies the interface type. In this way, it is possible to aggregate the metrics and to add specific alarms to specific interface types.

    The network type is generated using the name of the related NetworkAttachementDefinition, that in turn is used to differentiate different classes of secondary networks. For example, different interfaces belonging to different networks or using different CNIs use different network attachment definition names.

    The Network Metrics Daemon is a daemon component that collects and publishes network related metrics.

    The kubelet is already publishing network related metrics you can observe. These metrics are:

    • container_network_receive_bytes_total

    • container_network_receive_packets_total

    • container_network_transmit_bytes_total

    • container_network_transmit_errors_total

    • container_network_transmit_packets_dropped_total

    The labels in these metrics contain, among others:

    • Pod namespace

    These metrics work well until new interfaces are added to the pod, for example via Multus, as it is not clear what the interface names refer to.

    The interface label refers to the interface name, but it is not clear what that interface is meant for. In case of many different interfaces, it would be impossible to understand what network the metrics you are monitoring refer to.

    This is addressed by introducing the new pod_network_name_info described in the following section.

    Metrics with network name

    This daemonset publishes a pod_network_name_info gauge metric, with a fixed value of 0:

    The network name label is produced using the annotation added by Multus. It ia the concatenation of the namespace the network attachment definition belongs to, plus the name of the network attachment definition.

    Using a promql query like the following ones, it is possible to get a new metric containing the value and the network name retrieved from the annotation:

    1. (container_network_receive_bytes_total) + on(namespace,pod,interface) group_left(network_name) ( pod_network_name_info )
    2. (container_network_receive_errors_total) + on(namespace,pod,interface) group_left(network_name) ( pod_network_name_info )
    3. (container_network_receive_packets_total) + on(namespace,pod,interface) group_left(network_name) ( pod_network_name_info )
    4. (container_network_receive_packets_dropped_total) + on(namespace,pod,interface) group_left(network_name) ( pod_network_name_info )
    5. (container_network_transmit_bytes_total) + on(namespace,pod,interface) group_left(network_name) ( pod_network_name_info )
    6. (container_network_transmit_errors_total) + on(namespace,pod,interface) group_left(network_name) ( pod_network_name_info )