Connect Custom Proxy Integration

    You can extend any proxy to support Connect. Consul ships with a built-in proxy suitable for an out-of-the-box development experience, but you may require a more robust proxy solution for production environments.

    The proxy you integrate must be able to accept inbound connections and/or establish outbound connections identified as a particular service. In some cases, either ability may be acceptable, but both are generally required to support for full sidecar functionality.

    Sidecar proxies may support L4 or L7 network functionality. L4 integration is simpler and adequate for securing all traffic. L4 treats all traffic as TCP, however, so advanced routing or metrics features are not supported.

    Full L7 support is built on top of L4 support. An L7 proxy integration supports most or all of the L7 traffic routing features in Connect by dynamically configuring routing, retries, and other L7 features. The built-in proxy only supports L4, while Envoy supports the full L7 feature set.

    Areas where the integration approach differs between L4 and L7 are identified in this topic.

    The proxy must accept TLS connections on some port to accept inbound connections.

    Call the API endpoint to obtain the client certificate, e.g.:

    The client certificate from the inbound connection must be validated against the Connect CA root certificates. Call the /v1/agent/connect/ca/roots endpoint to obtain the root certificates from the Connect CA, e.g.:

    After validating the client certificate from the caller, the proxy can authorize the entire connection (L4) or each request (L7). Depending upon the of the proxied service, authorization is performed either on a per-connection (L4) or per-request (L7) basis. Authentication is based on “service identity” (TLS), and is implemented at the transport layer.

    Note: Some features, such as (local) rate limiting or max connections, are expected to be proxy-level configurations enforced separately when authorization calls are made. Proxies can enforce the configurations based on information about request rates and other states that should already be availabe.

    The proxy can authorize the connection by either calling the /v1/agent/connect/authorize API endpoint or by querying the endpoint.

    The /v1/agent/connect/authorize endpoint should be called in the connection path for each received connection. If the local Consul agent is down or unresponsive, the success rate of new connections will be compromised. The agent uses locally-cached data to authorize the connection and typically responds in microseconds. As a result, the TLS handshake typically spans microseconds.

    Note: This endpoint is only suitable for L4 (e.g., TCP) integration. The endpoint always treats intentions with defined (i.e., L7 criteria) as deny intentions during evaluation.

    The proxy can query the endpoint on startup to retrieve a list of intentions that match the proxy destination. The matches should be stored in the native filter configuration of the proxy, such as RBAC for Envoy.

    Persistent TCP connections and intentions

    For a proxied service configured with the TCP , potentially long-lived TCP connections will only be authorized when the connections are initially established. But because many services, such as databases, typically use persistent connection pools, changing intentions to deny access does not terminate existing connections. This behavior violates the updated intention. In these cases, it may appear as if the intention is not being enforced.

    Implement one of the following strategies to close connections:

    1. Configure connections to terminate after a maximum lifetime, e.g., several hours. This balances the overhead of establishing new connections with determining how long existing connections remain open after an intention changes.

    2. Periodically re-authorize every open connection. The authorization call is inexpensive and should be a local, in-memory operation on the Consul agent. Periodically authorizing thousands of open connections (e.g., once every minute) is likely to be negligible overhead, but doing so enforces a tighter upper boundary on how long it takes to enforce intention changes without affecting the protocol efficiency of persistent connections.

    Certificate serial in authorization

    Intentions currently use TLS URI Subject Alternative Name (SAN) for enforcement. The AuthZ API in the Go SDK contains a field for passing the serial number (). Proxies may provide this value during authorization.

    The API endpoints described in this section operate on agent-local data that is updated in the background. The leaf, roots, and intentions should be updated in the background by the proxy.

    The leaf cert, root cert, and intentions endpoints support blocking queries, which should be used to get near-immediate updates for root key rotations, new leaf certs before expiry, and intention changes.

    Although Consul follows the SPIFFE spec for certificates, some CA providers do not allow strict adherence. For example, CA certificates may not have the correct trust-domain SPIFFE URI SAN for the cluster. If SPIFFE validation is performed in the proxy, be aware that it should be possible to opt out, otherwise certain CA providers supported by Consul will not be compatible with the use of that proxy. Neither Envoy nor the built-in proxy currently validate the SPIFFE URI of the chain beyond the leaf certificate.

    For outbound connections, the proxy should communicate with a Connect-capable endpoint for a service and provide a client certificate from the API endpoint. The proxy must use the root certificate obtained from the /v1/agent/connect/ca/roots endpoint to verify the certificate served from the destination endpoint.

    The API endpoint enables any proxy to discover proxy configurations registered with a local service. This endpoint supports hash-based blocking, which enables long-polling for changes to the registration/configuration. Any changes to the registration/config will result in the new config being returned immediately.

    Refer to the built-in proxy for an example implementation. Using the Go SDK, the proxy calls the HTTP “pull” API via the watch package: .

    The discovery chain for each upstream service should be fetched from the API endpoint. This will return a compiled graph of configurations a sidecar needs for a particular upstream service.

    If you are only implementing L4 support in your proxy, set the OverrideProtocol value to tcp when fetching the discovery chain so that L7 features, such as HTTP routing rules, are not returned.

    For each in the resulting discovery chain, a list of healthy, Connect-capable endpoints may be fetched from the /v1/health/connect/:service_id API endpoint as described in the section.

    Proxies can use Consul’s service discovery API to return all available, Connect-capable endpoints for a given service. This endpoint supports a cached query parameter, which uses to improve performance. The API package provides a UseCache query option to leverage caching. In addition to performance improvements, using the cache makes the mesh more resilient to Consul server outages. This is because the mesh “fails static” with the last known set of service instances still used, rather than errors on new connections.

    Proxies can decide whether to perform just-in-time queries to the API when a new connection needs to be routed, or to use blocking queries to load the current set of endpoints for a service and keep that list updated. The SDK and built-in proxy currently use just-in-time resolution however many existing proxies are likely to find it easier to integrate by pulling the set of endpoints and maintaining it in local memory using blocking queries.

    Upstreams may be defined with the Prepared Query target type. These upstreams should use Consul’s API to determine a list of upstream endpoints for the service. Note that the API does not support blocking, so proxies choosing to populate endpoints in memory will need to poll the endpoint at a suitable and, ideally, configurable frequency.

    Long-term support for service-resolver configuration entries. The service-resolver configuration will completely replace prepared queries in future versions of Consul. In some instances, however, prepared queries are still used.

    Consul does not start or manage sidecar proxy processes. Proxies running on a physical host or VM are designed to be started and run by process supervisor systems, such as init, systemd, supervisord, etc. If deployed within a cluster scheduler (Kubernetes, Nomad), proxies should run as a sidecar container in the same namespace.

    Proxies will use the and CONSUL_HTTP_ADDR environment variables to contact Consul and fetch certificates. This occurs if the CONSUL_HTTP_TOKEN environment variable contains a Consul ACL token that has the necessary permissions to read configurations for that service. If you use the Go , then the environment variables will be read and the client configured for you automatically.

    Alternatively, you may also use the flags -token or -token-file to provide the Consul ACL token.

    Providing a Consul ACL Token

    Providing a Consul ACL Token

    Envoy

    • Envoy
    • Proxy

    If TLS is enabled on Consul, you will also need to add the following environment variables prior to starting the proxy:

    The CONSUL_CACERT, and CONSUL_CLIENT_KEY can also be provided as CLI flags. Refer to the consul connect proxy documentation for details.

    The proxy service ID comes from the user. See for an example. You can use the -proxy-id flag to specify the ID of the proxy service you have already registered with the local agent.