Monitoring Blocked and Passthrough External Service Traffic

How can you use Istio to monitor blocked and passthrough external traffic.

Sep 28, 2019 | By Neeraj Poddar - Aspen Mesh

Understanding, controlling and securing your external service access is one of the key benefits that you get from a service mesh like Istio. From a security and operations point of view, it is critical to monitor what external service traffic is getting blocked as they might surface possible misconfigurations or a security vulnerability if an application is attempting to communicate with a service that it should not be allowed to. Similarly, if you currently have a policy of allowing any external service access, it is beneficial to monitor the traffic so you can incrementally add explicit Istio configuration to allow access and better secure your cluster. In either case, having visibility into this traffic via telemetry is quite helpful as it enables you to create alerts and dashboards, and better reason about your security posture. This was a highly requested feature by production users of Istio and we are excited that the support for this was added in release 1.3.

To implement this, the Istio default metrics are augmented with explicit labels to capture blocked and passthrough external service traffic. This blog will cover how you can use these augmented metrics to monitor all external service traffic.

The Istio control plane configures the sidecar proxy with predefined clusters called BlackHoleCluster and Passthrough which block or allow all traffic respectively. To understand these clusters, let’s start with what external and internal services mean in the context of Istio service mesh.

External and internal services

Internal services are defined as services which are part of your platform and are considered to be in the mesh. For internal services, Istio control plane provides all the required configuration to the sidecars by default. For example, in Kubernetes clusters, Istio configures the sidecars for all Kubernetes services to preserve the default Kubernetes behavior of all services being able to communicate with other.

External services are services which are not part of your platform i.e. services which are outside of the mesh. For external services, Istio provides two options, first to block all external service access (enabled by setting global.outboundTrafficPolicy.mode to REGISTRY_ONLY) and second to allow all access to external service (enabled by setting global.outboundTrafficPolicy.mode to ALLOW_ANY). The default option for this setting (as of Istio 1.3) is to allow all external service access. This option can be configured via mesh configuration.

This is where the BlackHole and Passthrough clusters are used.

What are BlackHole and Passthrough clusters?

Prior to Istio 1.3, there were no metrics reported or if metrics were reported there were no explicit labels set when traffic hit these clusters, resulting in lack of visibility in traffic flowing through the mesh.

The next section covers how to take advantage of this enhancement as the metrics and labels emitted are conditional on whether the virtual outbound or explicit port/protocol listener is being hit.

Using the augmented metrics

To capture all external service traffic in either of the cases (BlackHole or Passthrough), you will need to monitor istio_requests_total and istio_tcp_connections_closed_total metrics. Depending upon the Envoy listener type i.e. TCP proxy or HTTP proxy that gets invoked, one of these metrics will be incremented.

Additionally, in case of a TCP proxy listener in order to see the IP address of the external service that is blocked or allowed via BlackHole or Passthrough cluster, you will need to add the destination_ip label to the istio_tcp_connections_closed_total metric. In this scenario, the host name of the external service is not captured. This label is not added by default and can be easily added by augmenting the Istio configuration for attribute generation and Prometheus handler. You should be careful about cardinality explosion in time series if you have many services with non-stable IP addresses.

PassthroughCluster metrics

This section explains the metrics and the labels emitted based on the listener type invoked in Envoy.

BlackHoleCluster metrics

Similar to the PassthroughCluster, this section explains the metrics and the labels emitted based on the listener type invoked in Envoy.

Monitoring these metrics can help operators easily understand all the external services consumed by the applications in their cluster.