Troubleshoot connectivity issues with ztunnel

This guide describes some options for monitoring the ztunnel proxy configuration and datapath. This information can also help with some high level troubleshooting and in identifying information that would be useful to collect and provide in a bug report if there are any problems.

Viewing ztunnel proxy state

The ztunnel proxy gets configuration and discovery information from the istiod control plane via xDS APIs.

The istioctl x ztunnel-config command allows you to view discovered workloads as seen by a ztunnel proxy.

In the first example, you see all the workloads and control plane components that ztunnel is currently tracking, including information about the IP address and protocol to use when connecting to that component and whether there is a waypoint proxy associated with that workload.

$ istioctl x ztunnel-config workloads
NAMESPACE          POD NAME                                IP          NODE                  WAYPOINT PROTOCOL
default            bookinfo-gateway-istio-59dd7c96db-q9k6v 10.244.1.11 ambient-worker        None     TCP
default            details-v1-cf74bb974-5sqkp              10.244.1.5  ambient-worker        None     HBONE
default            notsleep-5c785bc478-zpg7j               10.244.2.7  ambient-worker2       None     HBONE
default            productpage-v1-87d54dd59-fn6vw          10.244.1.10 ambient-worker        None     HBONE
default            ratings-v1-7c4bbf97db-zvkdw             10.244.1.6  ambient-worker        None     HBONE
default            reviews-v1-5fd6d4f8f8-knbht             10.244.1.16 ambient-worker        None     HBONE
default            reviews-v2-6f9b55c5db-c94m2             10.244.1.17 ambient-worker        None     HBONE
default            reviews-v3-7d99fd7978-7rgtd             10.244.1.18 ambient-worker        None     HBONE
default            sleep-7656cf8794-r7zb9                  10.244.1.12 ambient-worker        None     HBONE
istio-system       istiod-7ff4959459-qcpvp                 10.244.2.5  ambient-worker2       None     TCP
istio-system       ztunnel-6hvcw                           10.244.1.4  ambient-worker        None     TCP
istio-system       ztunnel-mf476                           10.244.2.6  ambient-worker2       None     TCP
istio-system       ztunnel-vqzf9                           10.244.0.6  ambient-control-plane None     TCP
kube-system        coredns-76f75df574-2sms2                10.244.0.3  ambient-control-plane None     TCP
kube-system        coredns-76f75df574-5bf9c                10.244.0.2  ambient-control-plane None     TCP
local-path-storage local-path-provisioner-7577fdbbfb-pslg6 10.244.0.4  ambient-control-plane None     TCP

The ztunnel-config command can be used to view the secrets holding the TLS certificates that the ztunnel proxy has received from the istiod control plane to use for mTLS.

$ istioctl x ztunnel-config certificates "$ZTUNNEL".istio-system
CERTIFICATE NAME                                              TYPE     STATUS        VALID CERT     SERIAL NUMBER                        NOT AFTER                NOT BEFORE
spiffe://cluster.local/ns/default/sa/bookinfo-details         Leaf     Available     true           c198d859ee51556d0eae13b331b0c259     2024-05-05T09:17:47Z     2024-05-04T09:15:47Z
spiffe://cluster.local/ns/default/sa/bookinfo-details         Root     Available     true           bad086c516cce777645363cb8d731277     2034-04-24T03:31:05Z     2024-04-26T03:31:05Z
spiffe://cluster.local/ns/default/sa/bookinfo-productpage     Leaf     Available     true           64c3828993c7df6f85a601a1615532cc     2024-05-05T09:17:47Z     2024-05-04T09:15:47Z
spiffe://cluster.local/ns/default/sa/bookinfo-productpage     Root     Available     true           bad086c516cce777645363cb8d731277     2034-04-24T03:31:05Z     2024-04-26T03:31:05Z
spiffe://cluster.local/ns/default/sa/bookinfo-ratings         Leaf     Available     true           720479815bf6d81a05df8a64f384ebb0     2024-05-05T09:17:47Z     2024-05-04T09:15:47Z
spiffe://cluster.local/ns/default/sa/bookinfo-ratings         Root     Available     true           bad086c516cce777645363cb8d731277     2034-04-24T03:31:05Z     2024-04-26T03:31:05Z
spiffe://cluster.local/ns/default/sa/bookinfo-reviews         Leaf     Available     true           285697fb2cf806852d3293298e300c86     2024-05-05T09:17:47Z     2024-05-04T09:15:47Z
spiffe://cluster.local/ns/default/sa/bookinfo-reviews         Root     Available     true           bad086c516cce777645363cb8d731277     2034-04-24T03:31:05Z     2024-04-26T03:31:05Z
spiffe://cluster.local/ns/default/sa/sleep                    Leaf     Available     true           fa33bbb783553a1704866842586e4c0b     2024-05-05T09:25:49Z     2024-05-04T09:23:49Z
spiffe://cluster.local/ns/default/sa/sleep                    Root     Available     true           bad086c516cce777645363cb8d731277     2034-04-24T03:31:05Z     2024-04-26T03:31:05Z

Using these commands, you can check that ztunnel proxies are configured with all the expected workloads and TLS certificate. Additionally, missing information can be used for troubleshooting any networking errors.

You may use the all option to view all parts of the ztunnel-config with a single CLI command:

$ istioctl x ztunnel-config all -o json

You can also view the raw configuration dump of a ztunnel proxy via a curl to an endpoint inside its pod:

$ kubectl debug -it $ZTUNNEL -n istio-system --image=curlimages/curl -- curl localhost:15000/config_dump

Viewing Istiod state for ztunnel xDS resources

Sometimes you may wish to view the state of ztunnel proxy config resources as maintained in the istiod control plane, in the format of the xDS API resources defined specially for ztunnel proxies. This can be done by exec-ing into the istiod pod and obtaining this information from port 15014 for a given ztunnel proxy as shown in the example below. This output can then also be saved and viewed with a JSON pretty print formatter utility for easier browsing (not shown in the example).

$ export ISTIOD=$(kubectl get pods -n istio-system -l app=istiod -o=jsonpath='{.items[0].metadata.name}')
$ kubectl debug -it $ISTIOD -n istio-system --image=curlimages/curl -- curl localhost:15014/debug/config_dump?proxyID="$ZTUNNEL".istio-system

Verifying ztunnel traffic through logs

ztunnel’s traffic logs can be queried using the standard Kubernetes log facilities.

$ kubectl -n default exec deploy/sleep -- sh -c 'for i in $(seq 1 10); do curl -s -I http://productpage:9080/; done'
HTTP/1.1 200 OK
Server: Werkzeug/3.0.1 Python/3.12.1
--snip--

The response displayed confirms the client pod receives responses from the service. You can now check logs of the ztunnel pods to confirm the traffic was sent over the HBONE tunnel.

$ kubectl -n istio-system logs -l app=ztunnel | grep -E "inbound|outbound"
2024-05-04T09:59:05.028709Z info    access  connection complete src.addr=10.244.1.12:60059 src.workload="sleep-7656cf8794-r7zb9" src.namespace="default" src.identity="spiffe://cluster.local/ns/default/sa/sleep" dst.addr=10.244.1.10:9080 dst.hbone_addr="10.244.1.10:9080" dst.service="productpage.default.svc.cluster.local" dst.workload="productpage-v1-87d54dd59-fn6vw" dst.namespace="productpage" dst.identity="spiffe://cluster.local/ns/default/sa/bookinfo-productpage" direction="inbound" bytes_sent=175 bytes_recv=80 duration="1ms"
2024-05-04T09:59:05.028771Z info    access  connection complete src.addr=10.244.1.12:58508 src.workload="sleep-7656cf8794-r7zb9" src.namespace="default" src.identity="spiffe://cluster.local/ns/default/sa/sleep" dst.addr=10.244.1.10:15008 dst.hbone_addr="10.244.1.10:9080" dst.service="productpage.default.svc.cluster.local" dst.workload="productpage-v1-87d54dd59-fn6vw" dst.namespace="productpage" dst.identity="spiffe://cluster.local/ns/default/sa/bookinfo-productpage" direction="outbound" bytes_sent=80 bytes_recv=175 duration="1ms"
--snip--

These log messages confirm the traffic was sent via the ztunnel proxy. Additional fine-grained monitoring can be done by checking logs on the specific ztunnel proxy instances that are on the same nodes as the source and destination pods of the traffic. If these logs are not seen, then a possibility is that traffic redirection may not be working correctly.

Verifying ztunnel load balancing

The ztunnel proxy automatically performs client-side load balancing if the destination is a service with multiple endpoints. No additional configuration is needed. The load balancing algorithm is an internally fixed L4 Round Robin algorithm that distributes traffic based on L4 connection state, and is not user configurable.

By calling a service with multiple backends, we can validate that client traffic is balanced across the service replicas.

$ kubectl -n default exec deploy/sleep -- sh -c 'for i in $(seq 1 10); do curl -s -I http://reviews:9080/; done'
$ kubectl -n istio-system logs -l app=ztunnel | grep -E "outbound"
--snip--
2024-05-04T10:11:04.964851Z info    access  connection complete src.addr=10.244.1.12:35520 src.workload="sleep-7656cf8794-r7zb9" src.namespace="default" src.identity="spiffe://cluster.local/ns/default/sa/sleep" dst.addr=10.244.1.9:15008 dst.hbone_addr="10.244.1.9:9080" dst.service="reviews.default.svc.cluster.local" dst.workload="reviews-v3-7d99fd7978-zznnq" dst.namespace="reviews" dst.identity="spiffe://cluster.local/ns/default/sa/bookinfo-reviews" direction="outbound" bytes_sent=84 bytes_recv=169 duration="2ms"
2024-05-04T10:11:04.969578Z info    access  connection complete src.addr=10.244.1.12:35526 src.workload="sleep-7656cf8794-r7zb9" src.namespace="default" src.identity="spiffe://cluster.local/ns/default/sa/sleep" dst.addr=10.244.1.9:15008 dst.hbone_addr="10.244.1.9:9080" dst.service="reviews.default.svc.cluster.local" dst.workload="reviews-v3-7d99fd7978-zznnq" dst.namespace="reviews" dst.identity="spiffe://cluster.local/ns/default/sa/bookinfo-reviews" direction="outbound" bytes_sent=84 bytes_recv=169 duration="2ms"
2024-05-04T10:11:04.974720Z info    access  connection complete src.addr=10.244.1.12:35536 src.workload="sleep-7656cf8794-r7zb9" src.namespace="default" src.identity="spiffe://cluster.local/ns/default/sa/sleep" dst.addr=10.244.1.7:15008 dst.hbone_addr="10.244.1.7:9080" dst.service="reviews.default.svc.cluster.local" dst.workload="reviews-v1-5fd6d4f8f8-26j92" dst.namespace="reviews" dst.identity="spiffe://cluster.local/ns/default/sa/bookinfo-reviews" direction="outbound" bytes_sent=84 bytes_recv=169 duration="2ms"
2024-05-04T10:11:04.979462Z info    access  connection complete src.addr=10.244.1.12:35552 src.workload="sleep-7656cf8794-r7zb9" src.namespace="default" src.identity="spiffe://cluster.local/ns/default/sa/sleep" dst.addr=10.244.1.8:15008 dst.hbone_addr="10.244.1.8:9080" dst.service="reviews.default.svc.cluster.local" dst.workload="reviews-v2-6f9b55c5db-c2dtw" dst.namespace="reviews" dst.identity="spiffe://cluster.local/ns/default/sa/bookinfo-reviews" direction="outbound" bytes_sent=84 bytes_recv=169 duration="2ms"

This is a round robin load balancing algorithm and is separate from and independent of any load balancing algorithm that may be configured within a VirtualService’s TrafficPolicy field, since as discussed previously, all aspects of VirtualService API objects are instantiated on the Waypoint proxies and not the ztunnel proxies.

Observability of ambient mode traffic

In addition to checking ztunnel logs and other monitoring options noted above, you can also use normal Istio monitoring and telemetry functions to monitor application traffic using the ambient data plane mode.

If a service is only using the secure overlay provided by ztunnel, the Istio metrics reported will only be the L4 TCP metrics (namely istio_tcp_sent_bytes_total, istio_tcp_received_bytes_total, istio_tcp_connections_opened_total, istio_tcp_connections_closed_total). The full set of Istio and Envoy metrics will be reported if a waypoint proxy is used.

Was this information useful?
Do you have any suggestions for improvement?

Thanks for your feedback!