Security Model

This document aims to describe the security posture of Istio’s various components, and how possible attacks can impact the system.

Components

Istio comes with a variety of optional components that will be covered here. For a high level overview, see Istio Architecture. Note that Istio deployments are highly flexible; below, we will primarily assume the worst case scenarios.

Istiod

Istiod serves as the core control plane component of Istio, often serving the role of the XDS serving component as well as the mesh mTLS Certificate Authority.

Istiod is considered a highly privileged component, similar to that of the Kubernetes API server itself.

  • It has high Kubernetes RBAC privileges, typically including Secret read access and webhook write access.
  • When acting as the CA, it can provision arbitrary certificates.
  • When acting as the XDS control plane, it can program proxies to perform arbitrary behavior.

As such, the security of the cluster is tightly coupled to the security of Istiod. Following Kubernetes security best practices around Istiod access is paramount.

Istio CNI plugin

Istio can optionally be deployed with the Istio CNI Plugin DaemonSet. This DaemonSet is responsible for setting up networking rules in Istio to ensure traffic is transparently redirected as needed. This is an alternative to the istio-init container discussed below.

Because the CNI DaemonSet modifies networking rules on the node, it requires an elevated securityContext. However, unlike Istiod, this is a node-local privilege. The implications of this are discussed below.

Because this consolidates the elevated privileges required to setup networking into a single pod, rather than every pod, this option is generally recommended.

Sidecar Proxies

Istio may optionally deploy a sidecar proxy next to an application.

The sidecar proxy needs the network to be programmed to direct all traffic through the proxy. This can be done with the Istio CNI plugin or by deploying an initContainer (istio-init) on the pod (this is done automatically if the CNI plugin is not deployed). The istio-init container requires NET_ADMIN and NET_RAW capabilities. However, these capabilities are only present during the initialization - the primary sidecar container is completely unprivileged.

Additionally, the sidecar proxy does not require any associated Kubernetes RBAC privileges at all.

Each sidecar proxy is authorized to request a certificate for the associated Pod Service Account.

Gateways and Waypoints

Gateways and Waypoints act as standalone proxy deployments. Unlike sidecars, they do not require any networking modifications, and thus don’t require any privilege.

These components run with their own service accounts, distinct from application identities.

Ztunnel

Ztunnel acts as a node-level proxy. This task requires the NET_ADMIN, SYS_ADMIN, and NET_RAW capabilities. Like the Istio CNI Plugin, these are node-local privileges only. The Ztunnel does not have any associated Kubernetes RBAC privileges.

Ztunnel is authorized to request certificates for any Service Accounts of pods running on the same node. Similar to kubelet, this explicitly does not allow requesting arbitrary certificates. This, again, ensures these privileges are node-local only.

Traffic Capture Properties

When a pod is enrolled in the mesh, all incoming TCP traffic will be redirected to the proxy. This includes both mTLS/HBONE traffic and plaintext traffic. Any applicable policies for the workload will be enforced before forwarding the traffic to the workload.

However, Istio does not currently guarantee that outgoing traffic is redirect to the proxy. See traffic capture limitations. As such, care must be taken to follow the securing egress traffic steps if outbound policies are required.

Mutual TLS Properties

Mutual TLS provides the basis for much of Istio’s security posture. Below explains various properties mutual TLS provides for the security posture of Istio.

Certificate Authority

Istio comes out of the box with its own Certificate Authority.

By default, the CA allows authenticating clients based on either of the options below:

  • A Kubernetes JWT token, with an audience of istio-ca, verified with a Kubernetes TokenReview. This is the default method in Kubernetes Pods.
  • An existing mutual TLS certificate.
  • Custom JWT tokens, verified using OIDC (requires configuration).

The CA will only issue certificates that are requested for identities that a client is authenticated for.

Istio can also integrate with a variety of third party CAs; please refer to any of their security documentation for more information on how they behave.

Client mTLS

In sidecar mode, the client sidecar will automatically use TLS when connecting to a service that is detected to support mTLS. This can also be explicitly configured. Note that this automatic detection relies on Istio associating the traffic to a Service. Unsupported traffic types or configuration scoping can prevent this.

When connecting to a backend, the set of allowed identities is computed, at the Service level, based on the union of all backend’s identities.

Server mTLS

By default, Istio will accept mTLS and non-mTLS traffic (often called “permissive mode”). Users can opt-in to strict enforcement by writing PeerAuthentication or AuthorizationPolicy rules requiring mTLS.

When mTLS connections are established, the peer certificate is verified. Additionally, the peer identity is verified to be within the same trust domain. To verify only specific identities are allowed, an AuthorizationPolicy can be used.

Compromise types explored

Based on the above overview, we will consider the impact on the cluster if various parts of the system are compromised. In the real world, there are a variety of different variables around any security attack:

  • How easy it is to execute
  • What prior privileges are required
  • How often it can exploited
  • What the impact is (total remote execution, denial of service, etc).

In this document, we will primarily consider the worst case scenario: a compromised component means an attacker has complete remote code execution capabilities.

Workload compromise

In this scenario, an application workload (pod) is compromised.

A pod may have access to its service account token. If so, a workload compromise can move laterally from a single pod to compromising the entire service account.

In the sidecar model, the proxy is co-located with the pod, and runs within the same trust boundary. A compromised application can tamper with the proxy through the admin API or other surfaces, including exfiltration of private key material, allowing another agent to impersonate the workload. It should be assumed that a compromised workload also includes a compromise of the sidecar proxy.

Given this, a compromised workload may:

  • Send arbitrary traffic, with or without mutual TLS. These may bypass any proxy configuration, or even the proxy entirely. Note that Istio does not offer egress-based authorization policies, so there is no egress authorization policy bypass occurring.
  • Accept traffic that was already destined to the application. It may bypass policies that were configured in the sidecar proxy.

The key takeaway here is that while the compromised workload may behave maliciously, this does not give them the ability to bypass policies in other workloads.

Istio offers a variety of features that can limit the impact of such a compromise:

  • Observability features can be used to identify the attack.
  • Policies can be used to restrict what type of traffic a workload can send or receive.

Proxy compromise - Sidecars

In this scenario, a sidecar proxy is compromised. Because the sidecar and application reside in the same trust domain, this is functionally equivalent to the Workload compromise.

Proxy compromise - Waypoint

In this scenario, a waypoint proxy is compromised. While waypoints do not have any privileges for a hacker to exploit, they do serve (potentially) many different services and workloads. A compromised waypoint will receive all traffic for these, which it can view, modify, or drop.

Istio offers the flexibility of configuring the granularity of a waypoint deployment. Users may consider deploying more isolated waypoints if they require stronger isolation.

Because waypoints run with a distinct identity from the applications they serve, a compromised waypoint does not imply the user’s applications can be impersonated.

Proxy compromise - Ztunnel

In this scenario, a ztunnel proxy is compromised.

A compromised ztunnel gives the attacker control of the networking of the node.

Ztunnel has access to private key material for each application running on it’s node. A compromised ztunnel could have these exfiltrated and used elsewhere. However, lateral movement to identities beyond co-located workloads is not possible; each ztunnel is only authorized to access certificates for workloads running on its node, scoping the blast radius of a compromised ztunnel.

Node compromise

In this scenario, the Kubernetes Node is compromised. Both Kubernetes and Istio are designed to limit the blast radius of a single node compromise, such that the compromise of a single node does not lead to a cluster-wide compromise.

However, the attack does have complete control over any workloads running on that node. For instance, it can compromise any co-located waypoints, the local ztunnel, any sidecars, any co-located Istiod instances, etc.

Cluster (API Server) compromise

A compromise of the Kubernetes API Server effectively means the entire cluster and mesh are compromised. Unlike most other attack vectors, there isn’t much Istio can do to control the blast radius of such an attack. A compromised API Server gives a hacker complete control over the cluster, including actions such as running kubectl exec on arbitrary pods, removing any Istio AuthorizationPolicies, or even uninstalling Istio entirely.

Istiod compromise

A compromise of Istiod generally leads to the same result as an API Server compromise. Istiod is a highly privileged component that should be strongly protected. Following the security best practices is crucial to maintaining a secure cluster.

Was this information useful?
Do you have any suggestions for improvement?

Thanks for your feedback!