Oh no! You’re having trouble? Below is a list of solutions to common problems.
If you don’t find what you need here, be sure to check out our help page.
Istio is installed and everything seems to be working except there are no traces showing up in Zipkin when there should be.
This may be caused by a known Docker issue where the time inside containers may skew significantly from the time on the host machine. If this is the case, when you select a very long date range in Zipkin you will see the traces appearing as much as several days too early.
You can also confirm this problem by comparing the date inside a docker container to outside:
docker run --entrypoint date gcr.io/istio-testing/ubuntu-16-04-slave:latest
Sun Jun 11 11:44:18 UTC 2017
date -u
Thu Jun 15 02:25:42 UTC 2017
To fix the problem, you’ll need to shutdown and then restart Docker before reinstalling Istio.
Envoy requires HTTP/1.1 or HTTP/2 traffic for upstream services. For example, when using NGINX for serving traffic behind Envoy, you will need to set the proxy_http_version directive in your NGINX config to be “1.1”, since the NGINX default is 1.0
Example config:
upstream http_backend {
server 127.0.0.1:8080;
keepalive 16;
}
server {
...
location /http/ {
proxy_pass http://http_backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
...
}
}
Validate the client and server date and time match.
The time of the web client (e.g. Chrome) affects the output from Grafana. A simple solution to this problem is to verify a time synchronization service is running correctly within the Kubernetes cluster and the web client machine also is correctly using a time synchronization service. Some common time synchronization systems are NTP and Chrony. This is especially problematic is engineering labs with firewalls. In these scenarios, NTP may not be configured properly to point at the lab-based NTP services.
The expected flow of metrics is:
The default installations of Mixer ship with a Prometheus adapter, as well as configuration for generating a basic set of metric values and sending them to the Prometheus adapter. The Prometheus add-on also supplies configuration for an instance of Prometheus to scrape Mixer for metrics.
If you do not see the expected metrics in the Istio Dashboard and/or via Prometheus queries, there may be an issue at any of the steps in the flow listed above. Below is a set of instructions to troubleshoot each of those steps.
Mixer generates metrics for monitoring the behavior of Mixer itself. Check these metrics.
Establish a connection to the Mixer self-monitoring endpoint.
In Kubernetes environments, execute the following command:
kubectl -n istio-system port-forward <mixer pod> 9093 &
Verify successful report calls.
On the Mixer self-monitoring endpoint, search for grpc_server_handled_total
.
You should see something like:
grpc_server_handled_total{grpc_code="OK",grpc_method="Report",grpc_service="istio.mixer.v1.Mixer",grpc_type="unary"} 68
If you do not see any data for grpc_server_handled_total
with a grpc_method="Report"
, then Mixer is not being called by Envoy to report telemetry. In this case, ensure that the services have been properly integrated into the mesh (either by via automatic or manual sidecar injection).
Verify Mixer rules exist.
In Kubernetes environments, issue the following command:
kubectl get rules --all-namespaces
With the default configuration, you should see something like:
NAMESPACE NAME KIND
istio-system promhttp rule.v1alpha2.config.istio.io
istio-system promtcp rule.v1alpha2.config.istio.io
istio-system stdio rule.v1alpha2.config.istio.io
If you do not see anything named promhttp
or promtcp
, then there is no Mixer configuration for sending metric instances to a Prometheus adapter. You will need to supply configuration for rules that connect Mixer metric instances to a Prometheus handler (example).
Verify Prometheus handler config exists.
In Kubernetes environments, issue the following command:
kubectl get prometheuses.config.istio.io --all-namespaces
The expected output is:
NAMESPACE NAME KIND
istio-system handler prometheus.v1alpha2.config.istio.io
If there are no prometheus handlers configured, you will need to reconfigure Mixer with the appropriate handler configuration (example)
Verify Mixer metric instances config exists.
In Kubernetes environments, issue the following command:
kubectl get metrics.config.istio.io --all-namespaces
The expected output is:
NAMESPACE NAME KIND
istio-system requestcount metric.v1alpha2.config.istio.io
istio-system requestduration metric.v1alpha2.config.istio.io
istio-system requestsize metric.v1alpha2.config.istio.io
istio-system responsesize metric.v1alpha2.config.istio.io
istio-system stackdriverrequestcount metric.v1alpha2.config.istio.io
istio-system stackdriverrequestduration metric.v1alpha2.config.istio.io
istio-system stackdriverrequestsize metric.v1alpha2.config.istio.io
istio-system stackdriverresponsesize metric.v1alpha2.config.istio.io
istio-system tcpbytereceived metric.v1alpha2.config.istio.io
istio-system tcpbytesent metric.v1alpha2.config.istio.io
If there are no metric instances configured, you will need to reconfigure Mixer with the appropriate instance configuration (example)
Verify Mixer configuration resolution is working for your service.
Establish a connection to the Mixer self-monitoring endpoint.
Setup a port-forward
to the Mixer self-monitoring port as described in Verify Mixer is receiving Report calls.
On the Mixer self-monitoring port, search for mixer_config_resolve_count
.
You should find something like:
mixer_config_resolve_count{error="false",target="details.default.svc.cluster.local"} 56
mixer_config_resolve_count{error="false",target="ingress.istio-system.svc.cluster.local"} 67
mixer_config_resolve_count{error="false",target="mongodb.default.svc.cluster.local"} 18
mixer_config_resolve_count{error="false",target="productpage.default.svc.cluster.local"} 59
mixer_config_resolve_count{error="false",target="ratings.default.svc.cluster.local"} 26
mixer_config_resolve_count{error="false",target="reviews.default.svc.cluster.local"} 54
Validate that there are values for mixer_config_resolve_count
where target="<your service>"
and error="false"
.
If there are only instances where error="true"
where target=<your service>
, there is likely an issue with Mixer configuration for your service. Logs information is needed to further debug.
In Kubernetes environments, retrieve the Mixer logs via:
kubectl -n istio-system logs <mixer pod> mixer
Look for errors related to your configuration or your service in the returned logs.
More on viewing Mixer configuration can be found here
Establish a connection to the Mixer self-monitoring endpoint.
Setup a port-forward
to the Mixer self-monitoring port as described in Verify Mixer is receiving Report calls.
On the Mixer self-monitoring port, search for mixer_adapter_dispatch_count
.
You should find something like:
mixer_adapter_dispatch_count{adapter="prometheus",error="false",handler="handler.prometheus.istio-system",meshFunction="metric",response_code="OK"} 114
mixer_adapter_dispatch_count{adapter="prometheus",error="true",handler="handler.prometheus.default",meshFunction="metric",response_code="INTERNAL"} 4
mixer_adapter_dispatch_count{adapter="stdio",error="false",handler="handler.stdio.istio-system",meshFunction="logentry",response_code="OK"} 104
Validate that there are values for mixer_adapter_dispatch_count
where adapter="prometheus"
and error="false"
.
If there are are no recorded dispatches to the Prometheus adapter, there is likely a configuration issue. Please see Verify Mixer metrics configuration exists.
If dispatches to the Prometheus adapter are reporting errors, check the Mixer logs to determine the source of the error. Most likely, there is a configuration issue for the handler listed in mixer_adapter_dispatch_count
.
In Kubernetes environment, check the Mixer logs via:
kubectl -n istio-system logs <mixer pod> mixer
Filter for lines including something like Report 0 returned with: INTERNAL (1 error occurred:
(with some surrounding context) to find more information regarding Report dispatch failures.
Connect to the Prometheus UI and verify that it can successfully scrape Mixer.
In Kubernetes environments, setup port-forwarding as follows:
kubectl -n istio-system port-forward $(kubectl -n istio-system get pod -l app=prometheus -o jsonpath='{.items[0].metadata.name}') 9090:9090 &
Visit http://localhost:9090/config.
Confirm that an entry exists that looks like:
- job_name: 'istio-mesh'
# Override the global default and scrape targets from this job every 5 seconds.
scrape_interval: 5s
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['istio-mixer.istio-system:42422']
Visit http://localhost:9090/targets.
Confirm that target istio-mesh
has a status of UP.
To debug Istio with gdb
, you will need to run the debug images of Envoy / Mixer / Pilot. A recent gdb
and the golang extensions (for Mixer/Pilot or other golang components) is required.
kubectl exec -it PODNAME -c [proxy | mixer | pilot]
Tcpdump doesn’t work in the sidecar pod - the container doesn’t run as root. However any other container in the same pod will see all the packets, since the network namespace is shared. iptables
will also see the pod-wide config.
Communication between Envoy and the app happens on 127.0.0.1, and is not encrypted.
Check your ulimit -a
. Many systems have a 1024 open file descriptor limit by default which will cause Envoy to assert and crash with:
[2017-05-17 03:00:52.735][14236][critical][assert] assert failure: fd_ != -1: external/envoy/source/common/network/connection_impl.cc:58
Make sure to raise your ulimit. Example: ulimit -n 16384