Debugging an Istio Traffic Policy That Isn't Working
You applied a VirtualService that splits traffic 80/20 between v1 and v2 of a service, but in production all traffic still goes to v1. Walk me through how you'd debug it.
You applied a VirtualService that splits traffic 80/20 between v1 and v2 of a service, but in production all traffic still goes to v1. Walk me through how you'd debug it.
I'd work backwards from the sidecar because that's where the routing actually happens. First, `istioctl proxy-config route` on the calling pod, filtered to the destination port. That dumps the effective Envoy route config and shows whether the 80/20 split is actually programmed into the proxy. If the route isn't there, the VirtualService isn't being picked up at all — usually a namespace, gateway, or host mismatch. Next, if the route is there but pointing to one subset, I check the DestinationRule subsets and the pod labels. `kubectl get pods --show-labels` and confirm `version: v2` is actually on the v2 pods. A subset that matches zero pods doesn't throw an error, it just gets no endpoints and Envoy routes elsewhere. After that, `istioctl proxy-config endpoints` to confirm the v2 subset has real endpoints. If endpoints are empty, that confirms the label mismatch. If everything looks right, I check for a competing VirtualService — two VirtualServices for the same host with conflicting rules can produce surprising results, and the order matters. I'd also look at the sidecar access logs to see which cluster requests are actually hitting. And `istioctl analyze` is good for catching obvious config errors before going deeper.
This is the troubleshooting question that separates engineers who've actually run Istio in production from those who've only configured it from tutorials. The interviewer is listening for a structured approach that starts with the data plane (what Envoy actually thinks) instead of guessing from the control plane YAML. Bonus signal if the candidate mentions checking for conflicting CRDs or namespace scope issues.
Check the route config the sidecar actually has
Confirm subsets resolve to real endpoints
Static analysis for common config problems
- Reading only the YAML you applied instead of checking what the sidecar actually programmed
- Assuming a subset with zero pods will throw an error — it silently routes elsewhere
- Ignoring conflicting VirtualServices in other namespaces or with overlapping hosts
- If `istioctl proxy-config route` shows the right config but traffic still misbehaves, what would you check next?
- How would two VirtualServices for the same host interact, and which one wins?
- What's the difference between a missing DestinationRule subset and a subset with zero matching pods, and how would you tell them apart?
More Service Mesh interview questions
Also worth your time on this topic
Istio Traffic Management Checklist: Routing, Retries, and Circuit Breaking
How to configure traffic management policies in Istio so your services can do canary releases, retry transient failures, and shed load when a downstream service goes bad. Covers VirtualService, DestinationRule, retries, timeouts, circuit breakers, and outlier detection.
60-90 minutes
VirtualService vs DestinationRule
In Istio, what's the difference between a VirtualService and a DestinationRule? When would you use each?
junior
Istio Traffic Management: Routing, Retries, and Circuit Breaking
Configure weighted routing, automatic retries, and circuit breakers in Istio with copy-paste YAML examples and real kubectl output you can verify on your own cluster.