Weighted Canary Rollout with Istio
Walk me through how you'd canary a new version of a service with Istio. Say you want to start at 5% traffic to v2 and ramp up.
Walk me through how you'd canary a new version of a service with Istio. Say you want to start at 5% traffic to v2 and ramp up.
First, both versions need to be running with distinct labels — `version: v1` on the stable pods and `version: v2` on the new ones. Then I write a DestinationRule that declares both subsets so Istio knows what v1 and v2 mean. Next, a VirtualService with a weighted route: 95 to v1, 5 to v2. To ramp, I bump the weight values and re-apply. Most teams script this or wire it into Flagger or Argo Rollouts so the ramp is automated based on success rate and latency metrics from Prometheus. The key thing to verify after each step is that the subset labels actually match real pods. A common failure mode is shifting weight to v2 when v2 has zero healthy pods because the pod label is `app=reviews-v2` instead of `version: v2`. You'll see 503s spike on the percentage of traffic you shifted. Also remember weights are percentages of matched traffic, not absolute. If your VirtualService has a header match before the weighted route, the weights only apply to requests that fell through to that rule.
This is the most common real-world Istio use case. The interviewer wants to hear that the candidate understands the two-CRD pattern, the label-matching gotcha, and ideally that they know weights are not an SLO-aware progressive delivery system on their own — that's what Flagger or Argo Rollouts add on top. Bonus points for mentioning sticky sessions or session affinity considerations.
Initial 95/5 split with DestinationRule subsets
Verify the split is hitting both subsets
Flagger Canary for automated ramp
- Forgetting to label the v2 pods with `version: v2`, so traffic shifts to an empty subset
- Manually editing weights in production instead of using Flagger or Argo Rollouts
- Assuming weights apply globally when an earlier match rule already routed the request
- How would you stick a specific user or header to v2 while everyone else stays on v1?
- What metrics would you watch during the ramp, and where would you get them?
- How does this change if your service holds in-memory session state?
More Service Mesh interview questions
Also worth your time on this topic
Istio Traffic Management Checklist: Routing, Retries, and Circuit Breaking
How to configure traffic management policies in Istio so your services can do canary releases, retry transient failures, and shed load when a downstream service goes bad. Covers VirtualService, DestinationRule, retries, timeouts, circuit breakers, and outlier detection.
60-90 minutes
VirtualService vs DestinationRule
In Istio, what's the difference between a VirtualService and a DestinationRule? When would you use each?
junior
Istio Traffic Management: Routing, Retries, and Circuit Breaking
Configure weighted routing, automatic retries, and circuit breakers in Istio with copy-paste YAML examples and real kubectl output you can verify on your own cluster.