Your security team sees that Litmus can delete pods and inject network faults cluster-wide, and they want it gone. How do you scope Litmus so you can still run chaos in production without handing it the keys to the cluster?

Question

Accepted Answer

There are two levers, RBAC and blast radius, and you pull both. On RBAC: Litmus ships litmus-admin, a cluster-wide service account that is fine for a sandbox and wrong for production. Each ChaosExperiment already declares exactly the verbs and resources its fault needs, so build a least-privilege ServiceAccount per experiment or per team namespace using a Role and RoleBinding instead of a ClusterRole wherever the fault allows it. Run Litmus in namespaced scope so a blast cannot cross tenant boundaries. On blast radius: the ChaosEngine appns and applabel narrow the target set, PODS_AFFECTED_PERC and KILL_COUNT cap how many pods go at once, and for node or infra faults NODE_LABEL fences which nodes can be touched. Sequence multiple faults serially rather than in parallel so you do not compound failures you cannot reason about. Then layer operational safety on top: run chaos in its own namespace, set resource quotas, attach probes in Continuous mode that fail fast and abort the run when an SLO breaks, and rehearse the whole thing in staging first. And know the kill switch cold: set engineState to stop or delete the ChaosEngine, which reverts the chaos best-effort and tears down the runner. The pitch back to security is simple. A least-privilege service account, namespaced scope, a bounded blast radius, and an auto-halt probe is a smaller and known standing risk than the unknown failure modes you are already shipping to production blind.

Scoping Litmus Safely: RBAC and Blast Radius

Sample answer

Why this matters

Code examples

Common mistakes to avoid

Likely follow-ups

More Chaos Engineering interview questions

Also worth your time on this topic

Running Your First Chaos Engineering Experiment with Litmus

Litmus Building Blocks: ChaosEngine vs ChaosExperiment

Running Your First Chaos Engineering Experiment with Litmus