You have 15 microservices, 4 environments (dev, staging, preprod, prod), and prod runs in 3 regional clusters. That's potentially 90 Application CRs. How do you structure the manifests repo so it doesn't fall apart?

Question

Accepted Answer

At this size, hand-written Application CRs are dead weight. Switch to ApplicationSets generated from the directory layout, push environment differences into structured parameter files, and split the repo into a workloads side and a platform side. Three changes get you there.

First, layout. Two axes: app and environment-or-cluster. I keep the app axis as the primary directory split because most changes happen inside one app.

workloads/
  checkout/
    base/
    overlays/
      dev/
      staging/
      preprod/
      prod-us-east/
      prod-eu-west/
      prod-ap-south/
  ledger/
    base/
    overlays/
      ... same shape
  ...
clusters/
  dev/
    config.yaml          # cluster metadata: name, server, region, labels
  staging/
  preprod/
  prod-us-east/
  prod-eu-west/
  prod-ap-south/
argocd/
  applicationsets/
    workloads.yaml       # one ApplicationSet that fans out across apps x envs
  projects/

Second, one ApplicationSet does the fan-out. A matrix generator combines a list of apps with the cluster list. The template produces one Application per (app, cluster) pair. Add a new app: drop a directory under workloads/. Add a new region: register the cluster and add an overlay. Both flows are pure Git, no YAML duplication.

Third, push the actual environment differences into well-named files, not into a wall of overlay patches. Each overlay holds a values.yaml or a config.yaml with the parameters that differ: replica count, resource requests, DNS suffix, feature flags. The overlay's kustomization.yaml is a thin shim that pulls in the base, applies a couple of patches, and merges the values file into a ConfigMap. When someone wants to know 'what is different about eu-west prod', they read one file, not five.

Things that go wrong at this scale if you do not plan for them:

1. Argo CD performance. 90 Applications all polling the same Git repo means your repo server becomes a bottleneck. Enable Git webhook integration so commits trigger immediate refresh instead of every Application polling on its own interval. Tune timeout.reconciliation in argocd-cm.

2. Repo server memory. Rendering Kustomize or Helm for 90 Applications eats RAM. Scale up the argocd-repo-server replicas, give them more memory, and consider running a separate Argo CD instance for production clusters so a dev repo storm cannot starve prod reconcile.

3. Blast radius of a base change. A change to workloads/checkout/base/deployment.yaml hits 6 environments at once. Use sync waves and sync windows so prod-eu-west does not sync at the same moment as prod-us-east. Or, for the riskiest changes, use a release branch that overlays target instead of main, so you can promote regions one at a time.

4. Discoverability. With 90 Applications, the Argo CD UI gets hard to navigate. Use labels on the ApplicationSet template (app, env, region, team) and rely on the UI's label filters instead of scrolling.

One pattern I avoid at this size: per-cluster manifests repos. It feels clean and it kills you the first time you need to roll out a security patch to 6 clusters at once. One repo, many Applications, generated from one template. Drift between regions is the enemy.

Scaling a Manifests Repo Across Many Services, Environments, and Clusters

Sample answer

Why this matters

Code examples

Common mistakes to avoid

Likely follow-ups

More GitOps interview questions

Also worth your time on this topic

Argo CD Multi-Environment Repository Structure Checklist

Bootstrapping Argo CD and Letting It Manage Itself

GitOps with Argo CD: Structuring Your Repository for Multi-Environment Deployments