When Kubernetes Is the Wrong Default
TLDR
Kubernetes has become the default infrastructure choice for new projects, but it's often the wrong decision for teams under 30 engineers. This guide provides a decision framework based on team size, workload characteristics, and operational maturity. Most teams would ship faster with managed platforms like Heroku, Render, or DigitalOcean App Platform. VMs work better for stateful workloads and legacy applications. Kubernetes makes sense when you need multi-region deployment, complex networking, or have dedicated platform engineers.
The industry treats Kubernetes as a default choice. Job postings list it as a requirement. Conference talks assume you're running it. Cloud providers optimize their offerings around it. This creates pressure to adopt Kubernetes even when it slows your team down.
The cost of premature Kubernetes adoption isn't just the learning curve. It's the 3-6 month delay getting your first production deployment right. It's the full-time platform engineer you hire at month 8 when the team realizes they can't maintain it themselves. It's the velocity tax where simple changes require updating Helm charts, waiting for CI/CD pipelines, and debugging pod networking.
This guide gives you a framework for making infrastructure decisions based on what actually matters: team size, workload characteristics, and time to value.
The Three Tiers of Infrastructure Complexity
Infrastructure choices exist on a complexity spectrum. Each tier trades operational burden for control:
MANAGED PLATFORMS VMs KUBERNETES
(Heroku, Render, etc.) (EC2, Droplets) (EKS, GKE, etc.)
Complexity: ██ ████████ ████████████████
Control: ████ ████████████ ████████████████████
Time-to-Ship: 🚀 Days 📦 Weeks ⏳ Months
Team Size: 1-15 5-30 20+
Managed Platforms abstract away infrastructure entirely. You push code, they handle runtime, scaling, SSL, logging, and deployment. Best for web applications, APIs, and background workers.
VMs give you control over the OS, networking, and installed software. You configure once, then deploy applications through standard tooling. Best for stateful workloads, legacy apps, and teams that need OS-level control without orchestration complexity.
Kubernetes provides container orchestration with advanced scheduling, networking, and deployment features. Best for multi-region deployments, complex microservices, or teams with platform engineering resources.
The key insight: moving right on this spectrum doesn't automatically make your infrastructure better. It makes it more flexible and more expensive to operate.
Decision Framework: What Matters More Than You Think
Stop asking "should we use Kubernetes?" and start asking these questions:
1. How Many Engineers Are Building Product?
Team size predicts infrastructure capacity better than anything else:
- 1-10 engineers: You can't afford dedicated platform work. Every hour spent on infrastructure is an hour not building product. Use managed platforms.
- 10-30 engineers: You might have one person spending 50% time on infrastructure. VMs or managed platforms work. Kubernetes requires too much ongoing maintenance.
- 30-60 engineers: Platform work becomes a full-time role. Kubernetes becomes viable if workload characteristics justify it.
- 60+ engineers: Platform team can justify itself. Multiple infrastructure patterns coexist. Kubernetes makes sense for appropriate workloads.
The math is simple: if Kubernetes requires 1.5 full-time engineers to operate, that's 10-15% of a 10-person team's capacity. You can't ship product at that burn rate.
2. What Do Your Workloads Actually Look Like?
Infrastructure should match workload shape, not resume trends:
Managed platforms win for:
- Web applications with stateless HTTP servers
- APIs serving JSON over HTTPS
- Background job processors (Sidekiq, Celery, etc.)
- Scheduled tasks and cron jobs
- Standard databases (Postgres, MySQL, Redis)
VMs work better for:
- Stateful applications with local disk requirements
- Legacy applications built for bare metal/VMs
- Applications requiring specific kernel versions or system libraries
- GPU workloads without Kubernetes expertise
- Applications sensitive to network overhead
Kubernetes justified for:
- Multi-region active-active deployments
- Complex service mesh requirements (mTLS, advanced routing)
- Applications requiring sophisticated autoscaling (HPA + VPA + custom metrics)
- Workloads with complex scheduling constraints (affinity, taints, tolerations)
- Multi-tenancy isolation requirements
If 90% of your workload is "web app + background workers + Postgres," Kubernetes is architectural gold-plating.
3. How Fast Do You Need to Ship?
Time-to-production matters more than most teams admit:
- Managed platform: Push code → production in 5-10 minutes. First deployment Day 1.
- VMs with configuration management: Deploy with Ansible/Terraform. First deployment Week 1-2.
- Kubernetes: Cluster setup, ingress config, cert-manager, secrets management, CI/CD integration, monitoring. First deployment Month 1-3.
That 3-month difference is 25% of a year. For early-stage companies, that's the difference between validating product-market fit and running out of runway.
4. What's Your Operational Maturity?
Be honest about where your team is:
Signals you're NOT ready for Kubernetes:
- No one on the team has operated Kubernetes in production
- You don't have CI/CD pipelines for current infrastructure
- Deployments require manual steps
- You've never handled a production incident requiring deep debugging
- Your monitoring consists of "check if the website is up"
Signals you MIGHT be ready:
- Someone has battle-tested Kubernetes experience (>1 year production operations)
- You have observability stacks deployed (metrics, logs, traces)
- Deployments are automated and reproducible
- You practice incident response and have runbooks
- You've outgrown simpler infrastructure and hit real limitations
Kubernetes won't teach you operational maturity. It will expose every gap in your practices at 3x speed.
Real-World Scenarios: What You Should Actually Choose
Scenario A: 8-Person Startup, Series A, Building SaaS Product
Workload:
- Next.js frontend
- Node.js API backend
- PostgreSQL database
- Redis for caching/sessions
- Background job processing
Wrong choice: Set up EKS with Helm charts, Ingress NGINX, cert-manager, and external-secrets. Cost: 2 engineers, 6 weeks, ongoing 40% time maintenance.
Right choice: Render, Railway, or DigitalOcean App Platform. Deploy frontend + backend + managed Postgres + managed Redis. Cost: 1 engineer, 2 days, <5% ongoing maintenance.
Why: The team needs to validate product-market fit, not build infrastructure. Every week spent on Kubernetes is a week not iterating on the product. The monthly cost difference ($200-500 platform premium vs $150-300 raw compute) is negligible compared to engineer time.
When to switch: At 25-30 engineers, when platform costs reach $3-5K/month and you have someone who can dedicate full-time to infrastructure.
Scenario B: 15-Person Team, High-Traffic API, Predictable Load
Workload:
- Python FastAPI serving 50K requests/minute
- Heavy CPU/memory usage (data processing)
- PostgreSQL with read replicas
- Scheduled batch jobs
Wrong choice: Kubernetes because "we need to scale." The orchestration overhead doesn't match the scaling pattern (predictable, steady traffic).
Right choice: VMs behind a load balancer. Use Terraform to provision 6-8 compute instances, Ansible for configuration, systemd for process management. HAProxy or cloud load balancer in front.
Cost comparison:
- Kubernetes path: $800/month compute + $150 EKS control plane + 60 engineer-hours/month = ~$8,000/month total cost
- VM path: $800/month compute + 8 engineer-hours/month = ~$1,600/month total cost
Why: VMs provide the control and predictability needed. The workload doesn't benefit from Kubernetes features (no complex routing, no need for rapid container churn, no multi-region). Operational simplicity wins.
When to switch: When traffic becomes unpredictable and autoscaling becomes a major operational burden. Or when you expand to multi-region and need orchestration.
Scenario C: 40-Person Team, Microservices, Multi-Region
Workload:
- 15 microservices with independent release cycles
- Multi-region deployment (US, EU, APAC)
- Complex routing requirements (canary deployments, A/B tests)
- Service-to-service authentication requirements
- 2 dedicated platform engineers
Wrong choice: Try to manage this with VMs and shell scripts. The complexity outgrows the tooling.
Right choice: Kubernetes (GKE/EKS) with service mesh (Istio or Linkerd), GitOps (ArgoCD or Flux), centralized observability (Datadog or Grafana stack).
Why: The workload characteristics justify orchestration complexity:
- Multi-region needs sophisticated traffic management
- Independent release cycles benefit from container isolation
- Service mesh solves authentication and observability at scale
- Team size supports dedicated platform investment
Cost: 2 full-time platform engineers + $3-5K/month cloud costs. This is justified because:
- 15 microservices × 3 regions = 45 deployment targets would be unmaintainable with VMs
- Platform team enables 38 product engineers to ship independently
- ROI is clear: platform unlocks velocity, not just "best practices"
Scenario D: 20-Person Team, ML/GPU Workloads, Batch Processing
Workload:
- Training jobs requiring GPU instances
- Inference API with variable traffic
- Data processing pipelines (Spark, Airflow)
- Model registry and versioning
Wrong choice: Force everything into Kubernetes because "that's what the industry uses for ML."
Better choice: Hybrid approach:
- Inference API: Managed platform (Render, Modal, or Replicate) handles scaling and serving
- Training jobs: Orchestrate with Airflow on VMs or use managed ML platforms (SageMaker, Vertex AI)
- Data pipelines: Dedicated compute instances or serverless (Lambda, Cloud Functions)
Why: Kubernetes GPU support is notoriously finicky. Node autoscaling with GPUs takes 5-10 minutes. Managed ML platforms handle this complexity.
When Kubernetes helps: When you have 3+ ML engineers who specifically need Kubernetes features (custom schedulers, Ray clusters, multi-tenancy for different teams).
The Hidden Costs of Kubernetes Nobody Talks About
Beyond the obvious learning curve, Kubernetes imposes ongoing costs:
1. Cognitive Load on Every Engineer
Kubernetes creates a second API layer every engineer must understand:
- Deployments vs StatefulSets vs DaemonSets
- Services vs Ingress vs Gateway API
- ConfigMaps vs Secrets vs External Secrets Operator
- Resource requests vs limits vs QoS classes
- Network policies, pod security policies, admission controllers
Each of these concepts requires training, documentation, and ongoing support. At a 15-person team, that's 1-2 hours per engineer per week = 15-30 hours/week = nearly 1 full-time engineer just answering questions.
2. Increased Deployment Complexity
Simple changes get complicated:
Managed platform:
git push origin main
# Done. Deployed in 3 minutes.
Kubernetes:
# Update Dockerfile
# Update Kubernetes manifests or Helm values
# Update CI/CD pipeline configuration
git push origin main
# Wait for image build (5-10 min)
# Wait for Kubernetes rollout (5-10 min)
# Check pod status, logs, events
# Debug if something goes wrong
# 20-30 minutes minimum, 2 hours if issues
That friction compounds. If deployments take 10x longer, teams deploy less frequently, batching changes, increasing risk.
3. Incident Response Complexity
When things break:
Managed platform: Check application logs, check platform status page. Clear ownership (platform handles infrastructure, you handle app code).
Kubernetes: Is it the application? The pod? The node? The ingress? The network policy? The service mesh sidecar? The CNI plugin? The cloud provider's networking layer?
Incidents that take 15 minutes with simpler infrastructure take 2-4 hours in Kubernetes until your team builds institutional knowledge.
4. Maintenance Burden
Kubernetes clusters require ongoing maintenance:
- Control plane upgrades (quarterly)
- Node OS patching (monthly)
- Add-on updates (cert-manager, ingress controllers, monitoring agents)
- Certificate rotation
- RBAC management
- Security policy updates
Estimate 40-60 engineer-hours/month for a production cluster. That's 30% of one engineer.
Progressive Adoption: How to Grow Into Kubernetes
If you decide Kubernetes is in your future, don't jump straight there:
Phase 1: Managed Platform (0-15 engineers)
- Ship product
- Learn operational basics (monitoring, logging, deployments)
- Validate product-market fit
Phase 2: VMs with Automation (15-30 engineers)
- Terraform for infrastructure
- Ansible or similar for configuration
- Containerize applications (Docker Compose or similar)
- Build observability muscle (metrics, logs, traces)
- Practice incident response
Phase 3: Hybrid Approach (30-50 engineers)
- Move stateless workloads to managed Kubernetes (GKE Autopilot, EKS Fargate)
- Keep databases and stateful apps on VMs or managed services
- Hire someone with production Kubernetes experience
- Invest in platform tooling slowly
Phase 4: Full Kubernetes (50+ engineers)
- Dedicated platform team
- GitOps workflows (ArgoCD/Flux)
- Service mesh if workload justifies it
- Multi-cluster/multi-region setup
Skipping phases doesn't save time. It compounds technical debt and slows velocity.
When Kubernetes Actually Makes Sense
Kubernetes is the right choice when you need features that simpler infrastructure can't provide:
1. Multi-region active-active deployment
Running identical infrastructure in 3+ regions with traffic management, failover, and data synchronization requires orchestration that Kubernetes provides.
2. Complex microservices topologies
When you have 20+ services with independent release cycles, service mesh integration, and sophisticated routing (canary, blue/green, traffic splitting), Kubernetes shines.
3. Multi-tenancy isolation
If you're building a platform where customers deploy their own code (PaaS, CI/CD runners, notebook servers), Kubernetes namespaces and network policies provide isolation.
4. Advanced autoscaling requirements
When you need combined horizontal + vertical + cluster autoscaling based on custom metrics (queue depth, request latency, business metrics), Kubernetes HPA/VPA/Karpenter deliver.
5. Dedicated platform engineering capacity
If you have 2+ full-time engineers whose job is to build internal platforms, Kubernetes becomes a reasonable foundation to build on.
If you don't have 3+ of these signals, simpler infrastructure will ship product faster.
Key Takeaways
Team size predicts infrastructure capacity: Below 30 engineers, Kubernetes diverts too much capacity from product work.
Match infrastructure to workload shape: Most SaaS products are web apps + APIs + background jobs. These don't need orchestration.
Time-to-production matters: Kubernetes adds 1-3 months to first production deployment vs managed platforms.
Operational complexity is ongoing cost: Kubernetes requires 30-50% of one engineer ongoing, plus training overhead for the whole team.
Progressive adoption reduces risk: Grow from managed platforms → VMs → Kubernetes as team size and workload complexity justify it.
Kubernetes should solve real problems: Multi-region, complex microservices, or advanced autoscaling justify complexity. Resume-driven development doesn't.
The goal isn't to avoid Kubernetes forever. It's to adopt complexity when it solves real problems, not when it's fashionable.
Most teams ship faster, learn more, and build better products by starting simple. You can always add complexity later. You can't easily remove it once you're locked in.
We earn commissions when you shop through the links below.
DigitalOcean
Cloud infrastructure for developers
Simple, reliable cloud computing designed for developers
DevDojo
Developer community & tools
Join a community of developers sharing knowledge and tools
Acronis
The most secure backup
Acronis: the most secure backup solution for your data
Pluralsight
Technology skills platform
Expert-led courses in software development, IT ops, data, and cybersecurity
Want to support DevOps Daily and reach thousands of developers?
Become a SponsorFound an issue?