Docker in Production

Learn how to prepare, deploy, and maintain Docker containers in production environments

Moving Docker from development to production requires careful planning and consideration of factors like performance, reliability, security, and scalability. This section covers essential practices for running Docker in production environments and introduces container orchestration platforms that can help manage your containerized applications at scale.

Production Readiness Checklist

Before deploying Docker to production, ensure you've addressed these key areas:

Container Optimization

Images are minimal in size and have limited attack surface
Multi-stage builds used to reduce final image size
Non-root users configured for containers
Resource limits set for CPU and memory
Health checks implemented
Graceful shutdown handling in place

Security Considerations

Container security scanning is part of the pipeline
All images use specific version tags (never latest)
Proper network segmentation implemented
Secrets management in place (no hardcoded secrets)
Security patches applied promptly
Container runtime security tools deployed

High Availability

Services designed for horizontal scaling
Load balancing configured
Persistent data properly managed with volumes
Automatic container restarts configured
Data backup and restore procedures in place

Monitoring and Logging

Centralized logging system implemented
Container and host monitoring in place
Alerting configured for critical thresholds
Application performance monitoring integrated

Container Orchestration Platforms

For most production workloads, you'll want to use a container orchestration platform rather than managing individual Docker containers.

Docker Swarm

Docker Swarm is Docker's native clustering and orchestration solution. It's easier to set up than alternatives like Kubernetes and works well for smaller deployments.

To initialize a Swarm:

docker swarm init --advertise-addr <MANAGER-IP>

To deploy a service to the Swarm:

docker service create --name web --replicas 3 -p 80:80 nginx:alpine

Key features of Docker Swarm:

Integrated with Docker
Simple setup and management
Rolling updates and rollbacks
Secret management
Load balancing
Service scaling

Example Docker Stack file (similar to Docker Compose but for Swarm):

version: '3.8'

services:
  web:
    image: nginx:alpine
    ports:
      - '80:80'
    deploy:
      replicas: 3
      update_config:
        parallelism: 1
        delay: 10s
      restart_policy:
        condition: on-failure
    healthcheck:
      test: ['CMD', 'curl', '-f', 'http://localhost']
      interval: 30s
      timeout: 10s
      retries: 3

  api:
    image: myapi:1.2.0
    ports:
      - '8000:8000'
    deploy:
      replicas: 2
      resources:
        limits:
          cpus: '0.5'
          memory: 512M
    secrets:
      - db_password
    environment:
      - DB_HOST=db
      - DB_USER=admin

  db:
    image: postgres:13
    volumes:
      - db-data:/var/lib/postgresql/data
    environment:
      - POSTGRES_PASSWORD_FILE=/run/secrets/db_password
    secrets:
      - db_password
    deploy:
      placement:
        constraints: [node.role == manager]

volumes:
  db-data:

secrets:
  db_password:
    external: true

To deploy the stack:

docker stack deploy -c docker-stack.yml myapp

Kubernetes

Kubernetes has become the industry standard for container orchestration. It offers more advanced features than Docker Swarm but with increased complexity.

A simple Pod definition in Kubernetes:

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  containers:
    - name: nginx
      image: nginx:1.21-alpine
      ports:
        - containerPort: 80
      resources:
        limits:
          memory: '128Mi'
          cpu: '500m'
      livenessProbe:
        httpGet:
          path: /
          port: 80
        initialDelaySeconds: 3
        periodSeconds: 3

Key features of Kubernetes:

Advanced scheduling and auto-healing
Horizontal pod autoscaling
Ingress and network policies
StatefulSets for stateful applications
ConfigMaps and Secrets
RBAC for access control
Extensive ecosystem and tooling

Managed Container Services

Many cloud providers offer managed container services that reduce the operational overhead:

Amazon ECS/EKS: AWS's container services (ECS is proprietary, EKS is Kubernetes)
Google Kubernetes Engine (GKE): Managed Kubernetes on Google Cloud
Azure Kubernetes Service (AKS): Managed Kubernetes on Azure
DigitalOcean Kubernetes: Simplified Kubernetes service
DigitalOcean App Platform: PaaS for containerized applications

These services handle much of the infrastructure management for you, allowing you to focus on your applications.

Deployment Strategies

How you deploy updates to your containers affects the availability and stability of your applications.

Rolling Updates

Rolling updates gradually replace instances of the old version with the new version:

# Docker Swarm
docker service update --image myapp:2.0 --update-parallelism 1 --update-delay 30s myapp

# Kubernetes
kubectl set image deployment/myapp container=myapp:2.0

Blue-Green Deployment

Blue-green deployment involves running two identical environments, switching traffic when the new version is ready:

Deploy the new version alongside the existing version
Test the new version
Switch traffic from old to new
Keep the old version running temporarily for quick rollback if needed

Canary Deployments

Canary deployments route a small percentage of traffic to the new version before full rollout:

Deploy the new version with minimal capacity
Route a small percentage of traffic to it
Monitor for issues
Gradually increase traffic to the new version
Complete the rollout if no issues arise

High Availability Patterns

Service Discovery

In dynamic container environments, service discovery helps applications find each other:

Docker Swarm provides DNS-based service discovery automatically
Kubernetes offers Service resources and CoreDNS
Third-party tools like Consul or etcd can be used

Load Balancing

Distribute traffic among multiple container instances:

External load balancers: HAProxy, NGINX, cloud load balancers
Internal load balancing: Built into orchestration platforms like Kubernetes Services

Example NGINX load balancer configuration:

upstream backend {
    server backend1:8000;
    server backend2:8000;
    server backend3:8000;
}

server {
    listen 80;

    location / {
        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Health Checks and Circuit Breaking

Implement health checks to detect and remove unhealthy containers:

HEALTHCHECK --interval=5s --timeout=3s --retries=3 CMD curl -f http://localhost/health || exit 1

For microservices, implement circuit breaking to handle failures gracefully:

# Istio circuit breaker configuration example
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: api-circuit-breaker
spec:
  host: api
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        http1MaxPendingRequests: 10
        maxRequestsPerConnection: 10
    outlierDetection:
      consecutiveErrors: 5
      interval: 30s
      baseEjectionTime: 30s

Scaling Containers in Production

Horizontal Scaling

Increase the number of container instances to handle more load:

# Docker Swarm
docker service scale web=5

# Kubernetes
kubectl scale deployment web --replicas=5

Vertical Scaling

Allocate more resources to each container:

# Kubernetes example
resources:
  requests:
    memory: '256Mi'
    cpu: '500m'
  limits:
    memory: '512Mi'
    cpu: '1000m'

Autoscaling

Configure automatic scaling based on metrics:

# Kubernetes Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-autoscaler
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Storage in Production

Persistent Storage

For data that must survive container restarts and relocations:

# Kubernetes PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

Storage Classes

Different applications have different storage requirements:

Block storage: For databases and applications that need low-latency access
File storage: For shared access across multiple containers
Object storage: For large volumes of unstructured data

Backup and Disaster Recovery

Regular backups are essential:

# Example database backup script for a container
docker exec db-container pg_dump -U postgres mydb > backup_$(date +%Y%m%d).sql

Consider using storage snapshots for point-in-time recovery.

Production Monitoring

Container Metrics

Monitor key container metrics:

CPU and memory usage
Network I/O
Disk I/O
Container status and health

Popular monitoring tools:

Prometheus and Grafana
Datadog
New Relic
Sysdig

Example Prometheus configuration for Docker monitoring:

scrape_configs:
  - job_name: 'docker'
    static_configs:
      - targets: ['localhost:9323']

Logging

Centralize logs from all containers:

# Example Docker run with logging configuration
docker run --log-driver=fluentd \
  --log-opt fluentd-address=fluentd-host:24224 \
  --log-opt tag=docker.{{.Name}} \
  myapp:latest

Popular logging stacks:

ELK (Elasticsearch, Logstash, Kibana)
Graylog
Loki with Grafana

Application Performance Monitoring (APM)

Implement tracing to monitor application performance:

Jaeger
Zipkin
Datadog APM
New Relic APM

Security in Production

Runtime Security

Monitor and enforce security at runtime:

Container behavior monitoring
Network activity monitoring
File integrity monitoring
Process monitoring

Tools like Falco can help:

# Falco rule example
- rule: Terminal shell in container
  desc: A shell was spawned in a container
  condition: container.id != "" and proc.name = bash
  output: 'Shell spawned in container (container=%container.name)'
  priority: WARNING

Image Scanning in CI/CD

Automate security scanning before deployment:

# GitHub Actions example with Trivy
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Build image
        run: docker build -t myapp:${{ github.sha }} .
      - name: Run Trivy vulnerability scanner
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: 'myapp:${{ github.sha }}'
          format: 'table'
          exit-code: '1'
          ignore-unfixed: true
          severity: 'CRITICAL,HIGH'

Network Security

Apply network segmentation and security:

# Kubernetes Network Policy example
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: db-policy
spec:
  podSelector:
    matchLabels:
      app: database
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: backend
      ports:
        - protocol: TCP
          port: 5432

Compliance and Governance

Image Registry Policies

Enforce policies on which images can be deployed:

Require images from trusted registries
Enforce signing and verification
Block images with critical vulnerabilities

Tagging Strategies

Implement a consistent tagging strategy:

Use semantic versioning
Include build information
Consider using digests for immutability

docker tag myapp:latest myapp:1.2.3-$(git rev-parse --short HEAD)

Documentation and Change Management

Maintain documentation of your containerized environment:

Architecture diagrams
Deployment procedures
Troubleshooting guides
Change logs

Cloud Deployment with DigitalOcean

DigitalOcean provides a user-friendly platform for running container workloads in production:

DigitalOcean Kubernetes: Managed Kubernetes service
DigitalOcean App Platform: PaaS for containerized applications
Droplets with Docker: Self-managed Docker hosts

Disaster Recovery Planning

Container Recovery

Plan for different failure scenarios:

Container failure → Automatic restart
Host failure → Orchestrator reschedules containers
Data corruption → Restore from backups

Multi-Region Deployments

For critical applications, consider multi-region deployments:

Deploy to multiple data centers/regions
Implement global load balancing
Replicate data across regions

Cost Optimization

Resource Utilization

Monitor and optimize resource usage:

Right-size container resource allocations
Use autoscaling to match demand
Consider spot instances for non-critical workloads

Example resource optimization check:

# Find containers with excessive resource allocation
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.MemPerc}}"

Image Storage Optimization

Optimize image storage costs:

Clean up unused images regularly
Implement image retention policies
Use minimal base images

# Clean up unused images
docker image prune -a --filter "until=24h"

Continuous Learning and Improvement

The container ecosystem evolves rapidly. Stay current by:

Attending industry events like DockerCon and KubeCon
Following blogs and newsletters about container technologies
Participating in community forums and discussion groups
Reviewing your architecture periodically against best practices
Testing new technologies in development before production

Final Thoughts

Running Docker in production requires careful planning and attention to detail, but the benefits are substantial: improved consistency, simplified deployments, better resource utilization, and greater scalability.

Start with simple workloads as you build your confidence and expertise. As your team's container knowledge grows, you can gradually tackle more complex deployments.

Remember that containerization is a means to an end, delivering reliable, secure, and scalable applications to your users. Always keep your application's requirements and user needs at the center of your container strategy.

We hope this guide has provided you with a solid foundation for your Docker journey. Happy containerizing!

Part of: Introduction to Docker

Updated: 5/17/2025

Tags