Load Balancing and High Availability
Distributing traffic across multiple servers and handling failures gracefully to maintain service reliability.
Single points of failure kill applications. When your only web server crashes, everyone gets error pages. When your database goes down, your entire application stops working. Load balancing and high availability techniques distribute traffic and eliminate these single points of failure.
Prerequisites
- Understanding of networking fundamentals and firewalls
- Access to multiple servers or containers for testing
- Basic web server configuration knowledge
Why Load Balancing Matters
Let's start with a simple example. You have one web server handling all your traffic:
Internet Users Single Web Server
│ │
┌────▼────┐ ┌────▼────┐
│ User 1 │ │ │
└─────────┘ │ nginx │
┌─────────┐ │ :80 │
│ User 2 │─────────────────►│ │
└─────────┘ │ CPU:95% │
┌─────────┐ │ Memory: │
│ User 3 │ │ 4GB/4GB │
└─────────┘ └─────────┘
┌─────────┐ │
│ User N │ ▼
└─────────┘ ❌ OVERLOADED
As traffic grows, this single server becomes overwhelmed. Load balancing solves this by distributing requests across multiple servers:
Internet Users Load Balancer Backend Servers
│ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│ User 1 │ │ │ │ nginx-1 │
└─────────┘ │ nginx │ │ :8080 │
┌─────────┐ │ LB │──────────────┤ CPU:30% │
│ User 2 │──────────────┤ :80 │ └─────────┘
└─────────┘ │ │ ┌─────────┐
┌─────────┐ │ Routes │──────────────│ nginx-2 │
│ User 3 │ │ Traffic │ │ :8080 │
└─────────┘ │ Evenly │ │ CPU:25% │
┌─────────┐ └─────────┘ └─────────┘
│ User N │ ┌─────────┐
└─────────┘ │ nginx-3 │
│ :8080 │
│ CPU:35% │
└─────────┘
Check current connections to your server:
# Check current connections to your server
netstat -an | grep :80 | grep ESTABLISHED | wc -l
Types of Load Balancing
Layer 4 (Transport Layer): Distributes based on IP addresses and ports
# Client connects to load balancer IP:80
# Load balancer forwards to backend servers
# 192.168.1.10:80 → 10.0.1.100:8080
# 192.168.1.10:80 → 10.0.1.101:8080
Layer 7 (Application Layer): Distributes based on content like HTTP headers or URLs
# Route based on URL path
# /api/* → API servers (10.0.1.100-102)
# /static/* → Static file servers (10.0.2.100-102)
# /* → Web application servers (10.0.1.200-202)
Setting Up Basic Load Balancing
nginx Load Balancer
nginx makes an excellent HTTP load balancer. Here's a basic configuration:
# /etc/nginx/sites-available/load-balancer
upstream backend_servers {
server 10.0.1.100:8080;
server 10.0.1.101:8080;
server 10.0.1.102:8080;
}
server {
listen 80;
server_name app.example.com;
location / {
proxy_pass http://backend_servers;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
Enable and test the configuration:
# Enable the site
sudo ln -s /etc/nginx/sites-available/load-balancer \
/etc/nginx/sites-enabled/
# Test configuration
sudo nginx -t
# Reload nginx
sudo systemctl reload nginx
# Test load balancing
curl -H "Host: app.example.com" http://your-load-balancer
HAProxy Configuration
HAProxy provides more advanced load balancing features:
# /etc/haproxy/haproxy.cfg
global
daemon
defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend web_frontend
bind *:80
default_backend web_servers
backend web_servers
balance roundrobin
option httpchk GET /health
server web1 10.0.1.100:8080 check
server web2 10.0.1.101:8080 check
server web3 10.0.1.102:8080 check
Start HAProxy and test:
# Test configuration
sudo haproxy -f /etc/haproxy/haproxy.cfg -c
# Start HAProxy
sudo systemctl start haproxy
# Check status
echo "show stat" | sudo socat stdio /var/lib/haproxy/stats
Load Balancing Algorithms
Different algorithms distribute traffic in different ways:
Round Robin
Requests go to each server in order:
upstream backend {
server server1.example.com;
server server2.example.com;
server server3.example.com;
}
Request 1 → server1, Request 2 → server2, Request 3 → server3, Request 4 → server1...
Weighted Round Robin
Some servers get more traffic:
upstream backend {
server server1.example.com weight=3;
server server2.example.com weight=2;
server server3.example.com weight=1;
}
Out of 6 requests: 3 to server1, 2 to server2, 1 to server3.
Least Connections
New requests go to the server with fewest active connections:
upstream backend {
least_conn;
server server1.example.com;
server server2.example.com;
server server3.example.com;
}
IP Hash
Requests from the same IP always go to the same server:
upstream backend {
ip_hash;
server server1.example.com;
server server2.example.com;
server server3.example.com;
}
This maintains session affinity without requiring shared session storage.
Health Checks
Load balancers should only send traffic to healthy servers.
HTTP Health Checks
Configure health check endpoints in your applications:
# Simple health check endpoint
from flask import Flask, jsonify
import psycopg2
app = Flask(__name__)
@app.route('/health')
def health_check():
try:
# Check database connection
conn = psycopg2.connect(database_url)
cursor = conn.cursor()
cursor.execute('SELECT 1')
cursor.close()
conn.close()
return jsonify({'status': 'healthy'}), 200
except Exception as e:
return jsonify({'status': 'unhealthy', 'error': str(e)}), 503
Configure the load balancer to use this endpoint:
upstream backend {
server 10.0.1.100:8080;
server 10.0.1.101:8080;
server 10.0.1.102:8080;
}
# nginx doesn't have built-in health checks in the free version
# Use nginx Plus or external monitoring
HAProxy has built-in health checking:
backend web_servers
option httpchk GET /health
http-check expect status 200
server web1 10.0.1.100:8080 check inter 5s
server web2 10.0.1.101:8080 check inter 5s
server web3 10.0.1.102:8080 check inter 5s
TCP Health Checks
For non-HTTP services, use TCP health checks:
backend database_servers
mode tcp
option tcp-check
server db1 10.0.1.200:5432 check port 5432
server db2 10.0.1.201:5432 check port 5432
SSL Termination
Load balancers can handle SSL encryption, reducing load on backend servers.
SSL Termination at Load Balancer
server {
listen 443 ssl http2;
server_name app.example.com;
ssl_certificate /etc/ssl/certs/app.example.com.crt;
ssl_certificate_key /etc/ssl/private/app.example.com.key;
# Modern SSL configuration
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512;
ssl_prefer_server_ciphers off;
location / {
proxy_pass http://backend_servers;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto https;
}
}
SSL Pass-Through
Sometimes you want end-to-end encryption:
stream {
upstream backend_ssl {
server 10.0.1.100:443;
server 10.0.1.101:443;
}
server {
listen 443;
proxy_pass backend_ssl;
proxy_ssl on;
}
}
Database High Availability
Databases require special consideration for high availability.
Read Replicas
Distribute read queries across multiple database replicas:
# Database connection routing
import random
WRITE_DB = "postgresql://user:pass@primary-db:5432/app"
READ_DBS = [
"postgresql://user:pass@replica1-db:5432/app",
"postgresql://user:pass@replica2-db:5432/app",
"postgresql://user:pass@replica3-db:5432/app"
]
def get_db_connection(read_only=False):
if read_only:
return random.choice(READ_DBS)
else:
return WRITE_DB
Primary/Secondary Failover
Implement automatic failover for database writes:
# PostgreSQL streaming replication setup
# Primary server: postgresql.conf
wal_level = replica
max_wal_senders = 3
wal_keep_segments = 64
# Secondary server: recovery.conf
standby_mode = 'on'
primary_conninfo = 'host=primary-db port=5432 user=replicator'
Use tools like Patroni for automatic failover:
# patroni.yml
scope: postgres-cluster
name: postgres-node1
restapi:
listen: 0.0.0.0:8008
connect_address: 10.0.1.100:8008
postgresql:
listen: 0.0.0.0:5432
connect_address: 10.0.1.100:5432
data_dir: /var/lib/postgresql/data
bootstrap:
dcs:
postgresql:
parameters:
wal_level: replica
hot_standby: 'on'
max_wal_senders: 5
Container Load Balancing
Container environments have their own load balancing considerations.
Docker Swarm Load Balancing
Docker Swarm provides built-in load balancing:
# Create a service with multiple replicas
docker service create --name web-service \
--replicas 3 \
--publish 80:8080 \
nginx
# Swarm automatically load balances between replicas
curl http://swarm-manager-ip
Kubernetes Services
Kubernetes Services provide load balancing for pods:
apiVersion: v1
kind: Service
metadata:
name: web-service
spec:
selector:
app: web
ports:
- port: 80
targetPort: 8080
type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-deployment
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: web
image: nginx
ports:
- containerPort: 8080
Deploy and test:
# Apply configuration
kubectl apply -f web-service.yaml
# Check service status
kubectl get services
# Test load balancing
kubectl get endpoints web-service
Global Load Balancing
For applications with users worldwide, distribute traffic geographically.
DNS-Based Global Load Balancing
Use DNS to route users to the nearest datacenter:
# Route53 geolocation routing example
app.example.com A 54.230.1.1 # US East
app.example.com A 54.230.2.1 # US West
app.example.com A 54.230.3.1 # Europe
app.example.com A 54.230.4.1 # Asia
CDN Load Balancing
Content Delivery Networks provide global load balancing:
# CloudFlare configuration
# Users connect to nearest edge location
# Edge locations route to healthy origin servers
Monitoring Load Balancer Performance
Track key metrics to ensure your load balancing is working effectively.
nginx Monitoring
Enable nginx status module:
server {
listen 8080;
server_name localhost;
location /nginx_status {
stub_status on;
access_log off;
allow 127.0.0.1;
deny all;
}
}
Check status:
curl http://localhost:8080/nginx_status
HAProxy Statistics
Enable HAProxy stats page:
# Add to haproxy.cfg
listen stats
bind *:8404
stats enable
stats uri /stats
stats refresh 5s
stats admin if TRUE
View statistics at http://your-server:8404/stats
Application Metrics
Monitor application-level metrics:
# Track requests per backend server
import time
from collections import defaultdict
request_counts = defaultdict(int)
response_times = defaultdict(list)
def track_request(server, response_time):
request_counts[server] += 1
response_times[server].append(response_time)
def get_stats():
return {
'request_counts': dict(request_counts),
'avg_response_times': {
server: sum(times) / len(times)
for server, times in response_times.items()
}
}
Troubleshooting Load Balancer Issues
Uneven Traffic Distribution
Check if requests are being distributed evenly:
# Monitor backend server logs
tail -f /var/log/nginx/access.log | grep "backend_server_ip"
# Check connection counts per server
netstat -an | grep :8080 | grep ESTABLISHED | wc -l
Session Persistence Problems
If users get logged out randomly, session data might not be shared:
# Solution 1: Use sticky sessions (ip_hash)
upstream backend {
ip_hash;
server server1.example.com;
server server2.example.com;
}
# Solution 2: Use shared session storage
# Configure Redis for session storage
Health Check Failures
If healthy servers are marked as down:
# Check health check endpoint manually
curl -I http://backend-server:8080/health
# Review health check configuration
# Ensure timeouts are appropriate for your application
Disaster Recovery Planning
High availability extends beyond load balancing to complete disaster recovery.
Multi-Region Deployment
Deploy applications across multiple geographic regions:
# Primary region: us-east-1
# Secondary region: us-west-2
# Tertiary region: eu-west-1
# DNS failover routes traffic to healthy regions
Data Replication
Ensure data is replicated across regions:
# Database replication
# File storage replication
# Configuration management
Automated Failover
Implement automated failover procedures:
#!/bin/bash
# Simple failover script
PRIMARY_HEALTH_URL="http://primary.example.com/health"
SECONDARY_HEALTH_URL="http://secondary.example.com/health"
if ! curl -f $PRIMARY_HEALTH_URL; then
echo "Primary unhealthy, switching DNS to secondary"
# Update DNS records to point to secondary
aws route53 change-resource-record-sets --hosted-zone-id Z123 --change-batch file://failover.json
fi
In the next section, we'll explore container networking - how Docker and Kubernetes handle load balancing and service discovery in containerized environments.
Load balancing is about more than just distributing traffic. It's about creating resilient systems that gracefully handle failures and provide consistent performance for your users.
Happy load balancing!
Found an issue?