Working with Docker Volumes

Manage persistent data storage for your Docker containers

Containers are ephemeral by nature, any data written inside a container is lost when the container is removed. For applications that need to persist data, Docker provides volumes. Volumes are the preferred mechanism for persisting data generated by and used by Docker containers.

Understanding Docker Storage Options

Docker provides several storage options:

  1. Volumes: Docker-managed storage outside the container's filesystem
  2. Bind mounts: Mount a file or directory from the host into the container
  3. tmpfs mounts: Store data in the host system's memory (volatile)

Volumes are the recommended approach because they:

  • Are completely managed by Docker
  • Are isolated from the host machine's filesystem
  • Can be backed up and restored easily
  • Work on both Linux and Windows containers
  • Can be more safely shared among multiple containers

Creating and Managing Volumes

Creating a Volume

To create a named volume:

docker volume create my-data

This creates a volume that persists independently of any container.

Listing Volumes

To see all existing volumes:

docker volume ls

Example output:

DRIVER    VOLUME NAME
local     my-data

Inspecting Volumes

To get detailed information about a volume:

docker volume inspect my-data

This returns a JSON object with information about the volume:

[
  {
    "CreatedAt": "2023-05-16T22:44:35Z",
    "Driver": "local",
    "Labels": {},
    "Mountpoint": "/var/lib/docker/volumes/my-data/_data",
    "Name": "my-data",
    "Options": {},
    "Scope": "local"
  }
]

Removing Volumes

To remove a volume:

docker volume rm my-data

To remove all unused volumes:

docker volume prune

Using Volumes with Containers

You attach a volume to a container using the -v or --mount flag when running a container.

Using a Named Volume

docker run -d --name nginx-with-data -v my-data:/usr/share/nginx/html nginx:latest

This command:

  1. Uses (or creates if it doesn't exist) a volume named my-data
  2. Mounts it at /usr/share/nginx/html inside the container
  3. Any data written to that path in the container is stored in the volume

Using the --mount Flag

The newer --mount syntax offers a more explicit alternative:

docker run -d --name nginx-with-data --mount source=my-data,target=/usr/share/nginx/html nginx:latest

Both approaches have the same effect, but --mount has a clearer syntax for more complex configurations.

Practical Examples with Volumes

Example 1: Persistent Database

Let's run a MySQL container with persistent data storage:

# Create a volume for MySQL data
docker volume create mysql-data

# Run MySQL with the volume
docker run -d \
  --name mysql-db \
  -e MYSQL_ROOT_PASSWORD=my-secret-pw \
  -e MYSQL_DATABASE=myapp \
  -e MYSQL_USER=myuser \
  -e MYSQL_PASSWORD=mypassword \
  -v mysql-data:/var/lib/mysql \
  mysql:8.0

Now, even if you remove the container, your database data will remain in the mysql-data volume:

# Remove the container
docker rm -f mysql-db

# Create a new container using the same volume
docker run -d \
  --name mysql-db-new \
  -e MYSQL_ROOT_PASSWORD=my-secret-pw \
  -v mysql-data:/var/lib/mysql \
  mysql:8.0

The new container will have all the data from the previous one.

Example 2: Web Application with Configuration

For a web server with custom configuration:

# Create volumes for content and configuration
docker volume create web-content
docker volume create web-config

# Run nginx with both volumes
docker run -d \
  --name web-server \
  -p 8080:80 \
  -v web-content:/usr/share/nginx/html \
  -v web-config:/etc/nginx/conf.d \
  nginx:latest

Now you can populate these volumes with your content and configuration.

Using Bind Mounts for Development

During development, it's often useful to mount local code directories into containers for real-time editing:

docker run -d \
  --name dev-environment \
  -p 3000:3000 \
  -v $(pwd):/app \
  node:18-alpine \
  sh -c "cd /app && npm install && npm start"

This mounts your current directory into the /app directory in the container.

For Windows PowerShell, use:

docker run -d `
  --name dev-environment `
  -p 3000:3000 `
  -v ${PWD}:/app `
  node:18-alpine `
  sh -c "cd /app && npm install && npm start"

Read-Only Volumes

For added security, you can mount volumes as read-only:

docker run -d \
  --name secure-nginx \
  -v web-content:/usr/share/nginx/html:ro \
  -p 8080:80 \
  nginx:latest

The :ro suffix makes the volume read-only within the container.

Volume Drivers for Remote Storage

Docker supports various volume drivers that enable storing data on remote systems:

# Create a volume using a specific driver
docker volume create --driver=rexray/ebs \
  --opt size=10 \
  --opt volumetype=gp2 \
  remote-storage

# Use the volume with a container
docker run -d \
  --name remote-data-app \
  -v remote-storage:/data \
  myapp:latest

This example uses the rexray/ebs driver to create an Amazon EBS volume.

Backup and Restore with Volumes

You can back up volumes by running a temporary container:

# Backup
docker run --rm \
  -v mysql-data:/source:ro \
  -v $(pwd):/backup \
  alpine:latest \
  tar czvf /backup/mysql-backup.tar.gz -C /source .

And restore the backup to a new volume:

# Create a new volume
docker volume create mysql-data-restored

# Restore the backup
docker run --rm \
  -v mysql-data-restored:/target \
  -v $(pwd):/backup \
  alpine:latest \
  sh -c "tar xzvf /backup/mysql-backup.tar.gz -C /target"

Data Volume Containers

Another pattern is to create a container specifically for holding data volumes, then mount its volumes in other containers:

# Create a data container
docker create \
  --name data-container \
  -v data-volume:/data \
  alpine:latest \
  /bin/true

# Use volumes from the data container
docker run -d \
  --name app1 \
  --volumes-from data-container \
  myapp:latest

The advantage is that you can manage related volumes together.

Volume Lifecycle Management

Volumes don't automatically get deleted when a container is removed, which is usually what you want for persistent data. However, you can use the --rm flag with the -v option to clean up anonymous volumes when the container is removed:

docker run --rm -v /data alpine:latest echo "This container creates and removes an anonymous volume"

For named volumes, you need to manually remove them as shown earlier.

Sharing Volumes Between Containers

Multiple containers can use the same volume simultaneously:

# Create a shared volume
docker volume create shared-data

# Run container 1
docker run -d \
  --name app1 \
  -v shared-data:/data \
  myapp:latest

# Run container 2
docker run -d \
  --name app2 \
  -v shared-data:/data \
  myapp:latest

Both containers have access to the same data in the volume.

Cloud-Based Volume Storage

When running containers in the cloud, you can use platform-specific volume solutions. DigitalOcean's Block Storage provides SSD-based volumes that can be attached to your Droplets running Docker.

Sign up with DigitalOcean to get $200 in free credits and try their Block Storage service with your containerized applications.

tmpfs Mounts for Sensitive Data

For sensitive data that you don't want to persist, you can use tmpfs mounts, which store data in the host's memory:

docker run -d \
  --name secure-app \
  --tmpfs /app/secrets:rw,noexec,nosuid,size=10M \
  myapp:latest

This creates a tmpfs mount at /app/secrets with specific options:

  • rw: Read-write access
  • noexec: Cannot execute binaries from this mount
  • nosuid: Cannot execute setuid binaries
  • size=10M: Limits the size to 10MB

Best Practices for Docker Volumes

  1. Use named volumes: Named volumes are easier to track and manage than anonymous volumes.

  2. Document volume usage: Keep track of which volumes are used for what purpose.

  3. Back up important data: Regularly back up critical data volumes.

  4. Consider read-only access: When a container only needs to read data, mount the volume as read-only for added security.

  5. Set volume ownership: For applications running as non-root users, ensure proper permissions on volume data:

    docker run --rm -v my-data:/data alpine chown -R 1000:1000 /data
    
  6. Use volume labels: Add metadata to volumes with labels:

    docker volume create --label project=myapp --label environment=production app-data
    
  7. Separate data from application: This separation allows you to update the application without affecting the data.

In the next section, we'll cover Docker networking to enable communication between containers.

Found an issue?