Docker

Why This Matters

It is 2 AM and the pager goes off. The application that works perfectly on every developer's laptop is crashing in production. "It works on my machine" has become a meme because the gap between development and production environments has caused more outages than anyone can count. Different library versions, missing configuration files, subtle OS differences -- the list goes on.

Docker solved this problem by packaging an application together with everything it needs -- libraries, dependencies, configuration, runtime -- into a single portable unit called a container image. If it runs in a Docker container on your laptop, it will run the same way on your production server, your colleague's machine, or a CI/CD pipeline.

Docker did not invent containerization (LXC existed before it, and namespaces/cgroups are kernel features we covered in Chapter 62), but Docker made containers accessible. It gave the world a simple CLI, a standardized image format, and a public registry (Docker Hub) that transformed how software is built, shipped, and run.


Try This Right Now

If Docker is already installed on your system:

$ docker run --rm hello-world
Hello from Docker!
This message shows that your installation appears to be working correctly.
...

That single command just:

  1. Checked for the hello-world image locally
  2. Downloaded it from Docker Hub (if not found locally)
  3. Created a container from that image
  4. Ran the container (which printed the message)
  5. Removed the container (--rm flag)

If Docker is not installed, the next section walks you through installation.


Docker Architecture

Docker uses a client-server architecture:

┌──────────────────────────────────────────────────────────────┐
│                                                              │
│   docker CLI ──────────► dockerd (daemon) ──────► containerd │
│   (client)                (Docker Engine)          │         │
│       │                        │                   │         │
│       │ REST API               │                   ▼         │
│       │ (unix socket)          │               runc          │
│       │                        │           (OCI runtime)     │
│       │                        │                             │
│       │                   ┌────┴────┐                        │
│       │                   │ Images  │                        │
│       │                   │ Volumes │                        │
│       │                   │Networks │                        │
│       │                   └─────────┘                        │
│       │                                                      │
│       └──────────────► Docker Hub / Registry                 │
│                        (pull/push images)                    │
│                                                              │
└──────────────────────────────────────────────────────────────┘

Key components:

  • docker CLI -- the command-line tool you interact with
  • dockerd -- the Docker daemon that manages containers, images, volumes, and networks
  • containerd -- the container runtime that manages container lifecycle
  • runc -- the low-level OCI runtime that actually creates containers using Linux namespaces and cgroups
  • Docker Hub -- the default public image registry

The docker CLI communicates with dockerd via a Unix socket at /var/run/docker.sock.


Installing Docker

Debian/Ubuntu

# Remove old versions
$ sudo apt remove docker docker-engine docker.io containerd runc 2>/dev/null

# Install prerequisites
$ sudo apt update
$ sudo apt install -y ca-certificates curl gnupg

# Add Docker's official GPG key
$ sudo install -m 0755 -d /etc/apt/keyrings
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \
    sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
$ sudo chmod a+r /etc/apt/keyrings/docker.gpg

# Add the repository
$ echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
  https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Install Docker
$ sudo apt update
$ sudo apt install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin

# Start and enable Docker
$ sudo systemctl enable --now docker

# Add your user to the docker group (log out and back in after)
$ sudo usermod -aG docker $USER

Fedora/RHEL

# Add Docker repo
$ sudo dnf config-manager --add-repo \
    https://download.docker.com/linux/fedora/docker-ce.repo

# Install Docker
$ sudo dnf install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin

# Start and enable
$ sudo systemctl enable --now docker

# Add your user to docker group
$ sudo usermod -aG docker $USER

Distro Note: On Arch Linux: sudo pacman -S docker and then sudo systemctl enable --now docker.

Safety Warning: Adding a user to the docker group grants them root-equivalent access to the system. The Docker daemon runs as root, and anyone who can talk to it can mount the host filesystem, access any file, or escalate privileges. In multi-user environments, consider rootless Docker or Podman instead.

Verify the installation:

$ docker version
$ docker info
$ docker run --rm hello-world

Images vs Containers

This distinction is fundamental:

  • An image is a read-only template containing the application, libraries, and filesystem. Think of it as a class in object-oriented programming.
  • A container is a running instance of an image. Think of it as an object (instance of a class).
Image (read-only template)          Container (running instance)
┌─────────────────────┐            ┌─────────────────────┐
│  Layer 4: App code  │            │  Writable layer     │ ← changes here
├─────────────────────┤            ├─────────────────────┤
│  Layer 3: pip install│           │  Layer 4: App code  │
├─────────────────────┤            ├─────────────────────┤
│  Layer 2: apt install│           │  Layer 3: pip install│
├─────────────────────┤            ├─────────────────────┤
│  Layer 1: Ubuntu base│           │  Layer 2: apt install│
└─────────────────────┘            ├─────────────────────┤
                                   │  Layer 1: Ubuntu base│
You can create many                └─────────────────────┘
containers from one image.         Has its own writable layer.

Images are built from layers. Each layer represents a filesystem change. Layers are shared between images, saving disk space and download time.

# List local images
$ docker images
REPOSITORY    TAG       IMAGE ID       CREATED        SIZE
ubuntu        22.04     a8780b506fa4   2 weeks ago    77.8MB
nginx         latest    a6bd71f48f68   3 weeks ago    187MB
hello-world   latest    9c7a54a9a43c   6 months ago   13.3kB

# List running containers
$ docker ps

# List all containers (including stopped)
$ docker ps -a

Running Containers

Basic docker run

# Run Ubuntu interactively
$ docker run -it ubuntu:22.04 bash
root@a1b2c3d4:/# cat /etc/os-release
root@a1b2c3d4:/# exit

# Run in the background (detached)
$ docker run -d --name my-nginx -p 8080:80 nginx

# Now access http://localhost:8080 in your browser

# View logs
$ docker logs my-nginx

# Follow logs in real-time
$ docker logs -f my-nginx

# Execute a command in a running container
$ docker exec -it my-nginx bash
root@e5f6g7h8:/# nginx -v
root@e5f6g7h8:/# exit

# Stop the container
$ docker stop my-nginx

# Remove the container
$ docker rm my-nginx

Key docker run flags:

FlagMeaning
-itInteractive + TTY (for shell access)
-dDetached (run in background)
--nameGive the container a name
-p 8080:80Map host port 8080 to container port 80
--rmRemove container when it stops
-e VAR=valueSet environment variable
-v /host:/containerBind mount a directory
--memory=256mMemory limit
--cpus=0.5CPU limit (half a core)
--restart=unless-stoppedRestart policy

Think About It: When you run docker run -p 8080:80 nginx, Docker creates a network namespace for the container with its own network stack. Port 80 inside the container's namespace is mapped to port 8080 on the host via iptables rules. The -p flag is networking namespace plumbing made simple.


The Dockerfile

A Dockerfile is a text file containing instructions to build an image. Each instruction creates a layer.

Anatomy of a Dockerfile

# Start from a base image
FROM python:3.12-slim

# Set metadata
LABEL maintainer="you@example.com"
LABEL description="A simple Python web application"

# Set the working directory inside the container
WORKDIR /app

# Copy dependency file first (for cache efficiency)
COPY requirements.txt .

# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy the rest of the application code
COPY . .

# Create a non-root user
RUN useradd --create-home appuser
USER appuser

# Document which port the app uses
EXPOSE 8000

# Define the startup command
CMD ["python", "app.py"]

Key Dockerfile instructions:

InstructionPurpose
FROMBase image to build upon
RUNExecute a command during build (creates a layer)
COPYCopy files from host into the image
ADDLike COPY but can handle URLs and tar extraction
WORKDIRSet the working directory for subsequent instructions
ENVSet environment variables
EXPOSEDocument which port the app listens on
CMDDefault command to run when container starts
ENTRYPOINTConfigure the container to run as an executable
USERSet the user to run subsequent commands as
VOLUMECreate a mount point for persistent data
ARGDefine build-time variables

ENTRYPOINT vs CMD

This catches many people off guard:

  • CMD provides default arguments that can be overridden: docker run myimage /bin/sh replaces the CMD.
  • ENTRYPOINT sets the main executable that always runs. CMD then provides default arguments to it.
# CMD only -- easy to override
CMD ["python", "app.py"]
# docker run myimage              → python app.py
# docker run myimage /bin/bash    → /bin/bash (CMD replaced)

# ENTRYPOINT + CMD -- flexible and robust
ENTRYPOINT ["python"]
CMD ["app.py"]
# docker run myimage              → python app.py
# docker run myimage test.py      → python test.py (CMD replaced, ENTRYPOINT stays)

Hands-On: Build an Image

Create a simple Python application:

$ mkdir -p ~/docker-demo && cd ~/docker-demo

Create app.py:

from http.server import HTTPServer, SimpleHTTPRequestHandler
import os

class Handler(SimpleHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.send_header('Content-type', 'text/plain')
        self.end_headers()
        hostname = os.uname().nodename
        self.wfile.write(f"Hello from container {hostname}\n".encode())

if __name__ == '__main__':
    server = HTTPServer(('0.0.0.0', 8000), Handler)
    print("Server running on port 8000...")
    server.serve_forever()

Create requirements.txt (empty for this example):

# No external dependencies

Create Dockerfile:

FROM python:3.12-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .

RUN useradd --create-home appuser
USER appuser

EXPOSE 8000
CMD ["python", "app.py"]

Build and run:

# Build the image
$ docker build -t my-python-app .

# Watch the layers being built
Step 1/8 : FROM python:3.12-slim
 ---> a1b2c3d4e5f6
Step 2/8 : WORKDIR /app
 ---> Running in f6e5d4c3b2a1
...
Successfully built 9a8b7c6d5e4f
Successfully tagged my-python-app:latest

# Run it
$ docker run -d --name myapp -p 8000:8000 my-python-app

# Test it
$ curl http://localhost:8000
Hello from container a1b2c3d4e5f6

# View the layers
$ docker history my-python-app
IMAGE          CREATED          CREATED BY                                      SIZE
9a8b7c6d5e4f   30 seconds ago   CMD ["python" "app.py"]                         0B
...

# Clean up
$ docker stop myapp && docker rm myapp

Docker Compose

Docker Compose lets you define and run multi-container applications with a single YAML file.

docker-compose.yml Anatomy

# docker-compose.yml
services:
  web:
    build: .
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=postgres://db:5432/myapp
    depends_on:
      - db
    restart: unless-stopped

  db:
    image: postgres:16
    environment:
      POSTGRES_DB: myapp
      POSTGRES_USER: appuser
      POSTGRES_PASSWORD: secretpassword
    volumes:
      - pgdata:/var/lib/postgresql/data
    restart: unless-stopped

  redis:
    image: redis:7-alpine
    restart: unless-stopped

volumes:
  pgdata:

Hands-On: Running a Compose Stack

$ mkdir -p ~/compose-demo && cd ~/compose-demo

Create a docker-compose.yml:

services:
  web:
    image: nginx:alpine
    ports:
      - "8080:80"
    volumes:
      - ./html:/usr/share/nginx/html:ro
    depends_on:
      - api

  api:
    image: python:3.12-slim
    working_dir: /app
    command: python -m http.server 5000
    expose:
      - "5000"
# Create content for the web server
$ mkdir -p html
$ echo "<h1>Hello from Docker Compose</h1>" > html/index.html

# Start all services
$ docker compose up -d

# View running services
$ docker compose ps

# View logs from all services
$ docker compose logs

# View logs from one service
$ docker compose logs web

# Stop all services
$ docker compose down

# Stop and remove volumes too
$ docker compose down -v

Common Docker Compose commands:

$ docker compose up -d          # Start in background
$ docker compose down           # Stop and remove containers
$ docker compose ps             # List running services
$ docker compose logs -f        # Follow logs
$ docker compose exec web sh    # Shell into a running service
$ docker compose build          # Rebuild images
$ docker compose pull           # Pull latest images
$ docker compose restart        # Restart all services

Volumes and Bind Mounts

Containers are ephemeral. When a container is removed, any data written inside it is lost. Volumes solve this.

Bind Mount:                         Named Volume:
(host path → container path)        (Docker-managed storage)

Host filesystem                     Docker storage area
/home/user/data/  ──────────►       /var/lib/docker/volumes/
                   mount              mydata/_data/  ──────────►
                                                      mount
Container sees:                     Container sees:
/app/data/                          /app/data/
# Named volume (Docker manages the storage location)
$ docker volume create mydata
$ docker run -d -v mydata:/app/data my-app

# Bind mount (you specify the host path)
$ docker run -d -v /home/user/config:/app/config:ro my-app
#                                                ^^
#                                          read-only mount

# List volumes
$ docker volume ls

# Inspect a volume
$ docker volume inspect mydata

# Remove unused volumes
$ docker volume prune

Safety Warning: Bind mounts give the container access to host files. A container running as root with a bind mount to / has full access to your host filesystem. Always use the :ro (read-only) flag unless write access is truly needed.


Docker Networking

Docker creates several networks by default:

$ docker network ls
NETWORK ID     NAME      DRIVER    SCOPE
a1b2c3d4e5f6   bridge    bridge    local
f6e5d4c3b2a1   host      host      local
9a8b7c6d5e4f   none      null      local
NetworkDescription
bridgeDefault. Containers get their own IP on a private network. Accessed via port mapping.
hostContainer shares the host's network stack. No isolation, but no port mapping needed.
noneNo networking. Container is completely isolated.
Bridge Network (default):

┌─────────────────────────────────────────────────┐
│  Host                                            │
│                                                  │
│  ┌───────────┐    ┌───────────┐                  │
│  │Container A│    │Container B│                  │
│  │172.17.0.2 │    │172.17.0.3 │                  │
│  └─────┬─────┘    └─────┬─────┘                  │
│        │                │                        │
│  ──────┴────────────────┴───────                 │
│        docker0 bridge (172.17.0.1)               │
│                │                                  │
│        NAT (iptables masquerade)                 │
│                │                                  │
│          eth0 (host NIC)                         │
└────────────────┼────────────────────────────────┘
                 │
            Internet

User-Defined Bridge Networks

The default bridge network does not provide DNS resolution between containers. Create a user-defined network for that:

# Create a custom network
$ docker network create mynet

# Run containers on the custom network
$ docker run -d --name web --network mynet nginx
$ docker run -d --name api --network mynet python:3.12-slim \
    python -m http.server 5000

# Containers can reach each other by name!
$ docker exec web ping -c 2 api
PING api (172.18.0.3): 56 data bytes
64 bytes from 172.18.0.3: seq=0 ttl=64 time=0.089 ms

# Clean up
$ docker network rm mynet

Think About It: Docker's networking is built on the Linux network namespaces and veth pairs we explored in Chapter 62. Each container gets its own network namespace. The docker0 bridge connects them. iptables rules handle port mapping and NAT. The complexity is hidden behind simple flags.


Essential Docker Commands Reference

# Container lifecycle
$ docker run                     # Create and start a container
$ docker start <container>       # Start a stopped container
$ docker stop <container>        # Graceful stop (SIGTERM, then SIGKILL)
$ docker kill <container>        # Immediate stop (SIGKILL)
$ docker rm <container>          # Remove a stopped container
$ docker rm -f <container>       # Force remove (stop + remove)

# Inspection
$ docker ps                      # List running containers
$ docker ps -a                   # List all containers
$ docker logs <container>        # View container logs
$ docker logs -f <container>     # Follow logs
$ docker inspect <container>     # Detailed JSON info
$ docker stats                   # Real-time resource usage
$ docker top <container>         # Running processes in container

# Interaction
$ docker exec -it <container> bash   # Shell into running container
$ docker cp file.txt container:/path  # Copy file to container
$ docker cp container:/path file.txt  # Copy file from container

# Images
$ docker images                  # List local images
$ docker pull <image>            # Download an image
$ docker build -t name .         # Build image from Dockerfile
$ docker rmi <image>             # Remove an image
$ docker image prune             # Remove unused images

# System
$ docker system df               # Disk usage
$ docker system prune            # Remove all unused data
$ docker system prune -a         # Remove everything unused

Image Registries

Docker Hub is the default public registry, but you can use other registries or run your own.

# Pull from Docker Hub (default)
$ docker pull nginx:latest

# Pull from a specific registry
$ docker pull ghcr.io/owner/image:tag
$ docker pull quay.io/organization/image:tag

# Tag an image for a registry
$ docker tag my-app:latest registry.example.com/my-app:v1.0

# Push to a registry (requires login)
$ docker login registry.example.com
$ docker push registry.example.com/my-app:v1.0

Security Best Practices

Running containers securely requires deliberate choices:

1. Do Not Run as Root

# BAD: runs as root by default
FROM python:3.12-slim
COPY app.py .
CMD ["python", "app.py"]

# GOOD: create and use a non-root user
FROM python:3.12-slim
RUN useradd --create-home appuser
WORKDIR /home/appuser
COPY --chown=appuser:appuser app.py .
USER appuser
CMD ["python", "app.py"]

2. Use Minimal Base Images

# Larger attack surface (140MB+)
FROM python:3.12

# Smaller attack surface (50MB)
FROM python:3.12-slim

# Smallest attack surface (requires static binaries)
FROM python:3.12-alpine

3. Do Not Store Secrets in Images

# BAD: secret baked into the image forever
ENV API_KEY=supersecret123

# GOOD: pass secrets at runtime
# docker run -e API_KEY=supersecret123 my-app

4. Pin Image Versions

# BAD: unpredictable, could change any time
FROM python:latest

# GOOD: specific version
FROM python:3.12.1-slim

# BEST: pin to a digest
FROM python@sha256:abc123def456...

5. Use .dockerignore

Create a .dockerignore file to prevent sensitive files from being copied into the image:

.git
.env
*.secret
node_modules
__pycache__
*.pyc

6. Scan Images for Vulnerabilities

# Docker Scout (built-in scanning)
$ docker scout cve my-app:latest

# Trivy (open-source scanner)
$ trivy image my-app:latest

Debug This

A developer's container starts but the application inside is not accessible:

$ docker run -d --name web -p 8080:80 my-web-app
$ curl http://localhost:8080
curl: (56) Recv failure: Connection reset by peer

Diagnosis steps:

# Is the container actually running?
$ docker ps
# Yes, it shows as running

# Check the logs
$ docker logs web
# Error: bind address 0.0.0.0:8000 already in use

# The application is trying to bind to port 8000 inside the container,
# but we mapped host:8080 → container:80

The problem: The port mapping says to forward host port 8080 to container port 80. But the application inside the container listens on port 8000, not port 80.

Fix:

$ docker rm -f web
$ docker run -d --name web -p 8080:8000 my-web-app
$ curl http://localhost:8080
# Works!

The host port and container port in -p are HOST:CONTAINER. The container port must match what the application actually listens on.


What Just Happened?

┌─────────────────────────────────────────────────────────────┐
│                    CHAPTER RECAP                             │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  Docker packages applications into portable containers.     │
│                                                             │
│  Architecture: CLI → dockerd → containerd → runc            │
│                                                             │
│  Image = read-only template (layers).                       │
│  Container = running instance + writable layer.             │
│                                                             │
│  Dockerfile: FROM, RUN, COPY, CMD, ENTRYPOINT, USER         │
│    → Each instruction creates an image layer.               │
│                                                             │
│  Docker Compose: multi-container apps in one YAML file.     │
│                                                             │
│  Volumes persist data beyond container lifecycle.           │
│  Bind mounts map host paths into containers.               │
│                                                             │
│  Networking: bridge (default, NAT), host, none.             │
│  User-defined networks provide DNS between containers.     │
│                                                             │
│  Security: non-root users, minimal images, no secrets       │
│  in images, pinned versions, .dockerignore, scanning.       │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Try This

  1. Build and run: Create a Dockerfile for a simple web application (use Python, Node.js, or any language you like). Build it, run it, and verify it responds to HTTP requests.

  2. Multi-container app: Write a docker-compose.yml that runs a web application with a PostgreSQL database and a Redis cache. Verify the web application can connect to both.

  3. Volume persistence: Run a PostgreSQL container with a named volume. Insert some data. Stop and remove the container. Start a new PostgreSQL container with the same volume. Verify your data survived.

  4. Networking exploration: Create a custom Docker network. Run two containers on it. From one container, ping the other by container name. Then inspect the network with docker network inspect and find the IP addresses.

  5. Image optimization: Take a Dockerfile that uses a full python:3.12 base image. Rewrite it to use python:3.12-slim. Compare the image sizes with docker images. How much space did you save?

  6. Bonus Challenge: Write a multi-stage Dockerfile. Use a build stage with full build tools to compile your application, then copy only the compiled binary into a minimal final image (like alpine or scratch). This is how production Go and Rust applications are containerized.