Inter-Process Communication

Why This Matters

Or consider this: your web browser talks to a local caching proxy. Your application connects to a PostgreSQL database running on the same machine. Your container runtime communicates with its daemon. None of these use the network -- they all use Inter-Process Communication (IPC) mechanisms built into the kernel.

Linux provides multiple IPC mechanisms, each suited to different scenarios. This chapter covers the ones you will encounter daily -- pipes, named pipes, redirections, process substitution -- and introduces the ones you need to know about for deeper work: shared memory, Unix domain sockets, and message queues.

Try This Right Now

# A pipeline: three processes communicating through pipes
echo "hello world" | tr 'a-z' 'A-Z' | rev
# Output: DLROW OLLEH

# How many processes were involved?
# Three: echo, tr, rev -- all connected by pipes

# See a pipe in action with /proc
sleep 100 | sleep 200 &
ls -l /proc/$(pgrep -n "sleep 100")/fd/
# fd/1 (stdout) will be a pipe
ls -l /proc/$(pgrep -n "sleep 200")/fd/
# fd/0 (stdin) will be a pipe

# Clean up
kill %1 2>/dev/null

Pipes: The Unix Superpower

The pipe (|) is the single most important IPC mechanism in Unix. It connects the standard output of one process to the standard input of the next:

+---------+    pipe    +---------+    pipe    +---------+
| Process |  stdout-->| Process |  stdout-->| Process |
|    A    |    stdin<--|    B    |    stdin<--|    C    |
+---------+           +---------+           +---------+

How Pipes Work Internally

When the shell sees cmd1 | cmd2, it:

Creates a pipe (a small kernel buffer, typically 64KB on Linux)
Forks two child processes
Connects cmd1's stdout (fd 1) to the write end of the pipe
Connects cmd2's stdin (fd 0) to the read end of the pipe
cmd1 writes data into the pipe; cmd2 reads data from the pipe

          Kernel Pipe Buffer (64KB)
         +------------------------+
cmd1 --> | data flows this way -> | --> cmd2
  fd 1   +------------------------+   fd 0
 (write)                             (read)

Key characteristics:

Pipes are unidirectional -- data flows one way only
Pipes are anonymous -- they exist only while the processes are running
If the pipe buffer is full, the writer blocks until the reader consumes data
If the reader exits, the writer gets SIGPIPE
Pipes connect processes that share a common ancestor (usually the shell)

Pipeline Examples

# Count lines in a file
cat /etc/passwd | wc -l
# Better: wc -l < /etc/passwd (no useless 'cat')

# Find the 10 largest files in /var/log
du -sh /var/log/* 2>/dev/null | sort -rh | head -10

# Show unique shells used on the system
cut -d: -f7 /etc/passwd | sort | uniq -c | sort -rn

# Monitor a log file and filter for errors
tail -f /var/log/syslog | grep --line-buffered "error"

# Count processes per user
ps aux | awk '{print $1}' | sort | uniq -c | sort -rn

# Generate a random password
cat /dev/urandom | tr -dc 'A-Za-z0-9!@#$' | head -c 20; echo

Pipeline Exit Status

By default, the shell reports the exit status of the LAST command in a pipeline:

false | true
echo $?
# Output: 0 (true's exit code, not false's)

To get the exit status of every command in the pipeline:

# Enable pipefail -- the pipeline fails if ANY command fails
set -o pipefail

false | true
echo $?
# Output: 1 (false's exit code)

# Or check individual statuses with PIPESTATUS (bash-specific)
cat nonexistent_file 2>/dev/null | sort | head
echo "${PIPESTATUS[@]}"
# Output: 1 0 0

Think About It: Why does grep pattern file | wc -l work but might give the wrong answer if grep fails? How does set -o pipefail help in scripts?

Redirection: Controlling File Descriptors

Every process starts with three open file descriptors:

+-----+--------+-------------------+
| FD  | Name   | Default           |
+-----+--------+-------------------+
|  0  | stdin  | Keyboard / terminal|
|  1  | stdout | Terminal screen    |
|  2  | stderr | Terminal screen    |
+-----+--------+-------------------+

Redirection lets you change where these point.

Output Redirection

# Redirect stdout to a file (overwrites)
echo "hello" > output.txt

# Redirect stdout to a file (appends)
echo "world" >> output.txt

# Redirect stderr to a file
ls /nonexistent 2> errors.txt

# Redirect stderr to the same place as stdout
ls /nonexistent /tmp 2>&1

# Redirect both stdout and stderr to a file
ls /nonexistent /tmp > all_output.txt 2>&1
# Or the modern shorthand (bash 4+):
ls /nonexistent /tmp &> all_output.txt

# Append both stdout and stderr
command &>> logfile.txt

The Order Matters

This is a classic gotcha:

# WRONG: stderr goes to terminal, not the file
ls /nonexistent /tmp 2>&1 > output.txt
# Why? 2>&1 duplicates fd 2 to where fd 1 currently points (terminal)
# Then > output.txt redirects fd 1 to the file
# So fd 2 still points to the terminal!

# RIGHT: redirect stdout first, then stderr to the same place
ls /nonexistent /tmp > output.txt 2>&1
# > output.txt redirects fd 1 to the file
# 2>&1 duplicates fd 2 to where fd 1 now points (the file)

Input Redirection

# Read from a file instead of the keyboard
sort < unsorted_list.txt

# Here document -- inline multi-line input
cat << 'EOF'
This is line one.
This is line two.
Variables like $HOME are NOT expanded (single-quoted delimiter).
EOF

# Here document with variable expansion
cat << EOF
Your home directory is $HOME
Your shell is $SHELL
EOF

# Here string -- single-line input
grep "pattern" <<< "search in this string"

/dev/null -- The Black Hole

/dev/null is a special file that discards everything written to it and produces nothing when read:

# Discard stdout (keep only errors)
find / -name "*.conf" > /dev/null

# Discard stderr (keep only normal output)
find / -name "*.conf" 2>/dev/null

# Discard everything
command > /dev/null 2>&1

# Check if a command succeeds without seeing output
if grep -q "pattern" file.txt 2>/dev/null; then
    echo "Found it"
fi

Redirecting to Multiple Places with tee

tee reads stdin and writes to both stdout AND a file:

# See output AND save it to a file
make 2>&1 | tee build.log

# Append instead of overwrite
df -h | tee -a disk_report.txt

# Write to multiple files
echo "log entry" | tee file1.log file2.log file3.log

                   +-----------> Terminal (stdout)
                   |
stdin --> [ tee ] -+
                   |
                   +-----------> file.log

Named Pipes (FIFOs)

Regular pipes are anonymous -- they exist only within a pipeline. Named pipes (FIFOs) are visible in the filesystem and can connect unrelated processes:

# Create a named pipe
mkfifo /tmp/mypipe

# Check it
ls -l /tmp/mypipe
# prw-r--r-- 1 alice alice 0 ... /tmp/mypipe
# The 'p' at the start means "pipe"

Using Named Pipes

Named pipes are blocking: a writer blocks until a reader opens the pipe, and vice versa. You need two terminals (or background processes):

# Terminal 1: Write to the pipe (this will block until someone reads)
echo "Hello from Terminal 1" > /tmp/mypipe

# Terminal 2: Read from the pipe
cat < /tmp/mypipe
# Output: Hello from Terminal 1
# Both commands complete

Practical Use Case: Log Processing Pipeline

# Create a named pipe for log processing
mkfifo /tmp/log_pipe

# Terminal 1: Tail a log into the pipe
tail -f /var/log/syslog > /tmp/log_pipe 2>/dev/null &

# Terminal 2: Process logs from the pipe
cat /tmp/log_pipe | grep --line-buffered "error" | while read line; do
    echo "[ALERT] $line"
done

# Clean up
rm /tmp/log_pipe

Practical Use Case: Parallel Processing

# Create two named pipes
mkfifo /tmp/pipe_a /tmp/pipe_b

# Split processing: compress and checksum simultaneously
tee /tmp/pipe_a < largefile.bin | md5sum > checksum.txt &
gzip < /tmp/pipe_a > largefile.bin.gz &
wait

# Clean up
rm /tmp/pipe_a /tmp/pipe_b

Think About It: What happens if you open a named pipe for writing but no process ever opens it for reading? How is this different from writing to a regular file?

Process Substitution

Process substitution lets you use a process's output (or input) as if it were a file. This is a bash feature (not POSIX sh).

Output Process Substitution: <(command)

The <(command) syntax runs command and makes its output available as a file path:

# Compare the output of two commands
diff <(ls /dir1) <(ls /dir2)

# Compare sorted lists without creating temporary files
diff <(sort file1.txt) <(sort file2.txt)

# What is the "file"?
echo <(echo hello)
# Output: /dev/fd/63 (or similar -- it's a file descriptor)

This is incredibly powerful because many commands expect file arguments, not piped input:

# paste needs two files -- use process substitution
paste <(cut -d: -f1 /etc/passwd) <(cut -d: -f7 /etc/passwd)

# Feed two data streams to a command that expects files
join <(sort file1.txt) <(sort file2.txt)

# Load data from a command into a while loop without subshell issues
while read -r user shell; do
    echo "User $user uses $shell"
done < <(awk -F: '{print $1, $7}' /etc/passwd)

Input Process Substitution: >(command)

The >(command) syntax creates a file path that feeds into a command's stdin:

# Write to two destinations simultaneously
echo "log entry" | tee >(gzip > compressed.gz) >(wc -c > byte_count.txt)

# Send output to both a file and a log processor
some_command > >(tee output.log) 2> >(tee error.log >&2)

Why Not Just Use Pipes?

Process substitution solves problems that pipes cannot:

# Problem: compare output of two commands
# With pipes -- impossible (pipes are linear, not branching)
# With process substitution:
diff <(find /dir1 -type f | sort) <(find /dir2 -type f | sort)

# Problem: the while-read-pipe subshell issue
# This LOSES the variable outside the loop:
count=0
cat file.txt | while read line; do
    count=$((count + 1))
done
echo "$count"  # Prints 0! The while loop ran in a subshell

# Process substitution avoids the subshell:
count=0
while read line; do
    count=$((count + 1))
done < <(cat file.txt)
echo "$count"  # Prints the correct count

Shared Memory Overview

Shared memory is the fastest IPC mechanism. Two or more processes map the same region of physical memory into their address spaces:

  Process A                     Process B
  +----------+                 +----------+
  | Address  |                 | Address  |
  | Space    |                 | Space    |
  |          |                 |          |
  | Shared   |---+         +---| Shared   |
  | Region   |   |         |   | Region   |
  +----------+   |         |   +----------+
                  v         v
            +------------------+
            | Physical Memory  |
            | (shared segment) |
            +------------------+

There is no copying of data -- both processes read and write to the same memory. This makes it extremely fast but also means you need synchronization (mutexes, semaphores) to prevent data corruption.

POSIX Shared Memory

# List existing shared memory segments
ls /dev/shm/

# See System V shared memory segments
ipcs -m

# See all IPC resources (shared memory, semaphores, message queues)
ipcs -a

Shared memory is commonly used by:

Database systems (PostgreSQL shared buffers)
Web servers (shared worker state)
Audio/video processing (passing frames between processes)
tmpfs mounts (/dev/shm is a tmpfs)

# /dev/shm is a tmpfs mount -- a RAM-based filesystem
df -h /dev/shm
mount | grep shm

# You can use it for fast temporary storage
echo "fast data" > /dev/shm/temp_data
# But remember: it vanishes on reboot!

Distro Note: The default size of /dev/shm varies. Debian/Ubuntu defaults to 50% of RAM. RHEL/Fedora also defaults to 50% of RAM. You can resize it: sudo mount -o remount,size=2G /dev/shm.

Unix Domain Sockets

Unix domain sockets are like network sockets but for local communication only. They are faster than TCP/IP sockets (no network stack overhead) and support both stream and datagram modes.

# Find Unix domain sockets on your system
ss -xl  # or: ss -x

# Or look in common locations
ls -l /var/run/*.sock 2>/dev/null
ls -l /run/*.sock 2>/dev/null
ls -l /tmp/*.sock 2>/dev/null

Common examples you will encounter:

# Docker daemon socket
ls -l /var/run/docker.sock

# MySQL/MariaDB socket
ls -l /var/run/mysqld/mysqld.sock

# PostgreSQL socket
ls -l /var/run/postgresql/.s.PGSQL.5432

# systemd-journald socket
ls -l /run/systemd/journal/socket

# D-Bus system socket
ls -l /run/dbus/system_bus_socket

How They Differ from Pipes

+---------------------+------------------+--------------------+
| Feature             | Pipes            | Unix Sockets       |
+---------------------+------------------+--------------------+
| Direction           | Unidirectional   | Bidirectional      |
| Connections         | One-to-one       | Many-to-one        |
| Related processes   | Required (pipes) | Not required       |
| needed?             | Optional (FIFO)  |                    |
| File on disk        | No (pipes)       | Yes (socket file)  |
|                     | Yes (FIFOs)      |                    |
| Protocol support    | Byte stream only | Stream or datagram |
| Permissions         | Via fd inheritance| Via file permissions|
+---------------------+------------------+--------------------+

Practical Example: Communicating with Docker via Socket

# Docker CLI talks to dockerd via Unix socket
# You can do the same with curl:
sudo curl --unix-socket /var/run/docker.sock http://localhost/version
# Returns JSON with Docker version info

# List containers via the socket API
sudo curl --unix-socket /var/run/docker.sock http://localhost/containers/json

Creating a Simple Unix Domain Socket (with socat)

# Install socat if not present
sudo apt install socat    # Debian/Ubuntu
sudo dnf install socat    # RHEL/Fedora

# Terminal 1: Create a socket server
socat UNIX-LISTEN:/tmp/test.sock,fork EXEC:/bin/cat

# Terminal 2: Connect and send data
echo "Hello, socket!" | socat - UNIX-CONNECT:/tmp/test.sock

# Clean up
rm -f /tmp/test.sock

Message Queues Overview

Message queues allow processes to exchange discrete messages through a kernel-maintained queue. Unlike pipes (byte streams), messages maintain their boundaries.

  Process A                              Process B
  +----------+                          +----------+
  |  Send    |---> +----------------+ -->| Receive  |
  |  msg 1   |     | Kernel Message |   |  msg 1   |
  |  msg 2   |---> |    Queue       | -->|  msg 2   |
  |  msg 3   |---> |                | -->|  msg 3   |
  +----------+     +----------------+   +----------+

Key characteristics:

Messages have types and priorities
Messages maintain boundaries (unlike pipes, which are byte streams)
The queue persists until explicitly removed (survives process death)
Kernel enforces queue size limits

Viewing Message Queues

# List POSIX message queues
ls /dev/mqueue/ 2>/dev/null

# List System V message queues
ipcs -q

# Show all IPC objects with details
ipcs -a

Linux supports both System V IPC (older: shmget, msgget, semget) and POSIX IPC (newer: shm_open, mq_open, sem_open). New code should prefer POSIX. Use ipcs / ipcrm to manage System V resources, and browse /dev/shm / /dev/mqueue for POSIX resources.

Debug This: Mystery Broken Pipe

A user reports that their script fails with "Broken pipe" errors:

#!/bin/bash
generate_report | head -5
echo "Report generated successfully"

The script works but prints write error: Broken pipe to stderr.

Diagnosis:

generate_report produces many lines of output. head -5 reads only 5 lines and then exits, closing the read end of the pipe. When generate_report tries to write the next line, the kernel sends it SIGPIPE and it dies.

Solutions:

# Option 1: Suppress the error message
generate_report 2>/dev/null | head -5

# Option 2: Redirect stderr to /dev/null for just that command
{ generate_report 2>/dev/null; } | head -5

# Option 3: Trap SIGPIPE (if you control the script doing the writing)
trap '' SIGPIPE

# Option 4: In many cases, this is harmless -- the exit code
# of the pipeline is 0 (head succeeded), and the SIGPIPE
# is just the kernel cleaning up efficiently

Think About It: Is a broken pipe actually an error? Or is it the kernel's efficient way of saying "nobody is listening, so stop wasting effort"?

Hands-On: Building an IPC Pipeline

Let us build a practical log processing system using different IPC mechanisms:

# Step 1: Create a named pipe for log ingestion
mkfifo /tmp/log_pipe

# Step 2: Write a log producer (simulates an application logging)
for i in $(seq 1 100); do
    echo "$(date '+%Y-%m-%d %H:%M:%S') [$(shuf -e INFO WARN ERROR -n1)] Message $i"
    sleep 0.1
done > /tmp/log_pipe &
PRODUCER_PID=$!

# Step 3: Process the logs -- count by severity
cat /tmp/log_pipe | tee \
    >(grep "ERROR" >> /tmp/errors.log) \
    >(grep "WARN" >> /tmp/warnings.log) \
    > /tmp/all.log

# Step 4: Wait for the producer to finish
wait $PRODUCER_PID 2>/dev/null

# Step 5: Check results
echo "=== Error count ==="
wc -l < /tmp/errors.log
echo "=== Warning count ==="
wc -l < /tmp/warnings.log
echo "=== Total messages ==="
wc -l < /tmp/all.log

# Step 6: Compare error and warning files
diff <(cut -d']' -f2 /tmp/errors.log | sort) \
     <(cut -d']' -f2 /tmp/warnings.log | sort) | head -20

# Clean up
rm -f /tmp/log_pipe /tmp/errors.log /tmp/warnings.log /tmp/all.log

IPC Mechanism Selection Guide

+-------------------+-------------+----------+----------+-----------+
| Mechanism         | Direction   | Speed    | Related  | Best For  |
|                   |             |          | Procs?   |           |
+-------------------+-------------+----------+----------+-----------+
| Pipe (|)          | One-way     | Fast     | Yes      | Pipelines |
| Named Pipe (FIFO) | One-way    | Fast     | No       | Producer- |
|                   |             |          |          | consumer  |
| Unix Socket       | Two-way     | Fast     | No       | Client-   |
|                   |             |          |          | server    |
| Shared Memory     | Both read/  | Fastest  | No       | Large data|
|                   | write       |          |          | sharing   |
| Message Queue     | One-way     | Moderate | No       | Discrete  |
|                   | per queue   |          |          | messages  |
| Signals           | One-way     | Fast     | No       | Simple    |
|                   | (notify)    |          |          | events    |
| Files             | Both        | Slow     | No       | Persistent|
|                   |             | (disk)   |          | data      |
+-------------------+-------------+----------+----------+-----------+

What Just Happened?

+------------------------------------------------------------------+
|  Chapter 12 Recap: Inter-Process Communication                   |
|------------------------------------------------------------------|
|                                                                  |
|  - Pipes (|) connect stdout of one process to stdin of next.     |
|  - Pipes are anonymous, unidirectional, and kernel-buffered.     |
|  - Redirections (>, >>, 2>&1, <) control file descriptors.      |
|  - /dev/null discards output; tee duplicates it.                |
|  - Named pipes (mkfifo) persist on disk and connect any two     |
|    processes.                                                    |
|  - Process substitution <() and >() treat command output as     |
|    file paths.                                                   |
|  - Shared memory is the fastest IPC (no data copying).          |
|  - Unix domain sockets provide bidirectional local IPC.          |
|  - Message queues preserve message boundaries.                   |
|  - Use pipefail in scripts to catch pipeline errors.            |
|  - Order matters in redirections: > file 2>&1 is correct.       |
|                                                                  |
+------------------------------------------------------------------+

Try This

Exercise 1: Redirection Mastery

Write a command that runs find / -name "*.conf" and saves normal output to found.txt, errors to errors.txt, and also displays both on the terminal simultaneously. (Hint: you need tee and redirection.)

Exercise 2: Named Pipe Chat

Create a simple two-way chat system using two named pipes. Two terminals should be able to send messages to each other. Each terminal reads from one pipe and writes to the other.

Exercise 3: Process Substitution Power

Without creating any temporary files, find all files that exist in /etc but not in /usr/etc (or any two directories). Use diff with process substitution and find.

Exercise 4: Pipeline Analysis

Run cat /etc/passwd | cut -d: -f7 | sort | uniq -c | sort -rn. Then rewrite it without cat (using input redirection). Then use ${PIPESTATUS[@]} to verify all pipeline stages succeeded.

Bonus Challenge

Write a script that uses a named pipe to implement a simple job queue. One terminal acts as the "dispatcher" writing commands to the pipe, and another terminal acts as the "worker" reading and executing them one at a time. Include error handling and logging of each job's exit status.

Linux Book: From First Boot to Production