Memory-Mapped I/O

Instead of copying data between kernel and user space with read() and write(), you can map a file directly into your process's address space. The kernel handles paging data in and out transparently. This is mmap() -- one of the most powerful system calls on Linux.

How mmap Works

Traditional I/O:

  User space            Kernel space             Disk
  +--------+           +------------+          +------+
  | buffer | <--copy-- | page cache | <--DMA-- | file |
  +--------+           +------------+          +------+
     read() copies data from kernel to user buffer

Memory-mapped I/O:

  User space
  +--------+
  | mapped |  <-- page fault --> kernel loads page from disk
  | region |                     directly into this address range
  +--------+
     No copy -- your pointer IS the data

When you access a mapped page for the first time, a page fault occurs. The kernel loads the data from disk into a physical page and maps it into your address space. Subsequent accesses hit that page directly -- no syscall overhead at all.

mmap in C

/* mmap_read.c -- read a file via mmap */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>

int main(int argc, char *argv[])
{
    if (argc != 2) {
        fprintf(stderr, "usage: %s <file>\n", argv[0]);
        return 1;
    }

    int fd = open(argv[1], O_RDONLY);
    if (fd == -1) { perror("open"); return 1; }

    struct stat st;
    if (fstat(fd, &st) == -1) { perror("fstat"); close(fd); return 1; }

    if (st.st_size == 0) {
        printf("(empty file)\n");
        close(fd);
        return 0;
    }

    void *addr = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
    if (addr == MAP_FAILED) {
        perror("mmap");
        close(fd);
        return 1;
    }

    /* We can close fd now -- the mapping keeps the file open internally */
    close(fd);

    const char *data = (const char *)addr;
    printf("first 80 chars:\n");
    size_t len = (size_t)st.st_size < 80 ? (size_t)st.st_size : 80;
    fwrite(data, 1, len, stdout);
    printf("\n");

    size_t lines = 0;
    for (off_t i = 0; i < st.st_size; i++) {
        if (data[i] == '\n') lines++;
    }
    printf("total lines: %zu\n", lines);

    munmap(addr, st.st_size);
    return 0;
}
$ gcc -Wall -o mmap_read mmap_read.c && ./mmap_read /etc/passwd
first 80 chars:
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/n
total lines: 42

The mmap Arguments

void *mmap(
    void   *addr,    /* suggested address (NULL = let kernel choose) */
    size_t  length,  /* how many bytes to map */
    int     prot,    /* protection: PROT_READ, PROT_WRITE, PROT_EXEC */
    int     flags,   /* MAP_SHARED, MAP_PRIVATE, MAP_ANONYMOUS, ... */
    int     fd,      /* file descriptor (-1 with MAP_ANONYMOUS) */
    off_t   offset   /* offset within file (must be page-aligned) */
);
FlagMeaning
MAP_PRIVATECopy-on-write: writes go to private copy, not file
MAP_SHAREDWrites go through to the file (visible to others)
MAP_ANONYMOUSNo file backing; memory initialized to zero
MAP_FIXEDUse exact address (dangerous if misused)
ProtectionMeaning
PROT_READPages can be read
PROT_WRITEPages can be written
PROT_EXECPages can be executed
PROT_NONENo access (guard pages)

Caution: MAP_FIXED will silently overwrite any existing mapping at that address, including your heap or stack. Almost never use it in application code.

MAP_SHARED vs MAP_PRIVATE

MAP_PRIVATE (copy-on-write):

  Process A          Process B
  +--------+        +--------+
  | page 1 |--+  +--| page 1 |    Both point to same physical pages
  | page 2 |--+--+--| page 2 |    (read-only until a write)
  +--------+        +--------+

  When A writes to page 1:
  +--------+        +--------+
  | page 1'| (new)  | page 1 |    A gets a private copy
  | page 2 |--+--+--| page 2 |    page 2 still shared
  +--------+        +--------+

MAP_SHARED:

  Process A          Process B
  +--------+        +--------+
  | page 1 |--+--+--| page 1 |    Same physical pages, writable
  | page 2 |--+--+--| page 2 |    Writes by A visible to B
  +--------+        +--------+

Writing with mmap

/* mmap_write.c -- modify a file via mmap */
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <string.h>

int main(void)
{
    const char *path = "/tmp/mmap_write_demo.txt";

    int fd = open(path, O_RDWR | O_CREAT | O_TRUNC, 0644);
    if (fd == -1) { perror("open"); return 1; }

    const char *initial = "Hello, World! This is memory-mapped.\n";
    size_t len = strlen(initial);
    write(fd, initial, len);

    void *addr = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    if (addr == MAP_FAILED) { perror("mmap"); close(fd); return 1; }
    close(fd);

    char *data = (char *)addr;
    printf("before: %s", data);

    memcpy(data, "HOWDY", 5);
    printf("after:  %s", data);

    /* Ensure changes reach disk */
    msync(addr, len, MS_SYNC);
    munmap(addr, len);

    /* Verify by reading normally */
    fd = open(path, O_RDONLY);
    char buf[128];
    ssize_t n = read(fd, buf, sizeof(buf) - 1);
    buf[n] = '\0';
    printf("verify: %s", buf);
    close(fd);

    return 0;
}
$ gcc -Wall -o mmap_write mmap_write.c && ./mmap_write
before: Hello, World! This is memory-mapped.
after:  HOWDY, World! This is memory-mapped.
verify: HOWDY, World! This is memory-mapped.

Caution: You cannot extend a file by writing past its end via mmap. The mapping size is fixed at mmap() time. To grow a file, use ftruncate() first, then remap.

msync: Flushing to Disk

msync() ensures that modifications to a MAP_SHARED mapping are written back to the underlying file on disk.

FlagMeaning
MS_SYNCBlock until write is complete
MS_ASYNCInitiate write, return immediately
MS_INVALIDATEInvalidate other mappings (force re-read)

Without msync, the kernel will eventually flush dirty pages, but the timing is unpredictable. For data integrity, always msync before considering data durable.

Anonymous mmap: Shared Memory Without a File

MAP_ANONYMOUS creates a mapping not backed by any file. The memory is initialized to zero. Combined with MAP_SHARED, it survives across fork() and allows parent-child communication.

/* anon_mmap.c -- shared memory between parent and child */
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/wait.h>
#include <unistd.h>

int main(void)
{
    int *shared = mmap(NULL, sizeof(int),
                       PROT_READ | PROT_WRITE,
                       MAP_SHARED | MAP_ANONYMOUS,
                       -1, 0);
    if (shared == MAP_FAILED) { perror("mmap"); return 1; }

    *shared = 0;

    pid_t pid = fork();
    if (pid == -1) { perror("fork"); return 1; }

    if (pid == 0) {
        *shared = 42;
        printf("child set *shared = %d\n", *shared);
        _exit(0);
    }

    waitpid(pid, NULL, 0);
    printf("parent reads *shared = %d\n", *shared);

    munmap(shared, sizeof(int));
    return 0;
}
$ gcc -Wall -o anon_mmap anon_mmap.c && ./anon_mmap
child set *shared = 42
parent reads *shared = 42

Driver Prep: Kernel drivers often use remap_pfn_range() to map device memory or DMA buffers into user space. The user-space side calls mmap() on the device file. Understanding MAP_SHARED here is essential preparation.

Large File Processing with mmap and madvise

mmap is ideal for processing large files. The kernel pages data in on demand and can evict pages under memory pressure.

/* mmap_large.c -- count bytes in a large file via mmap */
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>

int main(int argc, char *argv[])
{
    if (argc != 2) {
        fprintf(stderr, "usage: %s <file>\n", argv[0]);
        return 1;
    }

    int fd = open(argv[1], O_RDONLY);
    if (fd == -1) { perror("open"); return 1; }

    struct stat st;
    fstat(fd, &st);
    if (st.st_size == 0) { printf("empty file\n"); close(fd); return 0; }

    const char *data = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
    if (data == MAP_FAILED) { perror("mmap"); close(fd); return 1; }
    close(fd);

    /* Advise kernel we will read sequentially */
    madvise((void *)data, st.st_size, MADV_SEQUENTIAL);

    unsigned char target = 'e';
    size_t count = 0;
    for (off_t i = 0; i < st.st_size; i++) {
        if ((unsigned char)data[i] == target) count++;
    }

    printf("'%c' appears %zu times in %s (%ld bytes)\n",
           target, count, argv[1], (long)st.st_size);

    munmap((void *)data, st.st_size);
    return 0;
}

madvise() hints to the kernel how you plan to access the data:

HintMeaning
MADV_SEQUENTIALWill read sequentially; prefetch aggressively
MADV_RANDOMWill read randomly; do not prefetch
MADV_WILLNEEDWill need these pages soon; start loading
MADV_DONTNEEDDone with these pages; can be reclaimed

Try It: Map /usr/share/dict/words (if available) and count how many words start with the letter 'z'. Compare the speed against read() in a loop.

Rust: The memmap2 Crate

The Rust standard library does not include mmap. The memmap2 crate provides a safe wrapper. Add to Cargo.toml:

[dependencies]
memmap2 = "0.9"
// mmap_read.rs -- read a file via mmap in Rust
use memmap2::Mmap;
use std::fs::File;

fn main() -> std::io::Result<()> {
    let path = std::env::args().nth(1).expect("usage: mmap_read <file>");

    let file = File::open(&path)?;
    let mmap = unsafe { Mmap::map(&file)? };

    // mmap implements Deref<Target=[u8]>, so we can use it as a byte slice
    println!("file size: {} bytes", mmap.len());

    let preview = std::cmp::min(80, mmap.len());
    let text = String::from_utf8_lossy(&mmap[..preview]);
    println!("first {} bytes:\n{}", preview, text);

    let lines = mmap.iter().filter(|&&b| b == b'\n').count();
    println!("total lines: {}", lines);

    Ok(())
    // mmap is automatically unmapped when dropped
}

Rust Note: Mmap::map() is unsafe because the file could be modified by another process or truncated while you hold the mapping, causing undefined behavior (SIGBUS). This is the same risk as in C -- mmap is inherently a shared-memory interface.

Writable mmap in Rust

// mmap_write.rs -- modify a file via mmap in Rust
use memmap2::MmapMut;
use std::fs::OpenOptions;

fn main() -> std::io::Result<()> {
    let path = "/tmp/mmap_write_rs.txt";
    std::fs::write(path, b"Hello, World! Memory-mapped Rust.\n")?;

    let file = OpenOptions::new().read(true).write(true).open(path)?;
    let mut mmap = unsafe { MmapMut::map_mut(&file)? };

    println!("before: {}", String::from_utf8_lossy(&mmap[..]));

    mmap[..5].copy_from_slice(b"HOWDY");
    mmap.flush()?;

    println!("after:  {}", String::from_utf8_lossy(&mmap[..]));

    let contents = std::fs::read_to_string(path)?;
    print!("verify: {}", contents);
    Ok(())
}

mprotect: Guard Pages

mprotect() changes the protection on an existing mapping. One use case is guard pages -- regions marked PROT_NONE that cause a segfault on access, used to detect stack overflows or buffer overruns.

/* guard_page.c -- use mprotect to create a guard page */
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <signal.h>
#include <string.h>
#include <unistd.h>

static void handler(int sig, siginfo_t *info, void *ctx)
{
    (void)ctx;
    printf("caught %s at address %p\n",
           sig == SIGSEGV ? "SIGSEGV" : "SIGBUS",
           info->si_addr);
    _exit(1);
}

int main(void)
{
    long page_size = sysconf(_SC_PAGESIZE);

    void *region = mmap(NULL, 2 * page_size,
                        PROT_READ | PROT_WRITE,
                        MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (region == MAP_FAILED) { perror("mmap"); return 1; }

    void *guard = (char *)region + page_size;
    if (mprotect(guard, page_size, PROT_NONE) == -1) {
        perror("mprotect");
        return 1;
    }

    struct sigaction sa;
    memset(&sa, 0, sizeof(sa));
    sa.sa_sigaction = handler;
    sa.sa_flags = SA_SIGINFO;
    sigaction(SIGSEGV, &sa, NULL);

    char *usable = (char *)region;
    usable[0] = 'A';
    printf("wrote to usable page OK\n");

    printf("about to touch guard page...\n");
    char *bad = (char *)guard;
    bad[0] = 'B';  /* triggers SIGSEGV */

    munmap(region, 2 * page_size);
    return 0;
}
$ gcc -Wall -o guard_page guard_page.c && ./guard_page
wrote to usable page OK
about to touch guard page...
caught SIGSEGV at address 0x7f8a12341000
Memory layout:
+-------------------+-------------------+
|   usable page     |   guard page      |
|   PROT_READ |     |   PROT_NONE       |
|   PROT_WRITE      |   (any access =   |
|                   |    SIGSEGV)       |
+-------------------+-------------------+
^                   ^
region              region + page_size

mmap for Device Register Access (Preview)

In embedded and driver work, hardware registers live at fixed physical addresses. User-space programs can access them by mapping /dev/mem:

/* devmem_preview.c -- concept only, requires root */
#include <stdio.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>
#include <stdint.h>

int main(void)
{
    off_t phys_addr = 0xFE200000;  /* hypothetical GPIO base */
    size_t page_size = sysconf(_SC_PAGESIZE);

    int fd = open("/dev/mem", O_RDWR | O_SYNC);
    if (fd == -1) { perror("open /dev/mem (need root)"); return 1; }

    void *map = mmap(NULL, page_size,
                     PROT_READ | PROT_WRITE, MAP_SHARED,
                     fd, phys_addr);
    if (map == MAP_FAILED) { perror("mmap"); close(fd); return 1; }

    volatile uint32_t *regs = (volatile uint32_t *)map;
    uint32_t val = regs[0];
    printf("register at 0x%lx = 0x%08x\n", (long)phys_addr, val);

    munmap(map, page_size);
    close(fd);
    return 0;
}

Caution: Writing to the wrong physical address via /dev/mem can crash your system, corrupt data, or damage hardware. Production systems use proper kernel drivers with request_mem_region() and ioremap().

The volatile keyword is critical. Without it, the compiler may optimize away reads and writes to hardware registers. Hardware registers are side-effectful -- reading a status register may clear an interrupt flag.

Driver Prep: Device drivers map hardware registers into user space via the driver's mmap file_operation, which calls remap_pfn_range(). The user-space pattern is always: open device fd, mmap it, read/write through pointers.

Comparing I/O Methods

Method          Copies    Syscalls/access   Best for
-----------     ------    ---------------   --------
read()/write()  1-2       1 per call        Small files, streaming
stdio (fread)   2         ~1 per buffer     General purpose
mmap            0         0 (after fault)   Large files, random access,
                                            shared memory, device regs

mmap wins on: zero-copy access, automatic caching, cheap random access, and inter-process shared memory. It loses on: overhead for small files (minimum one page), harder error handling (SIGBUS), inability to grow without remapping, and inability to work with pipes, sockets, or non-seekable files.

Quick Knowledge Check

  1. What is the difference between MAP_SHARED and MAP_PRIVATE when you write to a mapped page?

  2. Why must you call msync() if you need to guarantee data has reached disk?

  3. What signal does the kernel deliver if you access an mmap'd region after the file has been truncated shorter than the mapping?

Common Pitfalls

  • Mapping a zero-length file. mmap with length 0 returns an error. Always check st_size before mapping.

  • Forgetting munmap(). Leaked mappings consume virtual address space. In long-running processes this eventually causes mmap to fail.

  • Ignoring SIGBUS. If the file is truncated while mapped, accessing beyond the new end delivers SIGBUS, not SIGSEGV.

  • Using MAP_FIXED casually. It silently overwrites existing mappings.

  • Writing past the mapping size. The mapping covers exactly the bytes you requested. Writing beyond it is a segfault.

  • Missing volatile on device registers. The compiler will optimize away your hardware accesses without it.

  • Forgetting O_SYNC for device memory. Without it, the kernel may use caching that reorders stores to device registers.