Memory-Mapped I/O
Instead of copying data between kernel and user space with read() and
write(), you can map a file directly into your process's address space.
The kernel handles paging data in and out transparently. This is mmap() --
one of the most powerful system calls on Linux.
How mmap Works
Traditional I/O:
User space Kernel space Disk
+--------+ +------------+ +------+
| buffer | <--copy-- | page cache | <--DMA-- | file |
+--------+ +------------+ +------+
read() copies data from kernel to user buffer
Memory-mapped I/O:
User space
+--------+
| mapped | <-- page fault --> kernel loads page from disk
| region | directly into this address range
+--------+
No copy -- your pointer IS the data
When you access a mapped page for the first time, a page fault occurs. The kernel loads the data from disk into a physical page and maps it into your address space. Subsequent accesses hit that page directly -- no syscall overhead at all.
mmap in C
/* mmap_read.c -- read a file via mmap */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>
int main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "usage: %s <file>\n", argv[0]);
return 1;
}
int fd = open(argv[1], O_RDONLY);
if (fd == -1) { perror("open"); return 1; }
struct stat st;
if (fstat(fd, &st) == -1) { perror("fstat"); close(fd); return 1; }
if (st.st_size == 0) {
printf("(empty file)\n");
close(fd);
return 0;
}
void *addr = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
if (addr == MAP_FAILED) {
perror("mmap");
close(fd);
return 1;
}
/* We can close fd now -- the mapping keeps the file open internally */
close(fd);
const char *data = (const char *)addr;
printf("first 80 chars:\n");
size_t len = (size_t)st.st_size < 80 ? (size_t)st.st_size : 80;
fwrite(data, 1, len, stdout);
printf("\n");
size_t lines = 0;
for (off_t i = 0; i < st.st_size; i++) {
if (data[i] == '\n') lines++;
}
printf("total lines: %zu\n", lines);
munmap(addr, st.st_size);
return 0;
}
$ gcc -Wall -o mmap_read mmap_read.c && ./mmap_read /etc/passwd
first 80 chars:
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/n
total lines: 42
The mmap Arguments
void *mmap(
void *addr, /* suggested address (NULL = let kernel choose) */
size_t length, /* how many bytes to map */
int prot, /* protection: PROT_READ, PROT_WRITE, PROT_EXEC */
int flags, /* MAP_SHARED, MAP_PRIVATE, MAP_ANONYMOUS, ... */
int fd, /* file descriptor (-1 with MAP_ANONYMOUS) */
off_t offset /* offset within file (must be page-aligned) */
);
| Flag | Meaning |
|---|---|
MAP_PRIVATE | Copy-on-write: writes go to private copy, not file |
MAP_SHARED | Writes go through to the file (visible to others) |
MAP_ANONYMOUS | No file backing; memory initialized to zero |
MAP_FIXED | Use exact address (dangerous if misused) |
| Protection | Meaning |
|---|---|
PROT_READ | Pages can be read |
PROT_WRITE | Pages can be written |
PROT_EXEC | Pages can be executed |
PROT_NONE | No access (guard pages) |
Caution:
MAP_FIXEDwill silently overwrite any existing mapping at that address, including your heap or stack. Almost never use it in application code.
MAP_SHARED vs MAP_PRIVATE
MAP_PRIVATE (copy-on-write):
Process A Process B
+--------+ +--------+
| page 1 |--+ +--| page 1 | Both point to same physical pages
| page 2 |--+--+--| page 2 | (read-only until a write)
+--------+ +--------+
When A writes to page 1:
+--------+ +--------+
| page 1'| (new) | page 1 | A gets a private copy
| page 2 |--+--+--| page 2 | page 2 still shared
+--------+ +--------+
MAP_SHARED:
Process A Process B
+--------+ +--------+
| page 1 |--+--+--| page 1 | Same physical pages, writable
| page 2 |--+--+--| page 2 | Writes by A visible to B
+--------+ +--------+
Writing with mmap
/* mmap_write.c -- modify a file via mmap */
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <string.h>
int main(void)
{
const char *path = "/tmp/mmap_write_demo.txt";
int fd = open(path, O_RDWR | O_CREAT | O_TRUNC, 0644);
if (fd == -1) { perror("open"); return 1; }
const char *initial = "Hello, World! This is memory-mapped.\n";
size_t len = strlen(initial);
write(fd, initial, len);
void *addr = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (addr == MAP_FAILED) { perror("mmap"); close(fd); return 1; }
close(fd);
char *data = (char *)addr;
printf("before: %s", data);
memcpy(data, "HOWDY", 5);
printf("after: %s", data);
/* Ensure changes reach disk */
msync(addr, len, MS_SYNC);
munmap(addr, len);
/* Verify by reading normally */
fd = open(path, O_RDONLY);
char buf[128];
ssize_t n = read(fd, buf, sizeof(buf) - 1);
buf[n] = '\0';
printf("verify: %s", buf);
close(fd);
return 0;
}
$ gcc -Wall -o mmap_write mmap_write.c && ./mmap_write
before: Hello, World! This is memory-mapped.
after: HOWDY, World! This is memory-mapped.
verify: HOWDY, World! This is memory-mapped.
Caution: You cannot extend a file by writing past its end via mmap. The mapping size is fixed at
mmap()time. To grow a file, useftruncate()first, then remap.
msync: Flushing to Disk
msync() ensures that modifications to a MAP_SHARED mapping are written
back to the underlying file on disk.
| Flag | Meaning |
|---|---|
MS_SYNC | Block until write is complete |
MS_ASYNC | Initiate write, return immediately |
MS_INVALIDATE | Invalidate other mappings (force re-read) |
Without msync, the kernel will eventually flush dirty pages, but the
timing is unpredictable. For data integrity, always msync before considering
data durable.
Anonymous mmap: Shared Memory Without a File
MAP_ANONYMOUS creates a mapping not backed by any file. The memory is
initialized to zero. Combined with MAP_SHARED, it survives across fork()
and allows parent-child communication.
/* anon_mmap.c -- shared memory between parent and child */
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/wait.h>
#include <unistd.h>
int main(void)
{
int *shared = mmap(NULL, sizeof(int),
PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS,
-1, 0);
if (shared == MAP_FAILED) { perror("mmap"); return 1; }
*shared = 0;
pid_t pid = fork();
if (pid == -1) { perror("fork"); return 1; }
if (pid == 0) {
*shared = 42;
printf("child set *shared = %d\n", *shared);
_exit(0);
}
waitpid(pid, NULL, 0);
printf("parent reads *shared = %d\n", *shared);
munmap(shared, sizeof(int));
return 0;
}
$ gcc -Wall -o anon_mmap anon_mmap.c && ./anon_mmap
child set *shared = 42
parent reads *shared = 42
Driver Prep: Kernel drivers often use
remap_pfn_range()to map device memory or DMA buffers into user space. The user-space side callsmmap()on the device file. UnderstandingMAP_SHAREDhere is essential preparation.
Large File Processing with mmap and madvise
mmap is ideal for processing large files. The kernel pages data in on demand and can evict pages under memory pressure.
/* mmap_large.c -- count bytes in a large file via mmap */
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>
int main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "usage: %s <file>\n", argv[0]);
return 1;
}
int fd = open(argv[1], O_RDONLY);
if (fd == -1) { perror("open"); return 1; }
struct stat st;
fstat(fd, &st);
if (st.st_size == 0) { printf("empty file\n"); close(fd); return 0; }
const char *data = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
if (data == MAP_FAILED) { perror("mmap"); close(fd); return 1; }
close(fd);
/* Advise kernel we will read sequentially */
madvise((void *)data, st.st_size, MADV_SEQUENTIAL);
unsigned char target = 'e';
size_t count = 0;
for (off_t i = 0; i < st.st_size; i++) {
if ((unsigned char)data[i] == target) count++;
}
printf("'%c' appears %zu times in %s (%ld bytes)\n",
target, count, argv[1], (long)st.st_size);
munmap((void *)data, st.st_size);
return 0;
}
madvise() hints to the kernel how you plan to access the data:
| Hint | Meaning |
|---|---|
MADV_SEQUENTIAL | Will read sequentially; prefetch aggressively |
MADV_RANDOM | Will read randomly; do not prefetch |
MADV_WILLNEED | Will need these pages soon; start loading |
MADV_DONTNEED | Done with these pages; can be reclaimed |
Try It: Map
/usr/share/dict/words(if available) and count how many words start with the letter 'z'. Compare the speed againstread()in a loop.
Rust: The memmap2 Crate
The Rust standard library does not include mmap. The memmap2 crate provides
a safe wrapper. Add to Cargo.toml:
[dependencies]
memmap2 = "0.9"
// mmap_read.rs -- read a file via mmap in Rust use memmap2::Mmap; use std::fs::File; fn main() -> std::io::Result<()> { let path = std::env::args().nth(1).expect("usage: mmap_read <file>"); let file = File::open(&path)?; let mmap = unsafe { Mmap::map(&file)? }; // mmap implements Deref<Target=[u8]>, so we can use it as a byte slice println!("file size: {} bytes", mmap.len()); let preview = std::cmp::min(80, mmap.len()); let text = String::from_utf8_lossy(&mmap[..preview]); println!("first {} bytes:\n{}", preview, text); let lines = mmap.iter().filter(|&&b| b == b'\n').count(); println!("total lines: {}", lines); Ok(()) // mmap is automatically unmapped when dropped }
Rust Note:
Mmap::map()isunsafebecause the file could be modified by another process or truncated while you hold the mapping, causing undefined behavior (SIGBUS). This is the same risk as in C -- mmap is inherently a shared-memory interface.
Writable mmap in Rust
// mmap_write.rs -- modify a file via mmap in Rust use memmap2::MmapMut; use std::fs::OpenOptions; fn main() -> std::io::Result<()> { let path = "/tmp/mmap_write_rs.txt"; std::fs::write(path, b"Hello, World! Memory-mapped Rust.\n")?; let file = OpenOptions::new().read(true).write(true).open(path)?; let mut mmap = unsafe { MmapMut::map_mut(&file)? }; println!("before: {}", String::from_utf8_lossy(&mmap[..])); mmap[..5].copy_from_slice(b"HOWDY"); mmap.flush()?; println!("after: {}", String::from_utf8_lossy(&mmap[..])); let contents = std::fs::read_to_string(path)?; print!("verify: {}", contents); Ok(()) }
mprotect: Guard Pages
mprotect() changes the protection on an existing mapping. One use case is
guard pages -- regions marked PROT_NONE that cause a segfault on access,
used to detect stack overflows or buffer overruns.
/* guard_page.c -- use mprotect to create a guard page */
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <signal.h>
#include <string.h>
#include <unistd.h>
static void handler(int sig, siginfo_t *info, void *ctx)
{
(void)ctx;
printf("caught %s at address %p\n",
sig == SIGSEGV ? "SIGSEGV" : "SIGBUS",
info->si_addr);
_exit(1);
}
int main(void)
{
long page_size = sysconf(_SC_PAGESIZE);
void *region = mmap(NULL, 2 * page_size,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (region == MAP_FAILED) { perror("mmap"); return 1; }
void *guard = (char *)region + page_size;
if (mprotect(guard, page_size, PROT_NONE) == -1) {
perror("mprotect");
return 1;
}
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sa.sa_sigaction = handler;
sa.sa_flags = SA_SIGINFO;
sigaction(SIGSEGV, &sa, NULL);
char *usable = (char *)region;
usable[0] = 'A';
printf("wrote to usable page OK\n");
printf("about to touch guard page...\n");
char *bad = (char *)guard;
bad[0] = 'B'; /* triggers SIGSEGV */
munmap(region, 2 * page_size);
return 0;
}
$ gcc -Wall -o guard_page guard_page.c && ./guard_page
wrote to usable page OK
about to touch guard page...
caught SIGSEGV at address 0x7f8a12341000
Memory layout:
+-------------------+-------------------+
| usable page | guard page |
| PROT_READ | | PROT_NONE |
| PROT_WRITE | (any access = |
| | SIGSEGV) |
+-------------------+-------------------+
^ ^
region region + page_size
mmap for Device Register Access (Preview)
In embedded and driver work, hardware registers live at fixed physical
addresses. User-space programs can access them by mapping /dev/mem:
/* devmem_preview.c -- concept only, requires root */
#include <stdio.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>
#include <stdint.h>
int main(void)
{
off_t phys_addr = 0xFE200000; /* hypothetical GPIO base */
size_t page_size = sysconf(_SC_PAGESIZE);
int fd = open("/dev/mem", O_RDWR | O_SYNC);
if (fd == -1) { perror("open /dev/mem (need root)"); return 1; }
void *map = mmap(NULL, page_size,
PROT_READ | PROT_WRITE, MAP_SHARED,
fd, phys_addr);
if (map == MAP_FAILED) { perror("mmap"); close(fd); return 1; }
volatile uint32_t *regs = (volatile uint32_t *)map;
uint32_t val = regs[0];
printf("register at 0x%lx = 0x%08x\n", (long)phys_addr, val);
munmap(map, page_size);
close(fd);
return 0;
}
Caution: Writing to the wrong physical address via
/dev/memcan crash your system, corrupt data, or damage hardware. Production systems use proper kernel drivers withrequest_mem_region()andioremap().
The volatile keyword is critical. Without it, the compiler may optimize
away reads and writes to hardware registers. Hardware registers are
side-effectful -- reading a status register may clear an interrupt flag.
Driver Prep: Device drivers map hardware registers into user space via the driver's
mmapfile_operation, which callsremap_pfn_range(). The user-space pattern is always: open device fd, mmap it, read/write through pointers.
Comparing I/O Methods
Method Copies Syscalls/access Best for
----------- ------ --------------- --------
read()/write() 1-2 1 per call Small files, streaming
stdio (fread) 2 ~1 per buffer General purpose
mmap 0 0 (after fault) Large files, random access,
shared memory, device regs
mmap wins on: zero-copy access, automatic caching, cheap random access, and inter-process shared memory. It loses on: overhead for small files (minimum one page), harder error handling (SIGBUS), inability to grow without remapping, and inability to work with pipes, sockets, or non-seekable files.
Quick Knowledge Check
-
What is the difference between
MAP_SHAREDandMAP_PRIVATEwhen you write to a mapped page? -
Why must you call
msync()if you need to guarantee data has reached disk? -
What signal does the kernel deliver if you access an mmap'd region after the file has been truncated shorter than the mapping?
Common Pitfalls
-
Mapping a zero-length file.
mmapwith length 0 returns an error. Always checkst_sizebefore mapping. -
Forgetting
munmap(). Leaked mappings consume virtual address space. In long-running processes this eventually causesmmapto fail. -
Ignoring SIGBUS. If the file is truncated while mapped, accessing beyond the new end delivers SIGBUS, not SIGSEGV.
-
Using
MAP_FIXEDcasually. It silently overwrites existing mappings. -
Writing past the mapping size. The mapping covers exactly the bytes you requested. Writing beyond it is a segfault.
-
Missing
volatileon device registers. The compiler will optimize away your hardware accesses without it. -
Forgetting
O_SYNCfor device memory. Without it, the kernel may use caching that reorders stores to device registers.