Anatomy of a Process Address Space
Type this right now
// save as regions.c — compile: gcc -g -o regions regions.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int initialized_global = 42; // .data
int uninitialized_global; // .bss
const char *string_lit = "I live in .rodata";
int main() {
int stack_var = 1;
int *heap_var = malloc(64);
printf("--- Memory Regions ---\n");
printf("Text (main): %p\n", (void *)main);
printf("Rodata (string): %p\n", (void *)string_lit);
printf("Data (init): %p\n", (void *)&initialized_global);
printf("BSS (uninit): %p\n", (void *)&uninitialized_global);
printf("Heap (malloc): %p\n", (void *)heap_var);
printf("Stack (local): %p\n", (void *)&stack_var);
printf("PID: %d (inspect /proc/%d/maps)\n", getpid(), getpid());
sleep(30); // time to inspect
free(heap_var);
return 0;
}
Compile and run it. While it sleeps, open another terminal and run
cat /proc/<PID>/maps. You'll see every region we're about to discuss.
THE Diagram
This is the memory layout of a running process on x86-64 Linux. Commit it to memory.
0xFFFF_FFFF_FFFF_FFFF ┌─────────────────────────────────────────────┐
│ │
│ Kernel Space │
│ (mapped into every process, but you │
│ can't touch it — ring 0 only) │
│ │
0xFFFF_8000_0000_0000 ├─────────────────────────────────────────────┤
│ │
│ (non-canonical address gap) │
│ │
0x0000_7FFF_FFFF_FFFF ├─────────────────────────────────────────────┤
│ │
│ Stack [rw-p] │
│ grows ↓ downward │
│ (local vars, return addrs, saved regs) │
│ │
├ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┤
│ Guard page [---p] (unmapped) │
├─────────────────────────────────────────────┤
│ │
│ Memory-mapped region │
│ (shared libraries: libc.so, ld-linux.so) │
│ (mmap'd files, anonymous mmap) │
│ │
├─────────────────────────────────────────────┤
│ │
│ (gap) │
│ │
├─────────────────────────────────────────────┤
│ │
│ Heap [rw-p] │
│ grows ↑ upward │
│ (malloc, calloc, Box::new, Vec::new) │
│ │
├─────────────────────────────────────────────┤
│ BSS [rw-p] │
│ (uninitialized globals, zeroed at load) │
├─────────────────────────────────────────────┤
│ Data [rw-p] │
│ (initialized globals: int x = 42) │
├─────────────────────────────────────────────┤
│ Rodata [r--p] │
│ (string literals, const arrays) │
├─────────────────────────────────────────────┤
│ Text [r-xp] │
│ (your compiled code — machine instr.) │
0x0000_0000_0000_0000 └─────────────────────────────────────────────┘
Now let's walk through each region from bottom to top.
Text: your compiled code
The .text section holds your program's machine instructions — the compiled output of every
function you wrote.
| Property | Value |
|---|---|
| Permissions | r-xp (read, execute, no write) |
| Source | Loaded from the ELF binary |
| Lifetime | Entire process lifetime |
| Who manages | OS loader maps it from disk |
Read and execute, but not writable. This is enforced by the hardware (page table permissions). If your code could rewrite itself, every buffer overflow would be an arbitrary code execution exploit. The CPU enforces W^X: a page is either writable or executable, never both.
🧠 What do you think happens?
What if you cast a function pointer to
int *and try to write to it?int *p = (int *)main; *p = 0x90909090; // NOP sled?Try it. The CPU will raise a fault before the write completes.
Data: initialized globals
int answer = 42; // C: goes in .data
#![allow(unused)] fn main() { static ANSWER: i32 = 42; // Rust: goes in .data }
This section holds global and static variables that have explicit initial values. The values are
stored in the ELF binary itself — when you cat the binary, the bytes 2a 00 00 00 (42 in
little-endian) are literally sitting in the file.
| Property | Value |
|---|---|
| Permissions | rw-p (read + write) |
| Source | Values loaded from ELF binary |
| Lifetime | Entire process lifetime |
| Who manages | OS loader |
BSS: uninitialized globals
int counter; // C: goes in .bss (implicitly zero)
static int buffer[4096]; // C: 16KB of zeros — in .bss
#![allow(unused)] fn main() { static mut COUNTER: i32 = 0; // Rust: goes in .bss (explicitly zero) }
BSS stands for "Block Started by Symbol" — an old assembler directive. What matters: the OS zeroes this memory at load time. The values are not stored in the binary.
| Property | Value |
|---|---|
| Permissions | rw-p (read + write) |
| Source | Zeroed by OS at load time — NOT stored on disk |
| Lifetime | Entire process lifetime |
| Who manages | OS loader |
💡 Fun Fact
If you declare
static int bigarray[1000000];in C, your binary does NOT grow by 4MB. The ELF file just records "I need 4,000,000 bytes of BSS." The OS allocates and zeroes them when the process starts. This is why BSS exists — it would be absurd to store millions of zeros on disk.
To see the savings yourself:
$ readelf -S regions | grep -E "\.data|\.bss"
[24] .data PROGBITS 0000000000004000 003000 000008 0 WA 0 0 8
[25] .bss NOBITS 0000000000004008 003008 000004 0 WA 0 0 4
Notice .bss is NOBITS. Zero bytes on disk. Full size in memory.
Heap: dynamic allocation
int *p = malloc(100); // C: heap allocation
#![allow(unused)] fn main() { let p = Box::new(42); // Rust: heap allocation let v = vec![1, 2, 3]; // Rust: heap allocation (via Vec) }
The heap is where dynamic allocations live. It starts just above BSS and grows upward toward higher addresses.
| Property | Value |
|---|---|
| Permissions | rw-p (read + write) |
| Source | Allocated at runtime via brk or mmap system calls |
| Lifetime | Until explicitly freed (C) or dropped (Rust) |
| Who manages | The allocator (malloc/free), kernel provides pages |
The heap is managed in two layers:
Your code malloc(100) / Box::new(42)
│
▼
Allocator glibc malloc / jemalloc / etc.
(user space) Maintains free lists, splits/merges blocks
│
▼
Kernel brk() for small allocations
mmap() for large allocations (>128KB)
We'll dissect the allocator in Chapter 20. For now, know that malloc doesn't call the kernel
every time — it maintains its own pool.
Stack: function call frames
void foo() {
int x = 10; // lives on the stack
int arr[100]; // 400 bytes on the stack
}
#![allow(unused)] fn main() { fn foo() { let x: i32 = 10; // lives on the stack let arr = [0i32; 100]; // 400 bytes on the stack } }
The stack starts near the top of user space and grows downward toward lower addresses.
| Property | Value |
|---|---|
| Permissions | rw-p (read + write, no execute) |
| Source | Allocated by the OS when the process starts |
| Lifetime | Until the function returns |
| Who manages | The CPU (rsp register), compiler (frame layout) |
| Size limit | Default 8MB (ulimit -s) |
Every function call pushes a frame onto the stack: return address, saved registers, local
variables. Every return pops it. The stack pointer (rsp) moves up and down — that's it. No
allocator, no free lists, no fragmentation. One register, one instruction to allocate, one
instruction to free.
That's why the stack is fast.
Memory-mapped regions
Between the heap and the stack, you'll find memory-mapped regions. These include:
- Shared libraries:
libc.so,ld-linux-x86-64.so,libpthread.so - Anonymous mappings: large
malloccalls (>128KB) usemmapinstead ofbrk - File mappings:
mmap()can map a file directly into your address space
7f8a12000000-7f8a12200000 r--p /usr/lib/x86_64-linux-gnu/libc.so.6
7f8a12200000-7f8a12395000 r-xp /usr/lib/x86_64-linux-gnu/libc.so.6
7f8a12395000-7f8a123ed000 r--p /usr/lib/x86_64-linux-gnu/libc.so.6
Notice libc has multiple entries — different sections (code, read-only data, writable data) are mapped with different permissions. Same library, different protection levels.
Kernel space: here be dragons
0xFFFF_8000_0000_0000 and above
The top half of the address space is reserved for the kernel. It's mapped into every process's
page table, but the page table entries are marked supervisor only. The CPU checks your current
privilege level (ring 3 for user code) against the page permissions (ring 0 required for kernel
pages). If you try to access kernel memory from user code, the CPU raises a page fault. The kernel
handles it by sending your process a SIGSEGV.
You interact with kernel space only through system calls — read, write, mmap, brk. Those
switch the CPU to ring 0, run kernel code, then switch back. That boundary is absolute.
Rust: same layout, different guarantees
Here's the key insight: Rust programs have the exact same memory layout as C programs.
static GLOBAL: i32 = 42; // .data — same as C static UNINIT: std::sync::atomic::AtomicI32 = std::sync::atomic::AtomicI32::new(0); // .bss (zero-initialized) fn main() { // .text — same as C let local = 10; // stack — same as C let boxed = Box::new(20); // heap — same as C }
Ownership, borrowing, lifetimes — they exist only at compile time. The generated machine code uses
the same stack, the same heap, the same text/data/bss sections. rustc doesn't invent a new memory
model. It enforces rules about how you use the one that already exists.
💡 Fun Fact
You can link Rust and C code together into a single binary. They share the same address space, the same heap, the same stack. A Rust function can call a C function (via
extern "C") and the stack frames interleave seamlessly. There's no boundary at runtime — only at compile time.
🔧 Task
Write a program (in C or Rust — or both) that places data in every region:
- A function → text
- An initialized global → data
- An uninitialized global → BSS
- A string literal → rodata
- A
malloc/Box::new→ heap- A local variable → stack
Print the address of each. Then, while the program is sleeping, run:
$ cat /proc/<PID>/mapsFor each printed address, find the corresponding line in the maps output. Verify:
- The address falls within the range on that line
- The permissions match what you'd expect (code is
r-xp, globals arerw-p, etc.)- The pathname column tells you whether it's from your binary, a library, or anonymous
Bonus: Use
readelf -S ./regionsto list all sections. Find.text,.data,.bss, and.rodata. Compare their sizes with what you'd predict from your code.