The Heap: Flexible and Dangerous
Type this right now
// save as heap_intro.c — compile: gcc -o heap_intro heap_intro.c
#include <stdio.h>
#include <stdlib.h>
int main() {
int *p = malloc(100);
printf("malloc(100) returned: %p\n", (void *)p);
printf("\nProcess maps (heap region):\n");
FILE *f = fopen("/proc/self/maps", "r");
char line[256];
while (fgets(line, sizeof(line), f)) {
if (strstr(line, "[heap]")) printf(" %s", line);
}
fclose(f);
free(p);
return 0;
}
$ gcc -o heap_intro heap_intro.c && ./heap_intro
malloc(100) returned: 0x55a8b7e002a0
Process maps (heap region):
55a8b7e00000-55a8b7e21000 rw-p 00000000 00:00 0 [heap]
You asked for 100 bytes. The kernel gave the allocator 135,168 bytes (0x21000 = 132KB). That discrepancy is the whole story of this chapter.
The two sides of dynamic allocation
C — explicit control, explicit danger:
int *p = malloc(100); // allocate 100 bytes
*p = 42; // use it
free(p); // you MUST remember to do this
Rust — same allocation, automatic cleanup:
#![allow(unused)] fn main() { let p = Box::new(42); // allocate on heap println!("{}", *p); // use it // p is dropped here — free() called automatically }
Vec, String, Box — every Rust heap type calls the allocator underneath. The difference is
that Rust's ownership system guarantees free happens exactly once, at the right time.
The allocator as middleman
Your code never talks to the kernel directly for heap memory. There's a middleman:
┌─────────────────┐
│ Your Code │ malloc(100) / Box::new(42)
│ │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Allocator │ glibc malloc / jemalloc / mimalloc
│ (user space) │ Maintains free lists, bins, arenas
│ │ Splits large blocks, coalesces freed ones
└────────┬────────┘
│ Needs more memory?
▼
┌─────────────────┐
│ Kernel │ brk() — move the program break
│ │ mmap() — map new anonymous pages
└─────────────────┘
The allocator requests large chunks from the kernel, then carves them up to serve your small
malloc calls. This amortizes the cost of system calls — calling the kernel is expensive (hundreds
of cycles), but adjusting a pointer in a free list is cheap (a few cycles).
brk and sbrk: the old way
The simplest heap mechanism is brk/sbrk. The kernel maintains a pointer called the
program break — the end of the data segment. You can move it:
Before any allocation:
┌──────────────┐
│ Text │
│ Data │
│ BSS │
└──────┬───────┘
│ ← program break (brk)
│
▼ (unmapped)
After sbrk(4096):
┌──────────────┐
│ Text │
│ Data │
│ BSS │
├──────────────┤
│ Heap (4KB) │ ← new memory
└──────┬───────┘
│ ← program break moved up by 4096
▼ (unmapped)
#include <unistd.h>
void *old_brk = sbrk(0); // get current break
sbrk(4096); // move break up by 4096 bytes
void *new_brk = sbrk(0); // verify
printf("grew by: %ld\n", (char *)new_brk - (char *)old_brk);
Simple. Contiguous. But limited — you can only grow or shrink from one end. You can't free memory in the middle and return it to the OS.
mmap anonymous: the modern way
For large allocations (glibc's threshold is typically 128KB), the allocator skips brk entirely
and asks the kernel for pages directly:
#include <sys/mman.h>
void *p = mmap(NULL, 1048576, // 1 MB
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1, 0);
// use p...
munmap(p, 1048576); // return to kernel
mmap returns memory from anywhere in the address space — it doesn't need to be contiguous with
the heap. And when you munmap, the pages go back to the kernel immediately. No fragmentation
problem at the kernel level.
💡 Fun Fact
When you
malloc(1000000)in glibc (about 1MB), it doesn't usebrk. It callsmmapinternally. You can observe this withstrace:$ strace -e brk,mmap ./your_program mmap(NULL, 1003520, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f...Small allocations use
brk. Large ones usemmap. The crossover point is tunable (mallopt(M_MMAP_THRESHOLD, ...)).
Fragmentation: the heap's worst enemy
Here's where the heap gets ugly. Watch this allocation sequence:
1. Allocate A (32 bytes), B (64 bytes), C (32 bytes):
┌────┬──────────┬────┐
│ A │ B │ C │
│ 32 │ 64 │ 32 │
└────┴──────────┴────┘
2. Free B:
┌────┬──────────┬────┐
│ A │ (free) │ C │
│ 32 │ 64 │ 32 │
└────┴──────────┴────┘
3. Try to allocate D (96 bytes):
┌────┬──────────┬────┬────────────┐
│ A │ (free) │ C │ D │
│ 32 │ 64 │ 32 │ 96 │
└────┴──────────┴────┴────────────┘
^ ^^^^^^^^
│ 64 bytes wasted — big enough
│ for some things, but not for D
The 64-byte hole can never be used for anything that needs more than 64 bytes. Over time, your heap becomes Swiss cheese — lots of small holes, no large contiguous blocks. Total free memory might be 10MB, but the largest contiguous block is 64 bytes.
This is external fragmentation, and it's one of the hardest problems in allocator design.
There's also internal fragmentation: you ask for 100 bytes, the allocator gives you 128 (rounded up for alignment). Those 28 extra bytes are wasted.
Internal fragmentation:
┌──────────────────────┐
│ Your 100 bytes │ pad │ ← allocator gives you a 128-byte block
│ │ 28 │ 28 bytes wasted inside the block
└──────────────────────┘
C's four deadly heap bugs
The heap in C is a minefield. Here are the four classic ways to corrupt it, each with code and a diagram.
1. Use-after-free
int *p = malloc(sizeof(int));
*p = 42;
free(p); // p is now a dangling pointer
*p = 99; // BUG: writing to freed memory
After malloc: After free: After *p = 99:
┌──────────┐ ┌──────────┐ ┌──────────┐
│ p → [42] │ allocated │ p → [??] │ freed │ p → [99] │ CORRUPTED
└──────────┘ └──────────┘ └──────────┘
Allocator may have Someone else may
recycled this block own this memory now
The pointer p still holds the old address, but that memory may now belong to a different
allocation. Writing to it silently corrupts someone else's data. This is the most common source
of security vulnerabilities in C programs.
2. Double-free
int *p = malloc(sizeof(int));
free(p);
free(p); // BUG: freeing the same block twice
Free list before: After free(p): After free(p) again:
HEAD → [blk_A] HEAD → [p] → [blk_A] HEAD → [p] → [p] → [blk_A]
^^^^^^^^^^^^
Circular! Allocator metadata
is now corrupted.
Double-free corrupts the allocator's internal free list. The next two malloc calls may return
the same pointer, causing two parts of your program to stomp on each other's data.
3. Buffer overflow (heap)
int *p = malloc(10 * sizeof(int)); // 40 bytes
for (int i = 0; i <= 10; i++) { // BUG: off-by-one, writes 11 elements
p[i] = i;
}
Heap memory layout:
┌──────────────────────────────┬──────────────────────┐
│ p's allocation (40 bytes) │ allocator metadata │
│ [0][1][2]...[9] │ [size|flags|next] │
└──────────────────────────────┴──────────────────────┘
^^^^^^^^^^^^^^^^
p[10] writes HERE
Corrupts the allocator's
bookkeeping for the next block
You wrote past the end of your allocation. The bytes that follow are typically allocator metadata
for the next block — size, flags, free-list pointers. Corrupting them means the next malloc or
free behaves unpredictably. The crash often happens far from the bug, making it nightmarish
to debug.
4. Memory leak
void process_data() {
int *data = malloc(1024 * 1024); // 1 MB
// ... use data ...
return; // BUG: forgot to call free(data)
}
After calling process_data() 1000 times:
Heap:
┌──────┬──────┬──────┬──────┬──────┬──────┬─── ─┬──────┐
│ 1 MB │ 1 MB │ 1 MB │ 1 MB │ 1 MB │ 1 MB │ ... │ 1 MB │
│ LOST │ LOST │ LOST │ LOST │ LOST │ LOST │ LOST │ LOST │
└──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┘
← 1 GB of memory, no way to free it →
Process RSS grows without bound. Eventually: OOM killer.
No pointer to the memory means no way to free it. The memory is allocated but unreachable. In a long-running server, this is death by a thousand cuts.
Rust prevents all four
Here's the remarkable thing: Rust's ownership system makes all four bugs compile-time errors.
Use-after-free: impossible
#![allow(unused)] fn main() { let p = Box::new(42); drop(p); // explicitly free println!("{}", *p); // COMPILE ERROR: use of moved value `p` }
error[E0382]: borrow of moved value: `p`
--> src/main.rs:4:20
|
2 | let p = Box::new(42);
| - move occurs because `p` has type `Box<i32>`
3 | drop(p);
| - value moved here
4 | println!("{}", *p);
| ^^ value borrowed here after move
After drop, the compiler knows p is gone. You can't use it. Period.
Double-free: impossible
#![allow(unused)] fn main() { let p = Box::new(42); drop(p); drop(p); // COMPILE ERROR: use of moved value `p` }
Same error. After the first drop, p is moved. You can't drop it again.
Buffer overflow: caught at runtime
#![allow(unused)] fn main() { let v = vec![0; 10]; println!("{}", v[10]); // RUNTIME PANIC: index out of bounds }
thread 'main' panicked at 'index out of bounds: the len is 10 but the index is 10'
Rust bounds-checks every array and vector access in safe code. You can opt out with
get_unchecked() in unsafe blocks, but the default is safety.
Memory leak: prevented by Drop
#![allow(unused)] fn main() { fn process_data() { let data = vec![0u8; 1024 * 1024]; // 1 MB on the heap // ... use data ... } // data is dropped here — Vec's Drop impl calls free() }
When data goes out of scope, Rust calls its Drop implementation, which frees the heap memory.
You didn't write free. You didn't need to remember. The compiler inserted it for you.
🧠 What do you think happens?
Rust makes leaking memory safe — you can intentionally leak with
Box::leak()ormem::forget(). Why would Rust allow this? Because leaking memory doesn't cause undefined behavior — it's wasteful, but it can't corrupt other data or violate memory safety. The safety guarantee is about soundness, not resource efficiency.
Side by side: the bug and the save
| Bug | C code | What happens | Rust equivalent | What happens |
|---|---|---|---|---|
| Use-after-free | free(p); *p = 1; | Silent corruption | drop(p); *p = 1; | Compile error |
| Double-free | free(p); free(p); | Allocator corruption | drop(p); drop(p); | Compile error |
| Buffer overflow | p[11] on size-10 | Silent write past end | v[11] on size-10 | Panic with message |
| Memory leak | Forget to free | Memory grows forever | Scope exit | Drop runs automatically |
Seeing the allocator in action
You can watch malloc call the kernel using strace:
$ strace -e brk,mmap,munmap ./heap_intro
brk(NULL) = 0x55a8b7e00000
brk(0x55a8b7e21000) = 0x55a8b7e21000
...
That first brk(NULL) queries the current program break. The second moves it up by 0x21000 bytes
(132KB). That's the allocator requesting its initial arena from the kernel — even though you only
asked for 100 bytes.
For a larger allocation:
void *big = malloc(256 * 1024); // 256 KB
$ strace -e brk,mmap ./big_alloc
mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f...
256KB > 128KB threshold, so malloc chose mmap instead of brk.
Heap metadata: what the allocator tracks
Every malloc block has hidden metadata. The allocator stores bookkeeping right next to your
data:
What malloc(100) actually looks like in memory:
┌─────────────────┬──────────────────────────────────┐
│ Chunk header │ Your 100 bytes (user data) │
│ (16 bytes) │ │
│ size | flags │ │
└─────────────────┴──────────────────────────────────┘
^ ^
│ └── pointer returned by malloc
└── actual start of the allocation
The pointer you get back is PAST the header.
This is why buffer overflows are so catastrophic — write past your allocation and you corrupt the next chunk's header. The allocator trusts its own metadata. If you corrupt it, all bets are off.
💡 Fun Fact
glibc's malloc stores the size of the current chunk AND uses the low bits of the size field as flags (since chunks are always aligned to 16 bytes, the bottom 4 bits are free for flags). One bit indicates "previous chunk is in use." This clever trick is why glibc malloc can coalesce adjacent free blocks efficiently — but it's also why heap corruption is so devastating.
Valgrind: your heap safety net in C
Since C can't catch heap bugs at compile time, you use runtime tools. Valgrind is the classic:
// save as uaf.c — compile: gcc -g -o uaf uaf.c
#include <stdlib.h>
int main() {
int *p = malloc(sizeof(int));
*p = 42;
free(p);
*p = 99; // use-after-free
return 0;
}
$ valgrind ./uaf
==12345== Invalid write of size 4
==12345== at 0x401156: main (uaf.c:7)
==12345== Address 0x4a47040 is 0 bytes inside a block of size 4 free'd
==12345== at 0x483CA3F: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12345== by 0x40114D: main (uaf.c:6)
Valgrind tells you exactly what happened, where, and when the block was freed. It's the closest C gets to Rust's compile-time guarantees — but it only catches bugs you actually trigger at runtime, and it makes your program 20-50x slower.
Rust's allocator: same mechanism, different interface
Rust uses the system allocator by default (glibc malloc on Linux). Box::new(42) calls malloc
internally. drop(box_value) calls free.
Rust C equivalent
─────────────────────────────────────────────
Box::new(42) malloc(4); *p = 42;
Vec::with_capacity(100) malloc(100 * elem_size)
String::from("hello") malloc(5); memcpy(...)
drop(x) free(x)
Same allocator. Same brk/mmap calls. Same heap metadata. The only difference is that Rust's
type system ensures you call free exactly once, at exactly the right time.
You can even swap in a different allocator in Rust:
#![allow(unused)] fn main() { use std::alloc::System; #[global_allocator] static GLOBAL: System = System; // use the system allocator explicitly // or use jemalloc, mimalloc, etc. via crates }
🔧 Task
Write a C program that demonstrates use-after-free:
int *p = malloc(sizeof(int)); *p = 42; free(p); printf("After free: %d\n", *p); // What prints? *p = 99;Compile and run it normally. Does it crash? (Probably not — that's what makes it dangerous.) Now run it under Valgrind:
valgrind ./uaf. Observe the report.Try the same pattern in Rust:
#![allow(unused)] fn main() { let p = Box::new(42); drop(p); println!("{}", *p); }Does it compile? Read the compiler error carefully — it tells you exactly what went wrong.
Use
strace -e brk,mmapon a program that does:
malloc(100)— observebrkmalloc(256 * 1024)— observemmap- Why the different system calls?
Challenge: Write a C program that allocates and frees memory in a pattern that creates fragmentation. Allocate 1000 blocks of 1KB, free the even-numbered ones, then try to allocate a single 500KB block. Does it fit in the holes? Use
/proc/self/mapsto observe the heap size before and after.