Undefined Behavior: C's Silent Killer
Type this right now
// save as ub_demo.c — compile TWICE with different optimization levels
#include <stdio.h>
#include <limits.h>
int main() {
int x = INT_MAX;
printf("x = %d\n", x);
printf("x + 1 = %d\n", x + 1);
if (x + 1 > x) {
printf("Overflow detected? x + 1 > x is TRUE\n");
} else {
printf("No overflow? x + 1 > x is FALSE\n");
}
return 0;
}
$ gcc -O0 -o ub_O0 ub_demo.c && ./ub_O0
x = 2147483647
x + 1 = -2147483648
No overflow? x + 1 > x is FALSE
$ gcc -O2 -o ub_O2 ub_demo.c && ./ub_O2
x = 2147483647
x + 1 = -2147483648
Overflow detected? x + 1 > x is TRUE
Read that again. The same code, compiled with different flags, produces opposite results.
The -O0 build wraps around and gives FALSE. The -O2 build says TRUE — the compiler
removed the comparison entirely because signed overflow is undefined behavior, so the compiler
assumes it cannot happen, and therefore x + 1 > x must always be true.
This isn't a bug in GCC. This is exactly what the C standard permits. Welcome to undefined behavior.
What undefined behavior actually means
The C standard defines three categories of problematic code:
- Implementation-defined: The behavior varies by compiler, but the compiler must document what
it does (e.g., size of
int, right-shifting signed numbers). - Unspecified: The compiler can choose among several options, but doesn't have to tell you (e.g., order of evaluation of function arguments).
- Undefined: The standard imposes no requirements whatsoever.
That last one is what kills you. When your program has undefined behavior, the compiler is allowed to:
- Produce the result you expected
- Produce a different result
- Crash
- Delete your code
- Make demons fly out of your nose (the community joke)
- Optimize as if the UB can never happen
That last point is the real danger. Modern optimizing compilers actively exploit undefined behavior. They don't just ignore your bug — they use it to justify removing your safety checks.
Example 1: signed integer overflow
The C standard says signed integer overflow is undefined. The compiler exploits this:
int check_overflow(int x) {
if (x + 1 > x) // compiler: "signed overflow can't happen"
return 1; // "so x + 1 is always > x"
else // "this branch is dead code"
return 0; // "I'll remove it"
}
At -O2, GCC compiles this to:
check_overflow:
mov eax, 1 ; just return 1, always
ret
The entire if statement is gone. The compiler reasoned: "Signed overflow is UB. I'm allowed to
assume the programmer never does UB. Therefore x + 1 is always greater than x. Therefore
the function always returns 1."
The logic is airtight given the assumption. The assumption is wrong for INT_MAX. But the
standard says you must never reach INT_MAX + 1, so the compiler is technically correct.
🧠 What do you think happens?
What if you change
inttounsigned intin the example above? Unsigned overflow IS defined in C — it wraps around modulo 2^n. The compiler can no longer optimize away the check. Try it.
Example 2: null pointer "optimization"
void process(int *ptr) {
int value = *ptr; // dereference ptr — if ptr is NULL, this is UB
if (ptr == NULL) { // null check AFTER dereference
printf("ptr is NULL!\n");
return;
}
printf("value = %d\n", value);
}
A human reads this and thinks "the null check is there for safety." The compiler reads it differently:
Compiler's reasoning:
1. Line 2 dereferences ptr
2. If ptr were NULL, that would be UB
3. I'm allowed to assume UB doesn't happen
4. Therefore ptr is NOT NULL at line 2
5. Therefore ptr is NOT NULL at line 3
6. Therefore the NULL check always fails
7. I'll remove it
At -O2:
process:
mov eax, DWORD PTR [rdi] ; dereference ptr (no null check)
; the if (ptr == NULL) block is GONE
mov esi, eax
lea rdi, .LC0 ; "value = %d\n"
jmp printf
The null check was deleted. If ptr is NULL, the program crashes with no diagnostic. The "safety"
code you carefully wrote is not in the binary.
This is not a contrived example. The Linux kernel had exactly this bug in a network driver (CVE-2009-1897). A null check was removed by GCC because a dereference appeared earlier in the function.
Example 3: use-after-free, the optimizer's playground
#include <stdlib.h>
#include <stdio.h>
int main() {
int *p = malloc(sizeof(int));
*p = 42;
free(p);
// UB: accessing freed memory
// The compiler may reuse p's register for something else,
// or the allocator may recycle the memory.
printf("value = %d\n", *p);
return 0;
}
This might print 42, or 0, or 1735289204, or crash, depending on:
- Optimization level
- Allocator implementation
- Whether another thread allocated between
freeand*p - Phase of the moon (only slightly joking)
The insidious part: it might work perfectly in testing and crash only in production. UB doesn't guarantee a crash. It guarantees nothing.
Example 4: uninitialized variables
int foo() {
int x; // uninitialized — reading it is UB
return x; // what value?
}
You might expect "whatever was on the stack." But the compiler is allowed to assume you never read uninitialized memory. This means:
int bar() {
int x;
if (x == 0) {
printf("zero\n");
}
if (x != 0) {
printf("nonzero\n");
}
}
The compiler can print both, neither, or one of these messages. It's not required to be consistent
— x doesn't have to have the same value in both checks. GCC and Clang have both been observed
making different choices at different optimization levels.
💡 Fun Fact
The phrase "nasal demons" comes from a 1992 comp.std.c Usenet post. Someone argued that undefined behavior could cause "demons to fly out of your nose." It became a running joke, but it captures a real truth: the standard truly places NO constraints on what happens. The joke persists because the reality is hard to believe.
UB on Godbolt: see it live
You can see these optimizations yourself at godbolt.org. Paste the signed
overflow example, select x86-64 GCC with -O2, and watch the assembly. The comparison vanishes.
Try these experiments:
- Change
-O2to-O0— the comparison reappears - Change
inttounsigned int— the comparison stays even at-O2 - Add
-fwrapv(tells GCC to treat signed overflow as wrapping) — the comparison stays
-fwrapv is GCC's escape hatch: it makes signed overflow defined (as two's complement wrapping).
Some projects (including the Linux kernel) compile with -fwrapv to eliminate this entire class
of UB.
Rust's answer: no undefined behavior in safe code
Rust makes a bold guarantee: safe Rust has no undefined behavior.
This isn't a suggestion or a best practice. It's a hard property of the language. The compiler rejects programs that could exhibit UB, or inserts runtime checks where compile-time prevention isn't possible.
Signed integer overflow:
fn main() { let x: i32 = i32::MAX; let y = x + 1; // In debug: PANIC at runtime println!("{}", y); }
$ cargo run
thread 'main' panicked at 'attempt to add with overflow'
$ cargo run --release
# In release mode: wraps to -2147483648 (defined behavior, not UB)
Rust made a choice: in debug mode, overflow panics (catches bugs). In release mode, overflow wraps (for performance). Either way, the behavior is defined. The compiler cannot exploit it for optimizations that break your code.
Null pointers:
#![allow(unused)] fn main() { // Rust doesn't have null pointers. // Option<&T> is the equivalent, and you MUST check it: fn process(ptr: Option<&i32>) { match ptr { Some(value) => println!("value = {}", value), None => println!("ptr is None!"), } } // The compiler enforces the check. You can't dereference without matching. }
Uninitialized variables:
fn main() { let x: i32; println!("{}", x); // COMPILE ERROR: use of possibly-uninitialized variable }
error[E0381]: used binding `x` isn't initialized
--> src/main.rs:3:20
|
2 | let x: i32;
| - binding declared here but left uninitialized
3 | println!("{}", x);
| ^ `x` used here but it isn't initialized
No guessing. No "whatever was on the stack." The compiler rejects it outright.
The unsafe boundary
Rust does allow operations that could cause UB — but only inside unsafe blocks:
fn main() { let x: i32; // This is safe Rust — compiler prevents UB // println!("{}", x); // won't compile unsafe { // In unsafe, you can do things that might cause UB let ptr: *const i32 = 0x1234 as *const i32; // let val = *ptr; // UB if address is invalid } }
The unsafe keyword is:
- Opt-in: You must explicitly ask for it
- Explicit: It marks exactly which code has elevated risk
- Auditable: You can grep for
unsafein any codebase - Contained: The responsibility for correctness is localized
The idea is that 95% of your code is safe Rust (no UB possible), and 5% is unsafe (UB is your
problem). You audit the 5%. In C, you audit 100%.
🧠 What do you think happens?
If you have 10,000 lines of Rust and UB occurs, where do you look? You grep for
unsafe— maybe 50 lines. In C, if you have 10,000 lines and UB occurs, where do you look? Everywhere. That's the practical value of the safe/unsafe boundary.
UB in unsafe Rust: still possible
unsafe Rust can still invoke UB. The most common causes:
#![allow(unused)] fn main() { unsafe { // 1. Dereferencing a raw pointer to invalid memory let ptr: *const i32 = std::ptr::null(); let _val = *ptr; // UB: null dereference // 2. Creating an invalid reference let _ref: &i32 = &*ptr; // UB: references must never be null // 3. Data races // Two threads writing to the same memory without synchronization // 4. Breaking aliasing rules // Having &T and &mut T to the same data simultaneously // 5. Calling a function with wrong ABI or invalid arguments } }
Unsafe Rust is roughly as dangerous as C for the code inside the unsafe block. The difference is
that the blast radius is contained — safe code can rely on invariants that unsafe code must
uphold.
The complete comparison
| Undefined Behavior in C | What happens | Rust equivalent | What happens |
|---|---|---|---|
Signed overflow: INT_MAX + 1 | Compiler removes your checks | i32::MAX + 1 | Debug: panic. Release: wraps (defined) |
Null deref: *NULL | Compiler removes null checks | No null pointers | Option<&T> forces you to check |
Use-after-free: free(p); *p | Silent corruption | drop(p); *p | Compile error |
Uninitialized read: int x; return x; | Anything — optimizer goes wild | let x: i32; x | Compile error |
Buffer overflow: a[11] on 10 | Silent corruption | a[11] on 10 | Panic: index out of bounds |
Double-free: free(p); free(p) | Allocator corruption | drop(p); drop(p) | Compile error |
| Data race: two threads, no sync | Torn reads, corruption | &mut T aliasing rules | Compile error |
Invalid enum value | Optimizer makes wrong branch assumptions | Invalid enum via unsafe only | Safe code can't create one |
Compiler flags that help
If you must write C, these flags make UB less likely to bite you:
# Compile-time warnings:
gcc -Wall -Wextra -Wpedantic -Werror
# Runtime sanitizers (debug builds):
gcc -fsanitize=undefined # UBSan: catches UB at runtime
gcc -fsanitize=address # ASan: catches memory bugs
gcc -fsanitize=thread # TSan: catches data races
# UB-safe overflow handling:
gcc -fwrapv # Signed overflow wraps (like unsigned)
gcc -ftrapv # Signed overflow traps (abort)
UBSan in action:
$ gcc -fsanitize=undefined -o ub_demo ub_demo.c && ./ub_demo
ub_demo.c:7:22: runtime error: signed integer overflow:
2147483647 + 1 cannot be represented in type 'int'
UBSan caught it. But remember: sanitizers only catch UB that actually executes. Code paths you don't test can still harbor UB. Rust's approach — preventing UB at compile time — catches it whether you test that path or not.
💡 Fun Fact
The LLVM optimizer (used by both Clang and rustc) has a concept called "poison values." When you compute something with UB (like signed overflow), the result is "poison" — a special marker that infects everything it touches. If a poison value reaches a branch condition, both branches become valid. If it reaches a store, the stored value is undefined. This is the formal mechanism by which UB propagates through your program.
🔧 Task
Compile the signed overflow example from the start of this chapter at
-O0and-O2:gcc -O0 -o ub_O0 ub_demo.c && ./ub_O0 gcc -O2 -o ub_O2 ub_demo.c && ./ub_O2Observe the different output. The compiler is not broken — it's exploiting UB.
Add
-fwrapvto the-O2build:gcc -O2 -fwrapv -o ub_wrap ub_demo.c && ./ub_wrapDoes the output match
-O0now? Why?Compile with UBSan:
gcc -fsanitize=undefined -o ub_san ub_demo.c && ./ub_sanRead the sanitizer's report carefully.
Write the equivalent in Rust:
fn main() { let x: i32 = i32::MAX; let y = x + 1; println!("{}", y); }Run in debug mode (
cargo run) and release mode (cargo run --release). Compare. Neither is undefined behavior — one panics, the other wraps. Both are specified outcomes.Challenge: Find the null-pointer example on Godbolt. Compile with
-O0and-O2. At-O2, confirm the null check is deleted from the assembly. Then add-fno-delete-null-pointer-checksand verify it comes back.