From Source to Binary

When you type gcc main.c -o main, four distinct stages run in sequence. Understanding each stage turns opaque compiler errors into something you can reason about -- and makes debugging linker failures, ABI mismatches, and cross-compilation issues far less painful.

The Four Stages

 Source (.c)
    |
    v
 [Preprocessor]  -->  Expanded source (.i)
    |
    v
 [Compiler]       -->  Assembly (.s)
    |
    v
 [Assembler]      -->  Object file (.o)
    |
    v
 [Linker]         -->  Executable (ELF)

Each stage is a separate program. GCC orchestrates them, but you can stop at any point and inspect the output.

Stage 1: Preprocessing

The preprocessor handles #include, #define, #ifdef, and macro expansion. It produces pure C with no directives left.

/* version.h */
#ifndef VERSION_H
#define VERSION_H
#define APP_VERSION "1.0.3"
#define MAX_RETRIES 5
#endif
/* stage1.c */
#include <stdio.h>
#include "version.h"

#ifdef DEBUG
  #define LOG(msg) fprintf(stderr, "DEBUG: %s\n", msg)
#else
  #define LOG(msg) ((void)0)
#endif

int main(void) {
    LOG("starting up");
    printf("App version: %s\n", APP_VERSION);
    printf("Max retries: %d\n", MAX_RETRIES);
    return 0;
}

Stop after preprocessing:

gcc -E stage1.c -o stage1.i

Open stage1.i -- it will be thousands of lines long because <stdio.h> gets fully expanded. Scroll to the bottom and you will see your code with all macros replaced:

int main(void) {
    ((void)0);
    printf("App version: %s\n", "1.0.3");
    printf("Max retries: %d\n", 5);
    return 0;
}

The string "1.0.3" is inlined. LOG became ((void)0) because DEBUG was not defined. Now try:

gcc -E -DDEBUG stage1.c -o stage1_debug.i

The LOG call now expands to an actual fprintf.

Try It: Add a #define PLATFORM "linux" to version.h and use it in main. Run gcc -E and confirm the string appears in the .i file.

Stage 2: Compilation (to Assembly)

The compiler translates the preprocessed C into assembly for the target architecture. On x86-64:

gcc -S stage1.c -o stage1.s
/* arith.c */
int add(int a, int b) {
    return a + b;
}

int square(int x) {
    return x * x;
}
gcc -S -O0 arith.c -o arith.s

The output (simplified, x86-64):

add:
    pushq   %rbp
    movq    %rsp, %rbp
    movl    %edi, -4(%rbp)
    movl    %esi, -8(%rbp)
    movl    -4(%rbp), %edx
    movl    -8(%rbp), %eax
    addl    %edx, %eax
    popq    %rbp
    ret

square:
    pushq   %rbp
    movq    %rsp, %rbp
    movl    %edi, -4(%rbp)
    movl    -4(%rbp), %eax
    imull   %eax, %eax
    popq    %rbp
    ret

Now try with optimization:

gcc -S -O2 arith.c -o arith_opt.s

The optimized output is dramatically shorter -- the compiler may skip the frame pointer entirely and use registers directly.

Try It: Compile arith.c with -O0, -O1, -O2, and -O3. Compare the assembly output with diff. Notice how the compiler eliminates unnecessary memory operations at higher levels.

Stage 3: Assembly (to Object Code)

The assembler translates assembly into machine code, producing an ELF object file:

gcc -c arith.c -o arith.o

Inspect it:

file arith.o
# arith.o: ELF 64-bit LSB relocatable, x86-64, ...

objdump -d arith.o

The object file contains machine instructions, but addresses are not yet resolved. Function calls to external symbols are placeholders.

/* caller.c */
#include <stdio.h>

extern int add(int a, int b);
extern int square(int x);

int main(void) {
    printf("add(3,4) = %d\n", add(3, 4));
    printf("square(5) = %d\n", square(5));
    return 0;
}
gcc -c caller.c -o caller.o
objdump -d caller.o

In the disassembly, calls to add, square, and printf show placeholder addresses (often all zeros). These are relocations -- the linker fills them in later.

Stage 4: Linking

The linker combines object files, resolves symbols, and produces the final executable:

gcc caller.o arith.o -o program
./program

Output:

add(3,4) = 7
square(5) = 25

Symbols and the Symbol Table

Every object file carries a symbol table. View it with nm:

nm arith.o
0000000000000000 T add
0000000000000014 T square

T means the symbol is in the text (code) section and is globally visible.

nm caller.o
                 U add
0000000000000000 T main
                 U printf
                 U square

U means undefined -- these symbols must be provided by another object file or library at link time.

Relocations

View relocations with readelf:

readelf -r caller.o

Each relocation entry says: "At offset X in section Y, insert the address of symbol Z." The linker processes every relocation in every object file.

+------------------+     +------------------+
|   caller.o       |     |   arith.o        |
|                  |     |                  |
|  main            |     |  add       [T]   |
|  calls add   [U] |---->|  square    [T]   |
|  calls square[U] |---->|                  |
|  calls printf[U] |--+  +------------------+
+------------------+  |
                      |  +------------------+
                      +->|   libc.so        |
                         |  printf     [T]  |
                         +------------------+

Caution: If you see "undefined reference to ..." at link time, it means the linker cannot find a symbol. Check that you are passing all required object files and libraries. Order matters with static libraries -- the linker processes files left to right.

Examining the Final Executable

file program
# program: ELF 64-bit LSB executable, x86-64, ...

readelf -h program     # ELF header
readelf -l program     # program headers (segments)
readelf -S program     # section headers
objdump -d program     # full disassembly

Key sections in an ELF binary:

+-------------------+
| .text             |  Executable code
+-------------------+
| .rodata           |  Read-only data (string literals)
+-------------------+
| .data             |  Initialized global/static variables
+-------------------+
| .bss              |  Uninitialized global/static variables
+-------------------+
| .symtab           |  Symbol table
+-------------------+
| .strtab           |  String table for symbols
+-------------------+
| .rel.text         |  Relocations (in .o files)
+-------------------+

Driver Prep: Kernel modules are ELF relocatable objects (.ko files). The kernel's module loader performs its own linking at insmod time, resolving symbols against the running kernel's symbol table. Understanding relocations now pays off directly when debugging module load failures.

Rust's Compilation Model

Rust does not follow the same four-stage pipeline. Instead:

 Source (.rs)
    |
    v
 [rustc frontend]   -->  HIR --> MIR
    |
    v
 [LLVM backend]     -->  Object files (.o) or LLVM IR (.ll)
    |
    v
 [Linker]           -->  Executable (ELF)

The Rust compiler (rustc) handles preprocessing-like tasks (macro expansion, conditional compilation with cfg) internally. There is no separate preprocessor.

A Rust Example

// arith.rs
fn add(a: i32, b: i32) -> i32 {
    a + b
}

fn square(x: i32) -> i32 {
    x * x
}

fn main() {
    println!("add(3,4) = {}", add(3, 4));
    println!("square(5) = {}", square(5));
}
rustc arith.rs -o arith_rust
./arith_rust

Viewing Intermediate Representations

Emit LLVM IR:

rustc --emit=llvm-ir arith.rs

This produces arith.ll -- LLVM's intermediate representation, which is portable across architectures.

Emit assembly:

rustc --emit=asm arith.rs

Emit object file only (no linking):

rustc --emit=obj arith.rs

Inspect the resulting object file the same way:

nm arith.o
objdump -d arith.o

Rust Note: Rust mangles symbol names by default. You will see names like _ZN5arith3add17h...E rather than plain add. Use #[no_mangle] and extern "C" when you need C-compatible symbol names. We cover this in Chapter 26.

Crates and Incremental Compilation

Rust's unit of compilation is the crate, not the individual .rs file. A crate can contain many modules spread across multiple files, but rustc compiles the entire crate as one unit.

Cargo enables incremental compilation: when you change one function, only the affected parts of the crate are recompiled. Incremental data is cached in target/debug/incremental/.

cargo build          # first build -- compiles everything
# edit one function
cargo build          # incremental -- only recompiles the changed parts

Compare with C, where each .c file is compiled independently into a .o file, and the build system (Make) decides which files to recompile based on timestamps.

C model:                     Rust/Cargo model:

file1.c --> file1.o          +------------------+
file2.c --> file2.o    vs    |  entire crate    |---> crate .rlib
file3.c --> file3.o          |  (all .rs files) |
   \       |      /         +------------------+
    \      |     /
     v     v    v
     [  linker  ]
     [executable]

Comparing Object Files from C and Rust

Let us compile equivalent functions in both languages and compare:

/* cfunc.c */
#include <stdint.h>

int32_t multiply(int32_t a, int32_t b) {
    return a * b;
}
#![allow(unused)]
fn main() {
// rfunc.rs
#[no_mangle]
pub extern "C" fn multiply(a: i32, b: i32) -> i32 {
    a * b
}
}
gcc -c -O2 cfunc.c -o cfunc.o
rustc --crate-type=staticlib --emit=obj -C opt-level=2 rfunc.rs -o rfunc.o

objdump -d cfunc.o
objdump -d rfunc.o

At -O2, both produce nearly identical machine code for this simple function:

multiply:
    movl    %edi, %eax
    imull   %esi, %eax
    ret

The LLVM backend (used by Rust) and GCC's backend produce equivalent output for straightforward arithmetic. Differences appear with more complex code -- different inlining decisions, vectorization strategies, and so on.

Try It: Write a function that sums an array of integers in both C and Rust. Compile with -O2 / -C opt-level=2 and compare the assembly. Does one auto-vectorize and the other not?

Practical: Walking Through All Four Stages

Here is a complete C program that we will take through every stage manually:

/* pipeline.c */
#include <stdio.h>

#define GREETING "Hello from the pipeline"

static int helper(int n) {
    return n * 2 + 1;
}

int main(void) {
    int result = helper(21);
    printf("%s: result = %d\n", GREETING, result);
    return 0;
}

Run each stage explicitly:

# Stage 1: Preprocess
gcc -E pipeline.c -o pipeline.i
wc -l pipeline.i        # thousands of lines

# Stage 2: Compile to assembly
gcc -S pipeline.i -o pipeline.s
wc -l pipeline.s         # tens of lines

# Stage 3: Assemble to object
gcc -c pipeline.s -o pipeline.o
nm pipeline.o            # main is T, printf is U, helper may be t (static)

# Stage 4: Link
gcc pipeline.o -o pipeline
./pipeline
# Hello from the pipeline: result = 43

Notice that helper might appear as t (lowercase) in nm output -- the lowercase means it is a local symbol (because of static). Local symbols are not visible to the linker from other object files.

The static Keyword and Symbol Visibility

/* visibility.c */
static int internal_func(void) {  /* local to this file */
    return 42;
}

int public_func(void) {  /* visible to linker */
    return internal_func();
}
gcc -c visibility.c -o visibility.o
nm visibility.o
0000000000000000 t internal_func
0000000000000014 T public_func

Lowercase t = local. Uppercase T = global.

In Rust, the equivalent is pub vs non-pub:

#![allow(unused)]
fn main() {
// visibility.rs
fn internal_func() -> i32 {
    42
}

pub fn public_func() -> i32 {
    internal_func()
}
}

Non-pub functions are not exported from the crate. When generating a C-ABI library, only #[no_mangle] pub extern "C" functions appear as global symbols.

Knowledge Check

  1. What does the preprocessor do with #include <stdio.h>? What does the resulting .i file contain?

  2. An object file contains a call to printf but its address is all zeros. What mechanism resolves this to the real address?

  3. In nm output, what is the difference between T and U?

Common Pitfalls

  • Forgetting to link all object files. If main.o calls add defined in arith.o, you must pass both to the linker.

  • Confusing compilation errors with linker errors. "undefined reference" is a linker error, not a compiler error. The code compiled fine; the symbol is just missing at link time.

  • Assuming identical assembly from C and Rust. Different compilers (GCC vs Clang/LLVM) make different optimization choices. Close does not mean identical.

  • Ignoring static visibility. A static function in one .c file cannot be called from another. This is intentional encapsulation, not a bug.

  • Stripping debug binaries during development. Keep symbols during development; strip only for release.