The Preprocessor and Macros

Before the C compiler sees your code, the preprocessor runs a text-substitution pass over it. This is powerful, dangerous, and entirely unlike anything in most modern languages. Rust replaces the preprocessor with a hygienic macro system and feature flags. This chapter covers both.

The C Preprocessor: A Text Rewriting Engine

The preprocessor operates on text, not on syntax trees. It knows nothing about types, scopes, or semantics. Every directive starts with #.

  Source file (.c)
      |
      v
  [Preprocessor]  -- #include, #define, #ifdef
      |
      v
  Translation unit (expanded text)
      |
      v
  [Compiler]  -- parsing, type checking, codegen
      |
      v
  Object file (.o)

#include and Include Guards

#include literally copies the contents of another file into the current position. No module system, no namespacing -- just text insertion. Without protection, including the same header twice causes redefinition errors:

/* sensor.h */
#ifndef SENSOR_H
#define SENSOR_H

struct sensor {
    int id;
    float value;
};

int sensor_read(struct sensor *s);

#endif /* SENSOR_H */

Many compilers also support #pragma once as a non-standard shortcut.

#define: Constants and Macros

Simple Constants

/* constants.c */
#include <stdio.h>

#define MAX_SENSORS  16
#define PI           3.14159265358979
#define VERSION      "1.0.3"

int main(void)
{
    printf("Max sensors: %d\n", MAX_SENSORS);
    printf("Pi: %.5f\n", PI);
    printf("Version: %s\n", VERSION);
    return 0;
}

The preprocessor replaces every occurrence of MAX_SENSORS with the literal text 16. No type checking. No scoping.

Caution: #define constants have no type. MAX_SENSORS is not an int -- it is the text 16. Prefer enum or static const for typed constants in modern C.

Function-Like Macros

/* macros.c */
#include <stdio.h>

#define MIN(a, b)  ((a) < (b) ? (a) : (b))
#define MAX(a, b)  ((a) > (b) ? (a) : (b))
#define SQUARE(x)  ((x) * (x))

int main(void)
{
    printf("min(3, 7) = %d\n", MIN(3, 7));
    printf("max(3, 7) = %d\n", MAX(3, 7));
    printf("square(5) = %d\n", SQUARE(5));
    printf("square(2+3) = %d\n", SQUARE(2 + 3));  /* 25, correct with parens */
    return 0;
}

Compile and run:

gcc -Wall -o macros macros.c && ./macros

Every parameter is wrapped in parentheses and the whole expression is wrapped in outer parentheses. Without this, operator precedence causes silent bugs.

Caution: Macro arguments are evaluated each time they appear. Consider SQUARE(i++) -- this expands to ((i++) * (i++)), which increments i twice and invokes undefined behavior. This is the most infamous macro pitfall in C. Never pass expressions with side effects to C macros.

Useful Kernel Patterns

The Linux kernel defines several macros that every systems programmer should know:

/* kernel_patterns.c */
#include <stdio.h>
#include <stddef.h>

/* Array size -- works only on true arrays, not pointers */
#define ARRAY_SIZE(arr)  (sizeof(arr) / sizeof((arr)[0]))

/* Build-time assertion (simplified) */
#define BUILD_BUG_ON(cond) \
    ((void)sizeof(char[1 - 2 * !!(cond)]))

/* Container-of: get the parent struct from a member pointer */
#define container_of(ptr, type, member) \
    ((type *)((char *)(ptr) - offsetof(type, member)))

struct device {
    int id;
    char name[32];
};

int main(void)
{
    int data[] = {10, 20, 30, 40, 50};
    printf("Array size: %zu\n", ARRAY_SIZE(data));

    BUILD_BUG_ON(sizeof(int) != 4);  /* passes on most platforms */
    /* BUILD_BUG_ON(sizeof(int) == 4);  -- would fail to compile */

    struct device dev = { .id = 42, .name = "sensor" };
    char *name_ptr = dev.name;
    struct device *dev_ptr = container_of(name_ptr, struct device, name);
    printf("Device ID via container_of: %d\n", dev_ptr->id);

    return 0;
}

Compile and run:

gcc -Wall -o kernel_patterns kernel_patterns.c && ./kernel_patterns

Output:

Array size: 5
Device ID via container_of: 42

Driver Prep: container_of is used on nearly every page of the Linux kernel source. When you have a pointer to a member of a struct (like a list_head), this macro recovers the pointer to the enclosing struct.

Stringification and Token Pasting

The preprocessor has two special operators: # turns a macro argument into a string, and ## pastes two tokens together.

/* stringify.c */
#include <stdio.h>

#define STRINGIFY(x)  #x
#define TO_STRING(x)  STRINGIFY(x)

#define CONCAT(a, b)  a##b

#define DEBUG_VAR(var)  printf(#var " = %d\n", var)

#define VERSION_MAJOR 2
#define VERSION_MINOR 7

int main(void)
{
    int count = 42;
    DEBUG_VAR(count);  /* expands to: printf("count" " = %d\n", count); */

    int xy = 100;
    printf("CONCAT(x, y) = %d\n", CONCAT(x, y));  /* becomes: xy */

    /* Two-level stringification for expanding macros */
    printf("Version: %s.%s\n",
           TO_STRING(VERSION_MAJOR),
           TO_STRING(VERSION_MINOR));

    return 0;
}

Compile and run:

gcc -Wall -o stringify stringify.c && ./stringify

Output:

count = 42
CONCAT(x, y) = 100
Version: 2.7

Note the two-level TO_STRING / STRINGIFY trick. If you write STRINGIFY(VERSION_MAJOR), you get the string "VERSION_MAJOR". The extra indirection forces macro expansion first.

Variadic Macros

/* variadic_macro.c */
#include <stdio.h>

#define LOG(fmt, ...) \
    fprintf(stderr, "[LOG] " fmt "\n", ##__VA_ARGS__)

int main(void)
{
    LOG("Starting up");
    LOG("Sensor %d: value = %.2f", 3, 27.5);
    LOG("Shutting down with code %d", 0);
    return 0;
}

Compile and run:

gcc -Wall -o variadic_macro variadic_macro.c && ./variadic_macro

The ##__VA_ARGS__ is a GCC extension that removes the trailing comma when no variadic arguments are passed.

X-Macros: Code Generation

X-macros generate repetitive code from a single list definition.

/* x_macro.c */
#include <stdio.h>

#define ERROR_LIST \
    X(ERR_NONE,     "no error")       \
    X(ERR_IO,       "I/O error")      \
    X(ERR_PARSE,    "parse error")    \
    X(ERR_OVERFLOW, "overflow")       \
    X(ERR_TIMEOUT,  "timeout")

/* Generate the enum */
#define X(code, str) code,
typedef enum {
    ERROR_LIST
} error_code_t;
#undef X

/* Generate the string table */
#define X(code, str) [code] = str,
static const char *error_strings[] = {
    ERROR_LIST
};
#undef X

const char *error_to_string(error_code_t e)
{
    if (e < 0 || (size_t)e >= sizeof(error_strings) / sizeof(error_strings[0]))
        return "unknown error";
    return error_strings[e];
}

int main(void)
{
    for (int i = ERR_NONE; i <= ERR_TIMEOUT; i++) {
        printf("%d: %s\n", i, error_to_string(i));
    }
    return 0;
}

Compile and run:

gcc -Wall -o x_macro x_macro.c && ./x_macro

Output:

0: no error
1: I/O error
2: parse error
3: overflow
4: timeout

The error codes and their string representations are defined in one place. You can never add a code and forget its string, or vice versa.

Conditional Compilation

/* conditional.c */
#include <stdio.h>

#ifdef __linux__
    #define PLATFORM "Linux"
#elif defined(_WIN32)
    #define PLATFORM "Windows"
#elif defined(__APPLE__)
    #define PLATFORM "macOS"
#else
    #define PLATFORM "Unknown"
#endif

#ifndef NDEBUG
    #define DBG(fmt, ...) fprintf(stderr, "DBG: " fmt "\n", ##__VA_ARGS__)
#else
    #define DBG(fmt, ...) ((void)0)
#endif

int main(void)
{
    printf("Platform: %s\n", PLATFORM);
    DBG("This only prints in debug mode");
    DBG("x = %d", 42);
    return 0;
}

Compile and run:

gcc -Wall -o conditional conditional.c && ./conditional
gcc -Wall -DNDEBUG -o conditional_rel conditional.c && ./conditional_rel

In the release build (-DNDEBUG), the DBG macro expands to nothing.

Try It: Add a #define VERBOSE flag. When defined, make the LOG macro also print the file name and line number using __FILE__ and __LINE__.

Rust: macro_rules! -- Pattern-Matching Macros

Rust macros operate on the syntax tree, not on raw text. They are hygienic: they cannot accidentally capture variables from the surrounding scope.

// rust_macros.rs

macro_rules! min {
    ($a:expr, $b:expr) => {{
        let a = $a;
        let b = $b;
        if a < b { a } else { b }
    }};
}

macro_rules! debug_var {
    ($var:expr) => {
        eprintln!("{} = {:?}", stringify!($var), $var);
    };
}

macro_rules! make_vec {
    ( $( $elem:expr ),* $(,)? ) => {{
        let mut v = Vec::new();
        $( v.push($elem); )*
        v
    }};
}

fn main() {
    let x = 10;
    let y = 3;
    println!("min({}, {}) = {}", x, y, min!(x, y));

    // Safe with side effects -- each argument evaluated once
    let mut counter = 0;
    let result = min!({ counter += 1; counter }, 5);
    println!("result = {}, counter = {}", result, counter);
    // counter is exactly 1, not 2

    debug_var!(x + y);
    debug_var!("hello");

    let v = make_vec![1, 2, 3, 4, 5];
    println!("vec: {:?}", v);
}

Compile and run:

rustc rust_macros.rs && ./rust_macros

Output:

min(10, 3) = 3
result = 1, counter = 1
x + y = 13
"hello" = "hello"
vec: [1, 2, 3, 4, 5]

Rust Note: Rust macros evaluate each argument once by binding it to a local variable. The SQUARE(i++) bug from C is impossible. This is what "hygienic macros" means in practice.

Rust: cfg Attributes for Conditional Compilation

Rust replaces #ifdef with the cfg attribute system:

// cfg_demo.rs

#[cfg(target_os = "linux")]
fn platform() -> &'static str {
    "Linux"
}

#[cfg(target_os = "windows")]
fn platform() -> &'static str {
    "Windows"
}

#[cfg(target_os = "macos")]
fn platform() -> &'static str {
    "macOS"
}

#[cfg(not(any(target_os = "linux", target_os = "windows", target_os = "macos")))]
fn platform() -> &'static str {
    "Unknown"
}

fn main() {
    println!("Platform: {}", platform());

    if cfg!(debug_assertions) {
        println!("Debug mode is ON");
    } else {
        println!("Release mode");
    }
}

Compile and run:

rustc cfg_demo.rs && ./cfg_demo

The cfg! macro evaluates at compile time. Dead branches are eliminated entirely.

Rust: Feature Flags in Cargo

Cargo supports feature flags for conditional compilation:

# Cargo.toml
[package]
name = "myapp"
version = "0.1.0"
edition = "2021"

[features]
default = ["json"]
json = ["dep:serde_json"]
verbose_logging = []

[dependencies]
serde_json = { version = "1", optional = true }
// src/main.rs (Cargo project)

#[cfg(feature = "json")]
fn parse_config(data: &str) {
    let v: serde_json::Value = serde_json::from_str(data).unwrap();
    println!("Parsed JSON: {}", v);
}

#[cfg(not(feature = "json"))]
fn parse_config(_data: &str) {
    println!("JSON support not compiled in");
}

fn main() {
    parse_config(r#"{"key": "value"}"#);
}

Build with different features:

cargo run                              # default features (json)
cargo run --no-default-features        # no json
cargo run --features verbose_logging   # default + verbose

Procedural Macros: A Brief Overview

Rust also has procedural macros: Rust functions that transform token streams at compile time. The three kinds are derive macros (#[derive(Debug, Serialize)]), attribute macros (#[route("GET", "/users")]), and function-like macros (sql!(SELECT * FROM users)). They are defined in a separate crate. Derive macros are by far the most common.

Side-by-Side: C Preprocessor vs Rust Macros

+----------------------------+----------------------------------+
| C Preprocessor             | Rust Macros                      |
+----------------------------+----------------------------------+
| Text substitution          | Syntax tree transformation       |
| No hygiene                 | Hygienic -- no name capture      |
| Arguments re-evaluated     | Arguments evaluated once         |
| No type safety             | Type-checked after expansion     |
| #ifdef for platforms       | #[cfg()] attributes              |
| #define constants          | const / static                   |
| Include guards needed      | Module system handles it         |
| Errors point to expanded   | Errors point to macro call site  |
|   code (unreadable)        |   (usually readable)             |
+----------------------------+----------------------------------+

Debugging Macro Expansions

In C, use gcc -E to see the preprocessed output:

gcc -E macros.c | tail -20

In Rust, use cargo expand (requires the cargo-expand tool):

cargo install cargo-expand
cargo expand

Try It: Write a C macro CLAMP(x, lo, hi) that clamps a value to a range. Then write the Rust equivalent using macro_rules!. Verify that the Rust version is safe with side effects by passing { counter += 1; counter } as an argument.

Knowledge Check

  1. What happens if you write #define SQUARE(x) x * x without parentheses and then call SQUARE(2 + 3)?

  2. Why does SQUARE(i++) cause undefined behavior in C but not in a Rust macro?

  3. What is the difference between cfg!(target_os = "linux") and #[cfg(target_os = "linux")] in Rust?

Common Pitfalls

  • Missing parentheses in C macros. Always wrap every parameter and the entire expression: #define M(x) ((x) + 1).
  • Multi-statement macros without do-while. Use do { ... } while(0) for macros that expand to multiple statements, or they break if/else chains.
  • Macro arguments with side effects. Never pass i++ or function calls to C macros unless you know the argument is used exactly once.
  • Include guard name collisions. Using a common name like UTILS_H in two different libraries causes silent header suppression.
  • Over-using macros when functions work. Modern C compilers inline small functions automatically. Use static inline instead of function-like macros when possible.
  • Overcomplicating Rust macros. If a function does the job, use a function. Macros are for cases where you need syntax flexibility (variadic arguments, code generation, compile-time string manipulation).