The Preprocessor and Macros
Before the C compiler sees your code, the preprocessor runs a text-substitution pass over it. This is powerful, dangerous, and entirely unlike anything in most modern languages. Rust replaces the preprocessor with a hygienic macro system and feature flags. This chapter covers both.
The C Preprocessor: A Text Rewriting Engine
The preprocessor operates on text, not on syntax trees. It knows nothing about
types, scopes, or semantics. Every directive starts with #.
Source file (.c)
|
v
[Preprocessor] -- #include, #define, #ifdef
|
v
Translation unit (expanded text)
|
v
[Compiler] -- parsing, type checking, codegen
|
v
Object file (.o)
#include and Include Guards
#include literally copies the contents of another file into the current
position. No module system, no namespacing -- just text insertion. Without
protection, including the same header twice causes redefinition errors:
/* sensor.h */
#ifndef SENSOR_H
#define SENSOR_H
struct sensor {
int id;
float value;
};
int sensor_read(struct sensor *s);
#endif /* SENSOR_H */
Many compilers also support #pragma once as a non-standard shortcut.
#define: Constants and Macros
Simple Constants
/* constants.c */
#include <stdio.h>
#define MAX_SENSORS 16
#define PI 3.14159265358979
#define VERSION "1.0.3"
int main(void)
{
printf("Max sensors: %d\n", MAX_SENSORS);
printf("Pi: %.5f\n", PI);
printf("Version: %s\n", VERSION);
return 0;
}
The preprocessor replaces every occurrence of MAX_SENSORS with the literal
text 16. No type checking. No scoping.
Caution:
#defineconstants have no type.MAX_SENSORSis not anint-- it is the text16. Preferenumorstatic constfor typed constants in modern C.
Function-Like Macros
/* macros.c */
#include <stdio.h>
#define MIN(a, b) ((a) < (b) ? (a) : (b))
#define MAX(a, b) ((a) > (b) ? (a) : (b))
#define SQUARE(x) ((x) * (x))
int main(void)
{
printf("min(3, 7) = %d\n", MIN(3, 7));
printf("max(3, 7) = %d\n", MAX(3, 7));
printf("square(5) = %d\n", SQUARE(5));
printf("square(2+3) = %d\n", SQUARE(2 + 3)); /* 25, correct with parens */
return 0;
}
Compile and run:
gcc -Wall -o macros macros.c && ./macros
Every parameter is wrapped in parentheses and the whole expression is wrapped in outer parentheses. Without this, operator precedence causes silent bugs.
Caution: Macro arguments are evaluated each time they appear. Consider
SQUARE(i++)-- this expands to((i++) * (i++)), which incrementsitwice and invokes undefined behavior. This is the most infamous macro pitfall in C. Never pass expressions with side effects to C macros.
Useful Kernel Patterns
The Linux kernel defines several macros that every systems programmer should know:
/* kernel_patterns.c */
#include <stdio.h>
#include <stddef.h>
/* Array size -- works only on true arrays, not pointers */
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
/* Build-time assertion (simplified) */
#define BUILD_BUG_ON(cond) \
((void)sizeof(char[1 - 2 * !!(cond)]))
/* Container-of: get the parent struct from a member pointer */
#define container_of(ptr, type, member) \
((type *)((char *)(ptr) - offsetof(type, member)))
struct device {
int id;
char name[32];
};
int main(void)
{
int data[] = {10, 20, 30, 40, 50};
printf("Array size: %zu\n", ARRAY_SIZE(data));
BUILD_BUG_ON(sizeof(int) != 4); /* passes on most platforms */
/* BUILD_BUG_ON(sizeof(int) == 4); -- would fail to compile */
struct device dev = { .id = 42, .name = "sensor" };
char *name_ptr = dev.name;
struct device *dev_ptr = container_of(name_ptr, struct device, name);
printf("Device ID via container_of: %d\n", dev_ptr->id);
return 0;
}
Compile and run:
gcc -Wall -o kernel_patterns kernel_patterns.c && ./kernel_patterns
Output:
Array size: 5
Device ID via container_of: 42
Driver Prep:
container_ofis used on nearly every page of the Linux kernel source. When you have a pointer to a member of a struct (like alist_head), this macro recovers the pointer to the enclosing struct.
Stringification and Token Pasting
The preprocessor has two special operators: # turns a macro argument into a
string, and ## pastes two tokens together.
/* stringify.c */
#include <stdio.h>
#define STRINGIFY(x) #x
#define TO_STRING(x) STRINGIFY(x)
#define CONCAT(a, b) a##b
#define DEBUG_VAR(var) printf(#var " = %d\n", var)
#define VERSION_MAJOR 2
#define VERSION_MINOR 7
int main(void)
{
int count = 42;
DEBUG_VAR(count); /* expands to: printf("count" " = %d\n", count); */
int xy = 100;
printf("CONCAT(x, y) = %d\n", CONCAT(x, y)); /* becomes: xy */
/* Two-level stringification for expanding macros */
printf("Version: %s.%s\n",
TO_STRING(VERSION_MAJOR),
TO_STRING(VERSION_MINOR));
return 0;
}
Compile and run:
gcc -Wall -o stringify stringify.c && ./stringify
Output:
count = 42
CONCAT(x, y) = 100
Version: 2.7
Note the two-level TO_STRING / STRINGIFY trick. If you write
STRINGIFY(VERSION_MAJOR), you get the string "VERSION_MAJOR". The extra
indirection forces macro expansion first.
Variadic Macros
/* variadic_macro.c */
#include <stdio.h>
#define LOG(fmt, ...) \
fprintf(stderr, "[LOG] " fmt "\n", ##__VA_ARGS__)
int main(void)
{
LOG("Starting up");
LOG("Sensor %d: value = %.2f", 3, 27.5);
LOG("Shutting down with code %d", 0);
return 0;
}
Compile and run:
gcc -Wall -o variadic_macro variadic_macro.c && ./variadic_macro
The ##__VA_ARGS__ is a GCC extension that removes the trailing comma when no
variadic arguments are passed.
X-Macros: Code Generation
X-macros generate repetitive code from a single list definition.
/* x_macro.c */
#include <stdio.h>
#define ERROR_LIST \
X(ERR_NONE, "no error") \
X(ERR_IO, "I/O error") \
X(ERR_PARSE, "parse error") \
X(ERR_OVERFLOW, "overflow") \
X(ERR_TIMEOUT, "timeout")
/* Generate the enum */
#define X(code, str) code,
typedef enum {
ERROR_LIST
} error_code_t;
#undef X
/* Generate the string table */
#define X(code, str) [code] = str,
static const char *error_strings[] = {
ERROR_LIST
};
#undef X
const char *error_to_string(error_code_t e)
{
if (e < 0 || (size_t)e >= sizeof(error_strings) / sizeof(error_strings[0]))
return "unknown error";
return error_strings[e];
}
int main(void)
{
for (int i = ERR_NONE; i <= ERR_TIMEOUT; i++) {
printf("%d: %s\n", i, error_to_string(i));
}
return 0;
}
Compile and run:
gcc -Wall -o x_macro x_macro.c && ./x_macro
Output:
0: no error
1: I/O error
2: parse error
3: overflow
4: timeout
The error codes and their string representations are defined in one place. You can never add a code and forget its string, or vice versa.
Conditional Compilation
/* conditional.c */
#include <stdio.h>
#ifdef __linux__
#define PLATFORM "Linux"
#elif defined(_WIN32)
#define PLATFORM "Windows"
#elif defined(__APPLE__)
#define PLATFORM "macOS"
#else
#define PLATFORM "Unknown"
#endif
#ifndef NDEBUG
#define DBG(fmt, ...) fprintf(stderr, "DBG: " fmt "\n", ##__VA_ARGS__)
#else
#define DBG(fmt, ...) ((void)0)
#endif
int main(void)
{
printf("Platform: %s\n", PLATFORM);
DBG("This only prints in debug mode");
DBG("x = %d", 42);
return 0;
}
Compile and run:
gcc -Wall -o conditional conditional.c && ./conditional
gcc -Wall -DNDEBUG -o conditional_rel conditional.c && ./conditional_rel
In the release build (-DNDEBUG), the DBG macro expands to nothing.
Try It: Add a
#define VERBOSEflag. When defined, make the LOG macro also print the file name and line number using__FILE__and__LINE__.
Rust: macro_rules! -- Pattern-Matching Macros
Rust macros operate on the syntax tree, not on raw text. They are hygienic: they cannot accidentally capture variables from the surrounding scope.
// rust_macros.rs macro_rules! min { ($a:expr, $b:expr) => {{ let a = $a; let b = $b; if a < b { a } else { b } }}; } macro_rules! debug_var { ($var:expr) => { eprintln!("{} = {:?}", stringify!($var), $var); }; } macro_rules! make_vec { ( $( $elem:expr ),* $(,)? ) => {{ let mut v = Vec::new(); $( v.push($elem); )* v }}; } fn main() { let x = 10; let y = 3; println!("min({}, {}) = {}", x, y, min!(x, y)); // Safe with side effects -- each argument evaluated once let mut counter = 0; let result = min!({ counter += 1; counter }, 5); println!("result = {}, counter = {}", result, counter); // counter is exactly 1, not 2 debug_var!(x + y); debug_var!("hello"); let v = make_vec![1, 2, 3, 4, 5]; println!("vec: {:?}", v); }
Compile and run:
rustc rust_macros.rs && ./rust_macros
Output:
min(10, 3) = 3
result = 1, counter = 1
x + y = 13
"hello" = "hello"
vec: [1, 2, 3, 4, 5]
Rust Note: Rust macros evaluate each argument once by binding it to a local variable. The
SQUARE(i++)bug from C is impossible. This is what "hygienic macros" means in practice.
Rust: cfg Attributes for Conditional Compilation
Rust replaces #ifdef with the cfg attribute system:
// cfg_demo.rs #[cfg(target_os = "linux")] fn platform() -> &'static str { "Linux" } #[cfg(target_os = "windows")] fn platform() -> &'static str { "Windows" } #[cfg(target_os = "macos")] fn platform() -> &'static str { "macOS" } #[cfg(not(any(target_os = "linux", target_os = "windows", target_os = "macos")))] fn platform() -> &'static str { "Unknown" } fn main() { println!("Platform: {}", platform()); if cfg!(debug_assertions) { println!("Debug mode is ON"); } else { println!("Release mode"); } }
Compile and run:
rustc cfg_demo.rs && ./cfg_demo
The cfg! macro evaluates at compile time. Dead branches are eliminated
entirely.
Rust: Feature Flags in Cargo
Cargo supports feature flags for conditional compilation:
# Cargo.toml
[package]
name = "myapp"
version = "0.1.0"
edition = "2021"
[features]
default = ["json"]
json = ["dep:serde_json"]
verbose_logging = []
[dependencies]
serde_json = { version = "1", optional = true }
// src/main.rs (Cargo project) #[cfg(feature = "json")] fn parse_config(data: &str) { let v: serde_json::Value = serde_json::from_str(data).unwrap(); println!("Parsed JSON: {}", v); } #[cfg(not(feature = "json"))] fn parse_config(_data: &str) { println!("JSON support not compiled in"); } fn main() { parse_config(r#"{"key": "value"}"#); }
Build with different features:
cargo run # default features (json)
cargo run --no-default-features # no json
cargo run --features verbose_logging # default + verbose
Procedural Macros: A Brief Overview
Rust also has procedural macros: Rust functions that transform token streams at
compile time. The three kinds are derive macros (#[derive(Debug, Serialize)]),
attribute macros (#[route("GET", "/users")]), and function-like macros
(sql!(SELECT * FROM users)). They are defined in a separate crate. Derive
macros are by far the most common.
Side-by-Side: C Preprocessor vs Rust Macros
+----------------------------+----------------------------------+
| C Preprocessor | Rust Macros |
+----------------------------+----------------------------------+
| Text substitution | Syntax tree transformation |
| No hygiene | Hygienic -- no name capture |
| Arguments re-evaluated | Arguments evaluated once |
| No type safety | Type-checked after expansion |
| #ifdef for platforms | #[cfg()] attributes |
| #define constants | const / static |
| Include guards needed | Module system handles it |
| Errors point to expanded | Errors point to macro call site |
| code (unreadable) | (usually readable) |
+----------------------------+----------------------------------+
Debugging Macro Expansions
In C, use gcc -E to see the preprocessed output:
gcc -E macros.c | tail -20
In Rust, use cargo expand (requires the cargo-expand tool):
cargo install cargo-expand
cargo expand
Try It: Write a C macro
CLAMP(x, lo, hi)that clamps a value to a range. Then write the Rust equivalent usingmacro_rules!. Verify that the Rust version is safe with side effects by passing{ counter += 1; counter }as an argument.
Knowledge Check
-
What happens if you write
#define SQUARE(x) x * xwithout parentheses and then callSQUARE(2 + 3)? -
Why does
SQUARE(i++)cause undefined behavior in C but not in a Rust macro? -
What is the difference between
cfg!(target_os = "linux")and#[cfg(target_os = "linux")]in Rust?
Common Pitfalls
- Missing parentheses in C macros. Always wrap every parameter and the
entire expression:
#define M(x) ((x) + 1). - Multi-statement macros without do-while. Use
do { ... } while(0)for macros that expand to multiple statements, or they breakif/elsechains. - Macro arguments with side effects. Never pass
i++or function calls to C macros unless you know the argument is used exactly once. - Include guard name collisions. Using a common name like
UTILS_Hin two different libraries causes silent header suppression. - Over-using macros when functions work. Modern C compilers inline small
functions automatically. Use
static inlineinstead of function-like macros when possible. - Overcomplicating Rust macros. If a function does the job, use a function. Macros are for cases where you need syntax flexibility (variadic arguments, code generation, compile-time string manipulation).