Structs, Enums, and Unions
Primitive types only get you so far. Real programs model real things: a network packet has a source, a destination, and a payload. A device register has named bit fields. This chapter covers the composite types that make systems programming possible.
C Structs
A struct groups related values under one name.
/* struct_basic.c */
#include <stdio.h>
#include <math.h>
typedef struct {
double x;
double y;
} Point;
double distance(Point a, Point b)
{
double dx = a.x - b.x;
double dy = a.y - b.y;
return sqrt(dx * dx + dy * dy);
}
int main(void)
{
Point a = { .x = 0.0, .y = 0.0 };
Point b = { .x = 3.0, .y = 4.0 };
printf("distance = %f\n", distance(a, b));
return 0;
}
$ gcc -Wall -std=c17 -o struct_basic struct_basic.c -lm && ./struct_basic
distance = 5.000000
typedef lets you write Point instead of struct Point everywhere. The .x
syntax in the initializer is a C99 designated initializer.
Driver Prep: The Linux kernel uses structs constantly:
struct file,struct inode,struct task_struct,struct sk_buff. Understanding struct layout and passing is foundational.
Rust Structs
Named-field struct
// struct_basic.rs struct Point { x: f64, y: f64, } fn distance(a: &Point, b: &Point) -> f64 { let dx = a.x - b.x; let dy = a.y - b.y; (dx * dx + dy * dy).sqrt() } fn main() { let a = Point { x: 0.0, y: 0.0 }; let b = Point { x: 3.0, y: 4.0 }; println!("distance = {}", distance(&a, &b)); }
No typedef needed. The struct name is the type name directly.
Tuple struct and unit struct
// tuple_struct.rs struct Color(u8, u8, u8); struct Meters(f64); struct Marker; // unit struct, zero-sized fn main() { let red = Color(255, 0, 0); println!("R={}, G={}, B={}", red.0, red.1, red.2); let height = Meters(1.82); println!("height = {} m", height.0); let _m = Marker; println!("size of Marker = {}", std::mem::size_of::<Marker>()); }
Tuple structs are useful for the "newtype" pattern -- wrapping a value in a distinct type for type safety. Unit structs take no memory at runtime.
Methods (impl blocks in Rust)
Rust attaches methods to structs via impl. C has no methods; you pass the struct
to a function manually.
C: functions that take a struct pointer
/* rect_c.c */
#include <stdio.h>
typedef struct {
double width;
double height;
} Rect;
double rect_area(const Rect *r)
{
return r->width * r->height;
}
int main(void)
{
Rect r = { .width = 5.0, .height = 3.0 };
printf("area = %f\n", rect_area(&r));
return 0;
}
Rust: methods with self
// rect_rust.rs struct Rect { width: f64, height: f64, } impl Rect { fn area(&self) -> f64 { self.width * self.height } fn new(width: f64, height: f64) -> Rect { Rect { width, height } } } fn main() { let r = Rect::new(5.0, 3.0); println!("area = {}", r.area()); }
C struct "method" call: rect_area(&r)
Rust method call: r.area()
Under the hood, both pass a pointer to the struct.
&self == const Rect*
&mut self == Rect*
Try It: Add a
scalemethod to the RustRectthat takes&mut selfand afactor: f64, and multiplies bothwidthandheightby the factor.
C Enums
In C, enums are just named integer constants.
/* enum_c.c */
#include <stdio.h>
enum Direction { NORTH = 0, SOUTH = 1, EAST = 2, WEST = 3 };
const char *direction_name(enum Direction d)
{
switch (d) {
case NORTH: return "North";
case SOUTH: return "South";
case EAST: return "East";
case WEST: return "West";
default: return "Unknown";
}
}
int main(void)
{
enum Direction d = EAST;
printf("direction = %s (%d)\n", direction_name(d), d);
/* C allows any integer -- no type safety */
enum Direction invalid = 99;
printf("invalid = %s (%d)\n", direction_name(invalid), invalid);
return 0;
}
Caution: C enums provide no type safety. You can assign any integer to an enum variable. The
defaultcase is your only defense.
Rust Enums: Algebraic Data Types
Rust enums are fundamentally more powerful. Each variant can carry data.
Simple enum
// enum_simple.rs #[derive(Debug)] enum Direction { North, South, East, West, } fn direction_name(d: &Direction) -> &str { match d { Direction::North => "North", Direction::South => "South", Direction::East => "East", Direction::West => "West", } } fn main() { let d = Direction::East; println!("direction = {} ({:?})", direction_name(&d), d); // let invalid: Direction = 99; // does NOT compile }
The match is exhaustive. Add a fifth variant and the compiler forces you to handle
it everywhere.
Enums with data
// enum_data.rs #[derive(Debug)] enum Shape { Circle(f64), Rectangle(f64, f64), Triangle { base: f64, height: f64 }, } fn area(shape: &Shape) -> f64 { match shape { Shape::Circle(r) => std::f64::consts::PI * r * r, Shape::Rectangle(w, h) => w * h, Shape::Triangle { base, height } => 0.5 * base * height, } } fn main() { let shapes = vec![ Shape::Circle(5.0), Shape::Rectangle(4.0, 6.0), Shape::Triangle { base: 3.0, height: 8.0 }, ]; for s in &shapes { println!("{:?} -> area = {:.2}", s, area(s)); } }
$ rustc enum_data.rs && ./enum_data
Circle(5.0) -> area = 78.54
Rectangle(4.0, 6.0) -> area = 24.00
Triangle { base: 3.0, height: 8.0 } -> area = 12.00
This is impossible in C with plain enums. You would need a struct with a tag and union.
Option and Result
Rust's standard library uses enums for two critical types.
Option replaces null pointers:
// option_demo.rs fn find_first_negative(nums: &[i32]) -> Option<usize> { for (i, &n) in nums.iter().enumerate() { if n < 0 { return Some(i); } } None } fn main() { let data = [10, 20, -5, 30]; match find_first_negative(&data) { Some(idx) => println!("first negative at index {}", idx), None => println!("no negatives found"), } }
Result replaces error codes:
// result_demo.rs use std::num::ParseIntError; fn parse_and_double(s: &str) -> Result<i32, ParseIntError> { let n: i32 = s.parse()?; Ok(n * 2) } fn main() { match parse_and_double("21") { Ok(val) => println!("success: {}", val), Err(e) => println!("error: {}", e), } match parse_and_double("abc") { Ok(val) => println!("success: {}", val), Err(e) => println!("error: {}", e), } }
Rust Note:
Option<T>isenum { Some(T), None }.Result<T, E>isenum { Ok(T), Err(E) }. These are ordinary enums with generics. The power comes frommatchand the?operator.
C Unions
A union stores different types in the same memory. Only one field is valid at a time.
/* union_c.c */
#include <stdio.h>
#include <string.h>
typedef struct {
enum { INT_VAL, FLOAT_VAL, STR_VAL } tag;
union {
int i;
double f;
char s[32];
} data;
} Value;
void print_value(const Value *v)
{
switch (v->tag) {
case INT_VAL: printf("int: %d\n", v->data.i); break;
case FLOAT_VAL: printf("float: %f\n", v->data.f); break;
case STR_VAL: printf("str: %s\n", v->data.s); break;
}
}
int main(void)
{
Value a = { .tag = INT_VAL, .data.i = 42 };
Value b = { .tag = FLOAT_VAL, .data.f = 3.14 };
Value c = { .tag = STR_VAL };
strncpy(c.data.s, "hello", sizeof(c.data.s) - 1);
print_value(&a);
print_value(&b);
print_value(&c);
return 0;
}
Union memory layout:
+------+------+------+------+------+------+------+------+
| shared memory (32 bytes) |
+------+------+------+------+------+------+------+------+
When tag == INT_VAL: first 4 bytes hold int
When tag == FLOAT_VAL: first 8 bytes hold double
When tag == STR_VAL: all 32 bytes hold char[32]
sizeof(union) = size of largest member = 32
Caution: Reading the wrong union member is undefined behavior in C. The
tagfield is a convention, not an enforcement. There is no runtime check.
Driver Prep: Type punning through unions is common in low-level code -- reading hardware registers, parsing binary protocols. The Linux kernel uses unions in structures like
union sigvalandunion nf_inet_addr.
Rust's Safe Alternative to Unions
Rust enums with data are tagged unions with the tag built in and enforced by the compiler:
// tagged_union_rust.rs enum Value { Int(i32), Float(f64), Str(String), } fn print_value(v: &Value) { match v { Value::Int(i) => println!("int: {}", i), Value::Float(f) => println!("float: {}", f), Value::Str(s) => println!("str: {}", s), } } fn main() { let values = vec![ Value::Int(42), Value::Float(3.14), Value::Str(String::from("hello")), ]; for v in &values { print_value(v); } }
For low-level type punning, Rust has raw union types (access requires unsafe):
// raw_union.rs union FloatBits { f: f32, u: u32, } fn main() { let fb = FloatBits { f: 1.0 }; let bits = unsafe { fb.u }; println!("float 1.0 as bits: 0x{:08X}", bits); }
Rust Note: Raw Rust unions exist primarily for C interop (FFI). In pure Rust code, prefer enums. The
unsafeblock signals that the programmer is taking responsibility for correctness.
Memory Layout Comparison
/* layout_c.c */
#include <stdio.h>
#include <stddef.h>
typedef struct {
char a; /* 1 byte */
int b; /* 4 bytes */
char c; /* 1 byte */
double d; /* 8 bytes */
} Example;
int main(void)
{
printf("sizeof(Example) = %zu\n", sizeof(Example));
printf("offset of a = %zu\n", offsetof(Example, a));
printf("offset of b = %zu\n", offsetof(Example, b));
printf("offset of c = %zu\n", offsetof(Example, c));
printf("offset of d = %zu\n", offsetof(Example, d));
return 0;
}
C struct layout (with padding):
Byte: 0 1 2 3 4 5 6 7
+----+----+----+----+----+----+----+----+
| a | pad| pad| pad| b | b | b | b |
+----+----+----+----+----+----+----+----+
Byte: 8 9 10 11 12 13 14 15
+----+----+----+----+----+----+----+----+
| c | pad| pad| pad| pad| pad| pad| pad|
+----+----+----+----+----+----+----+----+
Byte: 16 17 18 19 20 21 22 23
+----+----+----+----+----+----+----+----+
| d | d | d | d | d | d | d | d |
+----+----+----+----+----+----+----+----+
Total: 24 bytes (10 bytes of padding!)
Rust reorders fields to minimize padding:
// layout_rust.rs use std::mem; struct Example { a: u8, b: i32, c: u8, d: f64 } fn main() { println!("size of Example = {}", mem::size_of::<Example>()); }
Rust struct layout (fields reordered by compiler):
Byte: 0 1 2 3 4 5 6 7
+----+----+----+----+----+----+----+----+
| d | d | d | d | d | d | d | d |
+----+----+----+----+----+----+----+----+
Byte: 8 9 10 11 12 13 14 15
+----+----+----+----+----+----+----+----+
| b | b | b | b | a | c | pad| pad|
+----+----+----+----+----+----+----+----+
Total: 16 bytes (2 bytes of padding)
To force C-compatible layout, use #[repr(C)]:
// repr_c.rs #[repr(C)] struct Example { a: u8, b: i32, c: u8, d: f64 } fn main() { println!("size (#[repr(C)]) = {}", std::mem::size_of::<Example>()); // prints 24, same as C }
Driver Prep: When passing structs to the kernel or hardware, you must control the layout. Use
#[repr(C)]in Rust. In C, use__attribute__((packed))if you need to eliminate padding entirely.
Try It: Reorder the fields in the C struct to minimize padding manually. What is the smallest
sizeofyou can achieve?
Enum Memory Layout
// enum_size.rs use std::mem; enum Color { Red, Green, Blue } fn main() { println!("size of Color = {}", mem::size_of::<Color>()); println!("size of Option<u8> = {}", mem::size_of::<Option<u8>>()); println!("size of Option<Box<i32>> = {}", mem::size_of::<Option<Box<i32>>>()); }
$ rustc enum_size.rs && ./enum_size
size of Color = 1
size of Option<u8> = 2
size of Option<Box<i32>> = 8
Rust uses the smallest discriminant that fits. Color needs only 1 byte. A C enum
is typically 4 bytes (int-sized).
Option<Box<i32>> is the same size as Box<i32> -- Rust uses "niche optimization":
since Box can never be null, the null bit pattern represents None.
Option<Box<i32>> layout:
Some(ptr): | non-zero pointer value (8 bytes) |
None: | 0x0000000000000000 (8 bytes) |
No extra tag byte needed.
Quick Knowledge Check
- What is the difference between a C
unionand a Rustenumwith data? - Why does Rust reorder struct fields by default?
- What does
#[repr(C)]do?
Common Pitfalls
- Reading the wrong union member in C. Undefined behavior. No runtime check. Use a tag field and validate it in every access.
- Forgetting padding in C structs.
sizeof(struct)may be larger than the sum of field sizes. Useoffsetofto check. - Assuming C enum values are contiguous. You can assign arbitrary values:
enum E { A = 0, B = 100 }. Do not use them as array indices without bounds checks. - Forgetting
pubon Rust struct fields. The struct may be public, but fields are private by default. - Using
#[repr(C)]everywhere in Rust. Only use it when you need C-compatible layout (FFI, memory-mapped I/O). Otherwise let the compiler optimize. - Ignoring niche optimization.
Option<&T>is the same size as&T. Do not wrap references in custom tagged enums whenOptionalready does it for free.