Preparing for Kernel Space

Everything in this book has been user-space code. But every concept -- pointers, bit manipulation, function pointers, state machines, memory layout -- was chosen because it maps directly to kernel programming. This chapter connects the dots: what changes when you cross into kernel space, and how your user-space skills translate.

What Changes When You Cross the Boundary

User space                          Kernel space
+----------------------------------+----------------------------------+
| libc available                   | No libc                          |
| malloc/free                      | kmalloc/kfree (with GFP flags)   |
| printf                           | printk                           |
| Segfaults caught by kernel       | Bugs crash the whole system      |
| Virtual address space per process| Shared address space, all memory |
| Floating point available         | No floating point (usually)      |
| Large stack (8 MB default)       | Tiny stack (8-16 KB)             |
| User can be preempted freely     | Must think about preemption      |
| Errors return -1 and set errno   | Functions return negative errno  |
+----------------------------------+----------------------------------+

The kernel is freestanding C. No standard library, no heap by default, no safety net. Every technique we've practiced -- careful memory management, understanding alignment, defensive error handling -- becomes critical.

The Kernel's C Dialect

Kernel C is C11 (or later) with extensions and restrictions.

No Standard Library

You do not get #include <stdio.h>. Instead:

/* Kernel equivalents */
#include <linux/kernel.h>    /* printk, container_of */
#include <linux/slab.h>      /* kmalloc, kfree */
#include <linux/string.h>    /* memcpy, strcmp (kernel versions) */
#include <linux/types.h>     /* u8, u16, u32, u64, etc. */

printk replaces printf:

/* User space */
printf("value = %d\n", x);

/* Kernel space */
printk(KERN_INFO "value = %d\n", x);
/* or modern style: */
pr_info("value = %d\n", x);

No Floating Point

The kernel does not save/restore FPU state on context switches between kernel threads. Using floating point in kernel code silently corrupts user-space FPU registers.

/* WRONG in kernel code: */
double ratio = bytes / 1024.0;  /* will corrupt user FPU state */

/* CORRECT: use integer math */
unsigned long ratio = bytes / 1024;
unsigned long remainder = bytes % 1024;

If you absolutely need floating point (rare), you must wrap it:

kernel_fpu_begin();
/* ... floating point operations ... */
kernel_fpu_end();

Limited Stack

The kernel stack is typically 8 KB on x86 (two pages). Allocating large arrays on the stack will overflow it -- there's no guard page, just corruption.

/* WRONG in kernel code: */
char buffer[8192];  /* might overflow the entire kernel stack */

/* CORRECT: allocate on the heap */
char *buffer = kmalloc(8192, GFP_KERNEL);
if (!buffer)
    return -ENOMEM;
/* ... use buffer ... */
kfree(buffer);

GFP Flags

kmalloc takes a flags argument that specifies allocation context:

/* Can sleep (normal context, not in interrupt) */
ptr = kmalloc(size, GFP_KERNEL);

/* Cannot sleep (interrupt context, spinlock held) */
ptr = kmalloc(size, GFP_ATOMIC);

/* For DMA-able memory */
ptr = kmalloc(size, GFP_DMA);

Using GFP_KERNEL in interrupt context will deadlock the system. Using GFP_ATOMIC wastes emergency memory reserves. Getting this right is essential.

Caution: A wrong GFP flag is one of the most common kernel bugs. If you hold a spinlock or are in an interrupt handler, you must use GFP_ATOMIC. If you use GFP_KERNEL in that context, the allocator may sleep, and sleeping while holding a spinlock deadlocks the CPU.

Module Basics (Conceptual)

A kernel module is a .ko file loaded at runtime. The minimal structure:

/* hello_module.c -- conceptual, not compilable without kernel headers */
#include <linux/init.h>
#include <linux/module.h>

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Your Name");
MODULE_DESCRIPTION("Hello world kernel module");

static int __init hello_init(void)
{
    pr_info("hello: module loaded\n");
    return 0;  /* 0 = success */
}

static void __exit hello_exit(void)
{
    pr_info("hello: module unloaded\n");
}

module_init(hello_init);
module_exit(hello_exit);
$ make -C /lib/modules/$(uname -r)/build M=$(pwd) modules
$ sudo insmod hello.ko
$ dmesg | tail -1
[12345.678] hello: module loaded
$ sudo rmmod hello

The __init and __exit macros let the kernel free the init code after loading and skip the exit code for built-in (non-modular) drivers.

How Your User-Space Skills Map to Kernel Code

list_head: The Kernel's Real Linked List

In Chapter 16, we built linked lists in C. The kernel uses struct list_head -- an intrusive doubly-linked list that is embedded inside the data structure.

/* User space (from Ch16): */
struct node {
    int data;
    struct node *next;
};

/* Kernel: */
#include <linux/list.h>

struct my_item {
    int data;
    struct list_head list;  /* embedded list node */
};

/* Usage: */
LIST_HEAD(my_list);

struct my_item *item = kmalloc(sizeof(*item), GFP_KERNEL);
item->data = 42;
list_add(&item->list, &my_list);

/* Iterate: */
struct my_item *pos;
list_for_each_entry(pos, &my_list, list) {
    pr_info("data = %d\n", pos->data);
}

The container_of macro (which you may have implemented in Ch16) converts a list_head pointer back to the containing structure. This is the same technique, used everywhere in the kernel.

struct my_item layout:

+-----------+------------------+
| data (4B) | list_head (16B)  |
|           | .prev  | .next   |
+-----------+--------+---------+
^           ^
|           |
item        &item->list

container_of(&item->list, struct my_item, list) == item

Function Pointer vtables Become file_operations

In Chapter 18, we built vtables from function pointers. The kernel uses exactly the same pattern for its driver interfaces.

/* User space (from Ch18): */
struct Shape {
    double (*area)(void *self);
    void   (*draw)(void *self);
};

/* Kernel: file_operations for a character device */
#include <linux/fs.h>

static int     mydev_open(struct inode *i, struct file *f) { return 0; }
static ssize_t mydev_read(struct file *f, char __user *buf,
                          size_t len, loff_t *off) { return 0; }
static ssize_t mydev_write(struct file *f, const char __user *buf,
                           size_t len, loff_t *off) { return len; }
static long    mydev_ioctl(struct file *f, unsigned int cmd,
                           unsigned long arg) { return 0; }
static int     mydev_release(struct inode *i, struct file *f) { return 0; }

static const struct file_operations mydev_fops = {
    .owner          = THIS_MODULE,
    .open           = mydev_open,
    .read           = mydev_read,
    .write          = mydev_write,
    .unlocked_ioctl = mydev_ioctl,
    .release        = mydev_release,
};

This is the same struct-of-function-pointers pattern. The kernel dispatches open(), read(), write(), ioctl() through these pointers. Every character device, block device, and network device uses this pattern.

Similarly, platform drivers:

/* Kernel: platform driver operations */
#include <linux/platform_device.h>

static int  mydrv_probe(struct platform_device *pdev)  { return 0; }
static int  mydrv_remove(struct platform_device *pdev) { return 0; }

static struct platform_driver mydrv = {
    .probe  = mydrv_probe,
    .remove = mydrv_remove,
    .driver = {
        .name = "my_device",
    },
};

State Machines Become Driver Lifecycle

In Chapter 19, we built explicit state machines. Kernel drivers are state machines:

Driver Lifecycle State Machine:

  [UNLOADED]
      |
      v  module_init()
  [LOADED]
      |
      v  probe()
  [BOUND TO DEVICE]
      |
      +----> suspend()  --> [SUSPENDED]
      |                         |
      |      resume()   <-------+
      |
      v  remove()
  [UNBOUND]
      |
      v  module_exit()
  [UNLOADED]

Every transition has a corresponding callback in the driver structure. The patterns you practiced -- clear states, explicit transitions, error handling at each step -- are exactly what kernel drivers require.

Bit Manipulation Becomes Register Access

In Part III (Chapters 11-13), we covered bitwise operations, masks, and bit fields. In kernel drivers, you use these to read and write hardware registers.

/* User space (from Part III): */
#define BIT(n)          (1UL << (n))
#define SET_BIT(val, n) ((val) | BIT(n))
#define CLR_BIT(val, n) ((val) & ~BIT(n))

/* Kernel: register access */
#include <linux/io.h>

#define REG_CONTROL   0x00
#define REG_STATUS    0x04
#define CTRL_ENABLE   BIT(0)
#define CTRL_IRQ_EN   BIT(1)
#define STATUS_BUSY   BIT(7)

static void __iomem *base;  /* memory-mapped register base */

/* Enable the device */
u32 val = readl(base + REG_CONTROL);
val |= CTRL_ENABLE | CTRL_IRQ_EN;
writel(val, base + REG_CONTROL);

/* Wait for not busy */
while (readl(base + REG_STATUS) & STATUS_BUSY)
    cpu_relax();

readl and writel are memory-mapped I/O accessors that handle memory barriers and prevent compiler reordering. The bit manipulation is identical to what you learned.

Error Handling in the Kernel

The kernel returns negative errno values, not -1 with a separate errno variable:

/* User space: */
int fd = open(path, O_RDONLY);
if (fd < 0) {
    perror("open");  /* uses errno */
}

/* Kernel: */
static int mydrv_probe(struct platform_device *pdev)
{
    void *buf = kmalloc(1024, GFP_KERNEL);
    if (!buf)
        return -ENOMEM;  /* return the negative errno directly */

    int irq = platform_get_irq(pdev, 0);
    if (irq < 0)
        return irq;  /* pass through the error */

    /* ... */
    return 0;  /* success */
}

The goto-based cleanup pattern from earlier chapters is the standard kernel idiom:

static int mydrv_probe(struct platform_device *pdev)
{
    int ret;

    void *buf = kmalloc(1024, GFP_KERNEL);
    if (!buf)
        return -ENOMEM;

    ret = register_something();
    if (ret)
        goto err_free_buf;

    ret = setup_irq();
    if (ret)
        goto err_unregister;

    return 0;

err_unregister:
    unregister_something();
err_free_buf:
    kfree(buf);
    return ret;
}

This pattern appears in virtually every kernel driver probe function.

Concurrency in the Kernel

The kernel is massively concurrent: multiple CPUs, interrupts, softirqs, workqueues. Everything from Chapters on threads and synchronization applies, but with kernel primitives:

+---------------------+----------------------------+
| User space          | Kernel                     |
+---------------------+----------------------------+
| pthread_mutex_t     | struct mutex               |
| pthread_spinlock_t  | spinlock_t                 |
| sem_t               | struct semaphore           |
| atomic_int          | atomic_t, atomic_long_t    |
| pthread_cond_t      | wait_queue_head_t          |
| read-write lock     | rwlock_t, struct rw_semaphore |
+---------------------+----------------------------+

The key difference: in interrupt context, you cannot sleep, so you must use spinlocks rather than mutexes.

Rust in the Kernel

The Linux kernel has experimental Rust support. Kernel Rust has the same restrictions as kernel C: no standard library, no heap unless explicitly allocated, no floating point.

#![allow(unused)]
fn main() {
// Conceptual kernel module in Rust (requires Rust-for-Linux)
use kernel::prelude::*;

module! {
    type: MyModule,
    name: "my_module",
    license: "GPL",
}

struct MyModule;

impl kernel::Module for MyModule {
    fn init(_module: &'static ThisModule) -> Result<Self> {
        pr_info!("Hello from Rust kernel module!\n");
        Ok(MyModule)
    }
}

impl Drop for MyModule {
    fn drop(&mut self) {
        pr_info!("Goodbye from Rust kernel module!\n");
    }
}
}

Rust kernel modules use:

  • kernel:: crate instead of std::
  • pr_info! instead of println!
  • Result<T> with kernel error types
  • Box backed by kmalloc
  • The borrow checker prevents most use-after-free and data race bugs at compile time

Rust Note: Rust in the kernel is not a replacement for C. It's an additional language option for new drivers and subsystems. Existing kernel C code will not be rewritten. Understanding C kernel programming is still essential even if you plan to write Rust kernel modules.

The Complete Mapping

Here is how every major topic from this book connects to kernel programming:

+------------------------------+---------------------------------------+
| Book Chapter / Topic         | Kernel Equivalent                     |
+------------------------------+---------------------------------------+
| Pointers (Ch6-7)             | __user pointers, void __iomem *       |
| Structs (Ch8-9)              | Every kernel data structure            |
| Bit manipulation (Ch11-13)   | Register access, flag fields           |
| Linked lists (Ch16)          | struct list_head, hlist_head           |
| Function pointers (Ch18)     | file_operations, driver ops            |
| State machines (Ch19)        | Driver probe/remove/suspend/resume     |
| Opaque types (Ch20)          | struct device, struct file (internals) |
| Build system (Ch24-27)       | Kbuild, Kconfig, make menuconfig       |
| File descriptors (Ch28-31)   | struct file, VFS layer                 |
| Processes (Ch32-34)          | kthread, workqueue                     |
| Signals (Ch35-37)            | Kernel signal delivery                 |
| Memory mapping (Ch38-40)     | ioremap, DMA mapping                   |
| Threads (Ch41-43)            | kthread, per-cpu variables             |
| Synchronization (Ch44-46)    | spinlock, mutex, RCU                   |
| Networking (Ch47-49)         | sk_buff, net_device, socket layer      |
| Optimization (Ch50)          | Cache-aligned structs, likely/unlikely |
| Arenas/pools (Ch51)          | Slab allocator (kmem_cache)            |
| Atomics (Ch52)               | atomic_t, memory barriers              |
| /proc and /sys (Ch53)        | Creating procfs/sysfs entries           |
| ioctl (Ch54)                 | Implementing file_operations.ioctl     |
| Netlink (Ch55)               | genl_register_family()                 |
+------------------------------+---------------------------------------+

What to Study Next

  1. Linux Device Drivers (LDD3) -- the classic reference. Some APIs have changed, but the concepts are timeless.

  2. The kernel source itself -- drivers/ contains thousands of real examples. Start with simple ones like drivers/misc/.

  3. QEMU + buildroot -- build a minimal Linux system and test your modules in a VM. No risk of crashing your real machine.

  4. Kernel documentation -- Documentation/ in the kernel tree. Especially driver-api/ and core-api/.

  5. Rust for Linux -- if you want to write kernel modules in Rust, follow the rust-for-linux project.

  6. Write a character device driver -- your first kernel project should be a simple character device that implements open, read, write, and ioctl. You already know every concept required.

Driver Prep: This is it. You've learned the user-space foundations. Every concept in this book -- from pointers to atomics, from bit manipulation to state machines -- was chosen because it's essential in kernel and driver code. You're ready.

Try It: Download the kernel source. Navigate to drivers/misc/dummy-irq.c or drivers/misc/eeprom/at24.c. Read the code. You should recognize the patterns: module init/exit, probe/remove, file_operations, error handling with goto, bit manipulation for registers. If you can read and understand a real kernel driver, you've succeeded.

Quick Knowledge Check

  1. Why can't you use printf in kernel code?
  2. What happens if you use GFP_KERNEL inside an interrupt handler?
  3. How does container_of work, and why is it essential for kernel linked lists?

Common Pitfalls

  • Using malloc in kernel code. There is no libc. Use kmalloc.
  • Large stack allocations. The kernel stack is 8-16 KB. Allocate large buffers with kmalloc.
  • Sleeping in atomic context. If you hold a spinlock or are in an interrupt handler, you must not call anything that might sleep (kmalloc(GFP_KERNEL), mutex_lock(), copy_from_user() -- yes, even that can sleep).
  • Forgetting to free on error paths. The goto cleanup pattern exists for a reason. Every resource acquired must be released in reverse order.
  • Accessing user pointers directly. Always use copy_from_user() / copy_to_user(). Direct access crashes on SMAP-enabled CPUs and is a security vulnerability.
  • No error checking. Every kernel function that can fail must be checked. The kernel does not tolerate ignored errors the way user space sometimes does.