Netlink Sockets

Netlink is Linux's primary mechanism for communication between the kernel and user-space processes. Unlike ioctl, netlink uses a proper socket interface with structured messages, multicast groups, and asynchronous notifications. This chapter shows how to read the routing table, monitor network events, and build a simple network monitor.

Netlink is an AF_NETLINK socket family. Instead of connecting to a remote host, you connect to the kernel.

User space                        Kernel
+------------------+              +------------------+
| netlink socket   | <----------> | netlink subsystem |
| AF_NETLINK       |   messages   | (routing, link,  |
| SOCK_DGRAM       |              |  firewall, ...)  |
+------------------+              +------------------+

Key properties:

  • Message-based (like UDP, not like TCP streams).
  • Supports multicast -- subscribe to kernel event groups.
  • Bidirectional -- query state or receive notifications.
  • Replaces many ioctl-based network configuration interfaces.

Every netlink message starts with struct nlmsghdr:

struct nlmsghdr {
    __u32 nlmsg_len;    /* Total message length (including header) */
    __u16 nlmsg_type;   /* Message type */
    __u16 nlmsg_flags;  /* Flags: NLM_F_REQUEST, NLM_F_DUMP, etc. */
    __u32 nlmsg_seq;    /* Sequence number (for matching replies) */
    __u32 nlmsg_pid;    /* Sending process PID */
};
Message layout:

+------------------+-------------------+------------------+
| nlmsghdr         | payload           | padding          |
| (16 bytes)       | (variable)        | (to 4-byte align)|
+------------------+-------------------+------------------+
|<-------- nlmsg_len -------->|

For route messages, the payload is struct rtmsg followed by route attributes. For link messages, it's struct ifinfomsg followed by link attributes.

Protocol Families

ProtocolPurpose
NETLINK_ROUTERouting, addresses, links, neighbors
NETLINK_GENERICGeneric netlink (extensible)
NETLINK_NETFILTERFirewall (nftables, conntrack)
NETLINK_KOBJECT_UEVENTDevice hotplug events
NETLINK_AUDITKernel audit subsystem

NETLINK_ROUTE is by far the most common.

Reading the Routing Table

/* netlink_routes.c */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <linux/netlink.h>
#include <linux/rtnetlink.h>
#include <arpa/inet.h>

#define BUFSIZE 8192

struct nl_request {
    struct nlmsghdr hdr;
    struct rtmsg    msg;
};

static void parse_route(struct nlmsghdr *nlh) {
    struct rtmsg *rtm = NLMSG_DATA(nlh);

    /* Only show main table IPv4 routes */
    if (rtm->rtm_family != AF_INET)
        return;
    if (rtm->rtm_table != RT_TABLE_MAIN)
        return;

    char dst[INET_ADDRSTRLEN] = "0.0.0.0";
    char gw[INET_ADDRSTRLEN]  = "*";
    int  oif = 0;

    struct rtattr *rta = RTM_RTA(rtm);
    int rta_len = RTM_PAYLOAD(nlh);

    while (RTA_OK(rta, rta_len)) {
        switch (rta->rta_type) {
        case RTA_DST:
            inet_ntop(AF_INET, RTA_DATA(rta), dst, sizeof(dst));
            break;
        case RTA_GATEWAY:
            inet_ntop(AF_INET, RTA_DATA(rta), gw, sizeof(gw));
            break;
        case RTA_OIF:
            oif = *(int *)RTA_DATA(rta);
            break;
        }
        rta = RTA_NEXT(rta, rta_len);
    }

    printf("  %-18s via %-15s dev index %d  /%d\n",
           dst, gw, oif, rtm->rtm_dst_len);
}

int main(void) {
    int sock = socket(AF_NETLINK, SOCK_DGRAM, NETLINK_ROUTE);
    if (sock < 0) { perror("socket"); return 1; }

    /* Bind to netlink */
    struct sockaddr_nl sa = {
        .nl_family = AF_NETLINK,
        .nl_pid    = getpid(),
    };
    if (bind(sock, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
        perror("bind");
        close(sock);
        return 1;
    }

    /* Request a dump of the routing table */
    struct nl_request req = {
        .hdr = {
            .nlmsg_len   = NLMSG_LENGTH(sizeof(struct rtmsg)),
            .nlmsg_type  = RTM_GETROUTE,
            .nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP,
            .nlmsg_seq   = 1,
            .nlmsg_pid   = getpid(),
        },
        .msg = {
            .rtm_family = AF_INET,
            .rtm_table  = RT_TABLE_MAIN,
        },
    };

    if (send(sock, &req, req.hdr.nlmsg_len, 0) < 0) {
        perror("send");
        close(sock);
        return 1;
    }

    /* Read the response */
    printf("IPv4 Routing Table:\n");
    printf("  %-18s %-17s %-15s %s\n",
           "Destination", "Gateway", "Dev Index", "Prefix");

    char buf[BUFSIZE];
    int done = 0;
    while (!done) {
        ssize_t len = recv(sock, buf, sizeof(buf), 0);
        if (len < 0) { perror("recv"); break; }

        struct nlmsghdr *nlh = (struct nlmsghdr *)buf;
        while (NLMSG_OK(nlh, len)) {
            if (nlh->nlmsg_type == NLMSG_DONE) {
                done = 1;
                break;
            }
            if (nlh->nlmsg_type == NLMSG_ERROR) {
                fprintf(stderr, "Netlink error\n");
                done = 1;
                break;
            }
            parse_route(nlh);
            nlh = NLMSG_NEXT(nlh, len);
        }
    }

    close(sock);
    return 0;
}
$ gcc -O2 -o netlink_routes netlink_routes.c && ./netlink_routes
IPv4 Routing Table:
  Destination        Gateway           Dev Index       Prefix
  0.0.0.0            192.168.1.1       dev index 2  /0
  192.168.1.0        *                 dev index 2  /24
  172.17.0.0         *                 dev index 3  /16

Try It: Modify the program to also show IPv6 routes. Change rtm_family to AF_INET6 and use inet_ntop(AF_INET6, ...) with INET6_ADDRSTRLEN.

Monitoring Network Events

Netlink supports multicast. Subscribe to groups to receive real-time notifications when links go up/down or addresses change.

/* netlink_monitor.c */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <linux/netlink.h>
#include <linux/rtnetlink.h>
#include <net/if.h>

#define BUFSIZE 8192

static const char *msg_type_str(int type) {
    switch (type) {
    case RTM_NEWLINK: return "NEW_LINK";
    case RTM_DELLINK: return "DEL_LINK";
    case RTM_NEWADDR: return "NEW_ADDR";
    case RTM_DELADDR: return "DEL_ADDR";
    case RTM_NEWROUTE: return "NEW_ROUTE";
    case RTM_DELROUTE: return "DEL_ROUTE";
    default: return "UNKNOWN";
    }
}

int main(void) {
    int sock = socket(AF_NETLINK, SOCK_DGRAM, NETLINK_ROUTE);
    if (sock < 0) { perror("socket"); return 1; }

    struct sockaddr_nl sa = {
        .nl_family = AF_NETLINK,
        .nl_pid    = getpid(),
        .nl_groups = RTMGRP_LINK | RTMGRP_IPV4_IFADDR | RTMGRP_IPV4_ROUTE,
    };

    if (bind(sock, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
        perror("bind");
        close(sock);
        return 1;
    }

    printf("Monitoring network events (Ctrl+C to stop)...\n");

    char buf[BUFSIZE];
    while (1) {
        ssize_t len = recv(sock, buf, sizeof(buf), 0);
        if (len < 0) { perror("recv"); break; }

        struct nlmsghdr *nlh = (struct nlmsghdr *)buf;
        while (NLMSG_OK(nlh, len)) {
            printf("[%s] ", msg_type_str(nlh->nlmsg_type));

            if (nlh->nlmsg_type == RTM_NEWLINK ||
                nlh->nlmsg_type == RTM_DELLINK) {
                struct ifinfomsg *ifi = NLMSG_DATA(nlh);
                char ifname[IF_NAMESIZE];
                if_indextoname(ifi->ifi_index, ifname);
                printf("Interface: %s (index %d), flags=0x%x %s\n",
                       ifname, ifi->ifi_index, ifi->ifi_flags,
                       (ifi->ifi_flags & IFF_UP) ? "UP" : "DOWN");
            } else {
                printf("type=%d len=%d\n",
                       nlh->nlmsg_type, nlh->nlmsg_len);
            }

            nlh = NLMSG_NEXT(nlh, len);
        }
    }

    close(sock);
    return 0;
}
$ gcc -O2 -o netlink_monitor netlink_monitor.c && sudo ./netlink_monitor
Monitoring network events (Ctrl+C to stop)...
[NEW_LINK] Interface: eth0 (index 2), flags=0x1003 UP
[DEL_LINK] Interface: eth0 (index 2), flags=0x1002 DOWN

In another terminal, toggle an interface:

$ sudo ip link set eth0 down
$ sudo ip link set eth0 up
+------------------+----------------------------+---------------------------+
| Feature          | Netlink                    | ioctl                     |
+------------------+----------------------------+---------------------------+
| Async events     | Yes (multicast groups)     | No (must poll)            |
| Bulk queries     | Yes (NLM_F_DUMP)           | One item at a time        |
| Extensibility    | Attributes (TLV format)    | Fixed struct size         |
| Atomicity        | Can batch operations       | One operation per call    |
| Modern tools     | ip, iw use netlink         | ifconfig uses ioctl       |
| Complexity       | Higher (message parsing)   | Simpler (struct + call)   |
+------------------+----------------------------+---------------------------+

The ip command uses netlink. The old ifconfig command uses ioctl. Netlink is the modern, preferred interface.

Caution: Netlink messages must be properly aligned (NLMSG_ALIGN). Sending a message with wrong length or alignment can cause the kernel to reject it silently or return EINVAL.

This combines everything into a useful tool that watches for interface changes and prints their state.

/* link_watch.c */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <time.h>
#include <sys/socket.h>
#include <linux/netlink.h>
#include <linux/rtnetlink.h>
#include <net/if.h>

#define BUFSIZE 8192

int main(void) {
    int sock = socket(AF_NETLINK, SOCK_DGRAM, NETLINK_ROUTE);
    if (sock < 0) { perror("socket"); return 1; }

    struct sockaddr_nl sa = {
        .nl_family = AF_NETLINK,
        .nl_groups = RTMGRP_LINK,
    };

    if (bind(sock, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
        perror("bind");
        close(sock);
        return 1;
    }

    printf("%-20s %-10s %-8s %s\n",
           "Time", "Interface", "Event", "State");
    printf("%-20s %-10s %-8s %s\n",
           "----", "---------", "-----", "-----");

    char buf[BUFSIZE];
    while (1) {
        ssize_t len = recv(sock, buf, sizeof(buf), 0);
        if (len < 0) break;

        struct nlmsghdr *nlh = (struct nlmsghdr *)buf;
        while (NLMSG_OK(nlh, len)) {
            if (nlh->nlmsg_type == RTM_NEWLINK) {
                struct ifinfomsg *ifi = NLMSG_DATA(nlh);
                char ifname[IF_NAMESIZE] = "???";
                if_indextoname(ifi->ifi_index, ifname);

                /* Get timestamp */
                time_t now = time(NULL);
                struct tm *tm = localtime(&now);
                char timebuf[20];
                strftime(timebuf, sizeof(timebuf), "%Y-%m-%d %H:%M:%S", tm);

                const char *state;
                if (ifi->ifi_flags & IFF_RUNNING)
                    state = "RUNNING";
                else if (ifi->ifi_flags & IFF_UP)
                    state = "UP (no carrier)";
                else
                    state = "DOWN";

                printf("%-20s %-10s %-8s %s\n",
                       timebuf, ifname, "CHANGE", state);
                fflush(stdout);
            }
            nlh = NLMSG_NEXT(nlh, len);
        }
    }

    close(sock);
    return 0;
}

The netlink-packet-route and netlink-sys crates provide structured netlink access.

// Cargo.toml dependencies:
// netlink-sys = "0.8"
// netlink-packet-core = "0.7"
// netlink-packet-route = "0.17"

use std::io;
use std::os::unix::io::AsRawFd;

fn main() -> io::Result<()> {
    // Low-level: use raw socket like the C version
    let sock = unsafe {
        libc::socket(libc::AF_NETLINK, libc::SOCK_DGRAM, libc::NETLINK_ROUTE)
    };
    if sock < 0 {
        return Err(io::Error::last_os_error());
    }

    // Bind with RTMGRP_LINK group
    let mut sa: libc::sockaddr_nl = unsafe { std::mem::zeroed() };
    sa.nl_family = libc::AF_NETLINK as u16;
    sa.nl_groups = 1; // RTMGRP_LINK

    let ret = unsafe {
        libc::bind(
            sock,
            &sa as *const _ as *const libc::sockaddr,
            std::mem::size_of::<libc::sockaddr_nl>() as u32,
        )
    };
    if ret < 0 {
        return Err(io::Error::last_os_error());
    }

    println!("Monitoring link events (Ctrl+C to stop)...");

    let mut buf = [0u8; 8192];
    loop {
        let len = unsafe {
            libc::recv(sock, buf.as_mut_ptr() as *mut _, buf.len(), 0)
        };
        if len < 0 {
            return Err(io::Error::last_os_error());
        }
        println!("Received {} bytes of netlink data", len);
    }
}

For a higher-level approach, use the rtnetlink crate:

// Cargo.toml: rtnetlink = "0.13", tokio = { version = "1", features = ["full"] }
use rtnetlink::new_connection;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let (connection, handle, _) = new_connection()?;
    tokio::spawn(connection);

    // List all links
    let mut links = handle.link().get().execute();

    use futures::stream::StreamExt;
    while let Some(msg) = links.next().await {
        match msg {
            Ok(link) => {
                let name = link.header.index;
                println!("Link index {}: {:?}", name, link.attributes);
            }
            Err(e) => {
                eprintln!("Error: {e}");
                break;
            }
        }
    }

    Ok(())
}

Rust Note: The rtnetlink crate is async and uses tokio. It provides a much higher-level API than raw netlink sockets, with proper message parsing and type safety. For production code, this is strongly preferred over raw socket manipulation.

Generic netlink allows kernel modules and user-space programs to define custom message families without allocating a dedicated protocol number.

Flow:
1. Kernel module registers a generic netlink family ("my_family")
2. User-space resolves the family name to an ID via the controller
3. Communication proceeds using that dynamic ID

Tools like nl80211 (Wi-Fi configuration) and taskstats use generic netlink.

$ genl-ctrl-list   # (from libnl-utils)
0x0010 nlctrl version 2
0x0015 devlink version 1
0x001b nl80211 version 1
...

Driver Prep: Kernel modules that need a user-space communication channel often use generic netlink. When you write kernel modules, you'll use genl_register_family() to create a netlink family, and user-space programs will talk to your module via generic netlink sockets. This is the modern alternative to creating a custom character device for every module.

Try It: Run netlink_monitor in one terminal. In another terminal, run sudo ip addr add 10.99.99.1/24 dev lo and sudo ip addr del 10.99.99.1/24 dev lo. Watch the NEW_ADDR and DEL_ADDR events appear.

Quick Knowledge Check

  1. What advantages does netlink have over ioctl for network configuration?
  2. What does nl_groups in sockaddr_nl control?
  3. Why does netlink use NLMSG_ALIGN and NLMSG_NEXT macros instead of simple pointer arithmetic?

Common Pitfalls

  • Forgetting NLM_F_DUMP for bulk queries. Without it, you get one entry instead of the full table.
  • Not checking NLMSG_DONE. The kernel sends multi-part responses. You must loop until you see NLMSG_DONE.
  • Buffer too small. Netlink dumps can be large. Use at least 8KB buffers, or better, 32KB.
  • Wrong nl_pid. Set it to getpid() or 0 (let the kernel assign). Using a conflicting PID causes EADDRINUSE.
  • Ignoring NLMSG_ERROR. The kernel reports errors as netlink messages. Always check for error responses.
  • Assuming message order. Multicast events can arrive between dump responses. Use sequence numbers to match requests with replies.