Multiplexing with select and poll

Blocking I/O is simple: call read(), wait for data, process it. But a server with 100 clients cannot call read() on all 100 sockets at the same time. It blocks on the first one and ignores the other 99. Fork-per-connection and thread-per-connection solve this, but they are expensive. I/O multiplexing lets a single thread monitor many file descriptors and act only on the ones that are ready.

This chapter covers select() and poll(), their APIs, and their limitations.

The Problem

  Thread blocked on fd 3:       fds 4, 5, 6 have data waiting
  +---+                         +---+---+---+
  | 3 | <-- read() blocks       | 4 | 5 | 6 |  data piling up
  +---+                         +---+---+---+

  With multiplexing:
  +---+---+---+---+
  | 3 | 4 | 5 | 6 |  <-- "which of these are ready?"
  +---+---+---+---+
       |
       v
  "fd 4 and fd 6 are ready to read"
       |
       v
  read(4, ...)    read(6, ...)   <-- no blocking

select() in C

select() watches three sets of file descriptors: readable, writable, and exceptional. It blocks until at least one fd is ready or a timeout expires.

/* select_server.c -- single-threaded multi-client echo with select() */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/select.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>

int main(void)
{
    int listen_fd = socket(AF_INET, SOCK_STREAM, 0);
    int opt = 1;
    setsockopt(listen_fd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));

    struct sockaddr_in addr = {0};
    addr.sin_family      = AF_INET;
    addr.sin_addr.s_addr = htonl(INADDR_ANY);
    addr.sin_port        = htons(7878);
    bind(listen_fd, (struct sockaddr *)&addr, sizeof(addr));
    listen(listen_fd, 128);
    printf("select server on port 7878 (max %d fds)\n", FD_SETSIZE);

    fd_set master_set;
    FD_ZERO(&master_set);
    FD_SET(listen_fd, &master_set);
    int max_fd = listen_fd;

    for (;;) {
        fd_set read_set = master_set;   /* select modifies the set */

        int ready = select(max_fd + 1, &read_set, NULL, NULL, NULL);
        if (ready < 0) { perror("select"); break; }

        for (int fd = 0; fd <= max_fd; fd++) {
            if (!FD_ISSET(fd, &read_set))
                continue;

            if (fd == listen_fd) {
                /* New connection */
                struct sockaddr_in client;
                socklen_t clen = sizeof(client);
                int conn = accept(listen_fd,
                                  (struct sockaddr *)&client, &clen);
                if (conn < 0) { perror("accept"); continue; }
                if (conn >= FD_SETSIZE) {
                    fprintf(stderr, "fd %d exceeds FD_SETSIZE\n", conn);
                    close(conn);
                    continue;
                }

                FD_SET(conn, &master_set);
                if (conn > max_fd) max_fd = conn;

                char ip[INET_ADDRSTRLEN];
                inet_ntop(AF_INET, &client.sin_addr, ip, sizeof(ip));
                printf("+ %s:%d (fd %d)\n",
                       ip, ntohs(client.sin_port), conn);
            } else {
                /* Data from existing client */
                char buf[1024];
                ssize_t n = read(fd, buf, sizeof(buf));
                if (n <= 0) {
                    printf("- fd %d disconnected\n", fd);
                    close(fd);
                    FD_CLR(fd, &master_set);
                } else {
                    write(fd, buf, n);
                }
            }
        }
    }

    close(listen_fd);
    return 0;
}

The fd_set API

Macro/FunctionPurpose
FD_ZERO(&set)Clear all bits
FD_SET(fd, &set)Add fd to set
FD_CLR(fd, &set)Remove fd from set
FD_ISSET(fd, &set)Test if fd is in set
select(nfds, r, w, e, t)Block until fd(s) ready or timeout

The first argument to select() is the highest fd number plus one. The kernel scans from 0 to nfds-1.

Caution: FD_SETSIZE is typically 1024 on Linux. If your server opens fd 1024 or higher, FD_SET writes out of bounds, corrupting memory silently. This is undefined behavior, not a clean error. For servers that may handle more than ~1000 connections, use poll() or epoll instead.

Try It: Connect 5 clients to the select server using nc 127.0.0.1 7878. Type in different terminals and verify they all echo independently with no threads.

select() with Timeout

/* select_timeout.c -- wait for stdin with a 3-second timeout */
#include <stdio.h>
#include <sys/select.h>
#include <unistd.h>

int main(void)
{
    printf("Type something within 3 seconds...\n");

    fd_set fds;
    FD_ZERO(&fds);
    FD_SET(STDIN_FILENO, &fds);

    struct timeval tv;
    tv.tv_sec  = 3;
    tv.tv_usec = 0;

    int ret = select(STDIN_FILENO + 1, &fds, NULL, NULL, &tv);
    if (ret > 0 && FD_ISSET(STDIN_FILENO, &fds)) {
        char buf[256];
        ssize_t n = read(STDIN_FILENO, buf, sizeof(buf) - 1);
        buf[n] = '\0';
        printf("You typed: %s", buf);
    } else if (ret == 0) {
        printf("Timeout!\n");
    } else {
        perror("select");
    }
    return 0;
}

Caution: On Linux, select() modifies the timeval struct to reflect remaining time. Do not reuse it across calls without re-initializing. This behavior is Linux-specific and not portable.

poll() in C

poll() fixes the fd limit problem. Instead of a fixed-size bitmask, it takes an array of struct pollfd. You can monitor as many fds as the system allows.

/* poll_server.c -- single-threaded multi-client echo with poll() */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <poll.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>

#define MAX_FDS 4096

int main(void)
{
    int listen_fd = socket(AF_INET, SOCK_STREAM, 0);
    int opt = 1;
    setsockopt(listen_fd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));

    struct sockaddr_in addr = {0};
    addr.sin_family      = AF_INET;
    addr.sin_addr.s_addr = htonl(INADDR_ANY);
    addr.sin_port        = htons(7879);
    bind(listen_fd, (struct sockaddr *)&addr, sizeof(addr));
    listen(listen_fd, 128);
    printf("poll server on port 7879\n");

    struct pollfd fds[MAX_FDS];
    int nfds = 0;

    /* First entry: the listening socket */
    fds[0].fd     = listen_fd;
    fds[0].events = POLLIN;
    nfds = 1;

    for (;;) {
        int ready = poll(fds, nfds, -1);  /* -1 = block forever */
        if (ready < 0) { perror("poll"); break; }

        /* Check listening socket first */
        if (fds[0].revents & POLLIN) {
            struct sockaddr_in client;
            socklen_t clen = sizeof(client);
            int conn = accept(listen_fd,
                              (struct sockaddr *)&client, &clen);
            if (conn >= 0 && nfds < MAX_FDS) {
                fds[nfds].fd     = conn;
                fds[nfds].events = POLLIN;
                nfds++;

                char ip[INET_ADDRSTRLEN];
                inet_ntop(AF_INET, &client.sin_addr, ip, sizeof(ip));
                printf("+ %s:%d (fd %d, slot %d)\n",
                       ip, ntohs(client.sin_port), conn, nfds - 1);
            } else {
                if (conn >= 0) close(conn);  /* too many fds */
            }
        }

        /* Check client sockets */
        for (int i = 1; i < nfds; i++) {
            if (fds[i].revents & (POLLIN | POLLERR | POLLHUP)) {
                char buf[1024];
                ssize_t n = read(fds[i].fd, buf, sizeof(buf));
                if (n <= 0) {
                    printf("- fd %d disconnected\n", fds[i].fd);
                    close(fds[i].fd);
                    /* Swap with last entry to compact array */
                    fds[i] = fds[--nfds];
                    i--;  /* re-check this slot */
                } else {
                    write(fds[i].fd, buf, n);
                }
            }
        }
    }

    close(listen_fd);
    return 0;
}

struct pollfd

struct pollfd {
    int   fd;       /* file descriptor */
    short events;   /* requested events (input) */
    short revents;  /* returned events (output) */
};
FlagMeaning
POLLINData available to read
POLLOUTWriting will not block
POLLERRError condition (output only)
POLLHUPHang up (output only)
POLLNVALInvalid fd (output only)

Caution: POLLERR and POLLHUP are always monitored even if you do not set them in events. When they fire, you must handle them -- typically by closing the fd.

select vs poll

  select()                              poll()
  +------------------------------------+------------------------------------+
  | Fixed fd limit (FD_SETSIZE=1024)   | No fd limit (array of pollfd)      |
  | Bitmask modified on each call      | revents field written, events kept |
  | Must rebuild fd_set each iteration | Array persists between calls       |
  | O(max_fd) scanning                 | O(nfds) scanning                   |
  | Portable (POSIX, Windows)          | POSIX only (not native Windows)    |
  +------------------------------------+------------------------------------+

  Both share the fundamental limitation:
  The kernel scans the ENTIRE fd list on every call, even if only one fd is ready.
  At 10,000 fds, both spend most of their time scanning fds that have no events.

Monitoring for Writability

Sometimes you need to know when a socket is ready for writing -- for example, after a connect() in non-blocking mode, or when an output buffer was full.

/* poll_write.c -- detect when a non-blocking connect() completes */
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <poll.h>
#include <errno.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>

int main(void)
{
    int fd = socket(AF_INET, SOCK_STREAM, 0);

    /* Set non-blocking */
    int flags = fcntl(fd, F_GETFL, 0);
    fcntl(fd, F_SETFL, flags | O_NONBLOCK);

    struct sockaddr_in addr = {0};
    addr.sin_family = AF_INET;
    addr.sin_port   = htons(80);
    inet_pton(AF_INET, "93.184.216.34", &addr.sin_addr); /* example.com */

    int ret = connect(fd, (struct sockaddr *)&addr, sizeof(addr));
    if (ret < 0 && errno != EINPROGRESS) {
        perror("connect"); return 1;
    }

    /* Wait for connection to complete */
    struct pollfd pfd = { .fd = fd, .events = POLLOUT };
    int ready = poll(&pfd, 1, 5000);  /* 5-second timeout */

    if (ready > 0 && (pfd.revents & POLLOUT)) {
        int err = 0;
        socklen_t elen = sizeof(err);
        getsockopt(fd, SOL_SOCKET, SO_ERROR, &err, &elen);
        if (err == 0) {
            printf("Connected!\n");
        } else {
            printf("Connection failed: %s\n", strerror(err));
        }
    } else {
        printf("Timeout or error\n");
    }

    close(fd);
    return 0;
}

Driver Prep: The kernel's internal poll mechanism (struct file_operations.poll) works on the same principle. When you write a character device driver, you implement a poll callback so that userspace select()/poll() works on your device fd.

Rust: Using nix for select and poll

The Rust standard library does not expose select() or poll() directly. The nix crate provides safe wrappers.

poll with nix

// poll_server.rs -- multi-client echo with nix::poll
// Cargo.toml: nix = { version = "0.29", features = ["poll", "net"] }
use nix::poll::{poll, PollFd, PollFlags};
use std::collections::HashMap;
use std::io::{Read, Write};
use std::net::{TcpListener, TcpStream};
use std::os::fd::AsRawFd;

fn main() -> std::io::Result<()> {
    let listener = TcpListener::bind("0.0.0.0:7879")?;
    listener.set_nonblocking(true)?;
    println!("Rust poll server on port 7879");

    let mut poll_fds: Vec<PollFd> = vec![
        PollFd::new(listener.as_raw_fd(), PollFlags::POLLIN),
    ];
    let mut clients: HashMap<i32, TcpStream> = HashMap::new();

    loop {
        let _ready = poll(&mut poll_fds, -1)
            .expect("poll failed");

        let mut new_fds: Vec<PollFd> = Vec::new();
        let mut remove_fds: Vec<i32> = Vec::new();

        for pfd in &poll_fds {
            let revents = pfd.revents().unwrap_or(PollFlags::empty());
            let fd = pfd.as_raw_fd();

            if fd == listener.as_raw_fd() {
                if revents.contains(PollFlags::POLLIN) {
                    // Accept all pending connections
                    loop {
                        match listener.accept() {
                            Ok((stream, addr)) => {
                                println!("+ {}", addr);
                                stream.set_nonblocking(true).ok();
                                let raw = stream.as_raw_fd();
                                new_fds.push(
                                    PollFd::new(raw, PollFlags::POLLIN)
                                );
                                clients.insert(raw, stream);
                            }
                            Err(_) => break,
                        }
                    }
                }
            } else if revents.intersects(
                PollFlags::POLLIN | PollFlags::POLLERR | PollFlags::POLLHUP
            ) {
                let mut buf = [0u8; 1024];
                if let Some(stream) = clients.get_mut(&fd) {
                    match stream.read(&mut buf) {
                        Ok(0) | Err(_) => {
                            println!("- fd {}", fd);
                            remove_fds.push(fd);
                        }
                        Ok(n) => {
                            let _ = stream.write_all(&buf[..n]);
                        }
                    }
                }
            }
        }

        // Remove disconnected clients
        for fd in &remove_fds {
            clients.remove(fd);
            poll_fds.retain(|p| p.as_raw_fd() != *fd);
        }

        // Add new connections
        poll_fds.extend(new_fds);
    }
}

Rust Note: Rust's ownership model prevents the common C bug of using a closed fd. Once the TcpStream is removed from the HashMap, it is dropped, and the fd is closed. No dangling fd in the poll set -- the retain call removes the stale entry.

select with nix

// select_demo.rs -- wait for stdin with timeout using nix::select
// Cargo.toml: nix = { version = "0.29", features = ["select"] }
use nix::sys::select::{select, FdSet};
use nix::sys::time::TimeVal;
use std::io::Read;
use std::os::fd::AsRawFd;

fn main() {
    println!("Type something within 3 seconds...");

    let stdin_fd = std::io::stdin().as_raw_fd();
    let mut read_fds = FdSet::new();
    read_fds.insert(stdin_fd);

    let mut timeout = TimeVal::new(3, 0);

    match select(
        stdin_fd + 1,
        Some(&mut read_fds),
        None,
        None,
        Some(&mut timeout),
    ) {
        Ok(n) if n > 0 => {
            let mut buf = [0u8; 256];
            let n = std::io::stdin().read(&mut buf).unwrap();
            print!("You typed: {}", String::from_utf8_lossy(&buf[..n]));
        }
        Ok(_) => println!("Timeout!"),
        Err(e) => eprintln!("select error: {}", e),
    }
}

When to Use What

  Connections     Recommendation
  -----------     -------------------------------------------
  < 10            select() is fine, simple and portable
  10 - 1000       poll() removes the fd limit
  > 1000          epoll (next chapter) -- O(1) notification

Both select and poll have the same fundamental scaling problem: on every call, the kernel walks the entire list of file descriptors to check which are ready. With 10,000 fds, this linear scan dominates the server's CPU time. The next chapter introduces epoll, which solves this.

Knowledge Check

  1. What is FD_SETSIZE and why is it dangerous to exceed it with select()?
  2. How does poll() avoid the fd limit problem of select()?
  3. Why do both select() and poll() have O(n) per-call overhead?

Common Pitfalls

  • Not re-initializing fd_set -- select() modifies the set in place. You must copy the master set before each call.
  • Exceeding FD_SETSIZE -- silent memory corruption. No error, no warning, just data corruption and crashes.
  • Forgetting to handle POLLERR/POLLHUP -- the fd is signaled but reading from it yields an error. Infinite busy-loop if not handled.
  • Not compacting the pollfd array -- leaving closed fds in the array with fd = -1 works (poll ignores them) but wastes scanning time.
  • Assuming select() timeout is preserved -- on Linux, timeval is updated to reflect remaining time. Reuse without reinitializing gives shorter and shorter timeouts until you are busy-polling.
  • Using select() for high-fd-count servers -- it was designed in 1983 for a handful of file descriptors. Use poll() or epoll instead.