File Descriptors & Resource Limits
Why This Matters
It is 2 PM on a busy day. Your web server suddenly starts rejecting connections. The error log is full of messages like:
accept4(): Too many open files
The server is not out of CPU or memory. It has hit its file descriptor limit. Every network connection, every open file, every pipe, every socket -- each one consumes a file descriptor. When you run out, the process cannot open anything new. No new connections. No new files. No new logs.
This is one of the most common production issues, and it catches people off guard because "too many open files" sounds like a disk problem when it is actually a kernel resource limit problem. Understanding file descriptors and resource limits is essential for running any kind of server under load.
Try This Right Now
# How many files does your shell have open?
$ ls -l /proc/$$/fd
total 0
lrwx------ 1 user user 64 Jan 18 14:00 0 -> /dev/pts/0
lrwx------ 1 user user 64 Jan 18 14:00 1 -> /dev/pts/0
lrwx------ 1 user user 64 Jan 18 14:00 2 -> /dev/pts/0
lr-x------ 1 user user 64 Jan 18 14:00 255 -> /dev/pts/0
# What are your current limits?
$ ulimit -n # max open files for this shell
1024
# How many files are open system-wide?
$ cat /proc/sys/fs/file-nr
3456 0 9223372036854775807
# ^ ^ ^
# | | System-wide maximum
# | Free (unused, allocated but free)
# Currently allocated
# What is the system-wide maximum?
$ cat /proc/sys/fs/file-max
9223372036854775807
What File Descriptors Are
A file descriptor (FD) is a small non-negative integer that the kernel uses to identify an open file, socket, pipe, or device within a process. It is an index into the process's table of open files.
┌──────────────────────────────────────────────────────────┐
│ PROCESS FILE DESCRIPTOR TABLE │
│ │
│ FD Points To Purpose │
│ ── ───────── ─────── │
│ 0 /dev/pts/0 stdin (keyboard) │
│ 1 /dev/pts/0 stdout (screen) │
│ 2 /dev/pts/0 stderr (screen) │
│ 3 /var/log/app.log log file │
│ 4 socket:[12345] TCP connection │
│ 5 socket:[12346] TCP connection │
│ 6 pipe:[67890] pipe to child proc │
│ 7 /etc/app.conf config file │
│ ... │
│ │
│ Every open() or socket() call returns the next │
│ available FD number. When closed, the number is freed. │
└──────────────────────────────────────────────────────────┘
The Three Standard File Descriptors
Every process starts with three file descriptors already open:
| FD | Name | Default | Shell Symbol |
|---|---|---|---|
| 0 | stdin | Terminal (keyboard) | < or 0< |
| 1 | stdout | Terminal (screen) | > or 1> |
| 2 | stderr | Terminal (screen) | 2> |
This is why shell redirection works the way it does:
# Redirect stdout (FD 1) to a file
$ command > output.txt # same as: command 1> output.txt
# Redirect stderr (FD 2) to a file
$ command 2> errors.txt
# Redirect both stdout and stderr
$ command > output.txt 2>&1 # stderr goes where stdout goes
# Redirect stdin (FD 0)
$ command < input.txt # same as: command 0< input.txt
# Discard stderr
$ command 2>/dev/null
What Consumes File Descriptors?
Everything that involves I/O in Linux uses file descriptors:
- Regular files (
open()) - Directories
- Network sockets (TCP, UDP, Unix domain)
- Pipes (between processes)
- Devices (
/dev/*) - Event descriptors (
eventfd,epoll,inotify) - Timer descriptors (
timerfd) - Signal descriptors (
signalfd)
A busy web server might have:
- Hundreds of client TCP connections (one FD each)
- Connections to databases (FDs for each)
- Open log files
- Pipes to CGI processes
- epoll FD for event monitoring
Think About It: A web server handles 10,000 concurrent connections, has 5 log files open, 10 database connections, and a few internal pipes. Approximately how many file descriptors does it need? Is the default limit of 1024 sufficient?
Exploring /proc/PID/fd
Every process has its open file descriptors listed under /proc/PID/fd/:
# Find a process to inspect (e.g., your shell)
$ echo $$
1234
# List its file descriptors
$ ls -la /proc/1234/fd/
total 0
dr-x------ 2 user user 0 Jan 18 14:00 .
dr-xr-xr-x 9 user user 0 Jan 18 14:00 ..
lrwx------ 1 user user 64 Jan 18 14:00 0 -> /dev/pts/0
lrwx------ 1 user user 64 Jan 18 14:00 1 -> /dev/pts/0
lrwx------ 1 user user 64 Jan 18 14:00 2 -> /dev/pts/0
lr-x------ 1 user user 64 Jan 18 14:00 255 -> /dev/pts/0
# Count file descriptors for a process
$ ls /proc/1234/fd/ | wc -l
4
# Inspect nginx's file descriptors
$ sudo ls -la /proc/$(pgrep -f "nginx: master" | head -1)/fd/
lrwx------ 1 root root 64 Jan 18 14:00 0 -> /dev/null
lrwx------ 1 root root 64 Jan 18 14:00 1 -> /dev/null
l-wx------ 1 root root 64 Jan 18 14:00 2 -> /var/log/nginx/error.log
lrwx------ 1 root root 64 Jan 18 14:00 3 -> socket:[45678]
l-wx------ 1 root root 64 Jan 18 14:00 4 -> /var/log/nginx/access.log
lrwx------ 1 root root 64 Jan 18 14:00 5 -> socket:[45679]
lrwx------ 1 root root 64 Jan 18 14:00 6 -> socket:[45680]
# See file descriptor limits for a process
$ cat /proc/1234/limits | grep "Max open files"
Max open files 1024 1048576 files
# ^^^^ ^^^^^^^
# Soft limit Hard limit
lsof: List Open Files
lsof (List Open Files) is the swiss army knife for investigating file descriptors.
Basic lsof Usage
# List all open files (WARNING: huge output!)
$ sudo lsof | wc -l
25678
# List open files for a specific process
$ sudo lsof -p 1234
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nginx 1234 root cwd DIR 253,0 4096 2 /
nginx 1234 root rtd DIR 253,0 4096 2 /
nginx 1234 root txt REG 253,0 1234567 12345 /usr/sbin/nginx
nginx 1234 root mem REG 253,0 234567 23456 /lib/x86_64-linux-gnu/libc.so.6
nginx 1234 root 0u CHR 1,3 0t0 5 /dev/null
nginx 1234 root 1u CHR 1,3 0t0 5 /dev/null
nginx 1234 root 2w REG 253,0 45678 34567 /var/log/nginx/error.log
nginx 1234 root 3u IPv4 45678 0t0 TCP *:80 (LISTEN)
nginx 1234 root 4w REG 253,0 123456 45678 /var/log/nginx/access.log
FD Column Meanings
| FD | Meaning |
|---|---|
cwd | Current working directory |
rtd | Root directory |
txt | Program text (executable) |
mem | Memory-mapped file |
0u, 1u, 2w | FD number with access mode (r=read, w=write, u=read/write) |
Common lsof Queries
# What process has a specific file open?
$ sudo lsof /var/log/syslog
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
rsyslogd 890 root 7w REG 253,0 456789 12345 /var/log/syslog
# What files does a specific user have open?
$ sudo lsof -u www-data | head -20
# What network connections does a process have?
$ sudo lsof -i -p 1234
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nginx 1234 root 3u IPv4 45678 0t0 TCP *:http (LISTEN)
nginx 1234 root 5u IPv4 45679 0t0 TCP myhost:http->client:54321 (ESTABLISHED)
# What is listening on a specific port?
$ sudo lsof -i :80
$ sudo lsof -i :443
# Count open files per process
$ sudo lsof | awk '{print $1}' | sort | uniq -c | sort -rn | head -10
4567 nginx
2345 mysqld
1234 java
567 sshd
234 systemd
# Find processes with the most FDs
$ for pid in /proc/[0-9]*/fd; do
count=$(ls "$pid" 2>/dev/null | wc -l)
procname=$(cat "${pid%/fd}/comm" 2>/dev/null)
echo "$count $procname (${pid%/fd})"
done | sort -rn | head -10
# Find deleted files that are still held open (common disk space issue!)
$ sudo lsof +L1
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME
java 5678 app 12w REG 253,0 5.2G 0 56789 /var/log/app.log (deleted)
That last one is critical: a deleted file that is still held open by a process continues to consume disk space until the process closes it or exits. This is a common cause of "the disk is full but I cannot find what is using the space."
ulimit: Per-Process Resource Limits
ulimit controls resource limits for the current shell and its child processes.
Viewing Limits
# Show all limits
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 63340
max locked memory (kbytes, -l) 65536
max memory size (kbytes, -m) unlimited
open files (-n) 1024 ← This one!
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 63340
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
# Show just the open files limit
$ ulimit -n
1024
# Show hard limit (maximum the soft limit can be raised to)
$ ulimit -Hn
1048576
# Show soft limit (currently enforced)
$ ulimit -Sn
1024
Soft vs Hard Limits
┌──────────────────────────────────────────────────────────┐
│ SOFT vs HARD LIMITS │
│ │
│ Hard Limit: Maximum ceiling. Only root can raise it. │
│ Soft Limit: Currently enforced limit. Users can raise │
│ it up to the hard limit. │
│ │
│ Example: │
│ Hard limit = 65536 │
│ Soft limit = 1024 │
│ │
│ A regular user can do: │
│ ulimit -n 65536 ← raises soft to hard limit (OK) │
│ ulimit -n 100000 ← exceeds hard limit (DENIED) │
│ │
│ Root can do: │
│ ulimit -Hn 100000 ← raises hard limit (OK) │
│ ulimit -n 100000 ← then raises soft limit (OK) │
│ │
└──────────────────────────────────────────────────────────┘
Changing Limits
# Raise the soft limit for the current shell (up to hard limit)
$ ulimit -n 65536
# This only affects the current shell and its children
# To make it permanent, use limits.conf or systemd
/etc/security/limits.conf
This file sets limits for users and groups at login time (via PAM).
$ sudo vim /etc/security/limits.conf
# /etc/security/limits.conf
#
# Format: <domain> <type> <item> <value>
#
# domain: username, @groupname, or * for everyone
# type: soft or hard
# item: nofile, nproc, memlock, etc.
# value: the limit
# Set limits for the nginx user
nginx soft nofile 65536
nginx hard nofile 65536
# Set limits for all users in the webapps group
@webapps soft nofile 65536
@webapps hard nofile 65536
# Set limits for all users
* soft nofile 4096
* hard nofile 65536
# Limit max processes for regular users (fork bomb protection)
* soft nproc 4096
* hard nproc 8192
# Allow the database user to lock memory
mysql soft memlock unlimited
mysql hard memlock unlimited
You can also use drop-in files in /etc/security/limits.d/:
$ sudo vim /etc/security/limits.d/99-custom.conf
# Custom limits for web applications
www-data soft nofile 65536
www-data hard nofile 65536
Distro Note: On systems using systemd, limits.conf only applies to user login sessions (via SSH or console). For services managed by systemd, you must use the systemd service configuration instead.
System-Wide Limits
Beyond per-process limits, there are system-wide kernel parameters:
fs.file-max
The maximum number of file descriptors the kernel will allocate system-wide.
# View current limit
$ cat /proc/sys/fs/file-max
9223372036854775807
# View current usage
$ cat /proc/sys/fs/file-nr
3456 0 9223372036854775807
# allocated free max
# Set a new limit (rarely needed on modern kernels)
$ sudo sysctl -w fs.file-max=2000000
$ echo 'fs.file-max = 2000000' | sudo tee -a /etc/sysctl.d/99-file-max.conf
fs.nr_open
The maximum number of file descriptors a single process can have. This is the upper bound for ulimit -n.
$ cat /proc/sys/fs/nr_open
1048576
# Increase it if you need per-process limits higher than 1M
$ sudo sysctl -w fs.nr_open=2000000
Relationship Between Limits
┌──────────────────────────────────────────────────────────┐
│ LIMIT HIERARCHY │
│ │
│ fs.file-max (system-wide total) │
│ │ │
│ ├── fs.nr_open (per-process maximum) │
│ │ │ │
│ │ ├── hard limit (per-user/group, in limits.conf)│
│ │ │ │ │
│ │ │ └── soft limit (currently enforced) │
│ │ │ │
│ │ └── soft limit <= hard limit <= nr_open │
│ │ │
│ └── sum of all processes' open FDs <= file-max │
│ │
│ Example chain: │
│ fs.file-max = 2000000 │
│ fs.nr_open = 1048576 │
│ hard limit = 65536 │
│ soft limit = 1024 │
│ │
│ A process can open up to 1024 files (soft limit). │
│ User can raise to 65536 (hard limit). │
│ Root can raise to 1048576 (nr_open). │
│ Total across all processes: up to 2000000 (file-max). │
└──────────────────────────────────────────────────────────┘
Troubleshooting "Too Many Open Files"
This is one of the most common production issues. Here is a systematic approach:
Step 1: Identify the Affected Process
# Check which process is hitting the limit
$ dmesg | grep "too many open files"
$ journalctl -xe | grep "too many open files"
# Or check the application's error log
$ grep -i "too many open files" /var/log/nginx/error.log
Step 2: Check Current FD Usage
# Count open FDs for the process
$ ls /proc/$(pgrep -f nginx | head -1)/fd | wc -l
1023
# Check the process's limit
$ cat /proc/$(pgrep -f nginx | head -1)/limits | grep "Max open files"
Max open files 1024 1024 files
# ^^^^ soft ^^^^ hard
The process has 1023 of 1024 file descriptors in use -- it is at the limit.
Step 3: Investigate What the FDs Are
# Are they all network connections?
$ sudo lsof -p $(pgrep -f nginx | head -1) | awk '{print $5}' | sort | uniq -c | sort -rn
890 IPv4 ← 890 network connections!
45 REG ← 45 regular files
12 unix ← 12 unix sockets
3 DIR ← 3 directories
2 CHR ← 2 character devices
1 FIFO ← 1 pipe
# Is there a file descriptor leak? (Are FDs increasing over time?)
$ while true; do echo "$(date): $(ls /proc/$(pgrep -f nginx | head -1)/fd | wc -l)"; sleep 10; done
Step 4: Fix It
# Option A: Increase limits in systemd service file
$ sudo systemctl edit nginx.service
[Service]
LimitNOFILE=65536
$ sudo systemctl daemon-reload
$ sudo systemctl restart nginx
# Verify the new limit
$ cat /proc/$(pgrep -f nginx | head -1)/limits | grep "Max open files"
Max open files 65536 65536 files
# Option B: If using limits.conf (for non-systemd processes)
$ echo "nginx soft nofile 65536" | sudo tee -a /etc/security/limits.d/nginx.conf
$ echo "nginx hard nofile 65536" | sudo tee -a /etc/security/limits.d/nginx.conf
# Option C: If it is a file descriptor LEAK, the real fix is fixing the application
# (Increasing limits just delays the inevitable)
systemd Resource Controls
For services managed by systemd, resource limits are set in the service unit file:
[Service]
# File descriptor limit
LimitNOFILE=65536
# Max processes
LimitNPROC=4096
# Max locked memory (for databases)
LimitMEMLOCK=infinity
# Max core dump size
LimitCORE=infinity
# CPU time limit (in seconds)
LimitCPU=infinity
# Max file size
LimitFSIZE=infinity
# Max address space
LimitAS=infinity
# Apply to an existing service without editing the main file
$ sudo systemctl edit nginx.service
# This creates an override file at
# /etc/systemd/system/nginx.service.d/override.conf
# Verify the effective limits
$ sudo systemctl show nginx.service | grep LimitNOFILE
LimitNOFILE=65536
LimitNOFILESoft=65536
Checking Resource Usage of systemd Services
# See resource usage of a service
$ systemctl status nginx.service
Tasks: 5 (limit: 4096)
Memory: 12.5M
CPU: 2.345s
# Detailed cgroup view
$ systemd-cgtop
Control Group Tasks %CPU Memory
/ 245 5.2 4.5G
/system.slice 78 3.1 2.1G
/system.slice/ngin 5 0.8 12.5M
/user.slice 167 2.1 2.4G
Debug This
A Java application fails to start with:
java.io.IOException: Too many open files
You check and find:
$ ulimit -n
1024
$ cat /etc/security/limits.d/java.conf
javauser soft nofile 65536
javauser hard nofile 65536
The limits.conf looks correct. But the limit is still 1024. Why?
Common causes:
-
The application is started by systemd, and systemd does not read limits.conf. Fix: add
LimitNOFILE=65536to the systemd service file. -
The PAM module is not loaded. limits.conf requires
pam_limits.so. Check:
$ grep pam_limits /etc/pam.d/common-session
session required pam_limits.so
- The user running the process is different from what you think. Check:
$ ps aux | grep java
root 5678 ... java -jar app.jar
# Running as root, not as javauser!
- The shell session was started before the limits.conf change. Log out and back in, or start a new session.
Think About It: If you increase the file descriptor limit to 1,000,000 for a process, does that mean it will actually use that many? What is the cost of a higher limit if the FDs are not actually used?
┌──────────────────────────────────────────────────────────┐
│ What Just Happened? │
├──────────────────────────────────────────────────────────┤
│ │
│ File descriptors are how Linux tracks open files: │
│ - FD 0 = stdin, FD 1 = stdout, FD 2 = stderr │
│ - Every file, socket, pipe, device uses an FD │
│ - /proc/PID/fd shows a process's open FDs │
│ - lsof lists open files and their FD details │
│ │
│ Resource limits control FD usage: │
│ - ulimit -n: per-process soft limit │
│ - /etc/security/limits.conf: persistent limits │
│ - LimitNOFILE in systemd: for managed services │
│ - fs.file-max: system-wide total │
│ - fs.nr_open: max per-process ceiling │
│ │
│ "Too many open files" troubleshooting: │
│ 1. Find the process (dmesg, app logs) │
│ 2. Count its FDs (ls /proc/PID/fd | wc -l) │
│ 3. Check its limits (cat /proc/PID/limits) │
│ 4. Determine if it is a limit issue or a leak │
│ 5. Fix via systemd LimitNOFILE or limits.conf │
│ │
└──────────────────────────────────────────────────────────┘
Try This
-
Explore your FDs: Run
ls -la /proc/$$/fd/to see your shell's file descriptors. Open a file withexec 3>/tmp/testfile, runls -la /proc/$$/fd/again, and see FD 3 appear. Close it withexec 3>&-. -
lsof investigation: Use
lsof -u $USERto see all files you have open. Count them. How many are regular files vs sockets vs pipes? -
Limit testing: Set
ulimit -n 10in a shell. Then try to open many files with a simple script. Observe the "Too many open files" error. -
Process FD counting: Write a one-liner that finds the process with the most open file descriptors on your system. Use
/proc/*/fdandwc -l. -
Bonus challenge: Create a systemd service that runs a simple script. Set
LimitNOFILE=100in the service file. Have the script try to open 200 files. Checkjournalctlfor the failure. Then increase the limit to 300 and verify it works.