File Descriptors & I/O¶
What is a File Descriptor¶
A file descriptor (FD) is a positive integer greather than 0 that represents an open file, socket, pipe, or any other I/O resource.
Key Point: It's just an index into the process's file descriptor table!
int fd = open("/tmp/file.txt", O_RDONLY);
// fd is typically 3 (if 0,1,2 are taken)
read(fd, buffer, size);
close(fd);
File Descriptor Tables¶
Three Levels of Tables¶
graph TD
A[Process A FD Table] -->|FD 3| B[File Table Entry 1]
A -->|FD 4| C[File Table Entry 2]
D[Process B FD Table] -->|FD 3| B
B -->|inode| E[Inode X]
C -->|inode| F[Inode Y]
1. Per-Process FD Table
- Each process has its own
- Array of pointers to file table entries
- Index is the file descriptor number
2. System-Wide File Table
- Kernel maintains one
- Tracks: file offset, access mode, reference count
- Multiple FDs can point to same entry (dup, fork)
3. Inode Table
- One per file/device
- Contains file metadata and data location
Underliying Structure of the FDT¶
graph LR
subgraph "Process: PID 1234"
direction TB
FDT[File Descriptor Table] --> FD0["fd[0]"]
FDT --> FD1["fd[1]"]
FDT --> FD2["fd[2]"]
FDT --> FD3["fd[3]"]
FDT --> FD4["fd[4]"]
FDT --> FD5["fd[5]"]
end
subgraph "Open File Objects (struct file)"
direction TB
TTY0["struct file<br/>f_path: /dev/pts/0<br/>f_pos: N/A (tty)<br/>type: character device"]
TTY1["struct file<br/>f_path: /dev/pts/1<br/>f_pos: N/A (tty)<br/>type: character device"]
TTY2["struct file<br/>f_path: /dev/pts/2<br/>f_pos: N/A (tty)<br/>type: character device"]
FILE["struct file<br/>f_path: /path/to/data.txt<br/>f_pos: 2048<br/>type: regular file"]
SOCK["struct file<br/>f_path: socket:[12345]<br/>f_pos: N/A (stream socket)<br/>type: socket"]
PIPE_W["struct file<br/>f_path: pipe:[6789]<br/>f_pos: N/A (pipe)<br/>type: pipe (write end)"]
end
%% Connections
FD0 -->|stdin| TTY0
FD1 -->|stdout| TTY1
FD2 -->|stderr| TTY2
FD3 -->|regular file| FILE
FD4 -->|TCP socket| SOCK
FD5 -->|"pipe (write)"| PIPE_W
Example¶
// Process 1
int fd1 = open("file.txt", O_RDONLY); // FD 3 → file table entry A → inode 12345
// Process 2
int fd2 = open("file.txt", O_RDONLY); // FD 3 → file table entry B → inode 12345
// Different file table entries (independent offsets), same inode
// After fork()
if (fork() == 0) {
// Child shares same file table entry!
// Parent and child have synchronized file offset
}
Standard File Descriptors¶
Every process starts with three open file descriptors:
| FD | Name | Description |
|---|---|---|
| 0 | stdin | Standard input |
| 1 | stdout | Standard output |
| 2 | stderr | Standard error |
// These are equivalent
read(0, buffer, size);
read(STDIN_FILENO, buffer, size);
write(1, "Hello\n", 6);
write(STDOUT_FILENO, "Hello\n", 6);
write(2, "Error\n", 6);
write(STDERR_FILENO, "Error\n", 6);
lsof - List Open Files¶
lsof shows all open file descriptors.
# List all open files
lsof
# Files opened by process
lsof -p <PID>
# Processes using a file
lsof /path/to/file
# Network connections
lsof -i
lsof -i :80 # Port 80
lsof -i TCP # TCP only
# Files in directory
lsof +D /var/log
# User's files
lsof -u username
Example output:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
bash 1234 user cwd DIR 252,0 4096 123 /home/user
bash 1234 user 0u CHR 136,0 0t0 3 /dev/pts/0
bash 1234 user 1u CHR 136,0 0t0 3 /dev/pts/0
bash 1234 user 2u CHR 136,0 0t0 3 /dev/pts/0
bash 1234 user 255u CHR 136,0 0t0 3 /dev/pts/0
FD column:
cwd: Current working directorytxt: Program text (code)mem: Memory-mapped file0u: FD 0, read/write mode1w: FD 1, write only
I/O Redirection¶
Shell Redirection¶
# Redirect stdout to file
command > output.txt
# Redirect stderr to file
command 2> error.txt
# Redirect both
command > output.txt 2>&1
command &> output.txt # Bash shortcut
# Append
command >> output.txt
# Redirect stdin
command < input.txt
# Here document
command << EOF
line 1
line 2
EOF
# Pipe
command1 | command2
Programmatic Redirection¶
// Redirect stdout to file
int fd = open("output.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
dup2(fd, STDOUT_FILENO); // Now FD 1 points to output.txt
close(fd);
printf("This goes to file\n");
// Restore stdout (save first)
int saved_stdout = dup(STDOUT_FILENO);
// ... redirect ...
dup2(saved_stdout, STDOUT_FILENO);
close(saved_stdout);
dup2() usage:
int dup2(int oldfd, int newfd);
// Makes newfd a copy of oldfd
// If newfd is open, it's closed first
Blocking vs Non-Blocking I/O¶
Blocking I/O (Default)¶
Operations wait until complete.
int fd = open("file.txt", O_RDONLY);
char buf[100];
// Blocks until data available or EOF
ssize_t n = read(fd, buf, sizeof(buf));
Characteristics:
- Simple to program
- Process sleeps while waiting
- Good for sequential operations
Non-Blocking I/O¶
Operations return immediately, even if not complete.
int fd = open("file.txt", O_RDONLY | O_NONBLOCK);
char buf[100];
ssize_t n = read(fd, buf, sizeof(buf));
if (n == -1 && errno == EAGAIN) {
// No data available now, try again later
}
Set non-blocking on existing FD:
Characteristics:
- Returns EAGAIN/EWOULDBLOCK if would block
- Need to poll or use I/O multiplexing
- More complex but better for concurrent operations
I/O Multiplexing¶
Handle multiple file descriptors simultaneously.
select()¶
Original Unix multiplexing.
fd_set readfds;
FD_ZERO(&readfds);
FD_SET(fd1, &readfds);
FD_SET(fd2, &readfds);
struct timeval timeout = {5, 0}; // 5 seconds
int ready = select(max_fd + 1, &readfds, NULL, NULL, &timeout);
if (ready > 0) {
if (FD_ISSET(fd1, &readfds)) {
// fd1 is ready to read
}
if (FD_ISSET(fd2, &readfds)) {
// fd2 is ready to read
}
}
Limitations:
- FD_SETSIZE limit (typically 1024)
- Must scan all FDs to find ready ones
- O(n) complexity
poll()¶
Improvement over select.
struct pollfd fds[2];
fds[0].fd = fd1;
fds[0].events = POLLIN;
fds[1].fd = fd2;
fds[1].events = POLLIN;
int ready = poll(fds, 2, 5000); // 5 second timeout
if (ready > 0) {
if (fds[0].revents & POLLIN) {
// fd1 is ready
}
if (fds[1].revents & POLLIN) {
// fd2 is ready
}
}
Improvements:
- No FD limit
- Clearer API
- Still O(n) complexity
epoll()¶
Linux-specific, most efficient.
// Create epoll instance
int epfd = epoll_create1(0);
// Add FDs to monitor
struct epoll_event ev;
ev.events = EPOLLIN;
ev.data.fd = fd1;
epoll_ctl(epfd, EPOLL_CTL_ADD, fd1, &ev);
ev.data.fd = fd2;
epoll_ctl(epfd, EPOLL_CTL_ADD, fd2, &ev);
// Wait for events
struct epoll_event events[10];
int ready = epoll_wait(epfd, events, 10, 5000); // 5 second timeout
for (int i = 0; i < ready; i++) {
int fd = events[i].data.fd;
if (events[i].events & EPOLLIN) {
// fd is ready to read
}
}
close(epfd);
Advantages:
- O(1) complexity
- No FD limit
- Edge-triggered or level-triggered
- Most scalable
Trigger Modes:
- Level-triggered (default): Event reported while condition exists
- Edge-triggered (EPOLLET): Event reported only on state change
Comparison¶
| Feature | select() | poll() | epoll() |
|---|---|---|---|
| FD limit | 1024 | Unlimited | Unlimited |
| Performance | O(n) | O(n) | O(1) |
| Scalability | Poor | Medium | Excellent |
| Portability | High | High | Linux only |
Asynchronous I/O¶
Operations return immediately; completion notification comes later.
POSIX AIO¶
#include <aio.h>
struct aiocb cb;
memset(&cb, 0, sizeof(cb));
cb.aio_fildes = fd;
cb.aio_buf = buffer;
cb.aio_nbytes = size;
cb.aio_offset = 0;
// Start async read
aio_read(&cb);
// Do other work
// Check if complete
while (aio_error(&cb) == EINPROGRESS) {
// Still in progress
}
// Get result
ssize_t n = aio_return(&cb);
io_uring (Modern)¶
Linux 5.1+ high-performance async I/O.
#include <liburing.h>
struct io_uring ring;
io_uring_queue_init(32, &ring, 0);
// Submit read request
struct io_uring_sqe *sqe = io_uring_get_sqe(&ring);
io_uring_prep_read(sqe, fd, buffer, size, offset);
io_uring_submit(&ring);
// Wait for completion
struct io_uring_cqe *cqe;
io_uring_wait_cqe(&ring, &cqe);
ssize_t n = cqe->res;
io_uring_cqe_seen(&ring, cqe);
io_uring_queue_exit(&ring);
Benefits:
- True async I/O
- Shared memory ring buffers
- Very high performance
File Descriptor Limits¶
# Per-process limits
ulimit -n # Soft limit
ulimit -Hn # Hard limit
# View for process
cat /proc/<PID>/limits
# System-wide limits
cat /proc/sys/fs/file-max
cat /proc/sys/fs/file-nr # allocated, used, max
# Change limits
ulimit -n 4096 # Current shell
# or in /etc/security/limits.conf
File Descriptor Leaks¶
Forgetting to close FDs causes resource leaks.
// BAD: Leak
for (int i = 0; i < 1000; i++) {
int fd = open("file.txt", O_RDONLY);
// ... use fd ...
// Forgot close(fd)!
}
// GOOD
int fd = open("file.txt", O_RDONLY);
if (fd >= 0) {
// ... use fd ...
close(fd);
}
Detect leaks:
Practice Questions¶
- What is a file descriptor and how does it relate to an inode?
- Explain the three levels of file descriptor tables.
- What happens to file descriptors after fork()?
- How does dup2() work and what is it used for?
- What is the difference between blocking and non-blocking I/O?
- Compare select(), poll(), and epoll().
- Why is epoll() more scalable than select()?
- How do you detect file descriptor leaks?
Further Reading¶
man 2 open,man 2 read,man 2 closeman 2 select,man 2 poll,man 7 epollman 8 lsof- "The Linux Programming Interface" Chapters 4-5, 63