Skip to content

File Descriptors & I/O

What is a File Descriptor

A file descriptor (FD) is a positive integer greather than 0 that represents an open file, socket, pipe, or any other I/O resource.

Key Point: It's just an index into the process's file descriptor table!

int fd = open("/tmp/file.txt", O_RDONLY);
// fd is typically 3 (if 0,1,2 are taken)
read(fd, buffer, size);
close(fd);

File Descriptor Tables

Three Levels of Tables

graph TD
    A[Process A FD Table] -->|FD 3| B[File Table Entry 1]
    A -->|FD 4| C[File Table Entry 2]
    D[Process B FD Table] -->|FD 3| B
    B -->|inode| E[Inode X]
    C -->|inode| F[Inode Y]

1. Per-Process FD Table

  • Each process has its own
  • Array of pointers to file table entries
  • Index is the file descriptor number

2. System-Wide File Table

  • Kernel maintains one
  • Tracks: file offset, access mode, reference count
  • Multiple FDs can point to same entry (dup, fork)

3. Inode Table

  • One per file/device
  • Contains file metadata and data location

Underliying Structure of the FDT

graph LR
    subgraph "Process: PID 1234"
        direction TB
        FDT[File Descriptor Table] --> FD0["fd[0]"]
        FDT --> FD1["fd[1]"]
        FDT --> FD2["fd[2]"]
        FDT --> FD3["fd[3]"]
        FDT --> FD4["fd[4]"]
        FDT --> FD5["fd[5]"]
    end

    subgraph "Open File Objects (struct file)"
        direction TB

        TTY0["struct file<br/>f_path: /dev/pts/0<br/>f_pos: N/A (tty)<br/>type: character device"]
        TTY1["struct file<br/>f_path: /dev/pts/1<br/>f_pos: N/A (tty)<br/>type: character device"]
        TTY2["struct file<br/>f_path: /dev/pts/2<br/>f_pos: N/A (tty)<br/>type: character device"]
        FILE["struct file<br/>f_path: /path/to/data.txt<br/>f_pos: 2048<br/>type: regular file"]
        SOCK["struct file<br/>f_path: socket:[12345]<br/>f_pos: N/A (stream socket)<br/>type: socket"]
        PIPE_W["struct file<br/>f_path: pipe:[6789]<br/>f_pos: N/A (pipe)<br/>type: pipe (write end)"]
    end

    %% Connections
    FD0 -->|stdin| TTY0
    FD1 -->|stdout| TTY1
    FD2 -->|stderr| TTY2
    FD3 -->|regular file| FILE
    FD4 -->|TCP socket| SOCK
    FD5 -->|"pipe (write)"| PIPE_W

Example

// Process 1
int fd1 = open("file.txt", O_RDONLY);  // FD 3 → file table entry A → inode 12345

// Process 2
int fd2 = open("file.txt", O_RDONLY);  // FD 3 → file table entry B → inode 12345
// Different file table entries (independent offsets), same inode

// After fork()
if (fork() == 0) {
    // Child shares same file table entry!
    // Parent and child have synchronized file offset
}

Standard File Descriptors

Every process starts with three open file descriptors:

FD Name Description
0 stdin Standard input
1 stdout Standard output
2 stderr Standard error
// These are equivalent
read(0, buffer, size);
read(STDIN_FILENO, buffer, size);

write(1, "Hello\n", 6);
write(STDOUT_FILENO, "Hello\n", 6);

write(2, "Error\n", 6);
write(STDERR_FILENO, "Error\n", 6);

lsof - List Open Files

lsof shows all open file descriptors.

# List all open files
lsof

# Files opened by process
lsof -p <PID>

# Processes using a file
lsof /path/to/file

# Network connections
lsof -i
lsof -i :80        # Port 80
lsof -i TCP        # TCP only

# Files in directory
lsof +D /var/log

# User's files
lsof -u username

Example output:

COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
bash      1234 user  cwd    DIR  252,0     4096  123 /home/user
bash      1234 user    0u   CHR  136,0      0t0    3 /dev/pts/0
bash      1234 user    1u   CHR  136,0      0t0    3 /dev/pts/0
bash      1234 user    2u   CHR  136,0      0t0    3 /dev/pts/0
bash      1234 user   255u   CHR  136,0      0t0    3 /dev/pts/0

FD column:

  • cwd: Current working directory
  • txt: Program text (code)
  • mem: Memory-mapped file
  • 0u: FD 0, read/write mode
  • 1w: FD 1, write only

I/O Redirection

Shell Redirection

# Redirect stdout to file
command > output.txt

# Redirect stderr to file
command 2> error.txt

# Redirect both
command > output.txt 2>&1
command &> output.txt        # Bash shortcut

# Append
command >> output.txt

# Redirect stdin
command < input.txt

# Here document
command << EOF
line 1
line 2
EOF

# Pipe
command1 | command2

Programmatic Redirection

// Redirect stdout to file
int fd = open("output.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
dup2(fd, STDOUT_FILENO);  // Now FD 1 points to output.txt
close(fd);
printf("This goes to file\n");

// Restore stdout (save first)
int saved_stdout = dup(STDOUT_FILENO);
// ... redirect ...
dup2(saved_stdout, STDOUT_FILENO);
close(saved_stdout);

dup2() usage:

int dup2(int oldfd, int newfd);
// Makes newfd a copy of oldfd
// If newfd is open, it's closed first

Blocking vs Non-Blocking I/O

Blocking I/O (Default)

Operations wait until complete.

int fd = open("file.txt", O_RDONLY);
char buf[100];
// Blocks until data available or EOF
ssize_t n = read(fd, buf, sizeof(buf));

Characteristics:

  • Simple to program
  • Process sleeps while waiting
  • Good for sequential operations

Non-Blocking I/O

Operations return immediately, even if not complete.

int fd = open("file.txt", O_RDONLY | O_NONBLOCK);
char buf[100];
ssize_t n = read(fd, buf, sizeof(buf));
if (n == -1 && errno == EAGAIN) {
    // No data available now, try again later
}

Set non-blocking on existing FD:

int flags = fcntl(fd, F_GETFL);
fcntl(fd, F_SETFL, flags | O_NONBLOCK);

Characteristics:

  • Returns EAGAIN/EWOULDBLOCK if would block
  • Need to poll or use I/O multiplexing
  • More complex but better for concurrent operations

I/O Multiplexing

Handle multiple file descriptors simultaneously.

select()

Original Unix multiplexing.

fd_set readfds;
FD_ZERO(&readfds);
FD_SET(fd1, &readfds);
FD_SET(fd2, &readfds);

struct timeval timeout = {5, 0};  // 5 seconds
int ready = select(max_fd + 1, &readfds, NULL, NULL, &timeout);

if (ready > 0) {
    if (FD_ISSET(fd1, &readfds)) {
        // fd1 is ready to read
    }
    if (FD_ISSET(fd2, &readfds)) {
        // fd2 is ready to read
    }
}

Limitations:

  • FD_SETSIZE limit (typically 1024)
  • Must scan all FDs to find ready ones
  • O(n) complexity

poll()

Improvement over select.

struct pollfd fds[2];
fds[0].fd = fd1;
fds[0].events = POLLIN;
fds[1].fd = fd2;
fds[1].events = POLLIN;

int ready = poll(fds, 2, 5000);  // 5 second timeout

if (ready > 0) {
    if (fds[0].revents & POLLIN) {
        // fd1 is ready
    }
    if (fds[1].revents & POLLIN) {
        // fd2 is ready
    }
}

Improvements:

  • No FD limit
  • Clearer API
  • Still O(n) complexity

epoll()

Linux-specific, most efficient.

// Create epoll instance
int epfd = epoll_create1(0);

// Add FDs to monitor
struct epoll_event ev;
ev.events = EPOLLIN;
ev.data.fd = fd1;
epoll_ctl(epfd, EPOLL_CTL_ADD, fd1, &ev);

ev.data.fd = fd2;
epoll_ctl(epfd, EPOLL_CTL_ADD, fd2, &ev);

// Wait for events
struct epoll_event events[10];
int ready = epoll_wait(epfd, events, 10, 5000);  // 5 second timeout

for (int i = 0; i < ready; i++) {
    int fd = events[i].data.fd;
    if (events[i].events & EPOLLIN) {
        // fd is ready to read
    }
}

close(epfd);

Advantages:

  • O(1) complexity
  • No FD limit
  • Edge-triggered or level-triggered
  • Most scalable

Trigger Modes:

  • Level-triggered (default): Event reported while condition exists
  • Edge-triggered (EPOLLET): Event reported only on state change

Comparison

Feature select() poll() epoll()
FD limit 1024 Unlimited Unlimited
Performance O(n) O(n) O(1)
Scalability Poor Medium Excellent
Portability High High Linux only

Asynchronous I/O

Operations return immediately; completion notification comes later.

POSIX AIO

#include <aio.h>

struct aiocb cb;
memset(&cb, 0, sizeof(cb));
cb.aio_fildes = fd;
cb.aio_buf = buffer;
cb.aio_nbytes = size;
cb.aio_offset = 0;

// Start async read
aio_read(&cb);

// Do other work

// Check if complete
while (aio_error(&cb) == EINPROGRESS) {
    // Still in progress
}

// Get result
ssize_t n = aio_return(&cb);

io_uring (Modern)

Linux 5.1+ high-performance async I/O.

#include <liburing.h>

struct io_uring ring;
io_uring_queue_init(32, &ring, 0);

// Submit read request
struct io_uring_sqe *sqe = io_uring_get_sqe(&ring);
io_uring_prep_read(sqe, fd, buffer, size, offset);
io_uring_submit(&ring);

// Wait for completion
struct io_uring_cqe *cqe;
io_uring_wait_cqe(&ring, &cqe);
ssize_t n = cqe->res;
io_uring_cqe_seen(&ring, cqe);

io_uring_queue_exit(&ring);

Benefits:

  • True async I/O
  • Shared memory ring buffers
  • Very high performance

File Descriptor Limits

# Per-process limits
ulimit -n          # Soft limit
ulimit -Hn         # Hard limit

# View for process
cat /proc/<PID>/limits

# System-wide limits
cat /proc/sys/fs/file-max
cat /proc/sys/fs/file-nr    # allocated, used, max

# Change limits
ulimit -n 4096                  # Current shell
# or in /etc/security/limits.conf

File Descriptor Leaks

Forgetting to close FDs causes resource leaks.

// BAD: Leak
for (int i = 0; i < 1000; i++) {
    int fd = open("file.txt", O_RDONLY);
    // ... use fd ...
    // Forgot close(fd)!
}

// GOOD
int fd = open("file.txt", O_RDONLY);
if (fd >= 0) {
    // ... use fd ...
    close(fd);
}

Detect leaks:

# Monitor FD count
watch -n 1 'lsof -p <PID> | wc -l'

# List all FDs
ls -l /proc/<PID>/fd/

Practice Questions

  1. What is a file descriptor and how does it relate to an inode?
  2. Explain the three levels of file descriptor tables.
  3. What happens to file descriptors after fork()?
  4. How does dup2() work and what is it used for?
  5. What is the difference between blocking and non-blocking I/O?
  6. Compare select(), poll(), and epoll().
  7. Why is epoll() more scalable than select()?
  8. How do you detect file descriptor leaks?

Further Reading

  • man 2 open, man 2 read, man 2 close
  • man 2 select, man 2 poll, man 7 epoll
  • man 8 lsof
  • "The Linux Programming Interface" Chapters 4-5, 63