Skip to content

Process Management

Overview

Process management is fundamental to understanding how Linux runs programs and allocates system resources.

Process vs Thread

Process

A process is an instance of a running program with its own isolated memory space.

Characteristics:

  • Independent memory space: Cannot directly access other processes' memory
  • Own resources: File descriptors, signal handlers, process ID
  • Heavy creation: fork() copies entire memory space (COW optimized)
  • Security boundary: Isolated for safety
  • Context switch overhead: Must save/restore page tables, TLB flush

Thread

A thread is a unit of execution within a process that shares the process's resources.

Characteristics:

  • Shared memory: All threads see same address space
  • Shared resources: File descriptors, signal handlers, heap
  • Private stack: Each thread has its own stack
  • Lightweight: Fast creation, minimal overhead
  • Context switch: Faster, no TLB flush needed

Comparison

Feature Process Thread
Memory space Separate Shared
Creation time ~100μs ~10μs
Communication IPC needed Direct memory access
Safety Isolated Can corrupt each other
Resources Independent Shared
Context switch Expensive Cheap

When to Use

Use Processes:

  • Need isolation/security
  • Running untrusted code
  • Want crash isolation
  • Different programs

Use Threads:

  • Shared state needed
  • Parallel computation
  • Performance critical
  • Single application

Process States

A process transitions through various states during its lifetime.

State Diagram

stateDiagram-v2
    [*] --> Ready: fork()
    Ready --> Running: scheduled
    Running --> Ready: preempted
    Running --> Blocked: wait for I/O
    Blocked --> Ready: I/O complete
    Running --> Zombie: exit()
    Zombie --> [*]: parent wait()

State Descriptions

Running

  • Currently executing on CPU
  • Only one per CPU core

Ready (Runnable)

  • Ready to run, waiting for CPU
  • In run queue

Blocked (Sleeping)

  • Waiting for event (I/O, signal, resource)
  • Interruptible (S): Can be woken by signal
  • Uninterruptible (D): Cannot be interrupted (usually I/O)

Zombie (Defunct)

  • Terminated but not yet reaped by parent
  • Process Control Block still exists
  • Exit status needs collection

Orphan

  • Parent terminated before child
  • Adopted by init (PID 1)

Stopped

  • Suspended (SIGSTOP, SIGTSTP)
  • Can be continued with SIGCONT
# View process states
ps aux

# State codes
# R = Running
# S = Sleeping (interruptible)
# D = Disk sleep (uninterruptible)
# Z = Zombie
# T = Stopped
# I = Idle

Process Creation Lifecycle

fork() System Call

Creates a new process by duplicating the calling process.

pid_t pid = fork();

if (pid == -1) {
    // Error
    perror("fork failed");
} else if (pid == 0) {
    // Child process
    printf("I am child, PID: %d\n", getpid());
} else {
    // Parent process
    printf("I am parent, child PID: %d\n", pid);
}

What fork() does:

  1. Allocate new PCB: Create Process Control Block
  2. Assign PID: New process ID from kernel
  3. Copy memory: Use Copy-on-Write for efficiency
  4. Copy file descriptors: Reference same file table entries
  5. Copy signal handlers: Inherit parent's handlers
  6. Return twice:
  7. Returns 0 to child
  8. Returns child PID to parent

exec() Family

Replaces current process image with new program.

// Replace process with /bin/ls
char *args[] = {"/bin/ls", "-l", NULL};
execv("/bin/ls", args);

// Only reached if exec fails
perror("exec failed");

What exec() does:

  1. Load program: Read executable from disk
  2. Replace memory: Clear old code, data, stack
  3. Initialize new memory: Load new segments
  4. Reset signals: Default signal handlers
  5. Keep PID: Same process, new program
  6. Keep file descriptors: Unless FD_CLOEXEC set

Common Pattern: fork() + exec()

pid_t pid = fork();

if (pid == 0) {
    // Child: run new program
    execl("/bin/ls", "ls", "-l", NULL);
    _exit(1);  // Only if exec fails
}

// Parent continues
wait(NULL);  // Wait for child

Process Termination

Normal termination:

exit(0);        // Library function, cleanup
_exit(0);       // System call, immediate
return 0;       // From main(), calls exit()

Abnormal termination:

  • Signal (SIGTERM, SIGKILL)
  • Crash (SIGSEGV, SIGABRT)
  • Fatal error

Cleanup steps:

  1. Close file descriptors
  2. Release memory
  3. Send SIGCHLD to parent
  4. Become zombie: Wait for parent to collect status
  5. Parent calls wait(): Release PCB

Zombie Processes

A zombie is a terminated process waiting for parent to collect exit status.

// Create zombie (bad practice!)
if (fork() == 0) {
    exit(0);  // Child exits
}
// Parent doesn't call wait()
sleep(60);  // Child is zombie for 60 seconds

Problems:

  • Consumes PID
  • Wastes kernel memory (PCB)
  • Too many zombies can exhaust PIDs

Solutions:

// Solution 1: Wait for children
wait(NULL);
waitpid(child_pid, &status, 0);

// Solution 2: Ignore SIGCHLD
signal(SIGCHLD, SIG_IGN);

// Solution 3: Double fork trick
if (fork() == 0) {
    if (fork() == 0) {
        // Grandchild does work
        sleep(10);
        exit(0);
    }
    exit(0);  // Child exits immediately
}
wait(NULL);  // Reap child, grandchild becomes orphan → init adopts
# Find zombie processes
ps aux | grep 'Z'
ps -eo stat,pid,cmd | grep '^Z'

Context Switching

Context switching is the process of saving and restoring CPU state to switch between processes.

What Gets Saved/Restored

CPU Registers:

  • Program counter (PC)
  • Stack pointer (SP)
  • General purpose registers
  • Floating point registers

Process State:

  • Process ID
  • Priority
  • Memory mappings (page table)

Context Switch Steps

  1. Timer interrupt or system call: Trigger switch
  2. Save current process state: Registers, PC to PCB
  3. Select next process: Scheduler decision
  4. Load new process state: Restore registers from PCB
  5. Switch page tables: Update MMU
  6. Flush TLB: Clear address translation cache
  7. Resume execution: Jump to new PC
graph LR
    A[Process A Running] --> B[Timer Interrupt]
    B --> C[Save A's State]
    C --> D[Scheduler Selects B]
    D --> E[Load B's State]
    E --> F[Process B Running]

Context Switch Overhead

Direct costs:

  • Save/restore registers: ~1000 cycles
  • TLB flush: ~1000 cycles
  • Cache pollution: ~10,000 cycles

Indirect costs:

  • Cache misses after switch
  • TLB misses after switch
  • Pipeline stalls
# Measure context switches
vmstat 1
pidstat -w 1

# Per-process context switches
cat /proc/<PID>/status | grep voluntary

Process Scheduling

The Linux scheduler decides which process runs when.

Completely Fair Scheduler (CFS)

Linux uses CFS for normal processes (non-real-time).

Key Concepts:

Virtual Runtime (vruntime)

  • Tracks how much CPU time process has used
  • Weighted by priority
  • Process with lowest vruntime runs next

Red-Black Tree

  • Processes sorted by vruntime
  • O(log n) insertion/removal
  • Leftmost node = next to run

Time Slice

  • Not fixed time slice
  • Minimum granularity (typically 1ms)
  • Fair share based on number of processes

Scheduling Classes

  1. SCHED_FIFO: Real-time, first-in-first-out
  2. SCHED_RR: Real-time, round-robin
  3. SCHED_NORMAL: Default, CFS
  4. SCHED_BATCH: Batch processing, low priority
  5. SCHED_IDLE: Lowest priority
# View scheduling policy
chrt -p <PID>

# Set scheduling policy
chrt -f -p 99 <PID>  # FIFO, priority 99

Priorities and Nice Values

Nice Value (-20 to 19):

  • Lower = higher priority
  • Default = 0
  • User can increase (lower priority)
  • Root can decrease (raise priority)
# View priority and nice
ps -eo pid,ni,pri,comm

# Change nice value
nice -n 10 ./program      # Start with nice 10
renice -n 5 -p <PID>      # Change nice value

# Priority (PRI)
# PRI = 20 + nice value (typically)

CPU Affinity:

# Set CPU affinity
taskset -c 0,1 ./program        # Run on CPUs 0 and 1
taskset -p -c 0,1 <PID>         # Change existing process

Load Average

Load average represents the average number of runnable and uninterruptible processes.

# View load average
uptime
# load average: 1.23, 0.85, 0.45  (1, 5, 15 minutes)

cat /proc/loadavg
# 1.23 0.85 0.45 2/345 12345
# last/total processes, last PID

Interpretation:

  • Load = # CPUs: System fully utilized
  • Load < # CPUs: CPUs idle
  • Load > # CPUs: Processes waiting

Important:

  • Includes processes in D state (disk I/O)
  • Not just CPU load
  • Don't confuse with CPU utilization %

Process Groups and Sessions

Process Groups

A collection of processes that can receive signals collectively.

# View process group
ps -eo pid,pgid,cmd

# Send signal to entire group
kill -TERM -<PGID>  # Negative PID = process group

Sessions

A collection of process groups, typically associated with a terminal.

# View session ID
ps -eo pid,sid,cmd

# Create new session
setsid ./program

Job Control

# Foreground job
./program

# Background job
./program &

# List jobs
jobs

# Bring to foreground
fg %1

# Send to background
bg %1

# Stop job
Ctrl+Z

# Continue job
fg  or  bg

Daemons

A daemon is a background process that runs independently of a terminal.

Creating a Daemon

Traditional Method:

void daemonize() {
    // 1. Fork and exit parent
    pid_t pid = fork();
    if (pid > 0) exit(0);

    // 2. Create new session
    setsid();

    // 3. Fork again and exit
    pid = fork();
    if (pid > 0) exit(0);

    // 4. Change working directory
    chdir("/");

    // 5. Close file descriptors
    close(STDIN_FILENO);
    close(STDOUT_FILENO);
    close(STDERR_FILENO);

    // 6. Redirect to /dev/null
    open("/dev/null", O_RDONLY);  // stdin
    open("/dev/null", O_WRONLY);  // stdout
    open("/dev/null", O_WRONLY);  // stderr

    // 7. Do daemon work
    while (1) {
        // ...
    }
}

Modern Method (systemd):

[Unit]
Description=My Service

[Service]
ExecStart=/usr/bin/myprogram
Restart=always

[Install]
WantedBy=multi-user.target
# Manage with systemd
systemctl start myservice
systemctl status myservice
systemctl enable myservice

Practice Questions

  1. What is the difference between a process and a thread?
  2. Explain the complete lifecycle of a zombie process.
  3. What happens during a context switch?
  4. How does the CFS scheduler ensure fairness?
  5. What is the difference between fork() and vfork()?
  6. Why is the load average different from CPU utilization?
  7. How would you create a daemon process?
  8. What is the double-fork trick and why is it used?

Further Reading

  • man 2 fork, man 3 exec, man 2 wait
  • man 7 sched
  • man 1 ps, man 1 top
  • "The Linux Programming Interface" Chapters 24-28
  • "Linux Kernel Development" by Robert Love