Skip to main content
TWYTech World by Yashrajsinh

Linux Processes and Signals

Y
Yashrajsinh
··13 min read·Intermediate

Linux Processes and Signals

Every program running on a Linux system exists as a process. When you start a web server, run a build command, or open a terminal, the kernel creates a process with its own memory space, file descriptors, and execution context. Understanding how processes work, how they communicate through signals, and how to manage them effectively is essential for any engineer working with production systems. Whether you are debugging a stuck deployment on AWS EC2, managing background tasks in a Docker container, or writing automation scripts for Jenkins pipelines, process management knowledge gives you the tools to diagnose and resolve issues quickly.

This guide takes you through the complete lifecycle of Linux processes, from creation through execution to termination. You will learn how the kernel schedules processes, how parent-child relationships work, how signals provide inter-process communication, and how to use job control to manage multiple tasks from a single terminal session. By the end, you will be comfortable inspecting process trees, sending targeted signals, managing resource consumption, and writing scripts that handle signals gracefully.

Concept Overview

Every running program in Linux is a process with its own memory space, file descriptors, and scheduling priority. The kernel manages process lifecycle through signals, which are software interrupts that notify processes of events like termination requests, child process completion, or timer expiration.

Step-by-Step Explanation

The following sections cover process management from creation through termination, explaining how signals control process behavior and how job control lets you manage multiple tasks from a single terminal session. Each concept builds toward the monitoring and debugging techniques used in production.

What You Will Learn

By completing this guide you will understand and be able to apply the following concepts:

  • How the Linux kernel creates and manages processes through fork and exec system calls
  • How process states transition between running, sleeping, stopped, and zombie
  • How to inspect process hierarchies using ps, pstree, top, and htop
  • How signals work as an inter-process communication mechanism and what each standard signal does
  • How to send signals to processes using kill, pkill, and killall with appropriate signal choices
  • How job control lets you manage foreground and background processes from a single terminal
  • How to monitor resource consumption including CPU, memory, and file descriptors per process
  • How to write scripts and applications that handle signals gracefully for clean shutdown behavior
  • How to debug process issues including zombie processes, orphaned processes, and resource leaks

These skills connect directly to container management where each container runs as an isolated process tree, to service management with systemd, and to shell scripting where you need to control background processes and handle interrupts properly.

Prerequisites

Before diving into process management, you should be comfortable with basic Linux commands including navigating the filesystem, reading files, and using pipes. Familiarity with running commands in a terminal and basic shell syntax like variables and conditionals will help you follow the scripting examples. Access to a Linux system where you can safely create and terminate processes is essential for practicing these concepts hands-on.

You do not need kernel programming experience or deep systems knowledge. This guide explains concepts from the ground up with practical examples that you can run immediately. Having a second terminal window open while practicing is helpful so you can observe processes from one terminal while controlling them from another.

Process Fundamentals and Lifecycle

A process is the fundamental unit of execution in Linux. Every process has a unique process identifier (PID), a parent process (identified by PPID), an associated user and group, a current working directory, environment variables, open file descriptors, and a memory address space. The kernel maintains all this information in a data structure called the task struct.

Process Creation with Fork and Exec

Linux creates new processes through a two-step mechanism. First, the fork() system call creates an exact copy of the calling process. The new child process gets a new PID but inherits everything else from the parent including open files, environment variables, and memory contents (using copy-on-write for efficiency). Second, the child typically calls exec() to replace its memory image with a new program.

When you type a command in your shell, the shell forks itself, and the child process executes the command you typed. The parent shell waits for the child to finish before presenting you with a new prompt. This fork-exec pattern is how every command you run gets started.

# Observe the fork-exec pattern by tracing a simple command
strace -f -e trace=clone,execve bash -c "ls /tmp" 2>&1 | head -20
 
# View your shell's PID and its parent
echo "Shell PID: $$"
echo "Parent PID: $PPID"
 
# Show the process that started your shell
ps -p $PPID -o pid,ppid,comm
 
# Create a subshell and observe the new PID
(echo "Subshell PID: $$"; echo "Subshell actual PID: $BASHPID")

Process States

Every process exists in one of several states at any given moment. Understanding these states helps you diagnose why a process is not behaving as expected:

  • Running (R): The process is either currently executing on a CPU or waiting in the run queue ready to execute. Active computation happens in this state.
  • Sleeping (S): The process is waiting for an event such as I/O completion, a timer, or a signal. Most processes spend the majority of their time sleeping. This is interruptible sleep where signals can wake the process.
  • Uninterruptible Sleep (D): The process is waiting for I/O that cannot be interrupted, typically disk or network operations at the kernel level. Processes in this state cannot be killed until the I/O completes, which is why stuck NFS mounts create unkillable processes.
  • Stopped (T): The process has been paused, usually by receiving SIGSTOP or SIGTSTP (Ctrl+Z). It remains in memory but does not execute until resumed with SIGCONT.
  • Zombie (Z): The process has finished executing but its parent has not yet read its exit status. The process entry remains in the process table consuming a PID slot but no other resources. Zombies indicate a parent that is not properly waiting for its children.
# View process states in the STAT column
ps aux | head -5
 
# Find zombie processes
ps aux | awk '$8 ~ /Z/ {print}'
 
# Find processes in uninterruptible sleep
ps aux | awk '$8 ~ /D/ {print}'
 
# Watch process state changes in real time
watch -n 1 "ps -eo pid,stat,comm | grep -E '^[[:space:]]*[0-9]+ [RSDTZ]'"

Process Hierarchy and Orphans

Every process except PID 1 (init or systemd) has a parent. When a parent process terminates before its children, those children become orphans. The init process adopts orphaned processes and is responsible for cleaning them up when they exit. This adoption mechanism prevents zombie accumulation when parent processes crash.

You can visualize the process hierarchy to understand relationships between services, shells, and applications:

# Display the full process tree
pstree -p
 
# Show the tree rooted at a specific PID
pstree -p 1
 
# Show the tree for your current user
pstree -u $(whoami)
 
# Find all children of a specific process
ps --ppid 1234 -o pid,comm
 
# Show ancestry of a specific process
ps -p 5678 -o pid,ppid,comm
cat /proc/5678/status | grep -E "^(Pid|PPid|Name)"

Signals In Depth

Signals are software interrupts delivered to processes. They provide a mechanism for the kernel, other processes, or the process itself to notify a process that an event has occurred. Each signal has a default action (terminate, ignore, stop, or continue) that the process can override by installing a signal handler, except for SIGKILL and SIGSTOP which cannot be caught or ignored.

Standard Signal Reference

The most important signals for daily engineering work are:

  • SIGTERM (15): Polite termination request. The process can catch this signal, perform cleanup (close database connections, flush buffers, remove temporary files), and exit gracefully. This is the default signal sent by kill.
  • SIGKILL (9): Immediate termination. The kernel removes the process without giving it a chance to clean up. Use this only when SIGTERM fails because it can leave resources in an inconsistent state.
  • SIGINT (2): Interrupt from keyboard, sent when you press Ctrl+C. Similar to SIGTERM but specifically indicates user-initiated interruption.
  • SIGTSTP (20): Stop from keyboard, sent when you press Ctrl+Z. Suspends the process which can be resumed later.
  • SIGCONT (18): Continue a stopped process. Sent automatically when you use fg or bg commands.
  • SIGHUP (1): Hangup signal, originally sent when a terminal disconnected. Many daemons interpret this as a request to reload configuration files.
  • SIGCHLD (17): Sent to a parent process when a child terminates or stops. Used internally by shells and process managers.
  • SIGUSR1 (10) and SIGUSR2 (12): User-defined signals with no predefined meaning. Applications use these for custom purposes like toggling debug logging or triggering graceful restarts.
  • SIGPIPE (13): Sent when a process writes to a pipe with no reader. Common in pipelines when the downstream command exits early.
  • SIGALRM (14): Timer signal delivered after a specified interval. Used for implementing timeouts.
# List all available signals on your system
kill -l
 
# Send SIGTERM (default) to a process
kill 12345
 
# Send a specific signal by name
kill -SIGHUP 12345
 
# Send a specific signal by number
kill -9 12345
 
# Send a signal to all processes matching a pattern
pkill -SIGUSR1 -f "node server.js"
 
# Send a signal to all processes owned by a user
pkill -u deployuser -SIGTERM
 
# Send SIGTERM to a process group (negative PID)
kill -- -12345

Signal Handling in Scripts

Signals provide inter-process communication that controls process behavior without requiring the process to actively poll for instructions. Understanding signal semantics helps you implement graceful shutdown handlers and debug unresponsive processes.

Writing scripts that handle signals properly ensures clean shutdown behavior. This is critical for deployment scripts, long-running batch jobs, and service wrappers:

#!/bin/bash
# A script that handles signals gracefully
 
TEMP_DIR=$(mktemp -d)
PID_FILE="/tmp/myworker.pid"
echo $$ > "$PID_FILE"
 
# Cleanup function called on exit
cleanup() {
    echo "Cleaning up temporary files..."
    rm -rf "$TEMP_DIR"
    rm -f "$PID_FILE"
    echo "Cleanup complete. Exiting."
    exit 0
}
 
# Register signal handlers
trap cleanup SIGTERM SIGINT
trap "echo 'Received SIGHUP, reloading config...'; source /etc/myapp/config" SIGHUP
trap "echo 'Ignoring SIGUSR1'" SIGUSR1
 
echo "Worker started with PID $$"
echo "Temp directory: $TEMP_DIR"
 
# Main work loop
counter=0
while true; do
    counter=$((counter + 1))
    echo "Working... iteration $counter"
    
    # Simulate work
    sleep 5 &
    wait $!  # Wait is interruptible by signals, unlike sleep alone
done

Sending Signals to Process Groups

When you start a pipeline or a group of related processes, they often share a process group ID (PGID). Sending a signal to the negative of the PGID delivers it to every process in the group simultaneously. This is how your shell terminates an entire pipeline when you press Ctrl+C:

# Find the process group of a command
ps -o pid,pgid,comm -p 12345
 
# Send SIGTERM to an entire process group
kill -- -$(ps -o pgid= -p 12345 | tr -d ' ')
 
# Start a process in its own process group
setsid ./long-running-task.sh &
 
# View session and process group leaders
ps -eo pid,pgid,sid,comm | head -20

Job Control

Job control lets you manage multiple processes from a single terminal session. You can start processes in the background, suspend running processes, bring them back to the foreground, and check their status. This is particularly useful during development when you need to run a server, watch logs, and execute commands simultaneously.

Background and Foreground Processes

Running commands in parallel or in the background enables concurrent execution that reduces total processing time. Understanding job control and wait semantics ensures your scripts coordinate parallel tasks correctly without race conditions.

# Start a command in the background
npm run dev &
 
# The shell prints the job number and PID
# [1] 23456
 
# List all background jobs
jobs -l
 
# Bring job 1 to the foreground
fg %1
 
# Suspend the current foreground process (Ctrl+Z)
# Then resume it in the background
bg %1
 
# Start a process that survives terminal disconnect
nohup node server.js > /var/log/server.log 2>&1 &
 
# Disown a running background job so it survives shell exit
disown %1
 
# Wait for all background jobs to complete
wait
 
# Wait for a specific background PID
wait 23456
echo "Process exited with status: $?"

Practical Job Control Workflow

Job control lets you manage multiple processes from a single terminal session, suspending, resuming, and switching between foreground and background execution. This capability is essential when working on remote servers over SSH connections.

A common development workflow involves running multiple services simultaneously. Here is how you might manage a frontend dev server, a backend API, and a database watcher from one terminal:

# Start the database watcher in the background
./scripts/watch-migrations.sh &
DB_PID=$!
 
# Start the backend API
cd backend && npm run dev &
API_PID=$!
 
# Start the frontend (this one we keep in foreground for logs)
cd frontend && npm run dev
 
# When you need to stop everything (in another terminal or after Ctrl+C):
kill $API_PID $DB_PID
 
# Or kill all jobs from this shell
kill $(jobs -p)

Resource Monitoring Per Process

Resource limits prevent a single service from consuming all available system resources and impacting other workloads. Configuring appropriate CPU and memory constraints through cgroups provides isolation without the overhead of full virtualization.

Understanding how much CPU, memory, and I/O each process consumes helps you identify bottlenecks, detect memory leaks, and plan capacity. Linux provides several tools for per-process resource inspection.

CPU and Memory with Top and Htop

Resource limits prevent a single service from consuming all available system resources and impacting other workloads. Configuring appropriate CPU and memory constraints through cgroups provides isolation without the overhead of full virtualization.

# Interactive process monitor sorted by CPU
top
 
# Useful top commands while running:
# P - sort by CPU
# M - sort by memory
# k - kill a process
# f - choose display fields
# 1 - show per-CPU usage
 
# Show only processes for a specific user
top -u deployuser
 
# Batch mode for scripting (run once and exit)
top -bn1 | head -20
 
# htop provides a better interface with tree view
htop
 
# Show specific process resource usage
ps -p 12345 -o pid,pcpu,pmem,rss,vsz,comm
 
# Track a process over time
pidstat -p 12345 1 10  # sample every 1 second, 10 times

Memory Analysis

Resource limits prevent a single service from consuming all available system resources and impacting other workloads. Configuring appropriate CPU and memory constraints through cgroups provides isolation without the overhead of full virtualization.

# Detailed memory map of a process
cat /proc/12345/maps | head -20
 
# Memory summary from status file
cat /proc/12345/status | grep -E "^(VmSize|VmRSS|VmSwap|Threads)"
 
# System-wide memory overview
free -h
 
# Find the top 10 memory-consuming processes
ps aux --sort=-%mem | head -11
 
# Check for memory leaks by watching RSS growth
watch -n 5 "ps -p 12345 -o pid,rss,vsz,comm"
 
# View shared memory segments
ipcs -m

File Descriptors and Open Files

File descriptor inspection reveals which files, sockets, and pipes a process has open. This information diagnoses resource leaks, identifies which process holds a lock on a file, and verifies that services are listening on expected ports.

Each process has a limit on how many files it can open simultaneously. Hitting this limit causes "too many open files" errors that crash applications:

# Show open files for a process
lsof -p 12345
 
# Count open file descriptors
ls /proc/12345/fd | wc -l
 
# Show the file descriptor limit for a process
cat /proc/12345/limits | grep "open files"
 
# Check system-wide file descriptor usage
cat /proc/sys/fs/file-nr
 
# Find processes with the most open files
for pid in /proc/[0-9]*/fd; do
    echo "$(ls $pid 2>/dev/null | wc -l) $pid"
done | sort -rn | head -10
 
# Increase the limit for the current shell session
ulimit -n 65536

Debugging Process Issues

Systematic troubleshooting follows a diagnostic workflow that narrows the problem space efficiently. Starting with service status and journal output before examining configuration and dependencies resolves most issues quickly.

Production systems inevitably encounter process-related problems. Knowing how to diagnose zombie processes, trace system calls, and analyze process behavior saves hours of debugging time.

Dealing with Zombie Processes

Zombies are processes that have exited but whose parent has not called wait() to collect their exit status. A few zombies are harmless, but thousands indicate a parent process with a bug:

# Find zombie processes
ps aux | awk '$8=="Z" || $8=="Z+"'
 
# Find the parent of zombie processes
ps -eo pid,ppid,stat,comm | awk '$3~/Z/ {print "Zombie PID:",$1,"Parent:",$2}'
 
# The fix is usually to either:
# 1. Fix the parent to properly wait() for children
# 2. Kill the parent (zombies get adopted by init which reaps them)
# 3. Send SIGCHLD to the parent to remind it to reap
kill -SIGCHLD <parent_pid>

Tracing System Calls

System call tracing reveals the exact kernel interactions a process makes, providing definitive answers about file access patterns, network operations, and permission failures that application-level logging cannot capture.

When a process behaves unexpectedly, tracing its system calls reveals exactly what it is doing at the kernel level:

# Trace all system calls of a running process
strace -p 12345
 
# Trace only file-related system calls
strace -p 12345 -e trace=file
 
# Trace network-related calls
strace -p 12345 -e trace=network
 
# Trace a command from start with timing
strace -T -o /tmp/trace.log ls -la /var/log
 
# Count system calls by type
strace -c -p 12345
# Press Ctrl+C after a few seconds to see the summary
 
# Follow child processes (forks)
strace -f -p 12345

Process Priority and Scheduling

Timer units provide a modern alternative to cron with better logging, dependency management, and calendar expression support. They integrate with the journal and can trigger any service unit on flexible schedules.

Linux uses a priority-based scheduler. You can adjust process priority using nice values (ranging from -20 highest priority to 19 lowest) to ensure critical processes get CPU time:

# Start a process with lower priority (higher nice value)
nice -n 10 ./heavy-computation.sh
 
# Change priority of a running process
renice -n 5 -p 12345
 
# Start a CPU-intensive task at lowest priority
nice -n 19 make -j$(nproc)
 
# View nice values of all processes
ps -eo pid,ni,comm | sort -k2 -n
 
# Set real-time scheduling (requires root)
chrt -f 50 ./realtime-task

Real-World Use Cases

Process and signal management appears in numerous engineering scenarios. When deploying applications, you send SIGTERM to the old process and wait for graceful shutdown before starting the new version. This pattern is called graceful restart and is fundamental to zero-downtime deployments. Container orchestrators like Kubernetes send SIGTERM to pods during rolling updates, giving your application a configurable grace period to finish in-flight requests.

In CI/CD pipelines running on Jenkins, build processes spawn child processes for compilation, testing, and packaging. If a build times out, the CI system sends SIGTERM to the process group, and your build scripts need signal handlers to clean up temporary artifacts. Without proper signal handling, you end up with orphaned processes consuming resources on build agents.

When running applications in Docker containers, the container's PID 1 process must handle signals correctly because Docker sends SIGTERM during docker stop. If PID 1 does not forward signals to child processes, the container hangs for the stop timeout before Docker sends SIGKILL, causing ungraceful shutdowns and potential data loss.

Monitoring systems track process metrics to detect anomalies. A steadily growing RSS (resident set size) indicates a memory leak. An increasing file descriptor count suggests connections are not being closed. A growing number of zombie children points to a reaping bug. These metrics feed into alerting systems that notify engineers before problems impact users.

Best Practices

Follow these guidelines for effective process management in production environments:

  • Always try SIGTERM before SIGKILL. Give processes a chance to clean up resources, flush data, and close connections gracefully. Only escalate to SIGKILL after a reasonable timeout (typically 10-30 seconds).
  • Write signal handlers in your applications that perform cleanup on SIGTERM. Close database connections, finish writing to files, deregister from service discovery, and drain in-flight requests before exiting.
  • Use process groups and sessions to manage related processes together. When you need to terminate a service and all its children, send the signal to the process group rather than hunting individual PIDs.
  • Monitor file descriptor counts and set appropriate limits. The default limit of 1024 is too low for many server applications. Set limits in systemd unit files or /etc/security/limits.conf rather than relying on ulimit in scripts.
  • Never ignore zombie processes in production. While a few zombies are harmless, they indicate a bug in the parent process that will eventually exhaust the PID space if left unchecked.
  • Use nohup or disown for processes that must survive terminal disconnection, but prefer systemd services for anything that should run permanently. Systemd provides automatic restart, logging, and resource control that ad-hoc background processes lack.
  • Set appropriate nice values for batch processing jobs so they do not starve interactive services of CPU time. Build processes, log rotation, and backup scripts should run at lower priority than user-facing services.

Common Mistakes

Engineers frequently encounter these pitfalls when working with processes and signals:

  • Using kill -9 as the first resort. SIGKILL cannot be caught, so the process cannot clean up. This leads to corrupted files, leaked resources, and inconsistent state. Always start with SIGTERM and wait.
  • Forgetting that sleep in a bash script is not interruptible by signals delivered to the script. Use sleep N & wait $! instead so that signals interrupt the wait and trigger your trap handlers immediately.
  • Not handling SIGPIPE in applications that write to pipes or sockets. When the reader disconnects, SIGPIPE terminates your process by default. Either ignore SIGPIPE or handle it explicitly.
  • Running long-lived processes with & in a shell script without tracking their PIDs. If the script exits abnormally, those background processes become orphans with no management. Store PIDs and clean them up in a trap handler.
  • Confusing process groups with sessions. A session can contain multiple process groups, and signals sent to a process group only affect that group, not the entire session. Use ps -eo pid,pgid,sid,comm to understand the hierarchy.
  • Assuming all processes respond to signals immediately. A process in uninterruptible sleep (state D) cannot receive signals until the blocking I/O completes. This commonly happens with NFS mounts or failing disk operations.
  • Not setting a timeout when waiting for graceful shutdown. If you send SIGTERM and wait indefinitely, a buggy process that ignores the signal will hang your deployment. Always implement a timeout followed by SIGKILL as a fallback.
  • Ignoring the exit status of child processes. When a child is terminated by a signal, its exit status is 128 plus the signal number (e.g., 137 for SIGKILL). Check exit codes to distinguish between normal completion and signal-induced termination.

Summary

Linux processes and signals form the foundation of how work gets done on any Linux system. Every command you run, every service you deploy, and every container you start creates processes that the kernel schedules, monitors, and eventually cleans up. Signals provide the communication mechanism between processes and between the kernel and processes, enabling graceful shutdown, configuration reload, and process control.

Mastering process management means understanding the fork-exec lifecycle, recognizing process states, using job control effectively, and writing applications that handle signals properly. These skills are directly applicable when managing systemd services, writing shell scripts that coordinate multiple processes, debugging Docker containers that fail to stop gracefully, and operating production services on AWS infrastructure. Combined with the networking tools covered in the companion guide, you have a complete toolkit for diagnosing and resolving any process-related issue in your Linux environments.

Intermediate11 min read

Linux Shell Scripting Complete Guide

Master Bash shell scripting including variables, control flow, functions, text processing, error handling, and automation patterns for DevOps workflows.

Intermediate13 min read

Linux Systemd Services Complete Guide

Master systemd service management including unit files, dependencies, resource control, logging, timers, and production deployment patterns for Linux services.

Intermediate12 min read

Linux Networking Tools Complete Guide

Master Linux networking tools including ip, ss, curl, dig, tcpdump, iptables, and troubleshooting techniques for diagnosing connectivity issues.