Linux OS and Fundamentals

Linux is the layer that turns hardware into process isolation, files, sockets, memory mappings, timers, signals, device drivers, and resource accounting. Developers need it because code eventually becomes processes, syscalls, pages, sockets, and file descriptors. Operators need it because incidents show up as CPU pressure, memory pressure, IO waits, scheduler latency, kernel logs, device-driver behavior, access failures, broken scheduled jobs, SSH lockouts, and host identity drift.

Concrete commands, configs, and failure patterns are embedded in the topic pages where the operating-system behavior appears: systemd examples live with systemd pages, firewall examples live with netfilter pages, cgroup examples live with containerization and memory pages, and packet-performance examples live with the kernel networking pages.

Mental Model

A Linux system is a set of cooperating boundaries:

Boundary	What It Controls	What Breaks
Process	Execution context, PID, open files, signal handling, credentials.	Stuck workers, zombies, runaway children, bad signal handling.
Virtual memory	Per-process address space mapped to RAM, files, anonymous pages, and swap.	OOM kills, paging storms, leaks, fragmentation.
Scheduler	Which runnable tasks get CPU time.	High load, latency spikes, CPU starvation.
VFS	Common file abstraction over filesystems, devices, sockets, procfs, sysfs.	File descriptor leaks, mount issues, inode pressure.
Network stack	Interfaces, routes, sockets, queues, conntrack, firewall hooks.	Drops, retransmits, queue buildup.
Device drivers	Kernel code that talks to hardware.	GPU resets, NIC errors, storage timeouts.
Cgroups and namespaces	Resource limits and isolation for containers and services.	Container OOM, throttling, misleading host-vs-container metrics.

The kernel is not only “the thing under user space.” It is the shared arbiter for memory, CPU, IO, devices, and security boundaries.

Critical Subtopics

Topic	Why It Matters
Boot and Userspace	Explains how firmware, bootloader, kernel, initramfs, root mounts, and PID 1 turn hardware into services.
Network Boot and Automated Provisioning	Covers PXE, UEFI HTTP Boot, iPXE, DHCP, TFTP, Ubuntu autoinstall, cloud-init NoCloud, Kickstart, inventory-driven provisioning, and reinstall-loop prevention.
Kernel Modules and Devices	Covers loadable and built-in modules, `modprobe`, module aliases and parameters, sysfs, devtmpfs, udev, device nodes, initramfs, Secure Boot, and DKMS.
Filesystems and IO	Connects application file operations to VFS, page cache, mounts, inodes, block devices, and durable writes.
Block Devices and Partitioning	Covers `/dev` block devices, NVMe/SCSI naming, GPT, UUIDs, labels, udev, stable paths, and safe disk identity.
Mounts and fstab	Covers persistent mounts, `findmnt`, `/etc/fstab`, systemd mount units, automounts, bind mounts, and mount options.
Mount Namespaces and Propagation	Covers per-process mount views, bind mounts, shared subtree propagation, chroot, pivot_root, overlay, tmpfs, and container mount debugging.
ext4, XFS, and Repair	Covers ext4/XFS differences, inodes, quotas, journals, online growth, repair tools, TRIM, and safe repair workflows.
Storage Drives, RAID, and Database Performance	Covers SSD, HDD, NVMe, RAID 0/1/5/6/10, striping, mirroring, parity, disk-failure recovery, rebuild risk, and PostgreSQL and Elasticsearch storage tradeoffs.
RAID, Multipath, and Device Mapper	Covers md RAID, dm-crypt/LUKS, device mapper, multipath, NVMe multipath, WWIDs, and layered storage troubleshooting.
Storage Health and Performance	Covers `iostat`, SMART, NVMe health, kernel I/O errors, discard, queueing, latency, saturation, and failure response.
Containerization, OCI, and VMs	Explains containers versus VMs, OCI standards, image layers, overlayfs, namespaces, cgroups, bridge networking, macOS/Windows behavior, KVM, and hypervisors.
Network Stack	Shows how sockets, routes, namespaces, Linux bridges, netfilter, conntrack, queues, and NICs move packets.
Kernel Network Performance	Covers NAPI, softirq, NIC rings, RSS, RPS, RFS, XPS, offloads, qdisc, drops, and packet-processing bottlenecks.
TCP Kernel Tuning	Covers listen queues, `somaxconn`, SYN backlog, socket buffers, ephemeral ports, TIME_WAIT, keepalives, and conntrack limits.
eBPF and Tracing	Covers eBPF programs, maps, tracepoints, kprobes, uprobes, bpftrace, BCC, `tc`, XDP, and production-safe tracing.
Memory Pressure and OOM	Covers RSS, VSZ, page cache, slab, THP, NUMA, swap, cgroup memory, PSI, and OOM killer behavior.
System Call Debugging	Covers `strace`, `errno`, blocking syscalls, file and socket syscalls, `poll`, `epoll`, and syscall-layer failure evidence.
Security Controls	Covers capabilities, seccomp, AppArmor, SELinux, PAM, auditd, sudoers, file capabilities, setuid, and container boundaries.
Package and Boot Recovery	Covers broken packages, bad kernels, initramfs failures, GRUB rescue, emergency mode, chroot repair, and rollback.
Performance Triage Runbooks	Covers high CPU, high load, memory pressure, disk latency, softirq saturation, file descriptor exhaustion, and cgroup throttling.
Sockets and IPC	Covers TCP, UDP, Unix domain sockets, socket files, listen queues, buffers, file descriptors, pipes, shared memory, and IPC troubleshooting.
Processes and Threads	Covers tasks, PIDs, TIDs, thread groups, fork/exec/wait, zombies, signals, PID namespaces, scheduling, and thread debugging.
Debian and Ubuntu Operations	Uses Ubuntu Server as the default distro lens for packages, services, Netplan, UFW, logs, and certificates.
Users, Permissions, and sudo	Covers UIDs, GIDs, `/etc/passwd`, `/etc/shadow`, mode bits, ACLs, service users, and sudo policy.
SSH Access	Covers OpenSSH server policy, keys, host keys, PAM, account state, firewalls, and login troubleshooting.
Logs and Observability	Covers journald, kernel logs, `/var/log`, log rotation, metrics, PSI, and incident evidence collection.
Scheduled Automation	Covers cronjobs, crontab formats, anacron, systemd timers, job safety, idempotency, environment pitfalls, locking, and logs.
Backup and File Transfer	Covers `rsync`, `scp`, snapshot backups, consistency, restore testing, exclusions, permissions, and automation.
Time, Hostname, and Identity	Covers time sync, timezone, hostname, DNS identity, machine-id, certificates, and incident timelines.
Linux GPU Drivers	Covers AMD and NVIDIA GPU stacks, kernel modules, firmware, Mesa, ROCm, CUDA, `nvidia-smi`, Secure Boot, containers, and troubleshooting.
LVM	Covers the storage mapping layer behind many Linux volumes.
systemd	Covers service lifecycle, dependencies, logs, cgroups, timers, and resource controls.
systemd Networking	Covers systemd-networkd, systemd-resolved, `.network`, `.netdev`, `.link`, network targets, and wait-online behavior.
systemd Socket Activation	Covers `.socket` units, `ListenStream`, `ListenDatagram`, `Accept=yes`, service activation, and network service controls.
resolv.conf	Covers local DNS resolver behavior, search paths, and systemd-resolved interactions.

Unless a page is explicitly distro-neutral, examples prefer Debian-family Linux with Ubuntu Server as the default operational target. That means apt, dpkg, systemctl, journalctl, Netplan, UFW, /etc/ssl/certs, update-ca-certificates, OpenSSH, sudo, timedatectl, and systemd timers are the expected baseline tools.

Processes, Forking, Exec, and Wait

A Linux process is an execution context with a PID, virtual address space, file descriptor table, credentials, signal dispositions, and scheduling state.

Common lifecycle:

A process calls fork() or clone() to create a child.
The child often calls execve() to replace its memory image with a new program.
The parent calls wait() or waitpid() to reap the child exit status.

fork() is efficient because Linux uses copy-on-write memory. The child does not immediately copy all parent memory. Instead, parent and child initially share physical pages as read-only. If either writes, the kernel copies the modified page.

Important process states:

State	Meaning
Running / runnable	On CPU or waiting for CPU.
Interruptible sleep	Waiting for an event and can be interrupted by signals.
Uninterruptible sleep (`D`)	Usually waiting on IO or kernel path that cannot be interrupted.
Stopped	Paused by job control or tracing.
Zombie (`Z`)	Exited, but parent has not reaped it.

Zombies do not consume CPU or normal memory, but they consume PID table slots. Many zombies mean the parent is not reaping children.

File Descriptors

File descriptors point at open files, sockets, pipes, eventfds, epoll instances, devices, and more. Many production failures are FD failures:

process hits ulimit -n,
system hits global file table pressure,
app leaks sockets,
logs rotate but process keeps old deleted file open,
child processes inherit FDs that should have had close-on-exec.

Useful commands:

ls -l /proc/<pid>/fd
lsof -p <pid>
ulimit -n
cat /proc/sys/fs/file-nr

CPU and Scheduler Basics

High CPU means runnable work is consuming CPU time. High load average means tasks are runnable or in uninterruptible sleep. Those are not identical.

What to separate:

User CPU: application code.
System CPU: kernel work on behalf of processes.
IO wait: CPU idle while tasks wait for IO.
Steal: virtual CPU time taken by the hypervisor.
Softirq: deferred network/block/kernel work.

nice affects CPU scheduling priority for normal tasks. Real-time policies can starve normal tasks if misused.

High CPU workflow:

uptime
mpstat -P ALL 1
top -H -p <pid>
pidstat -t -p <pid> 1
perf top
cat /proc/pressure/cpu

First determine whether the CPU is in user code, kernel code, interrupts/softirqs, or virtualization steal. Then drill into process, thread, syscall, or interrupt source.

RAM, Virtual Memory, and Caches

Linux memory accounting is often misunderstood. “Free memory” being low is normal because Linux uses RAM for page cache and buffers. Cache can usually be reclaimed when applications need memory.

Key terms:

Term	Meaning
RSS	Resident Set Size: physical memory currently mapped for a process.
VIRT	Virtual address space size; not the same as physical memory used.
Page cache	Cached file contents. Usually reclaimable.
Anonymous memory	Heap/stack/private memory not backed by a file.
Slab	Kernel object caches. Some reclaimable, some not.
Swap	Disk-backed extension for anonymous memory pressure.
OOM killer	Kernel mechanism that kills processes when memory cannot be reclaimed.

Memory workflow:

free -h
cat /proc/meminfo
vmstat 1
slabtop
ps aux --sort=-rss | head
cat /proc/pressure/memory
dmesg -T | grep -i -E 'oom|out of memory|killed process'

Do not clear caches as a routine “fix.” It can make performance worse by forcing rereads. Clear cache only for controlled experiments when you understand what you are measuring.

Paging, Swap, and Memory Pressure

Paging is moving pages between memory states: active, inactive, file-backed, anonymous, swapped, reclaimed. Swap is not automatically bad. It can improve system behavior by moving cold anonymous pages out of RAM. Swap storms are bad: the system spends most time moving pages instead of doing useful work.

Signs of dangerous memory pressure:

rising si/so in vmstat,
high memory PSI,
direct reclaim latency,
OOM kills,
application latency during GC/allocation,
cgroup memory events increasing.

Container warning: cgroup memory limits include more than just app heap. Page cache, tmpfs, and some kernel accounting can matter. A container can OOM while the host still has plenty of memory.

Kernel Parameters and sysctl

Kernel parameters are configured in a few places:

boot-time kernel command line: /proc/cmdline,
runtime sysctls: /proc/sys/...,
persistent sysctl files: /etc/sysctl.conf and /etc/sysctl.d/*.conf,
module parameters: /sys/module/<module>/parameters/...,
systemd unit limits and cgroup controls.

Common sysctl areas:

Path	What It Affects
`vm.swappiness`	Kernel tendency to swap anonymous memory versus reclaiming page cache.
`vm.dirty_ratio` / `vm.dirty_background_ratio`	Dirty page writeback thresholds.
`vm.max_map_count`	Maximum memory map areas per process, relevant to JVMs, databases, and search engines.
`fs.file-max`	System-wide file handle maximum.
`net.core.somaxconn`	Listen backlog cap.
`net.ipv4.ip_local_port_range`	Ephemeral port range.
`net.ipv4.tcp_tw_reuse`	TCP TIME_WAIT reuse behavior for clients.

Treat sysctl changes as production changes: document the reason, observe before/after, and know whether the workload is host-level or cgroup-limited.

sysctl vm.swappiness
sysctl -a | grep '^vm\.'
cat /proc/cmdline
cat /sys/module/amdgpu/parameters/* 2>/dev/null

cgroups, Containers, and systemd

Modern Linux systems use cgroups to account and limit CPU, memory, IO, PIDs, and other resources. systemd uses cgroups for services; containers use cgroups for isolation.

Important signals:

CPU quota throttling can look like mysterious latency.
Memory limits can OOM a service even when host memory is available.
PID limits can prevent fork/exec.
IO limits can make a service slow without high CPU.
Pressure Stall Information (PSI) shows time lost waiting on CPU, memory, or IO pressure.

systemctl status <service>
systemd-cgls
systemd-cgtop
cat /proc/pressure/cpu
cat /proc/pressure/memory
cat /proc/pressure/io

GPU Interactions: General Model

Linux GPU stacks involve PCIe enumeration, kernel modules, firmware, device files, display or compute userspace, and sometimes container runtime hooks. GPU incidents often cross user/kernel boundaries. An application may report CUDA, ROCm, Vulkan, OpenGL, or display errors, while the root cause is kernel driver mismatch, missing firmware, permissions, PCIe reset, thermal throttling, or device memory pressure.

General GPU checks:

lspci -nnk | grep -A3 -E 'VGA|3D|Display'
ls -l /dev/dri /dev/nvidia* 2>/dev/null
dmesg -T | grep -i -E 'drm|gpu|amdgpu|nvidia|xid'
lsmod | grep -E 'amdgpu|nvidia'

For deeper AMD, NVIDIA, ROCm, CUDA, Secure Boot, containers, nvidia-smi, and NVIDIA persistence mode details, see Linux GPU Drivers.

High CPU Runbook

Check system shape: uptime, top, mpstat -P ALL 1.
Determine user vs system vs IO wait vs steal.
If one process dominates, inspect threads: top -H -p <pid>.
If system CPU is high, inspect syscalls, perf, interrupts, and network softirq.
If load is high but CPU is not, look for D-state tasks and IO stalls.
Check cgroup CPU throttling for containers and systemd services.
Preserve evidence before restarting.

High RAM Runbook

Check free -h and /proc/meminfo.
Separate application RSS, page cache, slab, tmpfs, and cgroup memory.
Check vmstat 1 for swap activity.
Check PSI memory pressure.
Look for OOM kills in dmesg.
For a process, inspect /proc/<pid>/smaps_rollup.
For containers, inspect cgroup memory events and limits.
Avoid clearing caches unless testing a hypothesis.

Commands

uname -a
cat /proc/cmdline
ps -eo pid,ppid,stat,comm,%cpu,%mem --sort=-%cpu | head
free -h
vmstat 1
mpstat -P ALL 1
pidstat -durh 1
cat /proc/pressure/{cpu,memory,io}
dmesg -T | tail -100

Study Cards

Question

Why is fork usually cheaper than copying an entire process?

Answer

Linux uses copy-on-write pages, so parent and child initially share physical pages until one writes.

Question

What is the difference between high CPU and high load average?

Answer

High CPU means CPU time is busy; load average also includes runnable tasks and tasks stuck in uninterruptible sleep.

Question

Why is low free memory not automatically bad on Linux?

Answer

Linux uses available RAM for page cache and buffers, much of which can be reclaimed when applications need memory.

Question

What does vm.swappiness influence?

Answer

The tendency to reclaim anonymous memory via swap versus reclaiming file-backed page cache.

Question

What does a zombie process indicate?

Answer

The child exited, but its parent has not called wait to reap the exit status.

Question

What is Pressure Stall Information useful for?

Answer

It shows time workloads lose because CPU, memory, or IO resources are unavailable.

Question

What is amdgpu?

Answer

The mainline Linux kernel DRM driver for supported AMD Radeon GPU families including GCN, RDNA, and CDNA.

Question

What does NVIDIA persistence mode do?

Answer

It keeps the GPU initialized when no clients are connected, reducing initialization latency and lifecycle churn on Linux GPU systems.

Practice Deck

Linux Deck

178 cards

Linux OS and Fundamentals

Mental Model

Critical Subtopics

Processes, Forking, Exec, and Wait

File Descriptors

CPU and Scheduler Basics

RAM, Virtual Memory, and Caches

Paging, Swap, and Memory Pressure

Kernel Parameters and sysctl

cgroups, Containers, and systemd

GPU Interactions: General Model

High CPU Runbook

High RAM Runbook

Commands

Study Cards

Practice Deck

Linux Deck

References