Tech Study Guide
Logs and Observability
Linux logs, journald, syslog, dmesg, service status, boot logs, log rotation, metrics, and incident evidence collection.
Logs and Observability
Linux administration requires knowing where evidence lives. The useful signal may be in the systemd journal, kernel ring buffer, application logs, syslog files, service status, metrics, or audit logs. Good operators collect evidence before restarting the thing that is failing.
Command Examples
journalctl -b
journalctl -p warning..alert -b
journalctl -u ssh -b
dmesg -T | tail -100
ls -lh /var/log
systemctl status systemd-journald
Example output and meaning:
| Command | Example output | What it does |
|---|---|---|
journalctl -b |
Timestamped kernel, service, denial, OOM, device, or network warnings. |
Finds time-correlated evidence from the host. |
journalctl -p warning..alert -b |
Timestamped kernel, service, denial, OOM, device, or network warnings. |
Finds time-correlated evidence from the host. |
journalctl -u ssh -b |
Timestamped kernel, service, denial, OOM, device, or network warnings. |
Finds time-correlated evidence from the host. |
Journal
journald stores structured events with fields such as unit, boot ID, priority, PID, executable, message, and timestamps. It is usually the fastest path for systemd service incidents.
Useful patterns:
journalctl -u nginx -b
journalctl --since "30 minutes ago"
journalctl -k -b
journalctl -o short-iso-precise
journalctl --list-boots
Kernel Logs
Kernel messages include driver errors, OOM kills, storage faults, network warnings, filesystem remounts, and hardware signals. dmesg reads the kernel ring buffer; journald may also collect kernel messages.
/var/log and Rotation
Ubuntu systems still use files under /var/log for many services. logrotate rotates many traditional logs. A deleted-but-open log can still consume disk until the process closes the file descriptor.
Metrics and Pressure
Logs explain events. Metrics show shape over time. For host incidents, combine logs with:
- CPU, memory, IO, and network utilization,
- cgroup metrics,
- Pressure Stall Information,
- service restart counts,
- disk/inode usage,
- time synchronization state.
Incident Evidence Checklist
systemctl status <service>.journalctl -u <service> -b --since ....journalctl -p warning..alert -b.- Kernel logs for OOM, IO, filesystem, network, or driver issues.
- App logs under
/var/logor application-specific paths. - Resource snapshots: CPU, memory, disk, network, and cgroups.
- Config state before changing it.
Study Cards
Why collect logs before restarting?
Restarting can rotate evidence, clear process state, and hide the original failure mode.
What does journalctl -u filter by?
A systemd unit, such as a service.
Why pair logs with metrics?
Logs show events; metrics show resource shape and timing across the incident window.