Linux Level 3 – Advanced Operations

This course moves from administration into deeper system behaviour, control, hardening, incident handling, and recovery. It covers advanced process and memory behaviour, firewalling, kernel module management, permissions models, backup and recovery strategy, and system protection features.

Course purpose

Develop deeper Linux operational judgement so engineers can understand complex host behaviour, harden systems sensibly, handle incidents with structure, and recover services and platforms safely.

Duration

2 days
Optionally a third day of labs labs

Target audience

senior Linux engineers
platform engineers
escalation support engineers
security-conscious infrastructure teams
administrators responsible for recovery and hardening

Prerequisites

Linux Engineering or equivalent real-world experience
confidence working at shell and service level
familiarity with LVM, mounts, boot and logs

Learning outcomes

inspect and manage kernel modules and advanced system behaviour
understand Linux process and memory pressure behaviour
configure and reason about host firewalling
explain watchdog and system protection mechanisms
work with standard permissions, ACLs, extended attributes and SELinux/AppArmor concepts
design sensible backup and recovery approaches
troubleshoot complex incidents involving processes, memory, permissions or boot-time protection features

Detailed module structure

Module 1: Kernel modules and advanced kernel interaction

Topics:

kernel modules vs built-in functionality
loading and unloading modules
dependency handling
persistent module configuration
useful tools:
- lsmod
- modprobe
- modinfo
- insmod
- rmmod
common reasons to inspect modules
driver and module troubleshooting
kernel command line awareness
sysctl overview as runtime kernel tuning

Module 2: Processes, scheduling and system behaviour

Topics:

process lifecycle
parent and child relationships
foreground and background
signals
service processes vs user processes
CPU scheduling awareness
priorities and niceness
zombie and defunct process awareness
process inspection:
- ps
- top
- htop
- pstree
- pgrep
- kill

Module 3: Memory pressure, OOM and system stability

Topics:

virtual memory basics
swap behaviour
page cache awareness
OOM killer
how memory exhaustion presents
identifying the victim process
prevention and tuning considerations
cgroup awareness if relevant to your environment

Module 4: Firewalling and packet filtering

Topics:

packet filtering concepts
chains, tables and rules at a high level
iptables
firewalld
relationship between legacy and modern tooling
service-based policy vs direct rule management
persistent firewall configuration
safe change practices to avoid locking yourself out
nftables awareness at a high level

Module 5: Watchdogs, resilience and host protection

Topics:

what a watchdog is
hardware vs software watchdog
when watchdogs are useful
failure scenarios
risks of misconfiguration
automatic recovery concepts
host health monitoring awareness

Module 6: Permissions and access control

Topics:

standard Unix permissions
ownership
setuid, setgid and sticky bit
ACLs
extended attributes
SELinux concepts
SELinux contexts and enforcement modes
AppArmor awareness if relevant to Debian and Ubuntu estates
troubleshooting permission denied beyond simple file mode bits

Module 7: Backup, restore and recovery strategy

Topics:

file-level backups vs image-level backups
consistency considerations
what to back up:
- config
- application data
- databases
- logs where needed
restore testing
bare-metal recovery concepts
single-file vs full-system recovery
documenting recovery procedures
integrity checking and retention

Module 8: Advanced troubleshooting and incident workflow

Topics:

structured troubleshooting
identifying whether the issue is:
- hardware
- kernel
- service
- filesystem
- permissions
- network
collecting evidence safely
avoiding destructive fixes
handover notes and post-incident review

Capstone focus: this final module brings together logs, processes, permissions and operational judgement into a single incident workflow.

Labs

inspect and load kernel modules
diagnose a missing-driver or module problem
trace a runaway or stuck process
simulate OOM and identify what happened
create firewall rules safely and recover from mistakes
compare standard permissions, ACLs and SELinux effects
perform a restore exercise from backup
troubleshoot a composite incident using logs, processes and permissions

Assessment

Advanced operations practical

inspect modules, processes and memory symptoms
interpret permission and protection-related failures
review a firewall change for safety and persistence
justify an appropriate backup or recovery action

Incident scenario

Troubleshoot a multi-layer Linux incident involving service instability, memory pressure, permissions confusion or protection features, then document the safest recovery path.

Deeper host understanding - Better recovery decisions - Stronger Linux operational control

Built for engineers who need to harden, troubleshoot and recover Linux systems under real operational pressure

Training scope and tailoring

The training plan shown above is provided as a structured guide to the typical scope and direction of the course. Our training content is reviewed and refined over time, so the precise balance of modules, examples and exercises may vary when the course is delivered.

Where there are specific topics, technologies or operational outcomes that are particularly important to your team, these can normally be incorporated into the delivery plan by prior agreement. Training is not treated as a rigid, fixed package; it is adapted where appropriate to reflect the client environment, delegate experience level, group size and the objectives agreed in advance.