Linux System Monitoring

System monitoring is essential for maintaining optimal performance, identifying bottlenecks, and troubleshooting issues. Linux provides comprehensive command-line tools for monitoring CPU, memory, disk, network, and process activity.

Overview

Linux system monitoring involves tracking various system resources and performance metrics:

  • CPU Usage - Processor utilization and load averages
  • Memory Usage - RAM and swap space utilization
  • Disk I/O - Read/write operations and throughput
  • Network Activity - Network traffic and connections
  • Process Activity - Running processes and resource consumption
  • System Load - Overall system performance metrics

Essential Monitoring Tools

Tool Purpose Key Features
top Real-time process monitor CPU, memory usage, process list
htop Interactive process viewer Color-coded, mouse support, tree view
ps Process status snapshot Detailed process information
vmstat Virtual memory statistics CPU, memory, I/O, system activity
iostat I/O statistics Disk utilization, throughput
free Memory usage RAM, swap, buffers, cache
df Disk space usage Filesystem utilization
netstat Network connections Active connections, listening ports

Process Monitoring

Using top Command

Basic top usage

top

Real-time display of running processes

Sort by CPU usage

top -o %CPU

Sorts processes by CPU usage (highest first)

Sort by memory usage

top -o %MEM

Sorts processes by memory usage

Show specific user processes

top -u username

Shows processes for specific user only

Using htop (Enhanced top)

Install and run htop

# Install htop sudo apt install htop # Ubuntu/Debian sudo yum install htop # RHEL/CentOS # Run htop htop

Interactive process viewer with color coding

Using ps Command

Show all processes

ps aux

Detailed list of all running processes

Show process tree

ps auxf

Shows processes in tree format

Find specific processes

ps aux | grep apache

Finds all Apache-related processes

Show processes by CPU usage

ps aux --sort=-%cpu | head -10

Top 10 processes by CPU usage

Show processes by memory usage

ps aux --sort=-%mem | head -10

Top 10 processes by memory usage

System Resource Monitoring

CPU Monitoring

Check load averages

uptime

Shows system uptime and load averages

Detailed load information

cat /proc/loadavg

1, 5, and 15-minute load averages

CPU information

cat /proc/cpuinfo

Detailed CPU specifications

Number of CPU cores

nproc

Shows number of available processing units

Memory Monitoring

Memory usage overview

free -h

Human-readable memory usage information

Detailed memory information

cat /proc/meminfo

Comprehensive memory statistics

Memory usage by process

ps aux --sort=-%mem | head -10

Top memory-consuming processes

Disk Space Monitoring

Filesystem usage

df -h

Human-readable disk space usage

Directory size

du -sh /var/log

Size of specific directory

Largest directories

du -h /var | sort -hr | head -10

Top 10 largest directories in /var

Inode usage

df -i

Shows inode usage for filesystems

Performance Statistics

Using vmstat

System statistics overview

vmstat

Virtual memory, CPU, and I/O statistics

Continuous monitoring

vmstat 2 10

Updates every 2 seconds, 10 times

Memory statistics

vmstat -s

Detailed memory statistics

Using iostat

I/O statistics

iostat

CPU and I/O statistics for devices

Extended I/O statistics

iostat -x 2 5

Extended stats, updated every 2 seconds, 5 times

Per-device statistics

iostat -d 2

Device-only statistics, updated every 2 seconds

Using sar (System Activity Reporter)

CPU utilization

sar -u 2 5

CPU utilization every 2 seconds, 5 times

Memory utilization

sar -r 2 5

Memory utilization statistics

I/O statistics

sar -b 2 5

I/O transfer rate statistics

Network Monitoring

Connection Monitoring

Active connections

netstat -tuln

Shows TCP and UDP listening ports

Established connections

netstat -tun

Shows active TCP and UDP connections

Connection statistics

netstat -s

Network protocol statistics

Network Interface Monitoring

Interface statistics

cat /proc/net/dev

Network interface statistics

Real-time network usage

watch -n 1 'cat /proc/net/dev'

Updates network statistics every second

Bandwidth monitoring with iftop

sudo iftop -i eth0

Real-time bandwidth usage by connection

System Information

Hardware Information

System information

uname -a

Kernel and system information

Hardware details

lshw -short

Hardware configuration summary

PCI devices

lspci

Lists PCI devices

USB devices

lsusb

Lists USB devices

System Logs

System messages

dmesg | tail -20

Recent kernel messages

System log

tail -f /var/log/syslog

Follow system log in real-time

Monitoring Scripts

System Health Check Script

#!/bin/bash # System health check script echo "=== System Health Report ===" echo "Date: $(date)" echo echo "=== System Load ===" uptime echo echo "=== Memory Usage ===" free -h echo echo "=== Disk Usage ===" df -h echo echo "=== Top 5 CPU Processes ===" ps aux --sort=-%cpu | head -6 echo echo "=== Top 5 Memory Processes ===" ps aux --sort=-%mem | head -6

Comprehensive system health check script

Resource Alert Script

#!/bin/bash # Resource monitoring and alerting # CPU threshold (percentage) CPU_THRESHOLD=80 # Memory threshold (percentage) MEM_THRESHOLD=90 # Disk threshold (percentage) DISK_THRESHOLD=85 # Check CPU usage CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1) if (( $(echo "$CPU_USAGE > $CPU_THRESHOLD" | bc -l) )); then echo "ALERT: High CPU usage: ${CPU_USAGE}%" fi # Check memory usage MEM_USAGE=$(free | grep Mem | awk '{printf("%.1f", $3/$2 * 100.0)}') if (( $(echo "$MEM_USAGE > $MEM_THRESHOLD" | bc -l) )); then echo "ALERT: High memory usage: ${MEM_USAGE}%" fi # Check disk usage df -h | awk 'NR>1 {gsub(/%/,"",$5); if($5 > '$DISK_THRESHOLD') print "ALERT: High disk usage on " $6 ": " $5"%"}'

Automated resource monitoring with alerts

Performance Analysis

Identifying Bottlenecks

CPU bottlenecks

# High load average (> number of CPU cores) uptime # High CPU wait time vmstat 1 5 # CPU-intensive processes ps aux --sort=-%cpu | head -10

Identifying CPU performance issues

Memory bottlenecks

# Memory usage and swap activity free -h vmstat 1 5 # Memory-intensive processes ps aux --sort=-%mem | head -10

Identifying memory performance issues

I/O bottlenecks

# I/O wait time and disk utilization iostat -x 1 5 # Processes causing I/O iotop # Disk usage by process lsof | grep -E "REG.*[0-9]+.*[0-9]+"

Identifying disk I/O performance issues

Continuous Monitoring

System monitoring dashboard

# Create a simple monitoring dashboard watch -n 2 'echo "=== System Monitor ==="; echo "Load: $(uptime | cut -d":" -f4)"; echo "Memory: $(free -h | grep Mem | awk "{print \$3\"/\"\$2}")"; echo "Disk: $(df -h / | tail -1 | awk "{print \$5}")"; echo "Processes: $(ps aux | wc -l)"'

Real-time system monitoring dashboard

Best Practices

System Monitoring Best Practices
  • Establish Baselines - Know normal system behavior patterns
  • Monitor Continuously - Use automated monitoring tools
  • Set Thresholds - Define alert levels for key metrics
  • Document Issues - Keep records of performance problems
  • Regular Reviews - Analyze trends and patterns over time
  • Proactive Monitoring - Identify issues before they become critical

Key Performance Metrics

Critical Metrics to Monitor
  • Load Average - Should be below number of CPU cores
  • CPU Utilization - Sustained >80% indicates bottleneck
  • Memory Usage - >90% usage may cause swapping
  • Disk Usage - >85% full requires attention
  • I/O Wait - High %iowait indicates disk bottleneck
  • Network Utilization - Monitor bandwidth usage

Common Performance Issues

Typical Performance Problems
  • High Load - Too many processes competing for CPU
  • Memory Leaks - Applications consuming increasing memory
  • Disk Full - Filesystems approaching capacity
  • I/O Bottleneck - Slow disk operations affecting performance
  • Network Congestion - Bandwidth limitations or packet loss
  • Zombie Processes - Dead processes not cleaned up

See also