Linux Log Analysis

Log analysis is crucial for system administration, troubleshooting, and security monitoring. Linux provides powerful command-line tools for processing, analyzing, and monitoring log files efficiently.

Overview

Linux systems generate extensive logs that provide insights into system behavior, errors, security events, and performance. Key tools for log analysis include:

  • grep - Pattern matching and filtering
  • awk - Field processing and calculations
  • sed - Text transformation
  • tail - Real-time monitoring
  • journalctl - Systemd journal analysis
  • sort/uniq - Data organization and counting
  • cut - Column extraction

Common Linux Log Files

Log File Description Content
/var/log/syslog System messages General system activity and messages
/var/log/auth.log Authentication logs Login attempts, sudo usage, SSH connections
/var/log/kern.log Kernel messages Kernel events, hardware issues, driver messages
/var/log/apache2/access.log Apache access log Web server requests and responses
/var/log/apache2/error.log Apache error log Web server errors and warnings
/var/log/mail.log Mail server logs Email server activity and errors
/var/log/cron.log Cron job logs Scheduled task execution

Basic Log Analysis

Searching and Filtering

Search for specific patterns

grep "ERROR" /var/log/syslog

Finds all lines containing "ERROR"

Case-insensitive search

grep -i "error" /var/log/syslog

Searches for "error" regardless of case

Search with context

grep -A 3 -B 3 "ERROR" /var/log/syslog

Shows 3 lines before and after each match

Multiple pattern search

grep -E "ERROR|WARN|CRITICAL" /var/log/syslog

Searches for multiple patterns

Exclude patterns

grep -v "INFO" /var/log/syslog

Shows all lines except those containing "INFO"

Time-based Analysis

Filter by date

grep "Jan 21" /var/log/syslog

Shows entries for January 21st

Filter by time range

awk '$3 >= "10:00:00" && $3 <= "11:00:00"' /var/log/syslog

Shows entries between 10:00 and 11:00

Recent entries

tail -100 /var/log/syslog

Shows last 100 log entries

Real-time Log Monitoring

Following Log Files

Monitor single log file

tail -f /var/log/syslog

Continuously displays new log entries

Monitor multiple log files

tail -f /var/log/syslog /var/log/auth.log

Monitors multiple files simultaneously

Monitor with filtering

tail -f /var/log/syslog | grep --line-buffered "ERROR"

Shows only error messages in real-time

Monitor with highlighting

tail -f /var/log/syslog | grep --color=always -E "ERROR|WARN|$"

Highlights error and warning messages

Advanced Analysis with AWK

Field Processing

Extract specific fields

awk '{print $1, $2, $3, $5}' /var/log/syslog

Extracts timestamp and process information

Count log levels

awk '/ERROR/ {error++} /WARN/ {warn++} /INFO/ {info++} END {print "Errors:", error, "Warnings:", warn, "Info:", info}' /var/log/syslog

Counts different log levels

Analyze by hour

awk '{hour=substr($3,1,2); count[hour]++} END {for(h in count) print h":00 -", count[h], "entries"}' /var/log/syslog

Groups log entries by hour

Top processes by log entries

awk '{process=$5; gsub(/\[.*\]/, "", process); count[process]++} END {for(p in count) print count[p], p}' /var/log/syslog | sort -nr | head -10

Shows processes generating most log entries

Statistical Analysis

Calculate log entry frequency

awk '{date=$1" "$2" "$3; gsub(/:[0-9][0-9]$/, ":00", date); count[date]++} END {for(d in count) print d, count[d]}' /var/log/syslog | sort

Shows log frequency per minute

Average log entries per hour

awk '{hour=substr($3,1,2); count[hour]++} END {total=0; for(h in count) total+=count[h]; print "Average per hour:", total/24}' /var/log/syslog

Calculates average log entries per hour

Web Server Log Analysis

Apache Access Log Analysis

Top IP addresses

awk '{print $1}' /var/log/apache2/access.log | sort | uniq -c | sort -nr | head -10

Shows top 10 IP addresses by request count

Most requested pages

awk '{print $7}' /var/log/apache2/access.log | sort | uniq -c | sort -nr | head -10

Shows most frequently requested URLs

HTTP status code analysis

awk '{print $9}' /var/log/apache2/access.log | sort | uniq -c | sort -nr

Counts HTTP status codes

404 errors analysis

awk '$9 == 404 {print $7}' /var/log/apache2/access.log | sort | uniq -c | sort -nr

Shows most common 404 errors

Bandwidth analysis

awk '{sum += $10} END {print "Total bytes:", sum, "MB:", sum/1024/1024}' /var/log/apache2/access.log

Calculates total bandwidth usage

Error Log Analysis

Error frequency by type

awk -F'] ' '{print $2}' /var/log/apache2/error.log | cut -d: -f1 | sort | uniq -c | sort -nr

Groups errors by type

Security Log Analysis

Authentication Analysis

Failed login attempts

grep "Failed password" /var/log/auth.log | awk '{print $11}' | sort | uniq -c | sort -nr

Shows IP addresses with failed login attempts

Successful logins

grep "Accepted password" /var/log/auth.log | awk '{print $1, $2, $3, $9, $11}'

Shows successful login details

SSH connection analysis

grep "sshd" /var/log/auth.log | grep "Connection from" | awk '{print $10}' | sort | uniq -c | sort -nr

Analyzes SSH connection sources

Sudo usage tracking

grep "sudo" /var/log/auth.log | awk '{print $1, $2, $3, $5, $8}' | sort

Tracks sudo command usage

Systemd Journal Analysis

Basic journalctl Usage

View recent entries

journalctl -n 50

Shows last 50 journal entries

Follow journal in real-time

journalctl -f

Continuously displays new journal entries

Filter by service

journalctl -u apache2

Shows entries for Apache service

Filter by priority

journalctl -p err

Shows only error-level messages

Time range filtering

journalctl --since "2025-01-21 10:00:00" --until "2025-01-21 11:00:00"

Shows entries within specific time range

Advanced Journal Analysis

Boot analysis

journalctl -b -1

Shows entries from previous boot

Kernel messages

journalctl -k

Shows kernel messages only

JSON output for processing

journalctl -o json | jq '.MESSAGE'

Outputs journal in JSON format for processing

Log Rotation and Management

Analyzing Rotated Logs

Search across rotated logs

zgrep "ERROR" /var/log/syslog*

Searches compressed and uncompressed log files

Combine multiple log files

cat /var/log/syslog.1 /var/log/syslog | grep "ERROR" | sort

Combines current and previous log files

Log Size Analysis

Check log file sizes

du -sh /var/log/*

Shows size of all log files

Find largest log files

find /var/log -type f -exec du -h {} + | sort -hr | head -10

Lists 10 largest log files

Automated Log Analysis Scripts

Daily Log Summary Script

#!/bin/bash # Daily log summary echo "=== Daily Log Summary ===" echo "Date: $(date)" echo echo "Error Count:" grep -c "ERROR" /var/log/syslog echo "Warning Count:" grep -c "WARN" /var/log/syslog echo "Top 5 Error Sources:" grep "ERROR" /var/log/syslog | awk '{print $5}' | sort | uniq -c | sort -nr | head -5 echo "Failed Login Attempts:" grep "Failed password" /var/log/auth.log | wc -l

Example script for daily log analysis

Alert Script for Critical Events

#!/bin/bash # Monitor for critical events tail -f /var/log/syslog | while read line; do if echo "$line" | grep -q "CRITICAL\|FATAL\|EMERGENCY"; then echo "ALERT: $line" | mail -s "Critical System Event" [email protected] fi done

Real-time alerting for critical log events

Best Practices

Log Analysis Best Practices
  • Regular Monitoring - Set up automated log monitoring and alerting
  • Log Retention - Maintain appropriate log retention policies
  • Centralized Logging - Use tools like rsyslog or ELK stack for centralization
  • Time Synchronization - Ensure accurate timestamps with NTP
  • Log Rotation - Implement proper log rotation to manage disk space
  • Security - Protect log files from unauthorized access and tampering

Performance Tips

Optimizing Log Analysis Performance
  • Use Specific Patterns - More specific grep patterns are faster
  • Limit Search Scope - Use time ranges and specific log files
  • Pipeline Efficiently - Order commands to filter early in the pipeline
  • Index Large Logs - Consider tools like grep with -F for fixed strings
  • Compress Old Logs - Use zgrep for compressed log files

Common Use Cases

Typical Log Analysis Scenarios
  • Troubleshooting - Finding root cause of system issues
  • Security Monitoring - Detecting intrusion attempts and anomalies
  • Performance Analysis - Identifying bottlenecks and resource issues
  • Compliance - Meeting audit and regulatory requirements
  • Capacity Planning - Understanding usage patterns and growth

See also