awk Command

The awk command is a powerful text processing tool in Linux, often used for pattern scanning and processing language. It reads a file line by line, splits each line into fields, and performs actions based on patterns.

Syntax

awk 'pattern { action }' [file...]

Description

awk is a versatile command-line utility that acts as a data extraction and reporting tool. It processes text files, one line at a time, and applies a set of rules (patterns and actions) to each line. If a line matches a pattern, the corresponding action is executed.

Common uses include:

  • Extracting specific columns from a file
  • Filtering lines based on content
  • Performing calculations on data
  • Generating formatted reports

Common Options and Variables

Option/Variable Description
-F fs Sets the field separator (default is whitespace)
$n Refers to the n-th field of the current record (e.g., $1 for first field)
$0 Refers to the entire current record (line)
NR Current record (line) number
NF Number of fields in the current record
BEGIN { action } Action to perform before processing any input lines
END { action } Action to perform after processing all input lines

Examples

Print the first column of a file

awk '{print $1}' data.txt

Displays only the first word/field of each line in data.txt.

Print specific columns with a custom separator

awk -F',' '{print $1, $3}' data.csv

Prints the first and third comma-separated fields from data.csv.

Filter lines containing a specific pattern

awk '/error/ {print}' logfile.log

Displays all lines from logfile.log that contain the word "error".

Sum values in a column

awk '{s+=$2} END {print "Total: ", s}' numbers.txt

Calculates the sum of values in the second column of numbers.txt.

Print lines longer than 80 characters

awk 'length($0) > 80 {print}' document.txt

Displays lines from document.txt that have more than 80 characters.

See also