awk
awk is a line-oriented text-processing language: it reads input record by record
(usually line by line), splits each record into fields, and runs pattern-action
rules against it. Its power comes from a compact set of built-in variables like
NR, NF and FS, a library of string and numeric functions, and the special
BEGIN/END patterns. This page is a searchable, offline reference to all of
them, each with a working snippet.
How it works
An awk program is a sequence of pattern { action } rules. For every input
record:
- The record is split into fields
$1,$2, … up to$NFusing the field separatorFS;$0is the whole record. - Each rule whose pattern matches runs its action. A pattern can be a regex
/error/, a boolean expressionNR > 1, a range/START/,/END/, or the specialBEGIN/ENDmarkers. - Built-in variables track state:
NR(record number),NF(field count),FNR(per-file record number),FILENAME, and the separatorsFS,OFS,RS,ORS.
String functions such as split, gsub, sub, substr, match, index and
sprintf let you reshape text, while int, sqrt, log and friends handle
numbers. print and printf produce output, and getline reads extra input.
Tips and examples
Sum a column and print the total at the end:
awk '{ sum += $2 } END { print "total:", sum }' data.tsv
Print the last field of each line regardless of how many fields there are:
awk '{ print $NF }' file
Reformat a colon-delimited file into tab-separated output:
awk 'BEGIN { FS=":"; OFS="\t" } { print $1, $3 }' /etc/passwd
Remember awk indexes from 1, treats unset variables as 0 or empty depending on
context, and rebuilds $0 whenever you assign to a field — handy for in-place
column edits.