Awk is a very powerful and handy tool designed for advanced text processing. It searches and scans a file line by line, splits each input line into fields, compares the input line or fields to pattern and performs an action on matched lines. Generally, it is used to transform data files and produce formatted reports.
In this tutorial, we will show some advanced and handy awk one-liners examples that will be useful to perform day to day operations.
Text Conversion
In this section, we will learn how to use sub and gsub function with awk command to delete tabs and spaces in the file.
We will use the following text file as the input file for all examples in this article:
cat > contents.txt
hitesh engineer sales 30000 jayesh director account 25000 vyom manager purchase 20000 bhavesh engineer sales 30000 rajesh directory sales 40000 niraj clerk account 20000 jay peon purchase 23000 deep clerk sales 20000
Delete all blank lines from a file
You can use the awk command with special NF variable to delete all blank lines from the file.
For example, delete all blank lines from the file contents.txt run the following command:
awk NF contents.txt
You should see the following output:
hitesh engineer sales 30000 jayesh director account 25000 vyom manager purchase 20000 bhavesh engineer sales 30000 rajesh directory sales 40000 niraj clerk account 20000 jay peon purchase 23000 deep clerk sales 20000
You can also perform the same thing with the following command:
awk '/./' /contents.txt
Delete leading whitespace and Tabs from the beginning of each line
You can use awk command to find one or more space or tab at the beginning of the file and delete them.
Run the following command to delete leading whitespace from the file named contents.txt
awk '{ sub(/^[ \t]+/, ""); print }' contents.txt
You should see the following output:
hitesh engineer sales 30000 jayesh director account 25000 vyom manager purchase 20000 bhavesh engineer sales 30000 rajesh directory sales 40000 niraj clerk account 20000 jay peon purchase 23000 deep clerk sales 20000
Delete trailing whitespace and Tabs from the end of each line
You can find one or more space or tab at the end of each line in the file and delete them.
Run the following command to delete the space and tab at the end of each line in file contents.txt
awk '{ sub(/[ \t]+$/, ""); print }' contents.txt
You should get the following output:
hitesh engineer sales 30000 jayesh director account 25000 vyom manager purchase 20000 bhavesh engineer sales 30000 rajesh directory sales 40000 niraj clerk account 20000 jay peon purchase 23000 deep clerk sales 20000
Delete both leading and trailing whitespaces from each line
You can also delete both leading and trailing whitespaces from each line using the single command as shown below:
awk '{ gsub(/^[ \t]+|[ \t]+$/, ""); print }' contents.txt
You should get the following output:
hitesh engineer sales 30000 jayesh director account 25000 vyom manager purchase 20000 bhavesh engineer sales 30000 rajesh directory sales 40000 niraj clerk account 20000 jay peon purchase 23000 deep clerk sales 20000
Reverse order of lines
This is very popular and very useful awk one-liner that records all lines in an array and arrange them in reverse order.
Run this awk one-liner to arrange all lines in reverse order in file contents.txt:
awk '{ a[i++] = $0 } END { for (j=i-1; j>=0;) print a[j--] }' contents.txt
You should get the following output:
deep clerk sales 20000 jay peon purchase 23000 niraj clerk account 20000 rajesh directory sales 40000 bhavesh engineer sales 30000 vyom manager purchase 20000 jayesh director account 25000 hitesh engineer sales 30000
Reverse order of fields in each line
You can use awk command with NF variable to arrange each field in each line in reverse order.
awk '{ for (i=NF; i>0; i--) printf("%s ", $i); printf ("\n") }' contents.txt
You should get the following output:
30000 sales engineer hitesh 25000 account director jayesh 20000 purchase manager vyom 30000 sales engineer bhavesh 40000 sales directory rajesh 20000 account clerk niraj 23000 purchase peon jay 20000 sales clerk deep
Remove Consecutive duplicate lines
To remove consecutive duplicate lines from the file, run the following command:
awk 'a != $0; { a = $0 }' contents.txt
Remove Nonconsecutive duplicate lines
To remove nonconsecutive duplicate lines from the file, run the following command:
awk '!a[$0]++' contents.txt
Numbering and Calculations
In this section, we will learn how to use FN and NR variables with awk command. It is used for processing and reports such as Number of records, number of fields.
Number all lines in a file
You can number all lines in a specific file with the following command:
awk '{ print NR "\t" $0 }' contents.txt
You should get the following output:
1 hitesh engineer sales 30000 2 jayesh director account 25000 3 vyom manager purchase 20000 4 5 bhavesh engineer sales 30000 6 rajesh directory sales 40000 7 niraj clerk account 20000 8 jay peon purchase 23000 9 deep clerk sales 20000
Number lines in a fancy manner
To number all lines in a specific file in fancy format, run the following command:
awk '{ printf("%5d : %s\n", NR, $0) }' contents.txt
You should get the following output:
1 : hitesh engineer sales 30000 2 : jayesh director account 25000 3 : vyom manager purchase 20000 4 : 5 : bhavesh engineer sales 30000 6 : rajesh directory sales 40000 7 : niraj clerk account 20000 8 : jay peon purchase 23000 9 : deep clerk sales 20000
Number only non-blank lines in files
You can number only non-empty lines with the following command:
awk 'NF { $0=++a " :" $0 }; { print }' contents.txt
You should get the following output:
1 : hitesh engineer sales 30000 2 :jayesh director account 25000 3 :vyom manager purchase 20000 4 :bhavesh engineer sales 30000 5 : rajesh directory sales 40000 6 :niraj clerk account 20000 7 :jay peon purchase 23000 8 :deep clerk sales 20000
Print number of lines that contains specific string
You can print the total number of lines that have the word engineer with the following command:
awk '/engineer/{n++}; END {print n+0}' contents.txt
You should get the following output:
2
Regular Expressions
In this section, we will show you how to use regular expressions with awk command to filter text or strings in files.
Print lines that match the specified string
To print all lines that matches the string engineer in the file contents.txt, run the following command:
awk '/engineer/' contents.txt
You should get the following output:
hitesh engineer sales 30000 bhavesh engineer sales 30000
Print lines that don't matches specified string
To print all lines that don,t matches the string 'jayesh' in the file contents.txt, run the following command:
awk '!/jayesh/' contents.txt
You should get the following output:
hitesh engineer sales 30000 vyom manager purchase 20000 bhavesh engineer sales 30000 rajesh directory sales 40000 niraj clerk account 20000 jay peon purchase 23000 deep clerk sales 20000
Print line before the matching string
To print the line before the matching string 'rajesh', run the following command:
awk '/rajesh/{print x};{x=$0}' contents.txt
You should get the following output:
bhavesh engineer sales 30000
Print line after the matching string
To print the line after the matching string 'rajesh', run the following command:
awk '/account/{getline; print}' contents.txt
You should get the following output:
vyom manager purchase 20000 jay peon purchase 23000
Substitution
In this section, we will show you how to use awk command to search a file for specific string and replace it with your desired string.
Substitute String with Other
To substitute a string 'engineer' with 'doctor' in the contents.txt, run the following command:
awk '{gsub(/engineer/, "doctor")};{print}' contents.txt
You should get the following output:
hitesh doctor sales 30000 jayesh director account 25000 vyom manager purchase 20000 bhavesh doctor sales 30000 rajesh directory sales 40000 niraj clerk account 20000 jay peon purchase 23000 deep clerk sales 20000
Find the string 'jayesh', 'hitesh' or 'bhavesh' and replace them with string 'mahesh', run the following command:
awk '{gsub(/jayesh|hitesh|bhavesh/,"mahesh");print}' contents.txt
You should get the following output:
mahesh engineer sales 30000 mahesh director account 25000 vyom manager purchase 20000 mahesh engineer sales 30000 rajesh directory sales 40000 niraj clerk account 20000 jay peon purchase 23000 deep clerk sales 20000
Find Free Disk Space with Device Name
You can use awk command with df to find and display only device name and the space used by each device.
To do so, run the following command:
df -h | awk '{print $1, $4}'
You should get the following output:
Filesystem Avail /dev/sda1 235G none 4.0K udev 1.9G tmpfs 377M none 5.0M none 1.5G none 100M /dev/sda5 135G /dev/loop0 0 /dev/loop1 0 /dev/loop2 0 /dev/loop4 0
Find Number of open connections per ip
This awk one-liner is very useful if you think your server is under attack. It prints out a list of open connections to your server and sorts them by amount.
netstat -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -n
You should get the list of all open connections to your server by amount:
18 103.132.192.30 12 104.18.12.5 11 104.18.5.23 9 104.244.42.3 1 104.244.42.5 1 127.0.0.1
Conclusion
As you see, we learned about the awk one-liner command with practical examples. I hope this will help you to perform your day-to-day tasks.
Comments