How to use loops in awk

Learn how to use different types of loops to run commands on a record multiple times.
180 readers like this.
arrows cycle symbol for failing faster

Opensource.com

Awk scripts have three main sections: the optional BEGIN and END functions and the functions you write that are executed on each record. In a way, the main body of an awk script is a loop, because the commands in the functions run for each record. However, sometimes you want to run commands on a record more than once, and for that to happen, you must write a loop.

There are several kinds of loops, each serving a unique purpose.

While loop

A while loop tests a condition and performs commands while the test returns true. Once a test returns false, the loop is broken.

#!/bin/awk -f

BEGIN {
	# Loop through 1 to 10 

    i=1;
    while (i <= 10) {
        print i, " to the second power is ", i*i;
        i = i+1;
    }
exit;
}

In this simple example, awk prints the square of whatever integer is contained in the variable i. The while (i <= 10) phrase tells awk to perform the loop only as long as the value of i is less than or equal to 10. After the final iteration (while i is 10), the loop ends.

Do while loop

The do while loop performs commands after the keyword do. It performs a test afterward to determine whether the stop condition has been met. The commands are repeated only while the test returns true (that is, the end condition has not been met). If a test fails, the loop is broken because the end condition has been met.

#!/usr/bin/awk -f
BEGIN {

	i=2;
	do {
		print i, " to the second power is ", i*i;
		i = i + 1
	}
	while (i < 10)

exit;
}

For loops

There are two kinds of for loops in awk.

One kind of for loop initializes a variable, performs a test, and increments the variable together, performing commands while the test is true.

#!/bin/awk -f

BEGIN {
    for (i=1; i <= 10; i++) {
        print i, " to the second power is ", i*i;
    }
exit;
}

Another kind of for loop sets a variable to successive indices of an array, performing a collection of commands for each index. In other words, it uses an array to "collect" data from a record.

This example implements a simplified version of the Unix command uniq. By adding a list of strings into an array called a as a key and incrementing the value each time the same key occurs, you get a count of the number of times a string appears (like the --count option of uniq). If you print the keys of the array, you get every string that appears one or more times.

For example, using the demo file colours.txt (from the previous articles):

name       color  amount
apple      red    4
banana     yellow 6
raspberry  red    99
strawberry red    3
grape      purple 10
apple      green  8
plum       purple 2
kiwi       brown  4
potato     brown  9
pineapple  yellow 5

Here is a simple version of uniq -c in awk form:

#! /usr/bin/awk -f

NR != 1 {
    a[$2]++
}
END {
    for (key in a) {
		print a[key] " " key 
    }
}

The third column of the sample data file contains the number of items listed in the first column. You can use an array and a for loop to tally the items in the third column by color:

#! /usr/bin/awk -f

BEGIN {
    FS=" ";
    OFS="\t";
    print("color\tsum");
}
NR != 1 {
    a[$2]+=$3;
}
END {
    for (b in a) {
        print b, a[b]
    }
}

As you can see, you are also printing a header column in the BEFORE function (which always happens only once) prior to processing the file.

Loops

Loops are a vital part of any programming language, and awk is no exception. Using loops can help you control how your awk script runs, what information it's able to gather, and how it processes your data. Our next article will cover switch statements, continue, and next.


Would you rather listen to this article? It was adapted from an episode of Hacker Public Radio, a community technology podcast by hackers, for hackers.

What to read next
Seth Kenlon
Seth Kenlon is a UNIX geek, free culture advocate, independent multimedia artist, and D&D nerd. He has worked in the film and computing industry, often at the same time.
User profile image.
Dave Morriss is a retired IT Manager based in Edinburgh, Scotland. He worked in the UK higher education sector providing IT services to students and staff.

2 Comments

find the typo :)
print "The square of ", i, " is ", i*i;
i = i+1;

The perils of simplifying examples at the last minute....

Thanks for the catch. Now back to my maths homework...

In reply to by Sento (sh) (not verified)

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.