awk awk awk!

Note: This entry has been restored from old archives.

I use awk a lot in my day to day work.

Just now: Do a subversion move of all files in the cwd to uppercase file names (I’m not going to explain why):

ls | awk '{u=toupper($1);if(u!=$1){system("svn move "$1" "u);}}'

And regularly I do things like: Get average size of files in a corpus:

find ./corpus/ -type f -exec stat {} ; 
        | grep 'Size:' 
        | awk '{c+=1;d+=$2}END{print d/c/1024 " kB"}'

Does it ever annoy you that grep doesn’t offer a total sum of matching lines?

grep -rihc 'foo' ./corpus/ | awk '{c+=$1}END{print c}'

Or that cut is only really useful with single char delimiters? The default awk delimiter is s+, so the second example above works as expected (and you can override the field separator with -F, even with simple regular expressions).

I’ve written some largish bits of code in awk before, it isn’t that far off using the likes of perl, but I’d generally recommend using more modern options for larger scripts[1] A sample of something I did in a fit of taking cross-platform too far[2] was to create an autoconf macro around this little monkey:

echo "The rabbit-hole went straight on like a 
tunnel for some way, and then dipped suddenly 
down, so suddenly that Alice had not a moment 
to think about stopping herself before she found 
herself falling down a very deep well." 
| awk -v width=50 -v pre='* ' '
BEGIN{
    line=pre;
}
{
    for (i = 1; i <= NF; i++)
    {
        if (length(line)+length($i) > width)
        {
            print line;
            line=pre;
        }
        line=line$i" "
    }
}
END{
    print line;
}'

I’ve modified it to make it into a hideous one-liner. In the original form it is an autoconf macro where the echoed string, the “pre” and the “width” are arguments. All to make failure messages that little bit neater.

What exactly does it do? Wraps a single line of text to a given width with a given prefix (in the process turning lumps of whitespace into single spaces), in this case the output is:

* The rabbit-hole went straight on like a tunnel 
* for some way, and then dipped suddenly down, so 
* suddenly that Alice had not a moment to think 
* about stopping herself before she found herself 
* falling down a very deep well.

Thank you awk (in fact in my particular case I should thank mawk).

There are plently of awk examples and docs out there.


[1] I don’t mean to slight awk too much, it is a complete programming language and is quite easy to work with. It’s been around since 1977 and was designed and written by Alfred Aho, Peter Weinberger and Brian Kernighan (you should recognise at least two of those names!).

[2] You never know, there might be a system out there somewhere without perl installed. The project only compiles on Linux? Hrm, but we have to be prepared for the future!