array and for loop in awk

June 25th, 2007 mysurface Posted in awk, Text Manipulation | Hits: 46824 | No Comments »

grep is always a useful tools to help analyze logs. When the logs are managed nicely in rows and columns format, awk will be much more efficient compared to grep. Refers to simple examples of awk here.

Awk support array and for loops. Let say I have IP logs that access to my servers time to time, and I wanna calculate various IP connecting to my servers, I can write a simple awk script uses array and for loops to do that.

The IP logs may looks as bellow


190607 084849   202.178.23.4 ...
190607 084859   164.78.22.64 ...
190607 084909   202.188.3.2   ...
...

Column 1 is date, column 2 is time and column 3 is IP. Let say my query is at 19 June 2007, prints me all the IP access to my servers, and how many times they accessing my servers.

The awk scrips will look something as bellow:


$1=="190607"{IP[$3]++;}
    END {
        for (a in IP)
            print a " access " IP[a] " times.";
        }

If column 1 is equal to 190607, make column 3 (which is IP Address) as an item of the array (IP[]), and increase the value of array IP[] by one. After finish accessing all logs, awk will get into END state, and print out results using for loops. Make ‘a’ as index of array IP, and print out ‘a’ and its array’s value. It may seems complicated at first, try to understand it by reading few times, or just try it out. Please take note that, the curly open brace must be place just after the keyword END.

Make the scripts as access.awk, and run the scripts with awk -f. Assume all logs with headings iplogs and with extension .txt


awk -f access.awk iplogs*.txt

As easy as that, you can even passing external value to awk scripts. Depends on how creative you are, you may create a great awk scripts to ease your logs analysis.

Leave a Reply