You are on page 1of 25

AWK

Lokesh (3054) Kushal (3053)

AWK is a programming language that is designed for processing text-based data.

INTRODUCTION
y created at Bell Labs in the 1970s y The name AWK is derived from the family names of its

authors Alfred Aho, Peter Weinberger, and Brian Kernighan.

y AWK is one of the early tools to appear in 7th Version

of Unix and gained popularity as a way to add computational features to a Unix pipeline.

TASKS
y Tallying information from text files and creating y y y y

reports from the results. Adding additional functions to text editors like "vi". Translating files from one format to another. Creating small databases. Performing mathematical operations on files of numeric data.

PARADIGMS
y Scripti g language - esides t e

urne s ell, is t e nly t er scripting language availa le in a standard nix envir n ent.

y r cedural language -

is an exa ple f a pr gra ing language t at extensively uses t e string datatype, ass ciative arrays
reads t e input a line at a time. line is scanned f r eac pattern in t e pr gram, and f r eac pattern t at matc es, t e ass ciated action is executed.

y Event driven -

Structure of an AWK Program


y An Awk program consists of: y An optional BEGIN segment
y

B GIN pattern {action} pattern {action} . . . pattern { action} END

For processing to execute prior to reading input Processing for input data For each pattern matched, the corresponding action is taken Processing after end of input data

y pattern - action pairs


y y

y An optional END segment


y

HANDLING TEXT
y One major advantage of Awk is its ability to

handle strings as easily as many languages handle numbers


y Awk variables can hold strings of characters

as well as numbers, and Awk conveniently translates back and forth as needed

BEGIN and END


y Special pattern BEGIN matches before the first input

line is read; END matches after the last input line has been read y This allows for initial and wrap-up processing
BEGIN { print NAME RATE HOURS ; print } { print } END { print total number of employees is , NR }

Hello world application


y BEGIN { print "

Hello, world!" }

Some of the Built-In Variables


y NF - Number of fields in current record y NR - Number of records read so far y $0

- Entire line y $n - Field n y $NF - Last field of current record

yLIKE C LANGUAGE

IT HAS ..

OPERATORS
y Operators in Increasing Precedence
y y y y y y y y y

Assignment: = , +=, -=, *=, /=, %=, ^= Logical: ||, &&, ~, !~ Relational: <, <=, ==, !=, >=, > Concatenation: blank Add/Subtract: +, Multiply/divide/mod: *, /, % Unary plus, minus, not, exponent (^ or **) Increment, decrement Field: $expr

Stri g

cat

ati

y New strings can be

created by combining old


}

ones

{ names = names $1 END { print names }

CONTINUE
y Arrays
y y

Associative arrays (hash): index can be any value (integer or string) Referencing creates entry: if (arr[ x ] != ) print arr[x] exists if ( expr ) statement [ else statement ] if ( subscript in array ) ... while ( expr ) statement for ( init_expr; test_expr; increment_expr ) statement for ( subscript in array ) statement do statement while ( expr ) break, continue, next,exit , return [expr]

y Control flow statements


y y y y y y y

Comma

Li

rg ments

y Accessed via built-ins ARGC and ARGV y ARGC is set to the number of command line

arguments y ARGV[ ] contains each of the arguments


y For the command line y awk filename
y y y

ARGC == 2 ARGV[0] == awk ARGV[1] == filename

ARGC/ARGV in Action
#argv.awk get a cmd line argument and display BEGIN {if(ARGC != 2) {print "Not enough arguments!"} else {print "Good evening,", ARGV[1]} }

getline
y How do you get input into your awk script other than y y y y

on the command line? The getline function provides input capabilities getline is used to read input from either the current input or from a file or pipe getline returns 1 if a record was present, 0 if an end-offile was encountered, and 1 if some error occurred Getline -$0, NF, NR,

getline from stdin


#getline.awk - demonstrate the getline function BEGIN {print "What is your first name and major? " while (getline > 0) print "Hi", $1 ", your major is", $2 "." }

Control Flow Statements


y Awk provides several control flow statements for

making decisions and writing loops y If-Else

$2 > 6 { n = n + 1; pay = pay + $2 * $3 } END { if (n > 0) print n, employees, total pay is , pay, average pay is , pay/n else print no employees are paid more than $6/hour }

Loop Control
y While # interest- compute compound interest # input: amount rate years # output: compound value at end of each year { i=1 while (i <= $3) { printf( \t%.2f\n , $1 * (1 + $2) ^ i) i=i+1 } }

Loop Control
y { for (i = NF; i > 0; i = i - 1) printf( %s , $i)

printf( /n ) } y { sum = 0 for (i = 1; i <= NF; i = i + 1) sum = sum + $i print sum { y { for (i = 1; i <= NF; i = i + 1) sum = sum $i } END { print sum }

Built-In Functions
y Arithmetic y sin, cos, atan, exp, int, log, rand, sqrt y String y length, substitution, find substrings, split strings y Output y print, printf, print and printf to file y Special y system - executes a Unix command
y y

system( clear ) to clear the screen Note double quotes around the Unix command

y exit - stop reading input and go immediately to the END

pattern-action pair if it exists, otherwise exit the script

Formatted Output
y printf provides formatted output y Syntax is printf( format string , var1, var2, .) y Format specifiers
y y y y y y

%c single character %d - number %f - floating point number %s - string \n - NEWLINE \t - TAB

Example:Transpose of matrix
y BEGIN {count = 1;

for (row = 1; row <= 5; ++row) { for (col = 1; col <= 3; ++col) { printf("%4d",count); array[row,col] = count++; } printf("\n"); } printf("\n"); for (col = 1; col <= 3; ++col) { for (row = 1; row <= 5; ++row) { printf("%4d",array[row,col]); } printf("\n"); } exit; }

y Thank you

You might also like