You are on page 1of 236

Perl Practical Extraction and Report Language

Prepared by A Brahmananda Reddy Assistant Professor/CSE VNRVJIET

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

What is Perl?
Perl is a general-purpose programming language originally

developed for text manipulation and now used for a wide range of tasks including system administration, web development, network programming, GUI development, and more.
The language is intended/planned to be practical (easy to use,

efficient, complete) rather than beautiful (tiny, elegant, minimal).


Its major features are that its easy to use, supports both

procedural and object-oriented (OO) programming, has powerful built-in support for text processing, and has one of the worlds most impressive collections of third-party modules.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

Perl is interpreted. This means that as soon as you write your

program, you can run it -- theres no mandatory compilation phase. The same Perl program can run on Unix, Windows, NT, MacOS, DOS, OS/2, VMS and the Amiga.
Perl is collaborative. The CPAN software archive contains free

utilities written by the Perl community, so you save time.


Perl is free. Unlike most other languages, Perl is not proprietary.

The source code and compiler are free, and will always be free.
Perl is fast. The Perl interpreter is written in C, and more than a

decade of optimizations have resulted in a fast executable.


Perl is flexible. The Perl motto is "theres more than one way to

do it." The language doesnt force a particular style of programming on you. Write what comes naturally. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 3

Perl is complete. The best support for regular expressions in any

language, internal support for hash tables, a built-in debugger, facilities for report generation, networking functions, utilities for CGI scripts, database interfaces, arbitrary-precision arithmetic --are all bundled with Perl.
Perl is secure. Perl can perform "taint (spoils status or

reputation) checking" to prevent security breaches/agrements. You can also run a program in a "safe" compartment to avoid the risks inherent in executing unknown code.
Perl is open for business. Thousands of corporations rely on Perl

for their information processing needs.


Perl is simple to learn. Perl makes easy things easy and hard

things possible. Perl handles tedious tasks for you, such as memory allocation and garbage collection. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 4

Perl is concise. Many programs that would take hundreds or

thousands of lines in other programming languages can be expressed in a page of Perl.


Perl

is object oriented. Inheritance, polymorphism, and encapsulation are all provided by Perls object oriented capabilities.

Perl is fun. Programming is meant to be fun, not only in the

satisfaction of seeing our well-tuned programs do our bidding/request, but in the literary act of creative writing that yields those programs. With Perl, the journey is as enjoyable as the destination.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

Why PERL ???


Perl stands for practical extraction and report language. Perl is similar to shell script. Only it is much easier and more

akin/powerful to the high end programming.


Perl is free to download from the GNU website so it is very

easily accessible .
Perl is also available for MS-DOS,WIN-NT and Macintosh.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

Basic Concepts
Perl files extension .Pl Can create self executing scripts

Advantage of Perl
Can use system commands

Comment entry
Print stuff on screen

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

Basics - Running Perl programs


To run a Perl program from the Unix command line:

perl progname.pl Alternatively, put this as the first line of your script: #! /bin/perl. Or #!/usr/bin/env perl It can make perl files self executable by making it as first line. The extension tells the kernel that the script is a perl script and the first line tells it where to look for perl. ... and run the script as /path/to/script.pl. Of course, itll need to be executable first, so chmod 755 script.pl (under Unix).
The -w switch tells perl to produce extra warning messages

about potentially dangerous constructs.


A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 8

Basics
The advantage of Perl is that you dont have to compile create

object file and then execute.


All commands have to end in ";" .

can use unix commands by using. System("unix command");


EG: system("ls *");

Will give the directory listing on the terminal where it is running.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

Basics
The pound sign "#" is the symbol for comment entry. There is

no multiline comment entry , so you have to use repeated # for each line.
The "print command" is used to write outputs on the screen.

Eg: print "this is CSE"; Prints "this is I M.Tech. SE" on the screen. It is very similar to printf statement in C.
If you want to use formats for printing you can use printf.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

10

Basic syntax overview


A Perl script or program consists of one or more statements.

These statements are simply written in the script in a straightforward fashion. There is no need to have a main() function or anything of that kind.
Perl statements end in a semi-colon:

print "Hello, world"; Comments start with a hash symbol and run to the end of the line # This is a comment
Whitespace is irrelevant: except inside quoted strings

print "Hello, world" ;


A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 11

... except inside quoted strings:


# this would print with a linebreak in the middle print "Hello world";
Double quotes or single quotes may be used around literal

strings: print "Hello, world"; print Hello, world; However, only double quotes "interpolate" variables and special characters such as newlines (\n): print "Hello, $name\n"; # works fine print Hello, $name\n; # prints $name\n literally Numbers dont need quotes around them: print 42; A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 12

You can use parentheses for functions arguments or omit

them according to your personal taste. They are only required occasionally to clarify issues of precedence. print("Hello, world\n"); print "Hello, world\n";
print("Hello World!\n"); print ("Hello World! \n) ; Print "Hello World!", "\n"; print "Hello", " ", "World!", "\n";

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

13

Beyond 'Hello world!'


Before we go on to look at Perl in detail, here are some small

examples to flavour of Perl and some indication of its power. Example 1: Print lines containing the string 'Shazzam!'

#l/usr/bin/Perl while (<STDIN>) print if /Shazzam!/ };


This one-liner reads from standard input and prints out all the

lines that contain the string 'Shazzam!.

<STDIN> is a bit of Perl magic that delivers the next

line of input each time round the loop.


At end-of-file it delivers a special value called undef, which

terminates the while loop.


A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 14

implied anonymous variable: the use of an implied

anonymous variable analogous/similar to the pronoun 'it', and the implied pattern match 'print it if it matches /Shazzam!/'. If we wanted to spell out, we would have changed the code to the version shown in Example 2
Example 2: The same thing the hard way

while ($line <STDIN>) { print $line if $line =/Shazzam!/ }; Here we use a variable instead anonymous 'it '. indicates that it is a scalar variable (as opposed to an array).
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 15

Example 3: Print lines not containing the string

'Shazzam! If we want to print all lines except those containing the pattern, we just change if to unless. #!/usr/bin/Perl while (<STDIN>) print unless /Shazzam!/ };

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

16

NAMES AND VALUES IN Perl


Like any other procedural lanlguage, Perl manipulates variables

which have a name (or identifier) and a value: a value is assigned to (or stored in) a variable by an assignment statement of the form

name = value;
Variable names resemble nouns in English (command names are verbs) A singular name is associated with a variable that holds a single item of data called a scalar value A plural name is associated with a variable that holds a collection of data items called an array or hash Variable names start with a character that denotes the kind of thing that the name stands for - scalar data ($), array(@), hash(%), subroutine (&) etc. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 17

Scalar data : Strings and Numbers


Internally, numbers are stored as

signed integers if

possible, and otherwise as double length floating point numbers in the systems native format. Strings are stored as sequences of bytes of unlimited length or as strings as sequences of printable characters, Perl attaches no significance to any of the 256 possible values of a byte.

Perl is a dynamically typed language


Boolean Values : Perl adopts the simple rule that numeric zero, "0" and the empty string ("") mean false, and anything else means true
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 18

Numeric and String constants:


Numeric constants (number literals) can be

written in a variety of ways, including scientific notation, octal and hexadecimal.


Numerical Literals
6 12.6 1e10 6.4E-33 4_348_348 Integer Floating Point Scientific Notation Scientific Notation Underscores instead of commas for long numbers Octal Hex
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

0777 0x3fff

19

String constants:

String constants (string literals) can be enclosed in single or double quotes.

The string is terminated by the first next occurrence of

the quote (single or double) which started it, so a single-quoted string can include double quotes and vice versa.
Single quoted strings are treated 'as is' with no

interpretation of their contents except the usual use of backslash to remove special meanings, so to include a single quote in such a string it must be preceded with a backslash, and to include a backslash you use two of them.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 20

Standard Files
STDIN : It is a normal input channel for the script.

STDOUT : It is an normal output channel.


STDERR : It is the normal output channel for errors.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

21

Example
#! /usr/local/bin/perl w print Enter the Text; $input = <STDIN> ; #Reads the input and stores in the variable #input Chomp(); #will remove new line character. Print entered text =$input ; #Prints the input on the command line\ This Program displays: Enter the Text Perl is awesome #Perl will read this Perl is awesome\n, by #default it will add \n character to your #entered text. So use chomp entered text =Perl is awesome
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 22

Variable Interpolation
Interpolation takes place only in double quotation

marks. Example #! /usr/local/bin/perl w $x = 12 ; #Assign the value to the variable print Value of x is $x ; #Prints the output This Program displays: Value of x is $x #Single quotation will not interpolate #(no processing is done) the values
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 23

Example
#! /usr/local/bin/perl w $x = 12 ; #Assign the value to the variable print Value of x is $x ; #Prints the output This Program displays:

Value of x is 12

#Double quotation interpolates the #values. (Variable is replaced by its #content )

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

24

Integers
Integers are usually expressed as decimal(10) but can be

specified in several different formats. 234 decimal integer 0765 octal integer 0b1101 binary integer 0xcae hexadecimal integer Converting a number from one base to another base can be done using sprintf function. Variables of different base can be displayed using printf function

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

25

Example
#! /usr/local/bin/perl w $bin = 0b1010; $hex = sprint f %x, $bin; $oct = sprint f %o ,45; print binary =$bin \n hexa =$hex \n octal =$oct; This Program displays:
binary= 1010 hexa = a octal = 55
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 26

Example #! /usr/local/bin/perl w $x = 98 ; print f ( Value in decimal =%d\n, $x ) ; print f ( Value in octal=%o\n, $x ) ; print f ( Value in binary =%b\n, $x ) ; print f ( Value in hexadecimal=%x\n, $x ) ; This Program displays: Value in decimal =98 Value in octal =142 Value in binary =1100010 Value in hexadecimal =62
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

27

Escaped Sequences
Character strings that are enclosed in double quotes

accept escape sequences for special characters. The escape sequences consist of a backslash (\) followed by one or more characters Escape Sequence Description
\b \e \f \l \L \u \U \r \v Backspace escape Form feed Forces the next letter into lowercase All following letters are lower case Forces the next letter into upper case All following letters are upper case Carriage Return Vertical Tab A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 28

Built-in functions : Function and its Description


chomp( ) : The chomp() function will remove (usually) any
newline character from the end of a string. The reason we say usually is that it actually removes any character that matches the current value of $/ (the input record separator), and $/ defaults to a newline. Ex :chomp($text);
Chop( ) : The chop() function will remove the last character of a

string (or group of strings) regardless of what that character is.

Ex:chop($text)

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

29

Built-in functions : Function and its Description(Cont..)


Chr () : Returns the character represented by that number in the character set.
Ex: chr(65 ) gives A.
Ord() : Returns the ASCII numeric value of the character

specified by expression. Ex:ord(A) gives 65.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

30

Operators
Operators can be broadly divided into 4 types.
Unary operator which takes one operand.

Example: not operator i.e. ! Binary operator which take two operands Example: addition operator i.e. + Ternary operator which take three operands. Example: conditional operator i.e. ?: List operator which take list operands Example: print operator
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

31

Arithmetic Operators
Operator Description + Adds two numbers Subtracts two numbers * Multiplies two numbers / Divides two numbers ++ Increments by one.(same like C) -Decrements by one % Gives the remainder (10%2 gives five)
** Gives the power of the number. Print 2**5 ; #prints 32.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 32

Shift Operators
shift operators manipulate integer values as binary

numbers, shifting their bits one to the left and one to the right respectively.
Operator << Description Left Shift Print 2 >>3 ; left shift by three positions, prints 8 >> Right Shift Print 42 >>2; #right shift by two positions, prints 10 x Repetition Operator. Ex: print hi x 3; Output : hihihi

Ex2: @array = (1, 2, 3) x 3; #array contains(1,2,3,1,2,3,1,2,3) A.Brahmananda Reddy / Assistant Ex3 :@array=(2)x80 #80 element array of value 2 Professor / CSE / VNRVJIET

33

Logical Operators
Logical operators represented by either symbols or

names. These two sets are identical in operation, but have different precedence. Operator Description
&& or AND || or OR XOR ! or NOT Return True if operands are both True Return True if either operand is True Return True if only one operand is True (Unary) Return True of operand is False

The ! operator has a much higher precedence

than even && and || . The not, and, or, and xor operators have the lowest precedence of all Perl's operators, with not being the highest of the four A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 34

Bitwise Operators
Bitwise operators treat their operands as binary

values and perform a logical operation between the corresponding bits of each value. Operator & | ^ ~ Description Bitwise AND Bitwise OR Bitwise XOR Bitwise NOT
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

35

Comparison Operators
The comparison operators are binary, returning a

value based on a comparison of the expression Operator Description


< > == <= >= <= > Lessthan Greaterthan Equality Lessthan or equal Greaterthan or equal It does not return a Boolean value. It returns -1 if left is less than right 0 if left is equal to right 1 if left is greater than right Inequality operator A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 36

!=

Comparison Operators on strings


String Description eq Return True if operands are equal le Return True if left operand is less than right ge Return True if left operand is greater or equal to right gt Return True if left operand is greater than right cmp It does not return a Boolean value. It returns -1 if left is less than right 0 if left is equal to right 1 if left is greater than right ne Return True if operands are not equal

.(dot) Concatenation operator. It takes two strings and joins them Ex: print System .Verilog It prints SystemVerilog. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 37

Binding operator
The binding operator ,=~ ,binds a scalar expression into a

pattern match.
String operations like s/// ,m//,tr/// work with $_ by default. By using these operators you can work on scalar variable other

than $_ .
The value returned from

=~ is the return value of the regular

expression function, returns undef if match failed.


The !~ operator performs a logical negation of the returned value

for conditional expressions, that is 1 for failure and '' '' for success in both scalar and list contexts.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 38

Operators on Scalar Variables


Numeric and Logic Operators

Typical : +, -, *, /, %, ++, --, +=, -=, *=, /=, ||, &&, ! ect Not typical: ** for exponentiation
String Operators

Concatenation: . - similar to strcat $first_name = Larry; $last_name = Wall; $full_name = $first_name . . $last_name;

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

39

Equality Operators for Strings


Equality/ Inequality : eq and ne

$language = Perl; if ($language == Perl) ... # Wrong! if ($language eq Perl) ... #Correct Use eq / ne rather than == / != for strings

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

40

Relational Operators for Strings


Greater than

Numeric : >
Greater than or equal to

String : gt String : ge
String : lt String : le

Numeric : >=
Less than

Numeric : <
Less than or equal to

Numeric : <=

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

41

String Functions
Convert to upper case

$name = uc($name);
Convert only the first char to upper case

$name = ucfirst($name);

Convert to lower case

$name = lc($name);
Convert only the first char to lower case

$name = lcfirst($name);
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 42

A String Example Program


#!/usr/local/bin/perl $var1 = larry; $var2 = moe; $var3 = shemp;
print ucfirst($var1); print uc($var2); print lcfirst(uc($var3)); # Prints 'Larry' # Prints 'MOE' # Prints 'sHEMP

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

43

Variable Interpolation
Perl looks for variables inside strings and replaces them

with their value $stooge = Larry print $stooge is one of the three stooges.\n;

Produces the output:


Larry is one of the three stooges.
This does not happen when you use single quotes

print '$stooge is one of the three stooges.\n;

Produces the output:


$stooge is one of the three stooges.\n
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 44

Character Interpolation
List of character escapes that are recognized

when using double quoted strings


\n \t \r newline tab carriage return

Common Example :
print Hello\n; # prints Hello and then a return

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

45

Numbers and Strings are Interchangeable


If a scalar variable looks like a number and Perl needs a

number, it will use it as a number

$a = 4; print $a + 18; # $b = 50; # print $b 10;

# a number prints 22 looks like a string, but ... # will print 40!

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

46

Blocks and Conditions


A

block is just a sequence of one or more statements

enclosed in braces.
The last statement in the block is terminated by the

closing brace.
The control structures in Perl use conditions to control

the evaluation of one or more blocks, (the body of a subroutine is a block).

Bare block : Blocks can in fact appear almost anywhere that a statement can appear: such a block is sometimes called a bare block.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 47

Blocks and Conditions


A condition is just a Perl expression which is evaluated

in a Boolean context: if it evaluates to zero or the empty string the condition is treated as false otherwise it is treated as true in accord with the rules already given.
Conditions

usually make use of the relational operators, and several simple conditions can be combined into a complex condition using the logical operators.

A condition can be negated using the ! Operator

! ($total > 50 and $total < 100)


A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 48

If-then-else statements
if ($total >0) { print "$tatal\n} else { print "bad total!\n"}

if ($total >70){ $grade = A; } elsif ($tatal > 50){ $grade = "B" ; } elsif ($total > 40){ $grade "C" ; } else { $grade = F"; $total = 0 } A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 49

Alternatives to if-then-else
A common idiom is to use a conditional expression in place of an if -then-else construct. Thus if ($a < 0) {$b = 0} else {$b = 1}; can be written $b = ($a < 0) ? 0 : 1; Another common idiom is, as we have seen, to use the 'or' operator between statements open(IN, $ARGV[0]) or die "Cant open A.Brahmananda $ARGV[0] \n"; Reddy / Assistant
Professor / CSE / VNRVJIET 50

Statement qualifiers
A single statement (but not a block) can be followed

by a conditional modifier, as in the English 'I'll come if it's fine'. print "OK\n" print "Weak\n" print Replace\n" if $volts >= 1. 5; if $volts >= 1.2 and $volts < 1. 5; if $volts < 1. 2;

This is readable and self-documenting:

compare the following code using conditional expressions, which has the same effect: Print (($volts >= 1.5) ? "OK\n : (($volts >= 1.2) ? "Weak\n" : "Replace\n"));
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 51

Repetition
Perl provides a variety of repetition mechanisms to suit

all tastes, including both testing loops and 'counting' loops. Testing loops: while ($a != $b) { if ($a > $b) { $a = $a - $b } else { $b = $b - $a } }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 52

Repetition
while can be

replaced by until to give the same effect as explicit negation of the condition. Likewise, statements (but not blocks) can use while and until as statement modifiers to improve readability $a +=2 while $a < $b; $a += 2 until $a > $b; Note particularly, though, that this is purely syntactic sugar - a notational convenience. Although the condition is written after the statement, it is evaluated before the statement is executed, just like any other while/until loop: if the condition is initially false the statement will never be executed - a zero trip loop. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 53

Repetition
Perl provides the do

loop for the purpose - strictly

speaking, do is a built-in function rather than a syntactic construct. The condition attached to a do loop looks superficially the same as a statement modifier, but the semantics are that the condition is tested after execution of the block, so the block is executed at least once.
do { } while $a = $b;

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

54

Repetition
The while can be replaced by

until with the obvious

meaning, and the modifier can be omitted entirely.


A

do statement without a modifier executes the

statements of the block and returns the value of the last expression evaluated in the block: Counting loops: Counting loops use the same syntax as c: for ($i = 1; $i <= 10; $i++) { $i_square = $i*$i; $i_cube $i**3; print $i\t$i_square\t$i_cube\n; }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 55

Repetition
There is also a foreach construct, which takes an

explicit list of values for the controlled variable. foreach $i (1 .. 10) { $i_square = $i*$i; $i_cube = $i**3; print $i\t$i_square\t$i_cube\n; } And if they wanted to count backwards they would write foreach $i reverse (1 .. 10) { $i_square = $i*$i; $i_cube =$i**3; print $i\t$i_square\t$i_cube\n; }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 56

Loop Type and Description


while loop Repeats a statement or group of
statements while a given condition is true. It tests the condition before executing the loop body.

until loop Repeats a statement or group of


statements until a given condition becomes true. It tests the condition before executing the loop body.

for loop Execute a sequence of statements multiple


times and abbreviates the code that manages the loop variable.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

57

Loop Type and Description


foreach loop The foreach loop iterates over a normal list value and sets the variable VAR to be each
element of the list in turn.

do...while loop Like a while statement, except that


it tests the condition at the end of the loop body

nested loops You can use one or more loop inside


any another while, for or do..while loop

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

58

How to Store Values


Scalar variables
List variables Push,pop,shift,unshift,reverse Hashes,keys,values,each Read from terminal, command line

arguments Read and write to files


A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

59

Scalar Variables
They should always be preceded with the $ symbol.

There is no necessity to declare the variable before

hand . There are no data types such as character or numeric. The scalar variable means that it can store only one value. If you treat the variable as character then it can store a character. If you treat it as string it can store one word . if you treat it as a number it can store one number.

Eg $name = "betty" ;
The value betty is stored in the scalar variable $name.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 60

EG: print "$name \n"; The output on the screen will be

betty.

Default values for all variables is undef.

Which is equivalent to null.


Example : $var =1 # integer $var = Hello_world # string $var=2.65 # Decimal number

$3var = 123 #Error, Shouldnt start with number Begin with $, followed by a letter then by letters, digits or underscores.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 61

$animal = "camel"; $answer = 42; print $animal; print "The animal is $animal\n";
print "The square of $answer is ", $answer * $answer, \n";

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

62

A list is a collection of variables, constants (numbers or

Lists

strings) or expressions, which is to be treated as a whole. It is written as a comma-separated sequence of values, e.g. "red", "green", "blue" 255, 128, 66 $a, $b, $c $a + $b, $a - $b, $a*$b, $a/$b A list often appears in a script enclosed in round brackets, to satisfy precedence rules e.g. : ("red", "green", "blue") It is important to appreciate that the brackets are not a Reddy / Assistant required part ofA.Brahmananda the list syntax
Professor / CSE / VNRVJIET

63

Applying the principle that the language should always

do what is natural, 'obvious shorthand is acceptable in lists, e.g. (1. .8) ("A" .. "H", "0" .. "Z") and to save tedious typing, qw(the quick brown fox} is a shorthand for ("the", "quick", "brown", "fox") qw/the quick brown fox/ or qw|the quick brown fox| The 'matching brackets' rule also applies to the qw A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 64 operator.

List magic
Lists are often used in connection with arrays and

hashes. A list containing only variables can appear as the target of an assignment and/or as the value to be assigned. This makes it possible to write simultaneous assignments, e.g. ($a, $b, $c) = (1, 2, 3); and to perform swapping or permutation without using a temporary variable, e.g. ($a, $b) = ($b, Sa) ; ($b, $c, Sa) = ($a, $b, $c)

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

65

Arrays
An array is an ordered collection of data whose

components are identified by an ordinal index: it is usually the value of an array variable. The name of such a variable always starts with an @, e.g. @days_of_week, denoting a separate namespace and establishing a list context.
The association between arrays and lists is a close one:

an array stores a collection, and a list is a collection, so it is natural to assign a list to an array, e.g. @rainfall = (1.2, 0.4, 0.3, 0.1, 0, 0, 0);

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

66

A list can occur as an element of another list.

the inner list is inserted in a linear one-level structure,

so that @foo (1,2, 3, "string");


@foobar = (4, 5, @foo, 6);

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

67

Arrays : An array represents a list of values my @animals = ("camel", "llama", "owl"); my @numbers = (23, 42, 69); my @mixed = ("camel", 42, 1.23);
Arrays are zero-indexed. Heres how you get at elements in an array: print $animals[0]; print $animals[1]; # prints "camel" # prints "llama

The special variable $#array tells you the index of the last element of an array: A.Brahmananda # Reddy / Assistant print $mixed[$#mixed]; last element, prints 1.23
Professor / CSE / VNRVJIET 68

The elements were getting from the array start with a $ because were getting just a single value out of the array To get multiple values from an array: @animals[0,1]; # gives ("camel", "llama"); @animals[0..2]; # gives ("camel", "llama", "owl"); @animals[1..$#animals]; # gives all except the first #element This is called an "array slice". You can do various useful things to lists: my @sorted = sort @animals; my @backwards = reverse @numbers;
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 69

You can do various useful things to lists: my @sorted = sort @animals; my @backwards = reverse @numbers;
There are a couple of special arrays too, such as @ARGV (the command line arguments to your script) and @_ (the arguments passed to a subroutine). These are documented in perlvar.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

70

Arrays: List and Scalar Context


Perl performs an operation, in the context of list context

or a scalar context.

list context: the 'target' of an operation is a collection. scalar context: it is a single data item.
In an assignment to an array, the @ of the array name

on the left-hand side establishes a list context, but when an element of an array is being accessed the occurrence of the same name but with a leading $ establishes a scalar context.
Some items that can occur in either context modify

their behavior depending on the current context


A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 71

A list evaluated in a scalar context delivers its last item

@foo = (101, 102 , 103); #sets all 3 values of the array foo, but $foo = (101, 102, 103); # sets $foo to 103.

If you assign a scalar value to an array, Perl does the

sensible thing and puts brackets round it to make it a list of one element, so @a = " candy" ; has the same effect as @a = ("candy") ;

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

72

If you assign an array to a scalar value, the value

assigned is the length of the array, $n = @foo; #Assigns the value 3 to $n.
List context can be established in other ways.

Eg: The print function establishes a list context, since it expects a list of things to print. But this can lead to unexpected results. Eg: $n = @foo print "array foo has $n elements\n #Prints Length print Ilarray foo has @foo elements #Prints entire #contents of the array
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 73

Associative arrays or content-addressable arrays. An associative array is one in which each element has

Hashes

two components, a key and a value, the element 'indexed' by its key (just like a table). Such arrays are usually stored in a hash table to facilitate efficient retrieval. A particular attraction of hashes is that they are elastic they expand to accommodate new elements as they are introduced. Names of hashes in Perl start with a % character: such a name establishes a list context. As with arrays, since each element in a hash is itself a scalar, a reference to an element uses $ as the first character, establishing a scalar context.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 74

The index (or key) is a string enclosed in braces (curly

brackets): if the key is a single 'word', i.e. does not contain space characters, explicit quotes are not required. $somehash{aaa} =123; #The braces establish that the scalar item is being assigned to an element of a hash. $somehash{234} = "bbb" ; # The key is a three-character string, not a number $somehash{" $a "} = 0; # The key is the current value of $a . %anotherhash = %somehash; # The leading % establishes a list context: the target of the assignment is the hash itself, not one of its values.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 75

They are like arrays. It can be considered as a group of

List Variables

scalar variables.
They are always preceded by the @symbol.

Eg @names = ("betty","veronica","tom");
Like in C the index starts from 0. If you want the second name you should use $names[1] Watch the $ symbol here because each element is a

scalar variable.
$ Followed by the list variable gives the length of the

list variable. Eg $names here will give you the value 3. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 76

Push,pop,shift,Unshift,reverse
These are operators operating on the list variables. Push and pop treat the list variable as a stack and

operate on it. They act on the higher subscript.


Eg: push(@names,"lily") , now the @names will contain ("betty","veronica","tom","lily"). Eg: pop(@names) will return "lily" which is the last value. And @names will contain ("betty","veronica","tom").

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

77

shift,Unshift,reverse
SHIFT AND UNSHIFT ACT ON THE LOWER
SUBSCRIPT.

EG: UNSHIFT(@NAMES,"LILY") NOW @NAMES CONTAINS ("LILY","BETTY","VERONICA","TOM").

EG: SHIFT(@NAMES) RETURNS "LILY" AND @NAMES CONTAINS ("BETTY","VERONICA","TOM").

REVERSE REVERSES THE LIST AND RETURNS IT.


A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 78

Perl provides several built-in functions for list manipulation. Three useful ones are
shift LIST: Returns the first item of LIST, & moves the

Manipulating Lists

remaining items down, reducing the size of LIST by 1.


unshift ARRAY, LIST: The opposite of shift: puts the

items in LIST at the beginning of ARRAY, moving the original contents up by the required amount.
push ARRAY, LIST: Similar to unshift, but adds the

values in LIST to the end of ARRAY.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

79

Hashes,keys,values,each
Hashes are like arrays but instead of having

numbers as their index they can have any scalars as index. Hashes are preceded by a % symbol.
Eg we can have %rollnumbers = ("A",1,"B",2,"C",3);

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

80

Hashes,keys,values,each
If we want to get the rollnumber of A we have to

say $rollnumbers{"a"}. This will return the value of rollnumber of A. Keys: returns a list of all the keys of the given hash. Values: returns the list of all the values in a given hash. Each function iterates over the entire hash returning two scalar value the first is the key and the second is the value
Eg $firstname,$lastname = each(%lastname) ; Here the $firstname and the $lastname will get a new key value pair A.Brahmananda during each iteration Reddy / Assistant
Professor / CSE / VNRVJIET 81

$somehash{aaa} = 123;

The braces establish that the scalar item is being assigned to an element of a hash.
$somehash{234} = "bbb" ;

The key a three-character string, not a number.


$somehash{" $a "} = 0;

The key is the current value of $a .


%anotherhash = %somehashi

The leading establishes a list context: the target of the assignment is the hash itself, not one of its values.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 82

Control Structures
If / unless statements While / until statements

For statements
Foreach statements

Last , next , redo statements


&& And || as control structures

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

83

If / Unless
If similar to the if in C.

Eg of unless.
Unless(condition){}

When you want to leave the then part and have

just an else part we use unless.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

84

While / Until / For


While similar to the while of C.

Eg until.
Until(some expression){}

So the statements are executed till the condition

is met.
For is similar to C implementation.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

85

Foreach Statement
This statement takes a list of values and assigns

them one at a time to a scalar variable, executing a block of code with each successive assignment.
Eg: Foreach $var (list) {}

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

86

Last / Next / Redo


Last: is similar to break statement of C.
Whenever you want to quit from a loop you can use this.
next :

To skip the current loop use the next statement.


It immediately jumps to the next iteration of the loop.

redo : The redo statement helps in repeating the

same iteration again.


A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

87

&& And || Controls


unless(cond1){cond2}. This can be replaced by cond1&&cond2. Suppose you want to open a file and put a

message if the file operation fails we can do.


(Condition)|| print "the file cannot be opened;

This way we can make the control structures

smaller and efficient.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

88

Iterating over lists


It is common to want to the same on all the items in a list, or a selection of the items.

foreach : The foreach loop performs a simple


iteration over all the elements of a list: foreach $i tern (list) { } The block is executed repeatedly with the variable $ item taking each value from the list in turn. The variable can be omitted, in which case $_ will be used. Since Perl will automatically convert an array into a list if required,
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 89

map
The Perl map function transforms a list into a new one and this new list is returned by the function. No need to modify the input list. The Perl map function evaluates a block or an expression for each element of an array and returns a new array with the result of the evaluation. During the evaluation, it locally assigns the $_ variable to each element of the array. The difference between the grep and map functions: while you can use grep to select elements from an array, you use map to transform the elements of an array. So keep in mind: grep to select, map to A.Brahmananda Reddy / Assistant transform. Professor / CSE / VNRVJIET 90

map

Eg: suppose we have a list of words

('cat', 'dog', 'rabbit', 'hamster', 'rat') and we want to create a list of the same words with a terminal's' added to form the plural: ('cats' ,'dogs' , 'rabbits', 'hamsters', 'rats') This could be done with a foreach loop: @s = qw/cat, dog, rabbit, hamster, rat/; @pl = (); foreach (@s){ push @pl, $_. 's;} print @s; However, this is such a common idiom that Perl provides an inbuilt function map to do the job: @pl = map $_. 's', @S; The general form of map: map expression, list; and map BLOCK A.Brahmananda Reddy / Assistant list; Professor / CSE / VNRVJIET

91

Subroutines
Subroutines (or subs) are named blocks of code, thus: sub foobar { statements }
Note that the subroutine definition does not include

argument specifications: it just associates the name with the block Subroutines are like everything else in perl, in the sense that they return a value. The value returned is the value of the last expression evaluated in the block that forms the subroutine body, unless the return function is A.Brahmananda Reddy / Assistant used to return a specific value
Professor / CSE / VNRVJIET 92

Subroutines are used: to avoid or reduce redundant code to improve maintainability and reduce possibility of errors to reduce complexity by breaking complex problems into smaller, more simple pieces to improve readability in the program
The advantage of subroutines in perl is, if

you wish to be able to call your subroutine and leave off optional arguments
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 93

Defining and using subroutines

Subroutines allow us to better structure our code, organize it and

reuse it. A subroutine will start with keyword sub. Eg: subroutine which calculates sum of two numbers: #!/usr/bin/perl -w $var1 = 100; $var2 = 200; $result = 0; $result = my_sum(); print "$result\n"; sub my_sum { $tmp = $var1 + $var2; return $tmp; }
Note: Subroutines might have parameters. When passing parameters to subroutines, it will be stored in @_ array. Do not confuse it with $_ which stores elements of an array in a loop.Reddy / Assistant A.Brahmananda
Professor / CSE / VNRVJIET 94

Using file parameters (positional parameters)


Sometimes we need to transmit parameters to our script files.

@ARGV is an array reserved for parameters transmitted to files (default value of number of arguments is set -1 if no parameters are transmitted. #!/usr/bin/perl -w if ($#ARGV < 2) { print "You must have at least 3 parameters.\n"; } else { print "Your parameters are: @ARGV[0..$#ARGV]\n"; }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 95

Calling subroutines: The semantics of a Perl


subroutine are better captured by the 'textual substitution' model If foobar is defined as a subroutine it can be called without arguments by &foobar; or equivalently (remember that there's more than one way to do it) &foobar(); In either case, the effect is as if the block forming the body of foobar had been copied into the script at that point.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

96

The ampersand identifies foobar explicitly as the

name of a subroutine, so this form of call can be used even if the subroutine definition occurs later in the script. If the subroutine has been defined earlier the ampersand can be omitted A forward declaration takes the form sub foobar; i.e. it is a declaration without a subroutine body. so that the ampersand hardly ever needs to be used.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

97

Subroutine arguments
If a subroutine expects arguments, the call takes the

form &foobar(argl, arg2); or in the likely case that the subroutine is already declared, we can omit the ampersand: foobar(argl, arg2); In fact, for a pre-declared subroutine, the idiomatic form of call is foobar argl, arg2 A subroutine expects to find its arguments as a single flat list of scalars in the anonymous array @_: they can be accessed in the body as $_ [0], $_ [1] etc.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 98

Note that this has the effect that arguments that are variable

names are called by reference. This is because the values stored in @_ are implicit references to the actual scalar parameters. Thus if you assign to $_ [0] the value of the actual parameter is changed. A common idiom in a subroutine which is expecting a variable number of arguments is to structure that as to process each argument in turn. foreach $arg @_ { . }

Another common idiom is to use shift to peel off one

argument at a time. For convenience, shift can be used without an argument in a subroutine body, and Perl will assume that you mean the argument array @_.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 99

When a subroutine is called with arguments, e.g.

foobar argl, arg2, arg3; the effect is as if the assignment @_ = (argl, arg2, arg3); Is executed just before the text of the body of foobar is inserted in the script. The value returned by a subroutine may be a scalar or a single flat list of scalars. It is important to note that in either case, the expression that determines the return value is evaluated in the context (scalar or list) in which the subroutine was called. Thus if we write $x = foo{$y, $z) the return value will be evaluated in scalar context,
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

100

but if we write @x = foo($y, $z) it will be evaluated in list context. This can lead to subtle errors if the expression is such that its value depends on the context in which it is evaluated. To deal with this the function wantarray is provided: this returns true if the subroutine was called in a list context, and false if it was called in a scalar context. A typical idiom is to use a conditional expression in the return statement: return wantarray ? list_value : scalar_value;

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

101

We have seen that a Perl script can be invoked with

arguments on the command line. These arguments are placed in an array @ARGV. A common idiom is to use a foreach loop to process each argument in tum: #!/usr/bin/perl foreach $arg @ARGV; { ... #process each argument in turn #as $arg } An alternative is to peel off the arguments one at a time using the shif t operator: #!/usr/bin/Perl while ($arg = shift) { } A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 102

sub maximum { if ($_[0] > $_[1]) { $_[0]; } else { $_[1]; } } $biggest = &maximum(37, 24); # Now $biggest is 37
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 103

#!/usr/bin/perl -w my $sum; $sum = &add_numbers(5, 15); print "The sum is $sum.\n"; sub add_numbers { my $num1; my $num2; my $sum; $num1 = $_[0]; $num2 = $_[1]; $sum = $num1 + $num2; return ($sum); }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 104

#!/usr/bin/perl -w &draw_triangle(0); sub draw_triangle { my $iterations = $_[0]; my $chars; $iterations++; if ($iterations == 11) { return 0; } else { for ($chars = 0; $chars < $iterations; $chars++) { print " * "; } print "\n"; &draw_triangle($iterations); } A.Brahmananda Reddy / Assistant } Professor / CSE / VNRVJIET

105

Functions
Function declaration
Calling a function Passing parameters Local variables Returning values

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

106

Function Declaration
The keyword sub describes the

function.
So the function should start with the keyword sub. Eg sub addnum { . }.

It should be preferably either in

the end or in the beginning of the main program to improve readability and also ease in debugging.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 107

Function Calls
$Name = &getname(); The symbol & should precede

the function name in any function call.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

108

Parameters of Functions
We can pass parameter to the

function as a list . The parameter is taken in as a list which is denoted by @_ inside the function. So if you pass only one parameter the size of @_ list will only be one variable. If you pass two parameters then the @_ size will be two and the two parameters can be accessed by $_[0],$_[1] ....
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 109

More About Functions


The variables declared in the

main program are by default global so they will continue to have their values in the function also. Local variables are declared by putting 'my' while declaring the variable.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

110

More About Functions


The result of the last operation

is usually the value that is returned unless there is an explicit return statement returning a particular value.
There are no pointers in Perl but

we can manipulate and even create complicated data structures.


A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

111

Introduction to regular expressions


RE provide an explicit powerful way of defining

patterns. Strictly, a RE is a notation for describing the strings produced by a regular grammar: it is thus a definition of a (possibly infinite) class of strings. meta-characters with special meanings in a RE.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

112

Regular expressions in Perl


Alternations :If REI, RE2 and RE3 are regular expressions, REI | RE2 | RE3 is a regular expression that will match anyone of the components.
Eg: /^Date: | ^From: | ^To: | ^Subject/ Grouping: Round brackets can be used to group items: Eg: /^(Date: | From: | To: | Subject) /, thus factoring out' the startof-line anchor. /print the (elder|younger)/ : # Pattern matches Print the elder and #print the younger.

/(([O-9][ ])|([a-z][ ]))+/ : # matches a sequence of one or more #items, each of which is either a sequence of #digits followed by a space or a sequence of #lower-case letters followed by a space.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 113

Repetition counts: In addition to the quantifiers *,


? And + explicit repetition can be added to a component of a regular expression, e.g. /(wet[ ]) {2}wet/ matches wet wet wet. The full list of possible count modifiers is: {n} must occur exactly n times {n , } must occur at least n times {n,m} must occur at least n times but no more than m times Thus an Internet IP address is matched by the pattern / ([0-9] {1 , 3}\.) {3} [0-9] {1 , 3}/

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

114

Non-greedy matching: A pattern including . * matches the longest string it can find. The pattern . *? can be used when the shortest match is required. (In fact, any of the quantifiers can be followed by a ? to specify the shortest match.). Shorthand: Some character classes occur so often that a shorthand notation is provided: \ d matches a digit, \ w matches a 'word' character (upper-case letter, lower-case letter or digit), and \s matches a whitespace' character (space, tab, carriage return or newline). Capitalization reverses the sense, e.g. \D matches any non-digit character.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

115

Anchors: We have seen the use of ^ and $ to 'anchor' the match at the start or end of the target respectively. Other anchors can be specified as \b (word boundary) and \B (not a word boundary).

Eg: if the target string contains john and johnathan as space-separated words, /\bJohn/ will match both john and johnathan, /\bJohn \b/ will only match john, While /\bJohn \B/ will only match johnathan.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 116

Back references: Round brackets serve another


purpose besides grouping.

They define a series of partial matches that are 'remembered' for use in subsequent processing or in the regular expression itself. In a regular expression, \ 1, \ 2 etc. denote the substring that actually matched the first, second etc. sub-pattern, the numbering being determined by the sequence of opening brackets. Thus we can require that a particular sub-pattern occurs in identical form in two or more places in the target string. If we want the round brackets only to define a grouping without remembering the sub-string matches we can use the syntax (?: ... ) .
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 117

Pattern matching modifiers


The operation of the pattern match operator can be

modified by adding trailing qualifiers

m/ /i Ignore case when pattern matching. m/ /g Find all occurrences.


$count = 0; while ($target =~ m/$substring/g) { $count++ } m/ /m Treat a target string containing newline characters as multiple lines.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

118

m/ /s Treat a target string containing newline

characters as a single string i.e. dot matches any character including newline. m/ / x Ignore whitespace characters in the regular expression unless they occur in a character class, or are escaped with a backslash. m/ /o Compile regular expression once only.
The RE engine has to construct a non-deterministic finite automaton which is then used to perform abacktracking match. This is called 'compiling the RE and may be time consuming for a complex regular expression. The /o modifier ensures that the regular expression is A.Brahmananda Reddy / Assistant compiled only once, when it is first encountered. Professor / CSE / VNRVJIET 119

Substitution
Given a file in which each line starts with a four-digit

line number followed by a space, while (<STDIN>) {s/^\d{4} [ ] / /; print;} will print the file without line numbers. The general syntax is s/pattern/subst/
The substitution operator checks for a match between

the pattern and the value held in $_, and if a match is found the matching sub-string in $_ is replaced by the string subst.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 120

Substitution modifier : e. This modifier signifies

that the substitution string is an expression that is to be evaluated at run-time if the pattern match is successful, to generate a new substitution string dynamically. Obviously this is only useful if the code includes references to the variables $&, $1,$2 etc. Eg: if the target string contains one or more sequences of decimal digits, the following substitution operation will treat each digit string as an integer and add 1 to it: s/\d+/$&+l/eg; A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 121

Regular Expression
Split and join
Matching & replacing Selecting a different target $&,$', And $` Parenthesis as memory Using different delimiter Others
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

122

Split And Join


Split is used to form a list from a scalar data

depending on the delimiter. The default delimiter is the space. It is usually used to get the independent fields from a record. .
Eg: $linevalue = "R101 tom 89%"; $_ = $linevalue. @Data = split();

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

123

Split and Join


Here $data[0] will contain R101 ,$data[1] tom ,

$data[2] 89%. Split by default acts on $_ variable. If split has to perform on some other scalar variable.Than the syntax is.
Split (/ /,$linevalue);

If split has to work on some other delimiter then

syntax is.
Split(/<delimiter>/,$linevalue);

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

124

Special Vriables
$& Stores the value which matched with

pattern. $' Stores the value which came after the pattern in the linevalue. $` Stores thte value which came before the pattern in the linevalue.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

125

Split and Join


Join does the exact opposite job as that of

the split. It takes a list and joins up all its values into a single scalar variable using the delimiter provided.
Eg $newlinevalue = join(@data);

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

126

Matching and Replacing


Suppose you need to look for a pattern and

replace it with another one you can do the same thing as what you do in unix . the command in perl is .
S/<pattern>/<replace pattern>.

This by default acts on the $_ variable.If it has to

act on a different source variable (Eg $newval) then you have to use.
Eg @newval=~s/<pattern>/<replace pattern> .

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

127

Parenthesis As Memory
Parenthesis as memory. Eg fred(.)Barney\1); . Here the dot after the fred indicates the it is

memorry element. That is the \1 indicates that the character there will be replaced by the first memory element. Which in this case is the any character which is matched at that poistion after fred.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

128

Finer points of looping: The continue block


A while loop can have an explicit continue block which

is executed at the end of each normal iteration before control returns to re-test the condition
while ( ... ) { . } continue{ }

The continue block is analogous to the third component

in a for loop specification. the for loop can be defined in terms of a whi1e loop with a continue block as follows.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 129

for ($i =1; $i < 10; $i++) { ...... } is equivalent to $i=1; while ($i < 10) { .. } continue { $i++; } A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 130

Last, next and redo revisited


The next keyword use to skip the rest of the statement block and

start the next iteration. Example :

@array = (0..9); print("@array\n"); for ($index = 0; $index < @array; $index++) { if ($index == 3 || $index == 5) { next; } $array[$index] = "*"; } print("@array\n");
This program displays: 0123456789 * * * 3 *Reddy 5 * /* ** A.Brahmananda Assistant
Professor / CSE / VNRVJIET 131

The last keyword is used to exit from a statement

block. Example @array = ("A".."Z"); for ($index = 0; $index < @array; $index++) { if ($array[$index] eq "T") { { last } } print("$index\n");

This program displays: 19


A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

132

The redo keyword causes Perl to restart the current

statement block. Example : print("What is your name? "); $name = <STDIN>; chop($name); if (! length($name)) { print("Msg: Zero length input. Please try again\n"); redo; } print("Thank you, " . uc($name) . "\n"); }

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

133

Perl does not have a case statement, we can easily

construct something similar using last in an expression SELECT: { $red += 1, last SELECT if /red/; $green += 1, last SELECT if /green/; $blue += 1, last SELECT if /blue/; } Parses as ($red += 1, last SELECT) if /red/; This construct increments $red, $blue or $green according as the anonymous variable $_ contains the string 'red, blue or 'green'. Using last as an operator in an expression to cause exit from the labeled bare block.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 134

Multiple loop variables


A for loop can iterate over two or more variables

simultaneously, Eg: for ($m = 1, $n = 1; $m < 10; $m++, $n += 2) { ..... }


Here we have used the comma operator in a scalar

context. In a list context the comma operator is a list constructor, but in a scalar context the comma operator evaluates its left hand argument, throws away the value returned, then evaluates its right-hand argument
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 135

Subroutine Syntax
Old and no prototyping Calling the subroutine &mysub();

Constructing the subroutine sub mysub { }


Note: The ampersand (&) is used before the subroutine call, and that no parentheses are used in the function definition.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

136

Prototyping, no arguments Calling the subroutine mysub();

Constructing the subroutine sub mysub() { }


Contrast that with the following, which expects two scalars. Experiment and note that Perl gripes when your prototype and call don't match.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 137

Prototyping, two string arguments Calling the subroutine mysub(($filename, $title);

Constructing the subroutine sub mysub($$) { }

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

138

Finer points of subroutines


Subroutine prototypes: which appears to allow you to

specify the number of arguments a subroutine takes, and the type of the arguments. Eg: sub foo($);
Specifies that foo is a subroutine that expects one scalar argument

sub bar();
specifies that bar is a subroutine that expects no arguments

The prototype is there to enable compile-time checking of the number and type of arguments provided.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

139

For scalar arguments, compile-time checking of the number of

arguments takes place, and a mismatch results in compilation being aborted. However, type checking does not take place rather, the $ in sub foo ($); tells the compiler that if the single argument is not scalar, it should be converted into a scalar by fair means or foul.

It is to allow us to define subroutines that behave like

the built-in functions when called without brackets round the arguments.
bar is a subroutine that takes no arguments and we write

$n = bar -1; if bar had been defined without a prototype, $n would be assigned the value returned by bar, since the compiler would gobble up the -1 as an argument.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

140

But with the prototype specifying no arguments,

the compiler knows that the statement should be parsed as $n = bar() -1; Similar considerations apply to subroutines taking one scalar argument. Suppose we define foo without a prototype, as follows sub foo { return shift } Then if we write @s = (foo $w1, $w2, $w3);
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 141

The value of @s is a list of one element, whose value is

that of $w1, $w2 and $w3 have been gobbled up as arguments, but ignored by foo. But if foo is defined with a prototype ($), the compiler knows to collect only one argument, and the value of @s is a list of three items,

foo ($w1), $w2 and $w3.


does not extrapolate to subroutines with more than one scalar arguments

sub foobar($$); and write @s = (foobar $w1, $w2, $W3}; we get a compile-time error, 'too many arguments supplied'. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 142

Prototypes can also include optional arguments:

sub foobar($;$) is a subroutine that expects at least one scalar argument The compiler will complain if the call of foo has no arguments or more than two arguments.
A prototype can specify non-scalar arguments using

@, % or &.
Eg: sub foobar(@); Specifies a subroutine that expects a variable number of arguments and It can be called without any arguments, and the compiler will not complain: an empty list is still A.Brahmananda a list Reddy / Assistant
Professor / CSE / VNRVJIET 143

sub foobar ($@); appears to specify a subroutine with two arguments, the first a scalar and the second a list. It just defines a subroutine with at least one argument. sub foobar (%); appears to specify a subroutine that expects a single argument that is a hash

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

144

REFERENCES
A reference is a scalar value that refers to an entire

array or an entire hash (or to just about anything else).


More complex data types can be constructed using

references, which allow you to build lists and hashes within lists and hashes.
A reference is a scalar value and can refer to any

other Perl data type. So by storing a reference as the value of an array or hash element, you can easily create lists and hashes within lists and hashes
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

145

Syntax
There are just two ways to make a reference, and

just two ways to use it once you have it. Making References: If you put a \ in front of a variable, you get a reference to that variable. $aref = \@array;
$href = \%hash;

# $aref now holds a #reference to @array # $href now holds a #reference to %hash

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

146

Once the reference is stored in a variable like $aref or

$href, you can copy it or store it just the same as any other scalar value: $xy = $aref; # $xy now holds a reference to #@array $p[3] = $href; # $p[3] now holds a reference to #%hash $z = $p[3]; # $z now holds a reference to %hash

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

147

RULE2: [ ITEMS ] makes a new, anonymous array, and returns a reference to that array. { ITEMS } makes a new, anonymous hash, and returns a reference to that hash. $aref = [ 1, "foo", undef, 13 ]; # $aref now holds a reference to an array $href = { APR => 4, AUG => 8 }; # $href now holds a reference to a hash The references you get from rule 2 are the same kind of references that you get from rule 1:
# This:

$aref = [ 1, 2, 3 ]; # Does the same as this: @array = (1, 2, 3); $aref = \@array; If you write just [], you get a new, empty anonymous array. If you write just {}, you get a new, empty anonymous A.Brahmananda Reddy / Assistant hash Professor / CSE / VNRVJIET

148

Using References

Its a scalar value, and weve seen that you can store it as a scalar

and get it back again just like any scalar. There are just two more ways to use it: You can always use an array reference, in curly braces, in place of the name of an array. For example, @{$aref} instead of @array.

EG: Arrays:
@a reverse @a $a[3] $a[3] = 17; @{$aref} An array reverse @{$aref} Reverse the array ${$aref}[3] An element of the array ${$aref}[3] = 17 Assigning an element

On each line there are two expressions that do the same thing. The left-hand versions operate on the array @a. The rightA.Brahmananda Reddy / Assistant hand versions operate on the array that is referred to by $aref. 149 Professor / CSE / VNRVJIET

Using a hash reference is exactly the same: %h %{$href} A hash keys %h keys %{$href} Get the keys from the hash $h{red} ${$href}{red} An element of the hash $h{red} = 17 ${$href}{red} = 17 Assigning an element Whatever you want to do with a reference, Use Rule 1

tells you how to do it Eg: foreach my $element (@array) { ... } so replace the array name, @array, with the reference: foreach my $element (@{$aref}) { ... A.Brahmananda Reddy / Assistant } Professor / CSE / VNRVJIET

150

"How do I print out the contents of a hash when all I have is a reference?" First write the code for printing out a hash:

foreach my $key (keys %hash) { print "$key => $hash{$key}\n"; } And then replace the hash name with the reference:
foreach my $key (keys %{$href}) { print "$key => ${$href}{$key}\n"; }

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

151

Rule 2: The most common thing to do with an array


or a hash is to extract a single element, and the Use Rule 1 notation is cumbersome. So there is an abbreviation. ${$aref}[3] is too hard to read, so you can write $aref->[3] instead. ${$href}{red} is too hard to read, so you can write $href->{red} instead.
If $aref holds a reference to an array, then $aref->[3]

is the fourth element of the array. Dont confuse this with $aref[3], which is the fourth element of a totally different array, $aref and @aref are unrelated the same way that $item and @item are.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 152

Similarly, $href->{red} is part of the hash referred to

by the scalar variable $href, perhaps even one with no name. $href{red} is part of the deceptively named %href hash. Eg: First, remember that [1, 2, 3] makes an anonymous array containing (1, 2, 3), and gives you a reference to that array. Now think about @a = ( [1, 2, 3], [4, 5, 6], [7, 8, 9] );
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 153

@a is an array with three elements, and each one is a

reference to another array.


$a[1] is one of these references. It refers to an array,

the array containing (4, 5, 6), and because it is a reference to an array, Use Rule 2 says that we can write $a[1]->[2] to get the third element from that array $a[1]->[2] is the 6. Similarly, $a[0]->[1] is the 2.

You can write $a[ROW]->[COLUMN] to get or set the element in any row and any column of the array.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 154

References
The concept of the pointer is common in many

languages, allowing indirect access to a piece of data - the value of a pointer variable tells us where we will find the data object. int X; /* integer */ int *Px; /* pointer to integer */ px = &X; /* "address of x" */
This kind of pointer is better called a reference, since it

is a true abstraction of an address: its only use is to locate its referent.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

155

Hard references and symbolic references


Hard reference: which locates an actual value in

memory.
Symbolic reference: which is a variable whose value is

a string that happens to be the name of another variable.


A reference is a scalar value, and can be used in any

context that a scalar is valid The only operation that can be carried out on a reference is dereferencing, i.e. locating the referent. A Perl variable is an association of an identifier in the symbol table with a reference to a value.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 156

If the identifier appears in an expression, dereferencing

takes place to obtain the value: If it occurs on the left-hand side of an assignment, the reference is used to locate the value to be updated. An explicit reference to a value can be created: this may be a value that is associated with an identifier in the symbol table, or it may be an 'anonymous' value with no associated variable: in this case the only way to get to the value is via the reference.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

157

Creating references
References to variables and subroutines:
The backslash operator creates a reference to a named variable or subroutine Eg: $ref2foo = \$foo; $ref2args = \@ARGV; $ref2sub = \&initia1ize; Note : By creating the reference we have two independent references to the same value: one associated with the
name in the symbol table, and another which is the value of the reference variable.
References to references can be created to any depth,

Eg:

$ref2ref2foo = \\$foo;
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 158

References to anonymous values:


An anonymous scalar is just a constant, and it is

possible to create a reference to it Eg: $pi_ref = \3.145926536;


This appears at first sight to be rather pointless, but it can be used in combination with a typeglob to create a read-only variable, as described later. More useful are references to anonymous arrays and hashes.

A reference to an anonymous array is created by

enclosing a list in square brackets. This array composer creates an anonymous array and returns a reference to it, thus Eg: $ref2array = [255, 127, 0];
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 159

The square brackets can be nested:

$ref2array2 = [255, 127, 0, [255, 255, 255]]; Here we have a reference to an array with four elements, three integers and a reference to an array. A reference to an anonymous hash can be created using curly brackets (braces): $ref2hash ={winter => 'cold', summer => 'hot} }; A reference to an anonymous subroutine is created with the anonymous subroutine composer: $ref2sub = sub{ ... };
This looks like the definition of a subroutine without a name, but it is in fact an expression, hence the semicolon after the closing brace. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 160

Typeglobs and references


typeglob which can represent any type of variable.

It is generic variable name, when evaluated it produces

a scalar value that represents all the data structures assigned to same variable name. typeglob variables are preceded with the character *. A typeglob variable can represents the types of variables like scalar, array, hash and subroutine. Eg: The typeglob variable *tgvar can represents $tgvar or @ tgvar or % tgvar or & tgvar. These can be used as an ordinary data type. We can perform the operations like: Assign the value to a variable, Store in an array, Pass it as parameter to a subroutine.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 161

Typeglobs and references


A typeglob is a hash with a fixed set of keys SCALAR,

ARRAY, HASH, etc., and associated values which are references to the scalar, array, hash etc. variables that share the name associated with the typeglob. We can write * f 00 { SCALAR} instead of \ $ foo if you want to write obscure/unintelligeble code. This equivalence is put to practical effect in the capability to create selective aliases *bar = *foo; Makes $bar the same as $foo, @bar the same as @foo, etc. If we write *bar = \$fOO;
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 162

Then $bar is an alias for $foo, but @bar and @foo

remain different arrays. In a similar vein, *Pi = \3.1415926536; creates a read-only variable: an attempt to assign to $Pi causes an error. What is it a reference to? More precisely, the ref function returns an empty string if its argument is not a reference. If the argument is a reference ref returns a string specifying what the argument is a reference to - REF, SCALAR, ARRAY, HASH, CODE, or GLOB.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

163

Dereferencing
The operation of dereferencing applied to a

reference delivers the referent $ref2foo = \$foo; then an occurrence of $$ref2foo has the same meaning as $foo. Thus print "$foo\n; print "$$ref2foo\n;
both print the value of $foo. We can do dereferencing on the left-hand side of an assignment A.Brahmananda also, thus: Reddy / Assistant
Professor / CSE / VNRVJIET 164

$ref2foo = \$foo; $foo = 1; $$ref2foo = 2;


The value of $ foo is now 2, but what we have done is

potentially dangerous, since we have two independent ways of referring to the same chunk of memory.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

165

sub foo { my($x, $y) = @_; ## parameters my($z); ## local var return($x + $y); } ...foo(6, 7) --> 13
@_ : @_ is an array of the passed scalar values -- = parallel

assigns the scalars to locals Flattening : Cannot pass an array as one of the args -- it all gets "flattened(compressed) into @_ My : my() declares something local $main::x : refer to outer "global" vars as $main::x -- needed if the "use strict 'vars'; option is on.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 166

File Handling Since perl is particularly well suited to working with textual data it often gets used to process files. A filehandle is a structure which perl uses to allow functions in perl to interact with a file. A filehandle is an abstract name given to a file, device, socket or pipe. A filehandle helps in getting input from and sending output to many different places.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 167

Creating a filehandle
In order to do any work with a file you first need to

create a filehandle. To create a filehandle using the open command. This can take a number of arguments;
The name of the filehandle (all in uppercase by convention) The mode of the filehandle (read, write or append) The path to the file When selecting a mode for your filehandle perl uses the symbols; < for a read-only filehandle > for a writable filehandle >> for an appendable filehandle

If you don't specify a mode then perl assumes a

read-only filehandle, and for reading files it is usual to not specify a mode . A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 168

The code below shows an example of opening a read-

only and a writeable filehandle. open (IN,'C:/readme.txt'); open (OUT,'>','C:/writeme.txt');


By convention filehandle names should be written in all capitals. open (OUT,'>C:/writeme.txt'); #still works this is a bad practice to get into open (OUT,">","C:\table.txt"); In the above code you've just tried to open a file called [tab]able.txt, which probably isn't what you meant!

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

169

# open file for reading open(IN, "<", $filename) or die $!;

# open file for writing open(OUT, ">", $filename) or die $!; # add line numbers to each line my $line_no = 1; while(<IN>) { print OUT "$line_no: $_"; $line_no++; }

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

170

# open file for reading open(my $input_fh, "<", $filename) or die $!; # open file for writing open(my $output_fh, ">", $filename) or die $!; # copy file my $line_no = 1; while(<$input_fh>) { print $output_fh "$line_no: $_"; $line_no++; }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

171

Closing a filehandle
When you've finished with a filehandle it's a good

practice to close it using the close function. This is more important for writable filehandles than read-only ones , but it never hurts. If you don't explicitly close your filehandle it will automatically be closed when your program exits. If you perform another open operation on a filehandle which is already open then the first one will automatically be closed when the second one is opened.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

172

Special Filehandles
In addition to creating your own filehandles perl actually comes

with three of them already defined for you. These are special filehandles which you can use to interact with whatever process launched your perl script. The special filehandles are: STDOUT A writeable filehandle. Used as the default output location for print commands. Is usually redirected to the console from which you ran your perl script. STDIN A readable filehandle. Used to pass information from the console into perl. Allows you to ask questions on the console and get an answer. STDERR Another writable filehandle usually attached to the console but normally only used for unexpected data such as error messages.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

173

Reading from a file


To read data from a file you use the <> operator.

You put the identifier of the filehandle you want

to read from in between the angle brackets. This reads one line of data from the file and returns it. To be more precise this operator will read data from the file until it hits a certain delimiter The default delimiter is your systems newline character ("\n"), hence you get one line of data at a time.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

174

my $file = "M:/tale_of_two_cities.txt"; open (IN, file) or die "Can't read $file: $!"; my $first_line = <IN>; print $first_line; print "The end"; This produces the following output:

It was the best of times, it was the worst of times, The end
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 175

You'll notice that even though there is no "\n" at the end

of the first print statement, the second one still appears on the next line. This is because the <> operator doesn't remove the delimiter it's looking for when it reads the input filehandle. Normally you want to get rid of this delimiter, and perl has a special function called chomp for doing just this. Chomp removes the same delimiter that the <> uses, but only if it is at the end of a string.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

176

my $file = "M:/tale_of_two_cities.txt"; open (IN,$file) or die "Can't read $file: $!"; my $first_line = <IN>; chomp $first_line; print $first_line; print "The end";

This code produces: It was the best of times, it was the worst of times,The end Note: The more normal way to read a file is to put the <> operator into a while loop so that the reading continues until the end of the file is reached.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 177

my $file = "M:/tale_of_two_cities.txt"; open (IN,$file) or die "Can't read $file: $!"; my $line_count = 1; while (<IN>) { chomp; print "$line_count: $_\n"; last if $line_count == 5; ++$line_count; } Gives: 1: It was the best of times, it was the worst of times, 2: it was the age of wisdom, it was the age of foolishness, 3: it was the epoch of belief, it was the epoch of incredulity, 4: it was the season of Light, it was the season of Darkness, 5: it was the spring of hope, it was the winter of despair, A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 178

OK, so I cheated and bailed out after 5 lines, but this

would in theory have worked to number every line of the whole file. It would also have done the right thing had there been less than 5 lines in the file as the while condition would have failed so the loop would have exited.
Note that by not specifying a variable name to assign to

in the loop condition statement the assignment goes to the default $_ variable. This in turn means that I don't need to pass an argument to chomp as it (like most scalar functions) operates on $_ by default.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

179

Reading from STDIN


A really useful facility is the ability to read in

information from the console. This allows you to produce interactive programs which can ask the user questions and receive answers. To achieve this you simply use the STDIN filehandle in the same way as you'd use any other read-only filehandle. You usually only read one line at a time from STDIN (so the input stops when the user presses return), but you can put the read into a while loop which will continue until the use enters the end-of-file symbol (Contol+D on most systems, but this can vary).
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 180

#!perl use warnings; use strict; print "What is your name? "; my $name = <STDIN>; chomp $name; print "What is your quest? "; my $quest = <STDIN>; chomp $quest; print "What is your favourite colour? "; my $colour = <STDIN>; chomp $colour; if ($colour eq "Blue... no, yellow!") { die "Aaaarrrrgggh!"; }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 181

Writing to a file
Writing to a file is really simple. The only

function you need to use is print. All of the previous print statements shown so far have actually been sending data to the STDOUT filehandle. If a specific filehandle is not specified then STDOUT is the default location for any print statements.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

182

#!perl use warnings; use strict; open (OUT,'>',"M:/write_test.txt") or die "Can't open file for writing: $!"; print OUT "Sending some data\n"; print OUT <<"END_DATA"; Now I'm sending a lot of data All in one go. Just because I can... END_DATA close OUT or dieA.Brahmananda "Failed to close file: $!"; Reddy / Assistant
Professor / CSE / VNRVJIET 183

The main thing to remember when writing to a

file is that in addition to checking that the open function succeeded, you must also check that you don't get an error when you close the filehandle. This is because errors can occur whilst/(at the same time) you are writing data, for example if the device you're writing to becomes full whilst you're writing.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

184

File System Operations


Perl also offers a number of other filesystem operations

within the language. Changing directory : Use the chdir function to change directory in perl. As with all file operation you must check the return value of this function to check that it succeeded.
chdir ("M:/Temp") or die "Couldn't move to temp directory: $!";

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

185

Deleting files : Perl provides the unlink function for


deleting files. This accepts a list of files to delete and will return the number of files successfully deleted. Again you must check that this call succeeded.
# This works: unlink ("M:/Temp/killme.txt","M:/Temp/metoo.txt") == 2 or die "Couldn't delete file: $!"; # But this is better: foreach my $file ("M:/Temp/killme.txt","M:/Temp/metoo.txt") { unlink $file or die "Couldn't delete $file: $!"; }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 186

Listing files: Another common scenario is that you


want to process a set of files. To do this you often want to get a file listing from the filesystem and work with that. Perl provides a function called file globbing to do this kind of task. There are two ways to perform a file glob, one is just a shortcut to the other.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

187

chdir ("M:/winword") or die "Can't move to word directory: $!"; # Quick easy way my @files = <*.doc>; print "I have ",scalar @files," doc files in my word directory\n";
# Longer way to do the same thing my @files2 = glob("*.rtf"); print "I have ",scalar @files2," rtf files in my word directory\n";
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 188

Although you can pull a glob into an array as shown

above a nicer way to use it is actually to treat it a bit like a filehandle and read filenames from it in a while loop.

chdir ("M:/winword") or die "Can't move to word directory: $!"; while (my $file = <*.doc>) { print "Found file $file\n"; }

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

189

Testing files
If you want to open a file for reading it's much nicer to

specifically test if you can read the file rather than just trying to open it and trapping the error. Perl provides a series of simple file test operators to allow you to find out basic information about a file.

The file test operators are: -e -r -w -d Tests if a file exists Tests if a file is readable Tests if a file is writable Tests if a file is a directory (directories are just a special kind of file) -f Tests if a file is a file (as opposed to a directory) A.Brahmananda Reddy / Assistant -T Tests if a file is a plain text file Professor / CSE / VNRVJIET

190

All of these tests take a filename as an argument and

return either true or false. chdir ("M:/winword") or die "Can't move to word directory: $!"; while (my $file = <*>) { if (-f $file) { print "$file is a directory\n"; } elsif (! -w $file) { print "$file is write protected\n"; } }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 191

Binary vs. text


Binary files are files that have to be translated

literally, such as a picture file, a sound file, or a binary file. Text files are any files that contain records that end in end-of-line characters. Some operating systems distinguish between binary and text files. Unix and Linux do not, but Windows does. Perl cant tell the difference between binary and text files (it has a Unix heritage).
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 192

Handling binary data


When writing binary data to a file, Perl converting

anything, so you have to use the binmode command with the filehandle to tell Perl this is to be written literally:

open(BFILE, >file1.dat); binmode(BFILE);


You only need to specify binmode for a

filehandle once, until you close the file On some operating systems (UNIX/Linux and Macintosh) binmode is ignored as there is no distinction between binary and text files
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 193

Using pack and unpack


The pack function resembles join, in that it combines

a list of items into a string. However, it builds a binary structure described by a template which is given as an argument, and returns this as a string. A common use is for converting between ASCII codes and the corresponding characters. Thus if @codes is a list of 10 integers between 0 and 127 pack cccccccccc" , @codes or pack "c10, @codes Produces a string containing the 10 corresponding ASCII A.Brahmananda Reddy / Assistant characters. Professor / CSE / VNRVJIET 194

Since the template is a string, and therefore allows

variable interpolation, we can generalize this to a list of any length: (For list of unknown length, it is return as follows) $count = @codes; pack "c$count" , @codes;

unpack:
The unpack function reverses the process, converting a

string a data structure into a list.


Like pack, it uses a template to define the order and

type of the items in the data structure.


A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 195

template character Description c A signed char (8-bit) value. C An unsigned char (octet) value. s A signed short (16-bit) value. S An unsigned short value. l A signed long (32-bit) value. L An unsigned long value. i A signed integer value. I A unsigned integer value.

http://perldoc.perl.org/functions/pack.html
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 196

Eval
It can be evaluated any string at run time

$myString = "print 'Hello World\n';"; $val = eval( $myString );

Evaluate statements from the command line

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

197

eval
it takes an arbitrary string as an operand and evaluates

the string at run-time, as a Perl script. Any variables declared local with my or local have a lifetime which ends when the eval is finished. The value returned is the value of the last expression evaluated, as with a subroutine: alternatively, return can be used to return a value from inside the eval.
In the event of a syntax error or a run-time error, eval

returns the value 'undefined' and places the error message in the variable $@. If there is no error, the value of $@ is an empty string
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 198

Eg: The use of this form of eval is to get the value of a variable whose name is to be computed, as a string at run-time, e.g. $myvar = ; . $value = eval "\$$myvar; Another example is a little shell, which allows you to type in a script which will be run when the end-offile character (Ctrl-D in UNIX, Ctrl-Z in Windows) is entered: my @script = <STDIN>; chomp @script; eval join , @script;
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 199

Closures
Instead of returning data , a Perl subroutine can

return a reference to a subroutine .


This is really no different from any other ways of

passing subroutine references around, except for a somewhat hidden feature involving anonymous subroutines and lexical ( my ) variables.

An

important use for anonymous subroutines is in constructing closures.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

200

sub tricky { my $x = shift; return sub {print "$x and $_[0]"}; }


The subroutine tricky expects one argument, and

returns a reference to an anonymous subroutine, which also expects one argument.


The body of the anonymous subroutine includes a

reference to $x, and as this occurred within the scope of the my declaration, the $x in the body of the anonymous subroutine is the local $x.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 201

If we write

$sub = &tricky ( "green" );


then the value of $sub is a reference to an anonymous subroutine whose body contains the value of tricky's $x - in this case the string green".
If later in the script we have

$x = "yellow"; &$sub ("red") ; then the output is green and red not 'yellow and red'.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

202

This is because the body of the anonymous subroutine

has been executed in the environment in which it was declared, not the environment in which it is executed. It carries its declaration-time environment with it: this combination of code and environment is what computer scientists call a closure.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

203

Eg: The powers of 2 sub p2gen { my $n =1; return sub { $n *= 2; } }

Now if we set $powers_of_two=p2gen; then the call &$powers_of_two;


will return 2: subsequent calls will return 4, 8, 16 and so on.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 204

sysread and syswrite


Experienced C programmers working in the UNlX

environment can access the underlying UNIX I/O system calls by using the sysread and syswrite functions. Perl provides equivalent functions. Perl's sysread is used in exactly the same way as read to read a block of data of known length. syswrite makes it possible to write a block of data of known length Eg: $bytes_written = syswrite BINFILE, $binstring, $n;
The value returned is the number of bytes actually written.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 205

Sysread syswrite
The syntax of sysread function is:

sysread Fhandle, scalar, len, offset


This function reads the block of data (len bytes) from

the given Filehandle (Fhandle) into the variable scalar. It returns number of bites read on success, returns 0 if EoF is reached and return undef on failure.
The offset parameter is optional. It specifies from

where the string has to read. By default, the string is read from the beginning of the File. An exception is raised when len is negative or when offset is out of string boundary.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

206

syswrite
The syntax of syswrite function is:

syswrite Fhandle, scalar, len, offset


This function writes len bytes of data from scalar

variable to the given Filehandle (Fhandle). It returns number of bytes written on success and returns undef on failure. The offset parameter is optional. It specifies the starting point for writing the string. Generally offset is 0, when scalar is empty. If the offset is negative then writing starts that many bytes backward from the end of the string. An exception is raised when len or when offset is out of string boundary.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 207

packages
A package is a collection of code which lives in its own

namespace Syntax: package package_name A namespace is a named collection of unique variable names (also called a symbol table). Which determines bindings of names both at compiletime and run-time. (Run-time name lookup happens when symbolic references are de-referenced and during execution of eval) Namespaces prevent variable name collisions between packages

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

208

The Package Statement package statement switches the current naming context to a specified namespace (symbol table) If the named package does not exists, a new namespace is first created. The package stays in effect until either another package statement is invoked, or until the end of the current block or file. You can explicitly refer to variables within a package using the :: package qualifier
Initially, code runs in the default package main.

Variables used in a package are global to that package but invisible in any other package unless a 'fully qualified name' is used Reddy / Assistant A.Brahmananda
Professor / CSE / VNRVJIET 209

Thus $A: : x is the variable x in package A, while $B: : X is the (different) variable x in

package B, and $x is the variable x in the package main.


A package is introduced by the package declaration,

and extends to the end of the innermost enclosing block, or until another package declaration is encountered. In this case the original package declaration is temporarily hidden.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

210

Eg: $i = 1; print "$i\n"; # Prints "1" package foo; $i = 2; print "$i\n"; # Prints "2" package main; print "$i\n"; # Prints "1"

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

211

$PACKAGE_NAME::VARIABLE_NAME

For Example:

$i = 1; print "$i\n"; package foo; $i = 2; print "$i\n"; package main; print "$i\n"; print "$foo::i\n";

# Prints "1
# Prints "2" # Prints "1" # Prints "2"
212

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

BEGIN and END Blocks You may define any number of code blocks named BEGIN and END which act as constructors and destructors respectively.
BEGIN { ... }

END { ... } BEGIN { ... } END { ... }

Every BEGIN block is executed after the perl script is loaded

and compiled but before any other statement is executed Every END block is executed just before the perl interpreter exits. The BEGIN and END blocks are particularly useful when creating Perl modules.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 213

Libraries and modules


Libraries and modules, on the other hand , are

packages contained within a single file, and are units of program reusability.
The power of Perl is immensely enhanced by the

existence of a large number of publicly available libraries and modules that provide functionality in specific application areas.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

214

Libraries

A library is a package that contains a collection of

related subroutines that can be used in some specific area/purpose, E.g. : A collection of mathematical functions. A library is usually declared as a package to give it a private namespace, and is stored in a separate file with an extension .p1. Thus we might have a library of mathematical functions in a library stored in a file Math.p1. These subroutines (a library)can be loaded into a program by placing the statement at the head of the program. require "Math.pl";
If the library is declared as a package called Math, we can be

accessed its subroutines/functions using fully qualified names, E.g.: $rootx = Math: : sqrt ($x) ; A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 215

Modules
Libraries have been largely "superseded by modules,

which provide enhanced functionality. a module is just a package that is contained in a separate file whose name is the same as the package name, with the extension .pm. Modules are more reusable than libraries because modules follows certain specific conventions and take advantage of built-in support that makes them a powerful form of reusable software.
From the user's point of view, the great thing about a

module is that the names of the subroutines in the module are automatically imported into the namespace A.Brahmananda Reddy / Assistant of that program using the package.
Professor / CSE / VNRVJIET 216

Eg: a module Math.pm is loaded at the starting of the program by writing use Math; Now the program can access the subroutines present in the module as if they were defined in the program itself. BEGIN { require Math.pm; Math::import(); }
A user can also import only selected subroutines of a

module. Use Math(sqrt, sin, cos);


A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 217

The above statement has the effect of writing the

following: BEGIN { require Math.pm; Math::import (sqrt, sin, cos); }


When the user requests a module using use statement

Perl searches the current directory. If module is not available then it searches the array @INC. This array holds a list of directories for a specific platform.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

218

Data structures
In C, a two-dimensional array is constructed as an array

of arrays, reflected in the syntax for accessing an element e.g. a [0] [1] . This technique does not work in Perl, since it is not possible to create a LISP-like list of lists.
However, a similar effect can be obtained by creating

an array of references to anonymous arrays. Suppose we write @colours = ([42, 128, 244], [24, 255, 0],[0, 127, 127]);
The array composer converts each comma-separated list

to an anonymous array in memory and returns a reference to it, A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 219

Data structures
so that @colours is in fact an array of references to

anonymous arrays. When we write an expression like $colours[0] [1] = 64;


$colours [0] is a reference to an array, and applying a

subscript to it forces dereferencing, returning the anonymous array from which an element is selected by the second subscript. (In fact, Perl inserts a dereferencing operator -> between an adjacent] and [.)

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

220

Complex data structures


By analogy with the array of arrays, which is actually

an array of references to arrays, we can create hashes of (references to) hashes. This is a very common data structure in Perl, since the anonymous hash provides a way of providing the capability of records (structs) in other languages.
Likewise, we can create arrays of (references to) hashes

and hashes of (references to) arrays. By combining all these possibilities data structures of immense complexity can be created.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

221

Eg:
we choose to make each element of the array a hash

containing three fields with key L (left neighbor), 'R' (right neighbor) and 'C' (content)
The values associated with L and R are references to

element hashes (or undef for non-existent neighbours): the value associated with C can be anything scalar, array, hash or reference to some complex structure..

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

222

Typically we will have two scalar variables containing

references to element hashes: 1. $head, a reference to the first element in the list, 2. $current, a reference to the element of current interest: the content of the current element is evidently accessed by $current-> { 'C }. We can move forwards the list with $current = $current->{'R'}; and backwards with $current = $current-> { 'L' }; If we create a new element with $new ={L=>undef, R=>undef, C=> . };
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 223

we can insert this element after the current element with

$new->{'R'} = $current->{'R'}; $current{'R'}->{'L'} = $new; $current{'R'} = $new; $new->{'L'} = $current;


Finally we can delete the current element with

$current->{'L'}->{'R'} = $current-> { I R '}; $current->{'R'}->{'L'} = $current->{ 'L'};

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

224

Objects
Objects in 'real' object-oriented programming (OOP)

are encapsulations of data and methods, defined by classes. Important aspects of OOP technology are inheritance and polymorphism. Objects in Perl provide a similar functionality, but in a different way: they use the same terminology as OOP, but the words have different as

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

225

Objects: An object is an anonymous data structure

(usually, but not necessarily, a hash) that is accessed by a reference, has been assigned to a class, and knows what class it belongs to. We have not defined classes yet: just remember that a Perl class is not the same thing as an OOP class. The object is said to be blessed into a class: this is done by calling the built-in function bless in a constructor.
Constructors:

There is no special syntax for constructors: a constructor is just a subroutine that returns a reference to an object.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

226

Classes: A class is a package that provides methods to

deal with objects that belong to it. There is no special syntax for class definition.
Methods: Finally, a method is a subroutine that expects

an object reference as its first argument (or a package name for class methods). To sum up, an object is just a chunk of referenced data, but the blessing operation ties it to a package that provides the methods to manipulate that data. Thus there is no encapsulation, but there is an association between data and methods.

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

227

Constructers

Objects are created by a constructor subroutine which is

generally (but not necessarily) called new. Example: package Animal; sub new { my $ref = {}; bless ref; return ref; } Remember that {} returns a reference to an anonymous hash. So the new constructor returns a reference to an object that is an empty hash, and knows (via the bless A.Brahmananda Reddy / Assistant function) that it belongs to the package Animal. Professor / CSE / VNRVJIET 228

Instances

With this constructor new defined we can create

instances of the object (strictly speaking. references to instances) $Dougal = new Animal; $ Ermyntrude = new Animal;
This makes $Dougal and $Ermyntrude references to

objects that are empty hashes, and 'know' that they belong to the class Animal. Note that although the right-hand side of the assignment, new Animal, looks superficially like a simple call of a subroutine new with one argument Animal, it is in fact a special syntax for calling the subroutine new in the package Animal.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 229

Interfacing to the Operating System


In order to use Perl as an alternative to write shell

scripts, the UNIX implementation of Perl replicated the system calls as built in functions. However the replication of system calls depends upon the capabilities of the host operating system. The equivalents for most of the UNIX facilities are provided by the Windows NT.
The facilities that are common to the UNIX and NT

implementations of Perl are like

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

230

Environment variables: The current environment variables are stored in the special hash %ENV. A script can read them or change them by accessing this hash. The local operator can be used to perform temporary change in the values of individual environmental variable. Example:

{local $ENV {PATH} = new value; }


A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 231

The commands that are inside in this block will be

executed with new variable PATH. However, the value of PATH will be replaced by the original value when it exits from the block. File system calls There are many file system calls that are used for handling the file system. All return a value indicating success or failure so a typical idiom is

chdir $x or die "cannot chdir to $x\n";


Here we have used the ultra-low precedence

or

operator; we could equally well have written

chdir $x || die "cannot chdir";


since named unary operators like chdir have higher A.Brahmananda Reddy / Assistant precedence than logical operators.
Professor / CSE / VNRVJIET 232

Examples of the more common calls for manipulating the file system are: chdir $x Change directory unlink $x Same as rm in UNIX or delete in NT rename ( $x, $y) Same as mv in UNIX link($x, $y) Same as ln in UNIX. Not in NT symlink ($x, $y) Same as ln -s in UNIX. Not in NT mkdir($x, 0755) Create directory and set modes rmdir $x Remove directory chmod(0644, $x) Set file permisions

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

233

The system function A simple example of the system function is

System(date);
The string given as argument is treated as a shell

command to be run by /bin/sh on UNIX, and by cmd.exe on NT, with the existing STDIN, STDOUT and STDERR, any ofthese can be redirected using Bourne shell notation, e.g. system("date >datefile") && die "can't redirect";
Shell commands returns zero to denote success and a

non-zero value to denote failure.


A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 234

Eg: system(echo textfiles: *.txt);


In this case the * . txt will be expanded to a list of all

the files in the current directory with the . txt extension before the echo is executed. Quoted execution The output of a shell command can be captured using quoted execution Eg1: $date = 'date; Here, the output of the date command is assigned to the variable $date. Eg2: $date = qx/date/;

#qx : quoted execution


235

A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET

Exec The exec function terminates the current script and executes the program named as its argument, in the same process. Note: exec() never returns, and it is unnecessary to have any Perl statements following it, except possibly an or die clause exec "sort $output" or die "Can't exec sort\n"; exec can be used to replace the current script with a new script:

exec perl -w scr2.pl" or die "Exec failed\n";


If the argument is a list, or is a string containing blank-

separated words, argument processing is the same as for the system function.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 236