Professional Documents
Culture Documents
What is Perl?
Perl is a general-purpose programming language originally
developed for text manipulation and now used for a wide range of tasks including system administration, web development, network programming, GUI development, and more.
The language is intended/planned to be practical (easy to use,
procedural and object-oriented (OO) programming, has powerful built-in support for text processing, and has one of the worlds most impressive collections of third-party modules.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET
program, you can run it -- theres no mandatory compilation phase. The same Perl program can run on Unix, Windows, NT, MacOS, DOS, OS/2, VMS and the Amiga.
Perl is collaborative. The CPAN software archive contains free
The source code and compiler are free, and will always be free.
Perl is fast. The Perl interpreter is written in C, and more than a
do it." The language doesnt force a particular style of programming on you. Write what comes naturally. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 3
language, internal support for hash tables, a built-in debugger, facilities for report generation, networking functions, utilities for CGI scripts, database interfaces, arbitrary-precision arithmetic --are all bundled with Perl.
Perl is secure. Perl can perform "taint (spoils status or
reputation) checking" to prevent security breaches/agrements. You can also run a program in a "safe" compartment to avoid the risks inherent in executing unknown code.
Perl is open for business. Thousands of corporations rely on Perl
things possible. Perl handles tedious tasks for you, such as memory allocation and garbage collection. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 4
is object oriented. Inheritance, polymorphism, and encapsulation are all provided by Perls object oriented capabilities.
satisfaction of seeing our well-tuned programs do our bidding/request, but in the literary act of creative writing that yields those programs. With Perl, the journey is as enjoyable as the destination.
easily accessible .
Perl is also available for MS-DOS,WIN-NT and Macintosh.
Basic Concepts
Perl files extension .Pl Can create self executing scripts
Advantage of Perl
Can use system commands
Comment entry
Print stuff on screen
perl progname.pl Alternatively, put this as the first line of your script: #! /bin/perl. Or #!/usr/bin/env perl It can make perl files self executable by making it as first line. The extension tells the kernel that the script is a perl script and the first line tells it where to look for perl. ... and run the script as /path/to/script.pl. Of course, itll need to be executable first, so chmod 755 script.pl (under Unix).
The -w switch tells perl to produce extra warning messages
Basics
The advantage of Perl is that you dont have to compile create
Basics
The pound sign "#" is the symbol for comment entry. There is
no multiline comment entry , so you have to use repeated # for each line.
The "print command" is used to write outputs on the screen.
Eg: print "this is CSE"; Prints "this is I M.Tech. SE" on the screen. It is very similar to printf statement in C.
If you want to use formats for printing you can use printf.
10
These statements are simply written in the script in a straightforward fashion. There is no need to have a main() function or anything of that kind.
Perl statements end in a semi-colon:
print "Hello, world"; Comments start with a hash symbol and run to the end of the line # This is a comment
Whitespace is irrelevant: except inside quoted strings
strings: print "Hello, world"; print Hello, world; However, only double quotes "interpolate" variables and special characters such as newlines (\n): print "Hello, $name\n"; # works fine print Hello, $name\n; # prints $name\n literally Numbers dont need quotes around them: print 42; A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 12
them according to your personal taste. They are only required occasionally to clarify issues of precedence. print("Hello, world\n"); print "Hello, world\n";
print("Hello World!\n"); print ("Hello World! \n) ; Print "Hello World!", "\n"; print "Hello", " ", "World!", "\n";
13
examples to flavour of Perl and some indication of its power. Example 1: Print lines containing the string 'Shazzam!'
anonymous variable analogous/similar to the pronoun 'it', and the implied pattern match 'print it if it matches /Shazzam!/'. If we wanted to spell out, we would have changed the code to the version shown in Example 2
Example 2: The same thing the hard way
while ($line <STDIN>) { print $line if $line =/Shazzam!/ }; Here we use a variable instead anonymous 'it '. indicates that it is a scalar variable (as opposed to an array).
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 15
'Shazzam! If we want to print all lines except those containing the pattern, we just change if to unless. #!/usr/bin/Perl while (<STDIN>) print unless /Shazzam!/ };
16
which have a name (or identifier) and a value: a value is assigned to (or stored in) a variable by an assignment statement of the form
name = value;
Variable names resemble nouns in English (command names are verbs) A singular name is associated with a variable that holds a single item of data called a scalar value A plural name is associated with a variable that holds a collection of data items called an array or hash Variable names start with a character that denotes the kind of thing that the name stands for - scalar data ($), array(@), hash(%), subroutine (&) etc. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 17
signed integers if
possible, and otherwise as double length floating point numbers in the systems native format. Strings are stored as sequences of bytes of unlimited length or as strings as sequences of printable characters, Perl attaches no significance to any of the 256 possible values of a byte.
0777 0x3fff
19
String constants:
the quote (single or double) which started it, so a single-quoted string can include double quotes and vice versa.
Single quoted strings are treated 'as is' with no
interpretation of their contents except the usual use of backslash to remove special meanings, so to include a single quote in such a string it must be preceded with a backslash, and to include a backslash you use two of them.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 20
Standard Files
STDIN : It is a normal input channel for the script.
21
Example
#! /usr/local/bin/perl w print Enter the Text; $input = <STDIN> ; #Reads the input and stores in the variable #input Chomp(); #will remove new line character. Print entered text =$input ; #Prints the input on the command line\ This Program displays: Enter the Text Perl is awesome #Perl will read this Perl is awesome\n, by #default it will add \n character to your #entered text. So use chomp entered text =Perl is awesome
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 22
Variable Interpolation
Interpolation takes place only in double quotation
marks. Example #! /usr/local/bin/perl w $x = 12 ; #Assign the value to the variable print Value of x is $x ; #Prints the output This Program displays: Value of x is $x #Single quotation will not interpolate #(no processing is done) the values
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 23
Example
#! /usr/local/bin/perl w $x = 12 ; #Assign the value to the variable print Value of x is $x ; #Prints the output This Program displays:
Value of x is 12
24
Integers
Integers are usually expressed as decimal(10) but can be
specified in several different formats. 234 decimal integer 0765 octal integer 0b1101 binary integer 0xcae hexadecimal integer Converting a number from one base to another base can be done using sprintf function. Variables of different base can be displayed using printf function
25
Example
#! /usr/local/bin/perl w $bin = 0b1010; $hex = sprint f %x, $bin; $oct = sprint f %o ,45; print binary =$bin \n hexa =$hex \n octal =$oct; This Program displays:
binary= 1010 hexa = a octal = 55
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 26
Example #! /usr/local/bin/perl w $x = 98 ; print f ( Value in decimal =%d\n, $x ) ; print f ( Value in octal=%o\n, $x ) ; print f ( Value in binary =%b\n, $x ) ; print f ( Value in hexadecimal=%x\n, $x ) ; This Program displays: Value in decimal =98 Value in octal =142 Value in binary =1100010 Value in hexadecimal =62
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET
27
Escaped Sequences
Character strings that are enclosed in double quotes
accept escape sequences for special characters. The escape sequences consist of a backslash (\) followed by one or more characters Escape Sequence Description
\b \e \f \l \L \u \U \r \v Backspace escape Form feed Forces the next letter into lowercase All following letters are lower case Forces the next letter into upper case All following letters are upper case Carriage Return Vertical Tab A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 28
Ex:chop($text)
29
30
Operators
Operators can be broadly divided into 4 types.
Unary operator which takes one operand.
Example: not operator i.e. ! Binary operator which take two operands Example: addition operator i.e. + Ternary operator which take three operands. Example: conditional operator i.e. ?: List operator which take list operands Example: print operator
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET
31
Arithmetic Operators
Operator Description + Adds two numbers Subtracts two numbers * Multiplies two numbers / Divides two numbers ++ Increments by one.(same like C) -Decrements by one % Gives the remainder (10%2 gives five)
** Gives the power of the number. Print 2**5 ; #prints 32.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 32
Shift Operators
shift operators manipulate integer values as binary
numbers, shifting their bits one to the left and one to the right respectively.
Operator << Description Left Shift Print 2 >>3 ; left shift by three positions, prints 8 >> Right Shift Print 42 >>2; #right shift by two positions, prints 10 x Repetition Operator. Ex: print hi x 3; Output : hihihi
Ex2: @array = (1, 2, 3) x 3; #array contains(1,2,3,1,2,3,1,2,3) A.Brahmananda Reddy / Assistant Ex3 :@array=(2)x80 #80 element array of value 2 Professor / CSE / VNRVJIET
33
Logical Operators
Logical operators represented by either symbols or
names. These two sets are identical in operation, but have different precedence. Operator Description
&& or AND || or OR XOR ! or NOT Return True if operands are both True Return True if either operand is True Return True if only one operand is True (Unary) Return True of operand is False
than even && and || . The not, and, or, and xor operators have the lowest precedence of all Perl's operators, with not being the highest of the four A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 34
Bitwise Operators
Bitwise operators treat their operands as binary
values and perform a logical operation between the corresponding bits of each value. Operator & | ^ ~ Description Bitwise AND Bitwise OR Bitwise XOR Bitwise NOT
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET
35
Comparison Operators
The comparison operators are binary, returning a
!=
.(dot) Concatenation operator. It takes two strings and joins them Ex: print System .Verilog It prints SystemVerilog. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 37
Binding operator
The binding operator ,=~ ,binds a scalar expression into a
pattern match.
String operations like s/// ,m//,tr/// work with $_ by default. By using these operators you can work on scalar variable other
than $_ .
The value returned from
for conditional expressions, that is 1 for failure and '' '' for success in both scalar and list contexts.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 38
Typical : +, -, *, /, %, ++, --, +=, -=, *=, /=, ||, &&, ! ect Not typical: ** for exponentiation
String Operators
Concatenation: . - similar to strcat $first_name = Larry; $last_name = Wall; $full_name = $first_name . . $last_name;
39
$language = Perl; if ($language == Perl) ... # Wrong! if ($language eq Perl) ... #Correct Use eq / ne rather than == / != for strings
40
Numeric : >
Greater than or equal to
String : gt String : ge
String : lt String : le
Numeric : >=
Less than
Numeric : <
Less than or equal to
Numeric : <=
41
String Functions
Convert to upper case
$name = uc($name);
Convert only the first char to upper case
$name = ucfirst($name);
$name = lc($name);
Convert only the first char to lower case
$name = lcfirst($name);
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 42
43
Variable Interpolation
Perl looks for variables inside strings and replaces them
with their value $stooge = Larry print $stooge is one of the three stooges.\n;
Character Interpolation
List of character escapes that are recognized
Common Example :
print Hello\n; # prints Hello and then a return
45
# a number prints 22 looks like a string, but ... # will print 40!
46
enclosed in braces.
The last statement in the block is terminated by the
closing brace.
The control structures in Perl use conditions to control
Bare block : Blocks can in fact appear almost anywhere that a statement can appear: such a block is sometimes called a bare block.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 47
in a Boolean context: if it evaluates to zero or the empty string the condition is treated as false otherwise it is treated as true in accord with the rules already given.
Conditions
usually make use of the relational operators, and several simple conditions can be combined into a complex condition using the logical operators.
If-then-else statements
if ($total >0) { print "$tatal\n} else { print "bad total!\n"}
if ($total >70){ $grade = A; } elsif ($tatal > 50){ $grade = "B" ; } elsif ($total > 40){ $grade "C" ; } else { $grade = F"; $total = 0 } A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 49
Alternatives to if-then-else
A common idiom is to use a conditional expression in place of an if -then-else construct. Thus if ($a < 0) {$b = 0} else {$b = 1}; can be written $b = ($a < 0) ? 0 : 1; Another common idiom is, as we have seen, to use the 'or' operator between statements open(IN, $ARGV[0]) or die "Cant open A.Brahmananda $ARGV[0] \n"; Reddy / Assistant
Professor / CSE / VNRVJIET 50
Statement qualifiers
A single statement (but not a block) can be followed
by a conditional modifier, as in the English 'I'll come if it's fine'. print "OK\n" print "Weak\n" print Replace\n" if $volts >= 1. 5; if $volts >= 1.2 and $volts < 1. 5; if $volts < 1. 2;
compare the following code using conditional expressions, which has the same effect: Print (($volts >= 1.5) ? "OK\n : (($volts >= 1.2) ? "Weak\n" : "Replace\n"));
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 51
Repetition
Perl provides a variety of repetition mechanisms to suit
all tastes, including both testing loops and 'counting' loops. Testing loops: while ($a != $b) { if ($a > $b) { $a = $a - $b } else { $b = $b - $a } }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 52
Repetition
while can be
replaced by until to give the same effect as explicit negation of the condition. Likewise, statements (but not blocks) can use while and until as statement modifiers to improve readability $a +=2 while $a < $b; $a += 2 until $a > $b; Note particularly, though, that this is purely syntactic sugar - a notational convenience. Although the condition is written after the statement, it is evaluated before the statement is executed, just like any other while/until loop: if the condition is initially false the statement will never be executed - a zero trip loop. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 53
Repetition
Perl provides the do
speaking, do is a built-in function rather than a syntactic construct. The condition attached to a do loop looks superficially the same as a statement modifier, but the semantics are that the condition is tested after execution of the block, so the block is executed at least once.
do { } while $a = $b;
54
Repetition
The while can be replaced by
statements of the block and returns the value of the last expression evaluated in the block: Counting loops: Counting loops use the same syntax as c: for ($i = 1; $i <= 10; $i++) { $i_square = $i*$i; $i_cube $i**3; print $i\t$i_square\t$i_cube\n; }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 55
Repetition
There is also a foreach construct, which takes an
explicit list of values for the controlled variable. foreach $i (1 .. 10) { $i_square = $i*$i; $i_cube = $i**3; print $i\t$i_square\t$i_cube\n; } And if they wanted to count backwards they would write foreach $i reverse (1 .. 10) { $i_square = $i*$i; $i_cube =$i**3; print $i\t$i_square\t$i_cube\n; }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 56
57
58
59
Scalar Variables
They should always be preceded with the $ symbol.
hand . There are no data types such as character or numeric. The scalar variable means that it can store only one value. If you treat the variable as character then it can store a character. If you treat it as string it can store one word . if you treat it as a number it can store one number.
Eg $name = "betty" ;
The value betty is stored in the scalar variable $name.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 60
betty.
$3var = 123 #Error, Shouldnt start with number Begin with $, followed by a letter then by letters, digits or underscores.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 61
$animal = "camel"; $answer = 42; print $animal; print "The animal is $animal\n";
print "The square of $answer is ", $answer * $answer, \n";
62
Lists
strings) or expressions, which is to be treated as a whole. It is written as a comma-separated sequence of values, e.g. "red", "green", "blue" 255, 128, 66 $a, $b, $c $a + $b, $a - $b, $a*$b, $a/$b A list often appears in a script enclosed in round brackets, to satisfy precedence rules e.g. : ("red", "green", "blue") It is important to appreciate that the brackets are not a Reddy / Assistant required part ofA.Brahmananda the list syntax
Professor / CSE / VNRVJIET
63
do what is natural, 'obvious shorthand is acceptable in lists, e.g. (1. .8) ("A" .. "H", "0" .. "Z") and to save tedious typing, qw(the quick brown fox} is a shorthand for ("the", "quick", "brown", "fox") qw/the quick brown fox/ or qw|the quick brown fox| The 'matching brackets' rule also applies to the qw A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 64 operator.
List magic
Lists are often used in connection with arrays and
hashes. A list containing only variables can appear as the target of an assignment and/or as the value to be assigned. This makes it possible to write simultaneous assignments, e.g. ($a, $b, $c) = (1, 2, 3); and to perform swapping or permutation without using a temporary variable, e.g. ($a, $b) = ($b, Sa) ; ($b, $c, Sa) = ($a, $b, $c)
65
Arrays
An array is an ordered collection of data whose
components are identified by an ordinal index: it is usually the value of an array variable. The name of such a variable always starts with an @, e.g. @days_of_week, denoting a separate namespace and establishing a list context.
The association between arrays and lists is a close one:
an array stores a collection, and a list is a collection, so it is natural to assign a list to an array, e.g. @rainfall = (1.2, 0.4, 0.3, 0.1, 0, 0, 0);
66
67
Arrays : An array represents a list of values my @animals = ("camel", "llama", "owl"); my @numbers = (23, 42, 69); my @mixed = ("camel", 42, 1.23);
Arrays are zero-indexed. Heres how you get at elements in an array: print $animals[0]; print $animals[1]; # prints "camel" # prints "llama
The special variable $#array tells you the index of the last element of an array: A.Brahmananda # Reddy / Assistant print $mixed[$#mixed]; last element, prints 1.23
Professor / CSE / VNRVJIET 68
The elements were getting from the array start with a $ because were getting just a single value out of the array To get multiple values from an array: @animals[0,1]; # gives ("camel", "llama"); @animals[0..2]; # gives ("camel", "llama", "owl"); @animals[1..$#animals]; # gives all except the first #element This is called an "array slice". You can do various useful things to lists: my @sorted = sort @animals; my @backwards = reverse @numbers;
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 69
You can do various useful things to lists: my @sorted = sort @animals; my @backwards = reverse @numbers;
There are a couple of special arrays too, such as @ARGV (the command line arguments to your script) and @_ (the arguments passed to a subroutine). These are documented in perlvar.
70
or a scalar context.
list context: the 'target' of an operation is a collection. scalar context: it is a single data item.
In an assignment to an array, the @ of the array name
on the left-hand side establishes a list context, but when an element of an array is being accessed the occurrence of the same name but with a leading $ establishes a scalar context.
Some items that can occur in either context modify
@foo = (101, 102 , 103); #sets all 3 values of the array foo, but $foo = (101, 102, 103); # sets $foo to 103.
sensible thing and puts brackets round it to make it a list of one element, so @a = " candy" ; has the same effect as @a = ("candy") ;
72
assigned is the length of the array, $n = @foo; #Assigns the value 3 to $n.
List context can be established in other ways.
Eg: The print function establishes a list context, since it expects a list of things to print. But this can lead to unexpected results. Eg: $n = @foo print "array foo has $n elements\n #Prints Length print Ilarray foo has @foo elements #Prints entire #contents of the array
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 73
Associative arrays or content-addressable arrays. An associative array is one in which each element has
Hashes
two components, a key and a value, the element 'indexed' by its key (just like a table). Such arrays are usually stored in a hash table to facilitate efficient retrieval. A particular attraction of hashes is that they are elastic they expand to accommodate new elements as they are introduced. Names of hashes in Perl start with a % character: such a name establishes a list context. As with arrays, since each element in a hash is itself a scalar, a reference to an element uses $ as the first character, establishing a scalar context.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 74
brackets): if the key is a single 'word', i.e. does not contain space characters, explicit quotes are not required. $somehash{aaa} =123; #The braces establish that the scalar item is being assigned to an element of a hash. $somehash{234} = "bbb" ; # The key is a three-character string, not a number $somehash{" $a "} = 0; # The key is the current value of $a . %anotherhash = %somehash; # The leading % establishes a list context: the target of the assignment is the hash itself, not one of its values.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 75
List Variables
scalar variables.
They are always preceded by the @symbol.
Eg @names = ("betty","veronica","tom");
Like in C the index starts from 0. If you want the second name you should use $names[1] Watch the $ symbol here because each element is a
scalar variable.
$ Followed by the list variable gives the length of the
list variable. Eg $names here will give you the value 3. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 76
Push,pop,shift,Unshift,reverse
These are operators operating on the list variables. Push and pop treat the list variable as a stack and
77
shift,Unshift,reverse
SHIFT AND UNSHIFT ACT ON THE LOWER
SUBSCRIPT.
Perl provides several built-in functions for list manipulation. Three useful ones are
shift LIST: Returns the first item of LIST, & moves the
Manipulating Lists
items in LIST at the beginning of ARRAY, moving the original contents up by the required amount.
push ARRAY, LIST: Similar to unshift, but adds the
79
Hashes,keys,values,each
Hashes are like arrays but instead of having
numbers as their index they can have any scalars as index. Hashes are preceded by a % symbol.
Eg we can have %rollnumbers = ("A",1,"B",2,"C",3);
80
Hashes,keys,values,each
If we want to get the rollnumber of A we have to
say $rollnumbers{"a"}. This will return the value of rollnumber of A. Keys: returns a list of all the keys of the given hash. Values: returns the list of all the values in a given hash. Each function iterates over the entire hash returning two scalar value the first is the key and the second is the value
Eg $firstname,$lastname = each(%lastname) ; Here the $firstname and the $lastname will get a new key value pair A.Brahmananda during each iteration Reddy / Assistant
Professor / CSE / VNRVJIET 81
$somehash{aaa} = 123;
The braces establish that the scalar item is being assigned to an element of a hash.
$somehash{234} = "bbb" ;
The leading establishes a list context: the target of the assignment is the hash itself, not one of its values.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 82
Control Structures
If / unless statements While / until statements
For statements
Foreach statements
83
If / Unless
If similar to the if in C.
Eg of unless.
Unless(condition){}
84
Eg until.
Until(some expression){}
is met.
For is similar to C implementation.
85
Foreach Statement
This statement takes a list of values and assigns
them one at a time to a scalar variable, executing a block of code with each successive assignment.
Eg: Foreach $var (list) {}
86
87
88
map
The Perl map function transforms a list into a new one and this new list is returned by the function. No need to modify the input list. The Perl map function evaluates a block or an expression for each element of an array and returns a new array with the result of the evaluation. During the evaluation, it locally assigns the $_ variable to each element of the array. The difference between the grep and map functions: while you can use grep to select elements from an array, you use map to transform the elements of an array. So keep in mind: grep to select, map to A.Brahmananda Reddy / Assistant transform. Professor / CSE / VNRVJIET 90
map
('cat', 'dog', 'rabbit', 'hamster', 'rat') and we want to create a list of the same words with a terminal's' added to form the plural: ('cats' ,'dogs' , 'rabbits', 'hamsters', 'rats') This could be done with a foreach loop: @s = qw/cat, dog, rabbit, hamster, rat/; @pl = (); foreach (@s){ push @pl, $_. 's;} print @s; However, this is such a common idiom that Perl provides an inbuilt function map to do the job: @pl = map $_. 's', @S; The general form of map: map expression, list; and map BLOCK A.Brahmananda Reddy / Assistant list; Professor / CSE / VNRVJIET
91
Subroutines
Subroutines (or subs) are named blocks of code, thus: sub foobar { statements }
Note that the subroutine definition does not include
argument specifications: it just associates the name with the block Subroutines are like everything else in perl, in the sense that they return a value. The value returned is the value of the last expression evaluated in the block that forms the subroutine body, unless the return function is A.Brahmananda Reddy / Assistant used to return a specific value
Professor / CSE / VNRVJIET 92
Subroutines are used: to avoid or reduce redundant code to improve maintainability and reduce possibility of errors to reduce complexity by breaking complex problems into smaller, more simple pieces to improve readability in the program
The advantage of subroutines in perl is, if
you wish to be able to call your subroutine and leave off optional arguments
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 93
reuse it. A subroutine will start with keyword sub. Eg: subroutine which calculates sum of two numbers: #!/usr/bin/perl -w $var1 = 100; $var2 = 200; $result = 0; $result = my_sum(); print "$result\n"; sub my_sum { $tmp = $var1 + $var2; return $tmp; }
Note: Subroutines might have parameters. When passing parameters to subroutines, it will be stored in @_ array. Do not confuse it with $_ which stores elements of an array in a loop.Reddy / Assistant A.Brahmananda
Professor / CSE / VNRVJIET 94
@ARGV is an array reserved for parameters transmitted to files (default value of number of arguments is set -1 if no parameters are transmitted. #!/usr/bin/perl -w if ($#ARGV < 2) { print "You must have at least 3 parameters.\n"; } else { print "Your parameters are: @ARGV[0..$#ARGV]\n"; }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 95
96
name of a subroutine, so this form of call can be used even if the subroutine definition occurs later in the script. If the subroutine has been defined earlier the ampersand can be omitted A forward declaration takes the form sub foobar; i.e. it is a declaration without a subroutine body. so that the ampersand hardly ever needs to be used.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET
97
Subroutine arguments
If a subroutine expects arguments, the call takes the
form &foobar(argl, arg2); or in the likely case that the subroutine is already declared, we can omit the ampersand: foobar(argl, arg2); In fact, for a pre-declared subroutine, the idiomatic form of call is foobar argl, arg2 A subroutine expects to find its arguments as a single flat list of scalars in the anonymous array @_: they can be accessed in the body as $_ [0], $_ [1] etc.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 98
Note that this has the effect that arguments that are variable
names are called by reference. This is because the values stored in @_ are implicit references to the actual scalar parameters. Thus if you assign to $_ [0] the value of the actual parameter is changed. A common idiom in a subroutine which is expecting a variable number of arguments is to structure that as to process each argument in turn. foreach $arg @_ { . }
argument at a time. For convenience, shift can be used without an argument in a subroutine body, and Perl will assume that you mean the argument array @_.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 99
foobar argl, arg2, arg3; the effect is as if the assignment @_ = (argl, arg2, arg3); Is executed just before the text of the body of foobar is inserted in the script. The value returned by a subroutine may be a scalar or a single flat list of scalars. It is important to note that in either case, the expression that determines the return value is evaluated in the context (scalar or list) in which the subroutine was called. Thus if we write $x = foo{$y, $z) the return value will be evaluated in scalar context,
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET
100
but if we write @x = foo($y, $z) it will be evaluated in list context. This can lead to subtle errors if the expression is such that its value depends on the context in which it is evaluated. To deal with this the function wantarray is provided: this returns true if the subroutine was called in a list context, and false if it was called in a scalar context. A typical idiom is to use a conditional expression in the return statement: return wantarray ? list_value : scalar_value;
101
arguments on the command line. These arguments are placed in an array @ARGV. A common idiom is to use a foreach loop to process each argument in tum: #!/usr/bin/perl foreach $arg @ARGV; { ... #process each argument in turn #as $arg } An alternative is to peel off the arguments one at a time using the shif t operator: #!/usr/bin/Perl while ($arg = shift) { } A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 102
sub maximum { if ($_[0] > $_[1]) { $_[0]; } else { $_[1]; } } $biggest = &maximum(37, 24); # Now $biggest is 37
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 103
#!/usr/bin/perl -w my $sum; $sum = &add_numbers(5, 15); print "The sum is $sum.\n"; sub add_numbers { my $num1; my $num2; my $sum; $num1 = $_[0]; $num2 = $_[1]; $sum = $num1 + $num2; return ($sum); }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 104
#!/usr/bin/perl -w &draw_triangle(0); sub draw_triangle { my $iterations = $_[0]; my $chars; $iterations++; if ($iterations == 11) { return 0; } else { for ($chars = 0; $chars < $iterations; $chars++) { print " * "; } print "\n"; &draw_triangle($iterations); } A.Brahmananda Reddy / Assistant } Professor / CSE / VNRVJIET
105
Functions
Function declaration
Calling a function Passing parameters Local variables Returning values
106
Function Declaration
The keyword sub describes the
function.
So the function should start with the keyword sub. Eg sub addnum { . }.
the end or in the beginning of the main program to improve readability and also ease in debugging.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 107
Function Calls
$Name = &getname(); The symbol & should precede
108
Parameters of Functions
We can pass parameter to the
function as a list . The parameter is taken in as a list which is denoted by @_ inside the function. So if you pass only one parameter the size of @_ list will only be one variable. If you pass two parameters then the @_ size will be two and the two parameters can be accessed by $_[0],$_[1] ....
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 109
main program are by default global so they will continue to have their values in the function also. Local variables are declared by putting 'my' while declaring the variable.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET
110
is usually the value that is returned unless there is an explicit return statement returning a particular value.
There are no pointers in Perl but
111
patterns. Strictly, a RE is a notation for describing the strings produced by a regular grammar: it is thus a definition of a (possibly infinite) class of strings. meta-characters with special meanings in a RE.
112
/(([O-9][ ])|([a-z][ ]))+/ : # matches a sequence of one or more #items, each of which is either a sequence of #digits followed by a space or a sequence of #lower-case letters followed by a space.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 113
114
Non-greedy matching: A pattern including . * matches the longest string it can find. The pattern . *? can be used when the shortest match is required. (In fact, any of the quantifiers can be followed by a ? to specify the shortest match.). Shorthand: Some character classes occur so often that a shorthand notation is provided: \ d matches a digit, \ w matches a 'word' character (upper-case letter, lower-case letter or digit), and \s matches a whitespace' character (space, tab, carriage return or newline). Capitalization reverses the sense, e.g. \D matches any non-digit character.
115
Anchors: We have seen the use of ^ and $ to 'anchor' the match at the start or end of the target respectively. Other anchors can be specified as \b (word boundary) and \B (not a word boundary).
Eg: if the target string contains john and johnathan as space-separated words, /\bJohn/ will match both john and johnathan, /\bJohn \b/ will only match john, While /\bJohn \B/ will only match johnathan.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 116
They define a series of partial matches that are 'remembered' for use in subsequent processing or in the regular expression itself. In a regular expression, \ 1, \ 2 etc. denote the substring that actually matched the first, second etc. sub-pattern, the numbering being determined by the sequence of opening brackets. Thus we can require that a particular sub-pattern occurs in identical form in two or more places in the target string. If we want the round brackets only to define a grouping without remembering the sub-string matches we can use the syntax (?: ... ) .
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 117
118
characters as a single string i.e. dot matches any character including newline. m/ / x Ignore whitespace characters in the regular expression unless they occur in a character class, or are escaped with a backslash. m/ /o Compile regular expression once only.
The RE engine has to construct a non-deterministic finite automaton which is then used to perform abacktracking match. This is called 'compiling the RE and may be time consuming for a complex regular expression. The /o modifier ensures that the regular expression is A.Brahmananda Reddy / Assistant compiled only once, when it is first encountered. Professor / CSE / VNRVJIET 119
Substitution
Given a file in which each line starts with a four-digit
line number followed by a space, while (<STDIN>) {s/^\d{4} [ ] / /; print;} will print the file without line numbers. The general syntax is s/pattern/subst/
The substitution operator checks for a match between
the pattern and the value held in $_, and if a match is found the matching sub-string in $_ is replaced by the string subst.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 120
that the substitution string is an expression that is to be evaluated at run-time if the pattern match is successful, to generate a new substitution string dynamically. Obviously this is only useful if the code includes references to the variables $&, $1,$2 etc. Eg: if the target string contains one or more sequences of decimal digits, the following substitution operation will treat each digit string as an integer and add 1 to it: s/\d+/$&+l/eg; A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 121
Regular Expression
Split and join
Matching & replacing Selecting a different target $&,$', And $` Parenthesis as memory Using different delimiter Others
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET
122
depending on the delimiter. The default delimiter is the space. It is usually used to get the independent fields from a record. .
Eg: $linevalue = "R101 tom 89%"; $_ = $linevalue. @Data = split();
123
$data[2] 89%. Split by default acts on $_ variable. If split has to perform on some other scalar variable.Than the syntax is.
Split (/ /,$linevalue);
syntax is.
Split(/<delimiter>/,$linevalue);
124
Special Vriables
$& Stores the value which matched with
pattern. $' Stores the value which came after the pattern in the linevalue. $` Stores thte value which came before the pattern in the linevalue.
125
the split. It takes a list and joins up all its values into a single scalar variable using the delimiter provided.
Eg $newlinevalue = join(@data);
126
replace it with another one you can do the same thing as what you do in unix . the command in perl is .
S/<pattern>/<replace pattern>.
act on a different source variable (Eg $newval) then you have to use.
Eg @newval=~s/<pattern>/<replace pattern> .
127
Parenthesis As Memory
Parenthesis as memory. Eg fred(.)Barney\1); . Here the dot after the fred indicates the it is
memorry element. That is the \1 indicates that the character there will be replaced by the first memory element. Which in this case is the any character which is matched at that poistion after fred.
128
is executed at the end of each normal iteration before control returns to re-test the condition
while ( ... ) { . } continue{ }
in a for loop specification. the for loop can be defined in terms of a whi1e loop with a continue block as follows.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 129
for ($i =1; $i < 10; $i++) { ...... } is equivalent to $i=1; while ($i < 10) { .. } continue { $i++; } A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 130
@array = (0..9); print("@array\n"); for ($index = 0; $index < @array; $index++) { if ($index == 3 || $index == 5) { next; } $array[$index] = "*"; } print("@array\n");
This program displays: 0123456789 * * * 3 *Reddy 5 * /* ** A.Brahmananda Assistant
Professor / CSE / VNRVJIET 131
block. Example @array = ("A".."Z"); for ($index = 0; $index < @array; $index++) { if ($array[$index] eq "T") { { last } } print("$index\n");
132
statement block. Example : print("What is your name? "); $name = <STDIN>; chop($name); if (! length($name)) { print("Msg: Zero length input. Please try again\n"); redo; } print("Thank you, " . uc($name) . "\n"); }
133
construct something similar using last in an expression SELECT: { $red += 1, last SELECT if /red/; $green += 1, last SELECT if /green/; $blue += 1, last SELECT if /blue/; } Parses as ($red += 1, last SELECT) if /red/; This construct increments $red, $blue or $green according as the anonymous variable $_ contains the string 'red, blue or 'green'. Using last as an operator in an expression to cause exit from the labeled bare block.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 134
context. In a list context the comma operator is a list constructor, but in a scalar context the comma operator evaluates its left hand argument, throws away the value returned, then evaluates its right-hand argument
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 135
Subroutine Syntax
Old and no prototyping Calling the subroutine &mysub();
136
138
specify the number of arguments a subroutine takes, and the type of the arguments. Eg: sub foo($);
Specifies that foo is a subroutine that expects one scalar argument
sub bar();
specifies that bar is a subroutine that expects no arguments
The prototype is there to enable compile-time checking of the number and type of arguments provided.
139
arguments takes place, and a mismatch results in compilation being aborted. However, type checking does not take place rather, the $ in sub foo ($); tells the compiler that if the single argument is not scalar, it should be converted into a scalar by fair means or foul.
the built-in functions when called without brackets round the arguments.
bar is a subroutine that takes no arguments and we write
$n = bar -1; if bar had been defined without a prototype, $n would be assigned the value returned by bar, since the compiler would gobble up the -1 as an argument.
140
the compiler knows that the statement should be parsed as $n = bar() -1; Similar considerations apply to subroutines taking one scalar argument. Suppose we define foo without a prototype, as follows sub foo { return shift } Then if we write @s = (foo $w1, $w2, $w3);
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 141
that of $w1, $w2 and $w3 have been gobbled up as arguments, but ignored by foo. But if foo is defined with a prototype ($), the compiler knows to collect only one argument, and the value of @s is a list of three items,
sub foobar($$); and write @s = (foobar $w1, $w2, $W3}; we get a compile-time error, 'too many arguments supplied'. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 142
sub foobar($;$) is a subroutine that expects at least one scalar argument The compiler will complain if the call of foo has no arguments or more than two arguments.
A prototype can specify non-scalar arguments using
@, % or &.
Eg: sub foobar(@); Specifies a subroutine that expects a variable number of arguments and It can be called without any arguments, and the compiler will not complain: an empty list is still A.Brahmananda a list Reddy / Assistant
Professor / CSE / VNRVJIET 143
sub foobar ($@); appears to specify a subroutine with two arguments, the first a scalar and the second a list. It just defines a subroutine with at least one argument. sub foobar (%); appears to specify a subroutine that expects a single argument that is a hash
144
REFERENCES
A reference is a scalar value that refers to an entire
references, which allow you to build lists and hashes within lists and hashes.
A reference is a scalar value and can refer to any
other Perl data type. So by storing a reference as the value of an array or hash element, you can easily create lists and hashes within lists and hashes
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET
145
Syntax
There are just two ways to make a reference, and
just two ways to use it once you have it. Making References: If you put a \ in front of a variable, you get a reference to that variable. $aref = \@array;
$href = \%hash;
# $aref now holds a #reference to @array # $href now holds a #reference to %hash
146
$href, you can copy it or store it just the same as any other scalar value: $xy = $aref; # $xy now holds a reference to #@array $p[3] = $href; # $p[3] now holds a reference to #%hash $z = $p[3]; # $z now holds a reference to %hash
147
RULE2: [ ITEMS ] makes a new, anonymous array, and returns a reference to that array. { ITEMS } makes a new, anonymous hash, and returns a reference to that hash. $aref = [ 1, "foo", undef, 13 ]; # $aref now holds a reference to an array $href = { APR => 4, AUG => 8 }; # $href now holds a reference to a hash The references you get from rule 2 are the same kind of references that you get from rule 1:
# This:
$aref = [ 1, 2, 3 ]; # Does the same as this: @array = (1, 2, 3); $aref = \@array; If you write just [], you get a new, empty anonymous array. If you write just {}, you get a new, empty anonymous A.Brahmananda Reddy / Assistant hash Professor / CSE / VNRVJIET
148
Using References
Its a scalar value, and weve seen that you can store it as a scalar
and get it back again just like any scalar. There are just two more ways to use it: You can always use an array reference, in curly braces, in place of the name of an array. For example, @{$aref} instead of @array.
EG: Arrays:
@a reverse @a $a[3] $a[3] = 17; @{$aref} An array reverse @{$aref} Reverse the array ${$aref}[3] An element of the array ${$aref}[3] = 17 Assigning an element
On each line there are two expressions that do the same thing. The left-hand versions operate on the array @a. The rightA.Brahmananda Reddy / Assistant hand versions operate on the array that is referred to by $aref. 149 Professor / CSE / VNRVJIET
Using a hash reference is exactly the same: %h %{$href} A hash keys %h keys %{$href} Get the keys from the hash $h{red} ${$href}{red} An element of the hash $h{red} = 17 ${$href}{red} = 17 Assigning an element Whatever you want to do with a reference, Use Rule 1
tells you how to do it Eg: foreach my $element (@array) { ... } so replace the array name, @array, with the reference: foreach my $element (@{$aref}) { ... A.Brahmananda Reddy / Assistant } Professor / CSE / VNRVJIET
150
"How do I print out the contents of a hash when all I have is a reference?" First write the code for printing out a hash:
foreach my $key (keys %hash) { print "$key => $hash{$key}\n"; } And then replace the hash name with the reference:
foreach my $key (keys %{$href}) { print "$key => ${$href}{$key}\n"; }
151
is the fourth element of the array. Dont confuse this with $aref[3], which is the fourth element of a totally different array, $aref and @aref are unrelated the same way that $item and @item are.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 152
by the scalar variable $href, perhaps even one with no name. $href{red} is part of the deceptively named %href hash. Eg: First, remember that [1, 2, 3] makes an anonymous array containing (1, 2, 3), and gives you a reference to that array. Now think about @a = ( [1, 2, 3], [4, 5, 6], [7, 8, 9] );
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 153
the array containing (4, 5, 6), and because it is a reference to an array, Use Rule 2 says that we can write $a[1]->[2] to get the third element from that array $a[1]->[2] is the 6. Similarly, $a[0]->[1] is the 2.
You can write $a[ROW]->[COLUMN] to get or set the element in any row and any column of the array.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 154
References
The concept of the pointer is common in many
languages, allowing indirect access to a piece of data - the value of a pointer variable tells us where we will find the data object. int X; /* integer */ int *Px; /* pointer to integer */ px = &X; /* "address of x" */
This kind of pointer is better called a reference, since it
155
memory.
Symbolic reference: which is a variable whose value is
context that a scalar is valid The only operation that can be carried out on a reference is dereferencing, i.e. locating the referent. A Perl variable is an association of an identifier in the symbol table with a reference to a value.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 156
takes place to obtain the value: If it occurs on the left-hand side of an assignment, the reference is used to locate the value to be updated. An explicit reference to a value can be created: this may be a value that is associated with an identifier in the symbol table, or it may be an 'anonymous' value with no associated variable: in this case the only way to get to the value is via the reference.
157
Creating references
References to variables and subroutines:
The backslash operator creates a reference to a named variable or subroutine Eg: $ref2foo = \$foo; $ref2args = \@ARGV; $ref2sub = \&initia1ize; Note : By creating the reference we have two independent references to the same value: one associated with the
name in the symbol table, and another which is the value of the reference variable.
References to references can be created to any depth,
Eg:
$ref2ref2foo = \\$foo;
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 158
enclosing a list in square brackets. This array composer creates an anonymous array and returns a reference to it, thus Eg: $ref2array = [255, 127, 0];
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 159
$ref2array2 = [255, 127, 0, [255, 255, 255]]; Here we have a reference to an array with four elements, three integers and a reference to an array. A reference to an anonymous hash can be created using curly brackets (braces): $ref2hash ={winter => 'cold', summer => 'hot} }; A reference to an anonymous subroutine is created with the anonymous subroutine composer: $ref2sub = sub{ ... };
This looks like the definition of a subroutine without a name, but it is in fact an expression, hence the semicolon after the closing brace. A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 160
a scalar value that represents all the data structures assigned to same variable name. typeglob variables are preceded with the character *. A typeglob variable can represents the types of variables like scalar, array, hash and subroutine. Eg: The typeglob variable *tgvar can represents $tgvar or @ tgvar or % tgvar or & tgvar. These can be used as an ordinary data type. We can perform the operations like: Assign the value to a variable, Store in an array, Pass it as parameter to a subroutine.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 161
ARRAY, HASH, etc., and associated values which are references to the scalar, array, hash etc. variables that share the name associated with the typeglob. We can write * f 00 { SCALAR} instead of \ $ foo if you want to write obscure/unintelligeble code. This equivalence is put to practical effect in the capability to create selective aliases *bar = *foo; Makes $bar the same as $foo, @bar the same as @foo, etc. If we write *bar = \$fOO;
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 162
remain different arrays. In a similar vein, *Pi = \3.1415926536; creates a read-only variable: an attempt to assign to $Pi causes an error. What is it a reference to? More precisely, the ref function returns an empty string if its argument is not a reference. If the argument is a reference ref returns a string specifying what the argument is a reference to - REF, SCALAR, ARRAY, HASH, CODE, or GLOB.
163
Dereferencing
The operation of dereferencing applied to a
reference delivers the referent $ref2foo = \$foo; then an occurrence of $$ref2foo has the same meaning as $foo. Thus print "$foo\n; print "$$ref2foo\n;
both print the value of $foo. We can do dereferencing on the left-hand side of an assignment A.Brahmananda also, thus: Reddy / Assistant
Professor / CSE / VNRVJIET 164
potentially dangerous, since we have two independent ways of referring to the same chunk of memory.
165
sub foo { my($x, $y) = @_; ## parameters my($z); ## local var return($x + $y); } ...foo(6, 7) --> 13
@_ : @_ is an array of the passed scalar values -- = parallel
assigns the scalars to locals Flattening : Cannot pass an array as one of the args -- it all gets "flattened(compressed) into @_ My : my() declares something local $main::x : refer to outer "global" vars as $main::x -- needed if the "use strict 'vars'; option is on.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 166
File Handling Since perl is particularly well suited to working with textual data it often gets used to process files. A filehandle is a structure which perl uses to allow functions in perl to interact with a file. A filehandle is an abstract name given to a file, device, socket or pipe. A filehandle helps in getting input from and sending output to many different places.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 167
Creating a filehandle
In order to do any work with a file you first need to
create a filehandle. To create a filehandle using the open command. This can take a number of arguments;
The name of the filehandle (all in uppercase by convention) The mode of the filehandle (read, write or append) The path to the file When selecting a mode for your filehandle perl uses the symbols; < for a read-only filehandle > for a writable filehandle >> for an appendable filehandle
read-only filehandle, and for reading files it is usual to not specify a mode . A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 168
169
# open file for writing open(OUT, ">", $filename) or die $!; # add line numbers to each line my $line_no = 1; while(<IN>) { print OUT "$line_no: $_"; $line_no++; }
170
# open file for reading open(my $input_fh, "<", $filename) or die $!; # open file for writing open(my $output_fh, ">", $filename) or die $!; # copy file my $line_no = 1; while(<$input_fh>) { print $output_fh "$line_no: $_"; $line_no++; }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET
171
Closing a filehandle
When you've finished with a filehandle it's a good
practice to close it using the close function. This is more important for writable filehandles than read-only ones , but it never hurts. If you don't explicitly close your filehandle it will automatically be closed when your program exits. If you perform another open operation on a filehandle which is already open then the first one will automatically be closed when the second one is opened.
172
Special Filehandles
In addition to creating your own filehandles perl actually comes
with three of them already defined for you. These are special filehandles which you can use to interact with whatever process launched your perl script. The special filehandles are: STDOUT A writeable filehandle. Used as the default output location for print commands. Is usually redirected to the console from which you ran your perl script. STDIN A readable filehandle. Used to pass information from the console into perl. Allows you to ask questions on the console and get an answer. STDERR Another writable filehandle usually attached to the console but normally only used for unexpected data such as error messages.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET
173
to read from in between the angle brackets. This reads one line of data from the file and returns it. To be more precise this operator will read data from the file until it hits a certain delimiter The default delimiter is your systems newline character ("\n"), hence you get one line of data at a time.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET
174
my $file = "M:/tale_of_two_cities.txt"; open (IN, file) or die "Can't read $file: $!"; my $first_line = <IN>; print $first_line; print "The end"; This produces the following output:
It was the best of times, it was the worst of times, The end
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 175
of the first print statement, the second one still appears on the next line. This is because the <> operator doesn't remove the delimiter it's looking for when it reads the input filehandle. Normally you want to get rid of this delimiter, and perl has a special function called chomp for doing just this. Chomp removes the same delimiter that the <> uses, but only if it is at the end of a string.
176
my $file = "M:/tale_of_two_cities.txt"; open (IN,$file) or die "Can't read $file: $!"; my $first_line = <IN>; chomp $first_line; print $first_line; print "The end";
This code produces: It was the best of times, it was the worst of times,The end Note: The more normal way to read a file is to put the <> operator into a while loop so that the reading continues until the end of the file is reached.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 177
my $file = "M:/tale_of_two_cities.txt"; open (IN,$file) or die "Can't read $file: $!"; my $line_count = 1; while (<IN>) { chomp; print "$line_count: $_\n"; last if $line_count == 5; ++$line_count; } Gives: 1: It was the best of times, it was the worst of times, 2: it was the age of wisdom, it was the age of foolishness, 3: it was the epoch of belief, it was the epoch of incredulity, 4: it was the season of Light, it was the season of Darkness, 5: it was the spring of hope, it was the winter of despair, A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 178
would in theory have worked to number every line of the whole file. It would also have done the right thing had there been less than 5 lines in the file as the while condition would have failed so the loop would have exited.
Note that by not specifying a variable name to assign to
in the loop condition statement the assignment goes to the default $_ variable. This in turn means that I don't need to pass an argument to chomp as it (like most scalar functions) operates on $_ by default.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET
179
information from the console. This allows you to produce interactive programs which can ask the user questions and receive answers. To achieve this you simply use the STDIN filehandle in the same way as you'd use any other read-only filehandle. You usually only read one line at a time from STDIN (so the input stops when the user presses return), but you can put the read into a while loop which will continue until the use enters the end-of-file symbol (Contol+D on most systems, but this can vary).
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 180
#!perl use warnings; use strict; print "What is your name? "; my $name = <STDIN>; chomp $name; print "What is your quest? "; my $quest = <STDIN>; chomp $quest; print "What is your favourite colour? "; my $colour = <STDIN>; chomp $colour; if ($colour eq "Blue... no, yellow!") { die "Aaaarrrrgggh!"; }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 181
Writing to a file
Writing to a file is really simple. The only
function you need to use is print. All of the previous print statements shown so far have actually been sending data to the STDOUT filehandle. If a specific filehandle is not specified then STDOUT is the default location for any print statements.
182
#!perl use warnings; use strict; open (OUT,'>',"M:/write_test.txt") or die "Can't open file for writing: $!"; print OUT "Sending some data\n"; print OUT <<"END_DATA"; Now I'm sending a lot of data All in one go. Just because I can... END_DATA close OUT or dieA.Brahmananda "Failed to close file: $!"; Reddy / Assistant
Professor / CSE / VNRVJIET 183
file is that in addition to checking that the open function succeeded, you must also check that you don't get an error when you close the filehandle. This is because errors can occur whilst/(at the same time) you are writing data, for example if the device you're writing to becomes full whilst you're writing.
184
within the language. Changing directory : Use the chdir function to change directory in perl. As with all file operation you must check the return value of this function to check that it succeeded.
chdir ("M:/Temp") or die "Couldn't move to temp directory: $!";
185
187
chdir ("M:/winword") or die "Can't move to word directory: $!"; # Quick easy way my @files = <*.doc>; print "I have ",scalar @files," doc files in my word directory\n";
# Longer way to do the same thing my @files2 = glob("*.rtf"); print "I have ",scalar @files2," rtf files in my word directory\n";
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 188
above a nicer way to use it is actually to treat it a bit like a filehandle and read filenames from it in a while loop.
chdir ("M:/winword") or die "Can't move to word directory: $!"; while (my $file = <*.doc>) { print "Found file $file\n"; }
189
Testing files
If you want to open a file for reading it's much nicer to
specifically test if you can read the file rather than just trying to open it and trapping the error. Perl provides a series of simple file test operators to allow you to find out basic information about a file.
The file test operators are: -e -r -w -d Tests if a file exists Tests if a file is readable Tests if a file is writable Tests if a file is a directory (directories are just a special kind of file) -f Tests if a file is a file (as opposed to a directory) A.Brahmananda Reddy / Assistant -T Tests if a file is a plain text file Professor / CSE / VNRVJIET
190
return either true or false. chdir ("M:/winword") or die "Can't move to word directory: $!"; while (my $file = <*>) { if (-f $file) { print "$file is a directory\n"; } elsif (! -w $file) { print "$file is write protected\n"; } }
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 191
literally, such as a picture file, a sound file, or a binary file. Text files are any files that contain records that end in end-of-line characters. Some operating systems distinguish between binary and text files. Unix and Linux do not, but Windows does. Perl cant tell the difference between binary and text files (it has a Unix heritage).
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 192
anything, so you have to use the binmode command with the filehandle to tell Perl this is to be written literally:
filehandle once, until you close the file On some operating systems (UNIX/Linux and Macintosh) binmode is ignored as there is no distinction between binary and text files
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 193
a list of items into a string. However, it builds a binary structure described by a template which is given as an argument, and returns this as a string. A common use is for converting between ASCII codes and the corresponding characters. Thus if @codes is a list of 10 integers between 0 and 127 pack cccccccccc" , @codes or pack "c10, @codes Produces a string containing the 10 corresponding ASCII A.Brahmananda Reddy / Assistant characters. Professor / CSE / VNRVJIET 194
variable interpolation, we can generalize this to a list of any length: (For list of unknown length, it is return as follows) $count = @codes; pack "c$count" , @codes;
unpack:
The unpack function reverses the process, converting a
template character Description c A signed char (8-bit) value. C An unsigned char (octet) value. s A signed short (16-bit) value. S An unsigned short value. l A signed long (32-bit) value. L An unsigned long value. i A signed integer value. I A unsigned integer value.
http://perldoc.perl.org/functions/pack.html
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 196
Eval
It can be evaluated any string at run time
197
eval
it takes an arbitrary string as an operand and evaluates
the string at run-time, as a Perl script. Any variables declared local with my or local have a lifetime which ends when the eval is finished. The value returned is the value of the last expression evaluated, as with a subroutine: alternatively, return can be used to return a value from inside the eval.
In the event of a syntax error or a run-time error, eval
returns the value 'undefined' and places the error message in the variable $@. If there is no error, the value of $@ is an empty string
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 198
Eg: The use of this form of eval is to get the value of a variable whose name is to be computed, as a string at run-time, e.g. $myvar = ; . $value = eval "\$$myvar; Another example is a little shell, which allows you to type in a script which will be run when the end-offile character (Ctrl-D in UNIX, Ctrl-Z in Windows) is entered: my @script = <STDIN>; chomp @script; eval join , @script;
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 199
Closures
Instead of returning data , a Perl subroutine can
passing subroutine references around, except for a somewhat hidden feature involving anonymous subroutines and lexical ( my ) variables.
An
200
reference to $x, and as this occurred within the scope of the my declaration, the $x in the body of the anonymous subroutine is the local $x.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 201
If we write
$x = "yellow"; &$sub ("red") ; then the output is green and red not 'yellow and red'.
202
has been executed in the environment in which it was declared, not the environment in which it is executed. It carries its declaration-time environment with it: this combination of code and environment is what computer scientists call a closure.
203
environment can access the underlying UNIX I/O system calls by using the sysread and syswrite functions. Perl provides equivalent functions. Perl's sysread is used in exactly the same way as read to read a block of data of known length. syswrite makes it possible to write a block of data of known length Eg: $bytes_written = syswrite BINFILE, $binstring, $n;
The value returned is the number of bytes actually written.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 205
Sysread syswrite
The syntax of sysread function is:
the given Filehandle (Fhandle) into the variable scalar. It returns number of bites read on success, returns 0 if EoF is reached and return undef on failure.
The offset parameter is optional. It specifies from
where the string has to read. By default, the string is read from the beginning of the File. An exception is raised when len is negative or when offset is out of string boundary.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET
206
syswrite
The syntax of syswrite function is:
variable to the given Filehandle (Fhandle). It returns number of bytes written on success and returns undef on failure. The offset parameter is optional. It specifies the starting point for writing the string. Generally offset is 0, when scalar is empty. If the offset is negative then writing starts that many bytes backward from the end of the string. An exception is raised when len or when offset is out of string boundary.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 207
packages
A package is a collection of code which lives in its own
namespace Syntax: package package_name A namespace is a named collection of unique variable names (also called a symbol table). Which determines bindings of names both at compiletime and run-time. (Run-time name lookup happens when symbolic references are de-referenced and during execution of eval) Namespaces prevent variable name collisions between packages
208
The Package Statement package statement switches the current naming context to a specified namespace (symbol table) If the named package does not exists, a new namespace is first created. The package stays in effect until either another package statement is invoked, or until the end of the current block or file. You can explicitly refer to variables within a package using the :: package qualifier
Initially, code runs in the default package main.
Variables used in a package are global to that package but invisible in any other package unless a 'fully qualified name' is used Reddy / Assistant A.Brahmananda
Professor / CSE / VNRVJIET 209
Thus $A: : x is the variable x in package A, while $B: : X is the (different) variable x in
and extends to the end of the innermost enclosing block, or until another package declaration is encountered. In this case the original package declaration is temporarily hidden.
210
Eg: $i = 1; print "$i\n"; # Prints "1" package foo; $i = 2; print "$i\n"; # Prints "2" package main; print "$i\n"; # Prints "1"
211
$PACKAGE_NAME::VARIABLE_NAME
For Example:
$i = 1; print "$i\n"; package foo; $i = 2; print "$i\n"; package main; print "$i\n"; print "$foo::i\n";
# Prints "1
# Prints "2" # Prints "1" # Prints "2"
212
BEGIN and END Blocks You may define any number of code blocks named BEGIN and END which act as constructors and destructors respectively.
BEGIN { ... }
and compiled but before any other statement is executed Every END block is executed just before the perl interpreter exits. The BEGIN and END blocks are particularly useful when creating Perl modules.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 213
packages contained within a single file, and are units of program reusability.
The power of Perl is immensely enhanced by the
existence of a large number of publicly available libraries and modules that provide functionality in specific application areas.
214
Libraries
related subroutines that can be used in some specific area/purpose, E.g. : A collection of mathematical functions. A library is usually declared as a package to give it a private namespace, and is stored in a separate file with an extension .p1. Thus we might have a library of mathematical functions in a library stored in a file Math.p1. These subroutines (a library)can be loaded into a program by placing the statement at the head of the program. require "Math.pl";
If the library is declared as a package called Math, we can be
accessed its subroutines/functions using fully qualified names, E.g.: $rootx = Math: : sqrt ($x) ; A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 215
Modules
Libraries have been largely "superseded by modules,
which provide enhanced functionality. a module is just a package that is contained in a separate file whose name is the same as the package name, with the extension .pm. Modules are more reusable than libraries because modules follows certain specific conventions and take advantage of built-in support that makes them a powerful form of reusable software.
From the user's point of view, the great thing about a
module is that the names of the subroutines in the module are automatically imported into the namespace A.Brahmananda Reddy / Assistant of that program using the package.
Professor / CSE / VNRVJIET 216
Eg: a module Math.pm is loaded at the starting of the program by writing use Math; Now the program can access the subroutines present in the module as if they were defined in the program itself. BEGIN { require Math.pm; Math::import(); }
A user can also import only selected subroutines of a
Perl searches the current directory. If module is not available then it searches the array @INC. This array holds a list of directories for a specific platform.
218
Data structures
In C, a two-dimensional array is constructed as an array
of arrays, reflected in the syntax for accessing an element e.g. a [0] [1] . This technique does not work in Perl, since it is not possible to create a LISP-like list of lists.
However, a similar effect can be obtained by creating
an array of references to anonymous arrays. Suppose we write @colours = ([42, 128, 244], [24, 255, 0],[0, 127, 127]);
The array composer converts each comma-separated list
to an anonymous array in memory and returns a reference to it, A.Brahmananda Reddy / Assistant
Professor / CSE / VNRVJIET 219
Data structures
so that @colours is in fact an array of references to
subscript to it forces dereferencing, returning the anonymous array from which an element is selected by the second subscript. (In fact, Perl inserts a dereferencing operator -> between an adjacent] and [.)
220
an array of references to arrays, we can create hashes of (references to) hashes. This is a very common data structure in Perl, since the anonymous hash provides a way of providing the capability of records (structs) in other languages.
Likewise, we can create arrays of (references to) hashes
and hashes of (references to) arrays. By combining all these possibilities data structures of immense complexity can be created.
221
Eg:
we choose to make each element of the array a hash
containing three fields with key L (left neighbor), 'R' (right neighbor) and 'C' (content)
The values associated with L and R are references to
element hashes (or undef for non-existent neighbours): the value associated with C can be anything scalar, array, hash or reference to some complex structure..
222
references to element hashes: 1. $head, a reference to the first element in the list, 2. $current, a reference to the element of current interest: the content of the current element is evidently accessed by $current-> { 'C }. We can move forwards the list with $current = $current->{'R'}; and backwards with $current = $current-> { 'L' }; If we create a new element with $new ={L=>undef, R=>undef, C=> . };
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 223
224
Objects
Objects in 'real' object-oriented programming (OOP)
are encapsulations of data and methods, defined by classes. Important aspects of OOP technology are inheritance and polymorphism. Objects in Perl provide a similar functionality, but in a different way: they use the same terminology as OOP, but the words have different as
225
(usually, but not necessarily, a hash) that is accessed by a reference, has been assigned to a class, and knows what class it belongs to. We have not defined classes yet: just remember that a Perl class is not the same thing as an OOP class. The object is said to be blessed into a class: this is done by calling the built-in function bless in a constructor.
Constructors:
There is no special syntax for constructors: a constructor is just a subroutine that returns a reference to an object.
226
deal with objects that belong to it. There is no special syntax for class definition.
Methods: Finally, a method is a subroutine that expects
an object reference as its first argument (or a package name for class methods). To sum up, an object is just a chunk of referenced data, but the blessing operation ties it to a package that provides the methods to manipulate that data. Thus there is no encapsulation, but there is an association between data and methods.
227
Constructers
generally (but not necessarily) called new. Example: package Animal; sub new { my $ref = {}; bless ref; return ref; } Remember that {} returns a reference to an anonymous hash. So the new constructor returns a reference to an object that is an empty hash, and knows (via the bless A.Brahmananda Reddy / Assistant function) that it belongs to the package Animal. Professor / CSE / VNRVJIET 228
Instances
instances of the object (strictly speaking. references to instances) $Dougal = new Animal; $ Ermyntrude = new Animal;
This makes $Dougal and $Ermyntrude references to
objects that are empty hashes, and 'know' that they belong to the class Animal. Note that although the right-hand side of the assignment, new Animal, looks superficially like a simple call of a subroutine new with one argument Animal, it is in fact a special syntax for calling the subroutine new in the package Animal.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 229
scripts, the UNIX implementation of Perl replicated the system calls as built in functions. However the replication of system calls depends upon the capabilities of the host operating system. The equivalents for most of the UNIX facilities are provided by the Windows NT.
The facilities that are common to the UNIX and NT
230
Environment variables: The current environment variables are stored in the special hash %ENV. A script can read them or change them by accessing this hash. The local operator can be used to perform temporary change in the values of individual environmental variable. Example:
executed with new variable PATH. However, the value of PATH will be replaced by the original value when it exits from the block. File system calls There are many file system calls that are used for handling the file system. All return a value indicating success or failure so a typical idiom is
or
Examples of the more common calls for manipulating the file system are: chdir $x Change directory unlink $x Same as rm in UNIX or delete in NT rename ( $x, $y) Same as mv in UNIX link($x, $y) Same as ln in UNIX. Not in NT symlink ($x, $y) Same as ln -s in UNIX. Not in NT mkdir($x, 0755) Create directory and set modes rmdir $x Remove directory chmod(0644, $x) Set file permisions
233
System(date);
The string given as argument is treated as a shell
command to be run by /bin/sh on UNIX, and by cmd.exe on NT, with the existing STDIN, STDOUT and STDERR, any ofthese can be redirected using Bourne shell notation, e.g. system("date >datefile") && die "can't redirect";
Shell commands returns zero to denote success and a
the files in the current directory with the . txt extension before the echo is executed. Quoted execution The output of a shell command can be captured using quoted execution Eg1: $date = 'date; Here, the output of the date command is assigned to the variable $date. Eg2: $date = qx/date/;
Exec The exec function terminates the current script and executes the program named as its argument, in the same process. Note: exec() never returns, and it is unnecessary to have any Perl statements following it, except possibly an or die clause exec "sort $output" or die "Can't exec sort\n"; exec can be used to replace the current script with a new script:
separated words, argument processing is the same as for the system function.
A.Brahmananda Reddy / Assistant Professor / CSE / VNRVJIET 236