Bup PDF

UNIT 3 ---Bottom-Up parsing Techniques
BOTTOM-UP PARSING
Bottom up parsing is another parsing strategy in which we start with the input string and try to obtain the
start symbol of the grammar using successive reductions. If we could reduce the input string to the start
symbol, the parse is successful otherwise it is unsuccessful. The reduction traces out the rightmost derivation
in the reverse order. The rightmost derivation in reverse is a natural choice for bottom up parsers because
rightmost sentential forms have the property that all symbols beyond the rightmost non terminal are terminal
symbols. The most important aspect of bottom up parsing is the process of detecting handles and using them
in reductions.
A general form of bottom-up parsing is called shift reduce parsing. Shift Reduce parsing attempts to
construct a parse tree for an input string beginning at the leaves (bottom) and working towards the root (top).
We can think of this process as one of reducing a string to the start symbol of a grammar. The shift reduce
parsing consists of shifting input symbols onto a stack until the right side of a production appears on top of
the stack. The right side may then be replaced by (reduced to) the symbol on the left side of the production
and the process is repeated.
Shift Reduce Parser

Shift reduce parsers use the principle of bottom up parsing. It attempts to construct a parse tree for an input
string beginning at the leaves (the bottom) and working up towards the root (the top).
A handle of a string is a substring that matches the right side of a production, and whose reduction to the
non terminal on the left side of the production represents one step along the reverse of a rightmost
derivation. Formally, a handle of a right-sentential form  is a production A   and a position of  where
the string  may be found and replaced by A to produce the previous right-sentential form in a rightmost
derivation of .
A rightmost derivation in reverse can be obtained by “handle pruning”. That is, we start with a string of
terminals w that we wish to parse. If w is a sentence of the grammar at hand, then w = n, where n is the nth
right-sentential form of some as yet unknown rightmost derivation
S = 0  1  2  …  n-1  n = w.
Viable prefixes: The set of prefixes of right sentential forms that can appear on the stack of a shift reduce
parser are called viable prefixes. A viable prefix of a right sentential form is a prefix which does not extend
beyond the right end of its handle. That is, prefixes of right sentential forms do not contain any symbols to
the right of the handle. A viable prefix is, so called because it is always possible to add terminal symbols to
the end of a viable prefix to obtain a right sentential form(rm).
Example : Consider the grammar:

S' -> Sc
S -> SA | A
A -> aSb | ab
rm
Solution: S'  Sc
rm
 SAc
rm
 SaSbc
now, right most sentential form: SaSbc
handle of right most sentential form: aSb
Prepared by T.Aruna Sri, Dept of CSE Page 1

viable prefixes of rightmost sentential form: , S, Sa, SaS, SaSb

Example : Consider the following grammar:
S -> aABe
A -> Abc | b
B -> d
rm
Solution: S  aABe
rm
 aAde
rm
 aAbcde
rm
 abbcde
Therefore Handle is b ( at position 2, A-> b)
Implementation of Shift-Reduce Parsing
Parsing by handle pruning requires two problems to be solved. The first is to locate the substring to be
reduced in a right-sentential form, and the second is to determine what production to choose in case there is
more than one production with that substring on the right side.
Implementation of a shift-reduce parser uses a stack to hold grammar symbols and an input buffer to hold
the string w to be parsed. We use $ to mark the bottom of the stack and also the right end of the input.
Initially, the stack is empty, and the string x is on the input as follows:
STACK INPUT
$ x$
The parser operates by shifting zero or more input symbols onto the stack until a handle  is on top of the
stack. The parser then reduces  to the left side of the appropriate production. The parser repeats this cycle
until it has detected an error or until the stack contains the start symbol and the input is empty.
STACK INPUT
$S $
After entering this configuration, the parser halts and announces successful completion of parsing.
There are four possible actions a shift-reduce parser can make: (1) shift, (2) reduce, (3) accept, and (4) error.
1. In a shift action, the next input symbol is shifted onto the top of the stack.
2. In a reduce action, the parser knows the right end of the handle is at the top of the stack. It must then
locate the left end of the handle within the stack and decide with what non terminal to replace the
handle.
3. In an accept action, the parser declares as successful completion of parsing.
4. In an error action, the parser encounters that a syntax error has occurred and calls an error recovery
routine.

Production No Production
(1) S -> E
(2) E -> E + T
(3) E -> T
(4) T -> id
and the string id + id + id. Perform shift-reduce parsing using the given string.

Solution:
Stack Input Operation
$ id1 + id2 + id3 $ shift
$ id1 + id2 + id3 $ reduce by (4)
$T + id2 + id3 $ reduce by (3)
$E + id2 + id3 $ shift
$E+ id2 + id3 $ shift
$ E + id2 + id3 $ reduce by (4)
$E+T + id3 $ reduce by (2)
$E + id3 $ shift
$E+ id3 $ shift
$ E + id3 $ reduce by (4)
$E+T $ reduce by (2)
$E $ reduce by (1)
$S $ accept

E -> E+E | E*E | (E) | id and the string id1 + id2 * id3
Solution:
1. $ id1+id2*id3$ shift
2. $id1 +id2*id3$ reduce by E -> id
3. $E +id2*id3$ shift
4. $E+ id2*id3$ shift
5. $E+id2 *id3$ reduce by E ->id
6. $E+E *id3$ shift
7. $E+E* id3$ shift
8. $E+E*id3 $ reduce by E-> id
9. $E+E*E $ reduce by E -> E*E
10. $E+E $ reduce by E->E+E
11. $E $ accept
Conflicts during Shift-Reduce Parsing

Shift-reduce parsing cannot be used for all types of context-free grammars. For some context-free grammars,
shift-reduce parser can reach a configuration in which the parser, knowing the entire stack contents and the
next input symbol, cannot decide whether to shift to reduce (a shift/reduce conflict), or cannot decide which
of several reductions to make (a reduce/reduce conflict).
Example Consider the following grammar where the productions are numbered as shown below:
E -> E + T {PRINT ‘1’}
E -> T { PRINT ‘2’ }
T -> T * F { PRINT ‘ 3’ }
T -> F { PRINT ‘ 4’ }
E -> ( E ) { PRINT ‘5’ }
F -> id { PRINT ‘6’ }

If a shift – reduce parser writes the production number immediately after performing any production , what
string will be printed if the parser input is id+id*id
Solution:
$ id + id * id$
$ id + id * id $ Shift
$F + id*id $ Reduced by F -> id Print ‘6’
$T +id * id $ Reduced by T -> F Print ‘4’
$E +id * id $ Reduced by E -> T Print ‘2’
$E+ id * id $ Shift
$ E + id * id $ Shift
$E+F * id $ Reduced by F -> id Print ‘6’
$E+T * id $ Reduced by T -> F Print ‘4’
$E+T* id $ Shift
$ E + T * id $ Shift
$E+T*F $ Reduced by F -> id Print ‘6’
$E+T $ Reduced by T -> T * F Print ‘3’
$E $ Reduced by E -> E + T Print ‘1’
Therefore the final string obtained is ‘64264631’.
Example : A shift reduces parser carries out the actions specified within braces immediately after reducing
with the corresponding rule of the following grammar:
S -> xxW PRINT ‘1’
S -> y PRINT ‘2’
W -> Sz PRINT ‘3’
If a shift – reduce parser writes the production number immediately after performing any production , what
string will be printed if the parser input is xxxxyzzz
Solution:
Input Applicable Production Transformed Input Output
x - - -
xx - - -
xxx - - -
xxxx - - -
xxxxy S->y xxxxS 2
xxxxS - - 2

xxxxSz W->Sz xxxxW 2 3

xxxxW - - 2 3
xxxxW S->xxW xxS 2 3 1
xxS - - 2 3 1
xxSz W ->Sz xxW 2 3 1 3
xxW - - 2 3 1 3
xxW S-> xxW S 2 3 1 3 1
Therefore the final output string is ‘23131’
Operator-Precedence Parsing
There are some important class of grammars, we can easily construct efficient shift-reduce parsers by hand.
These grammars have the property that no production on the right side has two adjacent non
terminals(variables) or is empty string  . A grammar with the first property is called an operator grammar.
That is in an operator grammar, any sentential form cannot contain two consecutive non-terminals. A
grammar is an operator grammar if it contains no productions of the form S -> αABβ, where α, β are
terminal strings and A, B are non-terminals. Thus two non-terminals cannot occur side by side on the right
hand side of a production, but must be separated by at least one terminal symbol.
Example : S -> SPS | (S) | -S | id

P -> + | - | * | / |
As grammar contains two nonterminals side by side on right handside of the
S-production : S -> SPS. Therefore, it is not an operator grammar.
To obtained operator grmammar replace P by right side strings P-productions. The resultant grammar S-
productions S -> S+S| S-S | S*S | S/S | S S/(E) | -E | id is an operator grammar.
In operator-precedence parsing, we define three disjoint precedence relations, <∙, = and ·>, between certain
pairs of terminals. These precedence relations guide the selection of handles and have the following
meanings:
Relation Meaning
a <· b a “precedence to” b
a=b a “has the same precedence as” b
a ·> b a “takes precedence over” b
An operator precedence grammar is an € free operator grammar in which at most one of the relations: <·, =,
·> holds between any pair of terminals.
Using Operator-Precedence relations
The intention of the precedence relations is to delimit the handle of a right sentential form, with <· marking
the left end, = appearing in the interior of the handle, and ·> marking the right end.
The handle can be found by the following process.

 Scan the string from the left end until the first ·> is encountered.
 Then scan backwards (to the left) over any +’s until a <· is encountered.
 The handle contains every thing to the left of the first ·> and to the right of the <· encountered in
step 2, including any intervening or surrounding non terminals.
Illustrating precedence relations:

If there exists a production S -> αabβ, where α, β are arbitrary strings, then the symbols a and b reduce
together to yield a non-terminal S’.
α a b β
Fig : a = b
If there exists a production S -> αAbβ, A -> raδ, then they need to be reduced to A before b can take part on
a reduction to S.
α A b β
r a δ
Fig : a ·>b
If there exists a production S -> αaBβ, B -> rbδ, then b needs to be reduced prior to a to S.
S
α a B β
r b δ
Fig 7.3: a <· b

Example : Consider the grammar with the sentential form: E + id * id.
E -> E+T | E-T | T
T -> T*F | T/F | F
F -> F P | P
P -> (E) | id
Sentential form: E + id * id
E + T
T * F
F P
P id
id
Fig : Parse tree for E + id * id
From above tree we can observe that
+ <· id id will be reduced before +
id ·> * id will be reduced before *
* <· id id will be reduced before *
+ <· * * will be reduced before +
S -> S + S | S - S | S * S | S / S | S S | (S) | - S | a
Assuming
1. is of highest precedence and right association.
2. * and / are next highest precedence and left association.
3. + and – are of lowest precedence and left associative.
Construct the operator precedence matrix.

Operator precedence relations from associativity and precedence –
1. If operator Q1 has higher precedence than operator Q2, make Q1 ·> q2 and Q2 <· Q1.
Expression Handle
S+ S *S+S S*S therefore, * ·> + and + <· *

2. If Q1 and Q2 are operators of equal precedence, then make Q1·>Q2 and Q2·>Q1 , if the operators
are left associative, or make Q1<·Q2 and Q2<·Q1 if they are right associative.
Expression Handle
S-S+S S-S therefore, +,- are left associative
+ ·> + ->-
+ ·> - ->+
S S S last S S therefore, right associative
<·
3. Make Q<· id, id·>Q, Q<·(, (<·Q, )·>Q, Q·>), Q·>$, $<·Q for all operators Q. Also let:
(=) $ <· ( $ <· id
( <· ( id ·> $ ) ·> $
( <· id id ·> ) ) ·> )
$ serves as both the left and right end marker.
These rules ensure that both id and (S) will be reduced to S.
Now trace the following input strings for the above grammar by the operator precedence relations.
i. id * ( id +id )
ii. id * ( id id ) – id / id
+ - * / a ( ) $
+ ·> ·> <· <· <· <· <· ·> ·>
- ·> ·> <· <· <· <· <· ·> ·>
* ·> ·> ·> ·> <· <· <· ·> ·>
/ ·> ·> ·> ·> <· <· <· ·> ·>
·> ·> ·> ·> <· <· <· ·> ·>
a ·> ·> ·> ·> ·> Error Error ·> ·>
( <· <· <· <· <· <· <· = Error
) ·> ·> ·> ·> ·> Error Error ·> ·>
$ <· <· <· <· <· <· <· Error Error
Operator Precedence Relations (Precedence Matrix)

Therefore, the string with the precedence relations inserted is : a * (a + a)
$ <· a ·> * <· ( <· a ·> + <· a·> ) ·> $
Parsing of id * ( id + id ):
Step Sentential form Handle Reduction
1. $<·a·>*<(<·a·>+<·a·>)·>$ a X1 -> a
2. $X1<·*<·(<·a·>+<·a·>)·>$ a X2 -> a

3. $X1<·*<·(X2<·+<·a·>)·>$ a X3 -> a
4. $X1<·*<·(X2<·+X3·>)·>$ X2+X3 X4->X2+X3
5. $X1<·*<·(X4=)·>$ (X4) X5 -> (X4)
6. $X1<·*X5·>$ X1*X5 X6-> X1*X5
7. $X6$
E
X6
E * E
X1 * X5
a ( E )
a ( X4 X)
E + E
X2 + X3
a a
a a
Fig : Parse trees for the sentence: a* ( a + a )

A grammar is called a simple operator precedence grammar if the following conditions are satisfied:
1. For any pair of symbols, at most one of the relations =, <· and ·> must hold.
2. Empty rules are not allowed.
3. No two productions can have the same right hand side.
Precedence functions
Precedence functions are used to reduce size of th precedence matrix. Usually Compilers using operator-
precedence parsers will not store the table of precedence relations among the terminals.Suppose we have n
terminals in the grammar G, the size of the precendence matrix is n*n. To reduce table size the precedence
table encoded by two precedence functions h and k that may terminal symbols to integers. Using these
functions the table size can be reduced to 2*n. We attempt to select h and k so that, for terminal symbols a
and b,
1. h(a) <· k(b) whenever a <· b,
2. h(a) = k(b) whenever a = b and
3. h(a) ·> k(b) whenever a ·> b.
Thus the precedence relation between a and b can be determined by a numerical comparison between f(a)
and g(b). Error entries in the precedence matrix are obscured, since one of 1, 2, 3 holds no matter what h(a)
and k(b) are. The loss of error detection capabilities is not considered serious enough to prevent the using of
precedence functions where possible; errors can still be caught when a reduction is called for and no handle
can be found.

Example : Consider the following precedence matrix
h k A + * $
a E ·> ·> ·>
+ <· ·> <· ·>
* <· ·> ·> ·>
$ <· <· <· E
Construct a precedence function.

Solution:
ka ha
h* k*
k+ h+
h$ k$
Fig : Directed Graph

No cycle exists in directed graph, so precedence functions exist.
1. h$ = k$ = 0, since there are no out edges.
2. The longest path from k+ node has length 1, path k+ -> h$
3. ka path: ka-> h* -> k* -> h+ -> k+ -> h$
=> k(a) = 5
a + * $
h 4 2 4 0
k 5 1 3 0
Fig : Resulting Precedence Functions
LR Parsers
This technique is an efficient, bottom-up parser technique that can be used to parse a all class of context-free
grammars. This technique is called LR(k) parsing; the “L” is for left-to-right scanning of the input, the “R”
for constructing a rightmost derivation in reverse, and the k for the number of input symbols of lookahead
that are used in making parsing decisions. When (k) is omitted, k is assumed to be 1.

The LR Parsing algorithm

The schematic model of an LR parser is shown in the following Fig7.7 . It consists of an input tape, an
output, a stack, a parsing program, and a parsing table that has two parts (Action and Goto). The LR parsing
program is same for all LR parsers; only the parsing table changes from one parser to another.
INPUTt a1 … ai … an $
LR
STACK sm Parsing Program OUTPUT
Xm
sm-1
Xm-1
…
Action Goto
s0
Fig : Model of LR Parsing
The parsing program reads characters from an input buffer one at a time. The program uses a stack to
store a string of the form s0X1s1X2s2…Xmsm, where sm is on top. Each Xi is a grammar symbol and each si is
a symbol called a state. Each state symbol summarizes the information contained in the stack below it, and
the combination of the state symbol on the top of the stack and the current input symbol are used to index
the parsing table and determine the shift-reduce parsing decision.
The parsing table consists of two parts, a parsing action function Action and a Goto function Goto.
The program driving the LR parser behaves as follows. It determines sm, the state currently on top of the
stack, and ai, the current input symbol. It then consults action[sm, ai], the parsing table entry for stat sm and
input ai, which can have one of four values:
1. shift s, where s is a state
2. reduce by a grammar production A  ,
3. accept, and
4. error.
The function goto takes a state and grammar symbol as arguments and produces a state. A
configuration of an LR parser is a pair whose first component is the stack contents and whose second
component is unexpected input:
(s0X1s1X2s2…Xmsm, aiai+1…an$)
This configuration represents the right-sentential form
X1X2…Xmaiai+1…an
is essentially the same way as a shift-reduce parser would; only the presence of states on the stack is new.
The next move of the parser is determined by reading ai, the current input symbol, and sm, the state
on top of the stack, and then consulting the parsing action table entry action[sm, ai]. The configurations
resulting after each of the four types of move are as follows:
1. If action[sm, ai] = shift s, the parser executes a shift move, entering the configuration
(s0X1s1X2s2…Xmsmais, ai+1…an$)
2. If action[sm, ai] = reduce A  , then the parser executes a reduce move, entering
the configuration
(s0X1s1X2s2…Xm-rsm-rAs, aiai+1…an$)
where s = goto [sm-r, A] and r is the length of , the right side of the production.
3.If action[sm, ai] = accept, parsing is completed.
4. If action[sm, ai] = error, the parser has discovered an error and calls an error recovery routine.
Example : Consider the following augmented grammar :

S’ -> Sc
S -> SA | A
A -> aSb | ab
(a) Find the collection of sets of LR(0) items
(b) Construct GOTO graph
Sol :
As grammar is augmented grammar , we can start construction of LR(0) items directly with augmented
production S’ -> .SC.
LR(0) items :
I0: S’ -> .SC

S -> .SA
S -> .A
A -> .aSb
A -> .ab
(b) goto(I0 , S)
S -> S.A
S’ -> S.C
A -> .aSb
A -> .ab
goto(I0 , A)
S -> A.
goto(I0 , a)
A -> a.Sb
A -> a.b
S -> .SA
S -> .A
A -> .aSb
A -> .ab
goto(I1 , C)
S’ -> SC.
goto(I1 , A)
S -> SA.
goto(I3 , S)
A -> aS.b
S -> S.A
A -> .aSb
A -> .ab
goto(I3 , b)
A -> ab.
goto(I6 , b)
A -> aSb.

The GOTO graph for the given grammar is constructed as shown in below Fig
I0 I1
I4
C
S S’ -> S.C
S’ -> .SC S -> S.A S’ -> SC.
S -> .SA A -> .aSb
S -> .A A -> .ab I5
A -> .aSb I2 A
A -> .ab A S -> SA.
S -> A.
A I8
a A
I3 a A -> aSb.
I6
a A -> a.Sb
A -> a.b S b
S -> .SA A -> aS.b
S-> .A S -> S.A
a A -> .aSb
A -> .aSb
A -> .ab I7A -> .ab
A -> ab.
b
:
Fig : GOTO Graph
Example : Construct NFA of LR(0) items of the following grammar:

S' -> S
S -> (S) S | ε
Sol: Exaplanation of this example is given to reader as excersises.The GOTO graph is shown in the Fig:.

S
S' -> S S' -> S.
ε ε
S -> .(S)S S -> .
( ε
ε
S -> (.S)S ε
S
S -> (S.) S
) S -> (S)S.
S -> (S).S
S
Fig : NFA of LR(0) Items
LR Grammars
A grammar for which we can construct a parsing table is said to be an LR grammar. An LR parser does not
have to scan the entire stack to know when the handle appears on top. Rather, the state symbol on top of the
stack contains all the information it needs.
There is a significant difference between LL and LR grammars. For a grammar to be LR(k), we must be able
to recognize the occurrence of the right side of a production, having seen all of what is derived from the
right side with k input symbols of lookahead. This requirement is far less stringent than that for LL(k)
grammars where we must be able to recognize the use of a production seeing only the first k symbols of
what its right side derives.
Construction of SLR Parsing Table

“Simple LR” or SLR for short is the weakest of the three in terms of the number of grammars for which it
succeeds, but it is the easiest to implement. The parsing table constructed by this method is referred to as
SLR table, and to an LR parsing using an SLR parsing table as an SLR parser. A grammar for which an SLR
parser can be constructed is said to be an SLR grammar. An LR(0) item of a grammar G is a production of G
with a dot at some position of the right side.
If G is a grammar with start symbol S, then G', the augmented grammar for G, is G with a new start symbol
S and production S'  S. The purpose of this new starting production is to indicate to the parser when it
should stop parsing and announce acceptance of the input. That is, acceptance occurs when and only when
the parser is about to reduce by
S'  S.

The Closure Operation

If I is a set of items for a grammar G, then closure (I) is the set of items constructed from I by the two rules:
1. Initially, every item in I is added to closure (I).
2. If A  is in closure (I) and B   is a production, then add the item B  . to I, if it is not
already there. We apply this rule until no more new items can be added to closure (I).
The GOTO Operation

goto (I, X) is defined to be the closure of the set of all items [A  X.] such that
[A  .X] is in I. If I is the set of items that are valid for some viable prefix , then goto (I, X) is the set of
items that are valid for the viable prefix X.
The sets-of-items construction

The algorithm for the canonical collection of sets of LR(0) items for an augmented grammar G is as
follows:
procedure items (G)
begin
C := {closure({[S  .S]})};
repeat
for each set of items I in C and each grammar symbol X
such that goto(I, X) is not empty and not in C do
add goto(I, X) to C
until no more sets of items can be added to C
end
Algorithm for Constructing an SLR parsing table

1. Construct C = {I0, I1, …, In}, the collection of sets of LR(0) items for G.
2. State i of the parser is constructed from Ii. The parsing actions for state I are determined as follows:
a) If [A  .a] is in Ii and goto(Ii, a) = Ij, then set action[i, a] to “shift j”. Here a must be a
terminal.
b) If [A  .] is in Ii, then set action[i, a] to “reduce A  ” for all a in FOLLOW (A); here A
may not be S.
c) If [S  S.] is in Ii, then set action[I, $] to “accept”.
If any conflicting actions are generated by the above rules, we say the grammar is not SLR(1). The
algorithm fails to produce a parser in this case.
3. The goto transitions for state I are constructed for all non terminals A using the rule: If goto(Ii, A) =
I=j, then goto[I, A] = j.
4. All entries not defined by rules (2) and (3) are made “error”.
5. The initial state of the parser is the one constructed from the set of items containing
[S  .S].

S -> CC
C -> aC | b
(a) Find the canonical sets of LR(0) items
(b) Construct SLR parsing table
Sol :
(C) The augmented grammar for the above grammar can be written as follows:

LR (0) items:
S’ -> .S
S -> .CC
C -> .aC
C -> .b
The GOTO graph for the given grammar is drawn in below Fig 7.10
I1
I0
S S’ -> S. I5
S’ -> .S
S -> .CC
C -> .aC S -> CC.
I2
C -> .b
C
c S -> C.C C
C -> .aC
C -> .b
d d
I6
I4
I3 c
C -> aC.
C -> a.
C -> a.C
C -> .aC C
a C -> .b
c
Fig : GOTO Graph
(b) Below is SLR parsing table for the given grammar
ACTION GOTO
State A b $ S C
0 Shift 4 Shift 3 1 2
1 Accept
2 Shift 4 Shift 3 5
3 Shift 4 Shift 6 6
Reduce by Reduce by Reduce by
4
C->b C->b C->b
Reduce by S Reduce by S Reduce by
5
-> CC -> CC S -> CC
6
C -> aC C -> aC C -> aC
Constructing Canonical LR Parsing Tables

This is the most general technique for constructing an LR parsing table from a grammar.
Algorithm: Construction of the sets of LR(1) items.
Input. An augmented grammar G.
Ouitput. The sets of LR(1) items that are the set of items valid for one or more viable prefixes of G.
Steps

The procedures closure and goto and the main routine items for constructing the sets of items are given
below.
function closure(I);
{
repeat
for each item [A  ., a] in I,
each production B   in G,
and each terminal b in FIRST(a)
such that [B  ., b] is not in I do
add [B  ., b] to I;
until no more items can be added to I;
return I
}
function goto(I, X);

{
let J be the set of items [A  X., a] such that
[A  .X, a] is in I;
return closure(J)
}
items(G);
{
C := {closure9{[S  .S, $]})};
repeat
for each set of items I in C and each grammar symbol X
such that goto(I, X) is not empty and not in C do
add goto(I, X) to C
until no more sets of items can be added to C
}
Algorithm for Constructing an canonical LR parsing table

1. Construct C = {I0, I1… In}, the collection of sets of LR(1) items for augmented G.
2. State i of the parser is constructed from Ii. The parsing actions for state i are determined as follows:
a. If [A  .a, b] is in Ii and goto(Ii, a) = Ij, then set action[i, a] to “shift j”. Here a must be a
terminal.
b. If [A  ., a] is in Ii, A  S then set action[i, a] to “reduce A  ”.
c. If [S  S., $] is in Ii, then set action[i, $] to “accept”.
If any conflicting actions are generated by the above rules, we say the grammar is not LR(1). The algorithm
fails to produce a parser in this case.
3. The goto transitions for state i are constructed for all non terminals A using the rule: If goto(Ii, A) =
Ij, then goto[i, A] = j.
4. All entries not defined by rules (2) and (3) are made ”error”.
5. The initial state of the parser is the one constructed from the set of items containing
[S  .S, $].

Example : Consider the following grammar

S -> BB
B -> cB | d
(a) Construct augmented grammar for G
(b) Construct the non-empty sets of LR(1) items for the given G
(c) Construct canonical parsing table of G
Sol :
(a) Augmented Grammar for grammar G is as follows:
S’ -> S
S -> BB
B -> cB
B -> d
(b) The LR(1) items for the grammar G are derived as shown below :
I0 :
S’ -> .S , {$}
S -> .BB , {$}
B -> .cB , {c,d}
B -> .d , {c,d}
I1 : goto(I0, S)
S’ -> S. , {$}
I2 : goto(I0,B)
S -> B.B , {$}
B-> .cB , {$}
B -> .d , {$}
I3 : goto(I0,c)
B-> c.B , {c,d}
B -> .cB , {c,d}
B -> .d , {c,d}
I4 : goto(I0,d)
B-> d. , {c,d}
I5 : goto(I2,B)
S -> BB. , {$}
I6 : goto(I2,c)
B -> c.B , {$}
B -> .cB , {$}
B -> .d , {$}
I7 : goto(I2,d)
B -> d. {$}
I8 : goto(I3,B)
B -> cB. , {c,d}

I9 : goto(I6,B)
B-> cB. , {$}
Below Fig shows the GOTO graph constructed for the sets of LR(1) items :
I0
I1 I2
S’ -> .S , {$} B
S’ -> S. , {$} S -> B.B , {$}
S S -> .BB , {$}
B -> .cB , {$}
B -> .cB , {c,d}
B -> .d , {$}
B-> .d , {c,d}
I4 d I7 d
B -> d. , {c,d} B
c B -> d. , {$} c
I5
d
I3 I6 d
c
B -> c.C , {c,d} S -> BB. , {$}
B -> c.B , {$}
B -> .cC , {c,d}
B -> .cB , {$}
B -> . d , {c,d}
B -> .d , {$}
I9
I8 B B -> cB. , {$}
C
B -> cB. , {c,d}
Fig : GOTO Graph

(c) The canonical parsing table for the grammar G is given below:
Action GOTO
State c d $ S B
1 Accept 5
2 Shift 6 Shift 7 8
3 Shift 3 Shift 4
Reduce by Reduce by
4
B -> d B->d
Reduce by
5
S -> BB
6 Shift 6 Shift 7 9
Reduce by
7
B -> d
Reduce by Reduce by
8
B -> cB B -> cB
Reduce by
9
B -> cB
SLR Table
Example : Consider the grammar(i.e., augmented grammar)

S -> A
A -> BA | ε
B -> aB | b
(i) Construct an CLR parsing table.
(ii) Find the action of LR(1) parser on input : aabb
Sol: Productions can be written as
(1) A -> ε
(2) A -> BA
(3) B -> b
(4) B -> aB
(i) The canonical parsing table for the given grammar is given in Table:
Action GOTO
State a b $ A B
Reduce by
A -> ε
1 Accept
Reduce by
A -> ε
3 Shift 3 Shift 4 6
4
B -> b B -> b B -> b
Reduce by Reduce by
5
A -> BA A ->BA
6
B -> aB B -> aB B -> aB
Action of LR(1) parser on input aabb
Stack Remaining Input Operation

0 aabb$ Initial
0a3 abb$ Shift
0a3a3 bb$ Shift
0a3a3b4 b$ Shift
0a3a3B6 b$ Reduce by B -> b
0a3B6 b$ Reduce by B -> aB
0B2 b$ Reduce by B -> aB
0B2b4 $ Shift
0B2B2 $ Reduce by B -> b
0B2B2A5 $ Reduce by A -> ε
0B2A5 $ Reduce by A -> BA
0A1 $ Reduce by A -> BA
--- $ Reduce by S -> A and
Accept
Constructing LALR Parsing Tables

Algorithm. An easy, but space-consuming LALR table construction.

Input. An augmented grammar G.
Output. The LALR parsing table functions action and goto for G.
Method.
1. Construction C= {I0, I1, …, In}, the collection of sets of LR(1) items.
2. For each core present among the set of LR(1) items, find all sets having that core, and replace these
sets by their union.
3. Let C = {J0, J1, …, Jm} be the resulting sets of LR(1) items. The parsing actions for set i are
constructed from Ji in the same manner as in the above algorithm. If there is a parsing action conflict,
the algorithm fails to produce a parser, and the grammar is said not to be LALR(1).
4. The goto table is constructed as follows. If J is the union of one or more sets of LR(1) items, that is, J
= I1  I2  …  Ik, then the cores of goto(I1, X), goto(I2, X), …, goto(Ik, X) are the same, since I0, I1,
…, Ik all have the same core. Let K be the union of all sets of items having the same core as goto(J,
X) = K.
Example : Consider the following grammar

S -> BB
B -> cB | d
Construct LALR parsing table
Sol: Augmented grammar can be written as
S’ -> S
S -> BB
B -> cB
B -> d
Identify the common cores of the LR(1) sets of item of this grammar shown in Fig 7.11 and merge
them.From graph we can observe that I3 and I6 , I4and I7 and I8 and I9 are sets of items contains
common cores. Because their first componets are identycall and differenctiating in only second componet.
1. Consider sets of item I3 and I6 are replaced by their merge of second componets, therefore we get
new sets item I36
I36: B-> c.B, {c, d, $}

B -> .cB, {c, d ,$}
B-> .d, {c, d ,$}
2. Now lets us consider another sets of item I4 and I7 are replaced by their merge result, therefore we
get new sets item I47
I47: B-> d., {c , d, $}

Finally we consider another common core sets of items I8 and I9 are replaced by their merged result,we get
I89: B-> cB., { c, d, $}
LALR Parsing table for given grammar can be constructed as shown in Table :

Action GOTO
State c d $ S B
1 Accept
2 Shift 36 Shift 47 5
36 Shift 36 Shift 47 89
47
B ->d B->d B ->d
Reduce by
5
B ->cB
89
B -> cB B -> cB B ->cB
YACC Programming Specifications:

A YACC specification consists of a mandatory rules section, and optional sections for definitions and user
subroutines.
The declarations section for definitions, if present, must be the first section in the YACC program. The
mandatory rules section follows the definitions; if there are no definitions, then the rules section is first. In
both cases, the rules section must start with the delimiter %%. If there is a subroutines section, it follows
the rules section and is separated from the rules by another %% delimiter. If there is no second %%
delimiter, the rules section continues to the end of the file.
When all sections are present, a specification file has the format:
declarations
%%
rules
%%
subroutines
The example that follows is a complete yacc specification. The sample program generates a parser which
takes input in the form:
month day , year
This input is converted to output in the form:
day month year
In the example, the declarations section defines a data structure used to hold the values associated with
tokens, and declares all the token names used in the rules section. The rules section contains one rule and an
action associated with it. The subroutines section defines a function that is called in the action.
%union
{
char *text;
int ival;
}
%token t_DAY
%token t_MONTH
%token t_YEAR
%%
date : t_MONTH t_DAY ',' t_YEAR
{ print_date ($2,$1,$4); };
%%
void print_date(d,m,y)
char *m;
int d,y;
{
printf("%d %s %d\n",d,m,y);
}
The parser uses a lexical analyzer that can return the tokens t_DAY, t_MONTH, and t_YEAR, and also can
associate each token with some value.
Practice:
1) Explain operator-precedence parsing with an example.

2) For the following grammar
S  A#
A  bB
B  cC
B  cCc
C  dA
Aa
(i) Generate the sets of LR(1) items.
(ii) Is the grammar SLR(1)
(iii) Is the grammar LR(1)
If not why not?
3) What is LR(1) passing?
Construct canonical LR parse table for the following grammar
S → Aa | bAc | bBa
A→d
B → d.
4) Consider the following augmented grammar :
S -> E
E -> E + T | T
T -> a | (E)
Construct the SLR(1) parse table
5) Construct LR(0) parser for the following grammar :
S -> cA | ccB
A -> cA | a
B -> ccB | b

Code No: 07A5EC20
UNIT 3 ---Bottom-Up parsing Techniques Set No. 4
JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD
III B.Tech. I-Sem. I Mid-Term Examinations, September - 2009
AUTOMATA AND COMPILER DESIGN
Objective Exam
A
1. Which of the following is an attribute of Symbol table [ ]

a)Type b)Name c)Scope d)All the above
2. Lexical analyzer is also called as [ ]

a)Scanner b)Parser c)Type Checker d)None
3. Which of the following is the most powerful parser? [ ]

a)SLR b)LALR c)Canonical LR d)Operator-precedence
4. A language L is designed as [ ]
(a)set of symbols over given (b)set of strings of symbols over given 
(c)set of alphabets (d)all of the above
5. In operator precedence parsing precedence relations are defined [ ]

(a) for all pair of non-terminals (b)for all pair of terminals
(c)both terminals and non terminals (d)only for a certain pair of terminals
6. The grammar E Æ E+E/ E*E/a is [ ]

(a) ambiguous (b) unambiguous
(c) depends on the given sentence (d) only for a certain pair of terminals
7. CFG can be recognized by a ________ [ ]

(a) Finite Automata (b)Pushdown Automata
(c)Linear bounded Automata (d)Turing machine
8. A bottom-up parser generates _________ [ ]

(a) left most derivation (b) right most derivation
(c) right most derivation in reverse (d)left most derivation in reverse
9. synthesized attribute can easily be simulated by an _________ [ ]

(a)LL grammar (b)ambiguous grammar
(c)LR grammar (d)none of the above
10. What is the RE for the language, set of strings with at least one 1, one 2 and one 3?
(a)1+2+3 (b)11*22*33* (c)1*2*3* (d)both a&b [ ]
Cont2

[2]
07A5EC20 Set No. 4
II. Fill in the blanks:
11. In ___________ situations inherited attribute is a natural choice
12. ______________are the parameters for mkleaf() function.
13. Syntax analyzer is also called as _________.
14. A __________ is a graphical representation for a derivation.
15. A syntax directed definition is a generalization of a __________ grammar.
16. EÆ E+T/T , eliminate left recursion _________________________
17. Regular expression for an identifier is _______________________
18. A parse tree showing the values of attributes at each node is called as _____________
19. Brute force method is one of the _____________ Technique.
20. YACC means __________________________
-oOo-

Code No: 07A5EC20
III B.Tech. I Sem., II Mid-Term Examinations, November- 2009
Objective Exam
Name: ______________________________ Hall Ticket No.

A
Answer All Questions. All Questions Carry Equal Marks. Time: 20 Min. Marks: 20.
I. Choose the correct alternative:
1.A sound type system eliminates _____,when the target program runs. [ ]
a)Type errors b)runtime errors c)compile type errors d)none
2. Implicit type conversions, is __________. [ ]

a)Done automatically by the compiler. b) Done automatically by the interpreter
c) Done automatically by the OS d) Done automatically by the language
3.____________ determines the type of a language construct from the way it is used. [ ]
a)Type synthesis b)Type inference c)Type reference d)None
4.A record is a data structure with ____________ [ ]

a)named fields b)fields c)named records d)records
5. Code Motion moves _____________. [ ]

a)Code outside a loop b)Code inside a loop c)Code top of a loop d)Code bottom of a loop.
6.Machine dependent optimization influenced by the _________ [ ]

a)Source machine b)Target machine c)Compiler d)Interpreter
7.An occurrence of an expression E is called a common sub expression if E is ___. [ ]

a)Previously computed and the values of E have not changed
b) Previously computed and the values of E will change
c) After computed and the values of E have not changed
d) After computed and the values of E will change
8. The code generator, produces the target program from the transformed ______. [ ]
a)High level code b)Low level code c) Intermediate code d)All the above
9. The line variable analysis is done by __________. [ ]

a)ud-chains b)du-chains c)computation of in and out d)All the above
10.Which is not an example for function-preserving transformation? [ ]

a)Copy propagation b)Flow of control c)Constant folding d)dead-code elimination
Cont..[2]

UNIT 3 ---Bottom-Up parsing
[2]
Techniques
Code No: 07A5EC20 Set No. 1
II. Fill in the Blanks:
11. Implicit type conversions also called ____________
12. The runtime representation of an object program in the logical address space consists of ___________.
13. The static data objects are created at ___________________.
14. The activations of procedures during the running of an entire program by a tree called _________.
15. The substitution of values for names whose values are constant is known as ______.
16. Activation records are sometimes called ______________.
17. If a transformation of a program performed by locking only at the statements in a basic blocks, called
________________.
18. The code improvement phase consists of _____________ followed by the application of transformation.
19. The replacement of an expressive operation by a cheaper one is called ________.
20. The relative address for a field name is relative to the __________ for that record.
-oOo-

Code No: 07A5EC20
Objective Exam

A
1. A record is a data structure with ____________ [ ]


3. Machine dependent optimization influenced by the _________ [ ]

4. An occurrence of an expression E is called a common sub expression if E is ___. [ ]


7. Which is not an example for function-preserving transformation? [ ]

8. A sound type system eliminates _____,when the target program runs. [ ]


Cont..[2]

[2]
________________.
-oOo-

Code No: 07A5EC20
Objective Exam

A



5. Which is not an example for function-preserving transformation? [ ]





Cont..[2]

ode No: 07A5EC20
1. [ ]
d)All the above
[ ]
[ ]






Cont..[2]

[2]
________________.
-oOo-

Code No: 45116 R07 Set No - 1

III B.Tech I Semester Regular Examinations,Nov/Dec 2009
AUTOMATA AND COMPLIER DESIGN
Common to Information Technology, Computer Science And Systems
Engineering
Time: 3 hours Max Marks: 80
Answer any FIVE Questions
All Questions carry equal marks
?????
1. (a) Consider the following declaration grammar & write the translation scheme
for identifying the type of the identifier:
P D; E
D D; D/id : T
T char/int/ T/array[num]ofT
Find the type of each entry.
(b) Consider following grammar:
E num.num/literal/num/E%E/E+E/ E''E / *E / E[E]
Construct semantic rules to find type of expression. [8+8]
2. (a) Describe in English the sets denoted by the following regular expressions:
i. [00 + 11 + (01 + 10)(00 + 11) (01+ 10) ] 
ii. 10+(0+11)0*1
(b) Prove following identities for regular expressions r, s & t. Here r=s means
L(r)=L(s)
i. (r*s*)*=(r+s)*
ii. (r+s)+t=r+(s+t)
3. (a) Give an algorithm to compute reaching definitions interprocedurally.

(b) Give an algorithm for eliminating global common subexpression. [8+8]
4. (a) Explain handle pruning process. Give examples.

(b) Explain error recovery in LR parsing. [8+8]
5. (a) Generate code for the following C statements. Assume all the variables are
static and three registers are available:
i. x=a+b*c
ii. x=a/(b+c)-d*(e+f)
(b) Generate code for the following C statements. Assume all the variables are
automatic and three registers are available:
i. x=a+b*c
ii. x=a/(b+c)-d*(e+f) . [8+8]
6. (a) What is an activation tree? Consider the following activation tree:

1

(b) Distinguish between static scoping and dynamic scoping. [8+8]
7. (a) Distinguish between synthesized & inherited attributes.

(b) Write a short note on abstract syntax tree. [8+8]
8. (a) What are the merits & demerits of recursive descent parsing.
(b) Explain predictive parsing in detail. [8+8]
?????
2

Engineering
?????
1. (a) Write an algorithm for elimination of induction variable.

(b) Write a C program to compute sum of digits of a number and convert it into
three address code. And generate ﬂow graph. [8+8]
2. (a) Discuss lexical scoping with nested procedures and without nested procedures.
(b) Describe the method to obtain faster access to nonlocals. [8+8]
3. Prove or disprove following for regular expression r, s & t:
(a) (rs+r)*s=r(sr+r)*
(b) (r+s)*=r*+s*
(c) s(rs+s)*r=rr*s(rr*s*)
(d) r(s+t)=rs+rt [16]
4. (a) Find the precedence functions for following grammar:

E E + E/E E/(E)/id
(b) Explain error recovery in LR parsing. [8+8]
5. (a) Generate code for following c program:

main()
 int i;
int a[10];
while(i < = 10)
a[i]=0;

(b) Explain the register allocation by graph coloring. [8+8]
6. (a) What is the main purpose of semantic analysis?

(b) Write about three address code? Give examples. [8+8]
7. (a) Write a short note on type equivalence.

(b) Write a short note on type checking. [8+8]
8. (a) Explain recursive descent parsing in detail.

(b) State the rules to compute FIRST(X) & FOLLOW(X). [8+8]
?????
3

Engineering
?????
1. (a) What is left recursion? Remove left recursion from following grammar:
S Aa/b
A Ac/Sd/
(b) Check for LL(1) for following grammar:
prog begin d semi X end
X d semi X/sY
Y semi s Y/. [6+10]
2. Construct SLR parsing table for following grammar:

E E + T/T
T T F/F
F (E)/id. [16]
3. (a) Consider following grammar & identify the type of subexpression. Use type
error as a type expression in error condition.
E literal/num/id/EmodE/E[E]/ E
(b) Write about type checking. Consider following C declarations:
typedef struct

int a, b;
CELL,*PCELL;
CELL foo[100];
PCELL bar(x,y)
int x;
CELL y..
Write type expressions for the types of foo and bar. [8+8]
4. (a) Discuss lexical scoping with nested procedures and without nested procedures.
(b) Consider the following code:
prog copyint()
var a:int
proc unsafe(var x;int)
begin x=2, a=0 end
begin
a=1
unsafe(a);
4

writeln(a);
end
Find the output if call by value, call by reference and call by value result are
used. [8+8]
5. Generate code for the following C statements. Assume all the variables are static
and three registers are available:
(a) x=a+b*c
(b) x=a/(b+c)-d*(e+f)
(c) a[i][j]=b[i][k]*c[k][j]
(d) a[i]+=b[j] [16]
6. Write short notes on following terms:
(a) dominators.
(b) natural loops.
(c) inner loops.
(d) preheaders. [16]
7. (a) Define regular expression. Give examples.

(b) State & explain the properties of regular sets. [4+12]
8. (a) What is dependency graph? What is its significance?

(b) Translate the expression (a+b)(c+d)+(a+b+c) into.
i. Quadruples.
ii. Triples.
iii. Indirect triples.
iv. Syntax tree. [8+8]
?????
37
5

Engineering
?????
1. (a) Give an algorithm to compute.

i. Available expressions.
ii. Live variables for the language with pointers.
(b) Prove that "depth of a reducible ﬂow graph is never less than the number of
times interval analysis must be performed to produce a single node." [8+8]
2. (a) Construct LALR parsing table for the following grammar:

S Aa/bAc/dc/bda
A d
Show the moves of this parser on input bda.
(b) Consider following grammar:
S 1S0/0S1/10
Is this grammar SLR(1) or not. [10+6]
3. (a) Construct FA equivalent to following regular expression: (1+01+001)*(+0+00)

(b) What are the applications of FA? Explain in detail. [8+8]
4. (a) Consider following grammar & identify the type of subexpression. Use type
error as a type expression in error condition.
E literal/num/id/EmodE/E[E]/ E
(b) What is structural equivalence? Write about structural equivalence of type

expressions? [8+8]
5. (a) Compare and contrast various storage allocation strategies.

(b) Consider following pseudo program and find the result if the arguments are
passed by call-by- value, call by reference & call by value result.
begin int a
proc p(b); int b
begin b=b+1; print(b,a) end
a=1 38
p(a)
print(a)
end. [8+8]
6. (a) Compare and contrast the quadruples, triples & indirect triples.
6
(b) What is the significance of syntax- directed definition. [8+8]
7. (a) Give the applications of DAG.

(b) Generate code for the following C statements:
i. x=++f(a)
ii.p++=q++. [8+8]
8. (a) Write the algorithm for predictive parsing.

(b) Explain error recovery in predictive parsing. [8+8]
?????
39

Bup PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bup PDF

Uploaded by

Copyright:

Available Formats

UNIT 3 ---Bottom-Up parsing Techniques

Shift Reduce Parser

Example : Consider the grammar:

Prepared by T.Aruna Sri, Dept of CSE Page 1

viable prefixes of rightmost sentential form: , S, Sa, SaS, SaSb

Implementation of Shift-Reduce Parsing

Example : Consider the grammar:

Prepared by T.Aruna Sri, Dept of CSE Page 2

Example : Consider the grammar:

Conflicts during Shift-Reduce Parsing

Prepared by T.Aruna Sri, Dept of CSE Page 3

Prepared by T.Aruna Sri, Dept of CSE Page 4

xxxxSz W->Sz xxxxW 2 3

Example : S -> SPS | (S) | -S | id

Prepared by T.Aruna Sri, Dept of CSE Page 5

Illustrating precedence relations:

Fig 7.3: a <· b

Construct the operator precedence matrix.

Prepared by T.Aruna Sri, Dept of CSE Page 7

Operator Precedence Relations (Precedence Matrix)

Prepared by T.Aruna Sri, Dept of CSE Page 8

Fig : Parse trees for the sentence: a* ( a + a )

Prepared by T.Aruna Sri, Dept of CSE Page 9

Example : Consider the following precedence matrix

Construct a precedence function.

Fig : Directed Graph

Prepared by T.Aruna Sri, Dept of CSE Page 10

The LR Parsing algorithm

Example : Consider the following augmented grammar :

I0: S’ -> .SC

Prepared by T.Aruna Sri, Dept of CSE Page 12

Example : Construct NFA of LR(0) items of the following grammar:

Prepared by T.Aruna Sri, Dept of CSE Page 13

S -> .(S)S S -> .

Construction of SLR Parsing Table

Prepared by T.Aruna Sri, Dept of CSE Page 14

The Closure Operation

The GOTO Operation

The sets-of-items construction

Algorithm for Constructing an SLR parsing table

Example : Consider the following grammar:

Prepared by T.Aruna Sri, Dept of CSE Page 15

(b) Below is SLR parsing table for the given grammar

Constructing Canonical LR Parsing Tables

Prepared by T.Aruna Sri, Dept of CSE Page 16

function goto(I, X);

Algorithm for Constructing an canonical LR parsing table

Prepared by T.Aruna Sri, Dept of CSE Page 17

Example : Consider the following grammar

Prepared by T.Aruna Sri, Dept of CSE Page 18

Fig : GOTO Graph

Example : Consider the grammar(i.e., augmented grammar)

Action of LR(1) parser on input aabb

Stack Remaining Input Operation

Constructing LALR Parsing Tables

Algorithm. An easy, but space-consuming LALR table construction.

Example : Consider the following grammar

I36: B-> c.B, {c, d, $}

I47: B-> d., {c , d, $}

I89: B-> cB., { c, d, $}

Prepared by T.Aruna Sri, Dept of CSE Page 21

YACC Programming Specifications:

1) Explain operator-precedence parsing with an example.