Professional Documents
Culture Documents
Compiler Construction
Parsing
Outline
Top-down v.s. Bottomup
Top-down parsing
Parsing
Recursive-descent
parsing
LL(1) parsing
LL(1) parsing
algorithm
First and follow sets
Constructing LL(1)
parsing table
Error recovery
Bottom-up parsing
Shift-reduce parsers
LR(0) parsing
LR(0) items
Finite automata of items
LR(0) parsing algorithm
LR(0) grammar
SLR(1) parsing
SLR(1) parsing algorithm
SLR(1) grammar
Parsing conflict
2
Introduction
Parsing is a process that constructs a
syntactic structure (i.e. parse tree) from
the stream of tokens.
We already learned how to describe the
syntactic structure of a language using
(context-free) grammar.
So, a parser only needs to do this?
Stream of tokens
Context-free grammar
Parsing
Parser
Parse tree
Parsing
E
E
i
d
Top-down
parsing
E
E
i
d
i
d
i
i
d
d
Bottom-up
parsing
Parsing
E+E
id + E
id + E * E
id + id * E
id + id * id
E+E
E+E*E
E + E * id
E + id * id
id + id * id
5
Top-down Parsing
What does a parser need to decide?
Which
How to guess?
What is the guess based on?
What
Reserved
What
If
Parsing
Top-down Parsing
Why is it difficult?
Cannot
Next
St
token: if
Structure to be built: St
MatchedSt | UnmatchedSt
UnmatchedSt
MatchedSt
Production
Next
par
token: id
parList |
parList
Parsing
Recursive-Descent
Write one procedure for each set of
productions with the same nonterminal
in the LHS
Each procedure recognizes a structure
described by a nonterminal.
A procedure calls other procedures if it
needs to recognize other structures.
A procedure calls match procedure if it
needs to recognize a terminal.
Parsing
Recursive-Descent: Example
EEOF|F
O+|F ( E ) | id
Parsing
Match procedure
procedure match(expTok)
{
if (token==expTok)
then
getToken
else
error
}
The token is not consumed until
getToken is executed.
Parsing
10
Problems in Recursive-Descent
Difficult to convert grammars into EBNF
Cannot decide which production to use
at each point
Cannot decide when to use production A
Parsing
11
LL(1) Parsing
LL(1)
Read
Parsing
12
Parsing
13
n
F
T
N
(
( n + ( n ) ) * n $
X
E
A
n
F
+
)
E T X
(
T
N
X A T X |
A + | E
X
Finished
M
*
T F N
F
)
n
N M F N |
M *
T
N
F ( E ) | n
E
X
$
14
15
* tY
or
X * and S * WNtY
Parsing
t
N
Y
X
Q
N
X
t
Y
t
16
First Set
Let X be or be in V or T.
First(X ) is the set of the first terminal
in any sentential form derived from X.
If
Parsing
17
Parsing
18
If A is a terminal or ,
then First(A) = {A}.
If A is a nonterminal,
then for each rule A
X1 X2 ... Xn, First(A)
contains First(X1) - {
}.
If also for some i<n,
First(X1), First(X2), ...,
and First(Xi) contain
, then First(A) conta
ins First(Xi+1)-{}.
If First(X1), First(X2), ...,
and First(Xn) contain
, then First(A) also
contains .
19
Parsing
First
exp
exp
addo
p
term
term
mulo
p
factor
+ -
( num
*
( num
20
Follow Set
Let $ denote the end of input tokens
If A is the start symbol, then $ is in
Follow(A).
If there is a rule B X A Y, then First(Y)
- {} is in Follow(A).
If there is production B X A Y and is
in First(Y), then Follow(A) contains
Follow(B).
Parsing
21
If A is the start
symbol, then $ i
s in Follow(A).
If there is a rule A
Y X Z, then Fi
rst(Z) - {} is in
Follow(X).
If there is
production B
X A Y and is in
First(Y), then
Follow(A) contai
ns Follow(B).
22
Parsing
First
exp
exp
addo
p
term
term
mulo
p
factor
Follow
( num
$)
+ + -
$)
( num + - $
*
*
( num
23
Parsing
24
Follow
{$,)}
{$,)}
{(,num}
{+,-,),$}
{+,-,),$}
{(,num}
{*,+,-,),$}
(
exp
term
6
8
mulo
p
n $
1
addo
p
factor
exp
term
+ -
9
10
11
25
LL(1) Grammar
A grammar is an LL(1) grammar if its
LL(1) parsing table has at most one pro
duction in each table entry.
Parsing
26
(
1,2
3,4
5
) +
- * num $
1,2
3,4
6
7 8
9
27
Parsing
factor
28
Left Recursion
Immediate left
recursion
A Y A, A X A|
A A X | Y A=Y X*
A Y1 A | Y2 A |...| Ym A,
A A X1 | A X2 || A
A X1 A| X2 A|| Xn A|
Xn | Y1 | Y2 |... | Ym
General left
recursion
Parsing
A => X =>* A Y
30
Good News!!!!
Never
Parsing
31
Left Factoring
Left factor causes non-LL(1)
Given
A X Y | X Z. Both A X Y and A
X Z can be chosen when A is on top of stac
k and a token in First(X) is the next token.
AXY|XZ
can be left-factored as
A X A and A Y | Z
Parsing
32
33
Bottom-up Parsing
Use explicit stack to perform a parse
Simulate rightmost derivation (R) from
left (L) to right, thus called LR parsing
More powerful than top-down parsing
Left
Two actions
Shift:
Parsing
34
Example of Shift-reduce
Parsing
Grammar
S S
S (S)S |
Reverse of
rightmost derivation
Parsing actions
from left to right
Stack Input Action
1
(())
$
( ( ) ) $ shift
2
(())
$(
())$
shift
$((
))$
reduce S
3
(())
$((S
))$
shift
4
((S))
$((S)
)$
reduce S 5
((S))
6
((S)S)
$((S)S
)$
reduce S ( S ) S
7
(S)
$(S )$
shift
8
(S)
$(S) $
reduce S
9
(S)S
$(S)S
$
reduce S ( S ) S
10 S
S
$S
$
accept
Parsing
35
Example of Shift-reduce
Parsing
Grammar
S S
S (S)S |
Parsing actions
Stack Input Action
$
( ( ) ) $ shift
1
$(
())$
shift
2
$((
))$
reduce S
3
$((S
))$
shift
4
$((S)
)$
reduce S
5
$((S)S
)$
reduce S ( S ) S 6
7
$(S )$
shift
8
$(S) $
reduce S
$(S)S
$
reduce S ( S ) S 9
10 S
Viable
prefix
$S
$
accept
Parsing
(())
(())
(())
((S))
((S))
((S)S)
(S)
(S)
(S)S
S
handle
36
Terminologies
Right sentential form
sentential form in a
rightmost derivation
Viable prefix
sequence of symbols on
the parsing stack
Handle
LR(0) item
Parsing
production with
distinguished position in
its RHS
(S)S
((S)S)
Viable prefix
( S ) S, ( S ), ( S, (
( ( S ) S, ( ( S ), ( ( S , ( (, (
Handle
( S ) S. with S
( S ) S . with S
( ( S ) S . ) with S ( S ) S
LR(0) item
S
S
S
S
S
( S ) S.
(S).S
(S.)S
(.S)S
.(S)S
37
Shift-reduce parsers
There are two possible actions:
shift
and reduce
Parsing
38
LR(0) parsing
Keep track of what is left to be done in
the parsing process by using finite auto
mata of items
An
item A w . B y means:
Parsing
39
LR(0) items
LR(0) item
Initial Item
Complete Item
Closure Item of x
Kernel Item
Parsing
40
S .S
S S
S (S)S
S
Parsing
S S.
S .(S)S
Items:
S .S
S S.
S .(S)S
S (.S)S
S (S.)S
S (S).S
S (S)S.
S.
S (.S)S
S.
S (S.)S
)
S (S).S
S (S)S.
41
S .(S)S
S (.S)S
S
S (S)S.
Parsing
S S.
S (S.)S
)
S (.S)S
S .(S)S
S.
S (S.)S
)
S (S).S
S.
S .S
S .(S)S
S.
S S.
S (S).S
S .(S)S
S.
S
S (S)S.
42
Parsing
43
A
(.A)
A
A. 3
(A)
A( .a
Parsing
A A.1
A a. 2
A (A.) 4
)
A (A). 5
44
Stack
$0
$0(3
$0(3(3
$0(3(3a2
$0(3(3A4
$0(3(3A4)5
$0(3A4
$0(3A4)5
$0A1
Parsing
Input
((a))$
(a))$
Action
shift
shift
a))$
shift
))$
reduce
))$
shift
)$
reduce
)$
shift
$
reduce
$
accept
45
Non-LR(0)Grammar
Conflict
Shift-reduce conflict
A state contains a
complete item A x.
and a shift item A x.B
y
A state contains more
than one complete
items.
A grammar is a LR(0)
grammar if there is no
conflict in the
grammar.
Parsing
S
S
S
(.S)S
.(S)S
. 2
S. 1
S
S
Reduce-reduce conflict
S .S
S .(S)S
S. 0
(S.)S 3
S
S
S
(S).S
.(S)S
. 4
S
(S)S. 5
46
SLR(1) parsing
Simple LR with 1 lookahead symbol
Examine the next token before deciding
to shift or reduce
If
47
Parsing
48
SLR(1) grammar
Conflict
Shift-reduce
A
conflict
Reduce-reduce
A
conflict
49
(A) | a
A .A A
A .(A) a
A .a 0
(
A A. 1
A a. 2
A
(.A)
A (A.) 4
A. 3 A
(A)
)
A .a
(
A (A). 5
Parsing
50
S S.1
(S)S |
S (S.)S 3
)
S (S).S
S .(S)S
S.
4
S
S (S)S. 5
Parsing
51
Shift-reduce conflict
Prefer
In
Reduce-reduce conflict
Error
Parsing
in design
52
Dangling Else
S S S. 1
S .S
0
I
I
S .I
S I. 2
S .other
I .if S
I
I .if S else
if
S
else
other
other
if
other
S
3
I
I
.other
other I if .S
4
I if .S else
if S.
5
S
if S. else S S S .I
S .other
I .if S
I .if S else
if S
I if S else .S
6
S .I
S .other
I .if S
I .if S else S
stat
e
if
S4
I .if S else S
7
else other
$
S
I
S3
1
2
R1
R1
R2
R2
4
6
AC
C
S4
5
Parsing
S3
S6
S4
R3
S3
53