You are on page 1of 111

Chapter 6: Semantic Analysis

(Static) Semantic Analyzer


==> Semantic Structure
- What is the program supposed to do?
- Semantics analysis can be done during syntax analysis
phase or the final code generation phase.
- typical static semantic features include declarations and
type checking.
- information (attributes) gathered can be either added to
the tree as annotations or entered into the symbol table.

Output of the semantic analyzer annotated AST

with subscripts from a range

Two Categories of Semantic Analysis

1.

The analysis of a program to meet the


definition of the programming language.

2.

The analysis of a program to enhance the


efficiency of execution of the translated
program.

Semantic Analysis Process


includes formally:
- description of the analyses to perform
- implementation of the analysis (translation of
the description) that may use appropriate
algorithms.

Description of Semantic Analysis


1.

Identify attributes (properties) of language


(syntactic) entities.

2.

Write attribute equations (or semantic rules) that


express how the computation of such attributes
is related to the grammar rules of the language.
Such a set of attributes and equations is called an
attribute grammar.

Syntax-directed semantics

The semantic content of a program is closely


related to its syntax.

All modern languages have this property.

Attributes
An attribute is any property of a programming
language construct.
- Typical examples of attributes are:
the data type of a variable, the value of an
expression, the location of a variable in memory,
the object code of a procedure, the number of
significant digits in a number.
- Attribute corresponds to the name of a field of a
structure.
-

Attribute Grammars

10

In syntax-directed semantics, attributes are


associated with grammar symbols of the
language. That is, if X is a grammar symbol and
a is an attribute associated to X, then we write
X.a for the value of a associated to X.
For each grammar rule X0-> X1 X2 Xn the
values of the attributes Xi.aj of each grammar
symbol Xi are related to the values of the
attributes of other grammar symbols in the rule.

That is, each relationship is specified by an


attribute equation or semantic rule of the form:
Xi.aj = fij (X0.a1 ,.., X0.ak ,.., X1.a1 ,.., X1.ak , .., Xn.a1
,.., Xn.ak )

11

An attribute grammar for the attributes a1,,ak is


the collection of all such attribute equations
(semantic rules), for all the grammar rules of the
language.

number.val must be
computed prior to
factor.val

12

Attribute grammars may involve several


interdependent attributes.

13

based-num -> num basechar


basechar -> o | d
num -> num digit | digit
digit -> o | 1 | 2 | 3 | 4 | 5 | 6 | 7 8 | 9

e.g. 345o
128d
128o (x)

Attribute grammars may be defined for


different purposes.

16

17

term1.tree = mkOpNode(*,term2.tree,factor.tree)

*
34

(34 3) * 42

42
3

factor.tree =
mkNumNode(number.lexval)

Algorithms for attribute computation

19

Dependency graph and evaluation order

Attribute grammar for simple C-like variable declarations


Grammar Rules

Semantic Rules

decl type var-list

var-list.dtype = type.dtype

type int

type.dtype = integer

type float

type.dtype = real

var-list1 id , var-list2

id.dtype = var-list1.dtype
var-list2.dtype = var-list1.dtype

var-list id

id.dtype = var-list.dtype

decl

type
(dtype = real)

float

var-list
(dtype = real)

,
id
(x)
(dtype = real)

Parse tree for the string


float x , y

var-list
(dtype = real)
id
(y)
(dtype = real)

decl

type
(dtype = real)
trivial
dependency

float

var-list
(dtype = real)

,
id
(x)
(dtype = real)

Parse tree for the string


float x , y

var-list
(dtype = real)
id
(y)
(dtype = real)

26

27

29

30

base is computed in
preorder and val in
postorder

Synthesized Attributes
An attribute a is synthesized if, given a grammar rule
A -> X1 X2 Xn, the only associated attribute equation
with an a on the left-hand side is of the form:
A.a = f (X1.a1 ,.., X1.ak ,.., Xn.a1 ,.., Xn.ak)
e.g., E1 -> E2 + E3
{E1.val = E2.val + E3.val; }
where E.val represents the attribute (numerical value
obtained) for E
An attribute grammar in which all the attributes are
synthesized is called S-attributed grammar.

32

S-attributed
grammar

33

term1.tree = mkOpNode(*,term2.tree,factor.tree)

*
34

42
factor.tree =
mkNumNode(number.lexval)

(34 3) * 42

34

42
3

Inherited Attributes

36

An attribute that is not synthesized is called


an inherited attribute.

Preorder /Preorder & Inorder


traversal

Computation of Attributes During


Parsing

38

L-attributed grammars

Computing Synthesized Attributes During


LR Parsing

-- LALR(1) parser are primarily suited to


handling synthesized attributes.
-- Two stacks are required.
value stack and parsing stack

39

E : E + E { $$ = $1 + $3} // Yacc specification

Parsing
stack

40

Value
stack

41

Translation (Attribute Computation)

42

A translation scheme is merely a context-free


grammar in which a program fragment called
semantic action is associated with each
production.

e.g. A -> XYZ { }


In a bottom up parser the semantic actions
is taken when XYZ is reduced to A. In a topdown parser the action is taken when A, X,
Y, or Z is expanded, whichever is appropriate.

43

Semantic Action

44

In addition to those stated before, the semantic


action may also involve:
1. the computation of values for variables
belonging to the compiler.
2. the generation of intermediate code.
3. the printing of an error diagnostic.
4. the placement of some values in the symbol
table.

Bottom-up Translation of S-attributed


Grammars
- A bottom-up parser uses a stack to hold information about
subtrees that have been parsed. We can use extra fields in
the parser stack to hold the values of synthesized attributes.
e.g. A -> XYZ {A.a = f (X.x, Y.y, Z.z)}
- Before reduction: the value of the attribute Z.z is in val [top],
Y.y is in val [top-1], and Z.z is in val [top-2].

After reduction: top is decremented by 2, A.a is put in val


[top]

For Special Conditions : Hook


stmt -> IF cond THEN stmt ELSE stmt
==>
stmt -> IF cond THEN { action to emit appropriate cond.
jump } stmt ELSE { action to emit appropriate uncond.
jump } stmt
Or
hook1-> { action to emit appropriate conditional jump }
hook2-> { action to emit appropriate unconditional jump }
stmt -> IF cond THEN hook1 stmt ELSE hook2 stmt

46

Symbol Table

47

consists of the records that associate


attributes with various programmer declared
objects. The main one is its name (a string
of characters, e.g. identifier).
semantic action will put information into
symbol table or take out attribute from
symbol table.

What kind of objects?


1. variables
2. components of a composition structure (i.e.,
field names of structure)
3. labels
4. procedure and function name
5. parameters for procedure and function
6. files
48

What attributes (attributes of objects)?


1. name
2. type (e.g. int, float, array, struct, a pointer to
struct, etc.)
3. location for variables and entry point
4. value for named constant
5. initial value for variable
6. flag showing if it has been accessed.
49

How does an attribute be represented?


1. Name strategy:
(a) use a field of n char in the symbol table
record to store the first (up to) n characters of
the identifier.
(b) use an auxiliary string table and store a
pointer (e.g. 5) to the 1st char. of the identifier
and the length ( e.g. 4) of the identifier in the
"name" field of symbol table record.
50

Scheme(A) is simpler, faster and require


less programming.

Scheme(B) allows arbitrary large name, saves


space and requires more programming. Often
this table is kept for string literals and can be
used with little extra programming.

51

2. type Since type can be arbitrarily complex,


they are best represented by a pointer to a
linked data structure that reflects the structure
of the type. (type is mainly used to determine
if semantics is correct and offset computation.)
Def. The static scope of an occurrence of an
identifier is that portion of the (source) program
in which other occurrence of the same
identifier represents the same object.

52

e.g. in Pascal, it is the procedure or function in


which it was declared minus all sub-procedures
and sub-functions in which it is represented.
program P;
procedure Q;
var x: real;
procedure R;
var x: integer;
x := ...
end;
x := x + 1
end;
end.

53

----------------------------|
----|
|
| scope of x
| minus this |
|
|
----|
|
|
----------------------------

Symbol table mechanism ?


1. What should be done when translating a
declaration?
2. What should be done when reference to an
identifier?
3. What should be done when at scope entry?
4. What should be done when at scope exit?

54

Multiscope symbol table


- descriptor is a record that describing an object.
- its fields are called attributes
- its key contains an identifier together with a
context (a context is a block or a declaration,
represented by a lexical number).

55

e.g.

56

float x;
------struct y{ ----|
int x; |
|
int z; | inner context | another context
.
|
|
} ------------

Each context associated with a #, x will pair with the


# to look up at symbol table.
- when we enter a context (compiling time - not run
time) we give it a new # on to the context stack (it is
the current context).
- when we exit a context we pop that # from the stack.
- when resolving a reference to a simple identifier, say
x9, we pair it with the current context and look it up in
the symbol table, if not there try with next context, etc.
until found, or we run out the context.
- when declaring an object we allocate a descriptor for
it. Put the current context into its context field and the
identifier into its identifier-field. Then fill in other
attribute as appropriate. In the case of a record
declaration

e.g.

float x;
-----struct y { ---|
int x; |
|
int z; | context #3
| context #2
.
|
|
} ----------

y has #2 in its context field and x has # 3 in its internal


context field. Lookup y using context # 2. Look up its
field x with context # 3.
- when resolving a reference to a qualified identifier (e.g.
student.grad) we look up the struct as before upon
finding, we get an attribute that called (that will be a #)
internal-context and lookup the field with the context to
find the descriptor for that object.

58

Consider the following basic programming-language


constructs for generating intermediate codes:
1. Declarations ()
2. arithmetic assignment operations ()
3. Boolean expressions ()
4. flow-of-control statements` if-statement() while ()
5. array references ()
6. procedure calls ()
7. switch statements ()
8. structure-type references ()

59

Semantic Actions for different


language constructs
1. Declarations
e.g. int x,y,z;
float w, z, s;

60

Suggested grammar:
(Note: This is a very simple grammar mainly
used for explanation.)
P -> MD;
M -> /* empty string */
D -> D, id
| int id
| float id

D , id
2

int id
int x , y ;

61

(Syntax-directed) Translation
4
1

62

P -> MD; {/* do nothing */}


M ->
{ if offset was not initialized then offset = 0;}
D -> int id { enter (id.name, int, offset);
/* a function entering type int and
particular offset to the entry id.name
of the symbol table */
D.type = int;
offset = offset + 4; /*bytes, width of int*/
D.offset = offset; }

D -> float id { enter (id.name, float, offset);


D.type = float;
offset = offset + 8;
/*bytes, width of float*/
D.offset = offset ; }
D -> D(1), id { enter (id.name, D(1).type, D(1).offset);
D.type = D(1).type;
If D(1).type == int D.offset = D(1).offset + 4;
else if
D(1).type == float D.offset = D(1).offset+8;
offset = D.offset;}
Note: We can construct a data structure to store the information
(attributes) of D. (i.e., D.type and D.offset)

Avoided grammar:
D -> int namelist ; | float namelist ;
namelist -> id, namelist | id

D
int

namelist
1

id
int x ;

64

Why?
When the 'id' is reduced into namelist, we cannot
know the type of 'id' (int or float?) immediately.
Therefore, it is troublesome to enter such type
information into the corresponding field of the 'id' in
the symbol table. Hence, we must use special coding
technique (e.g. linked list keeping the ids name
(pointers to symbol table) to achieve such a purpose.
(* In other words, we need backpatch to chain the
data type.)

Acceptable grammar:
D
-> int intlist ; | float floatlist ;
intlist -> id, intlist | id
floatlist -> id, floatlist | id

Advantage: The above-mentioned problem will not


happen. That is, when 'id' is reduced, we can
identify the type of id. (If id is reduced to intlist, then
id is of int type)
Defect: too much production will occur. => too many
states => bad performance

65

How to handle the following declaration?


x,y,z : float
Two approaches:
(I) decl -> id_list ':' type
id_list -> id_list ',' id | id
type -> int | float
(II) decl -> id ':' type | id , decl
type -> int | float

3
1
2

2
1

Which one is better for LR parsing? Why?

66

Suggested Grammar for the following Declaration:


var x,y,z : real; u,v,t : integer;

67

declarations : VAR decl_list


| /* empty (no declaration is permitted) */
;
decl_list
: declaration ';'
| declaration ';' decl_list
;
declaration : ID ':' type
| ID ',' declaration
;
type
: REAL
| INTEGER
;

Try to construct a parse tree for the following declaration


and see how to parse it:

var x: real; y: integer;

declarations
VAR

var

decl_list

declaration ; decl_list

ID : type declaration ;
x

real

ID : type
y

integer

The following grammar for declaration is


difficult for attribute gathering.

declaration : id_list ':' type ;


id_list
: ID
| id_list ',' ID
type
: REAL
| INTEGER

69

e.g.,

declaration

id_list

type

id_list ,
2

id_list , ID
1

ID

ID

REAL

Intermediate Code Generation

Three Address Code <-> (Two Address code => Triples)


Quadruples (a collective data structure, each unit is with 4 fields)
Operator
Arg1
Arg2
Result
=+
==*
=/
=%
[]=
=[]
.

A unit

Note: The entries of operator column are integers that represent


individual operators. The entries of Arg1 (operand1) Arg2
(operand2) and Result are index (pointer) to the symbol table.

Kinds of three-address codes:


1. A = B op(1) C (op is a binary arithmetic or logical operation)
2. A = op(2) B

(op is a unary operation, e.g. minus, negation, shift


operators, conversion operator, identity operator)
3. goto L
(unconditional jump, execute the Lth threeaddress code)
4. if A relop B goto L (relop denotes relational operators, e.g., <,
==, >, >=, !=, etc.)
5. param A and call P,n (used to implement a procedure call)
6. A = B [i]
7. A[i] = B
8.
A = &B
9.
A = *B
10. *A = B

In Quadruples:
Operator

73

1. ==> op(1)
2. ==> op(2)
3. ==> goto
4. ==> relopgoto
5. ==> param
5. ==> call
6. ==> =[]
7. ==> []=
8. ==> =&
9. ==> =*
10. ==> *=

Arg1

Arg2

B
B

A
A
P
B
B
B
B
B

B
n
i
i

Result

A
A
L
L

A
A
A
A
A

Example:

D=A+B*C

D = A*B+C

The generated three address code is:

T1 = A * B
T1 = B * C

T2 = T1 + C
T2 = A + T1

D = T2
D = T2

Interpret
this

Operator
=*
=+
=

Arg1
A
T1
T2

Arg2
B
C

Result
T1
T2
D

* T1 and T2 are compiler-generated temporary variables and they


are also saved in the symbol table.

Actually, in implementation the


quadruples look as:
Operator
8
15
3

Arg1
6
9
11

Arg2
7
8

in symbol table: index identifier

75

0
1
..
..
6
7
8
9
10
11

Result
9
11
10

Interpret
this

attributes

twa
K
..
..
A
B
C
T1 /* compiler generated temporary variable */
D
T2 /* compiler generated temporary variable */

2. Arithmetic Statements

76

A -> id = E
E -> E(1) + E(2)
E -> E(1) - E(2)
E -> E(1) * E(2)
E -> E(1) / E(2)
E -> - E(1)
E -> (E(1))
E -> id

A
id

E
3

id
x=a+b

id
T1 = a + b
x = T1

A -> id = E

{ GEN (id.place = E.place); }

/* GEN (argument) - a function used to save its argument into the


quadruple. The implementation of E is a data structure with one field
E.place which holds the name that will hold the index value of the
symbol table. */
2

E -> E(1) + E(2)

{ T = NEWTEMP();
/* NEWTEMP() - a function used to generate a
temporary variable T and save T into symbol
table and return the index value of the symbol
table. */
E.place = T;
/* Ts index value in symbol table is assigned to
E.place */
GEN(E.place = E(1).place + E(2).place); }

T=a+b

E -> E(1) * E(2) { T = NEWTEMP();


E.place = T;
GEN(E.place = E(1).place * E(2).place); }

E -> - E(1)

{ T = NEWTEMP();
E.place = T;
GEN(E.place = -E(1).place); }

E -> (E(1))

{ E.place = E(1).place; }

E -> id

{ E.place = id.place; }
/*idindexEfield 'place' ; In
implementation id.place refers to the index value
of id in the symbol table. */

Enhanced version for E -> E(1) op E(2)


**in this version E (array of
struct of EEattributes,
array indexEvalue stack)
{ T = NEWTEMP();
if E(1).type == int and E(2).type == int then
{ GEN (T = E(1).place intop E(2).place);
E.type = int;
}
else if E(1).type == float and E(2).type == float then
{ GEN (T = E(1).place floatop E(2).place);

E.type = float;
}
else if E(1).type == int and E(2). type == float then
{ U = NEWTEMP();
GEN (U = inttofloat E(1).place);
GEN (T = U floatop E(2).place);
E. type = float;
}
else /* E(1). type == float and E(2). type == int then
{ U = NEWTEMP();
GEN (U = inttofloat E(2).place);
GEN (T = E(1).place floatop U);
E. type = float;
}
}

3. Boolean Expression
M ->
E -> E or M E
| E and M E
| not E
|(E)
| id
| id relop id
81

An example
if p < q || r < s && t < u
x = y + z;
k = m n;
For the above boolean expression the
corresponding contents in the quadruples
are:
82

quadruples

counter = 100

Location

100
101
102
103
104
105
106
107
108
109
...

if p < q || r < s && t < u


x = y + z;
k = m n;

Three-Address Code
.
E
if p < q goto 106
7
goto 102
E or M E
6
if r < s goto 104
1
2
E and M E
goto 108
id < id
4
if t < u goto 106
3

goto 108 /*s.next = 105


id < id
5
t1 = y + z
id < id
x = t1
t2 = m - n
k = t2
.........

NEXTQUAD an integer variable used for saving the index (location)


value of the next available entry of the quadruples.
E.true an attribute of E that holds a set of indexes (locations) of the
quadruples, each indexed quadruple saves the three-address code
with true boolean expression.
E.false an attribute of E that holds a set of indexes of the
quadruples, each indexed quadruple saves the three-address code
with false boolean expression.
GEN(x) a function that translates x (a kind of three-address-code) into
quadruple representation.

So, we need to construct a data structure for E which includes two


fields, each field can save an unlimited number of integer.
Meanwhile, we need to construct an array of this Es structure to
store several Es attributes to be used in the same period of time .

M -> { M.quad = NEXTQUAD; }


/* M.quad is a data structure associated with M */
2.
E -> E(1) or M E(2)
{
BACKPATCH (E(1).false, M.quad);
E.true = MERGE (E(1).true, E(2).true);
E.false = E(2).false;
}
/* BACKPATCH (p, i) a function that makes each of the
quadruple index values on the list pointed to by p take
quadruple i as a target (i.e., goto i).*/

1.

/* MERGE (a, b) a function that takes the lists pointed to


by a and b, concatenates them into one list, and
returns a pointer to the concatenated list. */

3. E -> E(1) and M E(2)


{
BACKPATCH (E(1).true, M.quad);
E.true = E(2).true;
E.false = MERGE (E(1).false, E(2).false);
}

4. E -> not E(1)


{ E.true = E(1).false; E.false = E(1).true;}

5. E -> ( E(1) )
{ E.true = E(1).true; E.false = E(1).false;}

6. E -> id
{
E.true = MAKELIST (NEXTQUAD);
E.false = MAKELIST(NEXTQUAD + 1);
GEN (if id.place goto _ );
GEN (goto _);
}
/* MAKELIST ( i ) a function that creates a list containing i, an
index into the array of quadruples, and returns a pointer to the
list it has made. */
/* GEN(x) a function that translates x (a kind of three-addresscode) into quadruple representation. */

7. E -> id(1) relop id(2)


{
E.true = MAKELIST (NEXTQUAD);
E.false = MAKELIST(NEXTQUAD + 1);
GEN (if id(1).place relop id(2).place goto _ );
GEN (goto _);
NEXTQUAD
}
if id(1).place relop
20

E true

id(2).place goto _

21

false

goto _

22
20

88

21

4. Flow-of-Control statements
A. Conditional Statements
S -> if E then S else S
| if E then S
| A
| begin L end
L -> S
|L;S
/* A denotes a general assignment statement
L denotes statement list
S denotes statement
*/

89

1. S -> if E then M(1) S(1) N else M(2) S(2)


{
BACKPATCH (E.true, M(1).quad);
BACKPATCH (E.false, M(2).quad);
S.next = MERGE (S(1).next, N.next, S(2).next);
}
/* S.next is a pointer to a list of all conditional and
unconditional jump (goto) to the quadruple following the
statement S in execution order. */

90

2. S -> if E then M S(1)


{
BACKPATCH (E.true, M.quad);
S.next = MERGE (E.false, S(1).next)
}

3. M -> { M.quad = NEXTQUAD; }

4. N ->
{
N.next = MAKELIST (NEXTQUAD);
GEN (goto _);
}
20

91

NEXTQUAD

Goto ___

NEXTQUAD = 20

next

20

5. S -> A
{ S.next = MAKELIST ( ); }
/* initialize S.next to an empty list */

6.

7.

92

8.

L -> S { L.next = S.next; }


L -> L(1) ; M S
{
BACKPATCH (L(1).next, M.quad); // To resolve all
quadruples with conditional & unconditional unresolved
goto _
L.next = S.next;
}
S -> begin L end { S.next = L.next; }

B. Iterative Statement
S -> while E do S
9. S -> while M(1) E do M(2) S(1)
{
BACKPATCH (E.true, M(2).quad);
BACKPATCH (S(1).next, M(1).quad);
S.next = E.false;
GEN (goto M(1).quad);
}

93

An example:
while (A<B) do if (C<D) then X = Y + Z;
E
Index

100
101
102
103
104
105
106
107

Three-Address Code
..
2
if (A<B) goto 102
goto __ //will be resolved (filling 107) later
if (C<D) goto 104
goto 100
1 If (C<D) then X=Y+Z;
T=Y+Z
3
X=T
4
goto 100

5. Array References
Addressing Array Elements
one-dimension: A[low..high]
two-dimension: A[low1..high1, low2..high2]
n-dimension: A[low1..high1, low2..high2, ... , lown..highn]
Let: base = address of beginning of A, and
w = width of an array element
ni = the number of array elements in i-th dimension (row)
/* row major */
( e.g. n1 = high1 - low1 + 1; n2 = high2 - low2 + 1;
n3 = high3 - low3 + 1; ...)

A[i] has address: base (of A) + (i - low) * w = i * w + (base


low * w), where base - low * w is compile-time invariant.
A[i1, i2] has address (row-major): base + ((i1 - low1) * n2 +
i2 - low2) * w = (i1 * n2 + i2) * w + base - (low1 * n2 +
low2) * w, where base - (low1 * n2 + low2) * w is compiletime invariant
A[i1, i2, i3] has address: base + ((i1 - low1) * n2 * n3 + (i2
low2) * n3 + (i3 - low3)) * w = base + (((i1 - low1) * n2 + (i2
- low2)) * n3 + (i3 - low3)) * w = ((i1* n2 + i2) * n3 + i3) * w
+ base - ((low1* n2 + low2) * n3 + low3) * w, where base
((low1* n2 + low2) * n3 + low3) * w is compile-time invariant.

In general, A[i1, i2, ... ,ik] has address: ((..(((i1* n2 + i2) * n3 + i3)
*n4 + ... ) * nk + ik) * w + base - ((..((low1* n2 + low2) * n3 +
low3)... ) * nk + lowk) * w, where base - ((..((low1* n2 + low2) *
n3 + low3) ... ) * nk + lowk) * w is compile-time invariant.
Therefore, we can compute as follows:
e1 = i1
e2 = e1* n2 + i2
e3 = e2* n3 + i3
.
em = em-1* nm + im
.
ek = ek-1* nk + ik

The address of A[i1, i2, ... ,ik] is: ek * w + compiletime invariant.

98

Translation Scheme for Addressing Array Elements


Assume: (1) for each id there exists id.place which holds
its name,
(2) there is a function limit( ) where
limit(array_name, m) = nm i.e., the # of
elements of array array_name at dimension
m-th,
(3) we can find the width of an array element
from the name of array (i.e. from symbol
table)
// Please read Section 8.3.2 (Array References) of the
textbook.

(1) S -> L = E
{ if L.offset = null then
GEN (L.place = E.place)
else
GEN (L.place[L.offset] = E.place); //
}

(2) E -> E1 + E2
{ E.place = newtemp(); //generate a temporary variable and
save its symbol table index
GEN (E.place = E1.place+ E2.place);
}

(3) E -> (E (1)) { E.place = E (1).place }

(4) E -> L

(5) L -> Elist ]


{ L.place = Elist.array_name;
L.offset = newtemp();
GEN (L.offset = w * Elist.place); }
/* w is known from declaration of array */

{ if L.offset = null then


E.place = L.place
else
E.place = newtemp();
GEN (E.place = L.place[L.offset]);}

4 (6) L -> id { L.place = id.place; L.offset = null }


(1),

E
{ T = newtemp(); m = Elist (1).ndimen + 1;
GEN ( T = Elist (1).place * limit(Elist (1). array_name, m));
GEN ( T = T + E.place );
Elist.array_name = Elist (1).array_name;
Elist.place = tj;
// tj T
Elist.ndimen = mj; } // mj m

2 (7) Elist -> Elist

/* note em = em-1* nm + im , where Elist.place = em, Elist (1).place = em-1,


limit(Elist (1).array, m) = nm, and E.place = im */
1 (8) Elist -> id [ E { Elist.place = E.place; Elist.ndimen = 1;

Elist.array_name := id.place; }
// : compile-time invariant id.place

6. Procedure calls
1. call -> id (args)
2. args -> args , E
3. args -> E
3 1. call -> id (args)

{ for each item p on QUEUE do


GEN (param p);
GEN (call id.place, length of QUEUE); }
/* QUEUE is a data structure for saving the indexes of the symbol
table containing the names of the arguments. The length of
QUEUE is the number of elements in QUEUE */

2. args -> args , E


{ append E.place to the end of QUEUE; }

3. args -> E
{ initialize QUEUE to contain only
E.place; }
/* Originally, QUEUE is empty and, after the reduction
of E to args, QUEUE contains a single pointer to the
symbol table location for the name that denotes the
value of E. */

7. Structure Declarations (Read Sec. 8.3.3)


type -> struct { fieldlist} /*Note: symbols with bold face are
terminals */
| ptr
struct { int x;
//offset 0
| char
float y; //offset 2
| int
char k[10];//offset 6
| float
} m;
| double
m.width = 16 bytes

fieldlist -> fieldlist field;


int x
| field;
field
-> type id
| field [integer /*a token denoting any string of digits*/]

int x [10]

or

int x [10] [20] [30]


field

field -> type id


{ field.width = type.width;
field.name = id.name;
W_enter(id.name, type.width);}
| field(1) [integer]
{ field.width = field(1).width * integer.val;
field.name = field(1).name;
D_enter(field(1).name, integer.val);}

fieldlist -> field; {O_enter (field.name, 0); fieldlist.width = field.width;}


| fieldlist(1) field; { fieldlist.width = fieldlist(1).width
4
+ field.width;
O_enter (field.name, fieldlist(1).width);}
5 type -> struct '{' fieldlist '} ' { type.width = fieldlist.width; }
3

type -> char { type.width = 1; } /* Assume characters take one byte.*/

type -> ptr {type.width = 4; }

/*Assume pointers take four bytes.*/

type -> int { type.width = 2; }

/* Assume integers take two bytes.*/

.......

Definitions of functions used


D_enter(name,size) increases the number of dimensions for
name by one and enters the last dimension as size in the
symbol table entry for name.
W_enter(name,width) enters widthas the width of each
element of name. If name is not an array, then its width is
the number of locations taken by data of names type.
O_enter(name,offset) makes offset the number for which
field name name stands. This information, also, is recorded
in the symbol table entry for name.

8. Switch Statement
Syntax:

10
9

switch E
{
case V1: S1;
case V2: S2;
.............
.............
case Vn-1: Sn-1;
default: Sn;
}

When translated into three-address code:


100
101
102
103
104
105
106
107
108
109
110
111
112
113

Code to evaluate E into T


If T V1 goto 104
Code for S1
Goto 113
If T V2 goto 107
Code for S2
Goto 113
...
...
If T Vn-1 goto 112
Code for Sn-1
Goto 113
code for Sn
.

Temporary variable

Based on the given translation example, you


can infer how to generate the three-address
codes for switch statement easily !!!

11
1

You might also like