Compiler Construction Chapter 6

Chapter 6: Semantic Analysis
(Static) Semantic Analyzer

==> Semantic Structure
- What is the program supposed to do?
- Semantics analysis can be done during syntax analysis
phase or the final code generation phase.
- typical static semantic features include declarations and
type checking.
- information (attributes) gathered can be either added to
the tree as annotations or entered into the symbol table.
Output of the semantic analyzer annotated AST
with subscripts from a range
Two Categories of Semantic Analysis
1.
The analysis of a program to meet the

definition of the programming language.
2.
The analysis of a program to enhance the

efficiency of execution of the translated
program.
Semantic Analysis Process

includes formally:
- description of the analyses to perform
- implementation of the analysis (translation of
the description) that may use appropriate
algorithms.
Description of Semantic Analysis

1.
Identify attributes (properties) of language

(syntactic) entities.
2.
Write attribute equations (or semantic rules) that

express how the computation of such attributes
is related to the grammar rules of the language.
Such a set of attributes and equations is called an
attribute grammar.
Syntax-directed semantics
The semantic content of a program is closely

related to its syntax.
All modern languages have this property.
Attributes
An attribute is any property of a programming
language construct.
- Typical examples of attributes are:
the data type of a variable, the value of an
expression, the location of a variable in memory,
the object code of a procedure, the number of
significant digits in a number.
- Attribute corresponds to the name of a field of a
structure.
-
Attribute Grammars
10
In syntax-directed semantics, attributes are

associated with grammar symbols of the
language. That is, if X is a grammar symbol and
a is an attribute associated to X, then we write
X.a for the value of a associated to X.
For each grammar rule X0-> X1 X2 Xn the
values of the attributes Xi.aj of each grammar
symbol Xi are related to the values of the
attributes of other grammar symbols in the rule.
That is, each relationship is specified by an

attribute equation or semantic rule of the form:
Xi.aj = fij (X0.a1 ,.., X0.ak ,.., X1.a1 ,.., X1.ak , .., Xn.a1
,.., Xn.ak )
11
An attribute grammar for the attributes a1,,ak is

the collection of all such attribute equations
(semantic rules), for all the grammar rules of the
language.
number.val must be
computed prior to
factor.val
12
Attribute grammars may involve several

interdependent attributes.
13
based-num -> num basechar

basechar -> o | d
num -> num digit | digit
digit -> o | 1 | 2 | 3 | 4 | 5 | 6 | 7 8 | 9
e.g. 345o
128d
128o (x)
Attribute grammars may be defined for

different purposes.
16
17
term1.tree = mkOpNode(*,term2.tree,factor.tree)
*
34
(34 3) * 42
42
3
factor.tree =
mkNumNode(number.lexval)
Algorithms for attribute computation
19
Dependency graph and evaluation order
Attribute grammar for simple C-like variable declarations

Grammar Rules
Semantic Rules
decl type var-list
var-list.dtype = type.dtype
type int
type.dtype = integer
type float
type.dtype = real
var-list1 id , var-list2
id.dtype = var-list1.dtype
var-list2.dtype = var-list1.dtype
var-list id
id.dtype = var-list.dtype
decl
type
(dtype = real)
float
var-list
(dtype = real)
,
id
(x)
(dtype = real)
Parse tree for the string

float x , y
var-list
(dtype = real)
id
(y)
(dtype = real)
decl
type
(dtype = real)
trivial
dependency
float
var-list
(dtype = real)
,
id
(x)
(dtype = real)
Parse tree for the string

float x , y
var-list
(dtype = real)
id
(y)
(dtype = real)
26
27
29
30
base is computed in
preorder and val in
postorder
Synthesized Attributes
An attribute a is synthesized if, given a grammar rule
A -> X1 X2 Xn, the only associated attribute equation
with an a on the left-hand side is of the form:
A.a = f (X1.a1 ,.., X1.ak ,.., Xn.a1 ,.., Xn.ak)
e.g., E1 -> E2 + E3
{E1.val = E2.val + E3.val; }
where E.val represents the attribute (numerical value
obtained) for E
An attribute grammar in which all the attributes are
synthesized is called S-attributed grammar.
32
S-attributed
grammar
33
term1.tree = mkOpNode(*,term2.tree,factor.tree)
*
34
42
factor.tree =
mkNumNode(number.lexval)
(34 3) * 42
34
42
3
Inherited Attributes
36
An attribute that is not synthesized is called

an inherited attribute.
Preorder /Preorder & Inorder

traversal
Computation of Attributes During

Parsing
38
L-attributed grammars
Computing Synthesized Attributes During

LR Parsing
-- LALR(1) parser are primarily suited to

handling synthesized attributes.
-- Two stacks are required.
value stack and parsing stack
39
E : E + E { $$ = $1 + $3} // Yacc specification
Parsing
stack
40
Value
stack
41
Translation (Attribute Computation)
42
A translation scheme is merely a context-free

grammar in which a program fragment called
semantic action is associated with each
production.
e.g. A -> XYZ { }

In a bottom up parser the semantic actions
is taken when XYZ is reduced to A. In a topdown parser the action is taken when A, X,
Y, or Z is expanded, whichever is appropriate.
43
Semantic Action
44
In addition to those stated before, the semantic

action may also involve:
1. the computation of values for variables
belonging to the compiler.
2. the generation of intermediate code.
3. the printing of an error diagnostic.
4. the placement of some values in the symbol
table.
Bottom-up Translation of S-attributed

Grammars
- A bottom-up parser uses a stack to hold information about
subtrees that have been parsed. We can use extra fields in
the parser stack to hold the values of synthesized attributes.
e.g. A -> XYZ {A.a = f (X.x, Y.y, Z.z)}
- Before reduction: the value of the attribute Z.z is in val [top],
Y.y is in val [top-1], and Z.z is in val [top-2].
After reduction: top is decremented by 2, A.a is put in val

[top]
For Special Conditions : Hook

stmt -> IF cond THEN stmt ELSE stmt
==>
stmt -> IF cond THEN { action to emit appropriate cond.
jump } stmt ELSE { action to emit appropriate uncond.
jump } stmt
Or
hook1-> { action to emit appropriate conditional jump }
hook2-> { action to emit appropriate unconditional jump }
stmt -> IF cond THEN hook1 stmt ELSE hook2 stmt
46
Symbol Table
47
consists of the records that associate

attributes with various programmer declared
objects. The main one is its name (a string
of characters, e.g. identifier).
semantic action will put information into
symbol table or take out attribute from
symbol table.
What kind of objects?

1. variables
2. components of a composition structure (i.e.,
field names of structure)
3. labels
4. procedure and function name
5. parameters for procedure and function
6. files
48
What attributes (attributes of objects)?

1. name
2. type (e.g. int, float, array, struct, a pointer to
struct, etc.)
3. location for variables and entry point
4. value for named constant
5. initial value for variable
6. flag showing if it has been accessed.
49
How does an attribute be represented?

1. Name strategy:
(a) use a field of n char in the symbol table
record to store the first (up to) n characters of
the identifier.
(b) use an auxiliary string table and store a
pointer (e.g. 5) to the 1st char. of the identifier
and the length ( e.g. 4) of the identifier in the
"name" field of symbol table record.
50
Scheme(A) is simpler, faster and require

less programming.
Scheme(B) allows arbitrary large name, saves

space and requires more programming. Often
this table is kept for string literals and can be
used with little extra programming.
51
2. type Since type can be arbitrarily complex,

they are best represented by a pointer to a
linked data structure that reflects the structure
of the type. (type is mainly used to determine
if semantics is correct and offset computation.)
Def. The static scope of an occurrence of an
identifier is that portion of the (source) program
in which other occurrence of the same
identifier represents the same object.
52
e.g. in Pascal, it is the procedure or function in

which it was declared minus all sub-procedures
and sub-functions in which it is represented.
program P;
procedure Q;
var x: real;
procedure R;
var x: integer;
x := ...
end;
x := x + 1
end;
end.
53
----------------------------|
----|
|
| scope of x
| minus this |
|
|
----|
|
|
----------------------------
Symbol table mechanism ?

1. What should be done when translating a
declaration?
2. What should be done when reference to an
identifier?
3. What should be done when at scope entry?
4. What should be done when at scope exit?
54
Multiscope symbol table

- descriptor is a record that describing an object.
- its fields are called attributes
- its key contains an identifier together with a
context (a context is a block or a declaration,
represented by a lexical number).
55
e.g.
56
float x;
------struct y{ ----|
int x; |
|
int z; | inner context | another context
.
|
|
} ------------
Each context associated with a #, x will pair with the

# to look up at symbol table.
- when we enter a context (compiling time - not run
time) we give it a new # on to the context stack (it is
the current context).
- when we exit a context we pop that # from the stack.
- when resolving a reference to a simple identifier, say
x9, we pair it with the current context and look it up in
the symbol table, if not there try with next context, etc.
until found, or we run out the context.
- when declaring an object we allocate a descriptor for
it. Put the current context into its context field and the
identifier into its identifier-field. Then fill in other
attribute as appropriate. In the case of a record
declaration
e.g.
float x;
-----struct y { ---|
int x; |
|
int z; | context #3
| context #2
.
|
|
} ----------
y has #2 in its context field and x has # 3 in its internal

context field. Lookup y using context # 2. Look up its
field x with context # 3.
- when resolving a reference to a qualified identifier (e.g.
student.grad) we look up the struct as before upon
finding, we get an attribute that called (that will be a #)
internal-context and lookup the field with the context to
find the descriptor for that object.
58
Consider the following basic programming-language

constructs for generating intermediate codes:
1. Declarations ()
2. arithmetic assignment operations ()
3. Boolean expressions ()
4. flow-of-control statements` if-statement() while ()
5. array references ()
6. procedure calls ()
7. switch statements ()
8. structure-type references ()
59
Semantic Actions for different

language constructs
1. Declarations
e.g. int x,y,z;
float w, z, s;
60
Suggested grammar:
(Note: This is a very simple grammar mainly
used for explanation.)
P -> MD;
M -> /* empty string */
D -> D, id
| int id
| float id
D , id
2
int id
int x , y ;
61
(Syntax-directed) Translation
4
1
62
P -> MD; {/* do nothing */}

M ->
{ if offset was not initialized then offset = 0;}
D -> int id { enter (id.name, int, offset);
/* a function entering type int and
particular offset to the entry id.name
of the symbol table */
D.type = int;
offset = offset + 4; /*bytes, width of int*/
D.offset = offset; }
D -> float id { enter (id.name, float, offset);

D.type = float;
offset = offset + 8;
/*bytes, width of float*/
D.offset = offset ; }
D -> D(1), id { enter (id.name, D(1).type, D(1).offset);
D.type = D(1).type;
If D(1).type == int D.offset = D(1).offset + 4;
else if
D(1).type == float D.offset = D(1).offset+8;
offset = D.offset;}
Note: We can construct a data structure to store the information
(attributes) of D. (i.e., D.type and D.offset)
Avoided grammar:
D -> int namelist ; | float namelist ;
namelist -> id, namelist | id
D
int
namelist
1
id
int x ;
64
Why?
When the 'id' is reduced into namelist, we cannot
know the type of 'id' (int or float?) immediately.
Therefore, it is troublesome to enter such type
information into the corresponding field of the 'id' in
the symbol table. Hence, we must use special coding
technique (e.g. linked list keeping the ids name
(pointers to symbol table) to achieve such a purpose.
(* In other words, we need backpatch to chain the
data type.)
Acceptable grammar:
D
-> int intlist ; | float floatlist ;
intlist -> id, intlist | id
floatlist -> id, floatlist | id
Advantage: The above-mentioned problem will not

happen. That is, when 'id' is reduced, we can
identify the type of id. (If id is reduced to intlist, then
id is of int type)
Defect: too much production will occur. => too many
states => bad performance
65
How to handle the following declaration?

x,y,z : float
Two approaches:
(I) decl -> id_list ':' type
id_list -> id_list ',' id | id
type -> int | float
(II) decl -> id ':' type | id , decl
type -> int | float
3
1
2
2
1
Which one is better for LR parsing? Why?
66
Suggested Grammar for the following Declaration:

var x,y,z : real; u,v,t : integer;
67
declarations : VAR decl_list

| /* empty (no declaration is permitted) */
;
decl_list
: declaration ';'
| declaration ';' decl_list
;
declaration : ID ':' type
| ID ',' declaration
;
type
: REAL
| INTEGER
;
Try to construct a parse tree for the following declaration

and see how to parse it:
var x: real; y: integer;
declarations
VAR
var
decl_list
declaration ; decl_list
ID : type declaration ;
x
real
ID : type
y
integer
The following grammar for declaration is

difficult for attribute gathering.
declaration : id_list ':' type ;

id_list
: ID
| id_list ',' ID
type
: REAL
| INTEGER
69
e.g.,
declaration
id_list
type
id_list ,
2
id_list , ID
1
ID
ID
REAL
Intermediate Code Generation
Three Address Code <-> (Two Address code => Triples)

Quadruples (a collective data structure, each unit is with 4 fields)
Operator
Arg1
Arg2
Result
=+
==*
=/
=%
[]=
=[]
.
A unit
Note: The entries of operator column are integers that represent

individual operators. The entries of Arg1 (operand1) Arg2
(operand2) and Result are index (pointer) to the symbol table.
Kinds of three-address codes:

1. A = B op(1) C (op is a binary arithmetic or logical operation)
2. A = op(2) B
(op is a unary operation, e.g. minus, negation, shift

operators, conversion operator, identity operator)
3. goto L
(unconditional jump, execute the Lth threeaddress code)
4. if A relop B goto L (relop denotes relational operators, e.g., <,
==, >, >=, !=, etc.)
5. param A and call P,n (used to implement a procedure call)
6. A = B [i]
7. A[i] = B
8.
A = &B
9.
A = *B
10. *A = B
In Quadruples:
Operator
73
1. ==> op(1)
2. ==> op(2)
3. ==> goto
4. ==> relopgoto
5. ==> param
5. ==> call
6. ==> =[]
7. ==> []=
8. ==> =&
9. ==> =*
10. ==> *=
Arg1
Arg2
B
B
A
A
P
B
B
B
B
B
B
n
i
i
Result
A
A
L
L
A
A
A
A
A
Example:
D=A+B*C
D = A*B+C
The generated three address code is:
T1 = A * B
T1 = B * C
T2 = T1 + C
T2 = A + T1
D = T2
D = T2
Interpret
this
Operator
=*
=+
=
Arg1
A
T1
T2
Arg2
B
C
Result
T1
T2
D
* T1 and T2 are compiler-generated temporary variables and they

are also saved in the symbol table.
Actually, in implementation the

quadruples look as:
Operator
8
15
3
Arg1
6
9
11
Arg2
7
8
in symbol table: index identifier
75
0
1
..
..
6
7
8
9
10
11
Result
9
11
10
Interpret
this
attributes
twa
K
..
..
A
B
C
T1 /* compiler generated temporary variable */
D
T2 /* compiler generated temporary variable */
2. Arithmetic Statements
76
A -> id = E
E -> E(1) + E(2)
E -> E(1) - E(2)
E -> E(1) * E(2)
E -> E(1) / E(2)
E -> - E(1)
E -> (E(1))
E -> id
A
id
E
3
id
x=a+b
id
T1 = a + b
x = T1
A -> id = E
{ GEN (id.place = E.place); }
/* GEN (argument) - a function used to save its argument into the

quadruple. The implementation of E is a data structure with one field
E.place which holds the name that will hold the index value of the
symbol table. */
2
E -> E(1) + E(2)
{ T = NEWTEMP();
/* NEWTEMP() - a function used to generate a
temporary variable T and save T into symbol
table and return the index value of the symbol
table. */
E.place = T;
/* Ts index value in symbol table is assigned to
E.place */
GEN(E.place = E(1).place + E(2).place); }
T=a+b
E -> E(1) * E(2) { T = NEWTEMP();

E.place = T;
GEN(E.place = E(1).place * E(2).place); }
E -> - E(1)
{ T = NEWTEMP();
E.place = T;
GEN(E.place = -E(1).place); }
E -> (E(1))
{ E.place = E(1).place; }
E -> id
{ E.place = id.place; }
/*idindexEfield 'place' ; In
implementation id.place refers to the index value
of id in the symbol table. */
Enhanced version for E -> E(1) op E(2)

**in this version E (array of
struct of EEattributes,
array indexEvalue stack)
{ T = NEWTEMP();
if E(1).type == int and E(2).type == int then
{ GEN (T = E(1).place intop E(2).place);
E.type = int;
}
else if E(1).type == float and E(2).type == float then
{ GEN (T = E(1).place floatop E(2).place);
E.type = float;
}
else if E(1).type == int and E(2). type == float then
{ U = NEWTEMP();
GEN (U = inttofloat E(1).place);
GEN (T = U floatop E(2).place);
E. type = float;
}
else /* E(1). type == float and E(2). type == int then
{ U = NEWTEMP();
GEN (U = inttofloat E(2).place);
GEN (T = E(1).place floatop U);
E. type = float;
}
}
3. Boolean Expression
M ->
E -> E or M E
| E and M E
| not E
|(E)
| id
| id relop id
81
An example
if p < q || r < s && t < u
x = y + z;
k = m n;
For the above boolean expression the
corresponding contents in the quadruples
are:
82
quadruples
counter = 100
Location
100
101
102
103
104
105
106
107
108
109
...
if p < q || r < s && t < u

x = y + z;
k = m n;
Three-Address Code
.
E
if p < q goto 106
7
goto 102
E or M E
6
if r < s goto 104
1
2
E and M E
goto 108
id < id
4
if t < u goto 106
3
goto 108 /*s.next = 105

id < id
5
t1 = y + z
id < id
x = t1
t2 = m - n
k = t2
.........
NEXTQUAD an integer variable used for saving the index (location)

value of the next available entry of the quadruples.
E.true an attribute of E that holds a set of indexes (locations) of the
quadruples, each indexed quadruple saves the three-address code
with true boolean expression.
E.false an attribute of E that holds a set of indexes of the
quadruples, each indexed quadruple saves the three-address code
with false boolean expression.
GEN(x) a function that translates x (a kind of three-address-code) into
quadruple representation.
So, we need to construct a data structure for E which includes two

fields, each field can save an unlimited number of integer.
Meanwhile, we need to construct an array of this Es structure to
store several Es attributes to be used in the same period of time .
M -> { M.quad = NEXTQUAD; }

/* M.quad is a data structure associated with M */
2.
E -> E(1) or M E(2)
{
BACKPATCH (E(1).false, M.quad);
E.true = MERGE (E(1).true, E(2).true);
E.false = E(2).false;
}
/* BACKPATCH (p, i) a function that makes each of the
quadruple index values on the list pointed to by p take
quadruple i as a target (i.e., goto i).*/
1.
/* MERGE (a, b) a function that takes the lists pointed to

by a and b, concatenates them into one list, and
returns a pointer to the concatenated list. */
3. E -> E(1) and M E(2)

{
BACKPATCH (E(1).true, M.quad);
E.true = E(2).true;
E.false = MERGE (E(1).false, E(2).false);
}
4. E -> not E(1)

{ E.true = E(1).false; E.false = E(1).true;}
5. E -> ( E(1) )
{ E.true = E(1).true; E.false = E(1).false;}
6. E -> id
{
E.true = MAKELIST (NEXTQUAD);
E.false = MAKELIST(NEXTQUAD + 1);
GEN (if id.place goto _ );
GEN (goto _);
}
/* MAKELIST ( i ) a function that creates a list containing i, an
index into the array of quadruples, and returns a pointer to the
list it has made. */
/* GEN(x) a function that translates x (a kind of three-addresscode) into quadruple representation. */
7. E -> id(1) relop id(2)

{
E.true = MAKELIST (NEXTQUAD);
E.false = MAKELIST(NEXTQUAD + 1);
GEN (if id(1).place relop id(2).place goto _ );
GEN (goto _);
NEXTQUAD
}
if id(1).place relop
20
E true
id(2).place goto _
21
false
goto _
22
20
88
21
4. Flow-of-Control statements
A. Conditional Statements
S -> if E then S else S
| if E then S
| A
| begin L end
L -> S
|L;S
/* A denotes a general assignment statement
L denotes statement list
S denotes statement
*/
89
1. S -> if E then M(1) S(1) N else M(2) S(2)

{
BACKPATCH (E.true, M(1).quad);
BACKPATCH (E.false, M(2).quad);
S.next = MERGE (S(1).next, N.next, S(2).next);
}
/* S.next is a pointer to a list of all conditional and
unconditional jump (goto) to the quadruple following the
statement S in execution order. */
90
2. S -> if E then M S(1)

{
BACKPATCH (E.true, M.quad);
S.next = MERGE (E.false, S(1).next)
}
3. M -> { M.quad = NEXTQUAD; }
4. N ->
{
N.next = MAKELIST (NEXTQUAD);
GEN (goto _);
}
20
91
NEXTQUAD
Goto ___
NEXTQUAD = 20
next
20
5. S -> A
{ S.next = MAKELIST ( ); }
/* initialize S.next to an empty list */
6.
7.
92
8.
L -> S { L.next = S.next; }

L -> L(1) ; M S
{
BACKPATCH (L(1).next, M.quad); // To resolve all
quadruples with conditional & unconditional unresolved
goto _
L.next = S.next;
}
S -> begin L end { S.next = L.next; }
B. Iterative Statement
S -> while E do S
9. S -> while M(1) E do M(2) S(1)
{
BACKPATCH (E.true, M(2).quad);
BACKPATCH (S(1).next, M(1).quad);
S.next = E.false;
GEN (goto M(1).quad);
}
93
An example:
while (A<B) do if (C<D) then X = Y + Z;
E
Index
100
101
102
103
104
105
106
107
Three-Address Code
..
2
if (A<B) goto 102
goto __ //will be resolved (filling 107) later
if (C<D) goto 104
goto 100
1 If (C<D) then X=Y+Z;
T=Y+Z
3
X=T
4
goto 100
5. Array References
Addressing Array Elements
one-dimension: A[low..high]
two-dimension: A[low1..high1, low2..high2]
n-dimension: A[low1..high1, low2..high2, ... , lown..highn]
Let: base = address of beginning of A, and
w = width of an array element
ni = the number of array elements in i-th dimension (row)
/* row major */
( e.g. n1 = high1 - low1 + 1; n2 = high2 - low2 + 1;
n3 = high3 - low3 + 1; ...)
A[i] has address: base (of A) + (i - low) * w = i * w + (base

low * w), where base - low * w is compile-time invariant.
A[i1, i2] has address (row-major): base + ((i1 - low1) * n2 +
i2 - low2) * w = (i1 * n2 + i2) * w + base - (low1 * n2 +
low2) * w, where base - (low1 * n2 + low2) * w is compiletime invariant
A[i1, i2, i3] has address: base + ((i1 - low1) * n2 * n3 + (i2
low2) * n3 + (i3 - low3)) * w = base + (((i1 - low1) * n2 + (i2
- low2)) * n3 + (i3 - low3)) * w = ((i1* n2 + i2) * n3 + i3) * w
+ base - ((low1* n2 + low2) * n3 + low3) * w, where base
((low1* n2 + low2) * n3 + low3) * w is compile-time invariant.
In general, A[i1, i2, ... ,ik] has address: ((..(((i1* n2 + i2) * n3 + i3)
*n4 + ... ) * nk + ik) * w + base - ((..((low1* n2 + low2) * n3 +
low3)... ) * nk + lowk) * w, where base - ((..((low1* n2 + low2) *
n3 + low3) ... ) * nk + lowk) * w is compile-time invariant.
Therefore, we can compute as follows:
e1 = i1
e2 = e1* n2 + i2
e3 = e2* n3 + i3
.
em = em-1* nm + im
.
ek = ek-1* nk + ik
The address of A[i1, i2, ... ,ik] is: ek * w + compiletime invariant.
98
Translation Scheme for Addressing Array Elements

Assume: (1) for each id there exists id.place which holds
its name,
(2) there is a function limit( ) where
limit(array_name, m) = nm i.e., the # of
elements of array array_name at dimension
m-th,
(3) we can find the width of an array element
from the name of array (i.e. from symbol
table)
// Please read Section 8.3.2 (Array References) of the
textbook.
(1) S -> L = E
{ if L.offset = null then
GEN (L.place = E.place)
else
GEN (L.place[L.offset] = E.place); //
}
(2) E -> E1 + E2
{ E.place = newtemp(); //generate a temporary variable and
save its symbol table index
GEN (E.place = E1.place+ E2.place);
}
(3) E -> (E (1)) { E.place = E (1).place }
(4) E -> L
(5) L -> Elist ]

{ L.place = Elist.array_name;
L.offset = newtemp();
GEN (L.offset = w * Elist.place); }
/* w is known from declaration of array */
{ if L.offset = null then

E.place = L.place
else
E.place = newtemp();
GEN (E.place = L.place[L.offset]);}
4 (6) L -> id { L.place = id.place; L.offset = null }

(1),
E
{ T = newtemp(); m = Elist (1).ndimen + 1;
GEN ( T = Elist (1).place * limit(Elist (1). array_name, m));
GEN ( T = T + E.place );
Elist.array_name = Elist (1).array_name;
Elist.place = tj;
// tj T
Elist.ndimen = mj; } // mj m
2 (7) Elist -> Elist
/* note em = em-1* nm + im , where Elist.place = em, Elist (1).place = em-1,

limit(Elist (1).array, m) = nm, and E.place = im */
1 (8) Elist -> id [ E { Elist.place = E.place; Elist.ndimen = 1;
Elist.array_name := id.place; }
// : compile-time invariant id.place
6. Procedure calls
1. call -> id (args)
2. args -> args , E
3. args -> E
3 1. call -> id (args)
{ for each item p on QUEUE do

GEN (param p);
GEN (call id.place, length of QUEUE); }
/* QUEUE is a data structure for saving the indexes of the symbol
table containing the names of the arguments. The length of
QUEUE is the number of elements in QUEUE */
2. args -> args , E

{ append E.place to the end of QUEUE; }
3. args -> E
{ initialize QUEUE to contain only
E.place; }
/* Originally, QUEUE is empty and, after the reduction
of E to args, QUEUE contains a single pointer to the
symbol table location for the name that denotes the
value of E. */
7. Structure Declarations (Read Sec. 8.3.3)

type -> struct { fieldlist} /*Note: symbols with bold face are
terminals */
| ptr
struct { int x;
//offset 0
| char
float y; //offset 2
| int
char k[10];//offset 6
| float
} m;
| double
m.width = 16 bytes
fieldlist -> fieldlist field;

int x
| field;
field
-> type id
| field [integer /*a token denoting any string of digits*/]
int x [10]
or
int x [10] [20] [30]

field
field -> type id

{ field.width = type.width;
field.name = id.name;
W_enter(id.name, type.width);}
| field(1) [integer]
{ field.width = field(1).width * integer.val;
field.name = field(1).name;
D_enter(field(1).name, integer.val);}
fieldlist -> field; {O_enter (field.name, 0); fieldlist.width = field.width;}

| fieldlist(1) field; { fieldlist.width = fieldlist(1).width
4
+ field.width;
O_enter (field.name, fieldlist(1).width);}
5 type -> struct '{' fieldlist '} ' { type.width = fieldlist.width; }
3
type -> char { type.width = 1; } /* Assume characters take one byte.*/
type -> ptr {type.width = 4; }
/*Assume pointers take four bytes.*/
type -> int { type.width = 2; }
/* Assume integers take two bytes.*/
.......
Definitions of functions used

D_enter(name,size) increases the number of dimensions for
name by one and enters the last dimension as size in the
symbol table entry for name.
W_enter(name,width) enters widthas the width of each
element of name. If name is not an array, then its width is
the number of locations taken by data of names type.
O_enter(name,offset) makes offset the number for which
field name name stands. This information, also, is recorded
in the symbol table entry for name.
8. Switch Statement
Syntax:
10
9
switch E
{
case V1: S1;
case V2: S2;
.............
.............
case Vn-1: Sn-1;
default: Sn;
}
When translated into three-address code:

100
101
102
103
104
105
106
107
108
109
110
111
112
113
Code to evaluate E into T

If T V1 goto 104
Code for S1
Goto 113
If T V2 goto 107
Code for S2
Goto 113
...
...
If T Vn-1 goto 112
Code for Sn-1
Goto 113
code for Sn
.
Temporary variable
Based on the given translation example, you

can infer how to generate the three-address
codes for switch statement easily !!!
11
1

Compiler Construction Chapter 6

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Compiler Construction Chapter 6

Uploaded by

Copyright:

Available Formats

Chapter 6: Semantic Analysis

(Static) Semantic Analyzer

Output of the semantic analyzer annotated AST

with subscripts from a range

Two Categories of Semantic Analysis

The analysis of a program to meet the

The analysis of a program to enhance the

Semantic Analysis Process

Description of Semantic Analysis

Identify attributes (properties) of language

Write attribute equations (or semantic rules) that

The semantic content of a program is closely

All modern languages have this property.

In syntax-directed semantics, attributes are

That is, each relationship is specified by an

An attribute grammar for the attributes a1,,ak is

Attribute grammars may involve several

based-num -> num basechar

Attribute grammars may be defined for

Algorithms for attribute computation

Dependency graph and evaluation order

Attribute grammar for simple C-like variable declarations

decl type var-list

Parse tree for the string

Parse tree for the string

An attribute that is not synthesized is called

Preorder /Preorder & Inorder

Computation of Attributes During

Computing Synthesized Attributes During

-- LALR(1) parser are primarily suited to

E : E + E { $$ = $1 + $3} // Yacc specification

Translation (Attribute Computation)

A translation scheme is merely a context-free

e.g. A -> XYZ { }

In addition to those stated before, the semantic

Bottom-up Translation of S-attributed

After reduction: top is decremented by 2, A.a is put in val

For Special Conditions : Hook

consists of the records that associate

What kind of objects?

What attributes (attributes of objects)?

How does an attribute be represented?

Scheme(A) is simpler, faster and require

Scheme(B) allows arbitrary large name, saves

2. type Since type can be arbitrarily complex,

e.g. in Pascal, it is the procedure or function in

Symbol table mechanism ?

Multiscope symbol table

Each context associated with a #, x will pair with the

y has #2 in its context field and x has # 3 in its internal

Consider the following basic programming-language

Semantic Actions for different

P -> MD; {/* do nothing */}

D -> float id { enter (id.name, float, offset);

Advantage: The above-mentioned problem will not

How to handle the following declaration?

Which one is better for LR parsing? Why?

Suggested Grammar for the following Declaration:

declarations : VAR decl_list

Try to construct a parse tree for the following declaration

var x: real; y: integer;

The following grammar for declaration is

declaration : id_list ':' type ;

/Assume pointers take four bytes./