Professional Documents
Culture Documents
Code Generation
SS 2009
Hw/Sw Codesign
Christian Plessl
Translation Process
skeletal source program
preprocessor
source program
compiler
assembler program
assembler
relocatable machine code
linker / loader
library
Compiler Phases
source program
lexical analysis
analysis
syntactic analysis
semantic analysis
symbol table
intermediate code generat.
error handling
code optimization
code generation
synthesis
target program
HW/SW Codesign Ch 3 2009-05-11
Analysis
Lexical analysis
scanning of the source program and splitting into symbols
regular expressions: recognition by finite automata
Syntactic analysis
parsing symbol sequences and construction of sentences
sentences are described by a context-free grammar
A Identifier := E
E E + E | E * E | Identifier | Number
Semantic analysis
make sure the program parts "reasonably" fit together,
eg. type casts
Synthesis
Generation of intermediate code
machine independent simplified retargeting
should be easy to generate
should be easily translatable into target program
Optimization
GP processors: fast code, fast translation
specialized processors: fast code, short code (low memory
requirements), low power consumption
both intermediate and target code can be optimized
Code generation
Example (1)
source program
lexical analysis
assignment symbol
id1 :=
id2 +
identifiers
operators
id3
*
60
number
Example (2)
syntactic analysis
semantic analysis
:=
id1
:=
id1
+
id2
id2
*
id3
60
*
id3
IntToReal
60
Example (3)
intermediate code generat.
tmp1 := IntToReal(60)
tmp2 := id3*tmp1
tmp3 := id2+tmp2
id1 := tmp3
code optimization
tmp1 := id3 * 60.0
id1 := id2 + tmp1
code generation
MOVF
MULF
MOVF
ADDF
MOVF
id3, R2
#60.0, R2
id2, R1
R2, R1
R1, id1
10
syntax tree
:=
+
*
*
-
*
e
d
11
assignment instructions
x := y op z
x := op y
x := y
goto L
if x relop y goto L
x := y[i]
x[i] := y
subroutines
x := &y
y := *x
*x := y
param x
call p,n
return y
12
:=
t1 := c - d
t2 := e * t1
t3 := b * t1
t4 := t2 + t3
a := t4
t4
a
t3
t2
*
t1
b
e
c
13
14
t1
t2
t3
t4
if
:=
:=
:=
:=
t4
c - d
e * t1
b * t1
t2 + t3
< 10 goto L
15
16
i := 0
t2 := 0
t2 := t2 + i
i := i + 1
if i < 10 goto L
x := t2
i < 10
i >= 10
17
18
Example (1)
3 address code
C program
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
prod := 0
i := 0
t1 := 4 * i
t2 := a[t1]
t3 := 4 * i
t4 := b[t3]
t5 := t2 * t4
t6 := prod + t5
prod := t6
t7 := i + 1
i := t7
if i < 20 goto (3)
19
Example (2)
basic blocks
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
prod := 0
B1
i := 0
t1 := 4 * i
B2
t2 := a[t1]
t3 := 4 * i
t4 := b[t3]
t5 := t2 * t4
t6 := prod + t5
prod := t6
t7 := i + 1
i := t7
if i < 20 goto (3)
B1
B2
20
Example (3)
basic block B2
DAG for B2
t6, prod
+
t1 := 4 * i
t2 := a[t1]
t3 := 4 * i
t4 := b[t3]
t5 := t2 * t4
t6 := prod + t5
prod := t6
t7 := i + 1
i := t7
if i < 20 goto (3)
prod0
t5
t2
t4
[]
[]
<
t1, t3
*
b
t7, i
20
a
4
i0
1
21
22
23
Code Generation
Requirements
correct code
efficient code
efficient generation (?)
scheduling:
instruction sequencing
24
Register Binding
Goal: efficient use of registers
instructions with register operands are generally shorter and faster
than instructions with memory operands (CISC)
minimize number of LOAD/STORE instructions (RISC)
25
Instruction Selection
Code pattern for each 3 address instruction
x := y + z
u := x - w
MOV
ADD
MOV
MOV
SUB
MOV
y, R0
z, R0
R0, x
x, R0
w, R0
R0, u
Problems
26
Scheduling (1)
Optimal instruction sequence
minimal number of instructions for the given number of registers
NP-complete problem
t4
t1
t3
t2
t3
t1
t4
:=
:=
:=
:=
c + d
e - t2
a + b
t1 - t3
t1
t2
t3
t4
:=
:=
:=
:=
a + b
c + d
e - t2
t1 - t3
t2
b e
+
c
27
Scheduling (2)
with 2 registers:
t2
t3
t1
t4
:=
:=
:=
:=
MOV
ADD
MOV
SUB
MOV
ADD
SUB
MOV
c + d
e - t2
a + b
t1 - t3
c, R0
d, R0
e, R1
R0, R1
a, R0
b, R0
R1, R0
R0, t4
t1
t2
t3
t4
:=
:=
:=
:=
MOV
ADD
MOV
ADD
MOV
MOV
SUB
MOV
SUB
MOV
a + b
c + d
e - t2
t1 - t3
a, R0
b, R0
c, R1
d, R1
R0, t1
e, R0
R1, R0
t1, R1
R0, R1
R1, t4
28
Machine Model
n registers R0 Rn-1
Byte addressable, word = 4 byte
Instruction format
op source, destination
eg. MOV R0, a
ADD R1, R0
each instruction has cost 1
addresses of operands follow the instructions (in the code)
29
Addressing Modes
30
31
32
33
Register Allocation
Global register allocation
reserve a number of registers
for global variables
for loop variables
for variables in basic blocks
34
35
- An edge between the nodes vi and vj indicates that the life times
of vi and vj overlap. Then, vi and vj cannot share a register.
vi
vj
36
t1
t2
t3
t4
:=
:=
:=
:=
t3
t4
t1
t2
*
*
+
+
10
20
5
t3
t1
t4
t2
t1
t3
t2
t3
t4
1
37
t1
t1, t3 in R0
t2, t4 in R1
t1
t4
t2
t3
t4
t2
t3
MUL
MUL
ADD
ADD
#10, R0
#20, R1
#5, R0
R0, R1
38
39
remove d
40
b
c
remove a
e
remove b
c
remove c
remove e
{}
41
b
c
sequence of removed
nodes:
d, a, b, c, e
a
d
HW/SW Codesign Ch 3 2009-05-11
42
2 cost units at the end of the basic block, if a has been defined in the
basic block and is active afterward
no save instruction MOV R0, a
43
44
active(a, B1) = 1
used(a, B2) = 1
used(a, B3) = 1
a := b + c B1
d := d - b
e := a + f
acdef
acde
f := a - d B2
cdef
cdef
acdf
b := d + f B3
e := a - c
a:
b:
c:
d:
e:
f :
bcdef
b := d + c B4
bcdef
(e)
cost savings
(be)
4
6
3
6
4
4
for 3 registers:
b R0, d R1, a R2
HW/SW Codesign Ch 3 2009-05-11
45
46
*
2
*
7
6 +
a
reversed order:
1 2 3 4 5 6 7
47
48
Examples
ADD R0, R1
ADD *R0, R1
SUB a, R0
R1 := R1 + R0
R1 := R1 + ind R0
R0 := R0 - a
49
op
T1
T2
50
C[0]
optimal cost for computing n, if the result is
stored in memory
51
Example (1)
DAG:
machine model:
instruction set
Ri
Ri
Ri
Ri
Mj
:=
:=
:=
:=
:=
/
e
Mj
Ri op Rj
Ri op Mj
Rj
Ri
52
Example (2)
+
-
a
C=(0, 1, 1)
C=(0, 1, 1)
C=(0, 1, 1)
d
C=(0, 1, 1)
e
C=(0, 1, 1)
53
Example (3)
with one register: Ri := Ri - M
C=(3, 2, 2)
= C[2]
C=(0, 1, 1)
b
C=(0, 1, 1)
or Ri := Ri - Rj
54
Example (4)
with one register: Ri := Ri * M
C=(5, 5, 4)
*
C=(3, 2, 2)
C=(0, 1, 1)
d
C=(0, 1, 1)
e
C=(0, 1, 1)
55
Example (5)
R0:= R0 + R1
C=(8, 8, 7)
R0:= R0 - b
C=(3, 2, 2)
R1:= R1 * R0
C=(5, 5, 4)
R0:= a
a
C=(0, 1, 1)
R0:= R0 / e
C=(3, 2, 2)
R1:= c
C=(0, 1, 1)
C=(0, 1, 1)
R0:= d
d
C=(0, 1, 1)
/
e
C=(0, 1, 1)
56
57
Code Optimization
Transformations on the intermediate code and on the
target code
Peephole optimization
small window (peephole) is moved over the code
several passes, because an optimization can generate new optimization
opportunities
Local optimization
transformations inside basic blocks
Global optimization
transformations across several basic blocks
58
Optimizations (1)
Deletion of unnecessary instructions
(1)
(2)
MOV R0, a
MOV a, R0
(1)
MOV R0, a
Algebraic simplifications
x := y + 0*(z**4/(y-1));
x := x * 1;
x := x + 0;
x := y;
delete
59
Optimizations (2)
Control flow optimizations
(1)
(2) L1
goto L1
.
goto L2
(1)
(2) L1
goto L2
.
goto L2
if L1 is not reachable:
delete (2) (dead code elimination)
Operator reductions
x := y*8;
x := y << 3;
x := y**2;
x := y * y;
HW/SW Codesign Ch 3 2009-05-11
60
Local Optimizations
Common subexpression elimination
(1)
(2)
(3)
(4)
a
b
c
d
:=
:=
:=
:=
b
a
b
a
+
+
-
c
d
c
d
(1)
(2)
(3)
(4)
a
b
c
d
:=
:=
:=
:=
b + c
a - d
b + c
b
Variable renaming
t := b + c
u := b + c
Instruction interchange
t1 := b + c
t2 := x + y
t2 := x + y
t1 := b + c
HW/SW Codesign Ch 3 2009-05-11
61
Copy propagation
(1)
(2)
(3)
(4)
x := t1
a[t2] := t3
a[t4] := x
goto L
(1)
(2)
(3)
(4)
x := t1
a[t2] := t3
a[t4] := t1
goto L
62
(1)
(2)
(3)
(4)
j := n
j := j - 1
t4 := 4 * j
t5 := a[t4]
if t5 > v goto (1)
(1)
(2)
(3)
(4)
j := n
t4 := 4 * j
j := j - 1
t4 := t4 - 4
t5 := a[t4]
if t5 > v goto (1)
63