You are on page 1of 23

Introduction to Compilers

Compilers and Interpreters

Compilation
A compiler is a program that reads a
program written in one language (the
source language) and translates it into an
equivalent program in another language
(the target language).
Input
Oversimplified view:

Source
Program

Compiler

Target
Program

Error messages

Output

Interpreter
Instead of producing a target program
as a translation, an interpreter
performs the operations implied by the
source program.
An interpreter might build a tree and
carry out the operations at the nodes
as it walks the tree.
At the root it would discover it had an
assignment to perform.

Compilers and Interpreters


(contd)

Interpretation
Performing the operations implied by the
source program
Oversimplified view:

Source
Program

Interpreter

Output

Input
Error messages

Compilers and Interpreters


(contd)

Compiler: a program that translates an


executable program in one language into
an executable program in another
language

Interpreter: a program that reads an


executable program and produces the
results of running that program

The Analysis-Synthesis Model of


Compilation

There are two parts to compilation:


Analysis
Breaks up source program into pieces and
imposes a grammatical structure
Creates intermediate representation of
source program
Determines the operations and records them in
a tree structure, syntax tree
Known as front end of compiler

The Analysis-Synthesis Model of


Compilation (contd)
Synthesis
Constructs target program from intermediate
representation
Takes the tree structure and translates the
operations into the target program
Known as back end of compiler

A language-processing
system
Skeletal Source Program
Preprocessor
Source Program
Compiler
Target Assembly Program
Assembler
Relocatable Object Code
Linker
Absolute Machine Code

Libraries and
Relocatable Object Files
8

The context of a Compiler


A source program is divided into
modules stored in separate files. The
task of collecting the source program
is entrusted to a distinct program,
called a preprocessor. It also expand
shorthands called macros.
The compiler creates assembly code
that is translated by an assembler into
machine code and than linked
together with some library routines
into the code that actually runs on the

Analysis

In compiling, analysis has three


phases:
Linear analysis: stream of characters
read from left-to-right and grouped into
tokens; known as lexical analysis or
scanning
Hierarchical analysis: tokens grouped
hierarchically with collective meaning;
known as parsing or syntax analysis
Semantic analysis: check if the program
components fit together meaningfully
10

Phases of a compiler

11

Lexical Analysis
First phase of a compiler is called
lexical analysis or scanning, it reads
stream of characters & groups the
characters into meaningful sequence
called lexemes. For each lexeme the
lexical analyzer produces as output a
token of the form
<token name, attribute value>
So, this phase perform linear analysis
on the source program.

12

Lexical analysis(contd)

Characters grouped into tokens.

13

Syntax analysis (Parsing)


It performs hierarchical analysis on the
source program.
It is represented by syntax tree, where
each interior nodes are expressions
and leave nodes are operands.

14

Syntax analysis (contd)


Grouping tokens into grammatical phrases
Character groups recorded in symbol table
Represented by a parse tree

15

Syntax analysis (contd)


Hierarchical structure usually
expressed by recursive rules
Rules for definition of expression:

16

Semantic analysis
Checks source program for semantic
errors
Gathers
type
information
for
subsequent code generation (type
checking)
Identifies operator and operands of
expressions and statements

17

Intermediate code generation


Program representation for an
abstract machine
Should have two properties

Easy to produce
Easy to translate into target program

Three-address code is a commonly


used form similar to assembly
language
18

Code optimization and generation

Code Optimization
Improve intermediate code by
producing code that runs faster

Code Generation
Generate target code, which is machine
code or assembly code

19

Symbol-Table Management
Symbol table data structure with a
record for each identifier and its
attributes
Attributes include storage allocation,
type, scope, etc
All the compiler phases insert and
modify the symbol table

20

The Structure of a Compiler (8)


Code Generator
[Intermediate Code Generator]

Scanner
[Lexical Analyzer]

Non-optimized Intermediate Code

Tokens
Code Optimizer
Parser
[Syntax Analyzer]

Optimized Intermediate Code

Parse tree

Code Generator
Semantic Process
[Semantic analyzer]

Target machine code

Abstract Syntax Tree w/ Attributes

21

The Grouping of Phases

Compiler front and back ends:


Front end:
Analysis steps + Intermediate code generation
Depends primarily on the source language
Machine independent

Back end:
Code optimization and generation
Independent of source language
Machine dependent

22

The Grouping of Phases


(contd)

Compiler passes:
A collection of phases is done only once (single
pass) or multiple times (multi pass)
Single pass: reading input, processing, and producing
output by one large compiler program; usually runs faster
Multi pass: compiler split into smaller programs, each
making a pass over the source; performs better code
optimization

23

You might also like