You are on page 1of 9

Subject Code:

Objectives

Theory of Computation and Compiler Design L,T,P,J,C


3,0,0,4,4
Provides required theoretical foundation for a computational
model and compiler design
Discuss Turing machines as a abstract computational model
Compiler algorithms focus more on low level system aspects

Expected Outcome
On successful completion of the course, the student should be able to:
1. Design computational models for formal languages
2. Design scanners and parsers using top-down as well as bottomup paradigms
3. Design symbol tables and use them for type checking and other
semantic checks
4. Implement a language translator
5. Use tools such as lex, YACC to automate parts of
implementation process.
Module
1

Topics

SLO

Introduction To Languages and Grammars


Overview of a computational model - Languages and
grammars alphabets Strings - Operations on languages
Introduction to Compilers - Analysis of the Source
Program - Phases of a Compiler

L Hrs

Regular Expressions and Finite Automata


Finite automata DFA NFA Equivalence of NFA and
DFA (With Proof) - Regular expressions Conversion
9
between RE and FA (With Proof)
Lexical Analysis - Recognition of Tokens - Designing a
Lexical Analyzer using finite automata
Myhill-Nerode Theorem - Minimization of FA Decision
properties of regular languages Pumping lemma for 4
Regular languages (With Proof)
CFG, PDAs and Turing Machines
CFG Chomsky Normal Forms - NPDA DPDA Membership algorithm for CFG
12
Syntax Analysis - Top-Down Parsing - Bottom-Up
Parsing - Operator-Precedence Parsing - LR Parsers

9,6

5, 9

1, 6

Turing Machines Recursive and recursively enumerable


languages Linear bounded automata - Chomsky's
5
hierarchy Halting problem
Intermediate Code Generation - Intermediate Languages
Declarations - Assignment Statements - Boolean
4
Expressions - Case Statements Backpatching - Procedure
Calls.
Code Optimization - Basic Blocks and Flow Graphs
The DAG Representation of Basic Blocks - The Principal
Sources of Optimization - Optimization of Basic Blocks Loops in Flow Graphs - Peephole Optimization - 4
Introduction to Global Data-Flow Analysis

Code Generation Issues in the Design of a Code


Generator - The Target Machine - Run-Time Storage
Management - Next-Use Information - Register Allocation
and Assignment - A Simple Code Generator - Generating 3
Code from DAG

6, 9

11

18

Recent Trends
1

Project # Generally a team project [3 to 4 members]


60 [Non
# Concepts studied in CSE1001/CSE1002/CSE1003 should have been
Contact
used
hrs]
# Down to earth application and innovative idea should have been
attempted
# Report in Digital format with all drawings using software package to be
submitted. [Ex. 1. Design of a traffic light system using sequential circuits
OR 2. Design of digital clock]
# Assessment on a continuous basis with a minimum of 3 reviews.
The following is a sample project that shall be given to students that shall
be implemented using any programming language:
Define a small language that is similar to Standford's COOL (Class room
Object Oriented language). Each project will ultimately result in a
working compiler phase which can interface with other phases. Student
will have an option of doing the projects in any programming languages
they may also integrate some of the tools already available.

Develop a lexical analyzer - Tools such as lex, flex for C++; jlex
for Java may be used
Input - Set of tokens
Output - recognizing tokens in the specified language as valid and

invalid

Design and develop a parser (Variations may be given)


Tools such as YACC, bison for C++ and CUP for Java may be
used, packages for manipulating trees may also used to achieve
the task
Input Text with Symbols
Output - Abstract Syntax Tree
Implement to check static semantics of a language - refer to the
typing rules, identifier scoping rules, and other restrictions of the
specified language

Code generator - Input AST constructed and static analysis


performed
Output - MIPS assembly code
Text/Reference book exercises may also be given as project.

9, 18

Text Books
1. Introduction to Automata Theory, Languages, and Computation (3rd Edition), John E
Hopcroft, Rajeev Motwani, Jeffery D. Ullman, Pearson education, 2013.
2. Principles of Compiler Design, Alferd V. Aho and Jeffery D. Ullman, Addison Wesley,2006.
Reference Books
1. Introduction to Languages and the Theory of Computation, John Martin, McGraw-Hill
Higher Education,2010
2. Modern Compiler Implementation in Java, 2nd ed., Andrew W. Appel Cambrdige University
Press, 2012.
Theory of Computation and Compiler Design
Knowledge Areas that contain topics and learning outcomes covered in the course
Knowledge Area

Total Hours of Coverage

CS: AL(Algorithms and Complexity) / CE: CAO

17

CS: PL(Programming Languages) / CE: CAO

19

CS: DS(Discrete Structures) / CE: DSC

Body of Knowledge coverage


[List the Knowledge Units covered in whole or in part in the course. If in part, please indicate
which topics and/or learning outcomes are covered. For those not covered, you might want to
indicate whether they are covered in another course or not covered in your curriculum at all.
This section will likely be the most time-consuming to complete, but is the most valuable for
educators planning to adopt the CS2013 guidelines.]

KA

Knowledge Unit Topics Covered

CS: AL /
CE: ALG

Basic Automata,
Computability
and
Complexity

Hours

Introduction to languages and grammars 8


Chomsky's hierarchy
Finite automata DFA NFA Equivalence of
NFA and DFA
- Regular expressions
Conversion between RE and FA Minimization of
FA

CS: AL /
CE: ALG

Advanced
Automata
Theory and
Computability

CS: PL / CE: Language


PRF
Translation and
Execution

CFG Normal Forms CNF and GNF - PDA 9


DPDA NPDA - Turing Machines Recursive
and recursively enumerable languages
Introduction to Compilers - Analysis of the Source 4
Program - Phases of a Compiler - Lexical Analysis
- The Role of the Lexical Analyzer - Specification
of Tokens - Recognition of Tokens - Finite
Automata - From a Regular Expression to an NFA
- Design of a Lexical Analyzer

CS: PL / CE: Syntax Analysis Top-Down Parsing - Bottom-Up Parsing - 6


PRF
Operator-Precedence Parsing - LR Parsers - Using
Ambiguous Grammars
CS: PL / CE: Code Generation Code Generation Issues in the Design of a 3
PRF
Code Generator - The Target Machine - Run-Time
Storage Management - Next-Use Information - A
Simple Code Generator
CS: PL / CE: Advanced
PRF
Programming
Constructs

Register Allocation and Assignment - Generating 2


Code from DAGs - Dynamic Programming Code

CS: PL / CE: Language


PRF
Pragmatics

Intermediate Languages Declarations - 4


Assignment Statements - Boolean Expressions Case Statements Backpatching - Procedure
Calls.

CS: DS /
CE: DSC

Proof
Techniques

Decision properties of FAs- Pumping for Regular 6


and languages All Theorems and their proofs

CS: DS /
CE: DSC

Graphs and
Trees

Code Optimization - Basic Blocks and Flow 3


Graphs The DAG Representation of Basic
Blocks - The Principal Sources of Optimization Optimization of Basic Blocks - Loops in Flow
Graphs - Peephole Optimization - Introduction to
Global Data-Flow Analysis
Total Hours

45

Where does the course fit in the curriculum?


[In what year do students commonly take the course? Is it compulsory? Does it have prerequisites, required following courses? How many students take it?]

This course is a
Core subject
Suitable from 4th semester onwards.
Knowledge of any one programming language is essential.
What is covered in the course?
[A short description, and/or a concise list of topics - possibly from your course syllabus.(This is
likely to be your longest answer)]
The course gives an idea of different kinds of computational problems that are to be solved. All
the abstract computational models such as finite automata, pushdown automata and Turing
machines are taught to the students. Students are expected to design abstract models for the given
problems and also understand the limitations of such models. This course also gives complete
knowledge about how a high level language program is converted into the machine format that
can be understood by the machine. The subject gives the overall idea of the phases involved in
the conversion process and students are made to understand and apply the abstract machine
models for doing a particular task in a compilation process. The phases of compiler such as
lexical analysis, syntax analysis, code generation and code optimization are dealt in detail.
Overview of other phases of compilation is to be given in the course. Students are expected to
apply the acquired knowledge for designing a language translator.
Part 1: Abstract Models of Computation
This part of the course introduces languages and grammars and develops one of the three abstract
computational models such as finite automata, pushdown automata and Turing machines to
generate/accept the languages.
Part II: Lexical and Syntax Analysis
This part of the course deals with the algorithms and computational models that takes the high
level language program as input and check for correct syntax.
Part III: Code Generation and Optimization
The algorithms involved in generation of the code and optimization is explained to students in
this part of the course.
What is the format of the course?
[Is it face to face, online or blended? How many contact hours? Does it have lectures, lab
sessions, discussion classes?]
This Course is designed with 150 minutes of in-classroom sessions per week, 30 minutes of
video/reading instructional material per week, as well as 200 minutes of non-contact time spent
on implementing course related project. Generally this course should have the combination of
lectures, in-class discussion, guest-lectures, mandatory off-class reading material, quizzes.

How are students assessed?


[What type, and number, of assignments are students are expected to do? (papers, problem sets,
programming projects, etc.). How long do you expect students to spend on completing assessed
work?]

Students are assessed on a combination group activities, classroom discussion,


assignments, projects, and continuous, final assessment tests.
A minimum of six assignments shall be given to students in addition to the project. The
assignments may be given in the earlier stage of the course before the students start the
project.
Students can earn additional weightage based on certificate of completion of a related
MOOC course.

Session wise plan


Class
Hour
3

Lab
Hour

Topic Covered

levels of
mastery

Reference
Book

Introduction
To Familiarity T1, T2
Languages,
Grammars
and Compilers
Overview
of
a
computational model Languages and grammars
alphabets Strings Operations on languages
Analysis of the Source
Program - Phases of a
Compiler

Regular Expressions and Familiarity T1, R1


Finite Automata
Finite automata DFA
NFA

Remarks

Several
applications of
automata theory
such as Natural
language
processing,
bionformatics
may be quoted
and compiler
design shall be
introduced as a
applcation of
automata theory
that is to be dealt
in detail
Assignment1 with
exercise problems
in text/reference
book is to be

Design of DFA and NFA - Usage


Equivalence of NFA and
DFA (With Proof)

T1, R1

given

Regular expressions Usage


Conversion between RE
and FA (With Proof)

T1, R1

Assignment2 with
exercise problems
in text/reference
book is to be
given

Lexical
Analysis
- Familiarity T2, R2
Recognition of Tokens Designing
a
Lexical
Analyzer
using
finite
automata

Myhill-Nerode Theorem Minimization of FA


Decision properties of
regular
languages

Pumping
lemma
for
Regular languages (With
Proof)
CFG, PDAs and Turing
Machines
CFG Chomsky Normal
Forms
NPDA

DPDA
Membership algorithm for
CFG

Familiarity T1, R1
Usage

T1, R1

Familiarity T1, R1

Usage

T1, R1

Syntax Analysis - Top- Familiarity T2, R2


Down Parsing - Bottom-Up
Parsing
OperatorPrecedence Parsing

LR Parsers

Turing
Machines
Usage
Recursive and recursively
enumerable languages
Linear bounded automata - Usage
Chomsky's hierarchy
Halting problem

Assignment3 with
exercise problems
in text/reference
book is to be
given

Assignment4 with
exercise problems
in text/reference
book is to be
given

Assignment5 with
exercise problems
in text/reference
book is to be
given

Familiarity T2, R2
T1, R1

T1, R1

Assignment6 with
exercise problems
in text/reference
book is to be
given

45 Hours
(3 Credit
hours
15 Weeks
schedule)

Intermediate
Code
Generation - Intermediate
Languages Declarations Assignment Statements Boolean Expressions - Case
Statements Backpatching
- Procedure Calls.
Code Optimization - Basic
Blocks and Flow Graphs
The DAG Representation
of Basic Blocks - The
Principal
Sources
of
Optimization
Optimization of Basic
Blocks - Loops in Flow
Graphs
Peephole
Optimization - Introduction
to
Global
Data-Flow
Analysis
Code Generation Issues
in the Design of a Code
Generator - The Target
Machine
Run-Time
Storage Management Next-Use Information Register Allocation and
Assignment - A Simple
Code
Generator
Generating Code from
DAG
Recent Trends Just-intime compilation with
adaptive optimization for
dynamic
languages
Parallelizing Compilers

Familiarity T2, R2

Familiarity T2, R2

Familiarity T2, R2

Familiarity T2, R2

These topics can


be dealt in flipped
classroom type.
Video lecturers
may be prepared
new or may be
taken from web
and further
discussed in the
class

You might also like