Professional Documents
Culture Documents
Compilers Design
Dr. Ayman Hamarsheh
Lecture 1
Programming languages are notations for describing computations to people and to machines All the software running on all the computers was written in some programming language Before a program can be run, it first must be translated into a form in which it can be executed by a computer The software systems that do this translation are called compilers
a compiler is a program that can read a program in one language - the source language - and translate it into an equivalent program in another language the target language; An important role of the compiler is to report any errors in the source program that it detects during the translation process. If the target program is an executable machine-language program, it can then be called by the user to process inputs and produce outputs
An interpreter is another common kind of language processor. Instead of producing a target program as a translation, an interpreter appears to directly execute the operations specified in the source program on inputs supplied by the user. The machine-language target program produced by a compiler is usually much faster than an interpreter at mapping inputs to outputs . An interpreter, however, can usually give better error diagnostics than a compiler, because it executes the source program statement by statement.
They produce programs which run quickly. They can spot syntax errors while the program is being compiled (i.e. you are informed of any grammatical errors before you try to run the program).
The main advantages of interpreters There is no lengthy "compile time", i.e. you do not have to wait between writing a program and running it, for it to compile They tend to be more "portable", which means that they will run on a greater variety of machines.
In addition to a compiler, several other programs may be required to create an executable target program. A source program may be divided into modules stored in separate files. The task of collecting the source program is sometimes entrusted to a separate program, called a preprocessor. The preprocessor may also expand shorthands, called macros, into source language statements.
The compiler may produce an assembly language program as its output, because assembly language is easier to produce as output and is easier to debug. The assembly language is then processed by a program called an assembler that produces relocatable machine code as its output.
Large programs are often compiled in pieces, so the relocatable machine code may have to be linked together with other relocatable object files and library files into the code that actually runs on the machine. The linker resolves external memory addresses, where the code in one file may refer to a location in another file. The loader then puts together all of the executable object files into memory for execution.
Source Program
Translators Compilers
Target Program
Interpreters
Source code
Front End
Intermediate Representation
Back End
Machine code
Errors
Collects information about the source program and stores it in a data structure called a symbol table, which is passed along with the intermediate representation to the synthesis part. During analysis, the operations implied by the source program are determined and recorded in a hierarchical structure called a tree.
Definition of Syntax
In computer science, the syntax of a programming language is the set of rules that define the combinations of symbols that are considered to be correctly structured programs in that language. The syntax of a language defines its surface form. Text-based programming languages are based on sequences of characters. visual programming languages are based on the spatial layout and connections between symbols (which may be textual or graphical).
Definition of Syntax
The syntax of a programming language describes the proper form of its programs. The syntax of textual programming languages is usually defined using a combination of regular expressions (for lexical structure) and Backus-Naur Form (for grammatical structure) to inductively specify syntactic categories (nonterminals) and terminal symbols.
The syntax of a language describes the form of a valid program, but does not provide any information about the meaning of the program or the results of executing that program. syntax of most programming languages can be specified using a Type-2 grammar, i.e., they are context-free grammars.
Semantics
In general, the input to the semantic stage of analysis may be viewed as being a set of possible parses of the sentence, and information about the possible word meanings.
Lecture 4
In-depth Study of Syntactic Specifications
Syntactic
The syntactic analysis of source code usually entails the transformation of the linear sequence of tokens into a hierarchical syntax tree (abstract syntax trees are one convenient form of syntax tree)
Syntax definition
The syntax of textual programming languages is usually defined using a combination of regular expressions (for lexical structure) and Backus-Naur Form (for grammatical structure) to inductively specify syntactic categories (nonterminals) and terminal symbols
Syntax definition
The syntax of a language describes the form of a valid program, but does not provide any information about the meaning of the program or the results of executing that program. The meaning given to a combination of symbols is handled by semantics Not all syntactically correct programs are semantically correct
Using natural language as an example, it may not be possible to assign a meaning to a grammatically correct sentence or the sentence may be false: "John is a married bachelor. " is grammatically well-formed but has no generally accepted meaning.
No ambiguity allowed in programming languages in form (syntax) and meaning (semantics) Distinction between syntax and semantics: many programming languages have features that mean the same (shared semantics) but are expressed differently identifying which is which helps the learning curve
Syntax Specification
Formalism: set of production rules Microsyntax rules: concatenation, alternation (choice among finite alternatives), Kleene closure - The set of strings produced by these three rules is a regular set or regular language - The rules are specified by regular expressions they generate the regular language - Strings in the regular language are recognized by scanners