You are on page 1of 8

Context-Free Grammar

Logic and Discrete Structure

6/28/2011 Sadaqatullah Noonari BS-2

Page |2

Introduction:
We communicate through languages. These languages are constructed by sentences, words, single strings and empty strings which are formed by the combination of different elements of a single set called Alphabets of the Language. These sentences, words, single strings and empty strings are again combined by some rules followed by the language. These rules are called Grammars of the Language. A Formal Language is the subtopic of Language which strictly follows the Grammars. Lets take the example of English language, which when used formally strictly follows the grammar. i. ii. iii. iv. v. vi. vii. viii. A sentence in English is made up of noun phrase trailed by verb phrase. A noun phrase is made up of an article followed by noun, OR an article followed by an adjective followed by a noun. A verb phrase is made up of a verb or a verb followed by an adverb. The articles are a, an and the. The adjectives are large, hungry, quick, etc. The nouns are DVD, book, rabbit, mathematician. Etc. The verbs are grow, jump, write, break, etc. The adverbs are quickly, neatly, loudly, etc.

Computers also use the languages. These languages are formal languages thus strictly follow the rules of Grammars. The grammar used by computers is called Phrase Structure Grammar. The Phrase Structure Grammar was introduced by Noam Chomsky as the term used for the grammars which were defined by the phrase structure rules. Some authors also reserve this term for more restricted grammar in the Chomsky Hierarchy as context sensitive grammar or context free grammar. (Kenneth H Rosen)

Page |3

Definition
A context-free grammar G is defined by the 4-tuple: 1. is a finite set; each element Where

is called a non-terminal character or a variable.

Each variable represents a different type of phrase or clause in the sentence. Variables are also sometimes called syntactic categories. Each variable defines a sub-language of the language defined by . 2. is a finite set of terminals, disjoint from , which make up the actual content of the sentence. The set of terminals is the alphabet of the language defined by the grammar . 3. is a finite relation from to . The members of are called the (rewrite)

rules or productions of the grammar.


4. Is the start variable (or start symbol), used to represent the whole sentence (or program). It must be an element of The asterisk represents the Kleene . (http://en.wikipedia.org/wiki/Context-

star operation.

free_grammar#Examples)

Explaination
A formal system that describes how any legal text can be derived from a distinguished symbol called the axiom, or sentence symbol, for a language. It consists of a set of productions, each of which states that a sequence of symbols can replace a given symbol. The grammar is used as the data to derive a legal text for the following algorithm: 1. Let text be a single occurrence of the axiom. 2. If no production states that a symbol currently is next can be replaced by some sequences of symbols, then stop. 3. Rewrite text by replacing one of its symbols with a sequence according to some production. 4. Go to step (2). When this algorithm terminates, text is a legal text in the language. The phrase structure of that text is the hierarchy of sequences used in its derivation. (Eli)

Page |4

Construction of Context-Free Grammar


A context free grammar (CFG) is a set of recursive productions used to generate patterns of strings. The context-free grammar is based upon following four components: There is a set of terminal symbols, which are the characters of the alphabet that appears in the string generated by the grammar. It consists of a set of non- terminal symbols which are place holders for patterns of terminal symbols that can be generated by the non-terminal symbols. There is a set of productions in a CFG which is considered as the rules for re-writing the non-terminal symbols in a string with other terminal or non-terminal symbols. A special, non-terminal start symbol is also there in that appears in the initial string generated by the grammar. In order to generate a string of terminal symbols from a context free grammar following mechanism is applied. Begin with a string that contains a start symbol. Apply one of the productions with the start symbol at the left hand side, by replacing the start symbol with the right hand side of the production. Repeat the above to steps until all non-terminals have been replaced by the terminal ones. (www.cs.rochester.edu)

Symbolic representation
Here is a simple context-free grammar (CFG) S AB S ASB Aa Bb Here the is called a rewrite arrow. And the four expressions enlisted above are called the context-free rules. Other than arrow, there are two more symbols used in this grammar, the terminal and non-terminal symbols. In the expression given above, the terminal symbols are a and b. Whereas, the non-terminal symbols are A, S and B, the non-terminal symbols may occur to the left or right of the rewrite arrow, such as S in above example, but terminal symbols will occur only to the left of it. Every context free grammar has a special start or sentence symbol, which is commonly denoted by S. Some context free grammars also use the special symbol epsilon () to show a null string. What is more, the empty string can only occur to the right side of the rewrite arrow. E.g. The notation S can be added to the list given above, but not A. The simple

Page |5 interpretation rule of a context free grammar is interpreted by saying that one can replace an occurrence of the symbol on the left side of the rule by the symbols on the right side. (http://en.wikipedia.org/wiki/Context-free_grammar#Examples)

Some CFG Examples:


Example 1 A simple context-free grammar is S -> aSb | Where | is used to separate different options for the same non-terminal and stands for the empty string. This grammar generates the language which is not regular. Example 2 Here is a context-free grammar for syntactically correct infix algebraic expressions in the variables x, y and z: S -> T + S | T - S | T T -> U * T | U / T U -> (S) | x | y | z This grammar can for example generate the string "( x + y ) * x - z * y / ( x + x )".

Example 3 A context-free grammar for the language consisting of all strings over {a,b} which contain a different number of a's than b's is S -> U | V U -> TaU | TaT

Page |6 V -> TbV | TbT T -> aTbT | bTaT | Here, T can generate all strings with the same number of a's as b's, U generates all strings with more a's than b's and V generates all strings with fewer a's than b's. Example 4 A regular grammar Sa S aS S bS The terminals here are a and b, while the only non-terminal is S. The language described is all nonempty strings of as and bs that end in a. This grammar is regular: no rule has more than one non-terminal in its right-hand side, and each of these non-terminals is at the same end of the right-hand side. Every regular grammar corresponds directly to a nondeterministic finite automaton, so we know that this is a regular language. It is common to list all right-hand sides for the same left-hand side on the same line, using | to separate them. Hence the grammar above can be described more tersely as follows: S a | aS | bS (http://neohumanism.org)

Applications on CFG
Grammars are used to describe programming languages. Most importantly there is a mechanical way of turning the description as a Context Free Grammar (CFG) into a parser, the component of the compiler that discovers the structure of the source program and represents that structure as a tree. For example, The Document Type Definition (DTD) feature of XML (Extensible Markup Language) is essentially a context-tree grammar that describes the allowable HTML tags and the ways in which these tags may be nested. For example, one could describe a sequence of characters that was intended to be interpreted as a phone

Page |7 number by <PHONE> and </PHONE>. Two grammar inference application areas hold great potential 2: facilitating Domain-Specific Language (DSL) development for experts not well versed in language design. [Mernik et al., 05] (Gur Saran Adhar)

Page |8

Bibliography
Eli, h.-p. (n.d.). Gur Saran Adhar, h. (n.d.). Applications of CFG. http://en.wikipedia.org/wiki/Context-free_grammar#Examples. (n.d.). http://neohumanism.org, h. (n.d.). Kenneth H Rosen, 6. E. (n.d.). Discrete Mathemathics and Its Applications. www.cs.rochester.edu, h. (n.d.).

You might also like