You are on page 1of 34

Context Free Pumping

Lemma
Zeph Grunschlag

Agenda
Context Free Pumping

Motivation
Theorem
Proof

Proving non-Context Freeness

Examples on slides
Examples on blackboard

Pumping FAs

1
0

Strings of length 3 or more in DFA


above can be pumped because such
strings correspond to paths of length
3, so visit 4 vertices. Hence,
pigeonhole principle guarantees
some vertex visited twice, and hence
a pumpable cycle.

Pumping PDAs
X
X

However, regular pumping lemma


fails in this example.
Q: Give an example of a pattern that
cannot be pumped.

Pumping PDAs
X
X

A: (n )n cant be pumped in the first half.


However, could pump two substrings
at once. I.e. could take k left parens
if take k right parens. I.e. can
tandem pump.

Tandem Pumping
DEF: A string s in L is said to be tandem
pumpable if can break s up into
s = uvxyz
such that for all i 0 we have that
s = uv ixy iz L
with at least one of v,y nonempty.
Q1: Is 00111 tandem pumpable in 0*111 ?
Q2: Is 00100 tandem pumpable in {0n10n}
?
Q3: Is 00100100 tandem pumpable in
{0n10n10n} ?

Tandem Pumping
A1: Yes. Any pumpable string is
automatically tandem pumpable by letting
y = . In our case, let u = , v = 00, x = y
= z = 111.
uv ixy iz =(00)i111 is indeed in 0*111.
A2: Yes. Let u = , v = 00, x = 1, y = 11
and z =
uv ixy iz =(00)i1(00)i is indeed in {0n10n}
A3: NO! Tandem pumping 00100100 leads
either to too many 1s, or would increase
two of the 0-streaks, without ability to
increase the remaining 0-streak.

Tandem Pumping
In general, since pumping automatically
implies tandem pumping, all (infinite)
regular languages are tandem
pumpable. Turns out, that all (infinite)
context free languages are as well.
But Q3 can be generalized to show
that {0n10n10n} does not admit
tandem pumping of strings which are
past a certain length. This will end up
proving that {0n10n10n} is not context
free:

Context Free Pumping


Lemma
THM: Given a context free language L, there
is a number p (tandem-pumping
number) such that any string in L of length
p is tandem-pumpable within a substring
of length p. In other words, for all s L with
|s| p we we can write:

s = uvxyz
|vy | 1
(some pumpable stuff non-empty)
|vxy | p
(pumping inside length-p portion)
uv ixy iz L for all i 0 (tandem-pump v and
y)

CFPL Intuition
tk t2
t1

sk s2
s1

-y

-v

-u

-x

r
-z

p
sk s2
s1
= uvxyz

Intuitively s
is found as follows: Only
finitely many stack changes possible at cycles in
the graph of length n (the number of states).
Thus if s is long enough, there will have to be
some states q,r such that the same string is
pushed at q as is popped at r and such that the
path from q to r starts and ends with same stack
configuration. With these assumption, can then
pump up v and y in tandem as v pushes same
stuff that y pops off.

CFPL - Proof
The previous can actually be formalized and used
to prove the Context Free Pumping Lemma.
However, this is actually quite painful compared
to very simple grammar-theoretic proof:
Proof of CFPL: We may assume that the language
is in CNF. This is not an essential assumption
but it makes the proof a little easier.
Consider a derivation tree in which some
occurring variable node has itself as an
ancestor:

CFPL Proof
S
a

A
c h u g a

a n d

f
a

o r

y o u

c h o o

Could replace last appearance of A by its


first appearance. I.e., in tree replace
A * and a by
A * chuga and a choo
to get the following:

CFPL Proof
S
a

c h u g a

o r

c h o o
A

c h u g a

And again:

a n d

c h o o

y o u

CFPL Proof
S
a

c h u g a

o r

c h o o
A

c h u g a

c h o o
A

c h u g a

a n d

c h o o

y o u

CFPL Proof
S
a

A
c h u g a

a n d

f
a

o r

y o u

c h o o

Or could replace
A * chuga and a choo by
A * and a
This is called tandem-pumping down:

CFPL Proof
S
a

A
a n d

a
f

o r

y o u

CFPL Proof
In our particular case, we were able
to create any string of the form
a (chuga)i and a (choo)i for you
In general, any branch down the
derivation tree with a repeated
variable gives rise to strings of the
form uv ixy iz all of which are in L.
The end of the proof is just a counting
argument to see when a repeated
variable is guaranteed to occur.

CFPL Proof
Q: If n is the number of variables in
the grammar, what tree-height
guarantees that a variable is
repeated?
(Recall: the height of the trivial tree
just a root is 0)

CFPL Proof
A: If n is the number of variables in the
grammar, any subtree of height h =
n+1 will have a repeated variable.
This is because the bottom row of a
derivation tree is composed of
terminals, so height n+1 (= n+2
levels) guarantees n+1 levels of
variables, at least on one branch from
the root. Pigeonhole principle
guarantees that some variable will be
encountered twice!

CFPL Proof
Q: If the grammar is in CNF, what
kind of tree is any derivation tree?

CFPL Proof
A: A binary tree!
Q: What is the maximum number of
leaves that a binary tree of height n
may have?

CFPL Proof
A: 2n
Q: What the maximum number of
leaves that a CNF derivation-tree of
height n+1 may have?

CFPL Proof
A: Still 2n! This is because the only
way to get a terminal is through rule
of the form Aa so there is no
branching at the final level.
Q: What string length will guarantee
a derivation tree of height n+1 ?

CFPL Proof
A: 2n. This is because no tree with height
< n+1 could generated this many
leaves, or terminal letters.
This leads to setting the tandem-pumping
number to be p=2n.
The rest of the theorem follows from the
above considerations. Only fact that
need to verify is that the pumping can
happened within a substring of length p.
This just follows from finding a
repeating variable in the last n+2 levels
of tree.

Proving Non-Context
Freeness
Standard method for applying pumping
lemma. Only no. 3 changes from example
to example:
1. Suppose that the language is context
free.
2. Then it would have a pumping no. p.
3. Find a string s which isnt tandempumpable within a substring of length p.
4. 2 and 3 contradict, so 1 must have been
false and the language is not context free.

Freeness
Example 1
L ={1n0n 1n0n | n is non-negative }

Freeness
Example 1
The hard part is number 3!!! Try
s = 1p0p 1p0p
There are three cases of where the
sliding window vxy could be.

III

11001100

II

Freeness
Example 1
Case I. Pumping up (or down) would
change the number of 0s and or
1s in the first half of the string
without affecting the second half.
This would violate the language
definition.
I
III
11001100

II

Freeness
Example 1
Cases II and III. Same argument works
as in Case I. (Case III would cause
the second half to change without
affecting the first half. Case II
would cause the middle to change
without the first 1p nor the last
changing.) This completes the pf. of
no.3.

III

11001100

II

III

11001100

II

Freeness
Example 2
ADD = { x=y+z | x, y, and z are bitstrings which
satisfy
the equation }

Freeness
Example 2
The hard part is number 3! Define s
by:
1p+1=1p+10p
There are two cases of where
the substring vxy could be. (Sliding
p-window approach)
I

p+1

=1 +10
II

Freeness
Example 2
Case I. v must occur to the left of = while
y must occur to the right as otherwise,
pumping would give too many =s, or
would affect one side of the equation, and
not the other. Let k be the length of v
and l be the length of y. Pumping down
results in supposed equation: 1p+1-k=1pl+10p. This is impossible because the RHS
is then much greater than LHS.

p+1

=1 +10
II

Freeness
Example 2
Case II. Pumping must occur to the right
of =: The RHS is affected without
affecting the LHS. This is impossible
since we want the equation to hold in
binary.
This finishes the proof that ADD is not
context free.

p+1

=1 +10
II

Blackboard Exercises
{1n | n is prime}
{0n 1n 0n 1n }
{int x; x = 3; | x is an alphabetic
string}

Therefore, claim that Java is context free


is a lie. (If x = 3 occurred, must have
declared x somewhere in past!)

UNIX regex (not regexp): (a*)b\1b\1

\n construct refers to n t h
parenthesized sub-expression

You might also like