Professional Documents
Culture Documents
SANDEEP SINGH
(III B.TECH I.T)
Brief Introduction
Basic Notations
Naive Algorithm
Knuth – Morris – Pratt Algorithm
Boyer – Moore Algorithm
if ( t j == p k )
j++; k++;
else
// Back up over matched characters.
int backup = k – 1;
j = j – backup; k = k – backup;
//Slide pattern forward , start over.
j++;
i = j;
// Continue loop.
return match;
Comparisons = 0
a b b a b a b a a
a b a a
Comparisons = 1
a b b a b a b a a
a b a a
Comparisons = 2
a b b a b a b a a
a b a a
Comparisons = 3
a b b a b a b a a
a b a a
Comparisons = 3
a b b a b a b a a
a b a a
Comparisons = 4
a b b a b a b a a
a b a a
Comparisons = 4
a b b a b a b a a
a b a a
Comparisons = 5
a b b a b a b a a
a b a a
Comparisons = 5
a b b a b a b a a
a b a a
Comparisons = 6
a b b a b a b a a
a b a a
Comparisons = 7
a b b a b a b a a
a b a a
Comparisons = 8
a b b a b a b a a
a b a a
Comparisons = 9
a b b a b a b a a
a b a a
Comparisons = 9
a b b a b a b a a
a b a a
Comparisons = 10
a b b a b a b a a
a b a a
Comparisons = 10
a b b a b a b a a
a b a a
Comparisons = 10
a b b a b a b a a
a b a a
Comparisons = 11
a b b a b a b a a
a b a a
Comparisons = 12
a b b a b a b a a
a b a a
Comparisons = 13
a b b a b a b a a
a b a a
Comparisons = 14
a b b a b a b a a
Found !
a b a a
Preprocessing Time = 0.
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A B D
0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A B D
0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A B D
0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A B D
0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A B D
0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A B D
0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A B D
0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8
A B C _ A B C D A
FAIL !
A B C D A B D
0 1 2 3 4 5 6
In the fourth step, we get T[3] is a space
and P[3] = 'D', a mismatch.
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A
0 1 2 3 4
(M=4,i=0)
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A
0 1 2 3 4
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A
0 1 2 3 4
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A
0 1 2 3 4
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A
0 1 2 3 4
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A
0 1 2 3 4
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A
0 1 2 3 4
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A
0 1 2 3 4
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A
0 1 2 3 4
0 1 2 3 4 5 6 7 8
A B C A B C D A
A B C D A
0 1 2 3 4
8 9 10 11 12 13 14 15 16
A B A B C D A B
A B D
4 5 6
8 9 10 11 12 13 14 15 16
A B A B C D A B
A B D
4 5 6
8 9 10 11 12 13 14 15 16
A B A B C D A B
A B D
4 5 6
8 9 10 11 12 13 14 15 16
A B _ A B C D A B
FAIL !
A B D
4 5 6
We quickly obtain a nearly complete
match "ABCDAB" when, at T[6] (P[10]),
we again have a discrepancy.
6 7 8 9 10 11 12 13 14
C D A B A B C D
A B C D A B D
0 1 2 3 4 5 6
(M=8,i=2)
6 7 8 9 10 11 12 13 14
C D A B _ A B C D
FAIL !
A B C D A B D
0 1 2 3 4 5 6
( M = 11 , i = 0 )
11 12 13 14 15 16 17 18 19
A B C D A B C D A
A B C D A B D
0 1 2 3 4 5 6
( M = 11 , i = 0 )
11 12 13 14 15 16 17 18 19
A B C D A B C D A
A B C D A B D
0 1 2 3 4 5 6
11 12 13 14 15 16 17 18 19
A B C D A B C D A
A B C D A B D
0 1 2 3 4 5 6
11 12 13 14 15 16 17 18 19
A B C D A B C D A
A B C D A B D
0 1 2 3 4 5 6
11 12 13 14 15 16 17 18 19
A B C D A B C D A
A B C D A B D
0 1 2 3 4 5 6
11 12 13 14 15 16 17 18 19
A B C D A B C D A
A B C D A B D
0 1 2 3 4 5 6
11 12 13 14 15 16 17 18 19
A B C D A B C D A
A B C D A B D
0 1 2 3 4 5 6
11 12 13 14 15 16 17 18 19
A B C D A B C D A
A B C D A B D
0 1 2 3 4 5 6
11 12 13 14 15 16 17 18 19
A B C D A B C D A
A B C D A B D
0 1 2 3 4 5 6
11 12 13 14 15 16 17 18 19
A B C D A B C D A
A B C D A B D
0 1 2 3 4 5 6
11 12 13 14 15 16 17 18 19
A B C D A B C D A
A B C D A B D
0 1 2 3 4 5 6
11 12 13 14 15 16 17 18 19
A B C D A B C D A
A B C D A B D
0 1 2 3 4 5 6
11 12 13 14 15 16 17 18 19
A B C D A B C D A
FAIL !
A B C D A B D
0 1 2 3 4 5 6
( M = 15 , i = 2 )
14 15 16 17 18 19 20 21 22
D A B C D A B D E
A B C D A B D
0 1 2 3 4 5 6
( M = 15 , i = 2 )
14 15 16 17 18 19 20 21 22
D A B C D A B D E
A B C D A B D
0 1 2 3 4 5 6
14 15 16 17 18 19 20 21 22
D A B C D A B D E
A B C D A B D
0 1 2 3 4 5 6
14 15 16 17 18 19 20 21 22
D A B C D A B D E
A B C D A B D
0 1 2 3 4 5 6
14 15 16 17 18 19 20 21 22
D A B C D A B D E
A B C D A B D
0 1 2 3 4 5 6
14 15 16 17 18 19 20 21 22
D A B C D A B D E
A B C D A B D
0 1 2 3 4 5 6
14 15 16 17 18 19 20 21 22
D A B C D A B D E
A B C D A B D
0 1 2 3 4 5 6
14 15 16 17 18 19 20 21 22
D A B C D A B D E
A B C D A B D
0 1 2 3 4 5 6
14 15 16 17 18 19 20 21 22
D A B C D A B D E
A B C D A B D
0 1 2 3 4 5 6
14 15 16 17 18 19 20 21 22
D A B C D A B D E
MATCH !
A B C D A B D
0 1 2 3 4 5 6
A preprocessing of the pattern is necessary in
order to analyze its structure.
if( t j == p k)
j --; k --;
–
else
//slide P forward
i f y o u w i
m u s t
0 1 2 3
0 1 2 3 4 5 6 7 8
i f y o u w i
FAIL !
m u s t
0 1 2 3
0 1 2 3 4 5 6 7 8
i f y o u w i
m u s t
0 1 2 3
0 1 2 3 4 5 6 7 8
i f y o u w i
m u s t
0 1 2 3
0 1 2 3 4 5 6 7 8
i f y o u w i
m u s t
0 1 2 3
0 1 2 3 4 5 6 7 8
i f y o u w i
m u s t
0 1 2 3
0 1 2 3 4 5 6 7 8
i f y o u w i
m u s t
0 1 2 3
0 1 2 3 4 5 6 7 8
i f y o u w i
FAIL !
m u s t
0 1 2 3
0 1 2 3 4 5 6 7 8
i f y o u w i
m u s t
0 1 2 3
0 1 2 3 4 5 6 7 8
i f y o u w i
m u s t
0 1 2 3
0 1 2 3 4 5 6 7 8
i f y o u w i
m u s t
0 1 2 3
0 1 2 3 4 5 6 7 8
i f y o u w i
m u s t
0 1 2 3
7 8 9 10 11 12 13 14 15
w i s h t o u
m u s t
0 1 2 3
7 8 9 10 11 12 13 14 15
w i s h _ t o u
FAIL !
m u s t
0 1 2 3
7 8 9 10 11 12 13 14 15
w i s h _ t o u
m u s t
0 1 2 3
7 8 9 10 11 12 13 14 15
w i s h _ t o u
m u s t
0 1 2 3
7 8 9 10 11 12 13 14 15
w i s h _ t o u
m u s t
0 1 2 3
7 8 9 10 11 12 13 14 15
w i s h _ t o u
m u s t
0 1 2 3
7 8 9 10 11 12 13 14 15
w i s h t o u
m u s t
0 1 2 3
7 8 9 10 11 12 13 14 15
w i s h t o u
FAIL !
m u s t
0 1 2 3
7 8 9 10 11 12 13 14 15
w i s h t o u
m u s t
0 1 2 3
7 8 9 10 11 12 13 14 15
w i s h t o u
m u s t
0 1 2 3
7 8 9 10 11 12 13 14 15
w i s h t o u
m u s t
0 1 2 3
14 15 16 17 18 19 20 21 22
u n d e r s t a
( Line up u’s )
m u s t
0 1 2 3
14 15 16 17 18 19 20 21 22
u n d e r s t a
FAIL !
m u s t
0 1 2 3
14 15 16 17 18 19 20 21 22
u n d e r s t a
m u s t
0 1 2 3
14 15 16 17 18 19 20 21 22
u n d e r s t a
m u s t
0 1 2 3
14 15 16 17 18 19 20 21 22
u n d e r s t a
m u s t
0 1 2 3
14 15 16 17 18 19 20 21 22
u n d e r s t a
m u s t
0 1 2 3
14 15 16 17 18 19 20 21 22
u n d e r s t a
m u s t
0 1 2 3
14 15 16 17 18 19 20 21 22
u n d e r s t a
m u s t
0 1 2 3
14 15 16 17 18 19 20 21 22
u n d e r s t a
m u s t
0 1 2 3
14 15 16 17 18 19 20 21 22
u n d e r s t a
FAIL !
m u s t
0 1 2 3
19 20 21 22 23 24 25 26 27
r s t a n d o t
m u s t
0 1 2 3
19 20 21 22 23 24 25 26 27
r s t a n d o t
FAIL !
m u s t
0 1 2 3
19 20 21 22 23 24 25 26 27
r s t a n d o t
m u s t
0 1 2 3
19 20 21 22 23 24 25 26 27
r s t a n d o t
m u s t
0 1 2 3
19 20 21 22 23 24 25 26 27
r s t a n d o t
m u s t
0 1 2 3
19 20 21 22 23 24 25 26 27
r s t a n d o t
m u s t
0 1 2 3
19 20 21 22 23 24 25 26 27
r s t a n d o t
m u s t
0 1 2 3
19 20 21 22 23 24 25 26 27
r s t a n d o t
m u s t
0 1 2 3
19 20 21 22 23 24 25 26 27
r s t a n d o t
m u s t
0 1 2 3
19 20 21 22 23 24 25 26 27
r s t a n d o t
FAIL !
m u s t
0 1 2 3
26 27 28 29 30 31 32 33 34
o t h e r s y o
m u s t
0 1 2 3
26 27 28 29 30 31 32 33 34
o t h e r s y o
m u s t
0 1 2 3
26 27 28 29 30 31 32 33 34
o t h e r s y o
FAIL !
m u s t
0 1 2 3
26 27 28 29 30 31 32 33 34
o t h e r s y o
m u s t
0 1 2 3
26 27 28 29 30 31 32 33 34
o t h e r s y o
m u s t
0 1 2 3
26 27 28 29 30 31 32 33 34
o t h e r s y o
m u s t
0 1 2 3
26 27 28 29 30 31 32 33 34
o t h e r s y o
m u s t
0 1 2 3
26 27 28 29 30 31 32 33 34
o t h e r s y o
m u s t
0 1 2 3
26 27 28 29 30 31 32 33 34
o t h e r s y o
m u s t
0 1 2 3
26 27 28 29 30 31 32 33 34
o t h e r s y o
FAIL !
m u s t
0 1 2 3
26 27 28 29 30 31 32 33 34
o t h e r s y o
m u s t
0 1 2 3
26 27 28 29 30 31 32 33 34
o t h e r s y o
m u s t
0 1 2 3
26 27 28 29 30 31 32 33 34
o t h e r s y o
m u s t
0 1 2 3
26 27 28 29 30 31 32 33 34
o t h e r s y o
m u s t
0 1 2 3
34 35 36 37 38 39 40 41 42
o u m u s t
m u s t
0 1 2 3
34 35 36 37 38 39 40 41 42
o u m u s t
m u s t
0 1 2 3
34 35 36 37 38 39 40 41 42
o u m u s t
FAIL !
m u s t
0 1 2 3
34 35 36 37 38 39 40 41 42
o u m u s t
m u s t
0 1 2 3
34 35 36 37 38 39 40 41 42
o u m u s t
m u s t
0 1 2 3
34 35 36 37 38 39 40 41 42
o u m u s t
m u s t
0 1 2 3
34 35 36 37 38 39 40 41 42
o u m u s t
( Line up u’s )
m u s t
0 1 2 3
34 35 36 37 38 39 40 41 42
o u m u s t
m u s t
0 1 2 3
34 35 36 37 38 39 40 41 42
o u m u s t
m u s t
0 1 2 3
34 35 36 37 38 39 40 41 42
o u m u s t
m u s t
0 1 2 3
34 35 36 37 38 39 40 41 42
o u m u s t
MATCH !
m u s t
0 1 2 3
. . . . . d a t s
(T) FAIL !
t s a n d c a t s
(P)
Letters in T to the right of the current
position are ‘ats’ , the same letters that
form the suffix of P that was just
scanned.
If we know that P does not have another
instance of ‘ats’ , then we can slide P all
the way past the ‘ats’ in T.
d a t s . . . . .
(T) Fail !
( Previous Occurrence of ‘ats’ in P )
b a t s a n d c a
(P)
Shift one place to the right of ‘d’, as usual
d a t s . . . . .
(T)
b a t s a n d c a t s
(P)
The Boyer-Moore searching algorithm
perfoms O(n) comparisons in the worst
case.
Use BM for :
Strings of average length ( m ≥ 5 ) .
?
For Binary strings, BM does not do quite as
well.