You are on page 1of 10

DATA STRUCTURE SEMINAR REPORT ON

Submitted by : Raj S J
Class : MCA – II
Roll number : 07
SEARCHING

Let DATA be a collection of data elements in memory and


suppose a specific ITEM of information is given. Searching refers to the
operation of finding the location LOC of ITEM in DATA or printing some
message like ITEM does not appear there.
The search is said to be successful if ITEM does appear in DATA and
unsuccessful otherwise.
There are many different searching algorithms. The algorithm
that one chooses generally depends on the way information in DATA is
arranged. Two well-known and most used algorithms in searching are
1. LINEAR SEARCH
2. BINARY SEARCH.
The complexity of searching algorithms is measured in terms of number f(n)
of comparisons required to find ITEM in data where DATA contains n
elements.

LINEAR SEARCH
Suppose DATA is a linear array with n elements. Given no other
information about DATA, The most intuitive way to search for a given
ITEM in DATA is to compare ITEM with each element of DATA one by
one. That is, first we test whether DATA[1] = ITEM, and then we test
whether DATA[2] = ITEM, and so on. This method, which traverses DATA
sequentially to locate ITEM, is called linear search or sequential search.

Let us see an example below were DATA is an array with five names and
ITEM is the info whose location LOC we are searching in DATA. In this
case array DATA has 5 elements. We are searching for ITEM = “ KAREN “
Here the ITEM “ KAREN “ is checked with first element of DATA i.e. “
MARY “. Since first element of DATA is not equal to ITEM, now 2nd
element of DATA “ JANE” is checked and so on until either the array is
completely traversed or the LOC of ITEM is found. Here on the fifth
comparison the LOC of ITEM is found to be 5 in DATA.

In the following form of algorithm to simplify the matter, we first assign


ITEM to DATA[N+1].The position following the last element of data. Then
the outcome
LOC = N+1
Where LOC denotes the location where ITEM first occurs in data, signifies
search is unsuccessful. The purpose of initial assignment is to avoid
repeatedly testing whether or not we have reached the end of array DATA.
This way, the search must eventually succeed.

LINEAR( DATA, N, ITEM, LOC )


Here DATA is a linear array with n elements, and ITEM is a given item of
information. This algorithm finds the location of ITEM in DATA, or sets
LOC:=0 if search is unsuccessful.
1. [Insert ITEM at the end of DATA] Set DATA[N+1]:=ITEM
2. [Initialize counter.] Set LOC: = 1.
3. [Search for ITEM]
Repeat while DATA[LOC] ≠ ITEM:
Set LOC: =LOC+1
[End of loop]
4. [Successful?] IF LOC=N + 1, the set LOC: = 0
5.Exit
Observe that step 1 guaranty that the loop in step 3 must terminate. Without
step 1 the repeat statement in step 3 must be replaced by the following
statement, which involves two comparisons,

Repeat while LOC ≤ N and DATA[LOC] ≠ ITEM:

On the other hand in order to use step 1, one must guarantee that there is an
unused memory location at the end of the array DATA; otherwise one must
use the linear search algorithm given below

LINEAR(DATA, N, K, ITEM, LOC)


A linear array DATA with N elements and a specific ITEM of information is
given. This algorithm finds the location LOC of ITEM in the array of DATA
or sets LOC=0.

1. [Initialize] Set K: =1 and LOC: =0


2. Repeat Steps 3 and 4 while LOC = 0 and K ≤ N.
3. If ITEM = DATA[K], then set LOC:=K
4. Set K: = K + 1 [Increment counter]
[End of step 2 loop]
5. [Successful?]
If LOC = 0, then:
Write: ITEM is not in the array DATA
Else
Write: LOC is the location of ITEM in DATA
[End of if structure]
6.Exit
Complexity Of linear search algorithm

Worst case:

Clearly the worst case occurs when ITEM is the last element in the array
DATA or is not there at all. In either situation we have

C(n) = n

Accordingly C(n) = n is the worst case complexity of the linear search


algorithm.

Average case:

Here we assume that ITEM does appear in DATA, and that is equally likely
to occur at any position in the array. Accordingly the number of comparisons
can be any of the number 1,2,3……….n and each number occurs with the
probability p = 1/n. Then

C (n) = 1.(1/n) + 2. .(1/n) + 3 .(1/n) ………….+n.(1/n)

= ( 1 + 2 + 3 ……………….+ n) .(1/n)

= n.(n+1)/2.(1/n)

= (n+1)/2
Binary Search

Suppose the array DATA is sorted in increasing numerical order


or alphabetically. In that case there is an efficient
Searching algorithm called Binary search.
The Binary search algorithm applied to our data works as follows. During
each stage of our algorithm, our search for ITEM is reduced to a segment of
elements of DATA.

DATA[BEG],DATA[BEG+1],DATA[BEG+2]….DATA[END]

The variables BEG and END denote respectively the beginning and end
locations of the segment under consideration. The algorithm compares
ITEM with the middle element DATA [MID] of the segment, where MID is
obtained by

MID=INT ((BEG+END)/2).. ….[INT stands for integer value]

If DATA[MID] = ITEM then the search is successful and we set


LOC:=MID. Otherwise a new segment of DATA is obtained as follows

a) If ITEM < DATA[MID] , then ITEM can appear only in the left half
of the segment:

DATA[BEG],DATA[BEG+1]……………..DATA[MID-1]

So we reset END : = MID – 1 and begin searching again.

b) If ITEM > DATA[MID] , then ITEM can appear only in the right half
of the segment:

DATA[MID+1],DATA[MID+2] ……………..DATA[END]

So we reset BEG : = MID + 1 and begin searching again.

Initially we begin with the entire array DATA ie we begin with


BEG = 1 and END = n, or more.
Generally BEG = UB and END = LB
If ITEM is not in DATA then we get

END < BEG

This condition signals that the search is unsuccessful.

Example

Let DATA be the following sorted 13-element array:

DATA: 11 22 30 33 40 44 55 60 66 77 80 88 99

We apply the binary search to DATA for different values of ITEM.

(a) Suppose ITEM = 40. The search for ITEM in the array DATA is
pictured where the values of the DATA[BEG] and DATA[END] in
each stage of the algorithm is indicated by circles and the value of
DATA[MID] by a square.

(1) Initially BEG = 1 and END = 13


Hence MID = INT(1+13)/2 = 7 so DATA[MID] = 55

(2) Since 40 < 55 , END has its value changed END = MID-1=6
HENCE MID = INT((1+6)/2) = 3 so DATA{MID] = 30

Since 40 < 30 BEG has its value changed by


BEG = MID+1 = 4 Hence
MID = INT[(4+6)/2]=5 and so DATA[MID]=40

We found ITEM in location LOC = MID =5


(b) Suppose ITEM = 85.

(1) Again initially BEG=1,END=13 ,MID=7


and DAT [MID] = 55

(2) Since 85>55 BEG has its value changed by BEG = MID + 1= 8.
Hence

MID = INT[(11+13)/2]=10 and so DATA[MID] = 77

(3) Since 85 > 77 , BEG has its value changed by BEG = MID+1 = 11
Hence

MID = INT[(11+13)/2]=12 and so DATA[MID] =88

(4) Since 85 < 88, END has its value changed by END = MID – 1=11.
Hence

MID = INT[(11+11)/2]=11 and so DATA[MID]=80

Observe that here BEG = MID = END = 11

Since 85 >80, BEG has its value changed BEG = MID + 1 = 12 . But
now BEG > END. Hence ITEM does not belong to DATA.
BINARY( DATA, LB, UB, ITEM, LOC)
Here DATA is a sorted array with lower bound Lb and upper bound UB
and ITEM is a given item of information. The variables BEG, END and
MID denote respectively the beginning, end and middle locations of a
segment of elements of DATA. This algorithm finds the location LOC of
ITEM in DATA or sets LOC : = NULL

1. [Initialize segment variable]


Set BEG : = LB , END : = UB and MID = INT((BEG + END)/2)
3. Repeat steps 3 and steps 4 while BEG ≤ END and
DATA[MID] ≠ ITEM
3. If ITEM < DATA[MID] then
Set END : = MID – 1
Else
Set BEG : = MID + 1
[End of If structure ]

4. Set MID:=INT((BEG+ END)/2).


[End of step 2 loop ]

5. If DATA[MID] = ITEM, Then :


Set LOC: = MID
Else
Set LOC: =NULL
[End of If structure]
6.Exit

Complexity of Binary search

In Binary search each comparison reduces the sample size in half. Hence
we require at most f(n) comparisons to locate ITEM where

F(n) = [ log2n ] + 1 ie the running time of worst case is approximately


equal to log2n.Running time for average case is approximately equal to
the running time for worst case.
COMPARISON BETWEEN LINEAR AND BINARY SEARCH

LINEAR SEARCH BINARY SEARCH

Very easy to Implement Comparatively complex to


implement

Time taken to find item Comparatively much faster.


increases when n is large It requires only 20
I.e. algorithm efficiency comparisons with an initial
decreases as number of list of 1000000, elements
elements n becomes large

Complexity is n Complexity is Log 2 n

Can work on any array or list Need the array to be sorted


This is the major
disadvantage as it is costly
to keep array of data sorted
if there is frequent insertion
or deletion

Ideally, the choice between Linear and Binary Search should be taken
based on the number of search operations that are likely to be performed
over the List. As the number increases Binary search becomes more
efficient than linear search.

You might also like