You are on page 1of 1

I was reading about access and how it is optimizing for search.

It uses pretty much the same technique


that I had in mind initially. Here it is:

Think of a reference book (Morrision Boyd). Now every such book has got an index at the end which
contains keywords and assosciated topics.

For example: the keyword 'haloalkane' comes at many places in the chapters. So in the endex there is the
following entry:

haloalkane : Haloalkane Pg 200, Reaction Mechanism SN2 Pg 250,...,... etc.

*The keywords in index are sorted alphabetically, so its easier to find the keyword and look up possible
topics in which its mentioned.*

Now in your database, when we search for a string querry, we first divide it into different word
combinations. Let all these word-combinations (lets call them: keywords-tosearch) be contained in ar
array k[]

Now in the database, we have all the different object types listed, and corresponding to them we have all
the mateched keywords(lets call them: keywords-matched).
So for all elements in k[], we go through each object type and their corresponding keywords-matched and
check if k[i]==keywords-matched.

Current master DB structure:
ObjectType1 Keyword-Matched11 Keyword-Matched12 Keyword-Matched13
ObjectType2 Keyword-Matched21 Keyword-Matched22 Keyword-Matched23
ObjectType3 Keyword-Matched31 Keyword-Matched32 Keyword-Matched33


If the process I have mentioned above holds true (cause I was not really sure ki this was how things work
currently), I propose the following changes:

We create another database which will have the following structure

Keyword-Matched-1 ObjectType12 ObjectType12 ObjectType13
Keyword-Matched-2 ObjectType21 ObjectType22 ObjectType23
Keyword-Matched-3 ObjectType31 ObjectType32 ObjectType33

where Keyword-Matched-i are sorted alphabetically
So now when you search for a string and obtain k[],

for every element in k[] u search this new database.

Tradeoffs:
In the current structure: the database creation takes very less resource but searching takes lot of
resource.
Technically, the worst case time for result from a search querry is: Number_of_object_types *
Average(Keywords-Matched in each object type)

In the proposed database, the creation is resource intensive, but searching is very fast.
Worst Case time from a search querry would be: log(No. of Keywords-Matched)*Average(object types for
a keyword), whre log is base 2.

But as is the case for any database, creation or database updation is much much much less frequent than
searching.

Think about it...

You might also like