You are on page 1of 6

Prioritizing Web Links Based on Web Usage and

Content Data

Kamika Chaudhary

Santosh Kumar Gupta


Department of Computer Science & Engineering
Krishna Institute of Engineering & Technology
Ghaziabad-201206, India
santoshg25@gmail.com

Department of Computer Science & Engineering


Krishna Institute of Engineering & Technology
Ghaziabad-201206, India
kamika.agrohi@gmail.com

Abstract- Web has grown enormously and is stiD growing


rapidly day by day. With this huge amount of information in the
web it has become difficult for the search engines to retrieve the
required

and

relevant

information

efficiently.

Web

mining

techniques, using different approaches, have contributed a lot in


providing the relevant information to the user query. This paper
introduces a new method for prioritizing the web pages based on
web usage and web content data. The proposed method uses
Genetic Algorithm for providing good quality web pages as a
result of user query. Prioritization of web pages faDs in the
category of NP-complete problems. Genetic algorithm is used to
deal with this. The method includes the parameters from both
web usage and web content mining. Experimental results show
that the proposed approach performed better than the existing
approach.
Keywords--Genetic

algorithm;

web

usage

mining;

web

content mining; common entry and exit points

I.

INTRODUCTION

World Wide Web has brought revolutionary changes in the


popularity of internet. It has grown into a huge and global
information space. The volume of information present on the
web is distributed in nature and growing at an exponential rate.
To get the desired information without wandering through the
pages of website has become an irksome job. Different types of
methods are required to organize and manage the information
so that it can be used efficiently for business purpose. There
exists a need of web mining technique in order to explore such
a gigantic information base. Web mining is the process of
uncovering user desired information from web documents by
applying data mining techniques. Web mining aims to develop
new methods for effective retrieval of potentially useful
information. A large amount of information on the web is
redundant in nature resulting in multiple pages carrying similar
contents. There is a present of heterogeneity among data
present on the websites.
Based on the type of data present in web documents, web
mining is divided into three classes: web content mining, web
structure mining and web usage mining. Web content mining
searches the information from structured, semi structured or
unstructured content of the web. There are a number of links
present on the web pages which connects and organizes the

978-1-4799-2900-9/14/$3l.00 2014 IEEE

information together. These hyperlink structures are utilized by


web structure mining for retrieval of information. Web usage
mining discovers the usage pattern of visitor by mining the log
files. It works by preprocessing the initial log data which
removes the redundancy among data and then detecting the
patterns and then performing an analysis on these patterns in
order to find out user behavior. Several optimization
techniques have been used for fmd the most useful pages of
web site by using web usage and web content mining. The
proposed approach uses natural optimization technique called
genetic algorithm to explore the search space by using both
content and usage mining. The inspiration behind genetic
algorithm is the process of natural selection and genetic
dynamics [5]. Genetic algorithm has its roots in the Darwin's
theory of survival of the fittest. So genetic algorithm is a search
algorithm based upon the process of natural selection and
population genetics [19]. The proposed approach aims to use
genetic algorithm on the data collected by integrating web
usage mining and web content mining in order to find the
pages of web site which are of utmost importance to user. Our
approach is compared with the approach in [20] hereafter
named as EA and results are found to be better.
In Section II paper presents the Literature review. Section
III introduces the concept of Genetic Algorithm. In Section IV
the proposed algorithm is presented. Implementations details of
the proposed approach are given in Section V. Section VI &
VII describe experimentation and conclusion respectively.

II.

RELATED WORK

Web usage mining is the most crucial field of web mining.


A lot of research has been done in this area which shows the
importance of web usage mining to search engines. Speed and
precision acts as most desirable characteristics of search
engines. Evolutionary algorithms more specifically genetic
algorithm plays a vital role in achieving these characteristics.
These algorithms also play an important role in the mining of
web usage data. In [1] authors discuss about the use of genetic
algorithm for mining the information from the web. They
found that results of queries provided by search engines
suffered from the problem of poor information and irrelevant

546

pages. They provide a genetic strategy for search engines and


considered web search as a standard optimization problem. The
efficiency of search engine can be improved through web
usage mining by using MASEL (matrix analysis on search
engine log) algorithm proposed in [2]. The relationship among
user, query and resource acts as central idea for this algorithm.
MASEL considered a resource to be good if it is accessed by
many good users. The purpose of improving search engine
retrieval performance is dealt in [3]. Authors have proposed a
genetic programming based framework for discovering ranking
function which improves the retrieval performance by
prioritizing the web pages in the decreasing order of relevance.
The results are compared and found to be better than other
existing ranking function for information retrieval. In [4]
grammar based genetic programming used as data mining
optimization technique in e-Iearning system. A group of useful
education prediction (EP) rules are developed and provided to
courseware authors to improve the adaptive systems for web
based education (AS WE).
Genetic Algorithm is a natural selection theory based algorithm
used for solving optimization problems. It is an adaptive
heuristic search algorithm based on concept of survival of the
fittest. Selection, crossover, mutation and acceptance are the
main steps used for finding the solution to a problem. Fitness
function is used for fmding the goodness of any solution and
mutation escapes the population from problem of local optima
[5]. A probabilistic web user model based on genetic algorithm
for improving the web site structure is proposed in [6].
Adjacency matrixes have been used for representing the
genetic population and ranking acts as a parameter for fitness
scaling. Random binary vector is created by using scattered
crossover. The result shows an improvement over another
method.
Web usage mining works on the data collected from client
server interaction. It utilizes secondary data present in web
server logs, browser logs, proxy server logs, registration data,
user profiles, cookies or any other source for mining the
interesting patterns. It mainly consists of three phase data
preprocessing, pattern discovery and pattern analysis [7].
Pattern discovery is performed in order to draw useful patterns
from preprocessed data [S]. A system called Web Sift is
designed to perform usage mining. It utilizes data from web
server log in order to perform mining task. This data suffers
from real world challenges. A framework dealing with all these
challenges is discussed in [9]. A number of soft computing
techniques had been used for retrieving the information such as
in the field of web mining [10]. Soft computing technique
called self organizing map (SOM) is applied to preprocessed
data in web usage mining in order to find visitors navigation
behavior [11]. This behavior of them is used for discovering
the useful knowledge from secondary data [12, 21]. Authors
proposed an optimization technique called ant colony
clustering algorithm (ACLUSTER) for detecting useful trends
and used linear genetic programming for analysis of user
trends. ACLUSTER algorithm is applied on the preprocessed
and cleaned data by using number of objects in the area and
their similarity is used as independent threshold to form
clusters of web usage patterns. It is important to improve the
structure of web site from time to time as sites outgrow in their

2014

design by compiling links and pages together. In [13] websites


are reorganized by using 0-1 programming approach. This
method is based on the co -occurrence frequencies between
web pages which are obtained by user access pattern. In order
to reduce the search depth and information overload for users
two constraints are used number of outward links from each
page and length of shortest path from home page to each page.
Web personalization is the way of providing service to web
visitor for retrieving the information of hislher interest. This is
achieved by predicting the next page access by user. An
accurate recommendation system for predicting next page
access by using web usage mining has been discussed in [14].
Pair wise nearest neighbor clustering is used for identifying
similar access pattern. The method provides good prediction
accuracy and minimizes state space complexity. A two step
strategy to improve retrieval effectiveness for personalizing the
web has been presented in [15]. In the first step users query are
categorized by system automatically based on his search
history and then these categories are used for performing web
search.
An intelligent miner (i-miner) framework has been used for
analyzing the user trend [16]. A hybrid evolutionary approach
called FCM has been used for forming the clusters for
separating the user with similar interest and the Takagi-Sugeno
fuzzy inference system has been used for analyzing the trends.
Another approach for exploring the navigational pattern is by
discovering the relationship existing among user and web
object. A system based on probabilistic latent semantic analysis
(PLSA) [17]. It has been developed for automatically
characterizing the user preference and interests. Probabilistic
inference has been used for performing analysis tasks. Authors
in [IS] have proposed a workflow that shows how usage data
can be extracted and processed for a real world tourism web
site.
III.

THE PROPOSED ApPROACH

The Genetic Algorithm (GA) is a natural optimization and


adaptive heuristic search technique whose basic idea depends
upon process of natural evolution. The mechanism of evolution
is parallel in nature and has been used for solving several
computational problems [19]. GA is used for solving general
purpose optimization problems [5].
In computational problem genetic algorithm begins by
selecting initial population in the form of chromosome and
then applying fitness function which minimizes the cost on
selected chromosome. Then two parent chromosomes having
greater fitness are selected. Crossover and mutation are
performed on selected parents. The process is repeated until
best solution among current population is retrieved. After
selection crossover is performed between two parent string and
it results into offspring string. Mutation is another operator
which is applied after crossover in order to change genetic
material between parents and forms offspring. Then on the
basis of Darwinism the offspring which survives most is
chosen to be fittest [20].
A collection of webpage is used to represent chromosome
in web usage mining problem. In order to find the web pages
that is of utmost importance to user GA is used in this
approach. Unique number has been assigned to web pages.

International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT)

547

These pages are indexed by assigning ID to them and thus


chromosomal representation looks like below in fig I
Chromosomel= {set of web links} = {PI, P2, P3, P4, P5, P6,
P7, P8, P9, PIO}
Where PI, P2 . . . . . . . represents web links

43

48

10

13

37

38

44

36

14

popularity among users if it is visited by more number of


distinct visitors. Table III shows unique visitors.
TABLE III. UNIQUE VISITORS AND THEIR CORRESPONDING USER lD

Unique Id

49

Fig.1. Representation of web links in chromosomal form

A.

URL

Number of Unique Users

192.168.30.95

23

192.168.30.15

15

192.168.1.5

35

192.168.1.5

24

192.168.30.127

19

Chromosome Representation

The chromosomes are used for representing initial


population. Each chromosome shows a candidate solution. For
representing the web page we will assign a unique number id to
each unique URL taken from web server log. For further
processing these unique no id is used instead of URL of pages
visited by user.

3) Time Duration: The amount of time spent on a page


shows the relevance of page for the user. If a user spent more
amount of time on a particular page then that page is
considered to be useful for the user. Table IV shows duration
of particular URL.

TABLE 1. UNIQUE lD ASSIGNED TO WEB PAGE URL'S

TABLE IV. AMOUNT OF TIME USER STAYED ON THE PAGE WITH RESPECT TO

Unique Id

B.

URL

192.168.30.95:51854

192.168.30.15:45682

192.168.1.5:32773

192.168.1.5:32773

192.168.30.128:60339

URL

Unique Id

Duration(seconds)
45

192.168.30.95

192.168.30.15

217

192.168.1.5

84

192.168.1.5

24

192.168.30.127

Fitness Function

Fitness function is an objective function used for selection


of best individual among all individuals. It is used for
quantifying the optimality of a solution. It measures the
goodness of a solution by providing ranks to solution [21].
Various parameters are required for calculating the fitness of a
solution as presented below.
i) Access frequency: Access frequency measures number
of times a particular page is visited by user irrespective of
user id In web usage mining, the usefulness of any particular
page can be measured by calculating the access frequency.
More the access frequency more could be its usefulness. Table
II shows the URL and their related access frequency.
TABLE II.

URL

URCS AND THEIR RELATED ACCESS COUNT

Access Count

192.168.30.95

33

192.168.30.15

12

192.168.1.5

192.168.1.5

24

192.168.30.127

19

2) Number of unique visitors: This factor shows the


importance of any web page on the basis of unique visitors
visited this page. This means that a URL can have more

548

URL

2014

4) Number of bytes received: The quantity of data


downloaded by user from the web page shows that page has
content which is relevant for user. The entries for number of
bytes received by user are present in web log server entry.
From this entry we can deduce whether a page is important or
not.
TABLE V. NUMBER OF BYTES RECEIVED BY USER

Unique Id

Amount of bytes received

270

2254

1059

124

1609

5) Common entry and exit points: A visitor begins his


search by clicking on a link which forwarded him towards a
page of website. This page is considered as the entry point of
the user. The exit point signifies the destination of the visitor.
It tells what visitors are looking for in the website.
6) Number of advertisements: The importance of any web
page can also be recognized by analyzing the number of
advertisement present on any particular page. If a page

international Conference on issues and Challenges in intelligent Computing Techniques (iCICT)

consists of more number of advertisements then that page is

CostCommonPoints= 20

thought

CostAdvertisements=

to

be

visited

by

more

number

of

visitors.

Advertisements are placed on the pages which have higher


frequency of visits by user so they signifY the importance of
page.

C(x)=2.4*33+0.05*217+0.6*35+0.003*2254+0.6*20
+1.5*30= 174.812
C.

I. Access frequency of each page


2. Number of unique user
3. The amount of time user stayed on the page
4. Number of bytes received
5. Common entry and exit points
6. Number of advertisement
CostAccess frequency (AF)
If=1(A. Fi)
Where n=number of entries in the web log and AF is
number of times a page is accessed by visitors.
Costunique user (UNQ) = I f=1(UNQi)
Where n =number of entries in the web log and UNQ is
the number of different users visited a URL.
CostDuration (OUR) = IF; 1( DURi)
Where n=number of entries in the web log and DUR is
the amount of time user stayed on a web page.
CostBytes Received (BR) = I r 1(BRi)
Where n=number of entries in the web log and BR is the
amount of data user fetched from a web page.
Costcommon entry exit point (EP) = I 1{EPi)
Where n=number of entries and EP shows the pages of
beginning and finish of a user access session.
CostNumber of advertisement (AD) = IF;1(Alli)
Where n=number of entries and AD signifies the number
of advertisement present on any web page.
Cost Function
C(x) = Cl. IF;1(A H)+ C2. IF;lUNQi) + C3. If=1(OlJRi)
+C4. If=1(BRi) +C5. IP=1(EP i)+C6. If=1(AOi)
Where CI, C2, C3, C4, C5 and C6 represent different
constants and they are used for adjusting the values of
different parameters.

30

Selection

Selection is the process of choosing the fitter


chromosomes from the population. The main objective of
selection is to give importance to good solution and ignoring
bad solution. In our approach we are using binary tournament
selection which picks two individuals randomly from large set
of population.

Fig.2. Cost function and its parameters

An example for calculating cost of various parameters is


shown below:
F(x) = Cl.CostAccessFrequency + C2.CostDuration +
C3.CostUniqueUser
+
C4.CostBytesreceived
+
C5.CostCommonPoints + C6.CostAdvertisements
CI, C2, C3, C4, C5 and C6 are constants whose function is to
normalize the value of parameters
CI= 2.4 C2=0.05
C3=0.6
C4=0.003
C5=0.6
C6=1.5
In this example values are taken from above tables for
calculating the cost
(Table 2)
CostAccessFrequency= 3 3
CostStayDuration= 217
(Table 4)
CostUniqueUser= 35
(Table 3)
CostBytes Received= 2254
(Table 5)

2014

D.

Crossover

Crossover is the method which exchanges the genetic


material of both the parents to get new offspring. Main
function of crossover is to recombine two strings to get a new
better string. Various types of crossover exists, among all of
them cyclic crossover is used in the proposed work.
Parent I
43 I 48
Parent 2

49

14

Offspring 2

10

13

37

38

44

14

36

49

38

13

48

44

37

10

43

36

After Cyclic Crossover

Offspring I
43 I 48

49

14

10

13

37

49

14

38

13

48

38

13

48

43

48

10

13

37

Fig.3. Process of crossover

E.

Mutation

Mutation is the third operator of GA that performs the


function of maintaining diversity in the population by altering
some bits present in the chromosome. It randomly distributes
genetic information and avoids the probability of algorithm to
suffer from the problem of local optima [20). There are many
types of mutation operator: flip bit, boundary, uniform, non
uniform and Gaussian. It exploits the search space more
thoroughly and results in providing better solution.
Flip Bit Mutation
49 I 48 I 10

After Mutation
48
49
10

13

13

37

38

44

14

43

36

S4

38

44

14

43

36

Fig.4. Process of flip bit mutation

International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT)

549

IV.

V.

PROPOSED ALGORITHM

The proposed GA based algorithm (PGA) applies a fitness


function on the randomly selected initial population to
produce a set of web links which are of higher priority
(TopLink-P) as compared to other existing links. The fitness
function includes a number of parameters from both the
content and usage pattern of web links. PGA initiates by
randomly selecting a set of initial population and then
applying the operators of crossover and mutation on the
population for several generations until the population gets
converged and result is produced. The whole process is
represented in the form of steps in fig 5

AN EXAMPLE

An Example depicting procedure of proposed GA based


algorithm (PGA) is shown in the Fig 6.For execution of PGA
we have used java programming language and program is run
for 50 generations with initial population consisting of lO
chromosomes. Each chromosome includes a set of 5 pages and
then their cost is calculated by applying genetic operators. The
program runs till last generation which implies the convergence
of cost. In our experiment the generation converges at cost 509.
Fig 6 shows the chromosomes with their fitness cost at
generation 1, 2, 3 and 50 with crossover rate of 75%.

Stept

Input:

Step3
Third Generation

First GenerJ.tioll

Initial Population Size, PopSize

Number of generations, N

Crossover Rate, CR

Mutation Rate, MR
Output: Set of Top Priority Web links, TopLink-P

Cost Function:

Access frequency of each page (APi)

The amount of time user stayed on the page(DU Ri)

Number of unique user (lJ NQi)

Number of bytes received(BRi)

Common entry and exit points(E P' i)

Number of advertisement (ADi)

Cost

Chromosomes

Chromsomes

Cost

CRI

18

15

19

14

215

CRI

16

15

17

509

CR2

18

19

12

310

CR2

16

15

17

509

CRJ

16

13

II

215

CR3

10

14

12

310

CR4

J3

15

17

509

CR4

10

17

509

CR5

10

16

14

335

CR5

16

15

17

509

CR6

10

17

12

215

CR6

16

15

17

478

CR7

15

II

215

CR7

16

15

17

509

CR8

16

15

17

509

CR8

16

15

17

509

CR9

10

16

II

15

351

CR9

16

15

17

233

CRI

10

14

12

310

CRI 16

15

17

509

Selection
Crossover
Mutation

Step2

Step n
Fifteith Generation

Second Generation
Cost

Chromosomes

Chromosomes

Cost

Cost(C)=

CRI

16

15

17

509

CRI

16

12

II

17

509

Cl. Li,;,,(AF:i}tC2. bi,;,,(UNQi)+C3. b, _,(mJru)+C4. bi _1( B ru.)+C5

CR2

15

17

509

CR2

16

15

17

509

CRJ

10

16

II

15

351

CR3

14

15

509

CR4

10

16

14

335

CR4

16

15

17

509

. b,';',(EPi.}+C6.b;::' ,(A i)
Method:

I.

Generate Initial Population Set of randomly selected


web links, WebLinks [PopSize]
2.
Evaluate each Top-P Web Link in the set of Top-P web
links, WebLinks [PopSize], using cost function
While Generation::: N Do
3.
a) Perform Binary Tournament
Selection,
CrossoverWebLinks[PopSize]
among
b)Apply
Cyclic
Crossover
CrossoverWebLinks[popSize]
(WLinkParent 1, WLinkParent2)=Randoml yChoose(Crossover
WebLinks [popSize])
(WLinkOffspring 1, WLinkOffspring2 )=Cycli cCrossover(WLink
Parent 1, WLinkParent2)
c) Copy (WLinkOffspringl, WLinkOffspring2) to
NewWebLinks
NewWebLinks[]=(WLinkOffspring 1, WLinkOffspring2)
d) Perform mutation with mutation rate, MR
e) Copy New Web Links to Initial set of Top-P WebLinks
WebLinks [ ] = NewWebLinks []
End While

4.

TopLink-P WebLinks=LowCostWebLink(WebLinks [])

5.

Return TopLink-P

CR5

16

13

II

215

CR5

16

15

17

509

CR6

16

13

II

215

CR6

16

15

17

509

CR7

10

16

14

335

CR7

16

15

17

509

CR8

10

14

12

310

CR8

16

15

17

509

CR9

16

13

II

215

CR9

14

15

509

CRI

10

14

12

310

CRI

16

15

17

509

VI.

EXPERIMENTATlON AND RESULTS

The results produced after implementing the PGA on a


programming language is shown by making use of graph
structure. In our program we have included the parameters
from both the content of the web and from the usage pattern of
the web pages. The comparison of results of both PGA and
existing approach EA are shown in Fig 7.We have run the
program for 50 generations with different crossover rate
ranging from 50% to &75%. For a different crossover rate cost
of the web links varies. We have also studied the quality of the
web pages for different generations. We increased the
generations till 400 and keep the constant crossover rate of
75% and compared the value of fitness score. The finding
shows that on moving from one generation to next the cost
varies. We have also tries to study the effect of crossover rate
over the cost. The experimental results proves that in most of
the cases the cost of TopLink-P web pages are better for
proposed approach as compare to the existing approach.

Fig.5. Proposed GA based Algorithm (PGA)

550

2014

International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT)

Programming," Data Mining Xl: Data Mining, Text Mining and Their
Business Applications, pp.205-214 ,2005.


[5)

R.C. Chakraborty, "Fundamentals of Genetic Algorithms," Artificial


Intelligence ,2010.

[6)

E. Andaur, S. Rios, P. Roman, and J. Velasquez, "Best Web Site


Structure for Users Based on a Genetic Algorithm Approach,"
University of chile, 2010.

[7)

1. Srivastava, R.Cooley, M. Deshpande, and P. N. Tan, "Web Usage


Mining: Discovery and Applications of Usage Patterns from Web Data,"
ACM SIGKDD Explorations Newsletter 1.2 , pp.12-23, 2000.

[8)

R. L. Haupt, "Practical Genetic Algorithms," John Wiley & Sons Inc.


Chapter 1-7, pp. 1-251, 2004.

[9)

1. Srivastava, R. Cooley, M. Deshpande and P.N. Tan, "Web usage


mining: Discovery and applications of usage patterns from web
data", ACM SIGKDD Explorations Newsletter, 1(2), pp.12-23, 2000

[10) S. P. Nina, M. Rahman, K. l. Bhuiyan and K. Ahmed, " Pattern


discovery of web usage mining," In Computer Technology and
Development, ICCTD 09 International Conference on vol. 1, pp. 499503 lEEE 2009
[11) O.Nasraoui, M. Soliman, E. Saka, A. Badia, and R. Germain, " A web
usage mining framework for mining evolving user profiles in dynamic
web sites," Knowledge and Data Engineering, IEEE Transactions,
voI20(2), pp.202-215, 2008
[12) S. K. Pal, V. Talwar, and P. Mitra, "Web mining in soft computing
framework: Relevance, state of the art and future directions," Neural
Networks, IEEE Transactions ,vol 13(5), pp.1163-1177, 2002.


Fig.7. PGA vs EA at Cross Over rate 50%, 60%, 70%, 75%

VII.

CONCLUSION

As size of information present on the internet has taken a


shape of the giant it has become a necessity to increase the
efficiency of the search engines. Web mining is aiming in this
direction. It helps in mining the information on the basis of
content, structure and usage of web pages. The proposed GA
based approach combines the information from both content
as well as usage of a web page in order to provide the required
and relevant pages to user. We have calculated the cost of the
web pages till the value gets converged in order to get the
most optimized result. This cost is used as parameter in order
to find the relevance of TopLink-P web pages. We have
represented the experimental results in the form of graphical
structure. These results show the superiority of proposed
approach as compared to existing approach.

[13) K. Etminani, A. R. Delui, N. R. Yanehsari, and M. Rouhani, " Web


usage mining: Discovery of the users' navigational patterns using SOM,"
IEEE First International Conference in Networked Digital Technologies,
NDT'09 , pp. 224-249, 2009.
[14) A. Abraham, and V. Ramos, "Web usage mining using artificial ant
colony
clustering and linear genetic programming,"
lEEE
In Evolutionary Computation CEC'03 vol. 2, pp. 1384-1391, (2003)
[15) C. C. Lin, "Optimal Web site reorganization considering information
overload and search depth," European Journal of Operational
Research 173(3), pp.839-848, 2006.
[16) X. Jin, Y. Zhou, B. Mobasher, " Web usage mining based on
probabilistic latent semantic analysis," In Proceedings of the tenth ACM
SIGKDD international conference on Knowledge discovery and data
mining pp. 197-205, 2004.
[17) A. Pitman, M. Zanker, M. Fuchs, M. Lexhagen," Web usage mining in
tourism-a query term analysis and clustering approach," Information
and Communication Technologies in Tourism , pp 393-403, 2010.
[18) M. Mitchell, " An Introduction to Genetic Algorithms," MIT Press.
Chapter 1-6. pp. 1-203, 1998
[19) T. V. Mathew, " Genetic Algorithm," Indian Institute of Technology
Bombay, Mumbai pp. 1-15, 2012
[20) A. R. Simpson, G. C. Dandy, L. J. Murphy, "Genetic algorithms
compared to other techniques for pipe optimization" Journal of Water
Resources Planning and Management, 120(4), pp. 423-443, 1994
[21) A. K. Mishra, M. K. Mishra, V. Chaturvedi, S. K. Gupta and J. Singh,
"Web usage mining using self organized maps" International Journal of
Advanced Research in Computer Scence and Software Engineering, vol
3(6), pp. 532-539, 2013

REFERENCES
[1)

F. Picarougne, N. Monmarche, A. Oliver and G. Venturini,"GeniMiner:


Web Mining with a Genetic-Based Algorithm," ICWI, pp. 263-270,
2002.

[2)

D. Zhang, and Y. Dong, " A novel web usage mining approach for
search engines," Computer Networks, vol 39(3) ,pp 303-310, 2002.

[3)

W. Fan, M. Gordon and P. Pathak, "Genetic programming-based


discovery of ranking functions for effective web search," Journal of
Management Information Systems, vol 21(4), pp 37-56, 2005.

[4)

C. Romero, S. Ventura, C. Hervas and P. Gonzalez,"Rule Discovery in


web-based educational systems using Grammar-Based Genetic

2014

International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT)

551

You might also like