You are on page 1of 9

Int. J.

Advanced Networking and Applications 3571


Volume: 09 Issue: 05 Pages: 3571-3579(2018) ISSN: 0975-0290

Optimizing Ontology Mapping Using Genetic


Algorithms (OOMGA)
Aarti Singh
Department of Computer Science, Guru Nanak Girls College, Yamuna Nagar, Haryana, India
Email: singh2208@gmail.com
-------------------------------------------------------------------ABSTRACT----------------------------------------------------------
Ontologies play a vital role in knowledge representation in artificial intelligent systems. With emergence and
acceptance of semantic web and associated services offered to the users, more and more ontologies have been
developed by various stack-holders. Different ontologies need to be mapped for various systems to
communicate with each other. Ontology mapping is an open research issue in web semantics. Exact mapping
of ontologies is rare to achieve so it’s an optimization problem. This work presents and optimized ontology
mapping mechanism which deploys genetic algorithm.

Keywords: Genetic Algorithm , Ontology, Ontology Alignment , Ontology Mapping, Optimized Ontology
Mapping.
------------------------------------------------------------------------------------------------------------------- --------------------------
Date of Submission: Feb 12, 2018 Date of Acceptance: March 07, 2018
-------------------------------------------------------------------------------------------------------------------------- -------------------
1. Introduction related with ontologies need to be made clear. Next
subsection throws light on some such terms:
S emantic web emphasizes on incorporating meaning
with information displayed on the web. Ontologies are 1.1 Technical Preliminaries
the backbone of knowledge exchange in semantic web This section briefly explains some terms which need
where ontology is the taxonomy for a domain to understood clearly for understanding this work:
representing concepts, objects, attributes and their a. Ontology Mapping: Ontology mapping refers to
relationships with each other. Ontology represents method of translating concepts of one ontology
shared conceptualization (Gruber,1995) of a domain into concepts defined in some other ontology.
for use in semantic driven application in present Ontology mapping usually involves some loss of
Internet, where shared conceptualization refers to the information however, it doesn’t lead to
commonly accepted understanding for conceptual inconsistencies. Ontology alignment and
model of a domain under consideration.Ontologies articulation are used synonymously for ontology
(Singh et al.,2011) find applicability in system mapping. These are defined as:
engineering, semantic web, artificially intelligent  Ontology Alignment: refers to establishing
systems, information extraction and aggregation to a set of binary relations between the
name a few areas. Ontologies ( Singh et al., 2010) aim vocabularies of two ontologies.
to capture the knowledge in a generic and formal way  Ontology Articulation: involves
so that it may be reused and shared across generation of rules through which fusion
applications and by groups of people. or merging or ontologies can be carried
However, with wide acceptance of internet based out. Conditions of ontology alignment are
applications more and more ontologies are developed referred as articulation (Chitra &
by various stakeholders for different purposes, Aghila,2014).
making their interoperability difficult. Further, b. Similarity Measure:Similarity is numeric
considering the large size of internet, its users and measure of the degree to which two objects are
variety of applications being used; it is difficult to alike . Similarity measures focus on providing
force users to work with a single ontology for a concrete basis for finding similarity among two
domain. However, in order for different applications entities belonging to separate ontologies. Two
to communicate with each other and exchange objects must have similar characteristics to be
knowledge, it becomes essential for ontologies to be comparable. Formal definition of similarity
interoperable. This has been considered as an between two objects x and y as given by Ehrig
important issue by semantic web community and and Sure (2004) states:
many efforts has been made in this direction in names  sim(x, y) ∈ [0..1]
of ontology alignment or ontology mapping. Some
 sim(x, y) = 1 → x = y: two objects are
researchers have tried to focus on optimizing ontology
identical.
mapping, however the reason for optimizing ontology
 sim(x, y) = 0: two objects are different
mapping and scenario requiring it are not clearly
and have no common characteristics.
stated. Before moving further, some basic terms
 sim(x, x) = 1: similarity is reflexive.
Int. J. Advanced Networking and Applications 3572
Volume: 09 Issue: 05 Pages: 3571-3579(2018) ISSN: 0975-0290

 sim(x, y) = sim(y, x): similarity is coefficient J (Renjith& Chandrika,2013) can be


symmetric computed as:
𝑻
Many text similarity measures exist in the literature. 𝑱=
𝑻 +𝑻 +𝑻
Broadly similarity measures may be classified as (Lee
et al.,2008): (2)
where 𝑇 refers to terms common in both
1) Manual similarity measurement by
agreement among experts: This is accepted objects. 𝑇 refers to unique terms in one object
gold standard for similarity measurement and 𝑇 refers to unique terms in second object.
where most derived metrics have been Jacaard index of value 1 indicates that data
evaluated using peer review standard to objects are completely similar whereas value 0
assess their performance. However, this indicates they are completely dissimilar.
approach is infeasible due to lack of After understanding basic terms, now reason for
scalability. optimized ontology mapping needs to be understood.
2) Information–content based similarity Since ontologies are being designed by different
measurement: It involves computing sources, there is lack of consistency in taxonomies
frequency with which a term appears with being used by them, even if they are designed for the
another in a given piece of information. This same domain. Two different ontologies designed for
approach takes a statistical view of same domain may refer same concept with different
information for computing closeness of two names or different concepts with same names, or they
terms. may focus on different attributes of the concepts.
Second category above is mostly focused due to Now, when one concept say c1 from ontology say O1
availability of mathematical formulas for concrete has to be mapped to some concept c2 in ontology O2
justification of decision. In this category, vector space then, first c1 has to be searched in O2 for a match,
model (VSM) measures are widely accepted which using some similarity measure. Now, two possibilities
consider a text as vector of terms, joined with some are there, one is that some match may be found and
frequency.VSMs perform well on tasks that involve second is that no match may be found. If the match is
measuring the similarity of meaning between words, found in the form of synonymous concept of c1 then it
phrases, and documents (Turney is good otherwise some relationship needs to be
established between concepts of O1 and O2 in order to
&Pantel,2010).Methods in this category include:
ensure mapping. Ontology extension and intension
Dice coefficient, overlap coefficient, Jaccard
relationships (Singh et al., 2011) are being used for
similarity and cosine similarity etc., however cosine
this purpose. By focusing only on similarity measure
similarity measure outperforms others
based ontology mapping, there are chances that no
(Thada&Jaglan, 2013). This work makes use of
similarity between two concepts may be found and
Cosine similarity and Jaccard Coefficient. Thus both
mapping can’t be established. This will lead to wasted
these measures are defined below:
search time. Ontology mapping is an optimization
c. Cosine Similarity: This is the most popular
problem since, here it is not essential to get exact
technique to measure similarity of two frequency
matching of conceptsin even homogeneous domain
vectors. These vectors may be simple or
ontologies, leave apart the heterogeneous domain
weighted. It can handle both binary and non-
ontologies.
binary vectors. Let a and b be two frequency
However, another aspect can be to match one
vectors having n elements each:
ontology with many possible ontologies existing in
a=<a1,a2,a3,- - - - -,an>
the same domain and to find closest possible matching
b=<b1,b2,b3, - - - -, bn>
ontology. Thus optimized ontology mapping process
then cosine of angle θ between these two vectors
may be defined as “mapping one ontology with n
is calculated as:
∑𝒊=
other ontologies existing in a domain, to find closest
𝒊. 𝒊
𝒔 , = possible matching ontology, when no exactly
√∑𝒊= 𝒊 .√∑𝒊= 𝒊 matching ontology otherwise exists”. Optimization
(1) techniques focus on finding a satisfying solution
cosine similarity value may range from [-1,1] , it (optimal one) in the case, where no solutions
will be -1 when the vectors point in opposite otherwise exists [24]. Figure 1given below illustrates
directions and it will be +1 if the vectors point in ontology mapping as an optimization problem.
the same direction (more details may be found in
Turney &Pantel, 2010).
d. Jaccard Index or Jaccard coefficient: is useful
to measure similarity between two objects having
binary attributes.It measures the similarity
between two sample sets and is defined by the
size of intersection between the two sets divided
by the size of union of the two sets. Jaccard Figure 1. Ontology Mapping as Optimization Problem
Int. J. Advanced Networking and Applications 3573
Volume: 09 Issue: 05 Pages: 3571-3579(2018) ISSN: 0975-0290

Ontology mapping involves searching concepts of off-springs in next generations. Here reproduction
ontology in another one. Size of these taxonomies can refers to selecting fittest chromosome based on its
be quite large, leading to increased time and space fitness value. Crossover refers to exchanging genes
complexity of search processes. Thus, heuristic search between two individual chromosomes of a population
techniques need to be employed to reduce the number for producing new off-springs. Mutation deals with
of alternatives to be explored in the search space. randomly changing genes in a chromosome. It is of
Heuristic search techniques make use of a fitness two types i.e. Point mutation and chromosomal
function to decide next alternative to be explored mutation. In Point mutation only a single gene is
among many available alternatives. It is usually altered in a chromosome, whereas in chromosomal
implemented by assigning weights to various mutation few genes are altered completely.
alternatives i.e. candidates in a search space. Thus process of GA for problem solving may be
However, manual assignment of these weights is not summarized as follows:
practically feasible nor desirable in web based 1) Obtain a set of initial population
applications. A still better mechanism for searching 2) Iterative execution of:
ontologies and automating computation of fitness (i) Evaluation
function is use of machine learning techniques such as (ii) Selection
Genetic algorithms. (iii) Reproduction
Consequently, the main aim of the current work is to (iv) Crossover
present a genetic algorithm based optimized ontology (v) Mutation
mapping technique. 3) Convergence to a solution
The rest of paper is structured as follows: Section 2 Next section presents literature review in the relevant
provides brief overview of genetic algorithm and its domains.
working. Section 3 presents survey of relevant
literature in ontology alignment, ontology similarity 3. Literature Survey
parameters and genetic algorithms. Section 4 This section explores existing literature on ontology
introduces the proposed mechanism, experimental similarity measures and mapping mechanism and
analysis is illustrated in section 5. Finally, section 6 various methods available for ontology mapping
concludes this work. optimization.
Man et. al. (1996) in [7], have introduced GA as a
2. Genetics for Ontology Mapping: An complete entity in which knowledge can be integrated
Overview to develop framework for a design tool. Authors
Genetic Algorithm (GA) (Man et al.,1996) is based on highlighted that Genetic algorithms may be used as
evolutionary theory that follows principal of optimization tool.
‘survival-of-the-fittest’. It was presented by Maedche and Staab (2001) in [1], has considered
J.H.Holland in 1970s and has proved to be significant ontology as semiotic sign systems that are used to
instrument for scientific and engineering applications communicate meaning. They have proposed a
(Malhotra et al.,2011) since then.GA works on natural methodology to measure the extent to which two
process of evolution like reproduction, mutation, ontologies overlap and fit with each other at various
recombination and selection for providing solutions of semiotic levels. However, evaluation of proposed
complex and conflicting problems. Due to availability method with real world data is left as part of future
of cheap and high-speed computational components, work.
GA has emerged as an appealing solution for wide Wiesman and Roos (2004) in [4], introduced an agent
range of complex , time consuming tasks such as based domain independent method for ontology
information retrieval (Thada& Jaglan,2013), ontology mapping based on learning relationship between
mapping (Wang et al.,2006) and text mining etc. ontologies. However, mapping between different
GA starts with an initial population, where population representations of the same concepts can’t be handled
refers to a set of possible solutions for a problem. properly. Authors emphasized that context dependent
Each member of population is termed as a ontology mapping is an NP-Hard Problem. Further, an
chromosome and it represents a string of genes where extension of this method to learn a mapping between
a gene represents a bit pattern. The goal is to obtain a groups of interrelated concepts has been left as part of
set of most suitable chromosomes or most suitable future research.
individual chromosome after some iterations of GA. Euzenat J. (2004) in [5], has compared ontology
Suitability of a chromosome for a particular problem alignment methods on common tests. Main purpose of
is measured using fitness function (Renjith& this evaluation of ontology alignment methods was to
Chandrika, 2013). A population obtained after some help designer and developers of such methods to
iterations is called as a generation. improve further and help users to evaluate the
Effectiveness of next generation is enhanced by suitability of proposed methods for their applications.
applying reproduction, crossover and mutation A semi-automatic ontology mapping tool called
operations. Purpose of these operations is to mix or GLUE had been deployed by Doan et al. (2004) in
recombine genes of parents for production of their [9]. This tool makes use of multi-strategy learning
Int. J. Advanced Networking and Applications 3574
Volume: 09 Issue: 05 Pages: 3571-3579(2018) ISSN: 0975-0290

approach. It makes use of Naïve Bayes learning Malhotra et. al. (2011) in [6], have discussed the
technique which applies well to long textual elements concept and design procedure of genetic algorithms as
but is less effective with short, numeric elements. an optimization tool. They have applied GA for
Wang et. al. (2006) in [8], have developed a genetic process control in induction motor drive, speed
algorithm based optimization procedure for ontology control of gas turbine, etc. and optimized control
matching problem taking it as a feature-matching parameters for them. Singh et. al. (2011) in [10] have
process. Global similarity measure has been taken as proposed an agent based ontology mapping
fitness function between two ontologies based on mechanism for mapping in homogenous as well as
feature sets. heterogeneous domains, in order to facilitate
Martinez-Gil et. al. (2008) in [2], presented Genetics interoperability between multi-agent systems
for Ontology Alignments (GOAL) approach, to developed by different stakeholders for different
compute the optimal ontology alignment functions for purposes. This mechanism makes use of ontology
a given ontology input set. However, a multi- extension and intension concepts. However this work
objective strategy, avoiding unwanted deviations from doesn’t consider optimization while ontology
precision and recall values is left as part of future mapping.
study. Further, the authors emphasized that there Hartung et. al.(2013) in [3], presented Generic
should be a technique which given the specifications Ontology Matching and Mapping Management
of an ontology matching problem, may compute the (GOMMA) framework which works on n-gram
optimum alignment function. So that, ontology matching for computing the similarity of concept
alignment problem may be solved accurately and names and synonyms. This work outlined use of
without human intervention. This would lead to real Graphical Processing Unit (GPU) for highly parallel
interoperability in the semantic web. string matching. The GPU based execution of
Lin and Sandkuhl (2008) in [14], provided a review algorithms like n-gram matching requires some efforts
on exploiting Wordnet for ontology mapping. Authors to overcome the CPU limitations but boosts
emphasized that synonyms can help solve naming performance. However, effect of different kinds of
conflicts [4] among various ontologies, while GPU hardware on GPU-based similarity computations
mapping and Wordnet thesauri can help improve has been left as part of future research.
similarity measures dealing with ontology mapping. Singh and Anand (2013) in [13],developed an agent
A design structure for development [12] of based mechanism for automatic construction of
ontological databases in general had proposed by domain ontologies. Authors have used mapping
Singh et. al. (2010) in [11]. This work elaborated between already existing ontologies to construct new
minute details to be considered while designing ontology thus reducing time and efforts required in
ontology databases to make knowledge interchange this process. A comparison and summarization of
language independent. various existing techniques is given as follows in
Table 1.

Table1. Comparison of Existing approaches


S. Name of Author Technique Style of Results Limitations
No. mechanism Name used mapping
1 Lexicon based Maedche Semiotics view of Syntactic and Much more Ontologies are compared as
ontology and Staab ontology is Semantic experiences are sign systems. Lexicon,
comparison (2001) in considered comparison level needed to use reference functions and
[1] used. Composite ontology similarity semantic cotopy are used for
matching measures. this purpose. Optimization is
technique not considered.
2 Wiesman and Roos Wiesman Agent based Automatic, joint Ontology mapping Ontology mapping is of
approach and Roos ontology mapping attention is based on labels concern, optimization is not
(2004) in mechanism technique used and independent of addressed
[4] domain knowledge
3 GLUE[9] Doan et al. Joint Probability Semi-automatic 3-18% accuracy Naïve bayes learning
(2004) in Distribution of in matching technique used, works well
[9] Concepts, Multi- with long textual terms, not
Strategy learning effective for short numeric
method terms
4 GAOM Wang et al. Feature Matching Genetic Algorithm Not mentioned Structural properties of
(2006) in process, global used, automatic ontologies are only
[8] similarity measure mapping considered. Semantics has
is been used been ignored.
5 Genetics for Martinez et Genetic Algorithm Single goal-driven Precision and Recall Single-strategy ontology
Ontology al. (2008) search, automatic is better than mapping. Ontology mapping
Alignment (GOAL) in [2] mapping method GAOM optimization is not
considered.
6 IAM3I Singh et al. Multi-agent system Automatic, Homogeneous and Optimization in ontology
(2011) in based ontology ontology heterogeneous mapping not considered
[10] mapping extension and ontologies can be
Int. J. Advanced Networking and Applications 3575
Volume: 09 Issue: 05 Pages: 3571-3579(2018) ISSN: 0975-0290

mechanism intension concepts mapped


used
7 Generic Ontology Hartung et n-gram string Semi-automatic GPU based Memory must be pre-
Matching and al (2013) in comparison mechanism of allocated on target device.
Mapping [3] optimization, suffers Works only for integer
Management from memory values.
(GOMMA) constraints.

From the above table it can be concluded that, specializes searching along very high dimensional
although many efforts have been made towards search spaces, as this problem is.
ontology mapping, optimization of ontology mapping This work focuses on finding the optimal matching
still is an open research issue. It is clear that Genetic ontology from large number of ontologies existing
algorithms may be used for problems having large corresponding to a source ontology. Considering
search spaces. Some researchers have already used source ontology SO1consisting of n concepts and k
ontology mapping with this technique, however still target ontologies are available for mapping each
there is scope for a mechanism which may consisting of m concepts then total number of
incorporate, semantic knowledge in optimization comparisons required to choose best match will use
process. Therefore, the motivation to the current work the following equation :
is to develop an approach for optimizing ontology Optimal_matching(SO1)=f(n×k×m) (3)
mapping using Genetic algorithms as introduced in In order to solve this problem using GA, both the
the next section. fitness function (FT) and the evaluation function need
to be decided. The ontology taxonomies (hierarchy)
4. The Proposed Optimizing Ontology (OH) will act as input in formation of chromosomes
Mapping Using Genetic Algorithms of sample space, where a chromosome is a collection
(OOMGA) Approach of i genes.
For formulating genes, OH will be traversed starting
This work presents Optimized Ontology Mapping from root node to leaf node in depth first order, one
using Genetic Algorithm (OOMGA) mechanism for such traversal will produce one gene, and traversal of
optimal ontology mapping. This mechanism takes into complete OH will produce i genes {g1,g2,g3,-----gi}.
consideration synonymous concepts existing in Thus source ontology hierarchy OHs can be
compared ontologies along with usual method of term represented as a chromosome Cswhere
frequency based mapping. Reason for deploying GA Cs={g1s,g2s,g3s,-----gis}(4)
among all machine learning techniques is that GA Ontology mapping will involve comparison of
Cs(OHs) with {C1(OH1),C2(OH2),----Ck(OHk)} as
shown below in figure 2:

1
s g11 g12 g13 g14 g1m
g1s 2 g21 g22 g23 g24 g2m

g2s 3 g31 g32 g33 g34 g3m

g3s g41 g42 g43 g44 g4m


4

gk1 gk2 gk3 gk4


k

Figure 2. Process of Ontology Mapping using GA


Subsequent to these comparisons between the two
genes, it is required to compute their similarity, which
is a vector-space category problem. The vector space
model, also known as term-vector model, represents a
textual document as vectors of terms or words. Here focuses on syntactic similarity of two vectors, which
similarity of a query in a vector space of a document is not sufficient for optimizing ontology mapping.
may be calculated using cosine similarity(Turney and While comparing two ontologies, similar terms may
Pantel,2010) (also known as normalized correlation be expressed using different strings such as Person
coefficient) or Jaccard coefficient as discussed in and Human are synonyms but their cosine similarity
section 1.1 . would be 0. However, if we consider contextual
For textual vectors, cosine similarity lies between 0 & similarity of these terms, these are similar.
1. However, Cosine similarity doesn’t consider Consequently, contextual similarity should also be
magnitude or semantics of terms. Rather it only considered in order to provide optimal mapping
between ontologies. Therefore, Jaccard coefficient
Int. J. Advanced Networking and Applications 3576
Volume: 09 Issue: 05 Pages: 3571-3579(2018) ISSN: 0975-0290

can be used as it provides magnitude of difference If there is no semantic similarity between two genes
between two genes, as follows: or two ontologies then theJaccard coefficient (J) will
J  g1s , g11   g1s  g11 g1s  g11 (5) be 0 or close to zero. Then, the similarity will depend
mainly on Cosine_similarity of genes.
The Jaccard coefficient (J) between two genes would
be 1 or close to 1 if they are either identical or near 4.1 Example for mapping between two educational
identical, however it will be 0 in case of unidentical ontologies
genes. To clarify the above stated concept, consider the two
The fitness function ofthe proposed framework is exampleontologies as shown in figures 3 and 4. Both
defined as: these ontologies are from education domain, one
fitness _ fun  cos_ sim  g1s , g11   J ( g1s , g11 ) represents part of university ontology and other
illustrates part of school ontology.
(6)

Employees

Officer Faculty Non-Teaching

Registrar Deputy Asstt. Assoc. Professor Clerk Store- Technical


Registrar Professor Professor Keeper Asstt.

Figure 3. Part of University Ontology

Staff

Officer Faculty Non-Teaching

Principal Secretary Teacher Lecturer r. Lecturer Clerk Peon Supporting


staff

Figure 4. Part of University Ontology

To find mapping between these two ontologies, first


concepts need to be checked for their synonyms. For
this, all unique concepts of both ontologies will be
assigned unique numeric value and will be stored in a i.e. table 2 , where concept no. field indicates position
linear arraytermed as Unique Identification Array of concept in UIA. For example first row of table 3
(UIA) as shown in table 2, where the serial no. of a contains synonyms for concept no. 1 in UIA table i.e.
concept in the array will signify numeric value table 2.
associated with it.

Now, all these terms are checked for their synonyms


from thesaurus based Wordnet dataset, in order to
include contextual similarity of different terms in the
two ontologies under consideration. These synonyms
are saved as a row in a two dimensional matrix, called
asSynonym Set Matrix, as shown in Table 3 below.
Each row in table 3 corresponds to a concept of UIA
Int. J. Advanced Networking and Applications 3577
Volume: 09 Issue: 05 Pages: 3571-3579(2018) ISSN: 0975-0290

Table 2. Unique Identification Array (UIA)

Serial No. Concept name


1 Employee Concept Synonyms
2 Officer No.
3 Faculty 1 Staff worker
4 Non-teaching 2 CEO OSD
5 Registrar 3 Teacher Lecturer staff
6 Deputy Registrar 4 Person not in
7 Asstt. Professor teaching
8 Assoc. Professor 5 - - -
9 Professor 6 - - -
10 Clerk 7 Lecturer
11 Store Keeper
12 Technical Asstt.
13 Staff
14 Principal
15 Secretary
16 Teacher
17 Lecturer
18 Sr. Lecturer
19 Peon
20 Supporting Staff
21 Worker

Now, every concept of source and target ontologies (9) and (10) provide exact matching of all three terms
has a synonym set associated with it. These synonyms at second subset in equation 9, based on contextual
are represented in numeric values from using UIA similarity of these terms. Now J-coefficient for g11 and
table. For example: concept employee has synonym all subsets will be computed and maximum value
set {staff, worker} which can be represented as among all calculated values will be considered as J-
{13,21} using positional value of staff and worker coefficient of original pair (g1s , g11). For more
from table 2, similarly term faculty has synonym set relevant and lesser false negatives while matching, the
{13,16,17} . fitness function is to be computed.
This similarity calculation mechanism is better than
For generalization, when comparing two genes for cosine similarity alone as it incorporates contextual
if  g1s  g11  where:
similarity of terms in various ontologies.
similarity i.e., to check,
g1s={employee, faculty, asstt. prof.} ={1,3,7}(7) 4.2 Work Flow of OOMGA
g11={staff, faculty, lecturer}={13,3,17} (8) Figure 5 given below illustrates work flow in
Before comparing g1s is scanned from synonym set OOMGA. For optimized ontology mapping, initially
matrix (table 3) and its synonymous set termed as concepts of source ontology will be converted into
syn_set is generated by replacing each term with all genes. All unique terms of these genes will be entered
its synonyms one by one. For example syn_set for g1s into UIA and will be assigned unique integer values.
is given below: Further, synonyms of all unique terms will be
syn_set (g1s )= {{13,21},{13,16,17},{17}} obtained from Wordnet and will be inserted into
using this, g1s can be rewritten in expanded form as synonym set matrix. Afterwards, genes will be
shown below. converted into numeric sets. Then synonymous set
g1s={{1,3,7},{13,3,7},{21,3,7},{1,13,7},{1,16,7},{1, (syn_set) will be generated for source gene and it will
17,7},{1,3,17}} (9) be used for computing Jaccard coefficient from target
g11={13,3,17} (10) gene. In this process, J value for contextually similar
As compared to original equations (7) and (8) where genes will become close to one. Cosine similarity of
only one term was matching exactly, new equations source and target genes will also be computed.
Int. J. Advanced Networking and Applications 3578
Volume: 09 Issue: 05 Pages: 3571-3579(2018) ISSN: 0975-0290

Figure 4. Fitness Matrix

Ontology Preprocessing and


tokenization

Unique Numeric Identification Gene Fitness value


Assignment No.
g1s Fitness_val(g1s, g11) Fitness_val(g1s, g12) - -
Synonym Set Matrix Population g2s Fitness_val(g2s, g11) Fitness_val(g2s, g12) - -
Using Wordnet g3s Fitness_val(g3s, g11) Fitness_val(g3s, g12) - -

Generate Synonymous subsets of


source gene

Compare Synonymous subsets with


target gene and compute J

Compute cosine similarity for


ure 5.source
Work flow
and in OOMGA
target gene

Compute fitness function for source


and target gene and store in fitness
matrix

Compare fitness value with


evaluation criteria and generate new
population

Perform Crossover and Mutation


with some probability

Then, fitness function between two genes will be so it is a promising technique for optimized ontology
computed using equation (6) and will be stored in mapping. Further, proposed technique deploys a
fitness matrix shown in Table 4 given below. similarity calculation mechanism that is better than
Purpose of Table 4 is to keep record of fitness cosine similarity alone as it incorporates contextual
function values when source gene is compared with similarity of terms in various ontologies while
different target genes. Based on a threshold value, mapping optimization. However, proposed
genes will be selected for next generation and then mechanism is still in process of implementation.
mutation and crossover operations will be applied Future work involves its implementation and
with some probability (To be decided at the time of comparison with existing techniques.
experiment) to generate next generation.
This process will be repeated on all ontologies under References:
consideration for mapping and best matching 1. Maedche, A., &Staab, S. (2001). Comparing
ontologies would be considered as optimal matching ontologies-similarity measures and a comparison
pair. study (p. 16). AIFB.
2. Martinez-Gil, J., Alba, E., & Aldana-Montes, J. F.
4. Conclusions and Future Work (2008, October). Optimizing ontology alignments by
This work presented an optimized ontology mapping using genetic algorithms. In Proceedings of the
technique deploying genetic algorithm.GA specializes workshop on nature based reasoning for the semantic
searching along very high dimensional search spaces Web. Karlsruhe, Germany.
Int. J. Advanced Networking and Applications 3579
Volume: 09 Issue: 05 Pages: 3571-3579(2018) ISSN: 0975-0290

3. Hartung, M., Kolb, L., Groß, A., & Rahm, E. 14. Lin, F., &Sandkuhl, K. (2008). A survey of
(2013, January). Optimizing Similarity Computations exploiting wordnet in ontology matching.
for Ontology Matching-Experiences from GOMMA. In Artificial Intelligence in Theory and Practice
In Data Integration in the Life Sciences (pp. 81-89). II (pp. 341-350). Springer US.
Springer Berlin Heidelberg. 15. Ehrig, M., & Sure, Y. (2004). Ontology
4. Wiesman, F., & Roos, N. (2004, July). Domain Mapping-an integrated approach. In the Semantic
independent learning of ontology mappings. Web: Research and Applications (pp. 76-91).
In Proceedings of the Third International Joint Springer Berlin Heidelberg.
Conference on Autonomous Agents and Multiagent 16. Lee, W. N., Shah, N., Sundlass, K., &Musen, M.
Systems-Volume 2 (pp. 846-853). IEEE Computer (2008). Comparison of ontology-based semantic-
Society. similarity measures. In AMIA annual symposium
5. Euzenat J. (2004), ‘Evaluating Ontology proceedings (Vol. 2008, p. 384). American Medical
Alignment Methods’. Published in proceedings of Informatics Association.
Dagstuhl Seminar on Semantic Interoperability and 17. Gruber, T. R. (1995). Toward principles for the
Integration, September 2004, Wadern, Germany. design of ontologies used for knowledge
6. Malhotra R., Singh N. and Singh Y. (2011), sharing. International journal of human-computer
‘Genetic Algorithms: Concepts, Design for studies, 43(5), 907-928.
Optimization of Process Controllers’. Published by 18. Gao, Y., & Gao, W. (2012). Ontology similarity
Canadian Center of Science and Education in measure and ontology mapping via learning
International Journal of Computer and Information optimization similarity function. International
Science, Vol. 4, No.2, March 2011,pp. 39-54. Journal of Machine Learning and Computing, 2(2),
7. Man, K.,F., Tang, K.,S. and Kwong, S. (1996). 107-112.
Genetic Algorithms: Concepts and Applications. 19. Turney, P. D., &Pantel, P. (2010). From
IEEE Transactions on Industrial Electronics, frequency to meaning: Vector space models of
43(5),519-534, OCTOBER 1996. semantics. Journal of artificial intelligence
8. Wang, J., Ding, Z., & Jiang, C. (2006, December). research, 37(1), 141-188.
GAOM: Genetic algorithm based ontology matching. 20. Chitra, S., &Aghila, G. (2014). A survey on tools
In Services Computing, 2006. APSCC'06. IEEE and algorithms of ontology operations. Research
Asia-Pacific Conference on (pp. 617-620). IEEE. Journal of Engineering Sciences, 3(5), 12-25, May
9. Doan, A., Madhavan, J., Domingos, P., & Halevy, 2014.
A. (2004). Ontology matching: A machine learning 21. Jing, L., Zhou, L., Ng, M. K., & Huang, J. Z.
approach. In Handbook on ontologies (pp. 385-403). (2006, April). Ontology-based distance measure for
Springer Berlin Heidelberg. (GLUE Approach) text clustering. In Proceedings of the Text Mining
10. Singh, A., Juneja, D., & Sharma, A. K. (2011). Workshop, SIAM International Conference on Data
Design of an Intelligent and Adaptive Mapping Mining (Vol. 23).
Mechanism for Multiagent Interface. In High 22. Thada, V., &Jaglan, V. (2013). Comparison of
Performance Architecture and Grid Computing (pp. Jaccard, Dice, Cosine similarity coefficient to find
373-384). Springer Berlin Heidelberg. best fitness value for web retrieved documents using
11. Singh, A., Juneja, D., & Sharma, A. K. (2010). genetic algorithm. International journal of
General Design Structure of Ontological Databases Innovations in Engineering and Technology (IJIET),
in Semantic Web. International Journal of 2(4), 202-205, August 2013.
Engineering Science and Technology, 2(5), 1227- 23. Renjith, S., & Chandrika, A. (2013). Fitness
1232. function in genetic algorithm based information
12. Singh, A. and Anand,P.(2013). State of Art in filtering- a survey. International Journal of Computer
Ontology Development Tools. International Journal Science and Mobile Computing, 80-86, December
of Advances in Computer Science & Technology, 2013.
2(7),96-101, July 2013. 24. http://www.merriam-webster.com/dictionary/
13. Singh, A. and Anand,P. (2013). Automatic optimization
Domain Ontology Construction Mechanism.
Proceedings of 2013 IEEE International
Conference on Recent Advances in Intelligent
Computing Systems (RAICS), Trivandrum,
Kerala, India, December 19-21,2013,(pp.304-309).

You might also like