Swoogle Semantic 201183583

Swoogle: A Semantic Web Search and Metadata Engine
Li Ding, Tim Finin, Anupam Joshi, Rong Pan, Pavan Reddivari, Vishal Doshi, R. Scott Cost, Joel Sachs, Yun Peng Department of Computer Science and Electronic Engineering University of Maryland Baltimore County, Baltimore MD 21250, USA
Presented by
Adhitya Bhawiyuga (201183583)
Content
Introduction Semantic Web Document Swoogle Architecture Finding Semantic Web Document Semantic Web Document Metadata Ranking Indexing and Retrieval Current Status Conclusion and Future Work
Swoogle : Introduction
Are you familiar with this?
Introduction : What is Swoogle
Swoogle is search engine Crawler-based indexing and retrieval system Intended for Semantic web Extract metadata for each deocument Computes relation between document
Introduction : Related Work
Ontology based annotation system i.e. SHOE, Ontobroker, WebKB, QuizRDF based on annotation rather than on entire document Ontology repositories i.e. DAMLOntologyLibrary, SEM Web Central do not automatically discover semantic web document Semantics web browser i.e. Ontaria only focus on RDF storing, rather than on metadata
Swoogle : Semantic Web Document
Semantic Web Document : SWD
a document in a semantic web language that is online and accessible to web users and software agents...
SWD : Classification SWD is divided into :

Semantics Web Ontology (SWO) define significant proportion of statement which makes new term (i.e. class, property) Semantics Web Database doesn't define or extend significant number of terms or we can say as individuals
in the case of Swoogle, SWD is classified by using a threshold formulation
SWD : Classification Example SWO

<rdfs:Class rdf:about="http://xmlns.com /foaf/0.1/LabelProperty" vs:term_status="unstable"> <rdfs:label>Label Property</rdfs:label> <rdf:type rdf:resource="http://www.w 3.org/2002/07/owl#Class"/> <rdfs:isDefinedBy rdf:resource="http://xmlns.c om/foaf/0.1/"/> </rdfs:Class>
SWDB
<foaf:Person rdf:about="http://umbc.edu/~finin/finin.rdf#Tim Finin"> <owl:sameAs rdf:resource="http://ebiquity.umbc.edu/person/f oaf/Tim/Finin/foaf.rdf"/> <foaf:name>Tim Finin</foaf:name> <foaf:firstName>Tim</foaf:firstName> <foaf:mbox_sha1sum>9da08e2b4dc670d9254ab4 a4b4d61637fed3b18f</foaf:mbox_sha1sum> <foaf:mbox_sha1sum>49953f47b9c33484a753eaf 14102af56c0148d37</foaf:mbox_sha1sum> </foaf:Person>
Swoogle : Architecture
Swoogle : Architecture (1)

Data IR analyzer analysis SWD analyzer interface Web server Metadata creation SWD cache SWD metadata Web service Agent service
SWD discovery
SWD Reader
candidate url
Web crawler
web
Swoogle : Architecture (2) SWD Discovery discover potential SWD through web Metadata creation cache snapshot of SWD and generate objective metada of SWD Data analysis build analytical report based on cached SWD and created metadata Interface providing data services to Semantic Web Community
Swoogle : Finding SWD
Finding SWD
Google based Crawler
Utilizing Google webservice
Web crawler
Focused Crawler
Give url address user Verify and discover SWD based on Its relation i.e import web
web crawler
Swoogle : SWD Metadata
SWD Metadata : About
Basic Metadata syntactic and semantic features of SWD Relations relation between SWD Analytical Result describe SWD ranking
Basic Metadata (1)
Language Features properties describing syntactic and semantic data. i.e encoding (xml/rdf), language (owl,daml), owl species (owl-dl,owl-lite) RDF Statistics properties summarizing the node distribution. containing information about statistics of rdf:Class, rdf:Property or individuals and obtain the ontology ratio Ontology annotation properties describing a SWD as an ontology. Swoogle record instance of OWL:Ontology properties. i.e label, comment, versionInfo
Basic Metadata : Determining Ontology Ratio
ontology-ratio
amount of class
amount of properties
| C ( foo) | | P ( foo) | R ( foo) ! | C ( foo) | | P ( foo) | I ( foo)

amount of individuals
if ontology-ratio = 1, pure SWO if ontology-ratio = 0, pure SWDB if 0 < ontology-ratio < 1, determine a threshold
Relations Metadata (1) Swoogle captures following SWD relation
TM/IN captures term reference bewtween two SWD IM captures ontology import relation i.e. owl:imports, daml:imports EX captures ontology extends relation i.e. rdfs:subClassOf, rdfs:subPropertyOf, PV shows that an ontology is prior version of another i.e. owl:priorVersion
Relations Metadata (2) Swoogle captures following SWD relation
CPV shows that an ontology is prior version and compatible with another i.e. owl:DeprecatedProperty, owl:DeprecatedClass IPV shows that an ontology is prior version and incompatible with another i.e. owl:incompatibleWith
Swoogle : Ranking SWD
Ranking SWD : Google Page Rank Concept
Google introduce Page Rank concept to evaluate relative importance of web documents (probability) Probability calculated based on direct access probability and probability of following one links pointing to it
Ranking SWD : Swoogle Page Rank Concept (1)
Google page rank use uniform probability means all web document are treated with same manner In SWD, there are some different way to link the document with different manner. i.e. import, uses-term, extends Different term should be treated with different manner (give different weight) Therefore, Swoogle uses rational random surfing model
Ranking SWD : Rational Random Surfing Model

sum all link from x to a random page ranking
f ( x, a ) !
weigth(l )
l links ( x , a )
f ( x, a ) rawPR (a ) ! (1 d ) d rawPR( x) f ( x) x L ( a ) direct

probability sum all outlink f ( x) !
f ( x, a )
a T ( x )
Swoogle : Indexing and Retrieval
Information Retrieval
Using Traditional Information Retrieval method Work well with SWD document and text document with embedded markup
<?xml version="1.0"?> Here is I describe the rdf:Description syntax : <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:si="http://www.w3schools.com/rdf/"> <rdf:Description rdf:about="http://www.w3schools.com"> <si:title>W3Schools</si:title> <si:author>Jan Egil Refsnes</si:author> </rdf:Description> </rdf:RDF> <rdf:Description rdf:about="http://www.w3schools.com"> <si:title>W3Schools</si:title> <si:author>Jan Egil Refsnes</si:author> </rdf:Description>
Pure SWD Document
Text with embedded markup
Traditional Information Retrieval

N-Gram based matching Matched sample with URIref Given word Slide n character Find matched sample With probability
Word based Matching Reduce RDF to triple Extract URI from SWD Matched with given word
Indexing
After retrieving some information, each SWD is indexed based on Page Ranking formula
Rank 1 2 3 4 5 URL http://www.w3.org/1999/02/22-rdf-syntax-ns http://www.w3.org/2000/01/rdf-schema http://www.daml.org/2001/03/daml+oil http://www.w3.org/2002/07/owl http://www.w3.org/2000/10/rdftests/rdfcore/testSchema Value 2845.97 2814.21 311.65 192.18 59.82
Current Status
Page 30
Conclusion and Future Work
Powerful search and indexing systems are needed by Semantic Web developers and researchers to help them find and analyze SWDs Current web search engines such as Google and AlltheWeb do not work well with SWDs, as they are designed to work with natural languages Swoogle runs multiple crawlers to discover SWDs through meta-search and link-following
Thank you Terima kasih

Swoogle Semantic 201183583

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Swoogle Semantic 201183583

Uploaded by

Copyright:

Available Formats

Swoogle: A Semantic Web Search and Metadata Engine

Adhitya Bhawiyuga (201183583)

Are you familiar with this?

Introduction : What is Swoogle

Introduction : Related Work

Swoogle : Semantic Web Document

Semantic Web Document : SWD

SWD : Classification SWD is divided into :

in the case of Swoogle, SWD is classified by using a threshold formulation

SWD : Classification Example SWO

Swoogle : Architecture (1)

Swoogle : Finding SWD

Swoogle : SWD Metadata

SWD Metadata : About

Basic Metadata (1)

Basic Metadata : Determining Ontology Ratio

| C ( foo) |  | P ( foo) | R ( foo) ! | C ( foo) |  | P ( foo) |  I ( foo)

Relations Metadata (1) Swoogle captures following SWD relation

Relations Metadata (2) Swoogle captures following SWD relation

Swoogle : Ranking SWD

Ranking SWD : Google Page Rank Concept

Ranking SWD : Swoogle Page Rank Concept (1)

Ranking SWD : Rational Random Surfing Model

f ( x, a ) rawPR (a ) ! (1  d )  d rawPR( x) f ( x) x L ( a ) direct

Swoogle : Indexing and Retrieval

Pure SWD Document

Text with embedded markup

Traditional Information Retrieval

Conclusion and Future Work

Thank you Terima kasih

You might also like

| C ( foo) | | P ( foo) | R ( foo) ! | C ( foo) | | P ( foo) | I ( foo)

f ( x, a ) rawPR (a ) ! (1 d ) d rawPR( x) f ( x) x L ( a ) direct