You are on page 1of 5

A Short History of Ontology: Its not just a

Matter of Philosophy Anymore


By Charles Roe / June 7, 2012 / 0 Comments

TwitterFacebook 11Google+LinkedIn 6 Reddit

by Charles Roe
Humans like to classify things; we have spent our intellectual history taking apart the universe and
creating (or some would say discovering) the underlying structure of all things from chains of
galaxies to quarks. We love structure; it allows us to put knowledge into more easily understood,
more easily disseminated boxes that allow for clearer systems of formal classification.
The Ancient Greeks are often acknowledged as establishing the basis of western formal thought
structures and systems of critical analysis known collectively as philosophy. The term is generally
credited to the great Ionian mathematician, scientist, and religious mystic Pythagoras who lived circa
570 BCE. Parmenides, circa 500 BCE, is given credit for the first discussions on the ontological
categorization of existence (though the dates are not entirely agreed upon). Etymologically the
term ontology, like most philosophical terminology, comes from Greek and means essentially the
study or theory of being or that which is. Yet, historically the first known written use of the word
comes from the Latin ontologia in the early 17th century.
The somewhat vague terms often acknowledged, generally credited to, given credit for, first
known written use, were purposely employed in the previous paragraph to make a point: the ancient
history of philosophy, just like the etymology of words and current uses of such terminology is and
always will be debated by those who care about such debates. The term ontology is an apt example
of such a word. We know the accepted history of its use and its etymology, we know how it has been
commonly used throughout history and we can study its changes with the advent of AI and computer
science in the mid-1970s. Yet, how many people today actual agree on what an ontology is,
relative to its modern classificatory sense? How does it differ from a vocabulary? A taxonomy? What
are some of the ontologies used today in Data Management?
The Prevailing Trend

The modern history of ontology really beings with Artificial Intelligence (AI) research from the
1970s and 1980s. According to Tom Gruber, a pioneer in AI exploration and semantic web
technologies, AI researchers borrowed the term ontology from philosophy as an apt system for the
ordering of knowledge systems that they required:
In philosophy, one can talk about an ontology as a theory of the nature of existence (e.g. Aristotles
ontology offers primitive categories, such as substance and quality, which were presumed to account
for All That Is). In computer and information science, ontology is a technical term denoting an
artifact that is designed for a purpose, which is to enable the modeling of knowledge about some
domain, real or imagined.
Mr. Gruber wrote two famous papers in the 1990s that cemented the use of the word ontology within
the contemporary sphere of computer science:

Toward Principles for the Design of Ontologies Used for Knowledge Sharing (1993)

A Translation Approach to Portable Ontology Specifications (1995)


The first one set the stage for the use of the term ontologies as a way of specifying content-specific
agreements for the sharing and reuse of knowledge among software entities. The second defined an
ontology as an explicit specification of a conceptualization, while a conceptualization in Mr.
Grubers terms is an abstract, simplified view of the world that we wish to represent for some
purpose. Every knowledge base, knowledge-based system, or knowledge-level agent is committed to
some conceptualization, explicitly or implicitly.
Thus, an ontology as defined and used within modern computer science (and now many other fields)
is, in simple terms, a system for the formal organization of information. This relates to the ancient
philosophy on the nature of existence in that both systems classify being/that which exists whether
they are subjects/objects in a domain, conceptual models for automated reasoning, or categories of
individual identity.
Some other helpful definitions include:

Lars Marius Garshol, Onotopia: [t]he core meaning within computer science is a model for
describing the world that consists of a set of types, properties, and relationship types. Exactly
what is provided around this varies, but this is the essentials of an ontology. There is also
generally an expectation that there be a close resemblance between the real world and the features
of the model in an ontology.

Nicole Washington & Suzanna Lewis, Nature Education: An ontology is a logic-based


organizational structure for knowledge. Ontologies speed genetic discovery by allowing
researchers to quickly find and compare data from multiple sources.

Roberto Navigli and Paola Velardi: The goal of a domain ontology is to reduce (or
eliminate) the conceptual and terminological confusion among the members of a virtual

community of users (for example, tourist operators, commercial enterprises, medical practitioners)
who need to share electronic documents and information of various kinds.
Taxonomy versus Ontology?
The scope of this article cannot cover all of the extensive research done in terms of delineating the
differences between modern taxonomies and ontologies; books have been written on the subject and
there is much disagreement and debate within the Data Management industry itself of their particular
differences and uses, let alone other industries. But, some demarcation is helpful.
In his article Ontology and Taxonomy, Steve Hoberman used his own expertise along with many
quotes from specialists in the field to gain more clarity on the differences between the two terms.
Some of the highlights from that article will aid in a better understanding:

Gordon Everest: The synonym for ontology would be model (of something in data), and
the synonym for taxonomy would be tree.

Robert Ruffin: The taxonomy of a tiger is that it is a subtype of cat (classification), but an
ontological description may be that the tiger has a relationship to Asia, the continent on which it
lives.

A taxonomy is an ontology in the form of a hierarchy, and Whereas ontologies can have
any type of relationship between categories, in a taxonomy there can only be hierarchies.
Christine Connors, the Principal at TriviumRLG LLC, offers a further differentiation:
Efforts are underway to transform semantic systems into more than just known item or

NLP

derived labeling to systems capable of contextual understanding. Ontologies are the means by which
much of this effort will be accomplished in the short term. An ontology is more advanced than a
taxonomy as it can contain self-defined relationships beyond that of parent-child. It can also be used
to infer data and reason over information.
Thus, both taxonomies and ontologies are in essence vocabularies that offer a structured means of
classification. Whereas taxonomies exist within a strictly hierarchical scheme and work well for the
classification of such elements as Reference Data, Master Data Management, and distributed
computing systems, ontologies expand the relationship possibilities to levels that taxonomies do not.
An appropriate (but simplified) example would be a Knowledge Base (KB). In a simple structuraltaxonomic system, each file in the KB is assigned to one (though possibly more) nodes; each child
has a single relation to one parent in a strict hierarchy, similar to a directory structure of PC hard
drive. In an ontological KB structure, there are multiple parents tied to multiple children in a polyhierarchy with highly structured and formal constraints. A taxonomy is really a tree or directory
structure, while an ontology could be the forest (or entire KB); yet, the forest is far more formal
about the semantic structuring of classes, attributes, relations, objects, rules and restrictions than the
tree is due to the increased complexity of the varying relationships.
Some Examples of Ontologies

There are literally thousands of existing ontologies in the world today in virtually every industry
from software engineering to medical research, e-commerce to banking, linguistic processing to
document publishing and so forth. Even in the Data Management industry alone there are too many
to easily discuss. Thus, to distill the topic down to give some well-defined examples, only a few
mentioned in a recent DATAVERSITY webinar will be noted:

Dublin Core MetaData Initiative (DCMI): First conceived in 1994 during the 2nd
International World Wide Web Conference, DCMI was created to provide core metadata
vocabularies in support of interoperable solutions for discovering and managing resources.
Grounded in the The Dublin Core Metadata Element Set, DCMI works to promote open
consensus building in the development and maintenance of metadata vocabularies, worldwide
participation in the project, encouragement of neutrality in the adoption and use of the standards
and a comprehensive cross-disciplinary focus to break down information silos so that all data is
shared data.

Good Relations Ontology: Started in 2008, Good Relations is a simple but powerful ecommerce ontology for vocabulary for publishing all of the details of your products and services
in a way friendly to search engines, mobile applications, and browser extensions. It seeks to
streamline the e-commerce process and is the only OWL DL ontology that both Yahoo! and
Google support. It comes with a Creative Commons Attribution 3.0 license, so it is Open Source.

Web Ontology Language (OWL): The OWL was created to facilitate greater machine
interpretability of Web content than that supported by XML, RDF, and RDF Schema (RDF-S) by
providing additional vocabulary along with a formal semantics, especially for the Semantic Web.
A W3C (World Wide Web Consortium) standard, OWL is hoped to aid in the structuring of the
Semantic Web through the adoption of common systems of processing Web content.
For more examples, or to do further searches on ontologies see:

Sindice.com: A Semantic Web Index

Umbel.org: Umbel is a Vocabulary and Reference Concept Ontology

Bioontology.org: A good search site for many different biomedical ontologies

Cyc.com: Founded in 1994, Cycorp works to standardize, develop, commercialize and do


more research into AI.

Open Directory Project: Lists a number of published ontologies

Wikipedias listing on Ontology (information science) also has a long list of many known
published ontologies at the bottom of the page.
Conclusion Unstructured Data and Data Ontologies
The initial statements from Tom Gruber way back in 1993 during the bygone days of Web
development still ring true today:
Knowledge-based systems and services are expensive to build, test, and maintain. A software
engineering methodology based on formal specifications of shared resources, reusable components,

and standard services is needed. We believe that specifications of shared vocabulary can play an
important role in such a methodology.
The rapid growth of Unstructured Data over the past few years has spawned an explosion of Big
Data products meant to aid in the structuring of such massive amounts of data created from blogs,
social networking, video and a host of other unstructured elements. Enterprises worldwide are
hurrying to collect, analyze, and translate into appreciable information petabytes and even exabytes
of data so they can achieve a competitive edge in the marketplace. Yet, as the data volumes expand
to ever greater quantities, as non-relational data systems continue to enter the market, as the
complexity of systems, platforms, products, and codes continues to increase, new common
solutions are needed. Data pandemonium is rampant in the world today and the standardization of
data systems through the adoption of common ontologies is still a hope of many. The dream of Tim
Berners-Lee and the Semantic Web or Web 3.0 (call it what you will) is happening with the adoption
and further work with OWL, RDF, XML and others. But, the train is starting to go out of control.
Can data professionals contain the Big Data beast through Tom Grubers shared vocabulary? Or do
we already have such a system in place? Or rather than a single unifying system are the many
systems and the Lernaean Hydra of Ancient Greek Mythology with its many heads has already been
slain? Such questions are best left to further discussion. Parmenides certainly had no idea what his
philosophical musings would lead to some 2500 years after his death.

You might also like