Achieving Semantic Interoperability & Integration[MFU1] Using RDF and OWL
W3C
Editor's Draft– 01/23/2006
This version:
http://www.w3.org/TR/2004/whatever
Latest version:
Previous versions
This is the first public Working Draft
Editors:
Mike Ushold, The Boeing Company
Christopher Menzel, The
Boeing Company
Copyright
©2004 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability,
trademark
and document
use rules apply.
Semantic interoperability means
enabling different agents, services, and applications to exchange information,
data and knowledge, on and off the Web. To enable semantic
interoperability agents, services, and applications need to share the same
mutually understood vocabulary or to create correspondences or mappings between
their different vocabularies. One of the design goals of RDF and OWL is
to provide the means to specify such mappings. This note provides guidance on
how OWL and RDF can be used to enable semantic interoperability.
We briefly characterize what we
mean by semantic interoperability, and what the challenges are. We describe
some RDF and OWL constructs that are designed to support semantic interoperability
and illustrate them with examples. We highlight their strengths and
limitations. The main strengths are the ability to import, share and reuse public ontologies (in whole or part) [MFU2] and the ability to express logical equivalence and
other relationships between concepts, properties and individuals in different
ontologies. One main weakness is the lack of support for procedural functions
(e.g. arithmetic, string manipulation) that are needed for mapping between many
real-world ontologies.
This is a nearly complete first
draft of this note.
The outline and structure of the document is stable.
The major change form the last version is that there is a narratave for the
different OWL mapping constructs. The major content remaining to be added are
the code examples.
[ANTICIPATED:] This document is
the First Public Working Draft. We encourage public comments. Please send
comments to public-swbp-wg@w3.org [archive] and
start the subject line of the message with "comment:"
Open
issues, todo items:
Publication as a Working Draft
does not imply endorsement by the W3C Membership. This is a draft document and
may be updated, replaced or obsoleted by other documents at any time. It is
inappropriate to cite this document as other than work in progress.
Semantic Web languages, such as
RDF and OWL facilitate interoperability in significant ways. They provide the
social structure and technical framework to reuse existing ontologies; they
provide formal mechanisms to express logical equivalences and other formal
relationships between classes and properties in different ontologies. The goal
of this note is to give users and application developers tools and guidelines
to exploit OWL to achieve semantic interoperability. Ultimately, it is up to
the users to reuse ontologies correctly and to identify and specify the logical
relationships between terms in different ontologies.
Extending a single ontology: A
knowledge engineer would like to start with one ontology and extend it so that
it meets the needs of a particular application. This can be viewed as a “base case” of integration as the
engineer is integrating the conceptual specification of her requirements with
the existing ontology.
Integrate existing OWL ontologies:
A knowledge engineer has
distinct OWL ontologies to integrate resulting
in one larger ontology[dlm3] . She largely understands the intuitive meanings
of the vocabulary of the existing ontologies, either because the existing
ontologies are well documented and axiomatized, or simply because she is
familiar with the domain specific vocabulary of each. Her main task is to be
able to make the mappings between the existing ontologies explicit and to
generate one integrated ontology.
We will refer to this as semantic integration.
Application interoperability
requiring data translation: A software developer needs to have a data sent from a source application
(that is expressed using the vocabulary from one ontology), to a target
application that requires the data to be expressed using another ontology. The
data needs to be converted to use the vocabulary of the target application.
This is similar to the previous use case, in that we assume that there is a
good understanding of the meaning of both ontologies, and that the main task is
to specify the mappings. The
primary differences from the previous use case are that there is one target
ontology receiving data and there is no requirement to create one new
integrated ontology.
1.
Share
existing ontologies[dlm4] : A knowledge engineer has a stock of legacy ontologies written in different
representation languages such as Classic, Ontolingua, KIF, FLogic or other KR
languages. She wants to be able to share ontologies on the web. This knowledge engineer may find that
sharing ontologies is facilitated if mappings to terms in the ontologies
written in the other ontologies are made explicit. Best approach is to consult [OWL Semantics] for
precise account of the semantics of OWL; also CL-based axiomatic semantics
[Hayes], [Menzel] [dlm5]
Note that for use
cases 2 and 3, there is a need to map terminology from one ontology to another.
A major focus of this note will be how to specify these mappings in OWL. The
mechanics of creating the mappings is independent from what use case the
mappings are needed for.
NOTE this
document currently focuses on semantic integration as opposed to the
broader notion of semantic interoperability. Here, semantic
integration is taken to consist chiefly in the identification and explication
of logical connections between classes, properties, and individuals across
ontologies. Semantic interoperability (we (or I – CM) suggest)
encompasses semantic integration proper as well as, e.g., interoperability of
services and tools made possible and driven by semantic integration.
MFU: you
don't say here what interoperability is that integration is not. Do we need to
say something in the document?
DLM: why don’t we just say that this
document focuses on a first step toward semantic interoperability as
facilitated by semantic integration that is accomplished using a set of formal
statements of logical relationships between terms. Some have referred to this as articulation, or bridging,
axioms.
MFU: the data
translation use case is very definitely semantic interoperability, not
integration, so I think we are doing both. Hence, I added ‘integration’to the
title of the note.
The terms ‘semantic
interoperability’ and ‘semantic integration’ are often used loosely and
somewhat interchangeably. The core idea for both is the existence of and desire
to bridge a semantic gap between different systems or applications that use
different vocabularies. The different vocabularies reflect different underlying
conceptual models or ontologies that the systems are based on. Thus, semantic
interoperability and semantic integration both entail the use of semantic
mappings between terms in one ontology to terms in another ontology. The main
difference is an architectural one.
Semantic interoperability usually means that the original systems and
ontologies remain intact. Semantic integration usually means that there is some
merging of the ontologies or applications. For this note, we will use the term
‘semantic I&I’ to refer to both semantic interoperability and semantic
integration. We will use either
term on its own when we specifically wish to refer to just one.
We view semantic I&I as an
effort that focuses on enabling different agents, services, and applications to
exchange information, data, and knowledge such that the intended meaning [MFU6] of the information/data/knowledge is
preserved. For simplicity, we will refer to agents, services, and
applications collectively as agents. Great strides have been made in recent
decades to improve interoperability at physical and syntactic levels. Streams of data were successfully
transmitted between systems, however there was no meaning associated with the
data. This situation is analogous
to successful delivery of an encrypted message, appearing to the recipient in
an unfamiliar script -- mere scratchings on the page. However, as Verizon CTO Michael Brodie notes,
"it's the semantics, not the plumbing" that is required for
interoperability. It is
insufficient, that is, just to have a robust physical infrastructure for
transmitting data between systems, as the very same data can mean very
different things in different systems: depending on the system (as well as the
context). To a supplier, "delivery date" typically means the date the
product is shipped; and to the buyer, the same phrase typically means the date
the product is received. [dlm7]
Similarly, data values such as “32” may be an integer age (measured in years),
a temperature (measured in degrees fahrenheit or celsius), or a number of
employees (measured in FTE – Full Time Equivalents). Without a way of indicating its
intended meaning, the raw data received by a target system is potentially so
misleading and open to misinterpretation, that is can be viewed as essentially
useless. Systems are semantically
interoperable when agents within them are able to exchange information,
rather than mere data.
In this section, we give some
general guidelines and principles for achieving semantic interoperability &
integration among Web applications.
The Semantic Web and the Semantic Web languages provide both the social
structure and the technical means to facilitate semantic I&I. The most fundamental contribution that
the Semantic Web brings to semantic I&I is a set of recommended
standardized languages with well
defined syntax and semantics . One impediment to semantic I&I has been the use of a
wide variety of knowledge representation frameworks to represent
information. Exacerbating the
problem is the fact that many of these languages lacked any explicit
specification of their syntax — their basic vocabularies and their
grammars — and their intended semantics. In limited circumstances, the need for such specifications
may not arise. For example, within a small organization new users can pick up
the structure of the syntax and its intended interpretation from experienced
users in the organization. However, in the context of the SW, this model is
inadequate since agents wishing to share information must first share a common
understanding of the content of the representations in terms of which that
information is expressed.
The Semantic Web activity in the
W3C made a significant advance on the language heterogeneity problem through
the introduction of formal recommendations for several standard
XML-based ontology languages, notably, RDF, RDF Schema (RDFS) and OWL.
The syntax and semantics of these languages are open, well-defined
standards. This gives rise to our first principle:
Principle
1: To facilitate semantic I&I, create new ontologies in OWL.
To the extent that recommended standard ontology languages come into broader
usage for building ontologies on the WWW, the problems of syntactically
different or semantically ill-defined representation languages is minimized.
Therefore, whenever possible, use OWL when building new ontologies. This
will help ensure semantic I&I, at least at the language level.
Unfortunately, of course, many existing ontologies are not written in OWL, this
gives rise to the second principle.
Principle 2: To facilitate semantic I&I, translate existing ontologies
into OWL.
If existing ontologies are intended to be shared, translate them into OWL and
make the OWL version available in addition to the original version.
Unfortunately, there is no general, automated method for translating an
ontology written in another KR language into OWL. The difficulty of the task,
varies significantly from language to language.
First, some
ontologies are written in languages that are more expressive than OWL (such as
full first order logic) and thus it is possible that some details in the
original ontology will not be translatable into OWL. Some systems such as Ontolingua and Chimaera that use KIF –
a first order logic language – as their internal representation language
handle this issue in their export capabilities by translating as much as
possible into OWL and noting what they could not translate. One emerging option is to produce
ontologies completely in OWL if expressively possible with additional
information in a more expressive language, such as KIF, if
required[dlm8] .
Second, some
ontologies may be written in languages that embody paradigms more or less
similar to those embodied in OWL.
For example ontologies written in a frame-style language like OKBC,
LOOM, or CLASSIC may translate more easily and naturally into OWL than
ontologies written in a full, unrestricted first-order language like KIF. OWL can be viewed as a descendant of
frame-style and description logic systems and provide natural support for
modeling things such as classes and binary relations. If ontologies are best conceptually modeled with things such
as ternary (or higher arity) relations, translations may be less natural[MFU9] .
The first two principles focus on representing ontologies using the same
language, OWL. This is a great help, but there are many issues of semantic
heterogeneity that arise even when the same language is universally used.
The semantics for OWL does two very important things for us. First, it fixes
the meanings of the reserved vocabulary. For example, it says exactly
what is meant by declaring a relationship to be transitive, or that an
individual is a member of a class. Second, OWL specifies how the meanings
of complex expressions using various syntactic constructs as specified by the
grammar of the language depend systematically on the meanings of their simpler
parts. For example, if you know the precise meaning of the Animal and
Plant classes, and [from the OWL language spec] you know the meaning of the
reserved word 'Union' [NOTE: put the right OWL
construct here] you can know exactly what 'Animal Union Plant' means [NOTE: put the right OWL expression here].
The semantics of OWL, however, are silent on our knowledge engineer's
domain specific vocabulary. Hence, if this ontology is to be shared with or
reused by someone who does not share the ontology creator's understanding of
the domain specific vocabulary, the meanings of terms in that
vocabulary must be captured somehow. And this is really where the semantics
gets into the Semantic Web. This gives rise to our next principle:
Principle 3: To facilitate semantic I&I, reduce ambiguity by
expressing more meaning.
In the context of the Semantic Web, ontology creation takes on an entirely new
and exciting guise. Currently, web page content is expressed largely in terms
of unstructured text and graphics where most or all of the meaning is implicit.
Acquiring the intended meaning of the content relies on the human’s
understanding of the words and context. OWL's constructs open up the
possibility of explicitly declaring the meaning of the content.
OWL is used to put the semantics in the Semantic Web.
An ontology builder must take care to specify the meaning of the terms in the
ontology. This ensures that others who may wish to reuse your ontology can
glean the intended meaning of the vocabulary in your ontology. If an ontology
only contains a name for a class and nothing more, such as the class named
“Animal”, then for a start, only English speaking people will have an idea what
the class might mean. All we know from the semantics of OWL is
that 'Animal" denotes a set of individuals. If the ontology also
includes, say that the class Animal is a subclass of PhysicalObject, then this
expresses more meaning, and further reduces ambiguity. If the
ontology further specifies that Animal is a subclass of other classes (e.g. the
class of all moving things), that adds yet more meaning, further reducing
ambiguity. In addition to specifying superclasses, the ontology builder
can also indicate what relations have the class Animal in their Domain or
Range. Further, the ontology
builder can specify properties of those relations, such as if the relations are
transitive.
[MFU10]
Expressing more meaning in an OWL ontology amounts to 1) using a variety of OWL
constructs to capture different aspects of the meaning of a given term and 2)
using a given OWL construct as often as necessary to say as much as possible
about that aspect of the term's meaning.
[DLM
– comment – we should be careful not to encourage people to model a
lot information that is not expected to be useful. For example, many people do not like using CYC’s upper level
because it is viewed as having too many upper level concepts that “clutter” an
application when they are all inherited.
We probably want to draw a fine line here.]
MFU:
True, and here ‘useful’ means more than just what a given ontology is intended
for. When building an ontology for a specific purpose, you definitely would not
want to add lots of axioms that would likely never get used in any application.
However, it is also true that to facilitate semantic I&I, one can argue
that to a large extent, the more meaning the better – the more meaning,
the easier it will be for humans, or automated mapping assistants to understand
the meaning of the terms, which is needed to map them. It is a tradeoff, to be
sure. There is an analogy with db schema development. Often, DB constraints
that are true, and could be expressed, and which would help on understand the
meaning of the data, are SPECIFICALLY NOT put in, due to possible performance
impact. The tradeoff is understandabilty vs. performance. The same might apply
in ontology development.
Each thing stated about the terms in an ontology is represented as a formal
statement using the grammar of RDF along with the OWL reserved vocabulary. Each
statement serves to characterize the logical characteristics of, and logical
connections among, the classes, properties, and individuals named in a domain specific
vocabulary. Although a typical user of an ontology building environment may not
be aware of it, these formal statements are all axioms in a formal logic.
In summary, one increases sharing and reuse by reducing ambiguity, which in
turn, is achieved by adding more meaning to one's ontology. One adds more
meaning by making use of OWL's reserved vocabulary to axiomatize the classes,
properties, and individuals in an ontology.
Principle 4: To facilitate semantic I&I, reuse terms from existing
ontologies.
The richer content and
standardized representations afforded by OWL, together with the connectivity
provided by the web's basic infrastructure, opens up the second [MFU11] exciting aspect of ontology creation on the
Semantic Web, namely, the possibility of genuine, robust reuse. Prior
to the semantic web, reuse consisted of little more than incorporating a term
from someone else's ontology into one's own ontology, with the intention that
it means the same thing across ontologies. But if the term is not defined or
axiomatized by means of the sort of rich representational constructs that OWL
provides, there is no way to be certain, or even mildly confident, of its
meaning and hence no way to ensure commonality of meaning when the term is
reused. OWL makes such representations possible. Moreover, the infrastructure
of the web — notably, the mechanism of Uniform Resource Locators —
together with XML namespaces facilitate reuse by making it possible to import
content directly from remote ontologies, thereby reducing work and eliminating
the possibility of transcription errors.
Reusing existing terms, that have a well-defined meaning, is an important step,
but not nearly enough. Sometimes, there are good reasons to represent the same
concepts using different terms, or different representational constructs. This
gives rise to our next principle.
Principle 5: To facilitate
semantic I&I, use OWL mapping constructs to relate terms from different
ontologies.
When a term from one ontology is reused in the context of building a new
ontology or extending an existing ontology, it comes "tagged" with
its original namespace. [MFU12] This mechanism enables a single core term to be
used in both ontologies, each having a different meaning (this mechanism also preserves provenance information). [MFU13] In the event that the meanings are the same, or
closely related, it is important [MFU14] to specify the semantic connection between them.
This is done by using various mapping constructs in OWL.
MFU: expand this section to
include various use cases for when an ontology to ontology mapping might be
needed or useful. Distinguish interoperability from integration use cases,
perhaps. Then, we might want to shorten what is below, and just a breif section
about the kinds of mappings. This would be elaborated on in much more detail in
the main ‘narrative’ below. Here, you woud probably skip OWL syntax. I grayed
out the text below, and used it as a starting point for the narrative. We need
to come back to here and replace the gray text with a suitable summary of the key
points. It might juust be a few sentences, or somethigne a bit more elaborate.
Given this is the main meat of the note, it bears repeating for emphasis.
The simplest mapping constructs
declare exact logical equivalence; these are: owl:sameAs, owl:equivalentClass, and owl:equivalentProperty.
For example, we might specify that ontA:car is an equivalentClass to ontB:auto;
or that ontA:canRunSlowly is equivalentProperty to OntB:canJog, or that
OntA:venus is the sameAs OntA:morning_star. These three core mapping constructs
can also be used in conjunction with arbitrary combinations of a variety of
other OWL constructs (such as class forming operators) to create complex
logical mappings between terms from different ontologies. For example, an
ontology designer might say that OntA:lifeform and [OntB:plant Union
OntB:animal Union OntB:fungus] are equivalentClass(es). [NOTE: put the right
OWL syntax here. Mike: actually, you should have English as well as the OWL
code.]
Additionally, logical connections
may be specified that go beyond simple equivalence statements. If an ontology designer was not
interested in fungus, s/he might say that [OntB:plant Union OntB:animal]
isSubClassOf [OntA:lifeform]. As a more complex example, an ontology
designer may find that they want to state that their car class is defined as a
subclass of the ontB:auto class along with being a subclass of a new
restriction class (that may restrict the range of a particular property). [MFU15] In this way, OWL constructs may be used to
express formal logical relationships between either existing terms across
multiple ontologies or formal relationships between existing ontology terms and
a new ontology term.
In summary[MFU16] , OWL mapping constructs can be used to
specify the logical connections between reused classes, properties, and
individuals and those in one's ontology by means of the OWL vocabulary.[dlm17]
In the next section of this
document, we illustrate the use of the ontology mapping constructs for
facilitating semantic interoperability & integration by means of a series
of examples.
The simplest and most explicit mapping
constructs in OWL declare exact logical equivalence. For example, we might
specify that the class ontA:car is quivalent to the class ontB:auto, or that the
property OntA:canRunSlowly is equivalent to the property OntB:canJog. We can
also specify that the individual OntA:venus is the same individual as OntA[MFU18] :morning_star. These are examples of three core
mapping constructs in OWL that allow us to state that 1) two classes are
equivalent, 2) two properties are equivalent, and 3) that two individuals are
identical. The OWL constructs for this are equivalentClass, equivalentProperty
and sameAs, respectively.
Often, there is no exact
equivalence between two classes or two properties, but there may still be an
important subClass or subProperty relationship between them that is useful to
capture. For example, one might wish to create a mapping stating that the class
OntA:Primate is a subClass of the class OntB: Mammal, or that the property
OntA:brotherOf is a subProperty of OntB:siblingOf.
The equivalence and subclass
relationships state fundamenatal similarities between classes. One can also make statements that
clarify some of their differences. For example, one might declare that
OntA:plant is disjoint with the class OntB:fungus. You can also say that
ontA:JohnASmith is different from ontB:JohnASmith. These kinds of statements are not as obviously useful as a
basis for translating data from one ontology to another, however they specify
mapping information in the sense that they relate the meaning of classes and
individuals in two different ontologies.
We consider tThe
core OWL mapping constructs [MFU19] to beare:
owl:equivalentClass, owl:equivalentProperty,
owl:sameAs, owl:subClass
and owl:subProperty because they explictly
relate classes to classes, properties to properties or individuals to
individuals.
In the case of properties and
individuals: equivalentProperty, subProperty, sameAs, and differentFrom pretty
much exhaust what can be specified in a mapping between two different
ontologies. In the case of classes, there is much much more. For Iin
addition to being used on their own simply to
express relations between given classes, as in the above examples,
the core class mapping constructs may also be used in conjunction with other
OWL constructs to create more complex logical mappings between class
terms from different ontologies. For example, an ontologicaly
designer engineer might say that the class
OntA:lifeform in Ontology A is equivalent to the union of the three classes OntB:plant, OntB:animal
and OntB:fungus in Ontology B. In the same way,
Oone
can also use OWL restrictions to define
classes that enable one to express even more subtle in class
mappings. For example, we can state that the class OntA:bicycle is
a subclass of the class of all OntB:landVehicle(s) which exactly
two wheels; more exactly:
<owl:Class rdf:resource=”&OntA;bicycle”>
<rdfs:subClassOf>
<owl:restriction>
<owl:onProperty rdf:resource=”&OntB;hasWheels”>
<owl:cardinality rdf:datatype=”&xsd;nonNegationInteger”
>2</owl:cardinality>
</owl:Restriction>
</rdfs:subClassOf>
</owl:Class>
such that the
cardinality of the numberOfWheels property is equal to exactly 2.
In general, any legal combination
of OWL class formation operators can be used to form arbitrarily complex class
expressions to state class equivalence, disjointness or subclass
relationships. More typically, one
of the class expressions will be just a class name, like ‘lifeform’ in the
above example. However, it is also allowed to have arbitrarily complex
expressions for both classes that are declared to be equivalent, disjoint or
subclasses. For example, one could state that the class formed by the intersection
of the class C1 with the union of classes C2 and class C3 is a subclass or (or
equivalent to, or disjoint with) the class formed by taking the complement of
the intersection of classes C2 and C4. We leave it as an exercise for the
reader to come up with real-world examples where such expressions might be useful.[1] You can also state that any class
expression is empty or non-empty. You can state that a set of classes forms a
partition. Mapping statements are only limted by your imagination and the set
of legal OWL constructs for specifying classes.
Note that although in principle, any OWL constructs that can be used to describe or refer to properties can be used in conjunction with equivalentProperty and subProperty to create mappings between properties in different ontologies, in practice there aren’t an OWL construcsts for this. Similarly, while in principle, any OWL constructs that can be used to refer to individuals may be used with sameAs or differentFrom to create mappings between individuals in two different ontologies OWL does not have any additional constructs. [MFU21] TBD: some examples on complex expressions using property-forming and indiv-forming opera