W3C Working Draft – 04/18/2007
This version:
http://www.w3.org/TR/2007/whatever
Latest version:
http://www.w3.org/TR/whatever
Previous versions
Editors: Mike Uschold & Chris
Menzel
Contributors: Natasha Noy
Copyright
©2004 W3C®
(MIT,
ERCIM,
Keio), All
Rights Reserved. W3C liability,
trademark
and document
use rules apply.
Semantic interoperability means enabling different agents, services, and applications to exchange information, data and knowledge, on and off the Web. To enable semantic interoperability agents, services, and applications need either to share the same mutually understood vocabulary or to create correspondences (i.e. mappings) between their different vocabularies. One of the design goals of RDFS and OWL is to provide the means to specify such mappings. This note provides guidance on how RDFS and OWL can be used to enable semantic interoperability.
We briefly characterize what we mean by semantic interoperability, and what the challenges are. We describe some RDFS and OWL constructs that are designed to support semantic interoperability and illustrate them with examples. We highlight their strengths and limitations. The main strengths are the ability to express equivalence and other logical relationships between concepts, properties and individuals in different ontologies. One main weakness is the lack of support for procedural functions (e.g. arithmetic, string manipulation) that are needed for mapping between many real-world ontologies.
This is
a complete draft of this note.
Open issues, todo items:
This document is the First Public Working
Draft. We encourage public comments. Please send comments to public-swbp-wg@w3.org
[archive]
and start the subject line of the message with "comment:"
Publication as a Working Draft does not imply endorsement by the W3C
Membership. This is a draft document and may be updated, replaced or
obsoleted by other documents at any time. It is inappropriate to cite
this document as other than work in progress.
Semantic Web languages, such as RDFS and OWL facilitate interoperability in significant ways. They provide the social structure and technical framework to reuse existing ontologies. They also provide formal mechanisms to express logical equivalences and other formal relationships between classes and properties in different ontologies. The goal of this note is to provide tools and guidelines for users and application developers who wish to exploit RDFS and OWL to achieve semantic interoperability. Ultimately, it is up to the users to ensure that they reuse existing ontologies correctly and to identify and specify the logical relationships between concepts in different ontologies.
A major focus of this note will be how to specify cross-ontology mappings in OWL. The mechanics of creating the mappings are independent from what use case the mappings are needed for.
We view semantic interoperability and integration as enabling different agents, services, and applications to exchange information, data, and knowledge such that the intended meaning of the information/data/knowledge is preserved. For simplicity, we will refer to agents, services, and applications collectively as agents. Great strides have been made in recent decades to improve interoperability at physical and syntactic levels. Streams of data were successfully transmitted between systems, however there was no meaning associated with the data. This situation is analogous to successful delivery of an encrypted message, appearing to the recipient in an unfamiliar script -- mere scratchings on the page. However, as Verizon CTO Michael Brodie notes, "it's the semantics, not the plumbing", [that is required for interoperability]. [Note: Invited Talk, Second International Conference on the Semantic Web, October 2003, Sannibel Island, FL. USA]
Having a robust physical infrastructure for transmitting data between systems is necessary but not sufficient. The very same data can mean very different things in different systems: depending on the context. To a supplier, "delivery date" typically means the date the product is shipped; and to the buyer, the same phrase typically means the date the product is received. Similarly, data values such as “32” may be an integer age (measured in years), a temperature (measured in degrees Fahrenheit or Celsius), or an allocated amount of human resource (measured in FTE – Full Time Equivalents). Without a way of indicating its intended meaning, the raw data received by a target system is to so many possible [mis-]interpretations, that it is essentially useless. Systems are semantically interoperable when agents within them are able to exchange information, rather than mere data.
The terms ‘semantic interoperability’ and ‘semantic integration’ are often used loosely and somewhat interchangeably. The core idea for both is the existence of a semantic gap between different systems or applications that use different vocabularies -- a gap that needs to be bridged. The different vocabularies reflect different underlying conceptual models or ontologies that the systems are based on.
Thus, semantic interoperability and semantic integration both entail the use of semantic mappings between concepts in one ontology to concepts in another ontology. The main difference is an architectural one. For example, in use case 3 data is exchanged between two applications by mapping between the source and target application ontologies; the original ontologies and applications are left intact. This is semantic interoperability. Use case 2 illustrates semantic integration which usually involves some merging of ontologies.
For this note, we will use the term ‘semantic
I&I’ to refer to both semantic interoperability and
semantic integration, as most of the issues dealt with in this document
are common to both. We will use either term on its own when we
specifically wish to refer to just one.
In this section, we give some general guidelines and principles for achieving semantic interoperability & integration among Web applications. The Semantic Web and the Semantic Web languages provide both the social structure and the technical means to facilitate semantic I&I. The most fundamental contribution that the Semantic Web brings to semantic I&I is a set of recommended standardized languages with well defined syntax and semantics.
One impediment to semantic I&I has been the use of a wide variety of knowledge representation frameworks to represent information (use case 4). Exacerbating the problem is the fact that many of these languages lacked any explicit specification of their syntax — their basic vocabularies and their grammars — and their intended semantics. In limited circumstances, the need for such specifications may not arise. For example, within a small organization new users can pick up the structure of the syntax and its intended interpretation from experienced users in the organization. However, in the context of the semantic web, this model is inadequate since agents wishing to share information must first share a common understanding of the content of the representations in terms of which that information is expressed.
The Semantic Web activity in the W3C made a significant advance on the language heterogeneity problem through the introduction of formal recommendations for several standard XML-based ontology languages, notably, RDF, RDF Schema (RDFS) and OWL. The syntax and semantics of these languages are open, well-defined standards. This gives rise to our first principle:
Principle 1: To facilitate semantic I&I, create new ontologies in OWL or RDFS.Principle 2: To facilitate semantic
I&I, translate existing ontologies into OWL or RDFS.
If existing ontologies are intended to be shared, translate them into
OWL or RDFS and make the OWL or RDFS version available in addition to
the original
version. Unfortunately, there is no general, automated method
for
translating an ontology written in another KR language into OWL or
RDFS. The
difficulty of the task varies significantly from language to
language.
First, some ontologies are written in languages that are more expressive than OWL (such as full first order logic) and thus it is possible that some details in the original ontology will not be translatable into OWL. Some systems such as Ontolingua and Chimaera that use KIF – a first order logic language – as their internal representation language handle this issue in their export capabilities by translating as much as possible into OWL and noting what they could not translate. One emerging option is to adopt a strategy used by UML and OCL (object constraint language). The latter is basically first order logic, and is used to fill in expressive gaps in UML. In the present context, this would mean that ontologies should be expressed completely in OWL if possible, with additional information in a more expressive language, such as KIF (or one of its later incarnations: Common Logic & IKL) or rule languages, such as SWRL, as required.
Second, some ontologies may be written in languages using somewhat different paradigms than those used in OWL or RDFS. OWL can be viewed as a descendant of frame-style and description logic systems; it provides natural support for modeling things such as classes and binary relations. Thus, ontologies written in a frame-style language like OKBC, LOOM, or CLASSIC may translate more easily and naturally into OWL than ontologies written in a full, unrestricted first-order language like KIF. Protege, for example, extended its frame-based core to represent ontologies in OWL. If ontologies are best conceptually modeled with things such as ternary (or higher arity) relations, translations may be less natural. For some ideas on how to do this, see the note: Defining N-ary Relations on the Semantic Web from the W3C Semantic Web Best Practice and Deployment Working Group.The first two principles focus on representing ontologies using the standard languages such as OWL and RDFS. This is important, but does not go far enough; there are many issues of semantic heterogeneity that arise even when the same language is universally used.
The semantics for OWL and RDFS do two very important things for us. First, they fix the meanings of the reserved vocabulary. For example, they say exactly what is meant by declaring an individual is a member of a class, or that a relationship is transitive. Second, the OWL semantics specify how the meanings of complex expressions depend systematically on the meanings of their simpler parts. Such expressions are build up using various syntactic constructs as specified by the grammar of the language.Animal
and Plant classes, and you know the meaning
of the OWL construct owl:unionOf
you can know exactly what the following expression means: owl:unionOf(Animal
Plant). In plain English, it means that an individual is a member of
the new
class exactly when it is a instance of either or both of the classes Animal
and Plant. Formally, this expression is an anonymous class
for which the class extension contains those individuals that occur in
the class extensions of either Animal or Plant. Principle 3: To facilitate semantic
I&I, reduce ambiguity by expressing more meaning.
In the context of the Semantic Web, ontology creation takes on an
entirely new and exciting guise. Currently, web page content is
expressed largely in terms of unstructured text and graphics where most
or all of the meaning is implicit. Acquiring the intended meaning of
the content relies on the human’s understanding of the words
and context. OWL's constructs open up the possibility of explicitly
declaring the meaning of the content. OWL is used
to put
the semantics in the Semantic Web.
The idea is that an ontology will be the semantic foundation for web page content. An ontology builder must take care to specify the meaning of the terms in the ontology. There are two reasons for this. First, it ensures that others who may wish to reuse your ontology can glean the intended meaning of the vocabulary in your ontology. If an ontology only contains a name for a class and nothing more, such as the class named “Animal”, then all we know from the semantics of OWL is that 'Animal" denotes a set of individuals. Virtually all the intended semantics are implicit in the natural language meaning of the word. If the reader doesn't know English, then the class might as well be named “*&^tf$#”.
The second reason for carefully specifying the meaning of terms in an ontology is so applications can interpret the content appropriately. If you create the relationship tallerThan in your ontology, and your web page contains data indicating that Tom is tallerThan Sue, who is in turn tallerThan Barb, you want the application to be smart enough to figure out that Tom is tallerThan Barb. If all you say about your relationship is to give it a name, that won't happen. However, if you specify that the relationship is transitive, then the application could make the correct inference.
There are various ways to add meaning and reduce ambiguity using RDFS and OWL constructs. Here are a few of many possible ways
The richer content and standardized representations afforded by OWL, together with the connectivity provided by the web's basic infrastructure, opens up the second exciting aspect of ontology creation on the semantic web, namely, the possibility of genuine, robust reuse. Prior to the semantic web, reuse consisted of little more than incorporating a term from someone else's ontology into one's own ontology, with the intention that it means the same thing across ontologies. But if the term is not defined or axiomatized using the sort of rich representational constructs that OWL provides, there is no way to be certain, or even mildly confident, of its meaning and hence no way to ensure commonality of meaning when the term is reused. OWL makes such representations possible. Moreover, the infrastructure of the web — notably, the mechanism of Uniform Resource Locators — together with XML namespaces facilitate reuse by making it possible to import content directly from remote ontologies, thereby reducing work and eliminating the possibility of transcription errors.
Reusing existing concepts that have a well-defined meaning is an important step, but not nearly enough. Sometimes, there are good reasons to represent the same concepts using different terms, or different representational constructs. This gives rise to our next principle.
Principle 6: To facilitate semantic I&I, use OWL mapping constructs to relate concepts from different ontologies.
The vision of the Semantic Web is for the meaning of web content to be accessible to both humans and machines. Using OWL as described above allows content to be semantically-based. However, there will never be one single global ontology that everyone uses. Instead ontologies of all shapes an sizes are and will continue to be developed independently.
Independent development means that the ontologies will not be the same, even though they may cover the same subject matter. The term 'car' might be used in one ontology, and 'automobile' in another, to mean the same thing. A third ontology might just have a class called 'vehicle' and not include details of different kinds of vehicles.In the next section of this document, we illustrate the use of the ontology mapping constructs for facilitating semantic interoperability & integration using a series of examples.
The simplest kind of correlation to specify between two
concepts in OWL
is logical equivalence. To express equivalence between classes,
properties and individuals, we use the following three OWL constructs,
respectively: owl:equivalentClass,
owl:equivalentProperty,
and owl:sameAs.
Equivalence between two class definitions means that the
two classes have the same extensions, but they are not necessarily the
same concepts. A famous, somewhat comical example
for this, originating with the philosopher Aristotle are the classes "Human" and
"UnpluckedFeatherlessBiped". The meaning is different, but they
Similarlyl, equivalence between two property definitions means that they both have the same set of pairs of related individuals in their extension, but they might not be exactly the same property. An actual example of this took place at a large US university. When information systems were first being implemented at the university, every student was to be assigned a unique student ID. Additionally, the university also maintained every student's Social Security Number. Independently it was decided that a student's ID number would simply be his or her Social Security Number. Nonetheless, to allow for the possibility that an independent system of studentIDs might be implemented, both the property studentID and the property socSecNum remained in the system. If the system was driven by an ontology, then these two properties would be declared to be equivalent. They don't mean the same thing, but they have the same set of related individuals.
Equivalence between two individuals means that two URI references actually refer to the same thing: the individuals have the same "identity".For the examples below, assume that we have two different ontologies, Ontology A and Ontology B and we are establishing relationships between concepts in these ontologies. Concepts from the Ontology A are indicated by the prefix ontA and those from the Ontology B are prefixed with ontB.
For example, we might specify that the class ontA:car
is equivalent to the class ontB:auto as
follows:
ontA:car
owl:equivalentClass ontB:auto.
We can state that the property ontA:canRunSlowly
is equivalent to the property ontB:canJog as
follows:
ontA:canRunSlowly
owl:equivalentProperty ontB:canJog.
We can specify that the individual ontA:venus
is the same individual as ontB:morning_star as
follows:
ontA:venus
owl:sameAs ontB:morning_star.
Often, there is no exact equivalence between two classes or
two
properties, but there may still be an important subclass or subproperty
relationship between them that is useful to capture. For example, one
can create mappings to state that the class ontA:Primate
is a subclass of the class ontB: Mammal,
and that the property ontA:brotherOf
is a subProperty of ontB:siblingOf as follows:
ontA:Primate
rdfs:subClassOf ontB:Mammal.
ontA:brotherOf
rdfs:subPropertyOf ontB:siblingOf.
The equivalence and subclass relationships state fundamental
similarities between classes. One can also specify correlations among
concepts in different ontologies that highlight their differences. Two
OWL
constructs are provided for this: owl:disjointWith
and owl:differentFrom.
For example, one might declare that ontA:Plant
is disjoint with the class ontB:Fungus :
ontA:Plant
owl:disjointWith ontB:Fungus .
You can also say that ontA:JohnASmith
is different from ontB:JohnASmith:
ontA:JohnASmith
owl:differentFrom ontB:JohnASmith .
These kinds of statements are very useful for use cases 1,2 & 4 because they provide mapping information that relates the meaning of classes and individuals in two different ontologies. These difference mappings are less useful for the translating data from one ontology to another (use case 3).
In addition to being used on their own to express equivalence, similarities or differences between concepts in an ontology, these constructs can also be used to relate concepts to complex expressions using any valid OWL construct.
For example, there are three constructs for correlating
classes: owl:equivalentClass,
rdfs:subClassOf
and owl:disjointWith.
In the above examples the classes used as arguments to these
constructs are simple expressions. However, they may be
arbitrarily complex class expressions. For example, an ontological
engineer might say that
the class ontA:Lifeform in Ontology A is
equivalent to the union of the three classes ontB:Plant,
ontB:Animal and ontB:Fungus
in Ontology B :
ontA:Lifeform
owl:equivalentClass
[ a owl:Class ;
owl:unionOf ( ontB:Plant ontB:Animal ontB:Fungus )
] .
The unionOf construct may be used in conjunction with other OWL constructs for operating on sets: intersectionOf and complementOf to form arbitrarily complex expressions.
Next we consider OWL Restriction, another
important class formation operation that can be used on its own, or in
conjunction with the above set operation constructs to express
complex mappings. For example, we
can state that the class ontA:Bicycle is a
subclass of the class of all ontB:landVehicle(s)
that have exactly two wheels:
ontA:Bicycle
rdfs:subClassOf
[ a owl:Class ;
owl:intersectionOf
(ontB:LandVehicle
[ a owl:Restriction ;
owl:cardinality 2 ;
owl:onProperty ontB:hasWheels
])
] .
In general, any legal combination of OWL class formation operators can be used to form arbitrarily complex class expressions to state class equivalence, disjointness or subclass relationships. More typically, one of the class expressions will be just a class name, like ‘lifeform’ in the above example. However, it is also allowed to have arbitrarily complex expressions for both classes that are declared to be equivalent, disjoint or subclasses. For example, one could state that the class formed by the intersection of the class C1 with the union of classes C2 and class C3 is a subclass or (or equivalent to, or disjoint with) the class formed by taking the complement of the intersection of classes C2 and C4.
We leave it as an exercise for the reader to come up with real-world examples where such expressions might be useful. You can also state that any class expression is empty or non-empty. You can state that a set of classes forms a partition. Mapping statements are only limited by your imagination and the set of legal OWL constructs for specifying classes.
Any valid OWL construct for building a class expressions may be used as an argument to any of the OWL mapping constructs that relate classes. As we have seen, there is a rich set of class formation constructs in OWL.
In principle, it is also true that:
Suppose a pet shop decides to semantically enable its web site for on-line ordering. They heard all the buzz about the semantic web and how they are supposed to go out and use other people's ontologies. They do search for ontologies using Google for .owl files only or Swoogle. The search terms are 'male' and 'female' because the sex of a pet is an important factor for prospective customers.
The search retrieves a tiny ontology which we will call ontology A. It has two basic classes: ontA:Gender and ontA:Humans. There are two instances of ontA:Gender: ontA:male and ontA:female. There is a single property called ontA:hasGender, whose domain is ontA:Humans and range is ontA:Gender. The property is functional meaning that no instance of ontA:Humans can have more than one value for the ontA:hasGender property. There are two classes defined as restrictions on the ontA:hasGender property. Respectively, ontA:Women and ontA:Men (not to be confused with hallelujah). are defined to be ontA:Humans whose ontA:hasGender property has the value ontA:female and ontA:male.
The search also reveals a slightly different ontology covering much the same subject matter that we will call ontology B. It is also a rather tiny ontology. It has three basic classes: ontB:Sex, ontB:Animal and ontB:Human (which is a subclass of ontB:Animal). There are two instances of ontA:Sex: ontB:M and ontB:F. There a single property called ontB:hasGender, whose domain is ontB:Animal and range is ontB:Sex; it is also an instance of owl:FunctionalProperty (meaning that no instance of ontB:Human can have more than one value for the ontB:hasGender property).Below we list both ontologies in full. To make this mapping exercise more realistic, we do not document the meaning of the terms . These are still the early days of the wild wild semantic web, and there are ontologies of all shapes and sizes and levels of quality. The challenge is how to make sense of it all.
What is the pet shop owner to do? He likes ontology A because in addition to selling animals, he also likes to keep marketing information on male vs. female humans who are customers, and ontology B only talks about male and female animals. He likes ontology B because he needs to track the sex of the pets he sells, and ontology A only talks about the sex of humans. The pet shop owner needs some of each ontology, so this is an (albeit contrived) illustration of use case 2 -- ontology integration.
The first task is to determine the semantic correlations among the terms and concepts in the two ontologies. First we consider the classes in ontology A. There are two candidates for simple owl:equivalentClass mappings that may account for merely a different choice of terms: Human vs. Humans, and Gender vs Sex. Before we declare such equivalences, we must first try to verify that the intended and actual meanings really are the same.
In a perfect world, all ontologies would accurately
and unambiguously document the meaning of each concept. Documentation comes in many forms. It maybe in the ontology itself as
comments, or it may be in separate documents that describe the ontology, its
purpose, use cases, design, etc.
So, on the face of it they do not mean exactly the same thing. This is normal. However, even though the axioms don't suggest exact equivalence, their intended meanings could still be identical. As noted above, there is a tradeoff between adding more and more axioms to reduce ambiguity, vs. having just the axioms you need to achieve a specific purpose. Thus, one can usually think of many true axioms that could be added to a given ontology, but which are intentionally left out. This may be good for the ontology’s original purpose, but can make it harder for unanticipated users to resolve any unambiguities, which in turn hinders reuse.
Of course, things may also be left out due to carelessness, not careful design. There are no easy way to tell the difference. One has to use judgment. In this case, declaring the equivalence between ontA:Humans and ontB:Human is reasonable. We do this in OWL as follows:
ontB:Human
owl:equivalentClass ontA:Humans .
Lets check the axioms.
All we know about the class ontA:Gender is that it:
We see that there is a class called ontB:Female that is defined to be any animal whose ontB:hasGender property is equal to ontB:F. This is an important clue. It tells us that in all likelilood ontA:Gender and ontB:Sex are being used in the same way, and are good candidates for being declared as equivalent classes. We do this as follows: