Achieving Semantic Interoperability and Integration Using RDFS and OWL

W3C Working Draft – 04/18/2007

This version:
http://www.w3.org/TR/2007/whatever
Latest version:
http://www.w3.org/TR/whatever

Previous versions

 
Editors:  Mike Uschold & Chris Menzel  

Contributors: Natasha Noy 

Copyright ©2004 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.

Abstract

Semantic interoperability means enabling different agents, services, and applications to exchange information, data and knowledge, on and off the Web.  To enable semantic interoperability agents, services, and applications need either to share the same mutually understood vocabulary or to create correspondences (i.e. mappings) between their different vocabularies. One of the design goals of  RDFS and OWL is to provide the means to specify such mappings. This note provides guidance on how RDFS and OWL can be used to enable semantic interoperability.

We briefly characterize what we mean by semantic interoperability, and what the challenges are. We describe some RDFS and OWL constructs that are designed to support semantic interoperability and illustrate them with examples. We highlight their strengths and limitations. The main strengths are the ability to express equivalence and other logical  relationships between concepts, properties and individuals in different ontologies. One main weakness is the lack of support for procedural functions (e.g. arithmetic, string manipulation) that are needed for mapping between many real-world ontologies.

Status of this Document

This is a complete draft of this note.

Open issues, todo items:

This document is the First Public Working Draft. We encourage public comments. Please send comments to public-swbp-wg@w3.org [archive] and start the subject line of the message with "comment:"
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

 


Introduction

Semantic Web languages, such as RDFS and OWL facilitate interoperability in significant ways. They provide the social structure and technical framework to reuse existing ontologies. They also provide formal mechanisms to express logical equivalences and other formal relationships between classes and properties in different ontologies. The goal of this note is to provide tools and guidelines for users and application developers who wish to exploit RDFS and OWL to achieve semantic interoperability. Ultimately, it is up to the users to ensure that they reuse existing ontologies correctly and to identify and specify the logical relationships between concepts in different ontologies.

Use Cases 

  1. Extending a single ontology: A knowledge engineer would like to reuse an existing ontology and to extend it so that it meets the needs of a particular application.  This use case may be viewed as a kind of base, or degenerate case of integration, where the existing ontology is being integrated with a knowledge engineer's conceptual specification of his requirements, which already includes the existing ontology.  
  2. Integrate existing OWL ontologies: A knowledge engineer has distinct OWL ontologies to integrate resulting in one larger ontology. The existing ontologies are well documented and axiomatized, and the knowledge engineer has some familiarity with the domain specific vocabulary of each ontology.  She therefore knows how to relate the the meaning of a term in one ontology to one or more terms in the other ontology.  Her main task is to use these meaning correspondences to generate one integrated ontology.  We will refer to this as semantic integration.
  3. Application interoperability requiring data translation: A software developer needs to have a data sent from a source application to a target application, where each application's data is expressed using the vocabulary from a separate ontology. The data from one application needs to be converted to use the vocabulary of the other application.  As in the previous use case, we assume that there is a good understanding of the meaning of both ontologies. The main task is to specify the mappings.  The primary differences from the previous use case are 1) that there is one target ontology receiving data and 2) there is no requirement to create a new integrated ontology. Note that the mapping is directional, two mappings will be needed to achieve two-way translation.
The next use case we mention for completeness, because it is very important. However,  because it goes beyond the scope of OWL, we do not further consider it in this document. 
  1. Share existing ontologies: A knowledge engineer has a stock of legacy ontologies written in different representation languages such as Classic, Ontolingua, IKL, FLogic or other KR languages. She wants to be able to share ontologies on the web.  This knowledge engineer may find that sharing ontologies is facilitated if mappings to concepts in the ontologies written in the other ontologies are made explicit. 

A major focus of this note will be how to specify cross-ontology mappings in OWL. The mechanics of creating the mappings are independent from what use case the mappings are needed for.

What do we mean by Semantic Interoperability and Integration?

We view semantic interoperability and integration as enabling different agents, services, and applications to exchange information, data, and knowledge such that the intended meaning of the information/data/knowledge is preserved.  For simplicity,  we will refer to agents, services, and applications collectively as agents.  Great strides have been made in recent decades to improve interoperability at physical and syntactic levels.  Streams of data were successfully transmitted between systems, however there was no meaning associated with the data.  This situation is analogous to successful delivery of an encrypted message, appearing to the recipient in an unfamiliar script -- mere scratchings on the page. However, as Verizon CTO Michael Brodie notes, "it's the semantics, not the plumbing", [that is required for interoperability].  [Note: Invited Talk, Second International Conference on the Semantic Web, October 2003, Sannibel Island, FL. USA]

Having a robust physical infrastructure for transmitting data between systems is necessary but not sufficient. The very same data can mean very different things in different systems: depending on the context. To a supplier, "delivery date" typically means the date the product is shipped; and to the buyer, the same phrase typically means the date the product is received. Similarly, data values such as “32” may be an integer age (measured in years), a temperature (measured in degrees Fahrenheit or Celsius), or an allocated amount of human resource (measured in FTE – Full Time Equivalents). Without a way of indicating its intended meaning, the raw data received by a target system is to so many possible [mis-]interpretations, that it is essentially useless.  Systems are semantically interoperable when agents within them are able to exchange information, rather than mere data.

The terms ‘semantic interoperability’ and ‘semantic integration’ are often used loosely and somewhat interchangeably. The core idea for both is the existence of a semantic gap between different systems or applications that use different vocabularies -- a gap that needs to be bridged. The different vocabularies reflect different underlying conceptual models or ontologies that the systems are based on. 

Thus, semantic interoperability and semantic integration both entail the use of semantic mappings between concepts in one ontology to concepts in another ontology. The main difference is an architectural one. For example, in use case 3 data is exchanged between two applications by mapping between the source and target application ontologies; the original ontologies and applications are left intact. This is semantic interoperability.  Use case 2 illustrates semantic integration which usually involves some merging of ontologies.  

For this note, we will use the term ‘semantic I&I’ to refer to both semantic interoperability and semantic integration, as most of the issues dealt with in this document are common to both. We will use either term on its own when we specifically wish to refer to just one.

Semantic I&I and the Semantic Web: Some Basic Guidelines

In this section, we give some general guidelines and principles for achieving semantic interoperability & integration among Web applications.  The Semantic Web and the Semantic Web languages provide both the social structure and the technical means to facilitate semantic I&I.  The most fundamental contribution that the Semantic Web brings to semantic I&I is a set of recommended standardized languages with well defined syntax and semantics.  

One impediment to semantic I&I has been the use of a wide variety of knowledge representation frameworks to represent information (use case 4).  Exacerbating the problem is the fact that many of these languages lacked any explicit specification of their syntax — their basic vocabularies and their grammars — and their intended semantics.  In limited circumstances, the need for such specifications may not arise. For example, within a small organization new users can pick up the structure of the syntax and its intended interpretation from experienced users in the organization. However, in the context of the semantic web, this model is inadequate since agents wishing to share information must first share a common understanding of the content of the representations in terms of which that information is expressed.

The Semantic Web activity in the W3C made a significant advance on the language heterogeneity problem through the introduction of formal  recommendations for several standard XML-based ontology languages, notably, RDF, RDF Schema (RDFS) and OWL. The  syntax and semantics of these languages are open, well-defined standards.  This gives rise to our first principle:

Principle 1: To facilitate semantic I&I, create new ontologies in OWL or RDFS.
To the extent that recommended standard ontology languages come into broader usage for building ontologies on the WWW, the problems of syntactically different or semantically ill-defined representation languages is minimized.   Therefore, whenever possible, use OWL or RDFS when building new ontologies. This will help ensure semantic I&I, at least at the language level. Unfortunately, of course, many existing ontologies are not written in OWL and RDFS, and this gives rise to the second principle.

Principle 2: To facilitate semantic I&I, translate existing ontologies into OWL or RDFS.
If existing ontologies are intended to be shared, translate them into OWL or RDFS and make the OWL or RDFS version available in addition to the original version. Unfortunately,  there is no general, automated method for translating an ontology written in another KR language into OWL or RDFS. The difficulty of the task  varies significantly from language to language.

First, some ontologies are written in languages that are more expressive than OWL (such as full first order logic) and thus it is possible that some details in the original ontology will not be translatable into OWL.  Some systems such as Ontolingua and Chimaera that use KIF – a first order logic language – as their internal representation language handle this issue in their export capabilities by translating as much as possible into OWL and noting what they could not translate.  One emerging option is to adopt a strategy used by UML and OCL (object constraint language). The latter is basically first order logic, and is used to fill in expressive gaps in UML. In the present context, this would mean that ontologies should be expressed completely in OWL if  possible, with additional information in a more expressive language, such as KIF (or one of its later incarnations: Common Logic & IKL) or rule languages, such as SWRL,  as required.   

Second, some ontologies may be written in languages using somewhat different paradigms than those used in OWL or RDFS.  OWL can be viewed as a descendant of frame-style and description logic systems; it provides natural support for modeling things such as classes and binary relations.  Thus, ontologies written in a frame-style language like OKBC, LOOM, or CLASSIC may translate more easily and naturally into OWL than ontologies written in a full, unrestricted first-order language like KIF.  Protege, for example, extended its frame-based core to represent ontologies in OWL.  If ontologies are best conceptually modeled with things such as ternary (or higher arity) relations, translations may be less natural.  For some ideas on how to do this, see the  note: Defining N-ary Relations on the Semantic Web from the  W3C  Semantic Web Best Practice and Deployment Working Group.

The first two principles focus on representing ontologies using the standard languages such as OWL and RDFS. This is important, but does not go far enough; there are many issues of semantic heterogeneity that arise even when the same language is universally used. 

The semantics for OWL and RDFS do two very important things for us. First, they fix the meanings of the reserved vocabulary.  For example, they say exactly what is meant by declaring an individual is a member of a class, or that a relationship is transitive.  Second, the OWL semantics specify how the meanings of complex expressions depend systematically on the meanings of their simpler parts. Such expressions are build up using various syntactic constructs as specified by the grammar of the language. 

For example, if you know the precise meaning of the Animal and Plant classes, and you know the meaning of the OWL construct owl:unionOf you can know exactly what the following expression means: owl:unionOf(Animal Plant).
In plain English, it means that an individual is a member of the new class exactly when it is a instance of either or both of the classes Animal and Plant.  Formally, this expression is an anonymous class for which the class extension contains those individuals that occur in the class extensions of either Animal or  Plant

The semantics of OWL, however, are silent on our knowledge engineer's domain specific vocabulary. Hence, if this ontology is to be shared with or reused by someone who does not share the ontology creator's understanding of the domain specific vocabulary, the meanings of terms in that vocabulary must be captured somehow. And this is really where the semantics gets into the Semantic Web.  This gives rise to our next principle:

Principle 3: To facilitate semantic I&I, reduce ambiguity by expressing more meaning.

In the context of the Semantic Web, ontology creation takes on an entirely new and exciting guise. Currently, web page content is expressed largely in terms of unstructured text and graphics where most or all of the meaning is implicit. Acquiring the intended meaning of the content relies on the human’s understanding of the words and context. OWL's constructs open up the possibility of explicitly declaring the meaning  of the content.  OWL is used to put the semantics in the Semantic Web.

The idea is that an ontology will be the semantic foundation for web page content. An ontology builder must take care to specify the meaning of the terms in the ontology. There are two reasons for this. First, it ensures that others who may wish to reuse your ontology can glean the intended meaning of the vocabulary in your ontology. If an ontology only contains a name for a class and nothing more, such as the class named “Animal”, then all we know from the semantics of OWL is that 'Animal" denotes a set of individuals. Virtually all the intended semantics are implicit in the natural language meaning of the word. If the reader doesn't know English, then the class might as well be named “*&^tf$#”. 

The second reason for carefully specifying the meaning of terms in an ontology is so applications can interpret the content appropriately.  If you create the relationship tallerThan in your ontology, and your web page contains data indicating that Tom is tallerThan Sue, who is in turn tallerThan Barb, you want the application to be smart enough to figure out that Tom is tallerThan Barb.  If all you say about your relationship is to give it a name, that won't happen.  However, if you specify that the relationship is transitive, then the application could make the correct inference.   

There are various ways to add meaning and reduce ambiguity using RDFS and OWL constructs. Here are a few of many possible ways

Meaning about concepts in an ontology is represented as formal statements using the RDFS and OWL reserved vocabulary. Each statement serves to characterize the logical characteristics of, and logical connections among, the classes, properties, and individuals named in a domain specific vocabulary. Although a typical user of an ontology building environment may not be aware of it, these formal statements are all axioms in a formal logic.

In summary, you can increase sharing and reuse by adding more meaning to reduce ambiguity. You can add more meaning by making use of the  reserved vocabulary in RDFS and OWL to axiomatize the classes, properties, and individuals in an ontology.

Note that while less ambiguity is generally a good thing -- this does not mean that one should add arbitrary amounts of detail to the ontology just for its own sake. Designing and engineering ontologies should be  guided by requirements (just like designing anything else). If you don't need to model things at the level of a person or plant's DNA, then don't. In fact, adding more constraints on the meaning of classes and properties may actually reduce reusability. For example, making the class Plant disjoint with MovingThing, prevents the reuse of this class to describe plants that move (such as carnivorous plants).  In another example, if the format for a date field is not specified,  it might be easier to map to an ontology where a corresponding date field has range xsd:Date than if date was defined as a String. Thus:
It may be useful to separate the part of your ontology intended for reuse from more application-specific aspects of it. This gives rise to our next principle:

 Principle 4: To facilitate semantic I&I, develop ontologies in small reusable inter-related modules.


To address the trade-off of having an ontology that is useful for your application versus general enough to be reusable, break up your ontology into modules and then use 
owl:imports or direct reference to concepts from other modules by URIs to link them together. Thus, you can have one module that defines the hierarchy of plants and animals, and another one where you specify that in your application plants don't move (i.e., the class Plant is disjoint from MovingThing).

Principle 5: To facilitate semantic I&I, reuse concepts from existing ontologies.

The richer content and standardized representations afforded by OWL, together with the connectivity provided by the web's basic infrastructure, opens up the second exciting aspect of ontology creation on the semantic web, namely, the possibility of genuine, robust reuse. Prior to the semantic web, reuse consisted of little more than incorporating a term from someone else's ontology into one's own ontology, with the intention that it means the same thing across ontologies. But if the term is not defined or axiomatized using the sort of rich representational constructs that OWL provides, there is no way to be certain, or even mildly confident, of its meaning and hence no way to ensure commonality of meaning when the term is reused. OWL makes such representations possible. Moreover, the infrastructure of the web — notably, the mechanism of Uniform Resource Locators — together with XML namespaces facilitate reuse by making it possible to import content directly from remote ontologies, thereby reducing work and eliminating the possibility of transcription errors.

Reusing existing concepts that have a well-defined meaning is an important step, but not nearly enough. Sometimes, there are good reasons to represent the same concepts using different terms, or different representational constructs. This gives rise to our next principle.

Principle 6: To facilitate semantic I&I, use OWL mapping constructs to relate concepts from different ontologies.

The vision of the Semantic Web is for the meaning of web content to be accessible to both humans and machines. Using OWL as described above allows content to be semantically-based. However, there will never be one single global ontology that everyone uses. Instead ontologies of all shapes an sizes are and will continue to be developed independently. 

Independent development means that the ontologies will not be the same, even though they may cover the same subject matter. The term 'car' might be used in one ontology, and 'automobile' in another, to mean the same thing. A third ontology might just have a class called 'vehicle' and not include details of different kinds of vehicles.  

If we are going to leverage the good conceptual work of others who have built ontologies, then we are going to have to 1) have a way to keep track of which concepts came from which ontology and what they mean in their originating context, and 2) be able to specify semantic correlations between concepts in different ontologies. The latter is called mapping.  It is used in each of use cases 2-4 above.

When a term from one ontology is reused in the context of building a new ontology or extending an existing ontology, it comes "tagged" with its original namespace. This lets us keep track of which term came from where; it also enables a single core term to be used in both ontologies, each having a different meaning.  In the event that the meanings are the same, or closely related, it is important to specify the semantic correlations between them. This maximizes reuse and avoids redundancy, which in turn, improves maintainability.  These correlations are created by using various mapping constructs in OWL. 

In the next section of this document, we illustrate the use of the ontology mapping constructs for facilitating semantic interoperability & integration using a series of examples.

OWL-Based Mappings: Overview

Logical Equivalence Mappings

The simplest kind of correlation to specify between two concepts in OWL is logical equivalence. To express equivalence between classes, properties and individuals, we use the following three OWL constructs, respectively:  owl:equivalentClass, owl:equivalentProperty, and owl:sameAs. Equivalence between two class definitions means that the two classes have the same extensions, but they are not necessarily the same concepts.  A famous, somewhat comical example for this, originating with the philosopher Aristotle are the classes "Human" and "UnpluckedFeatherlessBiped".  The meaning is different, but they 

Similarlyl, equivalence between two property definitions means that they both have the same set of pairs of related individuals in their extension, but they might not be exactly the same property.  An actual example of this took place at a large US university.  When information systems were first being implemented at the university, every student was to be assigned a unique student ID.  Additionally, the university also maintained every student's Social Security Number.  Independently it was decided that a student's ID number would simply be his or her Social Security Number.  Nonetheless, to allow for the possibility that an independent system of  studentIDs might be implemented, both the property studentID and the property socSecNum remained in the system.  If the system was driven by an ontology, then these two properties would be declared to be  equivalent. They don't mean the same thing, but they have the same set of related individuals.

Equivalence between two individuals means that two URI references actually refer to the same thing: the individuals have the same "identity".

For the examples below, assume that we have two different ontologies, Ontology A and Ontology B and we are establishing relationships between concepts in these ontologies. Concepts from the Ontology A are indicated by the prefix ontA and those from the Ontology B are prefixed with ontB.

For example, we might specify that the class ontA:car is equivalent to the class ontB:auto as follows:

ontA:car
owl:equivalentClass ontB:auto.

We can state that the property ontA:canRunSlowly is equivalent to the property ontB:canJog as follows:

ontA:canRunSlowly
owl:equivalentProperty ontB:canJog.

We can specify that the individual ontA:venus is the same individual as ontB:morning_star as follows:

ontA:venus
owl:sameAs ontB:morning_star.

Non-Equivalent Similarity Mappings

Often, there is no exact equivalence between two classes or two properties, but there may still be an important subclass or subproperty relationship between them that is useful to capture. For example, one can create mappings to state that the class ontA:Primate is a subclass of the class ontB: Mammal, and that the property ontA:brotherOf is a subProperty of ontB:siblingOf as follows:

ontA:Primate
rdfs:subClassOf ontB:Mammal.
ontA:brotherOf
rdfs:subPropertyOf ontB:siblingOf.

Non-Equivalent Difference Mappings

The equivalence and subclass relationships state fundamental similarities between classes. One can also specify correlations among concepts in different ontologies that highlight their differences. Two OWL constructs are provided for this: owl:disjointWith and owl:differentFrom

For example, one might declare that ontA:Plant is disjoint with the class ontB:Fungus :

ontA:Plant
owl:disjointWith ontB:Fungus .

You can also say that ontA:JohnASmith is different from ontB:JohnASmith:

ontA:JohnASmith
owl:differentFrom ontB:JohnASmith .

These kinds of statements are very useful for use cases 1,2 & 4 because they provide mapping information that relates the meaning of classes and individuals in two different ontologies.  These difference mappings are less useful for the translating data from one ontology to another (use case 3).

Complex Mapping Statements using Other OWL Constructs

In addition to being used on their own to express equivalence, similarities or differences between  concepts in an ontology, these constructs can also be used to relate  concepts to complex expressions using any valid OWL construct.

For example, there are three constructs for correlating classes: owl:equivalentClass, rdfs:subClassOf and owl:disjointWith.  In the above examples the classes used as arguments to these constructs are simple expressions. However, they may be arbitrarily complex class expressions. For example, an ontological engineer might say that the class ontA:Lifeform in Ontology A is equivalent to the union of the three classes ontB:Plant, ontB:Animal and ontB:Fungus in Ontology B :

ontA:Lifeform
owl:equivalentClass
[ a owl:Class ;
owl:unionOf ( ontB:Plant ontB:Animal ontB:Fungus )
] .

The unionOf construct may be used in conjunction with other OWL constructs for operating on sets: intersectionOf and complementOf to form arbitrarily complex expressions.  

Next we consider OWL Restriction, another important class formation operation that can be used on its own, or in conjunction with the above set operation constructs to express complex mappings. For example, we can state that the class ontA:Bicycle is a subclass of the class of all ontB:landVehicle(s) that have exactly two wheels:

ontA:Bicycle
rdfs:subClassOf
 [ a owl:Class ;
owl:intersectionOf
(ontB:LandVehicle
[ a owl:Restriction ;
owl:cardinality 2 ;
owl:onProperty ontB:hasWheels
])
] .

In general, any legal combination of OWL class formation operators can be used to form arbitrarily complex class expressions to state class equivalence, disjointness or subclass relationships. More typically, one of the class expressions will be just a class name, like ‘lifeform’ in the above example. However, it is also allowed to have arbitrarily complex expressions for both classes that are declared to be equivalent, disjoint or subclasses. For example, one could state that the class formed by the intersection of the class C1 with the union of classes C2 and class C3 is a subclass or (or equivalent to, or disjoint with) the class formed by taking the complement of the intersection of classes C2 and C4. 

We leave it as an exercise for the reader to come up with real-world examples where such expressions might be useful. You can also state that any class expression is empty or non-empty. You can state that a set of classes forms a partition. Mapping statements are only limited by your imagination and the set of legal OWL constructs for specifying classes.

Summary: Mapping in OWL

There are three kinds of correlations. One specifies logical equivalence. Another specifies  similarity between two concepts that is not logical equivalence. The third specifies explicit differences.  There are seven OWL constructs explicitly designed for specifying correlations.

There are three OWL constructs for specifying logical equivalence:
  1. owl:equivalentClass,
  2. owl:equivalentProperty, and
  3. owl:sameAs.
There are two OWL constructs for specifying non-equivalent mappings among similar concepts:
  1. rdfs:subClassOf
  2. rdfs:subPropertyOf
There are two OWL constructs for specifying explicit differences:
  1. owl:disjointWith
  2. owl:differentFrom

Viewed from another perspective, there are three constructs that relate classes to classes:
  1. owl:equivalentClass,
  2. rdfs:subClassOf, and
  3. owl:disjointWith.
There are two constructs that relate properties to properties:
  1. rdfs:subPropertyOf
  2. owl:equivalentProperty
There are two constructs that relate individuals to individuals.
  1. owl:sameAs.
  2. owl:differentFrom

These core mapping constructs are used 
in conjunction with other OWL constructs to express formal logical relationships between concepts in different ontologies for the purpose of mapping. 

Any valid OWL construct for building a class expressions may be used as an argument to any of the OWL mapping constructs that relate classes. As we have seen, there is a rich set of class formation constructs in OWL.

In principle, it is also true that: 

However, at this time there are no OWL constructs that form expressions for individuals or for properties. So in the case of properties and individuals: equivalentProperty, subProperty, sameAs, and differentFrom pretty much exhaust what can be specified in a mapping between two different ontologies.

Depending on the use case, mappings can be used to specify the logical connections either between existing concepts across multiple ontologies or between existing ontology concepts and a new ontology concept. Mappings can be used to state clear similarities, or clear differences. Mappings can involve any combination of classes, properties and individuals from two or more ontologies.

An Example: Gender, Pet Shops and the Semantic Web

Suppose a pet shop decides to semantically enable its web site for on-line ordering.  They heard all the buzz about the semantic web and how they are supposed to go out and use other people's ontologies.  They do search for ontologies  using Google for .owl files only or Swoogle.  The search terms are  'male' and 'female' because the sex of a pet is an important factor for prospective customers. 

The search retrieves a tiny ontology which we will call ontology A.  It has two basic classes: ontA:Gender and ontA:Humans. There are two instances of  ontA:Gender: ontA:male and ontA:female. There is a single property called ontA:hasGender, whose domain is ontA:Humans and range is ontA:Gender. The property is functional meaning that no instance of ontA:Humans can have more than one value for the ontA:hasGender property. There are two classes defined as restrictions on the ontA:hasGender property. Respectively, ontA:Women and ontA:Men (not to be confused with hallelujah). are defined to be ontA:Humans whose ontA:hasGender property has the value ontA:female and ontA:male.

The search also reveals a slightly different ontology covering much the same subject matter that we will call ontology B. It is also a rather tiny ontology.  It has three basic classes: ontB:Sex, ontB:Animal and ontB:Human (which is a subclass of ontB:Animal)There are two instances of  ontA:Sex: ontB:M and ontB:F. There a single property called ontB:hasGender, whose domain is ontB:Animal and range is ontB:Sex; it is also an instance of owl:FunctionalProperty (meaning that no instance of ontB:Human can have more than one value for the ontB:hasGender property).

Below we list both ontologies in full. To make this mapping exercise more realistic, we do not document the meaning of the terms . These are still the early days of the wild wild semantic web, and there are ontologies of all shapes and sizes and levels of quality. The challenge is how to make sense of it all.

Ontology A
# Base: http://stupidURL/PetShopOntologyA.owl#
@prefix xsd:     <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix owl:     <http://www.w3.org/2002/07/owl#> .
@prefix ontA:       <http://stupidURL/PetShopOntologyA.owl#> .

<http://stupidURL/PetShopOntologyA.owl>
      a       owl:Ontology .

ontA:Gender
      a       owl:Class .

ontA:male
      a       ontA:Gender .

ontA:female
      a       ontA:Gender .


ontA:Humans
      a       owl:Class .


ontA:hasGender

      a       owl:ObjectProperty , owl:FunctionalProperty ;
      rdfs:domain ontA:Humans ;
      rdfs:range  ontA:Gender .


ontA:Women
      a       owl:Class ;
      owl:equivalentClass
              [ a       owl:Restriction ;
                owl:hasValue   ontA:female ;
                owl:onProperty ontA:hasGender
              ] .

ontA:Men
      a       owl:Class ;
      owl:equivalentClass
              [ a       owl:Restriction ;
                owl:hasValue   ontA:male ;
                owl:onProperty ontA:hasGender
              ] .

NB. the URI is has 'stupidURL' in the name to make it clear that it is never going to be found at any web site. It looks like a URL because the standards require it.

Ontology B
# Base: http://stupidURL/PetShopOntologyB.owl#
@prefix xsd:     <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix owl:     <http://www.w3.org/2002/07/owl#> .
@prefix ontB:       <http://stupidURL/PetShopOntologyB.owl#> .

<http://stupidURL/PetShopOntologyB.owl>
      a       owl:Ontology .

ontB:Sex
      a       owl:Class .

ontB:M
      a       ontB:Sex .
ontB:F
      a       ontB:Sex .

ontB:Animal
      a       owl:Class .

ontB:Human
      a       owl:Class ;
      rdfs:subClassOf ontB:Animal .

ontB:hasGender
      a       owl:ObjectProperty ;
      rdfs:domain ontB:Animal ;
      rdfs:range  ontB:Sex .

ontB:Female
      a       owl:Class ;
      owl:equivalentClass
              [ a       owl:Restriction ;
                owl:hasValue   ontB:F ;
                owl:onProperty ontB:hasGender
              ] .

ontB:Male
      a       owl:Class ;
      owl:equivalentClass
              [ a       owl:Restriction ;
                owl:hasValue   ontB:M ;
                owl:onProperty ontB:hasGender
              ] .
Use Case 2: Integrating the Ontologies

What is the pet shop owner to do?  He likes ontology A because in addition to selling animals, he also likes to keep marketing information on male vs. female humans who are customers, and ontology B only talks about male and female animals. He likes ontology B because he needs to track the sex of the pets he sells, and ontology A only talks about the sex of humans.  The pet shop owner needs some of each ontology, so this is an (albeit contrived) illustration of use case 2 -- ontology integration.  

The first task is to determine the semantic correlations among the terms and concepts in the two ontologies. First we consider the classes in ontology A.  There are two candidates for simple owl:equivalentClass mappings that may account for merely a different choice of terms: Human vs. Humans, and Gender vs Sex.   Before we declare such equivalences, we must first try to verify that the intended and actual meanings really are the same.   

How do we do that? The actual meaning is found in the formal statements that define the ontology, i.e. the axioms. The actual meaning should of course also shed light on the intended meaning. Additional information on the latter can be found in documentation for concepts in an ontology.   So we must check both the documentation and the axioms.  

In a perfect world, all ontologies would  accurately and unambiguously document the meaning of each concept. Documentation comes in many forms. It maybe in the ontology itself as comments, or it may be in separate documents that describe the ontology, its purpose, use cases, design, etc.   

In the real world, documentation is often missing, incomplete, or inaccurate.  Nevertheless, always check for documentation to look for meaning clues. Use cases are a particularly valuable source of insight into understanding an ontology.  Next we examine the relevant axioms relating to humans.

All we know about the class ontA:Humans is that it is:

  1. a subclass of Thing (which tells us nothing useful) and
  2. the domain of a property called ontA:hasGender.  

All we know about the class ontB:Human is that it is:

  1. a subclass of ontB:Animal

So, on the face of it they do not mean exactly the same thing. This is normal.  However, even though the axioms don't suggest exact equivalence, their intended meanings could still be identical. As noted above, there is a tradeoff between adding more and more axioms to reduce ambiguity, vs. having just the axioms you need to achieve a specific purpose.  Thus, one can usually think of many true axioms that could be added to a given ontology, but which are intentionally left out.  This may be good for the ontology’s original purpose, but can make it harder for unanticipated users to resolve any unambiguities, which in turn hinders reuse.  

Of course, things may also be left out due to carelessness, not careful design. There are no easy way to tell the difference. One has to use judgment. In this case, declaring the equivalence between ontA:Humans and ontB:Human is reasonable. We do this in OWL as follows: 

ontB:Human
      owl:equivalentClass ontA:Humans .

The other candidates for a simple class equivalence mapping are  ontA:Gender and ontB:Sex.  In English, these words can have quite different meanings.  Strictly speaking, the term ‘gender’ is a grammatical category, and ‘sex’, well gosh, let's just say it is a biological term referring to one of various reproduction strategies that have evolved. However, both terms are frequently used to denote whether a given lifeform is male or female. So they might be equivalent. 

Lets check the axioms.  We are looking for all the axioms that directly mention ontA:Gender or ontB:Sex. If you are comfortable with looking at raw .owl files, you can just search for the appropriate string to find the axioms. Otherwise, you will have to rely on ontology editor tools to make it easy for you to find all uses of a given term.    

All we know about the class ontA:Gender is that it:

  1. is a subclass of Thing (which tells us nothing useful) and
  2. is the range of a property called ontA:hasGender and
  3. has two instances: ontA:male and ontA:female.

All we know about the class ontB:Sex is that it:

  1. is a subclass of Thing (which tells us nothing useful) and
  2. is the range of a property called ontB:hasGender and
  3. has two instances: ontB:M and ontB:F.

This looks like a pretty good match. We have made the above determinations for actual meaning by examining the axioms that explicitly mention the terms in question. This may not be enough. For example, the property ontB:hasGender is directly mentioned in an axiom involving B:Sex.  To get a more complete picture of the meaning of the term B:Sex, we look for axioms that directly mention ontB:hasGender as well.

We see that there is a class called ontB:Female that is defined to be any animal whose ontB:hasGender property is equal to ontB:F.  This is an important clue. It tells us that in all likelilood ontA:Gender and ontB:Sex are being used in the same way, and are good candidates for being declared as equivalent classes. We do this as follows: 

ontA:Gender
      owl:equivalentClass ontB:Sex .

We have now mapped two classes in ontology A to two classes in ontology B.   There are two additional classes in  ontology A: ontA:Men and ontA:Women, which are defined in a structurally identical manner by restricting the values of the ontA:hasGender property. Because the meaning of ontA:Men and ontA:Women heavily depend on the meaning of ontA:hasGender, we examing the properties ontA:hasGender and ontB:hasGender and their respective ranges of possible values {ontA:male and ontA:female} and {ontB:M and ontB:F}before we try to semantically correlate the classes ontA:Men and ontA:Women to concepts in ontology B.

The terms ontA:male and ontA:female are used to define ontA:Men and ontA:Women using an owl:Restriction on the property ontA:hasGender. The terms ontB:M and ontB:F are used in ontology B in an exactly analogous manner to define the classes ontB:Male and ontB:Female using a restriction on the property ontB:hasGender.

Thus, it would apper to make perfect sense to use the owl:sameAs relationship to correlate the possible range of values for the hasGender properties as follows. This could be done as follows (NB the n3 syntax uses the '=' sign to denote owl:sameAs).

ontA:male
      =       ontB:M .
ontA:female
      =       ontB:F .

However, before we can be sure, we need to check that the meanings of ontA:hasGender and ontB:hasGender are really the same.  Names of concepts can be deceiving. There are two important differences in the meanings of ontA:hasGender and ontB:hasGender:
  1. the range of ontA:hasGender is ontA:Humans vs.
    the range of ontB:hasGender is ontB:Animals
  2. ontA:hasGender is functional and ontB:hasGender is not.
In deciding on a semantic correlation, one must decide whether these differences are  important and intentional. Whereas ontB:M and ontB:F apply to ontB:Animal, ontA:male and ontA:female only apply to ontA:Humans (and thus ontB:Human because they are equivalent classes). A biologist may tell us that there are subtle differences in the meaning of sex for different kinds of animals and other lifeforms.  This might be important if the ontology was for academic biology, or for a biotech company. However we are just talking about a pet shop, and all we care about is tracking male and female pets, and male and female customers.  Therefore it is going to suit our purposes to stick with our owl:sameAs declarations listed above.

We are left with a decision about how to relate the two different hasGender properties.  The difference could just be an oversight on the part of the ontology developers.  Or ontB:hasGender could be intentionally not functional  because some animals do have both male and female sexual organs.  The property becomes functional when applied to humans.  

Note that our job here is to find an ontology that works for us (use case 2) rather than to perform data translation (use case 3). This means that we just need to find classes and properties that meet our needs, we do not need to specify a full mapping between all the constructs in both ontologies.  

We need to have a way to track the gender of both animals and humans.  If the pet shop only sells animals that must be either male or female, but not both, then we only need one hasGender property, and it can be functional.  If you just want to get on with your on-line store, and you know that you will never ever sell multi-gender animals in the future then this is the most expedient thing to do.  However, this may not be usable by other pet shop owners who do sell pets with both male and female gender.

Suppose you are motivated by the possibility of others using your ontology; for example, you may be representing an international committee to standardize on pet shop metadata. In that case, you need to consider the needs of a broader range of pet shops' requirements and make choices accordingly. Your ontology would need to accommodate pet shops that sold multi-gender animals.

Lets say we wish for the ontology to be more widely used, or perhaps we anticipate expanding into a wider range of pets, so we do not want to restrict our growth.  We therefore need to accomodate animals that necessarily have just one gender, and those that may have more than one.

We can achieve this by specifying that ontA:hasGender is a rdfs:subProperty of ontB:hasGender. This means that any two individuals that are in the ontA:hasGender relationship are also in the ontB:hasGender relationship. For example,  if any individual ontA:hasGender ontA:male, then that same individual also ontB:hasGender ontA:male. (which is declared to be the same as ontB:M).  Animals in general may have more than one value for gender, but humans can have only one.  We can specify the mapping between the two hasGender properties as follows:

ontA:hasGender rdfs:subPropertyOf ontB:hasGender .

To avoid confusion due to the names being the same, we could rename them; e.g.  animalHasGender and humanHasGender. However if we wish to leave the original ontologies intact, we would instead create two new properties with these names, and set them to be equivalent to the other properties.  To do it this way, we would make the following declarations. This is the first example of creating a new term, so we use a different manespace prefix: pet.

pet:humanHasGender  owl:equivalentProperty ontA:hasGender .
pet:animalHasGender owl:equivalentProperty ontB:hasGender .


If we wished to make changes to the properties, then we could create new properties based on the existing ones, make the local changes, and declare them to be subproperties of the original properties. You would do this if, for example,  you wanted the animalHasGender property to also be functional. It would be bad practice, and perhaps even a logical inconsistency to declare two properties to be equivalent, if only one was functional. We leave how to do this as an exercise to the reader.  

What about extreme cases of where humans are born with organs of both sexes?  Always go back to the ontology requirements. We are selling pets, we are not setting up a database for human sexual anomolies.  We don't care about these extreme cases. We do care if one human or one pet is both male and female, because that will mess up our system. 

We are now ready to address the classes ontA:Men and ontA:Women.  We are satisfied that the hasGender properties in both ontologies have the same core semantic meaning, and we have satisfactorily addressed their differences. Intuitively, we can see that the set of all men is the intersection of the set of all humans and the set of all male animals. We can say this (and the counterpart for women and females) in OWL as follows:

ontA:Men
      owl:equivalentClass
      [ owl:intersectionOf (ontB:Human ontB:Male) ] .

ontA:Women
      owl:equivalentClass
      [ owl:intersectionOf (ontB:Human ontB:Female) ] .


Note that ontA:Men and ontA:Women should be subclasses of ontB:Male and  ontB:Female, respectively. By making the semantic mappings as shown, the inference engine will conclude this.
 Use Case 3: Data Integration 
We have considered the pet shop example as being about ontology integration. However, there might be a need to data translation. Each of the ontologies might have data sets that go with them.  If we ask a query about all the female customers that purchased mail animal pets in a given year, there must be a way for the system to know that an instance of ontB:Human is also an instance of ontA:Humans.  This happens for free, in the way we have set up our mappings.

In the data integration use case, we would need to define mapping in both directions. For example, we would have to try to define ontB:Female and ontB:Male using terms from ontology A. This is left as an exercise for the reader.
Summary: Gender, Pet Shops and the Semantic Web
In summary, to discover the meaning of a term in an ontology:
  1. consider the natural language meaning of the term used to name the concept
  2. consult the documentation
  3. examine the axioms
  4. consult the original ontology developers (if possible)
  5. add a heavy dose of educated guesswork
  6. stir
In deciding how or whether to use existing concepts, or to create new ones, or variations on existing ones, always keep your requirements in mind.  It is also important to carefully examine the axioms. If our pet shop owner adopted ontology A, failing to notice that gender only applied to humans, and not animals, then the application would break when orders are placed, error messages would occur and orders would be lost. Why? Because people will want to order animals that have a specified gender.

Summary and Conclusions

Our aim is that the reader is now equipped with the raw materials to start on their own project relating to semantic interoperability.  Putting it all together for a realistic example involves hard work and creativity. Most of the effort often boils down to figuring out the intended semantics  of what is in someone else's ontology.  Hence, we close by reiterating our third principle: To facilitate semantic I&I, reduce ambiguity by expressing more meaning.  However, do not add explicit meaning unless it is relevant. This makes it much easier to reuse and correlate to other ontologies.


Good luck!

Acknowledgments

Chris Welty and Deborah McGuinness participated in early discussions about the structure and content of this note and gave helpful comments on earlier drafts.  The pet shop example came from an email discussion in the cuo-wg mail list. Yongchun Gao posed the problem, and Pat Hayes outlined an OWL solution that was the basis for the example in this note.