Achieving Semantic Interoperability & Integration[MFU1]  Using RDF and OWL

 

W3C Editor's Draft– 01/23/2006

 

This version:

http://www.w3.org/TR/2004/whatever

Latest version:

http://www.w3.org/TR/whatever

Previous versions

This is the first public Working Draft

Editors:

Mike Ushold, The Boeing Company

Christopher Menzel, The Boeing Company


Abstract

Semantic interoperability means enabling different agents, services, and applications to exchange information, data and knowledge, on and off the Web.  To enable semantic interoperability agents, services, and applications need to share the same mutually understood vocabulary or to create correspondences or mappings between their different vocabularies. One of the design goals of  RDF and OWL is to provide the means to specify such mappings. This note provides guidance on how OWL and RDF can be used to enable semantic interoperability.

We briefly characterize what we mean by semantic interoperability, and what the challenges are. We describe some RDF and OWL constructs that are designed to support semantic interoperability and illustrate them with examples. We highlight their strengths and limitations. The main strengths are the ability to import, share and reuse public ontologies (in whole or part) [MFU2] and the ability to express logical equivalence and other relationships between concepts, properties and individuals in different ontologies. One main weakness is the lack of support for procedural functions (e.g. arithmetic, string manipulation) that are needed for mapping between many real-world ontologies.

Status of this Document

This is a nearly complete first draft of this note.
The outline and structure of the document is stable.
The major change form the last version is that there is a narratave for the different OWL mapping constructs. The major content remaining to be added are the code examples.

[ANTICIPATED:] This document is the First Public Working Draft. We encourage public comments. Please send comments to public-swbp-wg@w3.org [archive] and start the subject line of the message with "comment:"

Open issues, todo items:

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.


Introduction

Semantic Web languages, such as RDF and OWL facilitate interoperability in significant ways. They provide the social structure and technical framework to reuse existing ontologies; they provide formal mechanisms to express logical equivalences and other formal relationships between classes and properties in different ontologies. The goal of this note is to give users and application developers tools and guidelines to exploit OWL to achieve semantic interoperability. Ultimately, it is up to the users to reuse ontologies correctly and to identify and specify the logical relationships between terms in different ontologies.

Use Cases 

Extending a single ontology: A knowledge engineer would like to start with one ontology and extend it so that it meets the needs of a particular application.  This can be viewed as a “base case” of integration as the engineer is integrating the conceptual specification of her requirements with the existing ontology.

Integrate existing OWL ontologies: A knowledge engineer has distinct OWL ontologies to integrate resulting in one larger ontology[dlm3] . She largely understands the intuitive meanings of the vocabulary of the existing ontologies, either because the existing ontologies are well documented and axiomatized, or simply because she is familiar with the domain specific vocabulary of each. Her main task is to be able to make the mappings between the existing ontologies explicit and to generate one integrated ontology.  We will refer to this as semantic integration.

Application interoperability requiring data translation: A software developer needs to have a data sent from a source application (that is expressed using the vocabulary from one ontology), to a target application that requires the data to be expressed using another ontology. The data needs to be converted to use the vocabulary of the target application. This is similar to the previous use case, in that we assume that there is a good understanding of the meaning of both ontologies, and that the main task is to specify the mappings.  The primary differences from the previous use case are that there is one target ontology receiving data and there is no requirement to create one new integrated ontology.

1.     Share existing ontologies[dlm4] : A knowledge engineer has a stock of legacy ontologies written in different representation languages such as Classic, Ontolingua, KIF, FLogic or other KR languages. She wants to be able to share ontologies on the web.  This knowledge engineer may find that sharing ontologies is facilitated if mappings to terms in the ontologies written in the other ontologies are made explicit.  Best approach is to consult [OWL Semantics] for precise account of the semantics of OWL; also CL-based axiomatic semantics [Hayes], [Menzel] [dlm5] 

Note that for use cases 2 and 3, there is a need to map terminology from one ontology to another. A major focus of this note will be how to specify these mappings in OWL. The mechanics of creating the mappings is independent from what use case the mappings are needed for.

 

What do we mean by Semantic Interoperability & Integration?

NOTE this document currently focuses on semantic integration as opposed to the broader notion of semantic interoperability. Here, semantic integration is taken to consist chiefly in the identification and explication of logical connections between classes, properties, and individuals across ontologies. Semantic interoperability (we (or I – CM) suggest) encompasses semantic integration proper as well as, e.g., interoperability of services and tools made possible and driven by semantic integration.

MFU: you don't say here what interoperability is that integration is not. Do we need to say something in the document?

DLM:  why don’t we just say that this document focuses on a first step toward semantic interoperability as facilitated by semantic integration that is accomplished using a set of formal statements of logical relationships between terms.  Some have referred to this as articulation, or bridging, axioms.

MFU: the data translation use case is very definitely semantic interoperability, not integration, so I think we are doing both. Hence, I added ‘integration’to the title of the note.

The terms ‘semantic interoperability’ and ‘semantic integration’ are often used loosely and somewhat interchangeably. The core idea for both is the existence of and desire to bridge a semantic gap between different systems or applications that use different vocabularies. The different vocabularies reflect different underlying conceptual models or ontologies that the systems are based on. Thus, semantic interoperability and semantic integration both entail the use of semantic mappings between terms in one ontology to terms in another ontology. The main difference is an architectural one.  Semantic interoperability usually means that the original systems and ontologies remain intact. Semantic integration usually means that there is some merging of the ontologies or applications. For this note, we will use the term ‘semantic I&I’ to refer to both semantic interoperability and semantic integration. We will use either  term on its own when we specifically wish to refer to just one.

We view semantic I&I as an effort that focuses on enabling different agents, services, and applications to exchange information, data, and knowledge such that the intended meaning [MFU6] of the information/data/knowledge is preserved.  For simplicity,  we will refer to agents, services, and applications collectively as agents.  Great strides have been made in recent decades to improve interoperability at physical and syntactic levels.  Streams of data were successfully transmitted between systems, however there was no meaning associated with the data.  This situation is analogous to successful delivery of an encrypted message, appearing to the recipient in an unfamiliar script -- mere scratchings on the page. However,  as Verizon CTO Michael Brodie notes, "it's the semantics, not the plumbing" that is required for interoperability.  It is insufficient, that is, just to have a robust physical infrastructure for transmitting data between systems, as the very same data can mean very different things in different systems: depending on the system (as well as the context). To a supplier, "delivery date" typically means the date the product is shipped; and to the buyer, the same phrase typically means the date the product is received.  [dlm7]  Similarly, data values such as “32” may be an integer age (measured in years), a temperature (measured in degrees fahrenheit or celsius), or a number of employees (measured in FTE – Full Time Equivalents).  Without a way of indicating its intended meaning, the raw data received by a target system is potentially so misleading and open to misinterpretation, that is can be viewed as essentially useless.  Systems are semantically interoperable when agents within them are able to exchange information, rather than mere data.

Semantic I&I and the Semantic Web: Some Basic Guidelines

In this section, we give some general guidelines and principles for achieving semantic interoperability & integration among Web applications.  The Semantic Web and the Semantic Web languages provide both the social structure and the technical means to facilitate semantic I&I.  The most fundamental contribution that the Semantic Web brings to semantic I&I is a set of recommended standardized languages with well defined syntax and semantics .  One impediment to semantic I&I has been the use of a wide variety of knowledge representation frameworks to represent information.  Exacerbating the problem is the fact that many of these languages lacked any explicit specification of their syntax — their basic vocabularies and their grammars — and their intended semantics.  In limited circumstances, the need for such specifications may not arise. For example, within a small organization new users can pick up the structure of the syntax and its intended interpretation from experienced users in the organization. However, in the context of the SW, this model is inadequate since agents wishing to share information must first share a common understanding of the content of the representations in terms of which that information is expressed.

The Semantic Web activity in the W3C made a significant advance on the language heterogeneity problem through the introduction of formal  recommendations for several standard XML-based ontology languages, notably, RDF, RDF Schema (RDFS) and OWL. The  syntax and semantics of these languages are open, well-defined standards.  This gives rise to our first principle:

Principle 1: To facilitate semantic I&I, create new ontologies in OWL.
To the extent that recommended standard ontology languages come into broader usage for building ontologies on the WWW, the problems of syntactically different or semantically ill-defined representation languages is minimized.   Therefore, whenever possible, use OWL  when building new ontologies. This will help ensure semantic I&I, at least at the language level. Unfortunately, of course, many existing ontologies are not written in OWL, this gives rise to the second principle.

Principle 2: To facilitate semantic I&I, translate existing ontologies into OWL.

If existing ontologies are intended to be shared, translate them into OWL and make the OWL version available in addition to the original version. Unfortunately,  there is no general, automated method for translating an ontology written in another KR language into OWL. The difficulty of the task, varies significantly from language to language.

 

First, some ontologies are written in languages that are more expressive than OWL (such as full first order logic) and thus it is possible that some details in the original ontology will not be translatable into OWL.  Some systems such as Ontolingua and Chimaera that use KIF – a first order logic language – as their internal representation language handle this issue in their export capabilities by translating as much as possible into OWL and noting what they could not translate.  One emerging option is to produce ontologies completely in OWL if expressively possible with additional information in a more expressive language, such as KIF,  if required[dlm8] .

 

Second, some ontologies may be written in languages that embody paradigms more or less similar to those embodied in OWL.  For example ontologies written in a frame-style language like OKBC, LOOM, or CLASSIC may translate more easily and naturally into OWL than ontologies written in a full, unrestricted first-order language like KIF.  OWL can be viewed as a descendant of frame-style and description logic systems and provide natural support for modeling things such as classes and binary relations.  If ontologies are best conceptually modeled with things such as ternary (or higher arity) relations, translations may be less natural[MFU9] . 

The first two principles focus on representing ontologies using the same language, OWL. This is a great help, but there are many issues of semantic heterogeneity that arise even when the same language is universally used. 

The semantics for OWL does two very important things for us. First, it fixes the meanings of the reserved vocabulary.  For example, it says exactly what is meant by declaring a relationship to be transitive, or that an individual is a member of a class.  Second, OWL specifies how the meanings of complex expressions using various syntactic constructs as specified by the grammar of the language depend systematically on the meanings of their simpler parts.  For example, if you know the precise meaning of the Animal and Plant classes, and [from the OWL language spec] you know the meaning of the reserved word 'Union' [NOTE: put the right OWL construct here] you can know exactly what 'Animal Union Plant' means [NOTE: put the right OWL expression here].

The semantics of OWL, however, are silent on our knowledge engineer's domain specific vocabulary. Hence, if this ontology is to be shared with or reused by someone who does not share the ontology creator's understanding of the domain specific vocabulary, the meanings of terms in that vocabulary must be captured somehow. And this is really where the semantics gets into the Semantic Web.  This gives rise to our next principle:

Principle 3: To facilitate semantic I&I, reduce ambiguity by expressing more meaning.

In the context of the Semantic Web, ontology creation takes on an entirely new and exciting guise. Currently, web page content is expressed largely in terms of unstructured text and graphics where most or all of the meaning is implicit. Acquiring the intended meaning of the content relies on the human’s understanding of the words and context. OWL's constructs open up the possibility of explicitly declaring the meaning  of the content.  OWL is used to put the semantics in the Semantic Web. 

An ontology builder must take care to specify the meaning of the terms in the ontology. This ensures that others who may wish to reuse your ontology can glean the intended meaning of the vocabulary in your ontology. If an ontology only contains a name for a class and nothing more, such as the class named “Animal”, then for a start, only English speaking people will have an idea what the class might
mean. All we know from the semantics of OWL is that 'Animal" denotes a set of individuals.   If the ontology also includes, say that the class Animal is a subclass of PhysicalObject, then this expresses more meaning, and further reduces ambiguity.   If the ontology further specifies that Animal is a subclass of other classes (e.g. the class of all moving things), that adds yet  more meaning, further reducing ambiguity.  In addition to specifying superclasses, the ontology builder can also indicate what relations have the class Animal in their Domain or Range.  Further, the ontology builder can specify properties of those relations, such as if the relations are transitive.
[MFU10] 
Expressing more meaning in an OWL ontology amounts to 1) using a variety of OWL constructs to capture different aspects of the meaning of a given term and 2) using a given OWL construct as often as necessary to say as much as possible about that aspect of the term's meaning. 

 

[DLM – comment – we should be careful not to encourage people to model a lot information that is not expected to be useful.  For example, many people do not like using CYC’s upper level because it is viewed as having too many upper level concepts that “clutter” an application when they are all inherited.  We probably want to draw a fine line here.]

 

MFU: True, and here ‘useful’ means more than just what a given ontology is intended for. When building an ontology for a specific purpose, you definitely would not want to add lots of axioms that would likely never get used in any application. However, it is also true that to facilitate semantic I&I, one can argue that to a large extent, the more meaning the better – the more meaning, the easier it will be for humans, or automated mapping assistants to understand the meaning of the terms, which is needed to map them. It is a tradeoff, to be sure. There is an analogy with db schema development. Often, DB constraints that are true, and could be expressed, and which would help on understand the meaning of the data, are SPECIFICALLY NOT put in, due to possible performance impact. The tradeoff is understandabilty vs. performance. The same might apply in ontology development.

Each thing stated about the terms in an ontology is represented as a formal statement using the grammar of RDF along with the OWL reserved vocabulary. Each statement serves to characterize the logical characteristics of, and logical connections among, the classes, properties, and individuals named in a domain specific vocabulary. Although a typical user of an ontology building environment may not be aware of it, these formal statements are all axioms in a formal logic.

In summary, one increases sharing and reuse by reducing ambiguity, which in turn, is achieved by adding more meaning to one's ontology. One adds more meaning by making use of OWL's reserved vocabulary to axiomatize the classes, properties, and individuals in an ontology.

Principle 4: To facilitate semantic I&I, reuse terms from existing ontologies.
The richer content and standardized representations afforded by OWL, together with the connectivity provided by the web's basic infrastructure, opens up the second [MFU11] exciting aspect of ontology creation on the Semantic Web, namely, the possibility of genuine, robust reuse. Prior to the semantic web, reuse consisted of little more than incorporating a term from someone else's ontology into one's own ontology, with the intention that it means the same thing across ontologies. But if the term is not defined or axiomatized by means of the sort of rich representational constructs that OWL provides, there is no way to be certain, or even mildly confident, of its meaning and hence no way to ensure commonality of meaning when the term is reused. OWL makes such representations possible. Moreover, the infrastructure of the web — notably, the mechanism of Uniform Resource Locators — together with XML namespaces facilitate reuse by making it possible to import content directly from remote ontologies, thereby reducing work and eliminating the possibility of transcription errors.

Reusing existing terms, that have a well-defined meaning, is an important step, but not nearly enough. Sometimes, there are good reasons to represent the same concepts using different terms, or different representational constructs. This gives rise to our next principle.

Principle 5: To facilitate semantic I&I, use OWL mapping constructs to relate terms from different ontologies.

When a term from one ontology is reused in the context of building a new ontology or extending an existing ontology, it comes "tagged" with its original namespace. [MFU12] This mechanism enables a single core term to be used in both ontologies, each having a different meaning (this mechanism also preserves provenance information). [MFU13]  In the event that the meanings are the same, or closely related, it is important [MFU14] to specify the semantic connection between them. This is done by using various mapping constructs in OWL.

MFU: expand this section to include various use cases for when an ontology to ontology mapping might be needed or useful. Distinguish interoperability from integration use cases, perhaps. Then, we might want to shorten what is below, and just a breif section about the kinds of mappings. This would be elaborated on in much more detail in the main ‘narrative’ below. Here, you woud probably skip OWL syntax. I grayed out the text below, and used it as a starting point for the narrative. We need to come back to here and replace the gray text with a suitable summary of the key points. It might juust be a few sentences, or somethigne a bit more elaborate. Given this is the main meat of the note, it bears repeating for emphasis.

The simplest mapping constructs declare exact logical equivalence; these are:  owl:sameAs, owl:equivalentClass, and owl:equivalentProperty. For example, we might specify that ontA:car is an equivalentClass to ontB:auto; or that ontA:canRunSlowly is equivalentProperty to OntB:canJog, or that OntA:venus is the sameAs OntA:morning_star. These three core mapping constructs can also be used in conjunction with arbitrary combinations of a variety of other OWL constructs (such as class forming operators) to create complex logical mappings between terms from different ontologies.  For example, an ontology designer might say that  OntA:lifeform and [OntB:plant Union OntB:animal Union OntB:fungus] are equivalentClass(es). [NOTE: put the right OWL syntax here. Mike: actually, you should have English as well as the OWL code.]

Additionally, logical connections may be specified that go beyond simple equivalence statements.  If an ontology designer was not interested in fungus, s/he might say that [OntB:plant Union OntB:animal] isSubClassOf [OntA:lifeform].  As a more complex example, an ontology designer may find that they want to state that their car class is defined as a subclass of the ontB:auto class along with being a subclass of a new restriction class (that may restrict the range of a particular property).  [MFU15] In this way, OWL constructs may be used to express formal logical relationships between either existing terms across multiple ontologies or formal relationships between existing ontology terms and a new ontology term.

In summary[MFU16] , OWL mapping constructs can be used to specify the logical connections between reused classes, properties, and individuals and those in one's ontology by means of the OWL vocabulary.[dlm17] 

In the next section of this document, we illustrate the use of the ontology mapping constructs for facilitating semantic interoperability & integration by means of a series of examples.

OWL-Based Mappings: Overview

The simplest and most explicit mapping constructs in OWL declare exact logical equivalence. For example, we might specify that the class ontA:car is quivalent to the class ontB:auto, or that the property OntA:canRunSlowly is equivalent to the property OntB:canJog. We can also specify that the individual OntA:venus is the same individual as OntA[MFU18] :morning_star. These are examples of three core mapping constructs in OWL that allow us to state that 1) two classes are equivalent, 2) two properties are equivalent, and 3) that two individuals are identical. The OWL constructs for this are equivalentClass, equivalentProperty and  sameAs, respectively.  

Often, there is no exact equivalence between two classes or two properties, but there may still be an important subClass or subProperty relationship between them that is useful to capture. For example, one might wish to create a mapping stating that the class OntA:Primate is a subClass of the class OntB: Mammal, or that the property OntA:brotherOf is a subProperty of OntB:siblingOf.  

The equivalence and subclass relationships state fundamenatal similarities between classes.  One can also make statements that clarify some of their differences. For example, one might declare that OntA:plant is disjoint with the class OntB:fungus. You can also say that ontA:JohnASmith is different from ontB:JohnASmith.  These kinds of statements are not as obviously useful as a basis for translating data from one ontology to another, however they specify mapping information in the sense that they relate the meaning of classes and individuals in two different ontologies.

We consider tThe core OWL mapping constructs [MFU19] to beare: owl:equivalentClass, owl:equivalentProperty, owl:sameAs, owl:subClass and owl:subProperty because they explictly relate classes to classes, properties to properties or individuals to individuals. 

In the case of properties and individuals: equivalentProperty, subProperty, sameAs, and differentFrom pretty much exhaust what can be specified in a mapping between two different ontologies. In the case of classes, there is much much more.  For Iin addition to being used on their own simply to express relations between given classes, as in the above examples, the core class mapping constructs may also be used in conjunction with other OWL constructs to create more complex logical mappings between class terms from different ontologies.  For example, an ontologicaly designer engineer might say that the class OntA:lifeform in Ontology A is equivalent to the union of the three classes OntB:plant, OntB:animal and OntB:fungus in Ontology B.  In the same way, Oone can also use OWL restrictions to define classes that enable one to express even more subtle in class mappings. For example, we can state that the class OntA:bicycle is a subclass of the class of all OntB:landVehicle(s) which exactly two wheels; more exactly:

<owl:Class rdf:resource=”&OntA;bicycle”>

  <rdfs:subClassOf>

    <owl:restriction>

      <owl:onProperty rdf:resource=”&OntB;hasWheels”>

      <owl:cardinality rdf:datatype=”&xsd;nonNegationInteger”

      >2</owl:cardinality>

    </owl:Restriction>

  </rdfs:subClassOf>

</owl:Class>

[CPM�Ⰶ⤀뿿�20]

 such that the cardinality of the numberOfWheels property is equal to exactly 2.

In general, any legal combination of OWL class formation operators can be used to form arbitrarily complex class expressions to state class equivalence, disjointness or subclass relationships.  More typically, one of the class expressions will be just a class name, like ‘lifeform’ in the above example. However, it is also allowed to have arbitrarily complex expressions for both classes that are declared to be equivalent, disjoint or subclasses. For example, one could state that the class formed by the intersection of the class C1 with the union of classes C2 and class C3 is a subclass or (or equivalent to, or disjoint with) the class formed by taking the complement of the intersection of classes C2 and C4. We leave it as an exercise for the reader to come up with real-world examples where such  expressions might be useful.[1]  You can also state that any class expression is empty or non-empty. You can state that a set of classes forms a partition. Mapping statements are only limted by your imagination and the set of legal OWL constructs for specifying classes.

Note that although in principle, any OWL constructs that can be used to describe or refer to properties can be used in conjunction with equivalentProperty and subProperty to create mappings between properties in different ontologies, in practice there aren’t an OWL construcsts for this. Similarly, while in principle, any OWL constructs that can be used to refer to individuals may be used with sameAs or differentFrom to create mappings between individuals in two different ontologies OWL does not have any additional constructs. [MFU21] TBD: some examples on complex expressions using property-forming and indiv-forming opera