Cross-industry semantic interoperability, part three: The role of a top-level ontology
July 26, 2017
Blog
This multi-part series addresses the need for a single semantic data model supporting the Internet of Things (IoT) and the digital transformation of buildings, businesses, and consumers. Such a...
This multi-part series addresses the need for a single semantic data model supporting the Internet of Things (IoT) and the digital transformation of buildings, businesses, and consumers. Such a model must be simple and extensible to enable plug-and-play interoperability and universal adoption across industries.
Part two identified consortia and their approaches to application layer interop. In part three, we discuss the role of a top-level ontology in solving the metadata challenge, and how elements of alternative approaches can improve scalability.
This is intended to be a living series that incorporates relevant emerging concepts and reader comments over time. The community’s participation is encouraged.
Navigate to other parts of the series here:
- Cross-industry semantic interoperability: Glossary
- Cross-industry semantic interoperability, part one
- Cross-industry semantic interoperability, part two: Application-layer standards and open-source initiatives
- Cross-industry semantic interoperability, part four: The intersection of business and device ontologies
- Cross-industry semantic interoperability, part five: Towards a common data format and API
“There are two words for everything.” – E.V. Lucas
What is an ontology?
Ontologies, as parts of science, have many faces. Originally, ontology was the part of philosophy about “being,” or the universal system of knowledge that describes objects, phenomena, and regularities of the world.
In recent years, the development of ontologies has been moving from the realm of artificial intelligence (AI) to the Semantic Web. The ontologies on the web range from categorizing general web content (such as schema.org) to categorizations of products for sale and their features (such as on amazon.com).
As a facilitator of semantic interoperability, an ontology provides a standardized classification of the concepts associated with metadata identifiers for a particular domain (e.g., healthcare). While incorporating characteristics of a taxonomy and thesaurus, an ontology uses strict semantic relationships among terms and attributes with the goal of knowledge representation in machine-readable form (Figure 15).[7]
[Figure 15 | Semantic levels]
The methodologies used to develop an ontology are critical to facilitating scalability and must consider all relevant applications. The applications this article series considers include the business and IoT use cases of five inter-related industries – Homes & Buildings, Energy, Retail, Healthcare, and Transportation & Logistics.
While syntactic languages (such as OWL, RDF, and RDFS) can be used to construct ontologies, part three of the series will focus on methodologies agnostic to any specific modeling language.
A controlled vocabulary for consistency
A controlled vocabulary is a carefully selected collection of words and phrases (i.e., terms) that are given well-defined meaning consistent across contexts. A vocabulary can be used to maintain consistency in ontology development, which defines the contextual relationships behind the terms.
All terms in a vocabulary controlled by a registration authority should have an unambiguous, non-redundant definition. If multiple terms can be used to mean the same thing, one of the terms should be identified as the preferred term in the controlled vocabulary, with the other terms listed as synonyms or aliases (as shown in Figure 16 and the Glossary of Terms).
Controlled vocabularies should provide National Language Support for global applications. Standard vocabularies representing terms within domains of knowledge are already available freely from various organizations (e.g., lov.okfn.org).
[Figure 16 | Controlled terms with aliases and translations]
An object class for every thing
An ontology can provide a standardized classification of domain concepts through a collection of classes. Each class (concept) can represent a category of like things or objects which can be uniquely identified. A class is defined to reflect the attributes, restrictions, and relationships unique to its objects (instances). A class can represent physical objects such as sensors and persons as well as information objects such as business transactions [ISO 11179]. An ontology, together with a set of individual instances of classes, constitutes a knowledge base.[8]
A hierarchy for classes
Like a taxonomy, an ontology can define its classes within a hierarchical structure, which can be as deep as needed (Figure 17). A class (such as Sensor or Actuator) can be a subclass (type) of another class (Device).
[Figure 17 | Hierarchical structure of an ontology.]
All subclasses inherit the attributes of their class. For example, if Power Status is an attribute of the Device class, all Sensor and Actuator objects will have this attribute.
An attribute is attached at the most general class applicable to all of its objects, including subclasses. Since all classes are types of objects, the class hierarchy has one root class, Object, which comprises attributes, such as Identifier (an O-DEF classification property), that are inherited by all objects (see Figure 19).
While this methodology parallels object-oriented programming, it represents a metadata abstraction from programming. The metadata representing the ontology can be maintained in a repository (ISO 11179) completely abstracted from any programming environment.
A top-level for cross-industry interop
Top-level object classes (e.g., ODEF core index) can facilitate the exchange of data and interoperability across different domains (e.g., buildings, retail, healthcare) since they ensure that foundational terms are used in a unified and semantically compatible manner.
The semantic data models of the consortia identified in this series include top-level classes that support their targeted industries and use cases (Figure 18).
[Figure 18 | Consortia top-level object classes.]
While terminology may differ, the consortia share many foundational concepts (classes). A “blending” of these concepts can form a top-level ontology capable of supporting industry-specific use cases and cross-industry interoperability (Figure 19).
[Figure 19 | Blended, cross-industry top-level class hierarchy.]
The Name and Description attributes of the root Object class can describe these top-level classes, and are included in the Glossary of Terms:
While Person and Organization are considered top-level classes by some consortia models (O-DEF, schema.org), they are both subclasses of a Party concept within business models (GS1 EDI, ARTS ODM). A Party class includes attributes common to both persons and organizations, and enables one class to be associated with business transactions and other relationships.[9] A Party is capable of legal ownership and can be related to an Owner Party attribute of the root Object class. A Party instance can own both tangible (vehicle) and intangible (sales order) objects.
Although not explicitly defined within these consortia, a top-level Relationship class is included in this blended approach to abstract the ontology from any specific ontology language defining many-to-many relationships.
A class for an information model
An information model, as a knowledge domain, can have its own ontology that can model a multi-level ontology. An Information Model top-level object class (ODEF Information-Set) can be used to contain subclasses that define an information model (Figure 20). These include:
- Data Type
- Measurement Unit
- Attribute
- Vocabulary Term
- Role
[Figure 20 | An information model class hierarchy.]
An ontology for data types
Ontologies for data types and measurement units (such as QUDT.org) can provide foundational semantic interoperability in business and technology.
A Data Type class can be modeled as a subclass of Information Model. All data based on digital electronics is represented as bits (alternatives 0 and 1) on the lowest level, and a Bits attribute of the Data Type class can be inherited by all subclasses. Number and String are atomic data types (direct subclasses of the Data Type class), as their values cannot be described in smaller parts. Integer and Float “primitives” can be defined as subclasses of a Number class (as with schema.org). Atomic and primitive data types have been defined by standards organizations (e.g., ISO.org 11404, W3.org XML Schema), but inconsistencies among them are challenging to manage.
Additional data types (like Quantity) having unique attributes can be derived from primitive data types and defined as their subclasses. However, the use of specific primitive and derived data types has varied among programming languages and consortia data services, limiting semantic interoperability (Figure 21).
[Figure 21 | Consortia data types]
A data type for a term
A Term data type (similar to Haystack’s “Marker”) can be used by an attribute (similar to a Haystack tag) to classify an object separately or in conjunction with an object class hierarchy.
When utilized with a Controlled Vocabulary, the value of a Term attribute can represent a Term object. For example, in Figure 19 the Name attribute of the root Object class is assigned to the Term data type. The value of the Name attribute for the root Object class is related to the “Object” Term in a Controlled Vocabulary (Figure 16).
The concept of Term can also be modeled as a subclass of Information Model (Figure 20).
A data type for a relation
A Relation data type (similar to Haystack’s “Ref”) can be assigned to an attribute to denote a relationship with an object of the same or different class. For example, the Class attribute of the root Object class is assigned to the Relation data type (Figure 19). The “Within Class” attribute of the Attribute class is also assigned to the Relation data type (Figure 20). In this case, the relation represents containment of Attribute objects within a Class object.
An attribute assigned to a Relation data type should be restricted to objects within a single class, which should be the most restrictive subclass to properly reflect the relationship.
Quantity data types for measurements
Business and technology depend on measured numbers, most of which have units. A Data Type ontology can define a measured Quantity (schema.org QuantitativeValue) data type as a subclass of a Number data type. Data types can also be defined for each type or “dimension” of measurement, as instances (objects) of the Quantity data type. For example, a Temperature data type (UN/CEFACT Temp-MeasureType in Figure 21) can be defined as an instance of the Quantity subclass.
By modeling a Monetary (Currency) amount as just another measurement type, processes, including value conversions, can be normalized across all measurement types. A mechanism (similar to xe.com) can be used to retrieve dynamic value changes of a Conversion Factor (currency exchange rate) associated with a Monetary unit.
A class for measurement units
The most widely used system of units is the International System of Units, or SI. ISO 80000-1 further defines quantities and units of the SI and the International System of Quantities (ISQ).
A Unit class can be modeled as a subclass of Information Model. Figure 22 shows attributes (Identifier, Name, Class) of the Object class inherited by each object in the dataset. The figure also includes the attributes of the class (Unit) identified by the object’s Class attribute.
[Figure 22 | Example instances of a Unit class with Object and Unit attributes]
A Unit identifier (such as ºF) paired with a quantity value in a dataset (such as Haystack tagged data) can resolve to a Quantity data type (such as Temperature) within the identified Unit object. Attributes of the Unit object can also support a unit conversion process (Figure 23).
[Figure 23 | Temperature value conversion using Unit instance with conversion attributes]
Roles for an object
The concept of a role (such as within O-DEF) describes a function that can be performed by an object in a particular context. A Role class can be modeled as a subclass of Information Model and can include instances that apply to different object classes (Figure 24).
[Figure 24 | Example instances of a Role class with Object and Role attributes]
An instance within the Relationship class can assign an instance of a Role to an object. An object can have more than one role. For example, an instance of a Person can have Employee, Parent, and Passenger roles. An instance of a type of Device can be a Sensor and a Communicator. The purpose of many devices is to assume the same roles of Persons. Thus, a Role can be assigned across object classes.
Some Roles (Customer) have a corresponding Reverse Role (Vendor). When a Customer role is assigned to a Party (modeled in ARTS ODM), a corresponding Vendor role is assigned to another Party to form a trade relationship.
Part four considers the intersection of business and device ontologies. Part five discusses OWL, RDF, and RDFS as approaches to metadata management and syntactical interoperability.
For term definitions, see the Glossary.
References:
7. Harpring, Patricia, “Introduction to Controlled Vocabularies”, 2010 J. Paul Getty Trust.
8. Noy, Natalya F. and McGuinness, Deborah L. “Ontology Development 101: A Guide to Creating Your First Ontology”, protégé.Stanford.edu, 2001.
9. Hay, David C., Data Model Patterns: Conventions of Thought, Dorset House Publishers, Inc. (New York: 1996)