Information Model Development Recommendation: Data Element Repository

Revision History
Revision 0.4
Add aggregation and collection semantics of associations; add description of association inheritance
Revision 0.512/17/03
Change to docbook format

Abstract

This document describes the components of the data element repository (DER). The DER is the foundation of the information interoperability infrastructure. The DER contains the metadata that describes the universe of 'atomic' components of information interoperability. The metadata in the DER consists of both semantic and administrative metadata. The semantic metadata describes the meaning expected of an information interchange contract. The administration metadata describes the registration and tracking attributes of application metadata that is needed to enable global interoperability. The DER metadata consists of the following meta-entities: entities, properties, datatypes, associations, and constraints. Supporting meta-entities include association ends and classification schemes.


Table of Contents

1. Introduction
2. The data element repository is the source of the ISE information models
3. Entities, properties, datatypes, associations and data elements
3.1. Enities
3.2. Properties
3.3. Datatype
3.4. Associations
3.5. Data elements
4. Aggregation and collection semantics of associations
5. The meta-metadata model
6. Sample DER metadata

1. Introduction

This development guide documents the process and conventions for the formulation and maintenance of the work products and models associated with the Data Element Repository.

2. The data element repository is the source of the ISE information models

ISE standards-based information interoperability builds upon the foundation of a data element repository. The models themselves are arranged in a layered approach as illustrated in Figure 1. The fundamental module of information interoperability is the context schema. A context schema defines the precise set of data elements that are required for a particular interchange. Each information interchange (or information interoperability transaction) is based upon and scoped by an information use case. The context schema completely and unambiguously defines

Figure 1. Relationship of DER, context schemas and representations

Relationship of DER, context schemas and representations
the contract for an information interoperability transaction. The context schema comprises the minimum set of constraints that are required to accomplish the information interchange. The context schema is a conceptual model, which is materialized for a particular interchange by an instance of a representation information model. This relationship is analogous to the relationship of a resource to its representations in the WWW REST architecture. The Data Element Repository defines the universe of data elements and associations that are available for the composition of context schemas. Each information use case can be implemented by means of a finite set of data elements, selected from the Data Element Repository. The purpose of the Data Element Repository is to enable global information interoperability by supporting the sharing of information both across application domains and across organizational boundaries. Today, information interoperability is hampered not only by representational incompatibility but also by the sheer number of logical data redundancies that exist in a proliferation of point-to-point solutions. In order to reduce the number of redundancies the DER needs to provide mechanisms by which standardized data elements can be managed, published and located. The mechanisms include globally unique names for data elements (and their constituent parts) and the capturing of more complete metadata. Even at the data element level, it is unrealistic to expect that redundancies can be totally eliminated. The DER also manages synonyms, homographs and other lexical and semantic relationships among its terms. Finally, for data to be shareable, both consumers and publishers need to have a common understanding of its meaning. The DER incorporates precise natural language definitions.

3. Entities, properties, datatypes, associations and data elements

The DER is a repository that manages metadata to support information interoperability in a universe of discourse. The metadata in the DER consists of both semantic and administrative metadata. The semantic metadata describes the meaning expected of an information interchange contract. The administration metadata describes the registration and tracking attributes of application metadata that is needed to enable global interoperability. The DER metadata consists of the following meta-entities: entities, properties, datatypes, associations, and constraints. Supporting meta-entities include association ends, traces, and classification schemes.

3.1. Enities

An entity is “a set of ideas, abstractions, or things in the real world that can be identified with explicit boundaries and meaning and whose properties and behaviors follow some rules.” [ISO-11179, 1-14] In the context of information, an entity is “a description of [a set of] objects that share the same attributes, …, relationships and semantics.” [UML1.5, 2-26]. There is distinction between a set of entities (entity type) and a particular entity (entity instance). As with the concept of objects, there is some imprecision in the term; it is not always obvious which aspect, classification or instantiation, is more important to a particular usage. It is usually possible to determine the connotation from the context, otherwise the term should be qualified as entity type or entity instance.

An entity embodies the abstraction mechanisms of classification and instantiation; an entity is the unit of instantiation The first defining characteristic of an entity (instance) is that it is “a thing that can be distinctly identified.” [ER 2.2]. The second defining characteristic is that is can be classified as a distinct type. Classification is an essential characteristic because it designates the specific properties that are applicable to an instance. Without such a designation, it would not be possible to instantiate the entity. The data content of an entity (instance) encompasses one value for each property in its full entity description. There is no concept of an “abstract” DER entity; the notion of partial implementation has no meaning in the context of the DER, which essentially provides a laundry list of information components.

Although not explicit in the meta-entity xde:Entity, an entity contains a set of properties. An entity may participate in associations. The DER explicitly manages a set of entities, properties (with their datatypes), and associations. Each particular context schema designates a set of data elements and associations. The DER is comprised of the set of data elements that can correctly be inferred from the entities, properties, associations and the meta-relationships among them.

Table 1. Meta-properties of entity

xdp:identifierThe language independent globally unique identifier of an entity.
xdp:definitionStatement that expresses the essential nature of an entity and permits its differentiation from all other entities.
xdp:versionIdentification of an issue of an entity specification in a series of evolving entity specifications within a Registration Authority.
xdp:synonymsList of single word or multiword designations that differ from the given name but represent the same entity. Each followed by a designation or description of the application environment or discipline in which a name and/or synonymous name is applied or originates from.
xdp:classificationSchemeA reference to (a) class(es) for the arrangement or division of objects into groups based on characteristics which the object have in common, e.g., origin, composition, structure, application, function, etc.
xdp:keyWordsOne or more significant words used for retrieval of entities.
xdp:responsibleOrganizationThe organization or unit within an organization that is responsible for the contents of the mandatory attributes by which the entity is specified.
xdp:registrationStatusA designation of the position in the registration life-cycle of an entity.
xdp:submittingOrganizationThe organization or unit within an organization that has submitted the entity for addition, change or cancellation/withdrawal in the data element repository.

3.2. Properties

A property is “a peculiarity common to all members of an [entity type]”. [ISO11179 1.6].

In an entity instance, a property is materialized as a value without identity. A property is similar in some ways to an association, except that a property value is always contained within its entity instance. A property is final in that it imposes no further structure on the instance beyond its containment. The relationship of an instance to its property values is always one-way. Properties are unordered within an entity and accessed only by name.

Historically, a property name is local, scoped to the entity that contains it. However, in order to support global interoperability and re-use, property names in the DER shall be globally unique and persistent. A property in the DER may be used in more than one entity (type). This applies not only to types related by generalization/specialization but also to unrelated types.

A property is a dependent concept. The owning entity determines the context of the property. The definition of the property itself must be confined to the meaning of the property that is irrespective of context.

Table 2. Meta-properties of property

xdp:identifierThe language independent globally unique identifier of a property.
xdp:datatypeThe xdp:identifier of the datatype that defines the value space for this property.
xdp:definitionStatement that expresses the essential nature of a property and permits its differentiation from all other propeties.
xdp:versionIdentification of an issue of a property specification in a series of evolving property specifications within a Registration Authority.
xdp:synonymsList of single word or multiword designations that differ from the given name but represent the same property. Each followed by a designation or description of the application environment or discipline in which a name and/or synonymous name is applied or originates from."/>
xdp:classificationSchemeA reference to (a) class(es) for the arrangement or division of objects into groups based on characteristics which the object have in common, e.g., origin, composition, structure, application, function, etc.
xdp:keyWordsOne or more significant words used for retrieval of properties.
xdp:responsibleOrganizationThe organization or unit within an organization that is responsible for the contents of the mandatory attributes by which the property is specified.
xdp:registrationStatusA designation of the position in the registration life-cycle of a property.
xdp:submittingOrganizationThe organization or unit within an organization that has submitted the property for addition, change or cancellation/withdrawal in the data element repository.

3.2.1.  Meta-relationships for properties : Property inheritance

There are meta-relationships that apply to properties. The property is a key constituent of the data element, and the permutations of properties with respect to the entities to which they apply constraints the inheritance choices that can be made – both in the context schemas and in subsequent representations. A property S is said to be conformant with property T if S can be substituted in any instance in which T is expected. There are two conformance meta-relationships among properties. A property may be a sub-property of some other property. A property S is a sub-property of T if any instance of S can always be substtituted for an instance of T, but not vice versa. The sub-property meta-relationship is asymmetrical. If S is a sub-property of T, then T cannot be a sub-property of S. If S is a sub-property of T, the datatype of S must be conformant with the datatype of T. A sub-property need not be limited to a single super-property. If a property is a sub-property of more than one property, then its datatype must be conformant with the intersection of the datatypes of all its super-properties.

The second meta-relationship is the equivalent-property relationship. If property S is an equivalent-property-of T, then S is conformant with T and T is conformant with S. If S and T are equivalent properties, then they must have equivalent datatypes.

Among all possible properties some sets of properties are meant to convey the same ‘peculiarity’ among the entities in which they appear. A set of conformant properties constitutes such a set. Equivalent properties clearly refer the same property; they simply provide the opportunity to accommodate alternative names for that property. In a weaker sense, a sub-property is a conformant property as each of super-properties. The conformance of properties will play a prominent role in the supporting various flavors of multiple inheritance in the context schemaas well as their representations.

3.3. Datatype

A datatype (also known as a simple type) is a named, reusable value domain with a set of permissible values. As with entity, the two defining abstractions of a simple datatype are classification and instantiation. A simple datatype must be associated with a type because that is what defines the domain of possible values (value space). A simple datatype is materialized as a pure value without identity; that is what distinguishes it from an entity type.

Some sources note that a characteristic of simple datatypes is that they are unstructured and atomic. This is not, strictly speaking, a defining characteristic, because it is possible that the simple type could be parsed into smaller semantic units. This occurs in the case of a list of other simple types or a composite simple datatype (see below). It only makes sense to characterize a datatype as unstructured and atomic if the nuances of this characterization are well understood. A datatype is unstructured in the sense that it is not further divided by the structural components within its data model. For example, a text string in XML is a simple datatype because if contains no markup and, thus, contains further structure beyond its being a sequence of characters. The simple datatype may not always be atomic semantically, but it is always atomic with respect to its data model. For example, the value of XML attribute is, conceptually, a single unit of information – even though it may be a whitespace-separated list of values.

Although the simple datatype defines a set of permissible values, a distinction must be made between that value space and the lexical space of the datatype. It is customary that a datatype entails more than one lexical representation for a single value in its value space. For example, the values of boolean may be represented by true/false, 1/0, yes/no, etc. The trend in XML Schema has been to restrict this representational permissiveness by designating a canonical lexical representation for each value. If there is a one to many mapping from the value space to the lexical space, the definition of the simple datatype shall designate not only the value space but also its precise relationship to the lexical space.

Within the data model, a collection of simple types can itself be defined by a simple type. With XML the solution is that such a type consists of a whitespace-separated list. There are obvious implementation issues as well as semantic issues with such a type. These issues are addressed below.

Another form of datatype is the composite simple type. Sometimes there is an advantage to composing several semantic elements into a single datatype, for example, when making a list of the composite values; or a datatype such as xs:date, in which day, month, and year have separate meaning but are typically managed together. Another (very prevalent) example is a quantity that must be associated with a unit of measure.

There are three alternative ways that a composite simple type can be modeled in the DER: a) as a simple datatype with concomitant interpretation/parsing rules, b) as an entity type (which comes with an identifier), or c) as a separate property for each semantic atom. The DER model can accommodate either alternative because a complete representation mapping to any of the other alternatives can be defined for these representations. (described later).

Guidelines for defining simple types in the :DER

  • If the composite datatype is widely accepted and/or standardized (eg xs:date), the composite type SHOULD be used in the DER. Otherwise, an entity with an association MAY be used. (The mapping to the alternatives can be accomplished in the representation model).

  • Simpletypes shall be defined and managed as XML Schema simple types with additional metadata that defines management attributes and, if needed, collection metadata.

Table 3. Meta-properties of datatype

xdp:identifierThe language independent globally unique identifier of a simple datatype.
xdp:definitionStatement that expresses the essential nature of a datatype and permits its differentiation from all other datatypes.
xdp:versionIdentification of an issue of a datatype specification in a series of evolving datatype specifications within a Registration Authority.
xdp:synonymsList of single word or multiword designations that differ from the given name but represent the same datatype. Each followed by a designation or description of the application environment or discipline in which a name and/or synonymous name is applied or originates from."/>
xdp:classificationSchemeA reference to (a) class(es) for the arrangement or division of objects into groups based on characteristics which the object have in common, e.g., origin, composition, structure, application, function, etc.
xdp:keyWordsOne or more significant words used for retrieval of datatypes.
xdp:responsibleOrganizationThe organization or unit within an organization that is responsible for the contents of the mandatory attributes by which the datatype is specified.
xdp:registrationStatusA designation of the position in the registration life-cycle of a datatype.
xdp:submittingOrganizationThe organization or unit within an organization that has submitted the datatype for addition, change or cancellation/withdrawal in the data element repository.
xdp:schemaLocationList (whitespace separated) of URLs that may contain the XML schema that defines the datatype.
xdp:cardinalityIf present indicates that the datatype is a collection type. For each dimension of the collection, defines the xdp:monadType, xs:minOccurs, and xs:maxOccurs..

3.4. Associations

An association defines a semantic relationship between entity types. An instance of an association is a set of tuples relating instances of entities. Each tuple value may appear at most once.

Association is a first-class (independent) concept, just as is entity. An association is similar to a relationship, as described in the entity-relationship (E-R) model. An association differs from a relationship in that associations are relationships that may be instantiated in the infoset.

Some legacy data models do not treat association as a first-class concept. In some cases it is not possible to distinguish a property from an association. Associations have not always been explicitly designated in DED’s. Association is a first class concept in the UML metamodel.

An association must have a globally unique persistent name (just as an entity does). The association name defines the association itself. However, an association can be instantiated in either or both directions without loss of information. An association is to a link as an entity type is to an entity instance. An association and it links are similar to an entity and its instances in that a classification abstraction is instantiated, but the association/link differs since a variety of instantiation modes are possible. This makes the definition somewhat more involved. An association is bi-directional but not symmetrical; more names (beyond the association name) are needed to define the association fully.

The association name itself is globally unique, but each end of the association also needs a name (which shall be local to the association). The name of each association end may be meaningful (eg corresponding to a role name), but it does not need to be. Its main purpose is to distinguish each end. Each end also has metadata properties that apply only to that end. These properties specify the set of conditions that must be satisfied for the association to be valid.

An association shares some of the characteristics of a property, but it has more facets. An association implies a navigation path between entities that is more demanding than the simple containment relationship of an instance to its property values.

Different data models and object models have adopted different approaches regarding association metadata. The UML model supports n-ary associations; the OMG MOF model supports only binary associations, citing the rationale that the simpler form is sufficient for metamodeling. (The major difference between the MOF model and the UML metamodel is that the former is primarily a data model, while the latter is primarily an object model.)

The DER metadata model adopts the simpler approach: allowing only binary associations. As with the MOF, the goal is simplicity of the metadata model, keeping the number of basic concepts small. Binary associations can be used as building blocks to construct n-ary associations if needed. Moreover, binary associations are by far the most prevalent form of association found in information modeling.

A second form of more sophisticated association is the association entity. This concept is defined both in the E-R model (relationship entity) and in UML (association class). This construct is used when an association has properties in its own right, distinct from the properties of the associated entities. An association entity has all the characteristics of an entity, including identity. It may participate in other associations. (In UML the association class is multiply inherited from Association and Class.) Following the goal of simplicity, the DER model does not recognize the association entity as a primitive construct. An association entity should be modeled as an entity with an association for each relationship.

Figure 2. Association entity

Association entity

The representation of an association entity is illustrated in Figure 2, “Association entity” The association entity is treated like any other entity in the DER. Its associations correspond to the legs of a UML AssociationClass. The only problem is that there are more names affiliated with the association entity than with an association class. Some of the names in the association entity are generated synthetically in order to align the semantics of the concept. The AssociationClass has a class name and one(role) name for each association end. The association entity has a corresponding entity name and one (role) name for each remote association end. However, each remote association end is attached to the association entity by means of an association, which itself has a name as well as a (role) name for the attached end. The role name of the attached entity is always set to ex_assoc_entity, which signifies that the entity is an association entity. The association name is the concatenation of the association entity name and the remote role name.

The same approach should be used to model n-ary associations, which more often than not do have their own properties and identity. The definition of the n-ary assocation is illustrated in Figure 3, “n-ary association”. In UML an n-ary association is represented by a diamond that connects n (where n >=3) association ends. There is one association name and n rolenames. In the DER the n-ary association is modeled as an entity (ie an association entity). Each leg of the n-ary association is mapped to an association. Again there are more names than are absolutely needed. The standard association entity mapping of names is followed. Each leg maps to a remote role name in the association entity.

Figure 3. n-ary association

n-ary association

An association end has properties that control the validity of a particular link. Some of those properties define the aggregation semantics of the association end. The aggregation semantics apply to part-of associations, indicating which end is the composite and which the part. The part end may be further qualified with respect to existence dependence, shareability, and the potential for containment. The aggregation semantics are part of DE model and are described in detail later in this part.

The metadata for an association end designates the types of entities that may appear at that end. In object systems, an association may refer to multiple types via polymorphism. Since entity inheritance is not captured in the DER metadata model, the association end must explicitly enumerate the types that may appear at each association end. The types may be related by some generalization classification, or they may be completely unrelated.

3.4.1. Meta-relationships for associations: Association inheritance

There are meta-relationships that apply to associations. The inheritance relationship among associations parallels the inheritance relationship among properties. Eventually, both properties and associations become features in the representation schema, and they face the same issues with respect to multiple inheritance and redeclaration. An association S is said to be conformant with association T if S can be substituted in any instance in which T is expected. There are two conformance meta-relationships among associations. An association may be a sub-association of some other property. An association S is a sub-association of T if any instance of S can always be substituted for an instance of T, but not vice versa. The sub-association meta-relationship is asymmetrical. If S is a sub-association of T, then T cannot be a sub-association of S. If S is a sub-association of T, the set of instances defined by the entity types of each association end must be a subset of the instances defined by the entity types of the corresponding association end. As with properties, the inheritance of associations must ensure substitutability. The substitutability requirement holds even when a sub-association has more than one parent association. In this case the association ends of the sub-association must define a set of entity instances that is a subset of the intersection of the entity instances defined by the corresponding parent association ends.

The second meta-relationship is the equivalent-association relationship. If association S is an equivalent-association of T, then S is conformant with T and T is conformant with S. If S and T are equivalent associations, then all corresponding association ends must define the equivalent entity instance sets.

Among all possible associations some sets of associations are meant to convey the same relationship among the entities to which they refer. A set of conformant associations constitutes such a set. Equivalent associations clearly refer to the same association; they simply provide the opportunity to accommodate alternative names for that association. In a weaker sense, a sub-association is a conformant association as each of its super-associations. The conformance of associations will play a prominent role in the supporting various flavors of multiple inheritance in the context schemas well as their representations. Further details of association inheritance are described in Part 3

3.5. Data elements

The primary purpose of the DER is to manage associations and data elements, the key elements of the information interoperability contract. A data element (DE) is “a unit of data for which the definition, identification, representation and permissible values are specified…” [ISO11179-1]

The importance of the data element stems from the fact that it is the irreducible unit of information interoperability. “A data element then is a single unit of data that in a certain context is considered indivisible.” [ISO11179-B-1]. Th context in which a data element is irreducible is the formation of a context schema, the contract between the parties for a particular information interchange. The data element is a complete, unambiguous contractual element; it is the re-usable building block from which any number of contracts can be assembled.

A data element is a unit of information, that is, data plus context. Structurally, the data element is an aggregation of an entity type, a property type (and its datatype), which designates the representation. Both properties and datatypes, as they are represented in the DER, are independent of context. The association of a property/datatype with an entity establishes the semantic context.

Note

Some data modelers use re-named datatypes to signify contextual meaning. For example, the type string may be re-defined as a type called Identifier in order to indicate that a property with this type is to be used for identification. In the DER datatypes should be independent of implied semantics. Re-named simple types that do not define a new value space SHOULD be avoided. (The result of such a practice is the proliferation of redundant datatypes.) Instead, the data element itself, the combination of entity name and property name, should be used to establish semantic context.

The data element is uniquely and persistently identified by the identifiers for its entity and its property, each of which is itself a globally unique persistent identifier.

Data element concept

Some methodologies introduce the notion of a data element concept, which is an aggregation of an entity and property that can then be associated with different datatypes. This approach introduces aspects of representation into the DER model. After a property has been associated with an entity, there remain two open questions that must be settled before a data element instance can be created: the value space for the data element and the representation for the data element instance. There is one and only one value space associated with a particular data element. This has been a basic tenet in number of methodologies and data models (such as E-R, UML, domain key normal form, strong static typing in programming languages). This applies even in the case of redeclared properties because the redeclared property appears in a subtype entity and, thus, a separate data element.

For a single value domain, there may be a number of permissible representations, including variants in the type’s lexical space, alternative representations for composite datatypes, mappings to synthetic entities, and representations in alternative media. The first three representations are handled in the representation layer of the II architecture (described later). An alternative media representation should be handled at the presentation layer.

To introduce the notion of data element concept into the DE model would be a violation of the layering that is needed for information interoperability. The data element SHALL be concerned exclusively with the data model; representations SHALL be handled in the representation layer.

Naming convention

To facilitate the re-use of data elements, a naming convention is typically applied for combining the name components of data element.

In the DER the naming convention for data elements is the concatenation of expanded name for each component name in the following order: entity, property. Component names are separated by the “#” character. The expanded name of an xs:Qname is the concatenation of open curly brace, the namespace URI, close curly brace, the local name.

Example 1. data element name

{urn:ise:de/entity}Pipe#{urn:ise:de/property}Diameter

Table 4. Meta-properties of datatype

xdp:entityThe first component of the language independent unique identifier of a data element.
xdp:propertyThe second component of the language independent unique identifier of a data element.
xdp:nameSingle or multi word designation assigned to a data element.
xdp:definitionStatement that expresses the essential nature of a data element and permits its differentiation from all other data elements.
xdp:versionIdentification of an issue of a data element specification in a series of evolving data element specifications within a Registration Authority.
xdp:synonymsList of single word or multiword designations that differ from the given name but represent the same data element. Each followed by a designation or description of the application environment or discipline in which a name and/or synonymous name is applied or originates from."/>
xdp:classificationSchemeA reference to (a) class(es) for the arrangement or division of objects into groups based on characteristics which the object have in common, e.g., origin, composition, structure, application, function, etc.
xdp:keyWordsOne or more significant words used for retrieval of data elements.
xdp:responsibleOrganizationThe organization or unit within an organization that is responsible for the contents of the mandatory attributes by which the data element is specified.
xdp:registrationStatusA designation of the position in the registration life-cycle of a data element.
xdp:submittingOrganizationThe organization or unit within an organization that has submitted the data element for addition, change or cancellation/withdrawal in the data element repository.
xdp:typeScope: If true, designates that the data element has type scope, ie one instance for the entity type. If false or unset, designates that the data element has instance scope.

4. Aggregation and collection semantics of associations

Aggregation semantics

Aggregation, the part-of relationship, is supported in the DER information model. Aggregation is represented in a number of data models, though it is typically omitted from OO programming languages. It is recognized in UML as a qualifier for associations. A UML association has one of three flavors of aggregation: none, aggregate, or composite. If an association end’s aggregation attribute is set to ‘none’ (or is unset), then no aggregation semantics are implied. Otherwise, the association exhibits aggregation semantics. An aggregation association is asymmetrical: one end, the ‘whole’ end is the owning end, and the other end, the ‘part’ end is the owned end. If an association end is set to ‘composite’, then strong aggregation is indicated. The association end without the designation is the whole end (the aggregate)and, consequently, the designated end, the part end, cannot be an aggregate. The part end is strongly owned by the whole end and cannot participate in any other composite. In fact, an existence dependence relationship is indicated on the part end. If the whole end is deleted, the part end will be undefined. In a local data management system, the deletion of a whole end can, in practice, result in the deletion of the part end. In the distributed universe of information interoperability, this is no longer practicable. In fact, it is not apparent from inspection of a part end whether or not the ‘whole’ end’s entity instance exists. This can be determined only by attempting to access the ‘whole’ end’s entity instance. Finally, if an association end’s aggregation attribute is set to ‘aggregate’, then weak aggregation is indicated. The association is flagged as embodying a part-of relationship, and the part end cannot be flagged as an aggregate. Beyond that no further semantics are implied, so the designation carries very little weight.

In XML, aggregation has gained renewed prominence. Containment (composite) semantics are an integral part of the XML data model. Nevertheless, even in XML, aggregation semantics are most apparent only in their effect on the encoding of the instance data. Aggregation semantics need be indicated only for associations, single as well as multi-valued. The aggregation semantics for properties is always known implicitly to be ‘composite.’

At the schema level (M1), the only visible effects of aggregation semantics attach to a ‘composite’ association. The designation of an association as ‘composite’ indicates that a hierarchical representation is suitable for all instances and that containment can be applied to all links that instantiate the composite association. The composite association is directed and asymmetrical; and it is always possible to provide a synthetic local identifier for the children of the association. With weak or no aggregation, there is no strict guidance at the schema level. Weak aggregation gives, at most, a hint regarding possible representations, but nothing more. Containment may still be possible, but the decision now is based on the nature of the instance data. The situation was described a single reference in an early version of the SOAP encoding. The single reference representation says that if an instance appears only once in an infoset; it may be encoded as a contained instance. It is not shared in this particular infoset but could be shared in others. In the context schema, the only values for aggregation semantics are composite or none.

Collection semantics

Collection semantics are needed to complete the definition of multi-valued attributes and associations. It is not possible to determine whether two infosets are equal unless the collection semantics for each multi-valued property and associaiton is known. For example, if the two infosets contain a multi-valued association, an equality test cannot be confidently performed until it is known whether the collections are supposed to be ordered or not. Collection semantics are always associated with one “underlying” type or select of types. Heterogenous collections are defined by means substitutability or select choice. Simple valued and entity valued collections are never intermingled is a single representation

Collection semantics are described primarily in terms of monad. There are three types of monad: bag, set, and sequence. At the implementation level all collections can be represented by sequences. However, the monad categories distinguish between ordered and unordered collections. Sets and bags are unordered collections, distinguishing collections without or with duplicates. The monad categorization treats all ordered collections the same (as a sequence of values or instances). However, ordered collections can be further distinguished by two secondary characteristics: 1) whether holes are allowed and 2) whether duplicates are allowed. There are six possible combinations of collections based on the characteristics unordered/ordered, duplicates, and holes.

Table 5. Monad types

 orderedduplicatesholes
setNNN
bagNYN
listYYN
arrayYYY
array-uniqueYNY
list-uniqueYNN
The following combinations are not applicable:    
 NYY
 NNY

Multi-dimensional vs nested collections

If the dimensionality of a collection is greater than one, it is necessary to define the positions of cells with respect to the dimensions. For example, the position of a cell in a two-dimensional matrix is described by a pair of indices, (2, 3). There are two styles of collections in entity-based information: shareable and non-shareable collections. For multi-dimensional collections, the non-shareable collection, is the simpler format. It is the default case that results whenever a multi-valued attribute or association is declared without any further qualification. This implies that the collection structure itself is not shared by any other association (even though its underlying instances may be shared). A shareable collection corresponds to a nested collection. This implies that the collection structure itself may be shared by other collections.There is no new meta-entity constructs needed in order to define a nested collection. A nested collection is defined as an entity, with a single association ex:contents to the collected items. The ex:contents association is taken to be a multi-dimensional collection. The underlying type of a collection may be a nested collection. This allows nesting of collections to any depth.

Cardinality needs to be defined at each dimension of the collection. The attributes minOccurs and maxOccurs are used exactly as in XML Schema. The maxOccurs value must be greater than or equal to the minOccurs value in each cardinality element. The xtc:cardinality element defines the collection semantics of an associaiton end, of a collection-valued simple type definition, or a nested collection entity. One xtc:cardinality element appears for each dimension of the collection; the order of the xtc:cardinality element matches the order of the collection dimensions.

5. The meta-metadata model

The DER metadata is itself data and can be defined and managed using the same tools used for instance data. It has become traditional to document and formalize metamodels in the same modeling framework as the modeled instance data. Examples of this practice include UML and the OMG Meta Object Facility (MOF); XML and XML Schema; and even the Java programming language classes and interfaces and the java Class class. The OMG MOF is a general-purpose meta-modeling framework. Its design reflects a layered approach:

Table 6. UML/MOF Layers

LayerUML/MOFInformation interop
M0instancedata
M1UMLDER/context schema/representation schema
M2UML metamodelmeta-metadata
M3MOF modelnot defined

Layer M0 corresponds to the instances in an object system or to instance data in an information interchange. Layer M1 comprises the UML model of the classes and other artifacts of the deployed system (system metamodel). For information interoperability, this is the DER model and/or and/or context schema and/or representation schema that defines the data to be interchanged. Layer M2 is the UML metamodel; this is the MOF definition of the metaclasses that comprise the UML. For information interoperability this is the meta-metadata model. Layer M3 is the MOF definition of the metamodeling facility itself. This layer makes it possible to use the MOF to define other modeling languages besides the UML object modeling language. There is currently no analog to the MOF model for information interchange.

Even though it is a customary design goal to use the same framework at levels M1, M2, M3, this is not always accomplished in practice. UML/MOF is a case in point. The MOF is similar to UML, but in a number of ways it is a simplification. For instance, the MOF supports only binary associations, while the UML supports n-ary associations. The argument is that meta-metamodeling is simpler than object modeling; and, thus, binary associations satisfy all metamodeling requirements.

However, this is just a symptom of a more fundamental difference: UML is an object modeling methodology (including a large number of work products devoted to the modeling of behaviors). The MOF is primarily an information modeling methodology. The metamodel for UML is an information model, not an object model. One consequence is that fundamental differences between MOF model and UML metamodel are unavoidable.

The DER modeling framework deals expressly with metadata models. Its own meta-metamodel is also a metadata model. As a result it is more natural to use the same framework for both layers. The DER modeling framework is already scoped to one particular domain, information representation. No requirement has yet arisen to have a generic metadata modeling capability (as it did for the MOF). The M3 layer is not applicable for information interoperability.

6. Sample DER metadata

This section contains a sample of the metadata elements that would appear in a DER. This sample file can be found as follows.


<xder:DER
   xmlns:xder="urn:ise:ded"  <!-- namespace uri for DER elements -->
   xmlns:xde="urn:ise:ded/dataElement" <!-- namespace URI for metadata entities -->
   xmlns:xdp="urn:ise:ded/property" <!-- namespace URI for metadata properties -->  
>

<xder:Entities>

   <xde:Entity xdp:identifier="ise:StructuralSystem" xdp:definition="A Structural 
   system is a type of system (or functional group). A Structural_system provides 
   information and capabilities common to all types of  Structural_system objects."
   xdp:version="1" xdp:submittingOrganization="urn:ise:der" 
   xdp:registrationStatus="approved">
      <xdea:narrowerThan>
      	<xde:Entity eref="ise:System"/>
      </xdea:narrowThan>
   </xde:Entity>

</xder:Entities>

<xder:Properties>

  <xde:Property xdp:identifier="isep:identifier" xdp:definition=" This is the 
  identification of any Enterprise Data Object (EDO)."   xdp:version="1" 
  xdp:submittingOrganization="urn:ise:der" xdp:registrationStatus="approved" 
  xdp:datatype="xs:token">
     <xdea:subPropertyOf.>
      	<xde:Property eref="isep:name"/>
     </xdes:subPropertyOf>
  </xde:Property>

  <xde:Property xdp:identifier="isep:identifier" xdp:definition=" This is the 
  identification of any Enterprise Data Object (EDO)."   xdp:version="1" 
  xdp:submittingOrganization="urn:ise:der" xdp:registrationStatus="approved" 
  xdp:datatype="xs:string"/>

</xder:Properties>

<xder:Datatypes>

  <xde:Datatype xdp:identifier="xs:string" xdp:definition=" XML schema primitive 
  type for string."   xdp:version="1" 
  xdp:submittingOrganization="www.w3.org" xdp:registrationStatus="approved"
  schemaLocation="http://www.w3.org/2001/XMLSchema" />

</xder:Datatypes>


<xder:DataElements>
      
 <xde:DataElement xdp:entity="ise:StructuralSystem" xdp:property="isep:Identifier" 
 xdp:definition="Globally unique persistent identifier for a Structural System."/>

</xder:DataElements>


<xder:Associations>

 <xde:Association xdp:identifier="isea:StructuralSubSystem" xdp:definition=" The 
 association of a StructuralSystem and its immediate subsytems."   xdp:version="1" 
 xdp:submittingOrganization="www.w3.org" xdp:registrationStatus="approved">
     <xde:AssociationEnd xdp:name="system" xdp:type="from" 
     xdp:minOccurs="0"   xdp:maxOccurs="unbounded" xdp:monad="set" 
     xdp:aggregation="composite">
        <xdp:entities>
               <xdp:Entity_ref eref="ise:StructuralSystem"/>
        </xdp:entities>
     </xde:AssociationEnd>

     <xde:AssociationEnd xdp:entity="ise:StructuralSystem" 
        xdp:name="sub_system"   xdp:type="to" xdp:minOccurs="1" 
        xdp:maxOccurs="1" xdp:monad="set">
        <xdp:entities>
               <xdp:Entity_ref eref="ise:StructuralSystem"/>
        </xdp:entities>
     </xde:AssociationEnd>

 </xde:Association>

</xder:Associations>


</xder:DER>