IEEE SCC32 - Standards Coordinating Committee on

Intelligent Transportation Systems (ITS)

An Annotated List of Reviewed ITS Documents

A. ITS SYSTEM AND SUBSYSTEM-RELATED DOCUMENTS

1. ITS Architecture documents (softcopies):

1.1 Executive Summary [EXECSUM.DOC]

Note: A preliminary review of this document was conducted for general background purposes only. The preliminary review was deemed sufficient.

1.2 National Architecture Vision [NARVSN.DOC]

Note: A preliminary review of this document was conducted for general background purposes only. The preliminary review was deemed sufficient.

The ITS architecture is based on 29 user services, with 19 subsystems in four major areas. It includes a logical architecture, a physical architecture, and a mapping. Subsystem interconnections are modeled via data flows.

1.3 Logical Architecture [LAVOL1.DOC, La_v1c/doc, La_v2.doc, La_v3.doc]

This is an important document. Although it deals with DD/MST contents, it provides a perspective on the nature of ITS data and interchange among ITS subsystems. It is also a potential source for examples for the guidelines document. La_v3.doc contains the (full) ITS-level data dictionary.

The ITS-level DD dealt with supporting subsystem-subsystem interfaces. One use of this global DD is for control, somewhat also as a design specification, and somewhat also strictly for information or communication purposes. Its form varies from informal (text-based glossaries) to formal (CASE models). The CASE models were produced using Cadre's TEAMWORK product. This allows export of CDIF files, which can then be imported into RDBMSs such as MS Access. The overall structure is: Transaction; Data Flow; Data Element. However, the contents are not usually (or always) limited to data elements. They address three views: Control, Process and Data. [Note: That implies constraints/rules, process models and data/information models]. There are 300 processes in ITS, with 80 DFD's and one global DD. There are 2500 'entries' in the DD: these define data elements that are interchanged between two or more of the processes. [Note: These are not being reviewed in detail as part of this effort, as these represent DD content, not structure. However, cursory reviews have been performed in conjunction with reviews of the ITS subsystem documents noted below, to ensure that the varying nature of ITS data is taken into account when defining the meta structures for the ITS DD standard.

1.4 Standards Development Plan (SDP) [primary document is SDP.DOC].

This document provides a good background on the areas for standardization within ITS and the purpose and benefits of doing standardization.

"3. A review of the Logical and Physical Architecture data dictionaries identify elements that span multiple physical interfaces and services. These elements are candidates for foundational standards."

"The approach that has garnered the most attention and analysis to date is to review each of the subsystem interfaces defined by the Physical Architecture as potential candidates for standardization."

"To provide separable sets of data elements for use by standards organizations, a set of standards packages has been developed. Eleven packages have been defined that cover most near-term applications...These packages select specific interfaces from the architecture that have common data elements. For example, map databases and location information are used in multiple places in the architecture. One would expect that they would all conform to the same standard A standards package focuses on these data elements wherever they appear in the architecture."

Standards Requirements Packages [see 1.5 below] have been produced as part of the architecture as input to the standardization process. The SRPs contain data dictionary element (DDE) definitions and sizes. These are described as follows:

"Abstracts that describe the composite and primitive logical architecture data elements. This is a brief description of each data item and its use. For a more in depth examination of a data flow and the functions that use it, it is necessary to refer to the logical architecture documentation."

Several relevant standards (existing and in-work) are noted on pages 31-63, including the ITS subsystems to which they apply. Page 32 notes several data dictionary (content)-related standards projects.

Towards the end of the document is table of standards projects, noting the umbrella ITS DD and MST as being of high-priority and early need.

1.5 Standard Requirements Package(s) (SRPs) [SRP1 to 11.DOC, plus two other documents]

These documents are basically a repackaging of material extracted from the ITS architecture and arranged into logical groupings for standardization of short-term, high priority areas of the ITS. From the perspective of the DD and MST standards projects, little or no new information is added beyond the architecture documents (primarily the logical architecture, which is of most interest to the DD and MST projects). Data dictionary element (DDE) definitions are included in the SRPs. These obviously deal with DD and MST content issues, and can complement the logical architecture in terms of providing a sense of the nature of the DD/MST data elements and the nature of subsystem interchange, and perhaps as a source for examples for the guidelines.

1.6 Traceability Tables from Logical to Physical Architecture [TRACE.DOC].

Note: This is not really relevant, as physical architecture is not an issue here, and I did not download any documents related directly to the physical architecture.

2. Commercial Vehicle Information Systems and Networks (CVISN) Data Dictionary preliminary, 29 February 1996 - (1 hardcopy). Note: This is the one completed ITS subsystem data dictionary project.

Typical EAR (per clause 1.4.3 on section 1 page 10), supplemented with data flows (and therefore some level of processes/functions). The data flows have attributes, too. Note: Does the standard for ITS DDs need to allow mapping of CS data concepts (within an IS) to particular application systems/applications not just processes (the processes are CS and are currently the subject of the data flows). The relationships in the data model have the full suite of cardinalities. Emphasis is on the attributes (i.e., data elements), which they define as consisting of "a definition and the characteristics of the data." There are also codes representing types of valid contents (i.e., 'system tables' of types as values for a data element such as "Accident Parameters", which would include a one or two character code for the accident type). The tables are stated as having attributes, too. Applied naming standards to data elements. Keep basic data administration meta data such as source, format, synonyms, etc. The data element description is a text block. Used Popkin Software's CASE tool (System Architect). Didn't address composites, only atomic data elements. Note in the DD, using asterisks, whether a data element is a direct/exact use of X12 or D20, versus a specialization/modification/extension (but don't define the nature of the specialization, etc. if it is not an exact usage).

The DD contents are voluminous. There are lot of synonyms, as well as 'repackaging' of the same basic information into slightly different data structures. There are some confusing entries. For example: CV_border_clearance_data is stated as having the purpose of conveying identification numbers for the carrier, vehicle, driver and manifest to the roadside. However, the listing of data elements/attributes for this data structure does not contain this information.

Since it is based on application systems and reports and the like, one can speculate that it reflects more of the flavor of COBOL copylib layouts than a true logical, or even integrated physical, database structure.

Data structures and data flows have a listing of data elements separated by commas or pluses at the beginning, followed by an alphabetical listing of the data elements as attributes on the data structures/data flows. I assume the sequence of the comma-delimited list is significant: do these map directly to standard message sets (at least for the data flows)?

The data models contain entities with attributes just like the data structures/data flows only without the separate delimited versus alpha listings (just a sequential listing with one data element per line). These appear to be 'chunkier', more generic entities versus the packages of more specific data implied by the data structures and data flows.

3. Advanced Public Transportation Systems (APTS) documents:

3.1 Advanced Public Transportation Systems (APTS): Evaluation Guidelines (softcopy)

Note: This document is not particularly relevant. Deals mostly with surveying and test specification, but some useful background on APTS, and the discussion of APTS data gives a flavor of the nature of the data and provides a potential source of examples from the APTS domain.

3.2 Federal Transit Administration (FTA) - National Transit Geographical Information Systems (GIS) Manual (softcopy), includes Spatial Data Features Definitions as precursor to an APTS data dictionary

This document is a good source for understanding the nature of spatial data that must be able to be described (at the meta level) and managed in various ITS DDs (although this particular application is for transit use specifically). It is based on the FGDC standard.

"Metadata describe the content, quality, contacts, conditions and other characteristics of a data set. Many information specialist say that in the computer age, data is not power, "metadata" is. More than any other standard, the metadata provides a "roadmap" to information in a data set. The metadata provides information on the organization of, maintenance of and investment in data, data catalogs, access paths, and data transfer. A metadata document on geospatial data helps people who use geospatial data find the data they need and determine how best to use that data. To this end, the Federal Geographic Data Committee (FGDC) developed a Content Standard for Digital Geospatial Metadata to facilitate access to data inventoried in the National Geospatial Data Clearinghouse. This standard provides a format to catalog information about geospatial data sets. Appendix D contains an adaptation of the Content Standard for Digital Transit Geospatial Metadata reflecting transit domain considerations. The appendix describes and explains elements of the FGDC Metadata standard, and augments the FGDC version with guidelines for transit features, themes and databases. Metadata documents created for each data set in the NTG will advance appropriate use of the data, and those created for data sets submitted to the NTG will improve the data's integration with the NTG."

It deals mostly with the schemas required to represent and interchange spatial/GIS data (i.e., it is primarily content-related as far as the DD/MTS goes), but also deals to some degree with the meta structures used for those schemas.

"Referential Integrity refers to the accuracy, validity or correctness of the data in meeting the constraints and rules defined by the internal structure and content of the data. These rules apply to multiple levels of data and data base management systems. Tabular data are constrained by data domain and type rules, and data base management system rules. The relational database model must be normalized, and referential integrity constraints ensure that a unique primary key exists for each table and no foreign key is unmatched to a primary key. General integrity constraints associated with geographical data relate to data quality, such as ensuring fundamental relationships in the graph structure. For example, the sum of the degrees of the graph vertices equals twice the number of edges, where the vertex is a node and the edge is the line between two nodes."

Requirements to support transaction specifications are also discussed.

3.3 APTS Map Database User Requirements Specification, includes Spatial Features and Feature Types (hardcopy)

This document gives a good flavor for the disparity of data to be described and managed in a data dictionary even just within public transit. There is financial data like fare collection, and geospatial data like "where in the world is the bus stop I'm looking for." The geospatial data raises some questions about what level(s) of this they expect to put in (various) ITS data dictionaries, and whether that affects the meta-data structures and relations they need. They talk for example (on page 5) about a Geographic Layer with Spatial Object that has examples: Point, Piece, Polygon, Path and Plane; a Transit Concept Layer that has Spatial Feature with examples: Access Point, Segment, Transit Route; and a Transit Application Layer that has Included Term: Bus Stop, Fixed Route, Bicycle Path. They address Functions, as well as Features (which are entities or objects in everybody else's terminology) and Attributes, and the mapping between Functions and Features+Attributes.

3.4 Final Draft White Paper on The National ITS System Architecture: Data Dictionary for Transit Use (hardcopy).

This white-paper was written by Sandia, and esstentially states that the overall ITS data dictionary portion of the ITS architecture did not take into their early work on transit-related data, and does not meet the 'requirements' for APTS. They are very implementation-oriented, and appear to want common definitions and objects that they can easily reuse during actual physical system development and implementation, particularly to support real-time messaging for PT fare collection, locationing, etc. They complain that the ITS DD (which is part of the ITS Logical Architecture) addresses only portions of the 'technical definition' of data [elements] (such as size, data rate, etc.) and nothing in the area of how data are physically transferred (media, protocol, frequency bands, software standards).

It is apparent from looking at the logical-to-physical comparison for transit-related data elements that there are major differences. One difference they point out is genericity versus specialization: they want common, general data elements at the physical level, whereas the logical architecture contains all sorts of specializations of data elements to reflect their specific use in various data flows (which correspond to detailed transactions to support particular services). There are also naming conventions problems (see page 14 for an example), and there are cases where at least some of the logical data elements that are mapped to a single, common physical data element don't appear to be a good (or at least intuitive) match at all.

4. Traffic Management Data Dictionary (TMDD)

Note: The frst internal draft of the actual TMDD is due in January, but may not be available in 'public form' until March. In lieu of this material, the following TMDD-related documents in existence at this time were reviewed:

4.1 Strategic Plan for a Data Dictionary for Automated Traffic Management Systems (ATMS) - Some relevant excerpts from the Strategic Plan:

"One of the most important aspects when designing and implementing systems is the uniform understanding of the terminology being used. A data dictionary provides a unique identification and description of the data elements used in the transmission and communication of messages between computer systems. For each data element there normally is a description, size estimate, and listing and description of its critical attributes. Some dictionaries will include other features such as description of origin, timing requirements, and valid entries for the data element.

Perhaps surprisingly, there is some inconsistency as to the name of an individual data dictionary entry and at what information level it exists. At various times, these entries are referred to as data flows, messages, message sets, tables, data elements, and/or objects. It is commonly understood that a data dictionary holds data elements, but what appears to not be always understood is at what level these elements are being defined. This confusion has been recognized by others and Allan Kirson is preparing an excellent paper defining terminology and hierarchical structure for this area of message exchange. For this report, a data dictionary entry will be called a data element and will normally represent the smallest variable data unit definable. That is, a data element will normally consist at the level in a message where it cannot be further decomposed or subdivided.

The TMDD does not necessarily seek to identify data terms used only internally within a proprietary system or its database. Also, it is important to emphasize a data dictionary in itself does not seek to determine a database design, file structure or any method of internal storage in a system.

Listing of Data Element Description for Traffic Records Systems - ANSI D20.1

* Name

* Short Name

* Abbreviation

* Definition

* Sources

* Uses

* Type of Data Element (Basic or Composite)

* Type of Representation (Name, Abbreviation, Code, Numeric Value)

* Type of Character(s) (Numeric, Alphanumeric, Alphabetic, Special)

* Length (Fixed or Variable and Number of Characters)

* Synonyms

* Other Characteristics

* Source of Data Representation

* Description of Data Items (Name of Item, Abbreviation, Code,

Definition)

The information in Table 1 is an example of the structure of a data dictionary. It again demonstrates that a data dictionary can generally be described as having the following features:

* A definition of the term similar to a glossary.

* A listing of the various attributes or properties of the particular data element.

* A definition of how attributes are coded and the length or size property of the coded data element."

Note: This is basic EAR, with the attributes being largely extensional and most of the semantics, including relationships and typing buried in the text-based definition.

"Focusing on purer forms of TMDD's finds they exist primarily for specific traffic management systems which have been designed and developed by major traffic systems firms. Specific examples were examined in visits to PB Farradyne, Loral AeroSys and JHK & Associates. These individual data dictionaries were developed to support the data flow with their system database. They generally are organized in a hierarchy consisting of a set of tables for some function, with the tables made up of individual definitions. In database terminology this structured definition is called a schema. In this structure, one set of tables could, for example, be grouped under "Device Modules" with individual tables prepared for detector stations, ramp control, camera control, etc. To get a sense of size, the PB Farradyne Mist System contains over 130 individual tables with each table containing from as few as two defined attributes to as many as 20. An example of the table for LINK DEF is shown in Figure 2 and is one of the longer tables with 19 entries defining specific attributes. In addition to the exact attribute name and its description (definition), each attribute is described by its data type which is based on a structured format chosen by the system designer. Examination of the data dictionary or schema for other systems shows a similar structure consisting of tables for specific message sets. For example, the schema for the Loral AeroSys system database also includes a table for LINK but is significantly larger as it contains 32 attributes which are defined. Inspection will show, however, part of the increase is due to the system designers including environmental information in their table. (Figure 3). This comparison of how similar terms are handled by different system designers provides insight into the specific work that must be performed during the TMDD development. First, a common set of data elements must be identified; second, the specific definition of each data element must be established; and then, third, a table or list of the necessary attributes must be established. This demonstrates that consolidation and coordination of terminology and even system structure will be a considerable part of the TMDD development activity."

Note: This brings up the need to clearly allow for ES, IS and CS in the DD, but to stress that physical implementation (IS) should not be the primary issue except in cases of very static, hardwired interchange among homogeneous systems. For performance, this will likely need to be the case, for example, for real-time messaging. But you don't get information interchange unless the CS (read, the semantics) are interchanged between the systems, too.

"The ITS data dictionary developed in the NA program is substantial and consists of approximately 2500 entries. A rough estimate is that approximately 600 of these entries are related to ATMS."

Note: The messaging format used is based on the OSI seven-layer architecture, and uses ASN.1, as noted below:

"The STMF incorporates a structure based on a standardized procedure for structuring message formats of various communications protocols known as Abstract Syntax Notation One (ASN.1). Basically, it defines data terms which it calls 'Objects' in five fields as follows:

Object name - A textual name and an identifier for the object type.

Syntax - The abstract syntax of the object, i.e. how it is built.

Definition - A textual description of the meaning of the object type.

Access - The object can read-only, read-write, write-only, or not

accessible.

Status - Support is either mandatory, optional or obsolete."

Note: Again the semantics are essentially only a text-string within a message.

4.2 Summary document (softcopy) - Brief background.

4.3 The TMDD Prototype: An Analysis of ISO 11179 for the Development of ITS Standard Data Dictionaries (dated January 27, 1997)

Note: Refer also to the analysis of the IS 11179 standard documented in the companion notes on existing and emerging standards.

5. Data for Decision Requirements for Transportation Systems (softcopy)

The need for intermodal data analysis indicated in this report is a general driver for providing a DD standard that would enable information interchange. That interchange needn't be limited to intra-ITS systems, but also between those systems and the systems of related external organizations such as DoE, EPA , DoD, DoA and the Census Bureau.

"What is lacking is a systemwide framework and capacity to integrate and compare data on a more consistent basis over time to track system performance and determine where the transportation system is headed."

It was noted that more 'demand-oriented data' is needed to indicate areas of weakness and where to focus investments.

"An effective data support system has two essential components. First, the data should be organized in a framework keyed to the broad subject areas of interest. Second, analytic capability is critical to ensure that the data are translated into information that is useful for policy analysis. The latter is particularly important for understanding qualitative changes that are not readily measured or, if they do appear in time series data, are reflected too late for policy makers to take action."

"The time and cost of collecting and integrating data, as well as the need for systematic and reliable monitoring over time, work against constant modification of data bases. Thus, NTPMS is best structured not by issues, which tend to be transient, but by major attributes of the transportation system, which fall into four broad categories - supply, demand, performance, and impacts."

"At a minimum, a brief description of the data items, their sources, and methods of collection should be provided. A summary of key trends and changes in trends would also be appropriate, as would a discussion of the quality and limits of the data."

"The biggest gap in DOT's multimodal data programs is in flow data. Flow data refer to information on passenger and freight volumes from origin to final destination by trip purpose, distance, mode, and passenger and freight characteristics."

"The Statement of National Transportation Policy identified safety as the top departmental priority (DOT 1990, 7), yet the data to monitor the safety and security of the system across all transportation modes are inadequate."

Note: This is due to different levels of details, different measures, lack of correlation with volume (e.g., flow) statistics, etc. Things like performance versus condition reporting are also made difficult because of a lack of correlation and because of differing underlying assumptions used in statistical models. Are these assumptions ever part of a DD, or are they strictly part of stand-alone statistical analysis and simulation systems? If data warehousing is a technology to be used in the future, ITS DDs will need to have such information.

It was noted that a lot of the data envisioned as being collected by in-vehicle systems and related traffic control systems is intended for systems management. The DoT would also like to have access to this information for broader statistical analysis purposes. Use of data collected by the private sector is also desired.

Appendices provide a listing of existing databases, data programs and reports within the agency.

6. NATO Industrial Advisory Group, Subgroup 52 (NIAG/SG52), Allied Naval Engineering Publication (ANEP)-51, NATO Naval Combat System Information Catalogue, Volume 0 - Introduction and Volume 4 - Message Construction Standard

At the level of Volume 0, there appears to be many similarities to the objectives and overall structuring of approaches between ANEP and the ITS DD standard and guidelines and MST standard. For example, the notions of 'formally and unambiguously' defining data structures and data elements and the [generic] messages they get packaged into are found in both projects. ASN.1 is the expression mechanism, with MS Access serving at least as an interim analytical tool, if not a limited data dictionary or registry [Note: Vol. 0 talks about standardization of data elements and messages, particularly looking towards genericity to reduce unnecessary differences and hence reduce ambiguity]. There are also some obvious differences in terms of the breadth and general nature of the data being interchanged; ANEP data might well be fairly analogous to the emergency and traffic management data within the ITS, but doesn't deal with commercial/business data elements, financial data feeds, etc. that are more complex and don't map as directly to the low-level basic data types of ASN.1 (e.g., Integer, Real, String, etc.).

Based on reviewing Volume 0, the most relevant detailed document to review appeared to be Volume 4: Message Construction Standard. This was stated as providing a "formal and unambiguous syntax for the definition of messages and data structures (components of messages)". Some of the basic data elements are also stated as being 'formally defined'. However,, within Volume 4 itself it makes it clear that 'formally defined' is really only 'formally syntactically defined'. The basic data types are really only the eight most primitive ASN.1 data types (i.e., they don't really even make use of the full set of ASN.1 basic types). Additionally, while ASN.1 may be formal and unambiguous in its syntax (since the syntax and at least aspects of the associated grammar are defined in specifications that are well-controlled standards documents), that does not in any way mean that the [semantic] definitions one specifies using ASN.1 are necessarily formal or unambiguous. That would be presuming the nature of content based on the nature of its form. "C" is a formal and unambiguously defined programming language, but that does not mean someone cannot write bad or completely nonsensical programs in it, and while they may compile and execute, they don't produce any meaningful output. Saying, for example, that Message 37, Incident, is a set with members Incident Type (VisibleString), Date (VisibleString -- YYYYMMDD), ShipNumber (Integer), NatureOfIncident (VisibleString) and Report (Sequence of Events (VisibleString)) doesn't tell a human user explicitly about what an incident is, what a ship is, what's an acceptable description of the nature of an incident, what constitutes and event versus a part of an event or a description of an event. A message of this nature can be parsed and read into a receiving application, but unless that application already 'knows' what those data elements mean and has the same expected meanings for them as the sending application, then there is no assurance of nonambiguity. Given that, appropriate questions to ask would include: do these standards at least include a set of procedures for defining data elements in a consistent manner using, for example, structured English; do they have a set of real 'building blocks." In this case, the answers seem to be 'no', and that is apparently sufficient in this case, as the messages are simple enough and the data structures and data elements are at a low enough level that a direct mapping to ASN.1 basic data types provides recipients of the message all that is necessary (in this case, syntax) to act upon the message or otherwise make use of it in the receiving application's context.


Back to Home Page  -- E-mail to Sue Vogel, Staff