The following are my notes from the 4 February 1997 working group meeting held in
D.C. I am distributing them in this forum to serve as memory-joggers as you
prepare inputs and comments for the next meeting. They are NOT intended to
serve as official meeting minutes; those will be distributed by Charlie
Herget, IEEE ITS DD & MST Project Manager.
One of the initial tasks for which we are supporting IEEE as consultants
involves framing the major technical (architectural, process, etc.) issues
and approaches, as drivers for the development of the ITS DD standard and
guidelines. I recorded these notes to support that process. Additionally, I
see the notes as stimulating discussions in this electronic forum and
aiding in the development of inputs to the next WG meeting. If you have
comments, or feel that any point in the notes does not accurately reflect
the discussions at the meeting, please let me know and I will revise the
notes accordingly.
Tony Sarris
Ontek Corporation
IEEE ITS DD & MST Contractor
tony@ontek.com
Introduction - Valerie Shuman
IEEE ITS DD & MST Working Group Chair
IEEE ITS DD and MST Objectives, Preliminary Findings and Topics
for Discussion - Tony Sarris and Burt Parker
See copies of presentation handout.
EPA Data Registry- Bruce Bargmeyer and Joe Leahy
Currently 5000 legacy data elements. Not much work has been done to-date on
analyzing and reconciling those data elements, so they are not really data
concepts and the registry currently does not contain much of a conceptual
schema. There are ca. 100 data concepts such as Date, Longitude, etc.
Definitions are given only in natural language. Most of the meta data is of
a data administrative nature.
Mike Schagrin (DoT) emphasized DoT's interest in promoting harmonization
and reuse of data concepts. He wondered why reuse appeared to be noted as
'optional' on one of the slides. It was explained that the word 'optional'
referred not to the concept of reuse in general but rather to a specific
bullet point which stated that one objective for the ITS DD standard is to
enable any ITS subsystem data dictionary to "describe locally the nature
and structure of ITS subsystem data and processes and promote reuse". This
would be in addition to the reuse that would come inherently from the
description in any ITS data dictionary of common, globally-used data
concepts. This slide was based on the output of the kick-off workshop last
November, which noted a consensus position that information interchange
among ITS subsystems (via messages) was the primary objective of the DD
project, but that the various subsystem DDs might also want to use the same
approaches locally to enable easier harmonization across subsystems and
promote additional reuse.
There was considerable discussion of the issues surrounding the notion of a
'central registry'. Many of the attendees seemed at least willing to
consider the pros and cons of the idea. The most skeptical was the public
transit (APTS) subsystem representative. Later discussions indicated that
the primary concern of the APTS group was related to authority/control
issues; there was a strong desire on the part of APTS to retain control
over those data concepts for which they are the producer/owner or in which
they are at least a large stakeholder. Other subsystem groups expressed
concern about this, too, but not to as great a degree. There was also some
concern about the size of the effort if all ITS data elements/data concepts
have to be put into a registry.
1. Central Registry [Note: the word 'central' was found to be embed lots of
different notions such as physical implementation, control/authority, etc.,
and was dropped to avoid confusion at this point]
2. Common vs. specialized data: Common data include general concepts like
Vehicle, Customer and Link, and core data elements like Location,Color and
Identifier. Specialized data are related specifically to some subsystem
application context.
3. Global vs. local usage of data: Global data is used by at least two or
more subsystems (although one subsystem may be the primary or sole
originator of the actual data). Local data is used only within a single
subsystem/application context, or at least within a small number of
applications.
4. Authority (data ownership and stewardship): This deals with who has
ressponsibility for the data. It may be exclusively the originating party,
or other subsystems/applications may also have a role if they are important
stakeholders in the data (for example, if the data originates elsewhere,
but is used heavily in some other subsystem).
5. Distribution of physical aspects of a registry and related data
(database) management in a distributed DBMS environment
6. Enforced (mandatory) standardization
- format (syntax), including possible naming conventions
- meaning (semantics)
7. Nature of usage: What is the data used for, e.g., statistical analysis,
reporting, control, business/financial accounting or other related
processing, etc.,
Central Registry [Note: the word 'central' was found to be embed lots of
different notions such as physical implementation, control/authority, etc.,
and was dropped to avoid confusion at this point]
Q: Does it need to be a single, physical artifact?
A: No, it is possibly a 'central' logical rather physical registry, but
this brings up the distributed data mgmt. issues noted above.
1. What is this? [Is it an actual DD with all the meta data, perhaps
replicated, or an index into the actual DDs (perhaps with some meta data
directly residing here)?]
2. Where is it?
3. Who operates it?
4. Who pays for it?
5. Who decides the content and how (i.e., on what basis)?
6. What tool(s) might be used to support its creation and on-going
use?
In any case, this registry must follow the ITS DD standard.
1. Common data and global usage: "I probably never really had control, but
if I thought I did, I give it up now." Things like geospatial meta data for
which there might be standards even outside the ITS domain. This category
could also include very general ITS domain concepts like Vehicle
and Link.
2. Specialized and global usage: "I probably have control currently, but I
recognize my data is used by other subsystems, so I may be willing to give
up some control, in some cases at least." Examples might be emergency data
that gets used within the ATIS and APTS subsystems.
3. Specialized and local usage: "I have control, but no one else cares
normally anyway." The one notable exception is to promote reuse. When a
subsystem wants to add a new data concept, they should be able to check to
see if it, or something similar or related to it, is already in use in some
other subsystem. What might be local and specialized at one point in time,
might become global and common at some other time.
See also Bob Barrett's graphic.
Note: Of the TMDD, APTS and ATIS subsystems, no one felt like they had the
necessary supporting data to give 'guessimates' of the percentages for each
of these categories (although in the case of ATIS, Bob noted that they
produce very little data of their own; they basically repackage other data
into 'messages' that are useful to their domain's service/functional
needs).
VOTES [see Minutes for the official voting results]
1. Do we need a shared data dictionary: Yes.
2. How should that be physically realized (not at detailed level, but in
the sense of a general architectural approach)?
A. Interim: Do we want a single physical registry and should its structure
and guidelines for use (the later being its 'process' per Andy) be based on
IS 11179? Yes. The purpose is just to get some experience attempting this
sort of thing. It may help with the identification of requirements and the
testing of approaches. It may also help provide a better understanding of
the nature of ITS meta data and inter-relationships among that meta data.
Stratification/classification of the meta data was also noted as an
important requirement, and this effort may provide some insight into needs
in that area.
a. What tool should be used to support this? The goal is to be 'tool
neutral' in the sense of not appearing to favor a single vendor's tool, but
at the same time to make the existing meta data as visible and accessible
as possible, as quickly as possible, to help in this early analysis phase.
The analytical support tool should deal with: data structures, data
classification and the process of data analysis and registration. It was
noted that classification is not really part of the scope of the DD
standard or guidelines, although the DD meta structures should support
classification schemas, including various kinds of them (see also Paula
Okunieff's APTS presentation).
There was discussion of both the EPA Registry and MS Access. Bruce
Einsenhart and the Nat. Arch. team noted that they are using MS Access, as
are TMDD and APTS, and they have funding to support some work in this area.
Andy Schoka noted a strong preference for the EPA Registry over MS Access.
The group noted that both tools could support the application of 11179, and
both are based on COTS relational DBMS technology. MS Access is cheaper, as
the EPA Registry is built on a Oracle server and runs PowerBuilder as its
GUI.
B. Longer-Run: Need more time to decide. Burt Parker noted that this
affects the guidelines of use, although it was flet that the same meta
structures should/could be used either way, so the DD standard itself would
likely be the same in either case. Burt agreed to draft a white-paper
discussing the issues and options related to this topic area. Questions
related to this include:
- Does it include all data (although some inclusion may be by reference,
e.g., spatial meta data might be handled by a listing of the data element
and a pointer to FGDC, rather than incorporating/replicating all the FGDC
material into this registry directly). What about all the local DD meta
data: could it be handled by a pointer, too, or should it be replicated?
This has an impact on the guidelines due to distribution and associated
procedural/administration issues as noted above.
Current DD effort encompasses about 500 data elements. They are using MS
Access. They have looked at IS 11179 and have noted for interim work a
subset of meta attributes that they feel are most applicable to their
current effort (see separate TMDD document).
The TMDD 'national' DD is a real DD with meta level attributes and
definitions (both based on IS 11179), whereas there are many specific
TM-related application DBMSs. Many or most of these claim to have their own
[albeit limited] data dictionaries. For example, if the application DBMS is
Oracle RDBMS-based, the Oracle schema serves as a limited data
dictionary.
Synonyms in the TMDD case are often not really referring to the exact same
data concepts; often several apparent synonyms are all slight variations on
the same [implicit] general data concept.
Some past or current efforts are addressing the Conceptual Schema: ISO
TC204 APTS component, the APTS NTGIS aspect, and the APTS portion of the
Nat. Arch.
Some past or current efforts are addressing the Logical [External] Schema:
TCIP classification work and the APTS DD work. These address the
application/user view.
Some past or current efforts are addressing the Physical [Internal] Schema:
SAE J1587, Datax EDIFACT, TCIP, NTCI/Geospatial transfer. These address the
transfer, storage and/or implementation view.
Then there are specific applications, which deal with the instances of the
data: AVL, CAD, scheduling, etc.
There are two approaches to interchange:
- Translation, i.e., interchange which is based on a direct mapping.
Although specific criteria to identify what data is best suited to this
approach has not been developed yet, this would largely be common data. The
criteria could be stated in general as follows: fairly static; important
(i.e., where you can't afford any question of ambiguity and you are willing
to enforce standardization to ensure consistency; in other words, by
agreeing to deliberately reduce variability).
- Reconciliation, i.e., interchange which requires some reconciliation of
differences in the meaning/semantics.
They use multiple classification schemes for APTS data concepts and they
therefore need the IT S DD meta structures to be able to support this
requirement.
ATIS may be unique among ITS subsystems in that few, if any, data
elements/data concepts are specialized to, and created and used locally
for, ATIS only. ATIS basically involves application of other subsystems'
data elements/data concepts (from their DDs) through repackaging of that
data into ATIS-specific messages. The exception might be algorithms for
statistical calculations against other subsystems' data (per Bob and Ray
Starsman). In a broad sense this is part of the packaging of other data
into ATIS messages, but the algorithms for the calculations are ATIS data
concepts that may need to be kept in the ATIS DD.
Refer also to Bob's graphic showing overlap of TMDD, APTS and
ATIS data.
There are 20 core messages in the national architecture, supported at the
detailed level by some 2500 data elements contained in 80 major
data flows.
Nat. Arch. and CVISN vs. TMC, TME, ATIS, APTS, EMERG, TMDD
- The National Architecture DD, and the CVISN subsystem DD (the one
subsystem DD which has been completed already), as well as some other more
general data sources (DSRC and NTCIP Object Defs) are being used as sources
for more specific DDs that are being developed or proposed for development.
For example, a number of specific standard EDI DDs are being produced
relating to commercial vehicles and associated electronic commerce. This is
at the level of DD content (i.e., data elements/data concepts in the DDs),
not structure.
The next meeting will still be oriented to gathering requirements. It will
also begin to address MST issues. ASN.1 and related matters will be part of
discussions in this topic area (Tom Kurihara and Tony Sarris are to supply
some input for this discussion).
Andy Schoka (with input from Paul Hawes, Tom Kurihara and Allan Kirson and
anyone else who wants to provide input) is the focal point for terminology
and TC204 liaison issues.
Bruce Eisenhart (with input from Paula Okunieff and Tony Sarris) is the
focal point for classification/taxonomic inputs.
Back to Home Page
E-mail to Sue Vogel,
Staff