Unified Framework for Ontology
Folks,
I made a proposal to the ONTAC working group for
a unified framework (with the acronym UF instead
of UFO) that could serve as a basis for cooperation
among various ontology projects.
Unlike SUO, which proposed to start with an upper
level, I believe that the experience over the past
five years has demonstrated that the upper level
is the most controversial and the most difficult
level to address if we are ever to get any hope
of consensus.
Following are some notes I sent to the ONTAC WG.
The first note is the most recent one. After that
are excerpts from earlier notes -- with deletions
to avoid redundancy.
John Sowa
_________________________________________________
Dear Matthew and Barry,
For any of us who have been involved with SUO,
this discussion of ISO 15926 and BFO looks like
deja vu all over again.
Anybody who has invested many person years of effort
in developing and using an ontology has no intention
of abandoning it in favor of any other that may be
proposed as a standard.
Cyc has been in development for over 21 years, and
it has a large number of users. Furthermore, they
have contributed OpenCyc as a large subset that has
attracted many more users. There are also projects
like SUMO and others, which also have invested many
person years of effort.
The developers and the users of these projects are
not going to abandon their projects and switch to
anything new except under extreme duress. Yet these
same people have an enormous amount of valuable
experience. Without their active participation,
any new project would be very much impoverished,
and it would just appear to be a YACO (Yet Another
Competing Ontology).
Yet there is one very bright light on the horizon,
which nobody has considered a competitor: WordNet.
The developers of Cyc, SUMO, and many others have
ignored the WordNet upper levels, and they have
made the effort of demonstrating that the categories
of their ontologies can be aligned with the synsets
of WordNet.
That is why I proposed a unified framework that would
be very similar to a cleaned up WordNet: very few
axioms, no upper level of any kind, and most importantly
*neutral* with respect to any and every major ontology
that has been under development. By neutral with respect
to any ontology X, I mean the following:
1. Little or no adjustment required. Some alignment,
similar to the work done to align Cyc and SUMO with
WordNet, may be necessary.
2. But after that alignment has been performed, any subset
of UF could be imported into X without causing any
inconsistency with any of the axioms already in X.
3. Points #1 and #2 imply that the axioms of UF cannot
be very detailed. In particular, it cannot have much,
if any, of an upper level. It will be and must be
*impossible* to do detailed reasoning about any
major topic using only the UF axioms by themselves.
In short, UF would be very rich in categories, but very
poor in axioms, other than saying A is a subtype of B, an
instance of B, or a part of B (and perhaps a few others).
The axiom that type-subtype is transitive would be in UF,
but there would be no commitment to saying that part-whole
would or would not be transitive.
Barry suggested that BFO be used as a basis for UF.
That may be OK, provided that none of the developers
of any of the other major projects feel threatened.
In other words, any axioms in BFO that conflict with
Cyc, SUMO, ISO 15926 or any other major project must
be omitted from UF.
That requirement does not imply that anybody's pet
axioms will be thrown away. BFO would still have all
the axioms it started with, and it would have the
advantage of already being totally aligned with UF.
Anybody who wanted to use the BFO axioms to do any
reasoning would not use UF alone. Instead, they
would use BFO+UF. Similarly, people could use Cyc+UF,
SUMO+UF, or ISO15926+UF without making their previous
applications obsolete.
That is neutrality: all existing ontologies would be
on an equal footing, and nobody who has an application
that uses any of them would have to make any major
adjustments (other than some renaming, if necessary)
in order to add any or all of UF to their system.
I also want to emphasize that developing UF would be
a first step, and it would not preclude other projects
that would develop much more detailed modules or
microtheories for reasoning at any or all levels of
the ontology from top to bottom. But those projects
should be kept separate and optional, not required for
the use of UF by systems that have different axioms.
John Sowa
_________________________________________________
Dear Matthew and Barry,
Let me start with Matthew's remark, which gets to the
heart of my proposal:
MW> The level above whether you are 3D or 4D is about
> foundations. What set theory do we use, or do we use
> Category theory or Type Theory. And what about number
> theory (we do want numbers don't we?) The reality that
> I see is that you are no more certain or free of
> controversy at what ever level you operate at.
For the past five years on SUO list, the most heated
arguments have been about foundations. Therefore, I
propose that we eliminate *all* foundations. The type
Set and the type Number, would indeed be in the unified
framework (which I'll refer to as UF), but neither
Peano's axioms nor the axioms of any version of set
theory will be in UF.
The point is that UF is primarily intended as a framework
for communication among potentially (or actually) incompatible
systems. The major inconsistencies arise at the level of
axioms, which none of these systems would accept from one
another. But they can usually accept lower-level facts
without creating any conflict.
Therefore, UF should be very rich in types, but very, very
poor in axioms. But any serious inferencing (which may be
logic based, statistical, computational, or whatever) will
require much more. But every system that adds more does
so in ways that are incompatible with some other system.
Any axiom that causes a conflict with any major system shall
be deleted from UF, but there may also be a large number of
microtheories (as in Cyc) or modules (as in the following
remark) which could be very rich in axioms expressed in
very rich versions of logic.
BS> BFO has two modules, one 3-D (defined for representing
> continuants), one 4-D (designed for representing processes),
> together with relations between them. Users are welcome
> to use either both modules together, or just one of them,
> according to preference.
That's an excellent principle. Any axiom that is deleted
from UF will not go away, but it will be available in
modules or microtheories that could be used as needed
by various systems. In effect, the topmost levels of most
ontologies are the most controversial. Therefore, UF
should have a highly impoverished top level.
John Sowa
_________________________________________________
Cory,
I'd like to emphasize that my proposal is completely
compatible with a very precise specification.
> I would like to follow up on the precision line
> of thought. As one who generally likes precision
> (or at least more than most), I have to admit to
> not having achieve it to the extent that would be
> required to some of the goals attributed to Ontologies.
What I am recommending is that the general-purpose
framework avoid *detail*. Everything that is represented
would be precise, but it would omit a lot of detail.
> In general, compatibility between systems is achieved
> with interaction, negotiation and sometimes mutual adaptation.
> Expecting that a set of specifications are going to be
> sufficiently precise, detailed, sufficient and contextual
> to allow for autonomous adaptation is hard to accept.
I agree. And for *some* applications all the detail is
essential. I don't recommend that it be thrown away.
What I do recommend is that the framework contain only
a precise definition of the minimal assumptions that are
commonly accepted for the various categories. That policy
permits anyone to add as much detail as they like. But
it would also allow anyone with any specialized need to
add idiosyncratic axioms that nobody else would agree to.
All of the perspectives would agree on the minimal common
content, but they could diverge on the details.
Instead of saying it is vague or imprecise, it would be more
accurate to say that I am recommending an *underspecified*
general ontology, which could be specialized in different
ways for different purposes.
> For example, if a set of interfaces were mapped to something
> like wordnet, it is easy to see how tools would help the
> human make connections between interfaces by matching
> the concepts. If it were correct 75% of the time - that
> would be a big win!
What I am suggesting is something like a corrected WordNet --
i.e., with a more accurate and consistent treatment of the
type-subtype, type-instance, and whole-part relations.
It would be 100% correct *all the time* for what it says,
but it would not make any commitment on any controversial
issue.
> ... as well as the capability to "ground" domain concepts
> in ONE OR MORE "hubs", like wordnet or even Cyc.
Instead of calling it a "hub", I would call it a framework.
It would provide a placeholder for all the terminology or
vocabulary that anyone might want to add to it. It could
in fact become as big (in terms of number of entries) as
the OED together with the union of all the specialized
vocabularies anyone would like to add.
But I want to emphasize that this framework would be much
less detailed in its axiomatization than Cyc. It would
be more like a precisely defined and corrected WordNet
extended with many additional vocabularies.
As a precise, but underspecified framework, it would be equally
suitable for a 3-dimensional view of space with a separate time
dimension or for a 4-dimensional view of space-time. It would
be equally suitable for an Aristotelian view of substance or
a Whiteheadian view of process. It could be combined with a
commonsense ontology or with any of the latest results of
modern physics. You could use it with situation calculus or
with pi calculus, as you prefer. It would be completely
neutral with regard to any of these options.
Many of us had been participating in the SUO (Standard Upper
Ontology) project for over five years, and we never made any
progress in resolving these controversies. If we insist on
producing a hub, or an upper level, or whatever you want to
call it, that takes one position or the other on any of these
controversies, we'll go on for another five years without
reaching a consensus.
Therefore, I recommend that we exclude from the framework any
axiom or assumption that is in any way controversial. It would
be very precise for what it says -- much more so than WordNet --
but underspecified. As a framework, it would provide placeholders
for adding whatever specialized microtheories anyone would care
to add. Those microtheories could be added at any level from
top to bottom or in the middle.
With such a framework, we could make progress. Without it, we
could end up with another five years of wrangling over subtle
principles, theories, and distinctions of physics, philosophy,
linguistics, and semiotics, as we have seen with the SUO project.
John Sowa
_________________________________________________
Nicolas, Cory, Gary, Jim, Robert, David, Pat, et al.,
Nicolas raised an issue that gets closer to the heart
of the matter:
NFR> ... it should be also very clear that the information
> flow ontology for the communication between A and B is
> fundamentally dependent on knowledge of what A and B
> do with that information.
Indeed, there's communication, and there's "what A and B do".
The first depends critically on some language, vocabulary,
and speech acts. The second gets into "doing", which could
mean several things: a logic-based inference, a procedural
computation, a process that triggers physical events, or
just storage for future reference. And both *speech acts*
and *doing* imply some agent who has some purpose -- usually
task oriented -- for saying or doing.
Of all these things, the vocabulary is the most obvious, and
it's the one thing that tends to be the most stable, even if
or perhaps even because the words may have more than one sense.
WordNet makes provision for multiple senses, but many ontology
projects concentrate on a fixed set of senses or types, each
defined by precisely specified axioms.
In science, precision is good because it makes theories easier
to test -- or, as Popper said, *falsify*. But a synonym for
"easily falsified" is "fragile", as we have unfortunately
learned with many of our computer systems. In communication,
some vagueness is often good, because it makes a statement
easier to verify, not falsify. As Peirce said, "It is easy
to be certain, one has only to be sufficiently vague."
Gary cited one of Doug Lenat's examples:
GBC> "If it’s raining, carry an umbrella." The following are
> assumed in this summary rule...
Then he listed ten assumptions implicit in the rule, such as
"the performer is sane" or "their actions permit them a free
hand (e.g., not wheelbarrowing)". Gary (and Lenat) emphasize
that the number of such variations is open ended.
These problems with Cyc arise in every branch of science,
engineering, or business. The number of possible, but
unlikely exceptions to any rule is so large that the
probability that at least one of them will occur is very
high. That's called Murphy's Law.
Cory discussed the "meta concepts common across architectural
languages and notations" such as "UML, E-R models, OWL,
Collaboration Modeling, Services Interfaces, Information
Models, FEA-RMO, etc."
CC> The approach is to normalize and unify the concepts
> expressed in these various languages into a controlled but
> open set of concepts, this is the "semantic core". These
> concepts may be introduced from any of the architectural
> languages -- our job is to try and "slice and dice" the
> concepts so that the fit together (where possible) and are
> non-redundant (Where possible). We can then describe the
> mapping and/or transformation of various tools and
> representation into this common form.
This classification is very different from the ontologies of
Cyc, SUMO, Dolce, or BFO. Instead of analyzing the content
or subject matter, it addresses the metalevel and analyzes
the kinds of tasks that are performed on that content. This
is orthogonal to the classification of content, but it may be
very important for the applications that use the ontologies.
Although I still believe that further research in ontology is
important, I have little faith in the _Field of Dreams_ slogan:
"If you build it, they will come." Cyc has been built, and the
customers have not come. The major question is what strategies
for designing and deploying ontologies might be more successful.
Following are some points to consider:
1. Standardized vocabularies, terminologies, and nomenclatures
were developed long before computers became available, and
their value has been abundantly demonstrated, even without
formal axioms associated with any of the terms.
2. Many such terminologies have logical errors that must be
corrected. For example, three major links between terms
must be clearly distinguished: type-subtype, type-instance,
and whole-part. Some classifications lump all three under
the heading broader-narrower, but that leads to serious
confusion.
3. Other relationships should also be represented, such
as locationOf, containerOf, attributeOf, and various
relations of geography, kinship, and politics.
4. When two or more terms in the vocabulary have the same
supertype, the differentiae that distinguish them should
be explicitly stated, but very detailed axioms can often
be more of a hindrance than a help.
5. More detailed axioms from science, engineering, law,
philosophy, sociology, etc., are likely to be far too
specialized, theory dependent, and not only unnecessary,
but highly undesirable in a general-purpose ontology.
For example, a general ontology should be neutral with
respect to 3-D or 4-D models of space-time, situation
calculus vs. pi calculus, or continuant-based vs.
process-oriented ontologies.
6. The logic required for the general ontology should be
very simple. Aristotle's syllogisms, which are a subset
of description logics, are sufficient for the definitions
discussed in points #2, #3, and #4 above. More complex
logics should be limited to more specialized microtheories
for particular applications, not for the general ontology.
This outline suggests a major reduction in the complexity of
the logic and highly controversial issues about the nature
of space-time, processes, objects, etc. Those issues may
be extremely important for many purposes, but the fact that
they are controversial means that they should be relegated
to specialized microtheories, not the fundamental framework.
John Sowa