Re: SUO: linguistic concepts
The dialog between Scott Farrar and David Whitten is a brief, but
excellent survey of an enormous number of issues that I and several
other people have been trying to get the SUO gang to take seriously.
I tried to make that point in Chapter 7 of my 1984 book, which
had the title "Epilog: Limits of Conceptualization".
I reiterated that point under the title "Knowledge Soup", which I
used for a series of lectures dating from 1987. It is also the
title of Chapter 6 of my recent KR book.
As a introduction to some of the linguistic issues and their
relationship to knowledge representation, I suggest my essay
on "Concepts in the Lexicon", which contains material extracted
from several published papers:
http://www.jfsowa.com/ontology/lexicon.htm
For a longer essay on the philosophical issues involved, I recommend
my paper on "Signs, Processes, and Language Games", which has the
subtitle "Foundations for Ontology":
http://www.jfsowa.com/pubs/signproc.htm
This is a rather long article. If you want to read the recommendations
for how I believe an ontology should be organized, go straight to the
ending Section 7, which summarizes how multiple languages with multiple
conceptualizations can be related to multiple theories about the world:
http://www.jfsowa.com/pubs/signproc.htm
And finally, I will be giving a keynote speech at the Knowledge
Technologies 2002 Conference in Seattle on March 13th, with the title
"Negotiation Instead of Legislation". The theme of that talk (abstract
below) is that legislated ontologies, such as SUMO or Cyc, are obsolete.
The way that people interoperate is by negotation, and the talk shows
how negotiation can be implemented on current systems. The abstract
is at the end of this note, and I'll send a further note to this list
when I post the slides on my web site. Following is the web site
for KT 2002:
http://www.knowledgetechnologies.net/2002/about.htm
Following the abstract is a copy of the dialog between SF and DW.
John Sowa
______________________________________________________________________
Negotiation Instead of Legislation
John F. Sowa
Abstract. For years, the Holy Grail of IT has been a magical solution
to the problem of making incompatible systems interoperable. The most
common approach is to legislate some new kind of language, framework,
schema, vocabulary, terminology, nomenclature, ontology, or metadata.
Whatever it is called, the legislators promise that it will somehow
convert the knowledge cacophony of the World Wide Web into a knowledge
symphony.
Yet for any given task, people manage to work together without
reorganizing the totality of all the knowledge soup in their heads.
Instead of legislation, they use negotiation to make the minimal
adjustments needed to get the job done. To make negotiation possible
among computer systems, several processes must be accomplished:
defining the task to be done, mapping the task-related concepts to the
available structures of each system, and making adjustments only when
necessary. This talk discusses the mechanisms of negotiation, analyzes
their implications for system design, and shows how they can enable
legacy systems to interoperate in dynamically changing environments.
________________________________________________________________________
Below are some comments regarding the status of some linguistic concepts
within the ontology:
SF: Scott Farrar
DW: David Whitten
SF:
In my discussion with Ian, several issues surfaced:
First there is the status of the concept &%Language. We suggest
expanding its status somewhat:
In addition to including &%Language as a subclass of
&%LinguisticExpression,
we think &%Language should include a set-theoretic notion. Many
linguists
argue that a language is a potentially infinite set of what SUMO calls
&%LinguisticExpressions. Therefore Ian suggested that SUMO add,
(disjoint Language FiniteSet)
DW:
I certainly would understand why you would want to do this, however, if
you have a finite vocabulary, and a grammar that cannot realize an
infinite
number of linguistic expressions, it would be false. I would like to
caution you to be careful not to overload the SUMO term with more than
one
English word-sense. I believe Jon Awbrey is correct that there are many
ways words can be used, and SUMO needs to have a way to represent them,
but I am of the opinion that modern reasoning systems don't deal with
ambiguity well at all, and we would do well to keep our definitions
and terms usable by machines and systems currently available.
SF:
Yes, I agree that we are talking about more than one concept here. Now
the
question is just how many? The notion of language as an unbounded set is
pervasive within the ling. community. But, as much as possible, we're
trying
to remain theory neutral, at least while we're constructing our
lingustics
sub-ontology.
SF:
Also, if we construe language as a set of &%LinguisticExpressions, then
is it correct to say (subclass Language LinguisticExpression)? I think
that
if we adopt the set-theoretic notion of &%Language, then we need some
way
of disallowing Language as a member of the set that is itself
&%Language.
Perhaps this touches on Russell's Paradox?
DW:
What are the elements of the class &%Language ?
SF: I think we should allow any &%LinguisticExpression to be an element.
&%Words, &%Phrases, &%Morphemes, etc.
DW:
Is &%Language the mode or method of expressing a concept in a Linguistic
fashion? (meaning 1)
Or contrari-wise, is &%Language the set of utterances, each of which is
the
Linguistic formulation of a particular concept? (meaning 2)
I would hope that you aren't advocating that the term &%Language should
be
used for both of these, as they are distinct, seperable, and from the
SUO
perspective, do not support the same inferences and conclusions.
SF:
No, I'm not advocating this. They are definitely distinct concepts.
SF:
Second, &%Language should be linked ontologically to the concepts
of &%SocialInteraction and/or &%Communication. Let me express my ideas
in English:
"Language is a means for communication."
"And/or "Language is used for social interaction."
DW:
This is close to the meaning (1) above. Again, let me caution you that
simply because English uses the same word, you should not assume it is
the
same concept. As a doctoral student, I'm sure this has been drilled into
your consciousness multiple times, so I suppose my caution is actually
being
expressed to remind the other folks on the mailing list who haven't
thought
as deeply about this very focused area of existence as you have.
SF:
Third, I suggest that we give some thought to specifying the various
forms
of &%Language and &%LinguisticExpressions: perhaps a 'form' slot with
the
possible values, 'written,' 'spoken,' or 'signed.'
DW:
Perhaps Adam can clarify, but the idea of 'slots' isn't a KIF concept,
and the word has been used in so many ways in knowledge representation,
that I am afraid I must anticipate Jon Awbrey and say this Third point
is so fraught with ambiguity, that I have no idea how to crystalize it.
perhaps you should look into extra predicates that make the distinctions
you are interested in? or perhaps differing subclasses ?
SF:
Right. I'm new to the SUO-KIF format. Specialized predicates take the
place
of 'slots', I think. Does this accord w. your thinking? Can you give me
an
example of how we might specify this?
SF:
> Fourth, I suggest the following subclasses of &%Language:
>
> (subclass HumanLanguage Language)
> (documentation Human Language "This is the subclass of &%Language used
and
> interpreted by a &%Human. Instances include all extinct and extant human
> languages")
>
> (instance Spanish HumanLanguage)
DW:
Spanish as the name of a particular mode of expression is an instance of
HumanLanguage. (sub-meaning 1)
Spanish as the set of utterances understandable by a Spanish language
speaker is a subclass of HumanLanguage. (sub-meaning 2)
Since sub-meaning 1 and sub-meaning 2 are not the same, which do you
mean?
SF:
I think sub-meaning 2 is most appropriate.
SF:
> (subclass ArtificialLanguage Language)
> (documentation ArtificialLanguage "This is the subclass of &%Language
> designed by humans but intended for and interpreted by a &%Machine.
> This class is disjoint from the subclass of languages that are
> constructed to be used by humans, eg. Esperanto.")
>
DW:
I would have used the term ConstructedLanguage or ArtificialLanguage for
Esperanto. Since you are bringing interpretation and intent into your
documentation, don't be surprised if Jon picks up this ball and runs
with
it. I'll let him try to take it in for a touchdown.
I probably would include some idea of a FormalLanguage as a superclass
too, since there are many things in common there.
SF:
I just threw the following out for feedback.
> ;;I'm less sure about the following instance examples,
> ;;any suggestions?
> (instance Java ArtificialLanguage)
> (instance SQL ArtificialLanguage)
>
>
> (subclass AnimalLanguage Language)
> (documentation AnimalLanguage "This is the subclass of &%Language used
and
> interpreted
> by an &%Animal that is not &%Human.")
>
> (instance DolphinLanguage AnimalLanguage)
> ;;Obviously, the status of AnimalLanguage as "Language" is
> ;;controversial for many linguists
>
DW:
The concept that Animals have concepts in the way that humans do is
controversial in society at large, much less among linguists.
It should be able to have an ontology that takes a consistent approach
to
controversial topics though. How will a system which reasons using a
controversial ontology communicate that fact to other systems which may
not
support such reasoning methods, and separately, may not agree that
certain
facts should be considered to be true?