Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

SUO: RE: Re: An article on the pitfalls of metadata




Tom Johnston wrote:
> Later, I find things that never get any color description in the
> corpus.  So electricity, hunches, ideas, and many other objects are
> found not to have color descriptions.  That means there is a set of
> things that have the property color, and another set of things that
> are not known to have color.
> 
> Next, we want the algorithm to create an activity for itself to
> specifically find out more about color by using WordNet to cluster
> the set of all objects mentioned in the corpus.  The activity is just
> a heuristic hunch which may have little or no benefit, yet it might
> be valuable - the algorithm doesn't know yet.  So the activity it
> assigned to itself goes into the "things to do when there is nothing
> else to do" queue, arranged by the estimated cost and the estimated
> benefit of completing each activity.
> 
> TJ: "estimated benefit". Now how do we determine that?

Suppose the activity is to ask the operator whether all <list of types>
have color. The benefit might be calculated as the number of individuals 
in the database (from the corpus) that might or might not have a color.
The cost might be one attention point because the activity would use
up one attention span of the operator.  

But of course defining cost and benefit is often an art rather than
a clear quantification.  Which measures work and which don't would
be the operational definition of how valued one measure is over
another.  With time and iterations, measures can be developed that
provide some gain, and others can be deleted that don't seem to help.


> Now suppose the algorithm has been cooking for a while, and there are
> lots of tasks on the activity queue.  We can choose to limit 
> the number
> of tasks (a beam search) and throw away those with lower cost benefit
> estimates.  This keeps us from chasing every possible thing to do
> about the corpus under study.
> 
> As individuals aggregate properties, we can figure out which classes
> of individuals could have which properties.  Given the instantaneous
> state of knowledge about the set of individuals, those sets P[i] which
> have a proper subset of all the properties of a different set C[j] of
> individuals are necessarily more abstract (or more 
> generalized) concept
> sets than the C[j] sets.
> 
> Of course, WordNet gives us evidence that some nouns are more abstract
> than others because one synset is defined using the words from other
> synsets.  Finding a noncyclical path from most general to 
> least general
> is the way to map out higher from lower concepts.  Same with verbs;
> most English verbs with one syllable are used much more often than
> others with lots of syllables.  The most common are terminal verbs
> which can be modeled with actvity graphs, while the less common can be
> organized in terms of their WordNet definitions.  Same with 
> adjectives,
> adverbs, phrases, sentences, paragraphs and so on.
> 
> TJ: Wordnet, I gather, is a formalization of a good 
> dictionary. So when the
> dictionary is eventually revised, we get a semantic 
> earthquake, right? Or is
> WordNet just a way to translate the formal ontology into 
> something more like
> natural language statements?


WordNet is a product of the Princeton University Cognitive Science
Laboratory.  You can download it and/or read about it at
http://www.cogsci.princeton.edu/~wn/wn2.0.shtml

WordNet has been iterating for at least ten years, and gets
its funding from the government.  Princeton makes it freely
available, and the size and complexity of the databases has
gotten very large.  

On the site above, there are several good papers describing
the way its organized; you would probably find it interesting
and informative.  



> Using a widely agreed upon set of lexical terminals like those in
> WordNet helps make the algorithm's results understandable to human
> users, whether teachers, students or other consumer of the lattice.
> 
> For references, I suggest Tom Mitchell's book "Machine 
> Learning", which
> is a classic CS text.  There is a master's thesis by David Chapman
> (MIT, mid 80's) which describes an algorithm he called "Tweak".  This
> is the first paper to treat planning as an instance of temporal logic
> and its primary product - the modal truth criterion.  And of course,
> there's lots of work (google search) on data mining algorithms, rough
> sets, etc.
> 
> My own work was strongly influenced by all those sources, along with
> years of studying the software development process (back in cold
> war military days), automated design systems, lisp programming, and
> the modern Delphi programming environment IDE.  I wrote a commercial
> product that you can visit at:
> http://www.efficacyfx.com
> where there's a very practical implementation of process improvement
> concepts using a factory floor environment.  I was amazed at how much
> real data can help when building a work flow concept set.
> 
> 
> > Here's an example of what puzzles me about how we could 
> automatically
> > generate higher level ontological categories.
> >
> > One heuristic that comes to mind is that whenever two or more
> > entities share
> > identical attributes (assume everything is well-defined), we
> > can create a
> > supertype and move the common attributes up to it.
> 
> Yes, with the caveat that "identical" is sometimes elusive.  That's
> why we should start with very basic stuff, like integers and strings,
> and then use range-domain analytic techniques (a la IFF) to identify
> what attributes have the same values.
> 
> 
> >In this
> > way, an algorithm
> > could generate, for any set of tables, a next-level-up set of
> > supertype
> > tables.
> 
> Yes, with the WordNet structure serving as a suggestive source of
> names to give the supertype tables.  Also, joined tables have 
> attribute
> names that are "roles" which suggest some functionality of 
> the attributes.
> Again, WordNet could possibly help in giving familiar names to the
> common roles of the supertype attributes.
> 
> 
> >Iterating the process, we would end up with a
> > multi-level ontology
> > with a single entity as the highest entry (the 
> ousia/"thing"/substance
> > entity).
> 
> Yes, though that's somewhat artificial, but I haven't come up with a
> better way.  Also, remember that when the same (or similar) supertypes
> are made from distinct subtypes, they may be instances of the same
> supertype, but with some distinction among the subtype groups.  So
> the various types of oranges and the various types of apples have
> supertypes that eventually should merge into "fruit", along 
> with others.
> 
> 
> >In an OO environment, the existence of one or more
> > common methods
> > would probably suffice, even if there were no common attributes.
> 
> The reason I like IDEF0 so much for this modeling purpose is that
> people without OO backgrounds can still think in terms of objects
> and activities, controls, resources, consumables, products, and
> so on.  OO methods are activities, OO properties are object roles,
> and relations among objects are properties of the next higher
> objects.
> 
> OO programs start top down, but build to a library of previously
> defined objects.  The ultimate terminals in Delphi are TObject,
> TClass, and Program.  Everything else is built up from them by
> deriving new programs from previously defined objects.  That is
> basically how the algorithm should function: as a programmer in
> a very highly automated IDE that asks questions, understands the
> answers, and uses the knowledge it gains to pose more questions
> and answers.  Sort of like us.
> 
> TJ: yes, and Smalltalk has an object model which all 
> Smalltalk programmers
> use. The fact that countless programs have been written in 
> Smalltalk, using
> that model, I take as evidence for my contention that an upper level
> ontology can be quite stable because we will learn to use it 
> in a way that
> MAKES it stable. A stable upper ontology makes automated 
> ontology generators
> a less urgent concern, since we would then not be revising higher
> ontological categories very often at all.


I use the Delphi component library, which is also quite stable,
but I consider that the lower level and terminal component base
rather than the higher level.  The one (empty) high level
component in Delphi is the "program <YourNameHere> begin ... end."
component which I use to derive new applications from.  

So we seem to have different intuitions of what is high level
and what is low level.  That's why I like having an algorithm
look at all the properties of a component set, and assign the
highest level to the one with the fewest properties, the lowest
level to those with the most unique properties, and so on.  It
settles the Up versus Down issue mechanically instead of 
intuitively.  



> My doubts about how far automated ontology generators can 
> take us is NOT
> based on a detailed knowledge of them. It is based on more general
> considerations, viz: if we could formulate a "logic of discovery", as
> philosophers of science have called it, express it as a 
> software engine, and
> give it the right starting points, then the machines are 
> about to become
> truly intelligent. One of their first tasks might be to crank 
> out the TOE
> (theory of everything). The next big breaks in solid state physics,
> neuropharmacology, structural engineering, supply chain 
> logistics management
> and insurance claims processing would follow.
> 
> I don't think we're close to that. Now the notion of an 
> automatic ontology
> generator, it seems to me, is the notion of a software engine which
> implements a "logic of discovery", i.e. true intellectual 
> creativity, in one
> specific area, that of categorizing things. I'm not saying that such
> software could not produce anything. As indicated above, I 
> see clearly how
> it could produce an overwhelming number of higher level 
> categories from any
> robust set of base level entities. Only some of those higher level
> categories are "right", i.e. are intuitively natural, i.e. 
> seem natural
> extensions or clarifications of the ontologies we all 
> starting acquiring
> before we were a year old. What I do not see is how an 
> automated process can
> separate the wheat from the chaff.
> 
> So I need to get that example for you, which I'll work on the 
> next couple of
> days.

Good.  Let me know when you have it ready.  


> > This seems based on the principle that a supertype (I'm
> > talking in terms of
> > relational data models, now) of a set of entities  exists
> > whenever that set
> > shares one or more attributes in common. This makes for
> > tidier databases
> > and, in an OO environment, simpler code.
> 
> Yes.  I very much agree in concept and even in interpretation.
> 
> TJ: "whenever that set shares one or more attributes in 
> common". This rule
> for generating supertypes will generate database models 
> (ontologies) so ugly
> that no one would every implement them. (Demonstration to 
> follow in a couple
> of days.)
> 
> > If I had time (or if you request it), I would (or will)
> > develop an example
> > in which this heuristic generates a plethora of supertype
> > entities, with
> > many starting entities defined as subtypes of many of those 
> supertype
> > entities -- a "spaghetti" structure, in other words. OK,
> > here's one quick
> > example.
> 
> Good!
> 
> >Consider a Parts table in a manufacturing database,
> > and the many
> > tables which contain a foreign key back to Parts (the
> > Inventory and Bill of
> > Materials tables, for example). Should we define a supertype
> > for Inventory
> > and Bill of Materials on the basis of this one common
> > attribute?
> 
> No, we should define a domain called PartType.  That domain would be
> the foreign key values.  So there could be one table (or view) of
> all part type values.  Normally, that's just one attribute of many
> tables, but collecting it into one view is useful for later 
> distinguishing
> among the different subtypes of PartTypes.  Groups of PartType values
> can then be partitioned into those that are painted, carved, sanded,
> cut, upholstered, shipped, purchased, and so on.
> 
> TJ: that domain already exists. It is the dynamic (rather than static
> domain) consisting of all the primary key values of the 
> referenced Parts
> table. What I should have said is that if you accept my 
> supertype generation
> principle above, then on what grounds would you NOT "define a 
> supertype
> ..... of this one common attribute"?
> 
> 
> >If so, it's
> > name would be something like "Things Related to Parts". Every
> > transaction
> > table, in any database, will have a date-time-entered
> > attribute. Should all
> > transaction tables then be subtypes of a "Transaction
> > Date-Time" supertype
> > table? If not, how would an algorithm using this heuristic
> > weed out such
> > fluff? Or are there other algorithms that won't generate fluff?
> 
> 
> I prefer the concept of an "Event", as its used in formal logic texts.
> For example, "Knowledge in Action" by Raymond Reiter is a good source
> on thoughts and ideas on how to treat events.
> 
> TJ: yes. I have always thought of a transaction as the record 
> of an event
> that affects the state of one or more temporally-enduring 
> things recorded in
> a database.
> 
> Also, I don't like time stamps on every transaction, but only on those
> that are formally considered events of some type.  A good example
> of that is in my web site at
> http://www.efficacyfx.com/manufact.htm
> where there is a study of the parts and their activities in a factory
> floor environment.
> 
> TJ: I'll check it out. I hope you enjoyed the manufacturing example I
> developed a couple of weeks ago.
> 
> 
> In brief, if a bunch of actions are related, let each one be part
> of a single "event".  Then that event has a start time-date stamp
> and a stop time-date stamp in the database.  Whenever an 
> object (employee,
> assembly, raw material, tool, activity) is related to another object
> of the event, that is marked in the event table and kept as history
> for further analysis.
> 
> Eventually, history starts acting like the past.  At that point, the
> history table can be analyzed for repeating terms.  When a repeating
> term is found, it can be hypothesized to repeat again the next time
> the same situation is identified.
> 
> But to distinguish one situation from another, you have to group the
> occurances by their various constituents and distinguish the various
> subevents.  So that analysis leads to a lattice of concepts and
> subconcepts itself.  The recursive application of this approach is
> what eventually leads to the complete lattice.
> 
> 
> > In this structure, many of the generated entities would seem,
> > intuitively,
> > to be wrong, to not correspond to a "natural kind" in the
> > real world. Since
> > the human user is part of the information system, along with
> > the codebase
> > and the database (a point I have emphasized in several
> > articles of mine, and
> > which I think John Sowa might approvingly interpret as a bit
> > of the semiotic
> > perspective on my part), it is important for our entities to
> > be intuitive,
> > to seem to represent "natural kinds"; for otherwise, we will use the
> > database incorrectly, populating it with category mistakes
> > and often not
> > being sure how to frame a query to get what we want from it.
> 
> Yes again.  That's why a dictionary like WordNet should be integrated
> into the development tool so that meaningful terms can be 
> suggested.  When
> new terms are used, people have different reactions.  These 
> "events" can
> be kept and analyzed also.  By trying the various words in a synset,
> the algorithm can eventually select which ones to use in 
> which situation.
> 
> TJ: sounds like you think an automated process can generate 
> natural kinds,
> and then only needs something like WordNet to find a nice 
> name for them.
> We're back to the main point I'm most skeptical about.
> 
> 
> > So: how can an algorithm generate natural kinds?
> >
> > Tom
> 
> I hope the (nonsuccint) description above is a good starting point
> for further discussion.  Its a tough nut, but I think we can crack it.
> 
> The history and results of the Cyc effort, as seen through 
> OpenCyc, indicate
> to me that it isn't the database of facts and axioms that make up an
> intelligent system.  The lessons from expert systems 
> techniques makes me
> believe that it also isn't the algorithm (deduction, neural 
> nets, fuzzy
> logic, proof trees, grammars, languages, take your pick) that make a
> system intelligent.
> 
> The genetics projects indicate that we only have about 21,000 genes.
> Social science indicates that nurture is formative.  Twin 
> studies indicate
> that the two go hand in hand.  Hardware (algorithms, genes, ...) nor
> software (database, axioms, ...) are enough.
> 
> The only believable answer is the long one; we have to 
> develop a transparent
> learning algorithm and a database of facts that are situated in a real
> environment along with a bunch of humans interacting with it.  That's
> the way people become intelligent starting from a fertilized 
> egg.  Every
> step of the process is needed, and any one defective part 
> ruins the product.
> 
> TJ: I agree. But I doubt we can generate useful ontologies 
> via an algorithm
> anytime soon, because the ability to do so seems to me part 
> and parcel of
> the ability of a machine to be intelligent and conceptually creative.
> Nonetheless (another caveat emptor), my doubts are based on 
> more general
> philosophical considerations, and are based on very little detailed
> knowledge of the hard work that computer scientists have done 
> and are doing
> in this field.
> 
> More to come.
> 
> Tom
> 
> JMHO,
> Rich
> 
> 
> 
> > Tom Johnston wrote:
> >
> > <snip/>
> >
> > >My first question would
> > > be: which one has
> > > successfully incorporated the largest and most diverse set of
> > > lowest level
> > > (i.e. working database level) ontologies? Which ones can most
> > > completely
> > > rely on the data model itself to fully express the semantics
> > > up and down the
> > > entire ontology, without "patching things up" with ad hoc
> > > program code.
> > > (Sorry, I don't know how to translate this point, expressed
> > > in my preferred
> > > language, into the language of axiomatized formal systems.)
> > >
> > > Whichever one it is, that's the one we should go with. Let's
> > > work to add
> > > more lowest level ontologies to it. In the process, we may
> > > sometimes make a
> > > good case for revisions a couple of levels higher up. We 
> may on rare
> > > occasions make a good case for revisions much higher up.
> > Some of those
> > > revisions will not force structural changes elsewhere in the
> > > web of this
> > > ontology, e.g. adding a creation-date-timestamp to the top-level
> > > entry/table/class. Other revisions will force structural
> > > changes, and such
> > > changes can be painfully expensive. But the further up we
> > go, the less
> > > frequent the revisions will be. Once again, this is just
> > > Quine's sphere of
> > > language, his (or Peirce's?) holism.
> > >
> > > Tom
> > <snip/>
> >
> > Tom, why not use the process you described above as the initial
> > statement of an algorithm to automate the merger of lower level
> > data models?
> >
> > Observations about the actual databases stored with two data
> > models might be analyzed to come up with a higher level model
> > that incorporates both.  Since the top level model is empty,
> > when two data models merge to no common elements, the two are
> > clearly independent nodes on the lattice.  Some of the data
> > mining techniques can be applied to this approach.
> >
> > I don't think its necessary, or even useful, to develop the
> > lattice manually since it will be necessarily a dynamic lattice
> > that changes with time.  So its not the initial lattice that
> > we should spend effort on, its the method (algorithm, process)
> > for building the lattice and refining it through observations.
> >
> > JMHO,
> > Rich
> >
> >
> 
>