Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

SUO: RE: RE: Re: Missing Ingredients




Tom Johnston wrote:

> 1.  I'm afraid what I meant was this: if WordNet has 
> definitions for, say,
> 10,000 words that name types of which there can be tokens 
> (classes of which
> there can be members, if you prefer), then we need 10,000 tables.

You could do it that way.  But I think a more elegant way
would be to have one table for all individuals, and one row for
each one in the stored discussion.  Then there might also be
10,000 views, each identifying the individuals that belong
to each of the 10,000 types.  Properly constructed as sets
of indexes into the main table, these would be 10,000 sets
of integers, some of which refer to overlapping sets of
individuals.  

So 'birds' would be the subset of individuals that are
'red robin on the front lawn', 'parakeet', 'parrot' and so
forth.  

This approach should make anaphoric references somewhat
straightforward to process.  

'There was a red robin on my front lawn when I woke up this
morning.  The bird had a pretty song and cheered my starting
the day.'

The first sentence is declarative, and would cause a row in
the object table to be created to represent 'the red robin
on my front lawn.  The second sentence would extract all
known birds, of which 'the red robin on my front lawn' is
the only instance, and therefore is designated by 'the bird'
in the second sentence.  



> 2. I'm think of non-instantiated OO classes, tables which 
> have instances
> only when one of their leaf-node subtype tables also have instances.

Yes.  The table processing routines could be implemented before
the conversation starts, and could lie dormant until an instance
is declared, inferred, or suspected.  


> 3. I think I would. As indicated many times before, I think business
> applications require a lot of additional semantics over and 
> above ordinary
> language dictionaries. The best definition of "customer", in 
> any English
> dictionary, is no more than a starting point for a list of 
> set membership
> criteria for a Customer table.

Yes, the vagueness of natural language words we use every day
is very apparent.  


> 4. Don't agree. Programmers use "scope" in the narrow sense 
> you indicate. In
> my context, I use it to mean "one of the tables the query 
> will search, in
> constructing its result set". Perfectly acceptable use of 
> "scope", I think.

Then we agree to disagree.  I think the programmer's definition
of scope originated in trying to model linguistically faithful
representations of small snatches of conversation.  The scope
of a procedure normally includes several variables, rather
than a single one.  The "scope" of a conversation includes
several objects, as in the 'red robin on the front lawn', 
which involves a lawn, a robin, and an observer.  

Like many conversations, there are scopes within scopes, so
that each context scallops its own subject material, which
may include references to other convesations we've had in
the past.  


> 5. I've seen type hierarchies built, in databases, in which 
> non leaf-node
> tables are directly instantiated. e.g. Involved Party is subtyped as
> Supplier, Customer or Competitor. Each has attributes shared 
> by all the
> others; these go on the Involved Party table. Each has 
> attributes not shared
> with any others. Now consider Partners, companies that form temporary
> alliances with our company. For now, we have no distinct 
> attributes for
> them.
> 
> So we express this by adding "partner" as an element in the domain of
> involved-party-type. But since the attributes in 
> Involved-Party are all we
> need for partners, we do NOT add a Partner subtype table. Now 
> Involved-Party
> is a supertype which is directly instantiated, not just through its
> subtypes.

I still prefer the single table of individuals; if there is a
bird mentioned in the conversation, it might not be clear at first
which bird, where the bird is, what its doing, and so on.  As more
sentences are added, more information is declared about the bird.

In your 'partner' example, more database information added later
might require more information about the involved parties, for example,
which involved parties are legally liable for completing an agreed
project?  Partnerships are shared liabilities and shared assets,
therefore it may become necessary to distinguish partners from
other involved-party instances.  



> ..... Skipping over, for constraints of time, much 
> interesting stuff below
> ..... yes, I'm 30 minutes from Stone Mountain, 20 minutes 
> from GA Tech.
> Where the KKK used to burn crosses. Also where I've hiked 
> with friends.

I hadn't heard about the KKK doing that, but they've been
around for 100 years, so I suppose its not surprising.

Give my regards to Georgia!

Rich




> -----Original Message-----
> From: Richard Cooper [mailto:rich@valutech.com]
> Sent: Wednesday, October 22, 2003 1:47 PM
> To: Tom Johnston; Jon Awbrey; SUO
> Subject: RE: RE: Re: Missing Ingredients
> 
> 
> Tom Johnston wrote:
> > Rich:
> >
> > 1. To use WordNet to implement what I have in mind would
> > require database
> > tables for every leaf-node concept in WordNet.
> 
> When you download WordNet 2.0, there are 19 tables, properly
> normalized, but expressed in text according to their documentation.
> I've developed Delphi code to extract the 19 tables into true
> SQL tables.  Then I defined a few views that seem useful, but
> I've only just started, so it will grow quickly.  I have yet
> to extract hypernymy, synonymy, or any other -onymies, but
> the relationships are encoded in the detail relations for
> each part of speech.
> 
> 
> > 2. It would also require a database schema for every concept
> > in WordNet,
> > leaf-node or not.
> 
> Or rather, it requires a schema for 'concept' as encoded
> in WordNet tables.  There are lots of ways to skin that cat.
> 
> 
> > 3. Then SQL queries could reference any of these schemas,
> > specifying result
> > sets.
> 
> Yes, and the queries could implement arbitrary predications
> so long as they can be expressed in WordNet terms.  Unless
> you want to build on top of that structure to make more
> application-oriented distinctions that aren't part of the
> commonly communicated concepts.
> 
> 
> > 4. All tables registered under the referenced nodes would be
> > in-scope; all
> > others not.
> 
> If you're using 'scope' in the same way I use 'context', its
> not a property of entire tables, but of selected rows and
> columns in the query.  Scope in the programming sense is
> appropriate to that level of reference.
> 
> 
> > 5. The semantics of the result set would be as clear as the
> > definitions of
> > the referenced nodes, and not any clearer.
> 
> Yes; there's no free lunch.
> 
> 
> > Theoretical issues abound, of course. And certainly many more
> > than I am
> > aware of. For example:
> >
> > re 1: dictionaries are notoriously not hierarchies, and are
> > notoriously
> > circular. So perhaps we would need a hierarchical distillate
> > of WordNet, not
> > WordNet itself.
> 
> I haven't gotten far enough yet to extract hypernymy relationships,
> but I think somewhere in the WordNet documentation they claim
> that theirs is noncircular, but they offer no proof and they
> apparently don't test that assertion.  So we would have to do it.
> 
> 
> > re 2: are all non-leaf nodes instantiated only through 
> their leaf-node
> > subtypes?
> 
> A 'bird' is defined with 5 senses.
> 
> A 'parrot' is a kind of 'bird' in one of its two noun senses, and a
> 'copycat' in another, and has a verb sense 'to repeat mindlessly', so
> there is a combination of ways in which the word is used.
> 
> If you want a table of birds, every row is a bird.  If you
> have a table of parrots, every row is still a bird, but the
> distinction between parrots and nonparrots has to be encoded
> in a descriptor such as 'BirdType', which might include Robins,
> Parrots, Parakeets, etc.  Since a parakeet is a kind of Parrot,
> you could be encoding a partially hierachical concept in a
> discrete enumerator.  Just like in the real world.
> 
> Then you can specify queries that retrieve all parrots, robins,
> parakeets, or whatever is needed for the business distinctions.
> 
> My view is that SUO could provide an ontology of concepts
> and relationships that come from the WordNet database, and
> with a lot of work, encode many of the subtler concepts into
> the ontology.  The whole thing could be done in SQL, and would
> provide a starting point for more specific application-oriented
> ontologies that could be grown from the SUO one.
> 
> Instead of calling it the Standard Upper Ontology, we could
> call it the Standard Useful Ontology!
> 
> 
> 
> > re 3 & 4: so SQL queries, at least those referencing more
> > than one database,
> > would be redirected to reference the registration hierarchy, and
> > supplemented to optionally specify an include-database and/or
> > exclude-database list for all databases whose tables are
> > registered under
> > the referenced node. There's a fair chunk of software
> > development work here,
> > and its hard, internals-type middleware work, not traditional
> > applications
> > development work.
> 
> Yes.  But government agencies (NIST, NIH, NASA, DoD, DOC, DARPA,
> ..) have funded infrastructure work like that before.  So with
> a good starting concept, and some useful results, perhaps some kind
> government agency would fund extension of the work until there
> is a truly useful infrastructure to support database interchange.
> After all, there is a lot of government data that could be used
> to make a better, more capable government service.
> 
> 
> > re 5: my experience has been that the relevant semantics of business
> > database tables nearly always requires distinctions not part of the
> > definition of the ordinary language terms used to label
> > tables (or columns).
> 
> In very large projects, there is a documentation standard
> that defines the tables and columns in English, as well as
> the ER diagrams, UML, and other notations that help define
> the data more meticulously.  The English description is
> what operators get in their 'help' screens.  If a meticulously
> documented SUO is made part of the public infrastructure,
> all kinds of useful tools could extract that information to
> translate user requirements into databases more automatically
> than at present.  That approach would help remedy the seat of
> the pants way of building databases in the future.
> 
> 
> > So even if all this could be done starting with WordNet, we
> > still wouldn't
> > have anything a CFO would pull out his checkbook for.
> 
> A government project manager might though.  Start with the
> government, but aim at the commercial world as the endpoint.
> That's the way CS&EE research has often gone from ideas to
> useful objects.  When the CFO can buy bigger better databases
> for less money, she'll pull out the ATM card and sign the
> receipt.
> 
> 
> 
> > Finally, your comments about sets not being mutually
> > exclusive, and the
> > messiness of databases, are well taken. My emails in which 
> I discussed
> > Wittgensteinian family resemblances, a couple of months ago,
> > were attempts
> > to talk about this messiness. But I'd rather stop talking 
> philosophy,
> > translating back and forth between Peirce, Wittgenstein,
> > Quine and even the
> > dreaded Rorty. I'd rather stop talking about axiomatizing
> > everything we do.
> > I'd rather start building a house on a local sandbar instead
> > of endlessly
> > prospecting for granite bedrock
> 
> That's why I'm putting my spare time into the WordNet project.  I like
> to see something useful come out of my work.  It used to be enough
> to publish papers, but after a while I realized how few people
> read the really deep papers, and how useful the glossier magazines
> and web sites are to most of us.
> 
> 
> >(to use the least favorable
> > metaphor I can
> > think of for what I am recommending). (After all, Stone 
> Mountain, the
> > largest outcrop of granite in the world, is only a few miles
> > from me and my
> > local creek!
> 
> Tom, are you near Stone Mountain?  Coincedentally, I went to
> Georgia Tech in Atlanta many long years ago.  I climbed Stone
> Mountain several times during those years.
> 
> 
> >i.e. various metaphors about firm foundations
> > are creating
> > "analysis paralysis", IMHO.)
> >
> > But so what? Is there an idea here that might lead somewhere?
> > Or have I just
> > rediscovered, in less refined language, some of the more obvious
> > implications of knowledge soup, KIF, a lattice of theories, 
> IFF, C. S.
> > Peirce, twenty years of work in deductive databases, or two
> > ISO documents
> > recently mentioned by Matthew West? I certainly don't know.
> 
> 
> At the end of "Candide", Voltaire has the 40 year old man
> turn from concepts to real world work.  He seems to think
> there is a time to stop analyzing and start producing also.
> 
> Rich
> 
> 
> 
> > -----Original Message-----
> > From: Richard Cooper [mailto:rich@valutech.com]
> > Sent: Wednesday, October 22, 2003 12:22 PM
> > To: Tom Johnston; Jon Awbrey; SUO
> > Subject: RE: RE: Re: Missing Ingredients
> >
> >
> > From: Tom Johnston wrote:
> > > That's interesting. I've been thinking of a do-it-yourself,
> > > start-from-scratch approach.
> > >
> > > One question: are the entries in WordNet sophisticated enough
> > > to make the
> > > kind of distinctions I've been providing examples of,
> > > distinctions where
> > > tables in different databases but with the same names (the
> > > Customer table,
> > > the Shipments table, etc.) nonetheless have significantly
> > > different set
> > > membership criteria?
> >
> > WordNet provides class structure, so the concept of a
> > customer is returned from their standard seach as below:
> >
> > "
> > The noun customer has 1 sense (first 1 from tagged texts)
> >
> > 1. (25) customer, client -- (someone who pays for goods or services)
> > "
> >
> > The distinctions you mentioned were specific predicates that
> > distinguish the class of customers into those who pay upon
> > purchase, those who pay upon shipment, and those who pay
> > a bill when its due.  Those kinds of distinctions are deeper
> > than "customer", but still based on predications using the
> > same English words you used in describing the subsets.
> >
> > But you could take that arbitrarily deep.  For example,
> > some customers could pay upon purchase for some items, pay
> > upon shipment for others, and pay by mail for yet others.
> > So the sets aren't necessarily mutually exclusive.  Since
> > you've dealt with databases, you know how things can get
> > muddled up by unanticipated real world conditions.
> >
> > Given the columns of your customer table, you could make
> > all kinds of distinctions, such as customers from Chicago,
> > customers over 65, and so on.  All of these distinctions
> > have to be communicated to your users, and your users
> > work in natural language, so you must also.
> >
> > So I think WordNet provides a good starting point, and its
> > an authoritative reference work that lots of people use,
> > with ongoing funding and good prospects for continued
> > refinement.  And its free.  So it makes a very good starting
> > point.
> >
> > JMHO,
> > Rich
> >
> >
> > > From a business perspective, that's
> > > where the rubber
> > > really meets the road. Clearing up stuff like that is 
> what will get
> > > corporate checkbooks out. Formalizing ordinary language
> > > semantics will not.
> > >
> > > Thanks.
> > >
> > > Tom
> > >
> > > -----Original Message-----
> > > From: owner-standard-upper-ontology@majordomo.ieee.org
> > > [mailto:owner-standard-upper-ontology@majordomo.ieee.org]On
> > Behalf Of
> > > Richard Cooper
> > > Sent: Tuesday, October 21, 2003 5:22 PM
> > > To: Jon Awbrey; SUO
> > > Subject: SUO: RE: Re: Missing Ingredients
> > >
> > >
> > >
> > > Jon Awbrey wrote:
> > > <snip\>
> > > > TJ: 1.1.  Our goal, I take it, is to increase the semantic
> > > > interoperability
> > > >           of databases.  This means, I take it, (although I
> > > > have found no
> > > >           description of any such thing on the SUO website)
> > > > is to create
> > > >           a registration framework for real world databases.
> > > >
> > > > Tom,
> > > >
> > > > There's about 20 years worth of research on "deductive 
> databases"
> > > > that I can remember just since the first standard 
> textbooks began
> > > > to appear.  But you said bottoms-up, and I'm all for that, well,
> > > > let me check -- yes, it's an odd-numbered day where I am, so OK.
> > > >
> > > > Let us try to approach the question
> > > > of "semantic inter-operability" (SIO)
> > > > by way of the following sub-questions:
> > > >
> > > > 1.  What is the "meaning" of a "set of sentences" (SOS)?
> > > >
> > > > 2.  What is the "meaning" of a "table of tuples" (TOT)?
> > > >
> > > > 3.  How shall we compare the "meanings" of these two?
> > > >
> > > > I will give you and me both time to think and then get
> > back to you.
> > > >
> > > > Jon Awbrey
> > >
> > > This set of three questions is the most important triple we're
> > > dealing with in all SUO work.  Getting clear answers to how
> > > meaning is represented, communicated, stored, compared and
> > > organized would be a successful result.
> > >
> > > We have predefined the answer to be an ontology.  Then we refined
> > > that concept to include the lattice of ontologies, plus the IFF
> > > framework, but I still get the feeling there's a lot of stuff left
> > > out.
> > >
> > > So I agree with Tom that the focus should be refined further
> > > to incorporate real world database concepts, and I add one more
> > > suggestion; that we should be working with natural language
> > > words and sentences to impose the type structure, or class
> > > structure, and property lists, of common everyday concepts like
> > > address, customer, person, ..., fill in your favorite concepts.
> > >
> > > Finally, since we haven't been able to agree on more enhanced
> > > ontologies than WordNet, perhaps we should start the bottom-up
> > > process by extracting exactly the ontology that WordNet provides.
> > > This could be one of the bottom-level concept sets, along with
> > > others that may appear in the lattice as we continue.
> > >
> > > Rich
> > >
> > > he
> > >
> >
> >
> 
>