RE: SUO: CG representations for WordNet
Rich,
At 03:43 PM 1/14/2003 -0800, Richard Cooper wrote:
>Adam Pease wrote
> > Rich,
> > Richard Cooper wrote:
> > >Thanks Adam, I've downloaded ..-Top.txt and am looking at it now.
> > >In its first line, it mentions "Aligning the SUMO with WordNet"
> > >(Niles, forthcoming). Has this paper been released yet? It might
> > >clarify some of the things that look a little strange, such as
> > >"commutativefunction", which is not in WordNet 1.6 browser's
> > >repertoire. I would like to understand better how this merger
> > >fits with the WordNet synsets.
> >
> > We haven't managed to publish the paper yet, I've just now
> > posted it on the
> > ontology page and called it a Teknowledge "tech report"
>
>Thanks, I'm looking at it now. Presently, I have the WN 1.7 definition
>files for the four POS in separate relations, and I see from the paper
>that
>
>"As of June 2002, the ontology contains 965 terms and 3742 assertions"
>
>so there must be a number of WN synsets (109,000 in v. 1.7) that don't
>have representatives in SUMO at present. Is this a correct deduction,
>or has a tremendous amount of growth happened in the last six months?
SUMO is more general than WordNet, so many WordNet synsets map to a far
more general SUMO term. For example, there's no formal SUMO term and
definition for the notion of "antelope", so the WordNet synset that
includes that English word maps to the formal SUMO term HoofedMammal.
> > >When I say I'm looking for a CG database, what I emphasize is
> > >the case relationships required by the verb synsets. From
> > >Steven Pinker's book, it seems that verb cases offer a good
> > >handle for parsing English, but I haven't found a machine
> > >readable source for the case sets.
> >
> > Verbs are very important. The CaseRole relations in SUMO may
> > be what you
> > want, combined with the Process types. Most verbs in WordNet
> > have been
> > mapped to Process types in SUMO.
>
>Yes, if Schank's approach of coalescing lots of verbs into a
>few more rigidly typed forms is correct, maybe that's what happened
>to those other WN verbs.
yes
> From "Merged text.txt", I got
>
>(instance agent CaseRole)
>(domain agent 1 Process)
>(domain agent 2 Agent)
>(documentation agent "(&%agent ?PROCESS ?AGENT) means that ?AGENT is
>an active determinant, either animate or inanimate, of the &%Process
>?PROCESS, with or without voluntary intention. For example, water is
>the &%agent of erosion in the following proposition: the water
>eroded the coastline. For another example, Eve is an &%agent in the
>following proposition: Eve bit an apple.")
>
>which I guess means that agent is a CaseRole with two arguments, but
>how Process maps into specific verbs, and when a Process is too complex
>to map into just one or two verbs or a phrase, is not clear. Do you
>have any material that shows how the WN verbs (and their case roles)
>map into SUMO Processes and CaseRoles?
If you look at WordNetMappings-verbs.txt you'll see in the first line after
the comment header that the synset including the word "breathe" maps to the
formal SUMO term Breathing. That's a rare near equivalent match. The
second synset is more typical where the sense of "choke":""breathe with
great difficulty" is also mapped to Breathing.
Making proper use of the CaseRole(s) as targets of natural language
translation is not something that we have corpora to support, yet, although
we expect to have a paper describing an approach soon. As a simple example
"John kicks the cart."
could be translated to
(exists (?EV ?OBJ)
(and
(instance ?EV Impelling)
(instance ?OBJ Device)
(instance John-1 Human)
(agent ?EV John-1)
(patient ?EV ?OBJ))
>Perhaps I should just be patient and study it longer, but it seems to
>me that a detailed relationship between WN's 109,000 senses and class
>structures should be very useful for SUMO wrestlers who want to use
>both capabilities in one place.
Indeed, if I've understood you, that's exactly what we have done in the
mappings.
> >There are exceptions for
> > stative verbs
> > which aren't processes, as well as various verbs which have
> > very vague
> > meanings in English.
>
>Where are these kinds of exceptions documented? Especially
>useful would be something machine processable. If the SUMO
>Process is much more formally controllable, is there a way to
>automate the translation of many WN verb synsets into SUMO
>Process specs?
They are not well documented, other than verbs which didn't have a good
mapping have been given the default mapping of the SUMO term Process. It
would be beneficial to have a comprehensive analysis of this, but we're not
a big operation here! We'd love to have other folks look at our mappings
and do this sort of analysis. I think there's lots of good research that
could be done on the foundation we've created.
I'm not sure what you mean by "automate". We had to do the mappings by
hand, but now that they're done it's easy to automate the lookup of a SUMO
term from an English word. That function is handled in the "English term:"
window in the SUMO browser.
> > >SUMO seems to be somewhat more mathematically oriented than
> > >NLP oriented - or am I musjudging it in that way? Does
> > >SUMO contain, in an extractable way, the case relationships
> > >for WordNet verbs?
> >
> > SUMO is a formal ontology in mathematical logic, but thanks
> > to the WordNet
> > mappings, we think it can be used for NLP applications. Take
> > a look at the
> > CaseRole(s) and see what you think. I'd be happy to talk more.
> >
> > Adam
>
>These are my initial thoughts, but I'll study the material some
>more and see if there is some way to get a hold of it.
Great! Let me know if I can be of further help.
Adam
>Thanks yet again,
>Rich
>
>
>
> > >Thanks for your help again!
> > >
> > >Rich
> > >
> > >
> > >Adam Pease wrote:
> > > > Richard,
> > > > Since KIF and CGs are equivalent, our Suggested Upper
> > > > Merged Ontology
> > > > (SUMO) could be expressed in CG. We've mapped all 100,000
> > > > WordNet synsets
> > > > to SUMO. Both the ontology and the mappings are free. The
> > > > mappings are
> > > > released under the GNU license. See our main page at
> > > >
><<<http://ontology.teknowledge.com>http://ontology.teknowledge.com>http://ontology.teknowledge.com>
>or
> > go directly to
> > >
> >
> <<<http://ontology.teknowledge.com/cgi-bin/cvsweb.cgi/SUO/>http://ontology.teknowledge.com/cgi-bin/cvsweb.cgi/SUO/>http://ontology.teknowledge.com/cgi-bin/cvsweb.cgi/SUO/>.
>
> >
> > > The SUMO is
> > > listed as "Merge.txt" and the WordNet mappings are in WordNet
> > > file format
> > > and labeled as WordNetMappings-Top.txt,
> > > WordNetMappings-adjectives.txt,
> > > WordNetMappings-adverbs.txt and WordNetMappings-verbs.txt
> > >
> > > Adam
> > >
> > > At 10:29 AM 1/14/2003 -0800, Richard Cooper wrote:
> > >
> > > > From reading "Task-Oriented Semantic Interpretation" at
> > > ><<<http://www.jfsowa.com/pubs/tosi.htm>http://www.jfsowa.com/pubs/tos
> i.htm>http://www.jfsowa.com/pubs/tosi.h
> > tm><http://www.jfsowa.com/pu>http://www.jfsowa.com/pu
> >bs/tosi.htm
> > >I find that CG types are one-to-one with word senses, and
> > >each can have one or more canonical CGs. If word senses
> > >(types) are in a generalization lattice, does that mean that
> > >every node in the lattice has one or more CGs? How can I
> > >get hold of the actual CG structures?
> > >
> > >Using WordNet, I've been looking at the word senses and
> > >template phrases that are defined for each word. Is there
> > >a way to translate the WordNet entries into CGs? Or better
> > >yet, is there a database of CGs that corresponds to the
> > >WordNet entries?
> > >
> > >Having such a database resource should help NLP developers
> > >work with CGs and WordNet at the same time. Maybe this could
> > >even be related to the IFF concepts. Of course, the axioms
> > >from WordNet would be sparse, but that's another story for
> > >a future step.
> > >
> > >Comments appreciated,
> > >
> > >Rich