RE: SUO: RE: RE: Re: Missing Ingredients
Murray Altheim wrote:
> Richard Cooper wrote:
> > Tom Johnston wrote:
> >
> >>1. I'm afraid what I meant was this: if WordNet has
> >>definitions for, say,
> >>10,000 words that name types of which there can be tokens
> >>(classes of which
> >>there can be members, if you prefer), then we need 10,000 tables.
> >
> > You could do it that way. But I think a more elegant way
> > would be to have one table for all individuals, and one row for
> > each one in the stored discussion. Then there might also be
> > 10,000 views, each identifying the individuals that belong
> > to each of the 10,000 types. Properly constructed as sets
> > of indexes into the main table, these would be 10,000 sets
> > of integers, some of which refer to overlapping sets of
> > individuals.
> >
> > So 'birds' would be the subset of individuals that are
> > 'red robin on the front lawn', 'parakeet', 'parrot' and so
> > forth.
> >
> > This approach should make anaphoric references somewhat
> > straightforward to process.
> >
> > 'There was a red robin on my front lawn when I woke up this
> > morning. The bird had a pretty song and cheered my starting
> > the day.'
> >
> > The first sentence is declarative, and would cause a row in
> > the object table to be created to represent 'the red robin
> > on my front lawn. The second sentence would extract all
> > known birds, of which 'the red robin on my front lawn' is
> > the only instance, and therefore is designated by 'the bird'
> > in the second sentence.
> [...]
>
> Are we talking about humans or computers making these inferences?
I'm talking about requirements for a program to make these
kinds of inferences.
> I hate to harp on about this, but you seem to be ignoring the fact
> that this doesn't happen so easily.
Please keep 'harping'! You're helping me straighten out my
thoughts, and giving me new insights.
> For example, can you demonstrate
> that in the sentence above, "the bird" refers to the specific red
> robin on the front lawn? Yes, you can. But can a computer?
The computer would look up every word in the sentences and
retrieve the synset numbers. From 'robin', one of those synsets
would be the right one: a kind of bird. The only bird described
in the first sentence is 'a red robin on my front lawn'.
In the second sentence, the computer finds the word 'bird',
which is a class name for birds of every kind. One reasonble
conclusion that could be available from the list of all possible
interpretations is that the 'bird' in sentence 2 is the same
as the 'robin' in sentence 1.
Yes, there are many possible interpretations. Yes the bird
in sentence 2 could be different than the robin in sentece 1.
But most people would interpret this simple example in the way
I gave. The challenge is to organize the computer's list of
choices in order by what's most probable. To do that, the
computer would have to first look at a lot of humanly written
text, annotated as to meaning, and organized as to probable
interpretations.
I'm not saying this is easy; it isn't. That's why its never
been done to the level most people would consider full scale
natural language processing. But every year, the technology
gets better at this task. Its just a question of time before
we arrive at a something most people will accept as adequate
for limited uses.
>There's
> no magic here -- everything has to happen via inferences that are
> sound and reasonable (and I mean that in both senses), and the
> leap from "red robin" to "the bird" is nigh impossible for a
> computer to make without error.
Its even nearly impossible for people to make without error. I
don't expect the computer to be better than people any time in
my life span, but I do think improvements are coming albeit
slowly.
> Things that seem straightforward
> to us in anecdotal examples are by no means so in actual usage.
> Just knowing what a demonstrative pronoun refers to is very tricky
> business.
Yes, there are lots of examples of how pronouns can be used in
controlled texts. So list the possible interpretations. Then
look at the individuals mentioned in the context, and choose one
using the best historic material available.
> [and yes I'm being picky, because I think to some degree we need
> to be very careful in how we describe these systems, to avoid
> the "magic" creeping in.]
>
> Murray
Agreed.
Rich