[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
CE RE: MRE Starter grammar
Adam,
. Further comments interspersed below, prefaced "GH2>>> ".
Cheers Graham Horn
National Data Standards Unit
Australian Institute of Health and Welfare
================================================
Phone: +61.2.6244.1094
Fax: +61.2.6244.1199
Email: Graham.Horn@aihw.gov.au <mailto:graham.horn@aihw.gov.au>
-----Original Message-----
From: Adam Pease [mailto:apease@ks.teknowledge.com]
Sent: Thursday, June 07, 2001 2:31 PM
To: Horn, Graham; suo-ce@ieee.org
Subject: RE: MRE Starter grammar
Graham,
Comments below:
At 08:10 PM 6/4/2001 +1000, Horn, Graham wrote:
>Adam,
> . I hope you got a more comprehensible structure of what I
>sent than what's below. If that's what you got, I can try reformatting it
>another way to make the structural hierarchy more evident.
>
> . I am calling it MRE, because I believe that title is more
>precise. I am happy for people to correct me if I am in error, or to
provide
>an even more precise and succinct label, but I believe it is more than just
>controlled. Also, one participant earlier felt "controlled" implied users
>were being restricted in what they wanted to say, rather than just how they
>can express it.
I think we can achieve the latter.
> . Further comments interspersed below, prefaced "GH> ".
>
>
>
>Cheers Graham Horn
>National Data Standards Unit
>Australian Institute of Health and Welfare
>================================================
>Phone: +61.2.6244.1094
>Fax: +61.2.6244.1199
>Email: Graham.Horn@aihw.gov.au <mailto:graham.horn@aihw.gov.au>
>
>
>-----Original Message-----
>From: Adam Pease [mailto:apease@ks.teknowledge.com]
>Sent: Saturday, June 02, 2001 2:11 AM
>To: Horn, Graham; suo-ce@ieee.org
>Subject: RE: CE Starter grammar
>
>Graham,
> I have two concerns
>1. It is often not possible to determine the purpose of an utterance
>from its surface structure, and in isolation. I think it will be important
>to keep the distinctions among syntax, semantics and pragmatics.
>
>GH> While I agree natural language doesn't avoid ambiguities, I suggest
>we manage the various types of ambiguities as we encounter them. I think we
>will be able to eliminate most of them by developing a few syntactic rules
>forbidding certain constructions. IIrc, ACE has just such rules.
>
>GH> I agree we should keep the distinctions among syntax, semantics and
>pragmatics. I don't quite know what structure you are referring to here.
>
>GH> Actually, I suspect I will end up wanting additional structural
>distinctions than just the ones you mention. That's because I suspect some
>of the ones I will want will implicitly reflect structures required to be
>explicitly stated in KIF. Naturally we will have to explicitly declare the
>meaning of any such implicit structures.
>
>2. "process" is likewise an ontological issue rather than a syntactic
>one.
>
>GH> I don't mind if the label I dubbed "process" is changed, so long as
>the structure and functionality is precise. I think the structure I am
>referring to is really "process word group", which is different to
>"process". Nevertheless, it is also different to "verb". Perhaps you can
>suggest better labels. I mainly care about getting the structure right,
tho'
>I believe the labels should not be ambiguous or confusing, and would prefer
>they not be clumsy.
I'm not taking issue with the label names. I agree that different labels
for "process" are possible. I'm addressing a different issue. Labeling a
particular verb as a "process" verb, even in a highly restricted grammar,
may not be possible without doing a "first pass" on a sentence to determine
which words conform to which parts of speech.
GH2>>> I was just trying to separate the concept of what a verb or verb
group does from the actual word, since sometimes the verb function is
represented by several words.
GH2>>> Should I call it "verb group", instead of "process"?
I'm not a computational linguist by trade, but there is a fairly standard
process for language interpretation that is followed both in restricted
grammar interpretation as well as true natural language interpretation. It
consists of at least two phases. The first, marks the parts of speech. The
second interprets the parse tree to generate the associated semantics of the
sentence. There may be other approaches that I'm not aware of, but this is
the approach I'm assuming.
What I objected to was your providing a grammar - which is a map for the
allowable parse trees - that included elements that I don't believe can be
reasonably determined during the parsing phase. Or, if they were, it would
result in an overly complex and brittle parser. By separating things into
at least these two phases, we get a more tractable problem. Even in the
simplest possible programs that you find in prolog textbooks, this two phase
approach is employed.
GH2>>> I guess I am not appreciating why the parser becomes so "brittle".
That said, certainly there are many common forms of expression that I would
forbid on the grounds of ambiguity, or else, like ACE, allow only a single
interpretation.
As a concrete example, and one that conforms to the ACE grammar, take the
following two sentences:
The teachers have cats.
All teachers have cats.
and a reasonable translation to logic
(exists (?X ?Y)
(and
(instance-of ?X (GroupFn Teacher))
(instance-of ?Y (GroupFn Cat))
(possesses ?X ?Y)))
(=>
(instance-of ?X Teacher)
(and
(exists (?Y)
(instance-of ?Y Cat)
(possesses ?X ?Y))))
In the first case "teachers" is a "real" plural noun. It refers to a group
or set of teachers. In the second case teachers refers to all teachers. In
that case, the same token, "teachers", has a different meaning depending
upon context. Even ACE does not eliminate ambiguities of this sort because
it doesn't need to, they're quite tractable. This does nearly require
though that multiple passes be used in order not to complicate the grammar
unnecessarily.
GH2>>> I'm afraid I'm not appreciating the ways in which <In the second
case "teachers", has a different meaning depending upon context>. Can you
explain or exemplify?
GH2>>> Certainly I have no intention of making something unparsable.
Unfortunately I don't have as much experience with parsers to understand
your particular perspective/concern. Once I do understand it I can have a go
at avoiding it.
>I think these labels are fine so long as we acknowledge that they could
only
>be added either explicitly by the source of the utterance, or at a stage of
>processing following syntactic parsing.
>Adam
>At 04:04 PM 6/1/2001 +1000, Horn, Graham wrote:
> >Folks,
> > . May I suggest each line of a diagrammatic figure be
>started
> >with a dot, so as to minimise format obliteration by e-mail & firewall
> >systems. I also suggest Courier font for them, to utilise the constant
> >character width of that font. Please note the number of leading dots in
the
> >hierarchical indented structures below, in case the indenting is stripped
> >out .
> >
> > . I also feel we will need a symbology that indicates
> >optionality of elements and structures.
> >
> > . I had a go at some simple augmentations to this, but
there
> >are many aspects, of course, to consider. I feel we will rapidly need to
> >expand the paradigm.
> >
> > . May I suggest we classify sentences , such as:
> >* purpose:
> >* . statement
> >* . query
> >* . command
> >* complexity:
> >* . simple - only one independent clause
> >* . compound - more than one independent clauses only
> >* . complex -also has one or more dependant clauses
> >
> > . I also suggest we break them down into subject, process,
>and
> >possibly indirect object and/or direct object.
> >
> > . Doubtless I sound like an old school master, but I
>strongly
> >suspect we will rapidly require most of the basic traditional grammar
> >elements. Hence I suggest the following enlarged and hierarchical list:
> >
> >* LP = label phrase, of type:
> >* U = subject,
> >* I = indirect object, and
> >* B = direct object; comprising:
> >* . L = label
> >* .. N = noun
> >* .. PN = proper noun
> >* .. R = pronoun
> >o ... QR = query pronoun [who|what|which]
> >o ... (there are other types of pronoun)
> >* .. Q = qualifier
> >o ... J = adjective
> >- .... DJ = determining adjective [the|a|an|all|every]
> >o .. JP = adjectival phrase (E + NP, eg: "with the tag")
> >o ... JC = adjectival clause (form of simple sentence, but not
> >semantically complete, and introduced by a subordinating conjunction, eg:
> >"which he wore")
> >* P = process
> >* . PP = process phrase
> >* .. V = verb (finite)
> >* .. AV = auxiliary verb [
> >* .. PV = participle verb [
> >* .. M = modifier
> >o ... A = adverb
> >o ... QA = query adverb [how|what|where]
> >o ... AP = adverbial phrase (E + NP, eg: "along the wide road")
> >o ... AC = adjectival clause (form of simple sentence, but not
> >semantically complete, and introduced by a subordinating conjunction, eg:
> >"when they arrived")
> >* E = preposition [like|as|...|etc.]
> >* C = coordinating conjunction [and|but|or|...|etc.]
> >* S = subordinating conjunction [which|how|what|...|etc.]
> >
> > . I'm afraid I can't see any way of avoiding this level of
> >complexity. Fortunately the paradigm is well proven over thousands of
>years.
> >
> >
> > . For our purposes, we get a large task very early on, but
>we
> >can take it incrementally by analysing and specifying each type of
>sentence,
> >starting form the simplest, one at a time.
> >
> > . I am using square brackets to indicate optionality
below.
>I
> >have also not required optionality to be indicated below the level at
which
> >it occurs.
> >
> > . By the way, I am not wedded to the above symbols - they
>were
> >just drawn off the top of my head. We will probably have to assign roles
to
> >various characters, so as to make the meaning unambiguous.
> >
> > . Now let's have a go at the simplest generic structure.
> >. simple statement
> >. / | \ \
> >. U P [I] [B]
> >. / | \/
> >. LP PP LP
> >. / \ / \ / \
> >. [Q] L P M [Q] L
> >
> > . Suggestions, comments, corrections, criticisms welcome.
> >
> >
> >
> >Cheers Graham Horn
> >National Data Standards Unit
> >Australian Institute of Health and Welfare
> >================================================
> >Phone: +61.2.6244.1094
> >Fax: +61.2.6244.1199
> >Email: Graham.Horn@aihw.gov.au <mailto:Graham.Horn@aihw.gov.au>
><mailto:graham.horn@aihw.gov.au <mailto:graham.horn@aihw.gov.au> >
> >
> >
> >-----Original Message-----
> >From: Adam Pease [mailto:apease@ks.teknowledge.com]
><mailto:[mailto:apease@ks.teknowledge.com]>
> >Sent: Friday, June 01, 2001 12:15 PM
> >To: suo-ce@ieee.org <mailto:suo-ce@ieee.org>
> >Subject: CE Starter grammar
> >
> >
> >Folks,
> >I'll suggest the following strawman as a start on a restricted English
> >grammar.
> > sentence
> > / \
> > SNP|Q VP
> > / \ / \
> > Det N|PN V NP|SNP
> > / | \
> > Det M N|PN
> >
> >where:
> >SNP = simple noun phrase
> >NP = noun phrase
> >Det = determiner [the|a|an|all|every]
> >N = noun
> >V = verb
> >VP = verb phrase
> >PN = proper noun
> >M = modifier
> >Q = query word [who|what|where]
> >
> >
> >Adam
>
>Adam Pease
>Teknowledge
>(650) 424-0500 x571
Adam Pease
Teknowledge
(650) 424-0500 x571