SUO: RE: CG: RE: Re: Enhancing Data Interoperability with Ontologies...
John F. Sowa wrote:
> Rich,
>
> That depends on what you mean by polysemy:
>
> > do any of these controlled English
> > languages include forms of polysemy?
>
> If by polysemy, you mean the kind of polymorphism
> that is supported by many programming langugaes,
> then the answer is yes. They could allow a word
> to be specialized to a subtype by context.
>
> But the phenomenon in natural languages to which
> the term "polysemy" is applied allows much wider
> ranging kinds of variations than are supported
> by most polymorphic languages.
The goal of constructing an English-like language
should be to orient it toward the way people actually
use English in written form. Programming languages
are for precisely stating algorithms, precisely
defining data types, and keeping all structures
relevant to one object type in one place. So the
requirement for precision is all important in a
programming language.
In a controlled English, the same need for
precision applies, but if that is all that is
provided, writings in the language appear heavily
stilted, like bad English written by poor authors.
> So when you ask a question like that, you should
> supply specific examples of the kind of polysemy
> you are talking about.
>
> In other words, what is it that you want to say?
> And what kind of polysemy do you want? Be specific.
> State some particular sentences that you want to
> express where polysemy is essential to what
> you want to say.
>
> John
Using your paragraphs under "3 Eliminating Ambiguities"
in your article at http://www.jfsowa.com/clce/specs.htm
I will take individual examples.
"Multiple word senses. The most difficult ambiguity
to resolve in natural languages is the correct choice
of word sense in any particular context. CLCE avoids
that ambiguity by brute force: the correct sense for
any reserved word is uniquely determined by the syntax,
and the sense of any user-defined word is limited to
a single relation that is declared for that word. "
Brute force is too strong a tool to apply to writings
in a human oriented controlled English. Programming
languages do use the signature of a function to identify
which overloading is meant, and this capability is very
useful. But in a CLCE, this should be provided in user
defined words as well as in reserved words. Even
programming languages provide user-defined overloadings,
as well as built-in overloadings, and a CLCE should also.
"Attachment of prepositional phrases. In English,
prepositional phrases can be attached to nouns, verbs,
adjectives, or adverbs. In CLCE, the only preposition
that can be attached to a noun is of, and other
prepositions can only be attached to verbs."
Using a preposition or adjective or adverb should still
be overloadable in its phrase. So that a writer can
say "red vegetables" and use that phrase to mean the class
of those vegetables that are red as opposed to vegetables
of other colors. "Of" is not enough; it just indicates
a property of an object like "color of vegetable". All
the common prepositions have meaning because of the
way we perceive the world, and all prepositions should
be able to participate in phrasal overloadings.
Of course, there is also a need for selectional
restrictions so a user can't write "color of ideas"
as in Chomsky's example of a syntactically correct
meaningless sentence like "colorless green ideas
sleep furiously." Built-in selectional restrictions
should be part of the default definition package, like
the "system" package in many programming languages.
User defined selectional restrictions should also
be provided so that authors can tailor packages
toward specific domains that provide meaning for
the jargon experts use in their writings about these
domains. So a "use" statement equivalent in CLCE
should allow later authors to incorporate the
common domain packages into their writings.
"Scope of quantifiers. Unlike English, which allows
universal and existential quantifiers to be intermixed
in various contexts, CLCE limits universal quantifiers
to two positions: the subject of a clause or a prefix
that is placed in front of a clause; any CLCE quantifiers
that occur elsewhere must be existential. This
restriction enables the scope of all quantifiers to be
determined from the syntax."
Seems a little strict, but using it only to avoid the
heavy duties of making human users meet syntactic
constraints that aren't absolutely necessary would seem
to defeat the purpose of a controlled English. It might
be wise to let up on this, using specific selectional
restrictions to enforce ambiguity resolution. But
your intent - to separate syntactic from semantic
ambiguity resolution - makes sense.
"Referential noun phrases. In natural languages, the
referent of a pronoun or other referential noun phrase
may require implicit background knowledge. In CLCE,
pronouns are replaced by explicit variables, and the
variables may only be omitted under narrowly specified
conditions."
I don't find this too difficult to accept, but there
are probably domains - e.g. drama and psychology -
where "I", "you", "we", "us", "they" and "them" should
be supported. If the only users of a CLCE are objective
engineers and scientists, then it wouldn't be much
of a problem, but in AI, where realistic creatures are
part of the objective, personalities are needed that
can model the use of human oriented pronouns in human
oriented ways. Not allowing ANY form of anaphora
seems strict, but I see your point in having to organize
the language so that each and every ambiguity can be
resolved in some methodical way.
"Noun modifiers. Unlike an adjective such as blue,
which expresses an attribute, the word stuffed, as
in stuffed bear, makes a drastic change in the meaning
of the noun. Such modifiers must be combined with
the noun as a predefined term linked by underscores,
such as stuffed_bear or hard_disk_drive."
I strongly disagree. There is enough structure in
a controlled language to make phrasal restrictions
that are sensible without resorting to long single
words that lose their readability and interpretability
this way.
"Deeply nested sentences. In formal notations,
parentheses or precedence rules determine the grouping
of phrases, but the grouping in English is often
context dependent. In CLCE, parentheses are required
for deeply nested sentences and for lists of more than
two elements."
This seems to be a reasonable restriction.
"Features beyond FOL. Some of the most difficult problems,
which are still open research issues in linguistics,
involve plural noun phrases, verb tenses, modality, and
an open-ended number of context-dependent questions - all
of which are ruled out by the restrictions on CLCE syntax
and semantics."
In the example above about representing drama, law and
psychology, modality is very important. There should be
built-in functions that can make this less restrictive.
Possibly "statement" or "belief" or other built-in types
can be used to refer to what you like to call "thirdness"
concepts. Modal logic is a powerful capability, and should
be provided as part of the language. Modern planning
algorithms are based on modal logic operators, and those
could certainly be provided as built-ins.
This was a quick, informal series of thoughts I had based
on my experience with ROSIE, a stilted CLCE I used in
the eighties and found somewhat revolting. Better study
than this fast series of thoughts is necessary to fully
organize effective ways to handle ambiguity and polysemy,
but this is a rough outline.
HTH,
Rich