Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Obligatory and Optional Features



I received a question about a common problem that
arises in translations between English and languages
that have an obligatory marker for perfective verbs
(such as Chinese and most Slavic languages).

This issue, which is important for machine translation,
also illustrates some points that are fundamental to
every project in ontology or knowledge representation:

  1. The number of describable aspects of any situation
     is infinite, and every statement in any language,
     natural or artificial, is an abstraction from the
     subject matter for some particular purpose.  There
     is always an infinite amount of information left
     over, which the speaker (or writer) chose to ignore
     in order to focus on some smaller, more manageable
     subset that the speaker considers relevant.

  2. Every knowledge representation language (including
     every natural or artificial language) is designed
     to facilitate the expression of certain kinds of
     information at the expense of other kinds.

  3. For example, predicate calculus focuses attention
     on quantifiers, Boolean operators, true-false
     predication, and the coreference links expressed
     by variables.  Those features can also be expressed
     in every natural language, but they might not be
     the most relevant for any particular problem.

  4. Linguists talk about obligatory distinctions, which
     must be expressed in a particular linguistic form,
     as opposed to optional distinctions, which may be
     stated or omitted at the speaker's option.

  5. In predicate calculus, quantification, predication,
     and coreference are obligatory, and many other
     features, such as purpose, focus, time, place,
     speaker, listener, etc., are optional.  Although
     predicate calculus could express those features,
     somebody must first define appropriate predicates
     (often called an ontology) that could be used to
     make assertions about those features.

  6. The obligatory features expressed in natural languages
     are more similar to one another than they are to the
     the very limited number of obligatory features in
     predicate calculus.  However, different languages
     have evolved over the centuries in ways that make
     different features obligatory or optional.

The following discussion of tense and aspect illustrates
some design features that differ from one natural language
to another.  Similar differences occur between different
knowledge representation languages and ontologies.  It is
important for knowledge engineers to be sensitive to those
differences and to recognize the choices that may simplify
the expression of one kind of feature at the expense of
making others more obscure, harder to express, or harder
for the reader or listener to recognize.

I often talk about this issue under the title "knowledge soup",
and this example is just one more illustration.

John Sowa
________________________________________________________

> Say, if you have a habit of swimming every day, and
> "Yesterday you swam", do you mean that, in Simple Past,
> the habit is not continuous today?  Does Simple Past
> imply the habit finishes on "yesterday"?  But Simple Past
> cannot indicate a finish.

> If you eat dinner as a routine, and "Yesterday you ate dinner",
> does Simple Past mean that the routine is finished and will
> not continue today? But the routine of eating dinner must
> continue today, I am sure. It follows that what happened
> "yesterday" is not necessarily finished. Then, how can you
> say "Yesterday you ate dinner" is finished? This is the
> difficulty.

Linguists distinguish multiple kinds of information that
may be associated with verbs, and tense is just one of them.
One book that has a good summary of the issues across many
different languages is

    _The Evolution of Grammar:  Tense, Aspect, and Modality in
    the Languages of the World_ by Joan Bybee, Revere Perkins,
    and William Pagliuca, University of Chicago Press, 1994.

It is usually possible to express the same kinds of information
in any language, but some languages require certain features
to be obligatory, while other languages may leave them optional.
The optional features are frequently ignored, and they may
require complicated expressions to express what might be said
in a single syllable in some other language.

The distinction of completed vs. continuing action is called
_perfective_ vs. _imperfective_ aspect.  In some languages
such as Chinese and the Slavic languages, the distinction is
obligatory.  But in English, the distinction is optional,
and the speaker may leave it unspoken if it is not important
for the sentence (or if it can be inferred from context).

  1. In Slavic languages, such as Russian, Polish, and Bulgarian,
     both tense and aspect are indicated explicitly in every verb.
     Those languages have explicit endings on the verbs to mark
     past tense and auxiliary verbs (similar to English "shall"
     or "will") to mark future.  In addition, every verb is
     marked as either a perfective verb (which expresses a
     completed or finished action or state) or an imperfective
     verb (which indicates an action or state whose completion
     is indefinite or continuing).  There are also some verbs
     that indicate habitual or repetitive actions.

  2. Chinese has a much simpler grammar, in which there are no
     tense ending on verbs.  Unlike English and the Slavic
     languages, the time may be omitted in Chinese if it can
     be inferred from context.  If the time is significant,
     it must be mentioned by some explicit word or phrase that
     indicates when the action occurred.  However, Mandarin
     Chinese does have a short word "le", which is used to
     mark a action that has finished as opposed to an action
     that may be continuing.

This illustrates a typical cross-linguistic issue:  Russian
is closer to English in marking tense, but it is closer to
Chinese in marking aspect.  Consider some examples:

  1. "Yesterday Bob swam in Lake Michigan"

  2. "Yesterday Bob swam across the English Channel."

  3. "Yesterday Bob ate chicken."

  4. "Yesterday Bob ate dinner with friends."

In all four sentences, English and Russian would use some
form of past tense marker, even though the word "yesterday"
makes that word redundant.  In Chinese, there is no explicit
tense marker, and the word for "yesterday" would be sufficient
to indicate the time.

In English, none of the verbs explicitly mark completion of
the action, but context is often sufficient for translation.
Sentence (4) was almost certainly finished because the word
"dinner" indicates a complete activity.  In unusual cases,
the speaker would have mentioned the circumstances with
another clause beginning with the word "but":

    ... but a hurricane demolished the restaurant before
    they finished.

    ... but they lingered so long that dinner merged with
    breakfast.

(By the way, this example also illustrates the difference
between "and" and "but".  Both of those conjunctions are
translated to logic with the same symbol, but the word
"but" indicates that some typical default information is
contradicted by the subsequent clause.)

Sentence (2) is not as certain, because it is a much more
difficult effort, which many people attempted without success.
But if that sentence were expressed without any qualification,
one might assume that the effort was successful.

Sentences (1) and (3) are not as clear.  If no other information
were available from context, one could assume from (1) that Bob
stopped swimming before nightfall, since swimming is a strenous
activity.  However, it is not clear whether Bob completed some
goal, such as swimming for a prespecified time or distance.
The translator should use some indefinite form that implies
that the action finished yesterday, but leaves open the
question of whether some goal was achieved or whether it
was part of Bob's daily routine.

Sentence (3) raises another issue:  English uses articles
to distinguish "chicken", "a chicken", and "the chicken".
Although Chinese and Russian do not have articles, the
choice of article in English may affect the choice of
perfective vs. imperfective marker in other languages.

The sentence "Bob ate a chicken" would imply that Bob had
eaten an entire chicken, and the translation should use
the perfective marker.  The phrase "the chicken" would
imply that Bob had eaten some specific amount of chicken,
but it could either mean an entire predesignated chicken
or just some entire portion that was put on his plate.
In either case, the perfective marker would be used.
The word "chicken" with indicate some amount of meat
from a chicken, but with no information about completion.
A translation with an imperfective marker would be
appropriate.

The issue of articles illustrates another point:  Bulgarian
is very close to Russian in most respects, but unlike
Russian it does have a definite marker for noun phrases.
There has been a lot of research on how articles interact
with the perfective/imperfective distinction in verbs.
Following is a comparison of Chinese and English:


http://www-uilots.let.uu.nl/conferences/Perspectives_on_Aspect/Proceedings/soh.pdf
Perfective Aspect and Accomplishment Situations in Mandarin Chinese

All these examples illustrate an important point about
natural languages:  ambiguity and imprecision are *not*
characteristic of natural languages; on the contrary,
NLs are capable of extremely great precision.  The major
problem with translating a natural language into predicate
calculus (or into any other NL) is in trying to capture
the extreme amount of detail that can be expressed in the
source language -- or rather of deciding how much of it
must be preserved and how much of it can be omitted or
slightly modified without affecting the primary purpose
of the origianl.

I found the above example of Mandarin Chinese by using
a search engine over the WWW.  I also came across an
article that includes a good list of aspectual features,
which some languages make obligatory and others leave
optional (see below).

John Sowa
_____________________________________________________________

Source:  http://www.wordiq.com/definition/Grammatical_aspect

     * Habitual: 'I walk home from work.' (every day)
       'I would/used to walk home from work.' (past habit)
     * Perfect: 'I have/had gone to the cinema.'
     * Imperfect: 'I went to the cinema.'
     * Imperfective: 'I'm going home.' (the action is in progress)
     * Perfective: 'I went home.' (the action is finished)
     * Progressive: 'I am eating.'
     * Prospective: 'I am about to eat.'
     * Inceptive: 'I am beginning to eat.'
     * Continuative: 'I am continuing to eat.'
     * Terminative: 'I am finishing my meal.'
     * Inchoative: 'My nose is turning red.' (from the cold)
     * Cessative: 'I am quitting smoking.'
     * Pausative: 'I stopped working for a while.'
     * Resumptive: 'I resumed sleeping.'
     * Punctual: 'I slept.'
     * Durative: 'I slept for an hour.'
     * Delimitative: 'I slept for a while.'
     * Protractive: 'The argument went on and on.'
     * Iterative: 'I read the same books again and again.'
     * Frequentative: 'I go to school a lot.'
     * Experiential: 'I have gone to school many times.'
     * Intentional: 'I listened carefully.'
     * Accidental: 'I knocked over the chair.'
     * Generic: 'Mangos grow on trees.'
     * Intensive: 'It glared.'
     * Moderative: 'It shined.'
     * Attenuative: 'It glimmered.'