Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

SUO: Propositions (was Peirce's MS 514)




Pat and Jon,

JS>>> I used the word "statement" to avoid getting into details
>> > about the distinction between a "statement" and a "proposition".
>> > But as I have said in my KR book (Ch. 5) and many email notes,
>> > I prefer to define a proposition as an equivalence class of
>> > statements in multiple languages.  With such a definition,
>> > you can talk about entailment in a form that factors out as
>> > much of the notation as you like.  (The larger the equivalence
>> > class, the smaller the dependence on any particular notation.)

JS>>> This kind of factoring is especially important for the SUO,
>> > since we must deal with multiple languages, including CGs,
>> > DAML, OIL, SQL, RDF, UML, and versions of controlled natural
>> > languages, such as ACE or Aristotelian syllogisms.
 
PH>> Interesting idea, and I agree important.  But to give it flesh
>> you need to say what the equivalence relation is, relative to
>> which the equivalence classes are defined;  and that in turn
>> seems to require some way to compare the semantic content
>> of expressions from different languages.  If they are
>> all interpreted in the same structures one could see
>> how this would go -- equivalent expressions denote
>> the same things in all interpretations -- but the
>> more general construction eludes me.

In Ch 5, I defined the notion of a "meaning-preserving
translation", either within a single language or among
multiple languages.  The simplest example is identity,
but I define a more interesting class of transformations:

   http://www.bestweb.net/~sowa/logic/meaning.htm

That file has the nicely formatted version.  The plain text
version is attached below.

JA>This is exactly the sort of problem that
>Category Theory was invented to solve:

I agree, and I was thinking of category theory, but I didn't
want to overburden the KR book with too much formalism that I
didn't intend to apply in detail.  However, category theory
is so general that the really interesting questions are often
in the details of what kinds of transformations are appropriate
for any given application.

With that caveat, I agree that there are a lot of interesting
possibilities for using category theory.  Joe Goguen & Co. have
been using "institution theory" for related purposes:

   http://citeseer.nj.nec.com/384650.html

  "Abstract: Institutions formalize the intuitive notion
  of logical system, including both syntax and semantics."

Following is the plain text version of the HTML file cited above.

John Sowa

---------------------------------------------------------------

                   Meaning-Preserving Translations

                           by John F. Sowa

Informally, different statements in different languages can mean "the same
thing." Formally, that "thing," called a proposition, represents abstract,
language-independent, semantic content. As an abstraction, a proposition
has no physical embodiment that can be written or spoken. Only its
statements in particular languages can be expressed as strings of symbols.

According to Peirce (1905), "The meaning of a proposition is itself a
proposition. Indeed, it is no other than the very proposition of which it
is the meaning: it is a translation of it." Mathematically, Peirce's
informal statement may be formalized by defining a proposition as an
equivalence class of sentences that can be translated from one to another
while preserving meaning. Some further criteria are necessary to specify
what kinds of translations are considered to "preserve meaning." Formally,
a meaning-preserving translation f from a language L1 to a language L2 may
be defined as a function that satisfies the following constraints:

   * Invertible. The translation function f must have an inverse function g

     that maps sentences from L2 back to L1. For any sentence s in L1, f(s)

     is a sentence in L2, and g(f(s)) is a sentence in L1. All three
     sentences, s, f(s), and g(f(s)) are said to express the proposition p.


   * Proof preserving. When a sentence s in L1 is translated to f(s) in
     L2 and back again to g(f(s)) in L1, the result might not be identical
     to s. But according to the rules of inference of language L1, each
     one must be provable from the other: s|-g(f(s)), and g(f(s))|-s.
     Similarly, f(s) and f(g(f(s))) must be provable from each other by
     the rules of inference of language L2.

   * Vocabulary preserving. When s is translated from L1 to L2 and back to
     g(f(s)), the logical symbols like ∀ and the syntactic markers
     like commas and parentheses might be replaced by some equivalent.
     However, the same content words or symbols that represent categories,
     relations, and individuals in the ontology must appear in both
     sentences s and g(f(s)). This criterion could be relaxed to allow
     terms to be replaced by synonyms or definitions, but arbitrary content

     words or predicates must not be added or deleted by the translations.

   * Structure preserving. When s and g(f(s)) are mapped to Peirce Normal
     Form (with negation ~, conjunction &, and the existential quantifier
     ∃ as the only logical operators), they must contain exactly
     the same number of negations and existential quantifiers, nested in
     semantically equivalent patterns.

These four criteria ensure that the sentences s and g(f(s)) are highly
similar, if not identical. If s is the sentence "Every farmer who owns a
donkey beats it", then the sentence g(f(s)) might be "If a farmer x owns
a donkey y, then x beats y". Those sentences use different logical and
syntactical symbols, but they are provably equivalent, they have the same
content words, and they have the same structure when expressed with only
&, ~, and ∃.

Attempts to apply formal definitions to natural languages are fraught
with pitfalls, exceptions, and controversies. To avoid such problems, the
definition of meaning-preserving translation may be restricted to formal
languages, like CGs and KIF. The sample sentence in Figure 5.12 could be
defined as part of a formal language, controlled English, which happens
to contain many sentences that look like English. Yet even for formal
languages, the four criteria require further explanation and justification:


   * Invertible. The functions f and g are not exact inverses, since
     g(f(s)) might not be identical to s. To ensure that f is defined for
     all sentences in L1, the language L2 must be at least as expressive
     as L1. If L2 is more expressive than L1, then the inverse g might be
     undefined for some sentences in L2. In that case, the language L2
     would express a superset of the propositions of L1.

   * Proof preserving. Preserving provability is necessary for meaning
     preservation, but it is not sufficient. It is a weak condition that
     allows all tautologies to be considered equivalent, even though the
     proof of equivalence might take an exponential amount of time.
     Informally, the test to determine whether two sentences "mean the
     same" should be "obvious." Formally, it should be computable by an
     efficient algorithm -- one whose time is linearly or polynomially
     proportional to the length of the sentence.

   * Vocabulary preserving. Two sentences that mean the same should talk
     about the same things. The sentence Every cat is a cat is provably
     equivalent to Every dog is a dog, even though one is about cats and
     the other is about dogs. Even worse, both of them are provably
     equivalent to a sentence about nonexistent things, such as Every
     unicorn is a unicorn. An admissible translation could make some
     changes to the syntactic or logical symbols, as in the sentence If
     something is a cat, then it is a cat. It might replace the word cat
     with domestic feline, but it should not replace the word cat with dog
     or unicorn.

   * Structure preserving. Of all the logical operators, conjunction &
     is the simplest and least controversial, while negation ~ introduces
     serious logical and philosophical problems. Intuitionists, for
     example, deny that ~~p is identical to p. For relevance logic,
     Anderson and Belnap (1975) disallowed the disjunctive syllogism, which

     is based on ∨ and ~, because it can introduce extraneous
     information into a proof. Computationally, ~~p and p have different
     effects on the binding of values to variables in Prolog, SQL, and many

     expert systems. The constraints on quantifiers and negations help
     ensure that formulas in the same equivalence class have the same
     properties of decidability and computational complexity.

These conditions impose strong constraints on translations that are said to

preserve meaning. They ensure that the content words or predicates remain
identical or synonymous, they preserve the logical structure, and they
prevent irrelevant content from being inserted.

Examples of Meaning-preserving Translations.  To illustrate the issues,
consider meaning-preserving translations between two different notations
for first-order logic. Let L1 be predicate calculus with Peano's symbols
&, ∨, ~, ->, ∃, and ∀, and let L2 be predicate
calculus with Peirce's symbols +, x, -, -<, &Sigma;, and &Pi;.
Then for any formulas or subformulas p and q in L1, let f produce the
following translations in L2:

   * Conjunction. p&q => pxq.

   * Disjunction. p&or;q => -(-px-q).

   * Negation. ~p => -p.

   * Implication. p->q => -(px-q).

   * Existential quantifier. (&exist;x)p => &Sigma;x p.

   * Universal quantifier. (&forall;x)p => -&Sigma;x -p.

The sentences generated by f use only the operators x, -, and
&Sigma;, but the inverse g is defined for all operators in L2:

   * Conjunction. pxq => p&q.

   * Disjunction. p+q => p&or;q.

   * Negation. -p => ~p.

   * Implication. p-<q => p->q.

   * Existential quantifier. &Sigma;x p => (&exist;x)p.

   * Universal quantifier. &Pi;x p => (&forall;x)p.

The functions f and g meet the criteria for meaning-preserving
translations: they are invertible, proof preserving, vocabulary preserving,

and structure preserving. Furthermore, the proof of equivalence can be
done in linear time by showing that two sentences s and t in L1 map to
the same form with the symbols &, ~, and &exist;.

The functions f and g in the previous example show that it is possible to
find functions that meet the four criteria. They don't map any sentences
to the same equivalence class unless they can be said to "preserve
meaning" in a very strict sense, but they leave many closely related
sentences in different classes: permutations such as p&q and q&p;
duplications such as p, p&p, and p&p&p; and formulas with renamed
variables such as (&exist;x)P(x) and (&exist;y)P(y). To include more
such sentences in the same equivalence classes, a series of functions
f1, f2, ..., can be defined, all of which have the same inverse g:

  1. Sorting. The function f1 makes the same symbol replacements as f,
     but it also sorts conjunctions in alphabetical order. As a result,
     p&q and q&p in L1 would both be mapped to pxq in L2, which would be
     mapped by g back to p&q. Therefore, f1 groups permutations in the
     same equivalence class. Since a list of N terms can be sorted in
     time proportional to NlogN, the function f1 takes just slightly
     longer than linear time.

  2. Renaming variables. The function f2 is like f1, but it also renames
     the variables to a standard sequence, such as x1 ,x2 , ... . For
     very long sentences with dozens of variables of the same type, the
     complexity of f2 could increase exponentially. A typed logic can
     help reduce the number of options, since the new variable names
     could be assigned in the same alphabetical order as their type
     labels. For the kinds of sentences used in human communications,
     most variables have different types, and the computation time for
     f2 would be nearly linear.

  3. Deleting duplicates. After f1 and f2 sort conjunctions and rename
     variables, the function f3 would eliminate duplicates by deleting
     any conjunct that is identical to the previous one. The deletions
     could be performed in linear time.

For the kinds of sentences that people speak and understand, the total
computation time of all three functions would be nearly linear. Although
it is possible to construct sentences whose computation time would
increase exponentially, those sentences would be hopelessly unintelligible
to humans. What is unnatural for humans would be inefficient for computers.


This series of functions shows how large numbers of closely related
sentences can be reduced to a single canonical form. If two sentences
express the same proposition, their canonical forms, which can usually
be calculated efficiently, would be the same. The function f2 has the
effect of reducing sentences to Peirce Normal Form (PNF) -- the result
of translating a sentence from predicate calculus to an existential graph
and back again. As an example, consider the following sentence, which
Leibniz called the Praeclarum Theorema (splendid theorem):

   * ((p -> r) & (q -> s)) -> ((p & q) -> (r & s)).

This formula may be read "If p implies r and q implies s, then p and q
imply r and s." When translated to L2 by f3 and back to L1 by g, it has
the following Peirce Normal Form:

   * ~((~(p & ~r) & ~(q & ~s)) & ~(~(p & q) & ~(r & s)) ).

This form is not as readable as the original, but it serves as the
canonical representative of an equivalence class that contains 864
different, but highly similar sentences. The function f3, which deletes
duplicates, can reduce an infinite number of sentences to the same form.
Such transformations can factor out the differences caused by the choice
of symbols or syntax.

To account for synonyms and definitions, another function f4 could be used
to replace terms by their defining lambda expressions. If recursions are
allowed, the replacements and expansions would be equivalent in computing
power to a Turing machine; they could take exponential amounts of time or
even be undecidable. Therefore, f4 should only expand definitions without
recursions, direct or indirect. Since the definitions may introduce
permutations, duplications, and renamed variables, f4 should expand the
definitions before performing the reductions computed by f3. Without
recursion, the expansions would take at most polynomial time.

Meaning in Natural Languages.  When functions like the fi series are
extended to natural languages, they become deeply involved with the
problems of syntax, semantics, and pragmatics. In his early work on
transformational grammar, Noam Chomsky (1957) hoped to define
transformations as meaning-preserving functions. But the transformations
that moved phrases and subphrases had the effect of changing the scope of
quantifiers and the binding of pronouns to their antecedents:

   * Every cat chased some mouse.
     => Some mouse was chased by every cat.

   * We do your laundry by hand; we don't tear it by machine.
     => We don't tear your laundry by machine; we do it by hand.

To account for the implications of such transformations, Chomsky (1982)
developed his theory of government and binding, which replaced all
transformations by a single operator called move-&alpha; and a set of
constraints on where the phrase &alpha; could be moved. In his most recent
minimalist theory, Chomsky (1995) eliminated movement altogether and
formulated the principles of grammar as a set of logical constraints. With
that theory, both language generation and interpretation become
constraint-satisfaction problems of the kind discussed in Section 4.6. The
common thread running through these theories is Chomsky's search for a
syntax-based characterization of the meaning-preserving translations.

AI-based computational linguistics has also involved a search for
meaning-preserving translations, but with more emphasis on semantics and
pragmatics than on syntax. Roger Schank (1975), for example, developed his
conceptual dependency theory as a canonical representation of meaning with
an ontology of eleven primitive action types. Although Schank was strongly
opposed to formalization of any kind, his method of reducing a sentence to
canonical form could be viewed as a version of function f4. In his later
work (Schank & Abelson 1977; Schank 1982), he went beyond the sentence to
higher-level structures called scripts, memory organization packets (MOPs),

and thematic organization packets (TOPs). These structures, which have been

implemented in framelike and graphlike versions of EC logic, address
meaning at the level of paragraphs and stories. Stuart Shapiro and his
colleagues have implemented versions of propositional semantic networks,
which support similar structures in a form that maps more directly to logic

(Shapiro 1979; Shapiro & Maida 1982; Shapiro & Rappaport 1992). Shapiro's
propositional nodes serve the same purpose as Peirce's ovals and McCarthy's

contexts.

Besides the structural forms of syntax and logic, the meaning-preserving
translations for natural languages must account for the subtle interactions

of many thousands of words. The next two sentences, for example, were
adapted from a news report on finance:

   * The latest economic indicators eased concerns that inflation is
     increasing.

   * The latest economic indicators heightened concerns that inflation is
     increasing.

The first sentence implies that inflation is not increasing, but the second

one implies that it is. The negation, which is critical for understanding
the sentences, does not appear explicitly. Instead, it comes from an
implicit negation in the meaning of the noun concern: if some agent x has a

concern about y, then x hopes that some bad event does not happen to y. The

concern is eased when the bad event is less likely to occur, and the
concern is heightened when the bad event is more likely to occur. In the
normal use of language, people understand such sentences and their
implications. For a computer to understand them, it would require detailed
definitions of the words, background knowledge that rising inflation is bad

for the economy, and the reasoning ability to combine such information.
Doug Lenat and his group in the Cyc project have been working since 1984 on

the task of encoding and reasoning with the millions of rules and facts
needed for such understanding.