Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

SUO: Monosemy, Semantics, and Natural Language




In an earlier note, I mentioned that I was
assuming a single word sense for each CLCE term.
Jon Awbrey correctly pointed out that such an
approach does not come to grips with the reality
of how concepts are used.  I certainly agree.

Then today, I received a copy of a Seybold report
that made some unfounded claims about the potential
impact of the semantic web on "on data, computing,
people, publishing, government and manufacturing."
I wrote the following response to the person who
sent it to me.

Since Seybold charges money for their reports, I
cannot distribute or point to the full text, but
my comments below should give some idea of what
the authors claim.

John Sowa
___________________________________________________

Thanks for sending the report.  I hate to say nasty
things about people who say nice things about my work,
but I can't endorse that report.  Following are some
comments.

The first point is that the authors fail to recognize
that the ambiguities of natural language are intimately
related to semantics and that changing the notation
does nothing to solve them.  The following point is
not only wrong, but totally wrong headed:

    A key theme is that language-based approaches suffer
    from ambiguities inherent in natural language.
    We contrast language-based knowledge representation
    with semantic-form declarative knowledge and conclude
    that semantics trumps linguistics.

As I have said repeatedly, SYNTAX IS NOT THE PROBLEM.
Therefore, no change to syntax can ever solve the problem.

I certainly agree that there are ambiguities in NLs.
The simple, trivial ambiguities are syntactic.  They
are easy to deal with.  The difficult problems are all
in the semantics.  And nothing that the authors say
in that report comes to grips with the real semantic
problems or even acknowledges their existence.

The following sentence indicates the authors' starry-eyed
innocence:

    As the semantic model becomes richer, it more
    completely specifies not only the formal class-subclass
    relationships, but also relationships between concepts,
    and the descriptive logic and conditional assertions
    that are used to perform inference.

That is the view that Wittgenstein adopted from Bertrand
Russell in 1913, and which he elaborated in his first book,
the _Tractatus Logico-Philosophicus_.  After writing that
book, W. thought that he had solved all the problems of
philosophy, and he retired to an Austrian village to teach
elementary school.  That's where he discovered that kids
(and grown-ups) don't think that way.  As Shakespeare said,

    There are more things in heaven and earth, Horatio,
    Than are dreamt of in your philosophy.

For the rest of his life, Wittgenstein analyzed, recanted,
and clarified the hopelessness of the superficial approach
that he and Russell had developed in their early work.
W's analysis had nothing to do with the expression of
concepts in natural language or in logic.  Every problem
and pitfall that W. analyzed in his later philosophy
applies just as much to the declarative languages of
logic or the semantic web as it does to natural language.

Hope reigns eternal within the programmer's breast:

    Also, it is possible to reuse ontologies, in whole or
    in part, that have already been developed.

I heard that same statement made about programs back
in the 1960s:  once a program has been written, it
could be used again by anybody else who had the same
problem, and the collection of solved problems would
grow indefinitely.

The key goal mentioned below has nothing to do with
the kind of language that is being used:

    A key goal of language-based knowledge representation
    is to eliminate the ambiguity of describing things
    with labels and natural language, leading to improved
    search and easier integration of content and processes.
    But, as we have seen, this goal is difficult to achieve
    when we use language to describe what we mean. Natural
    language use is inherently ambiguous. Many words have
    multiple meanings. There is no way to guarantee that
    two occurrences of the same word have the same meaning.

That makes it sound as the word "cat" in English is
"inherently ambiguous", but if we write a unique identifier
"cat" in a pure, unsullied declarative language, it will
magically aquire a unique meaning that nobody (or no
computer) could possibly confuse with anything else.

The solution that the authors propose is the same one
that Frege, Russell, and the logical positivists were
hoping to achieve with logic -- and they failed miserably:

    Ultimately, the only way to ensure precise meanings
    is to move away from natural language toward
    pure semantic codes and relationships; that is, use
    unique identifiers to identify concepts. (We may draw
    an analogy here with the UPC [universal product code]
    identifier that has no significance other than that
    it is unique.) Do not use labels or names of things.
    Rather, determine meaning by the sum of all the
    relationships the concept has.

A UPC has a unique meaning because it has been legislated
to have a unique meaning.  Once you have legislated the
meaning, its meaning is unique in any context, linguistic
or nonlinguistic (such as swiping a light pen over it).

People have been legislating meanings for words and concepts
in natural languages since the time of Socrates.  It is done
all the time for concepts in mathematics and science --
and those concepts are expressed by "labels" or "names" in
natural language sentences.  That has been done repeatedly
and successfully since the time of Euclid.

But even in mathematics, where the most precise NL usage
can be found, it is very rare for any two mathematicians
(or even any single mathematician) to use the same term
in exactly the same way in two different publications.
Therefore, it is common for mathematical publications
to have an opening section (or an appendix) that states
the definitions and axioms that are assumed.

Note the lessons to be learned:

  1. The only concepts that are ever precisely defined
     are legislated concepts -- ones whose meanings are
     stipulated or agreed by convention.

  2. Those agreements can be formalized and used in
     natural languages just as well as in any artifical
     language.  That fact has been demonstrated
     repeatedly since the time of Socrates and Euclid.

  3. But establishing agreements that hold for more
     than one context, such as one Platonic dialog,
     a single publication in mathematics, a single
     computer program, or a single database system,
     is extremely difficult and extremely rare.

  4. In computer systems, the only legislated concepts
     that are repeatedly used in a fixed sense are ones
     that are embodied in programming code that is very
     hard to change, such as the kernel of an operating
     system, the compiler of a programming language, or
     a library of programs that are fundamental to the OS
     or the compiler.  Even then, those meanings change
     with every release or patch to the OS or the compiler.

  5. Outside of mathematics and computers, the most common
     attempts to legislate meaning are in the legal system.
     The US Constitution is the world's first and most
     successful attempt to legislate a complete system
     of government and the concepts used to describe it.
     But that success depends completely on the ongoing
     efforts to interpret, extend, and clarify those
     concepts by the three branches of government
     (judicial, legislative, and executive).  With such
     a mechanism of constant reinterpretation, the system
     has survived and flourished for over two centuries.
     Without it, the Constitution would have been a
     useless piece of paper.

Note that in every one of these examples -- in mathematics,
computer systems, and government -- the mechanism of
enforcement, interpretation, and reinterpretation of the
semantics is the *ONLY* guarantee that the concepts are
used with a common meaning.  The use of a natural language
or an artificial language does not make the slightest
difference in determining whether a concept's meaning
shifts or stays constant.

Summary:  The Seybold article is a restatement of a hope
that Frege and Russell proposed a century ago.  Wittgenstein
and the logical positivists tried to achieve it with logic,
and they failed miserably.  The semantic webbers have zero
chance of achieving it by replacing natural language with
any other kind of language or notation.

Bottom line:  You can't solve a problem by ignoring it.

For more about semantics and the failure of logical
positivism, I recommend the following:

    http://www.jfsowa.com/pubs/signproc.htm
    Signs, Processes, and Language Games

John Sowa