Re: SUO: Axiomatizing lower level WordNet terms
John,
I agree with your suggestions, and I'd like to add a few comments.
John Bateman wrote:
> The problem with a lot of linguistics in the 60-80s was that it was
> done in armchairs, dredging examples from the minds of the armchair
> linguist. Many now see this is a hopeless endeavor, since the mind is
> not very good at forgetting about what it is trying to prove and
> producing less biased more representative data.
As I'm sure you know, there have been many fads (and fallacies) in
linguistics (and other fields) during the 20th century (not to mention
many more of both during earlier centuries).
When behaviorism was at its peak during the first half of the 20th
century, there was a very strong emphasis on data analysis, especially
by statistical methods. One goal was the search for "grammar discovery
procedures", which would be fed a large corpus of linguistic data, from
which some systematic method would generate a grammar. Although that
research produced some interesting results, there were two serious
weaknesses:
1. The behaviorists emphasized trivial stimulus-response models,
which were inadequate to represent the most interesting features
of languages.
2. The computer and storage facilities of the time were inadequate
to process the large corpora that would be needed.
Chomsky reacted to those limitations in two ways:
1. He rejected the simplistic mechanisms, and proposed much richer
representations -- the Chomsky hierarchy of formal grammars,
transformational grammars for natural languages, and the idea of
automomous syntax.
2. The use of introspection (or the intuition of a native speaker)
as a way of judging grammaticality.
In some respects, the Chomskyan revolution was healthy: it enlarged
the range of representations that linguists used, and it encouraged
them to think of richer examples than they were likely to find in the
small corpora that were then being examined.
In other respects, Chomsky put linguistics in a straight-jacket that
was almost as bad as the S-R model of the behaviorists: his notion
of autonomous syntax set up an artificial barrier between syntax and
semantics, and his emphasis on armchair methods caused linguists to
avoid looking at corpora, which were becoming much larger and easier
to process with the advent of large-scale computers.
As a result, Chomsky's influence during the 1950s and 60s was largely
beneficial. But when Chomsky returned to linguistics after his venture
into anti-Vietnam-War politics, he stifled some promising directions
that his former students were pursuing. The major consequence is that
the most important innovations of the 1970s and 80s were done outside
the Chomskyan paradigm -- largely by logicians and computational
linguists, who often collaborated on semantic efforts.
And I agree that the emphasis on processing large corpora during the
late 80s and 90s was a healthy direction, since it enabled new paradigms
to be tested on large volumes of data with enormously more powerful
computer systems.
This brief summary of linguistic history should be considered when
we look at current (and possible future) directions in ontology:
> But that just emphasizes my conviction that the formalization of
> particular ontologies for particular domains of discourse (e.g.,
> linguistics) should be left to experts and not be trampled on in
> a monolithic SUMO.
I strongly agree with this conclusion, and I suggest the following
lessons that can be drawn from the linguistic history (and also the
history of many other branches of science):
1. Fads and fallacies tend to come and go, and the best way to avoid
getting trapped by some temporarily popular fad is to ensure that
all voices are given a fair chance to be heard.
2. But not all voices are equally worthy of being listened to.
Therefore, there must be means of testing various proposals
in order to select the most promising ones, while still giving
the less popular ones a chance to continue plodding along in case
the tortoise might someday catch up when the hare stumbles.
3. Sitting in an armchair is often a good way to dream up a new
technique, but all methods, no matter how they were discovered,
should be tested against hard data -- such as large corpora,
as in the case of linguistics, or serious applications that
people or business organizations are willing to pay to use.
These are the major reasons why I am opposed to large monolithic
ontologies that are dreamed up by people sitting in an armchair
(or at a computer keyboard) and are being proposed as standards
that boast an imprimatur of the IEEE, ISO, or other prestigious
organization.
Bottom line: Armchair philosophies (or ontologies) are fine, as long
as they have to compete on an equal footing with others. They way to
ensure that they do is to standardize a mechanism, such as IFF, that
can accommodate all approaches and let the users and developers pick
and choose whatever is best for any application.
John Sowa