Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

SUO: RE: Question about CLCE




John F. Sowa wrote 
> I received an offline question about my proposed
> Common Logic Controlled English (CLCE):
> 
>  > Have you had some experience with people using
>  > CLCE to provide machine parsable knowledge?
>  >
>  > How are the inter-coder reliability issues?
> 
> Following is my reply.  And for people who have
> not looked at CLCE, following are the spec's:
> 
>    http://www.jfsowa.com/clce/specs.htm
> 
> John Sowa.
> ____________________________________________________
> 
> As I have said many times, syntax is not the problem.
> The corollary is that no syntax of any kind can be,
> by itself, the solution.
> 
> I grant that many people have built successful systems
> using a particular syntax for knowledge representation.
> But I claim that the value of such systems does not
> lie in the syntax of any particular notation, but
> in the methodology it supports.  I would go further
> and claim that the value of any of the notations
> that people have used successfully (such as the
> UML family, for example) is not in their syntax,
> but in the associated methodology.
> 
> The next point I would make is that methodology,
> by itself, is not the goal, but an approach for
> achieving the real goal, which is to analyze
> and understand some problem and to develop an
> effective solution.  Each methodology embodies
> a way of analyzing a particular class of
> problems and developing solutions for them.
> 
> Description logics, for example, have been
> successfully used for an important class of
> problems.  But those problems could be solved
> equally well in any syntax that was combined
> with the same methodology and toolset.
> 
> My reason for developing CLCE is not to propose
> yet another syntax, but to eliminate the syntactic
> arguments by saying, in effect, "Why bother?"
> 
> Since the real contribution is not in the syntax
> but in the methodology, let's dispense with
> the syntactic issues right at the beginning.
> Let people use their own native language
> as the notation, if they like, or give them
> graphics tools as supplementary visual aids.
> 
> To answer your question:  No, I have not used
> CLCE for any particular problem.  But many
> people have used versions of controlled NLs
> for database design, expert systems design,
> and pseudo-code for program design.  I am just
> designing CLCE as a common syntax that can be
> subsetted, as needed, for any such purpose.
> 
> People could use the DL subset of CLCE, the
> SQL query subset, the FOL subset, or the
> imperative subset as needed.  Or they could
> supplement it with any graphics aids they like.
> 
> My main goal in developing CLCE is to get rid of
> monstrosities such as OWL, the Object Constraint
> Language of UML, and the multitude of English-like
> wannabees such as COBOL or SQL. As a replacement,
> I suggest two kinds of languages:
> 
>   1. As the inner language for computer processing,
>      I recommend logic, in whichever subset is
>      appropriate for any particular problem.
> 
>   2. As the outer language for human consumption,
>      I recommend controlled NLs together with
>      graphic supplements whenever they are helpful.
> 
> The people who design and implement the tools need
> to be familiar with both the inner and outer languages.
> But the people who use the tools can do everything
> in controlled NLs supplemented with graphics.
> 
> As a result, the controlled NL serves as documentation
> that is readable by both computers and humans.
> There can be no discrepancy between the documentation
> and the implementation, since the documentation is
> automatically compiled to the implementation.
> 
> Given this approach, R & D can be diverted from syntax
> to the important questions of semantics and methodology:
> What knowledge is required to solve particular problems,
> how can it be acquired, and how can it be used and reused?
> 
> John Sowa


While I like the basic idea of a controlled language,
requiring monosemy is too strict a constraint for most
people.  I write from the familiarity of using a form
of CLCE called ROSIE back in the late eighties.  

It shouldn't be too hard to dress up a CLCE to include
more conversational forms of language, so long as the
message is clear.  People use different words for good
reasons.  Your vocabulary is somewhat different from
mine even when we write about nearly identical topics.

Using your lattice of concepts as a guide to research,
I googled on "paraphrase" and "lattice" and came up with
http://www.cs.cornell.edu/home/llee/papers/statpar.pdf
"Learning to Paraphrase: An Unsupervised Approach 
Using Multiple-Sequence Alignment" by Barzilay and Lee
from Cornell.  

This paper describes some experimental results from
acquiring a lattice of sentences that describe the
same news story from different sources.  By aligning
paraphrase sentences and organizing them into a lattice,
they were able to come up with a way of forming templates
that represent common ways of talking about a few specific
news events.  

The paper describes how they identified "back-bone"
concepts, which they turned into slots in the sentences,
thus turning each sentence into a very Englishy version
of simple variables.  Each of the templates can then be
filled in to automatically generate paraphrases.  

It appears to me that the same technique might work
well in reverse; after parsing a sentence and identifying
some of the -onymies, (especially hyponymy/hypernymy
relationships) of the kinds of objects that occupy slots
in the sample sentences, it should be feasible to parse
a wider variety of English input sentences for relating
sentences to their underlying semantic content.  

The amount of work required to build a large and accurate
lattice of sentences, and other lattices of phrase types,
would be great.  But it might produce more feasible
NLP interfaces than a pure syntactical approach.  

What do you think, John, is this a defensible approach?

Thanks,
Rich