SUO KIF
Adam,
Here is an initial crack at some comments on your SUO KIF
(http://ltsc.ieee.org/suo/suo-kif.html) along with some suggested
changes.
Comments, reactions, suggestions, fulsome praise, bitter criticism,
etc all welcome.
-chris
--
Christopher Menzel # web: philebus.tamu.edu/~cmenzel
Philosophy, Texas A&M University # net: chris.menzel@tamu.edu
College Station, TX 77843-4237 # vox: (979) 845-8764
*****
1 Scope
...
The following categorical features are essential to the design of KIF.
1. The language has declarative semantics...
2. The language is logically comprehensive...
3. The language provides for the representation of knowledge about
knowledge. This allows the user to make knowledge representation
decisions explicit and permits the user to introduce new knowledge
representation constructs without changing the language.
This last point should go if "metaknowledge" and other metatheoretic
appararatus (e.g., the (defXXXX ...) operators) are going to go into a
module.
4 Syntax
General point: several of the non-terminals of the unannotated BNF
found in Appendix A (e.g., "constant") do not show up in this chapter
on the left side of a production rule. Everything in the appendix
should be in this chapter with some annotation. Also, BNF
non-terminals are typically surrounded by angle brackets. (A niggling
point.)
4.1 Introduction
As with many computer-oriented languages, the syntax of KIF is most
easily described in three layers. First, there are the basic
characters of the language. These characters can be combined to form
<i>lexemes</i>. Finally, the lexemes of the language can be combined
to form grammatically legal expressions. Although this layering is
not strictly esential to the specification of KIF, it simplifies the
description of the syntax by dealing with white space at the lexeme
level and eliminating that detail from the expression level.
I don't believe the last sentence in this paragraph adds anything.
It is especially uniformative for readers who have not had a lot of
exposure to formal languages.
...
4.2 Characters
The alphabet of KIF consists of 7 bit blocks of data. In this
document, we refer to KIF data blocks via their usual ASCII encodings
as characters (as given in ISO 646:1983).
This seems to me to get things intuitively backwards. If anything,
shouldn't the elements of the basic alphabet just be the basic ASCII
characters themselves? And isn't it more natural to say that *they*
are encoded by 7 bit data blocks, rather than the reverse? Maybe it's
just a cultural thing here, but it seems more natural to me to build a
language out of characters rather than binary data blocks. This is
not a huge issue, obviously.
...
4.3 Lexemes
The process of converting characters into lexemes in called
<i>lexical analysis</i>. The input to this process is a stream of
characters, and the output is a stream of <i>lexemes</i>.
The function of a lexical analyzer is cyclic. It reads characters from
the input string until it encounters a character that cannot be
combined with previous characters to form a legal lexeme. When this
happens, it outputs the lexeme corresponding to the previously read
characters. It then starts the process over again with the new
character. Whitespace causes a break in the lexical analysis process
but otherwise is discarded.
These paragraphs strike me as way off the point, and as generally
uninformative for the average reader. The more intuitive point for
users, I think, is simply that lexemes are analogous to the words of a
natural language, insofar as they are the basic syntactic units that
can be ascribed meanings (and will be ascribed meanings in the section
on semantics) and are also the building blocks of more complex
linguistic items, notably, sentences.
...
Semantically, there are four categories of constants in KIF --
object constants, function constants, relation constants, and
logical constants. Object constants are used to denote individual
objects. Function constants denote functions on those
objects. Relation constants denote relations. Logical constants
express conditions about the world and are either true or false.
KIF is unusual among logical languages in that there is no syntactic
distinction among these four types of constants; any constant can be
used where any other constant can be used. The differences between
these categories of constants is entirely semantic.
There seems to be some conflation of syntax and semantics here. I
think the following captures the matter a little more clearly: In the
semantics of KIF, there are four categories of entity: objects,
functions, relations, and truth values. By contrast, in the syntax of
KIF, rather than four corresponding classes of lexemes (as is more
often the case in formal languages), there is a single class,
<constant>, whose instances can denote any entity in any of the four
semantic categories.
Having said that, I think that this design choice is problematic. It
at least calls for some careful semantic footwork. Suppose I have
<contants>: "foo", "bar", and "baz". And suppose, in an intepretation
of KIF, I assign *objects* to "foo" and "bar" them as their
denotations, V(foo) and V(bar), and that I assign a property (i.e., a
set) V(baz) to "baz". Now, by KIF's syntax "(foo bar)" is a sentence,
the fact that V(foo) is an object notwithstanding. How to evaluate
it? Well, by the usual semantics for atomic formulas, "(foo bar)" is
true (in the given interpretation) iff V(bar) is a member of V(foo).
Well, since V(foo) is an object, and hence has no members, I guess
"(foo bar)" just turns out false. Fine.
But by KIF's syntax, "(foo bar)" is *also* a <funterm>. Hence, e.g.,
"(baz (foo bar))" is a <relsent>. So how do we evaluate that? By the
usual semantics for <funterm>s, the semantic value of "(foo bar)", qua
<funterm>, is the value of V(foo) applied to V(bar). But, by
assumption, V(foo) is an object, not a function, and hence it makes no
sense to apply V(foo) to V(bar).
However, perhaps there is a way around this as well. For KIF's
semantics does call for the existence of a disintguished object,
BOTTOM, which serves as the value of a partial function when applied
to an object not in its domain. Hence, we could treat objects as
funny sorts of partial functions -- ones that are undefined
everywhere. Hence, the value of "(foo bar)" when "foo" is interpreted
as an object will simply be the distinguished "null" object BOTTOM.
But null objects like BOTTOM are troublesome. To work intutively, it
seems to me that it needs to be stipulated that BOTTOM is not in the
interpretation of any predicate, for the whole idea of BOTTOM is just
to serve as a semantic object in cases where a function takes an
*inappropriate* argument, one to which it is *meaningless* to apply
the function. On the suggested approach to "(foo bar)", where V(foo)
is an object, "(foo bar)" should be meaningless on *every* argument.
So, again, BOTTOM should not be in the interpretation of any
predicate. (Think of a more intuitive instance; should "(Happy (Burak
Arafat))" ever turn out true when we have to think of "(Burak Arafat)"
as application of the "function" Burak to the object Arafat?)
But now we have problems. First, "=" is a predicate; its extension is
the set of all pairs (e,e) for e in the universe of discourse --
except, if our reasoning above was correct, (BOTTOM,BOTTOM). But now
this means that not all identity sentences turn out true in every
interpretation. In particular, it will not be the case, on our
given intepretation, that (= (foo bar) (foo bar)).
But perhaps we can make an identity a special case and allow
(BOTTOM,BOTTOM) into its extension; or something of the sort. There's
still another problem: not all instances of the axiom schema of
universal instantiation are valid. For instance, in our given
interpretation once again,
(=> (forall (?x) (A ?x))
(A (foo bar)))
is false, since the interpretation of "(foo bar)", qua <funterm> is
BOTTOM, and BOTTOM is not in the interpretation of the predicate A, no
matter what A is. Hence, to fix this, we would need, at the least, a
new distinguished predicate "bottom" to refer to BOTTOM, and we would
have to revise UI to:
(=> (forall (?x) F)
(=> (not (= t bottom))
(F[?x/t])))
where t is a <term>, F is a <sentence> (typically one containing "?x"
free), and F[?x/t] is the result of substituting all free occurrences
of "?x" in F with t.
Doable, but messy; not to mention unlovely. But ALL of this
unbecoming mess can be avoided simply by imposing a bit more order on
KIF's laissez faire syntax; in particular, simply by introducing
strict, disjoint classes of constants, function symbols, and
predicates.
Now, the perception might be that this imposes too much order on KIF;
it legislates ahead of time which strings of characters are to fall in
what categories. But this brings us to another problem with KIF's
current definition. Currently (I mentioned this once before) there is
only the one KIF language. I think this is a big mistake, as it
forces every KIF user to choke down the whole bloated enchilada --
every constant, every sentence, every funterm that you can make out of
any sequence of characters. This is not good. People should be
allowed to construct elegant (or not) *specialized* languages with
only the apparatus they need or, at least, want. Often people don't
need function terms. Current KIF foists denumerably many off on
them. People typically only want as many constants as there are
concepts and salient objects in their domains of interest. Moreover, I
think people do want to be able to regiment their languages so that
constants that denote concepts can only play the role of predicates in
<relsent>s, and those that denote objects can only play the role of
arguments to predicates and function symbols. I just don't think people
will want to have to deal with the KIF's monstrous semantic ambiguity,
manageable though it may be.
In sum, then, my suggestion is that, once one has defined KIF
characters and lexemes in sections 4.2 and 4.3, we define in section
4.4 the notion of a KIF *language*, of which there will be uncountably
many. (Proof left to reader :-) Such a language will have denumerably
many variables, a countable number (hence, possibly zero) of
*individual constants*, a countable number of *predicates*, and a
countable number of *function symbols*, where these items would
constitute pairwise disjoint sets of character sequences. I would be
happy to work out the details if this suggestion is accepted.
5 Basics
The title here should probably be "Basic Semantics".
5.1 Introduction
The basis for the semantics of KIF is a conceptualization of the
world in terms of objects and relations among those objects....
...KIF is conceptually grounded in that every universe of discourse
is required to include certain basic objects.
The following basic objects must occur in every universe of
discourse.
* All numbers, real and complex.
* All ASCII characters.
* All finite strings of ASCII characters.
* Words. Yes, words are themselves objects in the universe of
discourse, along with the things they represent.
* All finite lists of objects in the universe of discourse.
* bottom -- a distinguished object that occurs as the value of a
partial when that function is applied to arguments for which the
function make no sense.
If numbers are to be included in modules, then the first of these
should go, i.e., they should not be considered required semantic
objects in every intepretation of KIF. Similarly, ASCII characters,
strings of characters, and worlds are mentioned to serve as the
denotations for KIF's self-referential mechanisms, which have been
eliminated in the current SUO KIF. Hence they should all go into the
semantics of one or more modules. List terms have also been removed
to go into a module, so finite lists of objects should not be required
semantic objects either. BOTTOM might still be necessary if we allow
partial functions, but I'd suggest we try to avoid them due to the
trouble that BOTTOM causes, at least in the KIF core. There are other
tricks we can use for getting the effect of partial functions. So
basically all there is left to say is that an interpretation of a KIF
language has a nonempty universe of discourse. Period. And isn't
that just the way it should be if KIF's core is to be basic FOL?
The rest of chapter 5 and chapter 6 seem ok. Chapter 7 (Numbers)
should go if the various number theories are going to reside in
modules. Chapter 12 should be chapter 8.
The structural ontology of Appendix B does not fit the suggested
KIF core, as it involves notions of subclass and instance -- but there
are no classes in the current core. This would have to be part of an
extension -- one that should probably be worked up ASAP.