Re: Many-sorted (or typed) logic
Pat,
Many thanks for the tutorial. It was very helpful for me. Your
proposal for #3 sounds very sensible, as does the notion of moving KIF
closer to having some of the facilities, semantics and syntax of CycL.
Adam
At 05:57 PM 7/24/2000 -0700, pat hayes wrote:
>I wonder if I might try to pick this issue apart a little, as I think both
>(all?) sides are in a sense right but in a different sense wrong. As Fritz
>mentioned, this issue has been debated before in the ontology community;
>it has also been debated in other forums, including the theorem-proving
>community about 25 years ago and more recently even at such academic
>heights as the Foundations of mathematics (FOM) list. Moreover it is
>potentially important for the choice of an ontology language, so for all
>these reasons I think it is important to keep the issues straight.
>
>Following advice from Matthew West, I will try to explain them so that
>everyone can see what the issues are; I apologize in advance to those to
>whom such 'tutorial' explanations might seem condescending.
>
>What is this about?
>
>The issue can be phrased as whether or not to allow higher-order
>quantifiers in the ontological language, and how to decide what they mean
>if we do allow them. Any ontological language worth a damn will allow
>quantification over 'things' (sometimes 'entities' or some other word is
>used), and this is often called 'first-order' quantification. What makes
>it first-order is that there is a clear distinction, in this kind of logic
>(first-order logic, or FOL), between the things that quantifiers range
>over and the assertions that are made about them in the logic.
>Higher-order quantifiers treat properties of things as another kind of
>'thing', which makes everything more complicated.
>
>A brief summary of logical semantics.
>
>The basic idea of 'model theory' as it is often called (a better term
>would be 'interpretation theory') is to give a mathematical specification
>of what would count as a way to interpret the logic. The logical
>expressions themselves say something, but what do they say it about? An
>'interpretation' is a way to answer that question. An interpretation
>defines just enough structure to give a meaningful truthvalue to every
>expression of the logical language. There are interpretation rules which
>often seem so obvious as to be hardly worth saying, such as that the
>meaning of '(and p q)' is true in an interpretation just when the meaning
>of 'p' is true and the meaning of 'q' is also true in that interpretation,
>but it gets interesting when one gets to the quantifiers. The way that
>first-order quantification is understood is that the quantifier ranges
>over some set of 'things' called 'the universe' of the interpretation. In
>fact, the choice of a universe (it usually has to be nonempty) is the
>first thing to be done in specifying a first-order interpretation, and
>then each name is interpreted to be the name of a member of the universe,
>each predicate is given an interpretation which is a subset of the
>universe, each binary relation is a subset of pairs of elements of the
>universe, and so on. A statement beginning with the universal quantifier
>is true just when the body of the statement (without the quantifier) is
>true for every value of the bound variable. Notice that this might come
>out different if we had chosen a different universe: the meaning of the
>quantifier depends on the entire universe, not just on some part of it.
>
>(The title line refers to many-sorted logics. This is just a way of saying
>that some of the predicates of the language are given a special status as
>a 'sort', and some extra syntactic machinery is provided to restrict
>quantifiers to one of these sorts, and maybe some special inference
>schemes are also added for reasoning efficiently about sort heirarchies.
>But all the sorts are subsets of the same universe.)
>
>An interpretation specifies an unique value for every expression of the
>logic; and indeed it can be characterised as the 'smallest' mathematical
>structure which would assign a truthvalue to every expression in the
>language in this way: it is a kind of minimal mathematical specification
>of a possible universe. It is one way the world could be so as to make all
>the sentences perfectly clear in their meaning. These 'ways the world
>might be' are not themselves part of the logic, and are not to be thought
>of as in any physical or inferential or mechanical way 'connected' to the
>logic. They are not required to be computable or constructible (though
>some of them may be) and they are not constrained to be 'made of' any
>particular kind or category of 'stuff'. They are a purely mathematical
>description of the idea of a 'possible meaning' for the logic.
>
>It is important to see that this is only ONE way the world could be. It
>doesnt make sense to ask what is THE interpretation of an expression in a
>logical language. In general one cannot point to a single interpretation
>and say anything very useful about the logic from that single
>interpretation, since the logic may well have different interpretations
>which have different properties. What one can do is relate inference in
>the logic - the mechanical business of drawing conclusions and creating
>proofs - to what is true in *all* interpretations of the logic. Sometimes
>a logic is "complete" in the sense that a sentence is provable exactly
>when it is true on all possible interpretations (according to the rules of
>interpretation given). This is a very nice property of a logic and its
>semantics, since it shows that they are in a precise kind of
>correspondence with each other. If a logic is not 'complete' in this way
>(note, the name doesnt mean that it is 'completely defined' or that
>everything intuitively true is provable in it) then it means that the
>semantics - the way the language is being interpreted - and the proof
>machinery - which in our terms means how the computer is able to use the
>logic - are out of step with each other: the semantics makes claims about
>the logic which the machinery cannot guarantee.
>
>(Notice that the very same language could be given different rules of
>interpretation. This isnt often done for FOL but it could be, and it is
>often done for other logics. Completeness only makes sense with respect to
>a set of interpretation rules, so a logic may be complete with reference
>to one semantics but incomplete with repect to another.)
>
>A natural question to ask is, can one write enough axioms into an ontology
>to be able to guarantee that the only interpretations which satisfy those
>axioms are just the ones one has in mind? And the answer is: almost
>certainly, NO. Almost all interesting ontologies have 'nonstandard'
>interpretations, ie interpretations which satisfy the logical rules but
>which don't correspond to the intuitions one had in mind when writing the
>axioms. However, it is important to not let this get one down, since it
>is in fact a source of power rather than a kind of defeat. The existence
>of nonstandard interpretations doesnt amount to a claim that there is
>something wrong with the axioms, or to say that the axioms in some sense
>are infected by these unintended interpretations, or that one is somehow
>'forced' into considering them. In foundations of mathematics work
>nonstandard models are often considered an embarrassment, since
>mathematicians aspire to complete theories of Platonic realities, but for
>ontology writing it seems to be what one would expect: it's another way of
>saying that ontologies must be open-ended. There is always something more
>you could say about your intended topic: the ontological theory is never
>'finished' in the sense that it says everything that could logically be
>said about a topic in enough detail to guarantee that nobody will ever be
>able to say anything more about it. ('Nonstandard' interpretations can
>often be very interesting, in any case, even in mathematics. For example
>there is an entire field concerned with 'nonstandard' interpretations of
>arithmetic in which infinitesimals are allowed.) But the key point is,
>whether one likes it or not, nonstandard interpretations are perfectly
>good interpretations, and so there is no way to legislate them out of
>existence or claim that they are somehow forbidden.
>
>Now, a final point about interpretations. I said above that proofs could
>be related to what is true in *all* interpretations. This 'all' assumes
>that we have given an exact characterisation of what counts as an
>interpretation, which is relatively easy for 'first-order' quantification;
>but when we turn to 'higher-order' quantification, the story gets more
>complicated, since there will be several different ways to interpret a
>higher-order quantifier, and these correspond to different kinds of
>interpretation, and so it is no longer quite so clear what it means to say
>'all interpretations'. This is where the Chris/John/Fritz exchange becomes
>tricky.
>
>Higher-order logical notation
>
>Mathematicians decided about 85 years ago that they needed a way to talk
>about properties and relations as well as about things, and practicing
>ontologists came to a similar conclusion about 75 years later. First-order
>logic has symbols for properties, of course, but it doesnt allow one to
>say something about 'all properties': it only has names for properties,
>not quantifiers and variables over them. Second-order logic (SOL) allows
>such quantifiers (and third-order logic allows properties of properties,
>and fourth-order... you get the idea. 'Higher-order logic' (HOL) is
>sometimes taken to mean any of these logics and sometimes to mean their
>union where one has all properties of properties of properties of..... up
>as far as you want. It might seem like a big difference, but in practice
>once you have second order you pretty much get all the rest, so I won't
>bother to be exact about the distinctions.)
>
>Now it is not hard to tweak the syntax of the logical language to allow
>quantifiers over properties (and relations: again I won't bother with the
>details). You just allow variables to be put in the predicate position of
>an expression, ie you allow things like (?P a ?x) as well as (P ?x ?y),
>and then you allow them to be quantified. Nothing to it: CYCL does it
>without even batting an eyelid. (KIF carefully excludes this, which IMHO
>is a major problem for using KIF in ontology specification. I agree with
>Fritz and I think with John here). It gets a bit trickier however to
>specify inference rules and interpretations.
>
>Inferences using higher-order notation.
>
>It is almost as easy to adapt the usual FOL inference rules/machinery to
>handle these property quantifiers. In fact one hardly needs to adapt it:
>it works on them just like it works on other variables. The usual
>unification algorithm, for example, used by first-order reasoners, works
>just fine with the variables in the first position of an expression. (As
>far as unification is concerned, the expression is just a tree with
>constant names or variables at its tips. The usual first-order restriction
>to only allow constants in the relation position has to be added as an
>explicit restriction.) So what, one might well ask, is all the fuss about?
>
>A hint of the problem can be seen by asking whether there ought to be any
>new inference rules which we can now use which we couldnt use before. For
>example, suppose we have asserted (I'll do this in 'illegal KIF')
>(not(exists ((?P))(and (?P a) (?P b)) )
>and something else asserts (FOO a) and (BAZ b). Now, is this a
>contradiction? The slightly-modified first-order reasoner isnt going to
>see any contradiction here. But does this mean there is no contradiction,
>or that first-order reasoning isnt up to the new task? Well, it depends on
>what that "exists ((?P))" is supposed to mean. What that says is,
>intuitively, there exists a property which is true of both a and b, ie a
>and b 'share a property'. Well, do they? One person (call him F.) might
>say: no, or at least not unless FOO is the same as BAZ. But someone else
>(call him P.) might say: Oh yes they do. Consider the property 'FOO OR
>BAZ", ie the property which is true of something just when either FOO or
>BAZ is true of it (compare "No dogs or Irish need apply"). If FOO and BAZ
>are properties, so is this. And this disjunction property is true of both
>a and of b, so it is a counterxample to the axiom, so there is a
>contradiction here. Imagine how the resulting debate might go:
>
>F: That 'property' of yours isn't a real property.
>P: Sure it is. I can write a sentence in the logical language which
>defines it:
>(defproperty FUBAZ(?x) (or (FOO ?x)(BAZ ?x)))
>F: But you didn't write that definition, did you?
>P: But that is irrelevant: just because I didnt bother to define it
>explicitly, doesnt mean it doesnt *exist*. When I quantify over the
>universe, I don't quantify only over the things that have been defined
>explicitly. Even first-order quantification wouldnt work properly with
>that interpretation.
>F: But if you allow property quantifiers to range over all properties
>which could possibly be defined, they become useless, since there are all
>kinds of those 'properties'. In fact there is, potentially, such a
>property for every expression in the logic.
>P: Yes, indeed, there is; and that is why it is so hard to give a complete
>set of inference rules for higher-order quantifiers. But whether you like
>it or not, that is the universe of properties over which your quantifiers
>are supposed to be interpreted, so you had better face up to that fact.
>F: Hey, I can make my quantifiers range over any set I want them to range
>over! The question is, who is to be master, that is all.
>R: (A stranger who has been listening nearby): Actually the situation is
>even worse than either of you realize. Second-order quantifiers are true
>only over the set of ALL properties, whether they can be defined or not,
>so there might be more properties than there are expressions in the
>language, even.
>F and P , together: Listen, things are hard enough already. Go back to
>wherever you came from and leave us alone.
>
>Second order universes
>
>As one can see, the issue turns on what universe the higher-order
>quantifiers are supposed to range over. If we were giving an
>interpretation for a higher-order logic, we would have to choose a basic
>universe for the ordinary quantifiers to range over, but what would be the
>proper universe for the the second-order quantifers? Here are a few
>possible answers:
>
>1. The set of all properties, ie all subsets of the first-order universe
>(often called the 'power set' of the universe). (R's answer.)
>2. The set of all definable properties, ie those which could be defined
>using an expression of the language. This is a subset of the power set.
>(P's answer)
>3. The set of all named properties, ie the properties which have been
>assigned to predicate names by the interpretation. This is an even smaller
>subset of the power set. (F's answer.)
>
>Of these answers 1. seems best: it is simple, clear, intuitive and has the
>merit that it is completely determined by the first-order universe itself,
>so that the choice of a 'basic' universe determines the whole universe for
>all the other quantifiers (third-order property quantifiers range over
>subsets of subsets of the universe, etc. on up.). This is the 'classical'
>semantics for higher-order logic, and it is what most mathematicians mean
>when they contrast higher-order with first-order logics. Unfortunately for
>us it is utterly impractical, as one can prove that NO logical system of
>rules can possibly be complete with respect to this semantics. (The reason
>is quite interesting: basically because the power set of a countable
>universe is uncountable. This means for example that there are uncountably
>many properties of the integers; see R's interjection. Maybe we don't
>really want to be quantifying over such a large set after all...) What
>this means is that no matter how loudly someone might claim that this was
>the semantics they had in mind for their logic, the logic itself - and any
>implementation of it - will not in fact accurately reflect this
>semantics. It can only be wishful thinking. There is absolutely no way
>around this, its just a mathematical fact of life. This is the sense in
>which Chris is right in emphasising that SOL (or HOL) cannot be
>transcribed into FOL or indeed into any first-order theory.
>
>(An aside. Fritz's remark that we just have to abandon completeness is a
>misunderstanding about the nature of completeness. If we 'abandon
>completeness' in this sense then we are just claiming the right to assign
>arbitrary semantic interpetations to our logics as a kind of ritualization
>of fantasy. I can claim that the symbol "urgh" has the same semantics as
>the US Constitution, but what does that claim amount to? How could it be
>justified? It wouldnt be enough for me to just say that this was my
>"intended interpretation" and expect my machinery to work accordingly.
>Completeness is the only guarantee that our semantic claims about our
>logics, and the actual inferential performance of those logics, have any
>meaningful relationship to each other. 'Abandoning completeness' allows
>one to bask in a kind of glow of satisfaction about what ones axioms seem
>to be saying, but it doesnt help with the awkward fact that the
>'nonstandard' models which one has conveniently refused to allow into ones
>preferred semantic account are, like it or not, the sand in the machinery
>of ones inference system.)
>
>(Just as a point of clarification: Ive been careful to distinguish between
>a logic, and a language or notation. One can take a given notation and
>define a variety of different inference rules for it, each giving a
>different logic. In the FOO OR BAZ example, there are (at least) two
>different logics being considered, but they both involve the same
>language. The moral is that one cannot tell a logic from its syntax: one
>needs to know something more about it. This 'something more' might be
>some inference rules or it might be a system of interpretation, or both.
>If both and they correspond to each other, we have completeness. If only a
>semantics with no rules, we have an aspiration. If only rules but no
>semantics, we have confusion.)
>
>Returning to the above answers, the imaginary debate between F. and P. is
>really about the difference between 2. and 3. Here there is something
>useful to be said. Answer number 2 - P's position in the debate - gives
>what is often called the Henkin semantics for HOL (named for its inventor)
>. This has several nice properties. There is a complete set of inference
>rules, and HOL with this semantics can be transcribed into a (rather
>clumsy, but never mind for now) ontology in FOL. (It is a many-sorted
>logic with a sort for each arity of each level, eg there is a separate
>sort for third-order predicates with seven arguments, etc., and a lot of
>axioms about how various combinations of predicates are connected with
>each other.) This is the sense which Chris dismisses as being just a
>notational variant on FOL, as indeed in a robust mathematical sense it is.
>Nevertheless, it is a distinctly different logic to mechanize, one with
>its own special inference rules which go beyond the usual first-order
>inference machinery (for example, it would be able to find the FOO OR BAZ
>contradiction all by itself) and which are very tricky to implement.
>(Another way of saying the same thing is that the FOL ontology which would
>'transcribe' a Henkin-interpreted HOL is so clumsy as to be unusable in
>practice.) For example, the usual unification algorithm notoriously does
>not work in this version of HOL (it doesnt even work for SOL). I know of
>no current implementation of higher-order logic in this sense, though if
>there was one that would be a wonderful thing. (To implement the logic
>directly one would need rules of lambda-conversion to be incorporated into
>the unification algorithm.)
>
>Answer 3 makes a higher-order quantification amount to even less than the
>second answer, and as that is equivalent to FOL, one might expect that
>with this answer, a higher-order quantification is actually LESS
>expressive than a first-order quantifier, and indeed that is the case.
>What this amounts to is that higher-order quantifiers are equivalent to
>quantifications over the names of properties rather than the properties
>themselves. When one writes
>(forall ((?P))...?P....) one really means, on this interpretation,
>something like
>(forall ((?n)) (implies (PropertyName ?n) ....(propertynamedby ?n)....)
>and any property that doesnt have an explicit name is just not in the
>universe of properties.
>
>Now, this last way of interpreting second-order quantifiers is SO limited
>in semantic expressiveness that the mathematicians of 85 years ago wouldnt
>have taken it seriously for a minute. Nevertheless, I would urge, this is
>in fact the most useful way to interpret higher-order quantifiers for
>ontological use. It means that higher-order quantifiers are severely
>restricted in their scope, but still it allows one to write properties of
>properties of ... etc., to be able to draw conclusions about named
>properties, to define classes of properties, etc., and use all that
>expressive licence whichFritz and other happy CYCLists have been taking
>for granted for years; and still have the resulting inferences interpreted
>by a standard efficient first-order reasoner.
>
>In terms of the 'semantic debates' to which Chris and John have been
>referring, it is so weak as to be hardly worth mentioning, but for
>practical usage in writing ontologies it is of immense and liberating
>value. It costs nothing in inference complexity and does not complicate
>the proof theory in any harmful way, and it fits smoothly into the usual
>inference machinery. Its very weakness is an odd source of strength, in
>that it renders harmless a kind of use/mention casualness which one sees
>often in CYC, where a predicate on a property can refer either to the
>property or to the name of the property. In conventional interpretations
>this is a potential disaster, but with this weak notion of higher-order
>quantifier it is harmless. It also allows the user to control the universe
>of quantification by providing new definitions, for example.
>
>Why everyone except me is wrong
>
>So, to return to the adhominemism for a moment. Chris is the mathematician
>here, and he is correct (contra John) that HOL with the Henkin semantics
>is no more expressive than FOL, and that it is incorrect to say that SOL
>(still less HOL) is *equivalent* to a sorted first-order
>theory. Nevertheless John is right to emphasise that even with the Henkin
>semantics, SOL and HOL are distinct languages with their own distinct
>proof styles and are in many ways superior to their transcription into
>first-order language. Unfortunately these strengths do not translate into
>mechanizability, as far as I know. Fritz is, I guess, entitled to claim
>that he will ignore completeness and hence claim that his logic fits a
>semantic theory which it in fact doesn't fit, but then he is also entitled
>to claim that black is white. The truth of the matter is that the logic
>which Fritz likes so much is actually FOL inference applied directly to
>HOL syntax, an odd blend from the traditional point of view but one which
>(Fritz is right to emphasise) is of considerable utility for representing
>ontologies, and which is readily mechanizable using well-known techniques.
>(But the debates would be a lot shorter and less noisy if John and Fritz
>would stop making philosophical claims which are overstated, controversial
>and unnecessary to their main points and irrelevant to the main topic at
>hand. So there.)
>
>-----
>
>I plan to expand on this in a later message, giving a more thorough
>description of the proposed language. One might call it a CYClified
>version of KIF. (This logic lies somewhere in between answers 2 and 3, by
>the way, which is where CYCL may also lie.)
>
>Pat Hayes
>---------------------------------------------------------------------
>IHMC (850)434 8903 home
>40 South Alcaniz St. (850)202 4416 office
>Pensacola, FL 32501 (850)202 4440 fax
>phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes
>
-----------------
Adam Pease
Teknowledge
(650) 424-0500 x571
- Follow-Ups:
- SUO KIF
- From: Chris Menzel <cmenzel@tamu.edu>