Re: SUO: RE: Architectures for Intelligent Systems
[foreword: I am answering below a message from Matthew West which,
while still apparently distributed, disappeared in the twilight zone.
I recalled its content in: http://suo.ieee.org/email/msg08324.html
Silly bugs striked again, OTOH it feels good to know we are still
in the REAL world where there is no "silver bullet" for software
problems.
I foresee that *ontologies* bugs will be even more disastrous
than software bugs, unless for drastic changes in methodology,
because any bit of knowledge in an ontology has far more
widespread reach than just a line of source code, as badly
crippled it might be. So, gentlemen, it's UP TO YOU...]
Dear Matthew,
I will only comment the points for which I have some answers.
> > - A more "classical" ontology of basic concepts, that is, concepts
> > for non structured things like 'color', 'temperature', 'right'
> > and 'left' etc... Anything that could NOT be explained, say, to
> > an alien, because it needs monstration. Yet, provision would be
> > made for inclusion of "opaque" new concepts, like when explaining
> > the word 'red' to a blind man, he can learn to use it properly
> > while still having *no clue* of what it is.
>
> MW: The size of this is potentially huge, it might even be infinite.
Probably not. The OED contains only 300,000 words, assuming technical words
are not there, apply a factor of 3, 5, 10, whatever, then scale down by the
ratio of "unstructured things" versus total (don't know how small but it's
there and likely tiny), then split by domain: biology, chemistry (yes,
overlapping, but nevertheless), sociology, whatever, this may be huge
(not so in my opinion) but still manageable.
> > This part is even more "delicate" to setup than the logical
> > framework
> > because I would not trust "hand-crafting" that part,
>
> MW: On the other hand I know of no automated process I would trust
> to identify concepts reliably, not least because no one has been able
> to describe such a process to me satisfactorily. This is the holy grail
> that Jon is after, and it is worth seeking,
Please cool down! We are not looking for an understanding of Shakespeare,
Aristotle or the like, as it seems Jon is after.
> but it has not yet been found as far as I know. So I am left with hand
> crafting, until I can at least understand the hand crafting process.
No, I, for myself, would trust more a purely syntactical process that
will identify leaf nodes in a tree of concepts that would have been
setup according to some parsing + semantic rules (however crude).
The parsing and semantic rules would obviously have to be adjusted
until one is satisfied with the output, but please, NO HAND TUNING of
the output!!!
> > no more than I
> > trust the SUO effort.
>
> MW: Again, do you mean SUMO here? Please be careful.
No need to be more careful than the "true believers" in SUO Scope & Purpose
> > I envision some kind of analysis over a corpus of somehow "raw" text
> > pertaining to a given domain to locate the terminal leafs in the
> > concepts structure.
>
> MW: Well this is fine, and what is usually done, but it doesn't have to
> be automated.
Oh! Yes, it has to!
The reasons are, reliability, cost, maintainability (redoing from scratch
whenever needed by changes in the state of the art).
> > This means that this "basic concepts" part is likely to be split
> > between many application domains.
>
> MW: ???
See above, you don't need all biology concepts when talking about
chemistry and vice-versa even if the domains overlap.
> > It will be "modular" but with a mostly flat structure not very much
> > lattice-like,
>
> MW: I don't think you should be trying to determine the structure of an
> ontology, but discover it.
Of course, but I just mean that the lattice of "basic" concepts and
attributes will not be very "hairy" nor of much "height".
> > Specialised domain ontologies would be
> > much less worrysome and MUCH MORE USEFULL FOR REAL APPLICATIONS!
>
> MW: I'm not convinced about this. I find it interesting to note that
> a core set of concepts (e.g. classification, specialisation, composition,
> connection) are involved in almost any domain, and generally dominate
> the description of that domain.
These are likely to receive a pretty "formal" definition and that's
why they are shared by so many domains. They are akin to the axioms of
a math theory. With probably only a few exceptions (for instance,
zoology classification versus chemistry classification, may be...)
they can be put in a shared common "top level" domain.
> > From such a basis, ontology would be free to use the
> > formalism that suits them most among the ones mentionned above
> > and even to invent any new one provided they give a formal
> > description of it's syntax and semantics.
>
> MW: Therein lies a can of worms.
Not necessarily. Developpers will be constrained by their *own*
capabilities to describe the syntax and semantics with the core
formalism. That will probably prevent them to look after too
"metaphysical" (and practically useless..) distinctions.
Once done, that will not be a problem for the broker to deal with
it's own formalism. Any discrepancies with "intended" semantics
will be the responsibility of the writers, CODE IS LAW!
> > Then from the "basic" concepts of one or more domains they will
> > build structured concepts of their own that could be matched by
> > the broker to any other ontology developped along the same rules.
> > There will be some rules about the way to qualify the "constructed"
> > concepts such as to satisfy the FCA like requirements for the broker.
>
> MW: Would you care to state what the rules are?
The only rule or so, is that, from the peculiar formalism and the
actually entered content for each word, it should be possible to
automatically reformat the whole ontology graph (internally and
for the broker purposes only) such as to provide a set of concepts
(unary "typing" predicates) an attributes (binary "properties",
predicates tying a "part" or "quality" to an object).
> > Existing ontologies already built "outside" of these rules
> > will possibly
> > have to be reshaped (still within their own formalism) and
> > may be even
> > *supplemented* to satisfy the rules. This should be a
> > mechanised process
> > which will depends mostly on the base formalism and in some
> > cases, perhaps,
> > on the particular domain or the specific ontology within the domain.
>
> MW: I.e. they have to be integrated. Again I do not see somthing
> well defined enough here that it can be "mechanised".
The response is above, however, as I did not check yet, it may be
that some existing formalisms are "intrinsically" unamenable to
the above capability.
> > About consistency, I already said in a working document which has
> > only been forwarded to Jay Halcomb (sigh...) and a few others:
> >
> > > I don't expect the basic "negociation" capabilities to ever fall
> > > into this [Russel's paradox].
> > > The *content* of what is negociated might, but that's not
> > > the *core ontology*'s business.
> > >
> > > It's up to you (the "receiving" ontology) to assess the
> > trustworthyness
> > > of your remote correspondant ontology to avoid it
> > screwing the consistency
> > > of your knowledge.
> > > Domain dependency again, plus "out of ontology"
> > criterion of safety,
> > > there will be *ontology viruses*, isn't that great!
>
> MW: Again, where is the well defined process that can be automated.
Sketched below in the previous message.
This amounts to downloading a "theory" from a third party and "trusting"
it enough to make inferences by its rules on your own data.
The consistency checks have been described and I believe they are quite
stringent, but you never know, malicious third parties may try to
defeat them and succeed.
> > Of course, there will have to be disambiguation of homonyms from
> > different domains and even different contexts within the same domain
> > etc, etc... but this a developpement effort *not* an
> > architectural one.
> > Because as I already required, there should *not* be direct mapping
> > from lexemes to concepts (nodes in the FCA lattice).
>
> MW: And how are you going to automate that?
See below about contextual disambiguation.
> > > Take a really simple example. Suppose you have one ontology that
> > > does mereology based on a spatio-temporal ontology. How do you know
> > > (=find out) that you can't mix that with another ontology that uses
> > > continuants as its underlying paradigm of persistence?
> >
> > I don't know about "mereology based on a spatio-temporal
> > ontology", so
> > I cannot fully elaborate on this, you would be kind to give me a more
> > "mundane" example exhibiting the same problem.
>
> MW: The example is quite mundane. Mereology is the study of whole/part.
> A spatio-temporal ontology is one that sees individuals as spatio-temporal
> extents (4d worms if you like) rather than as continuants, that are wholly
> present at each point in time and persistent through time. Classical
> mereology is sufficient for a spatio-temporal ontology, but it is not for
> a continuant based one (at least as soon as time is taken into account).
> If you want to understand how different these are I suggest reading Peter
> Simons' "Parts".
Ah! Yes, the famous 4D vs 3D debate, I will have a look.
Is there anything similar and as nasty as that outside
spatio-temporal ontologies?
But anyway, I see no reason for this to impact my proposal if it can
be made "basic" enough. It will be up to the proponents of either 3D
or 4D to state the semantics they want within the core formalism and
then, maybe, it will be found that with the help of fully formalised
specs of each variant the correspondances between the two will be
more understandable an even automatically cross-translatable.
> MW: As I mentioned above, composition (whole/part) is one of those basic
> innescapable subjects, so we need to have some answers here.
See my responses to Tim King in: http://suo.ieee.org/email/msg08309.html
> > If the primitives are properly described it should always be
> > *theoretically* possible for the broker to figure out the "mapping"
> > between the two sets of primitives, but that may put a real
> > strain on
> > the inference engine and may not be "practically" feasible, too bad!
>
> MW: Again I see no process I can automate.
Until I know more about the 4D/3D problem I can illustrate this
with the trivial example of the representation of complex numbers
in programming languages.
You may choose an imaginary/real representation or a polar one
but that does not mean that you cannot provide a conversion
between languages using different conventions and, even, primitives
to alter only the imaginary or real value in an environment where
the representation is actually a polar one.
This kind of primitives will just bring a performance penalty.
Within the broker, given the semantic is *totally* formalised, it
should be possible to devise that kind of conversions automatically
thru the inference engine, but as I know just as well as you, this
will really tax its capabilities for non trivial cases.
A meta-remark: You seem quite "scared" by unknown territories and
not willing to advance until you are fairly sure that *all* problems
will be solvable. But unless you actually try, you never know if the
*practical* cases will fall or not within the feasibility range.
There are many, many algorithms which worst case is truly intractable
that yet are of great value in almost all everyday uses.
There is always the possibility of a hardship, the case you just need
will not work, but that's life, isn't it?
> > The main point about the "interoperable" ontology structure is that
> > concepts have to be defined *only* by their attributes/properties
> > such as to allow the computation of "proper" concepts by FCA closure
> > upon the extension and comprehension.
> >
> > Of course this has to end somewhere, this why "basic" concepts are
> > to be provided and agreed upon in the domain specific ontologies.
> > These basic concepts are rather attributes/properties than concepts,
> > that is, they are *elements* of the comprehension set of the FCA, not
> > subsets (for uniformity they also can be viewed as singletons).
>
> MW: This approach imposes a whole philosophy I do not necessarily
> subscribe to.
Right! There are three solutions:
- Your "philosophy" can be mapped back and forth to the broker "philosophy"
and your description of it is precise enough to allow it to do that!
- Your problem domain is still amenable to the broker "philosophy" and,
if you really need your work done, you convert to the broker "philosophy".
- You stay out of the game.
> > Non basic attributes/properties have the form of binary relation
> > where one argument (say the first) is an instance of the concept
> > being defined and the other argument is an existentially quantified
> > occurrence of another concept.
> >
> > A car has a 'power source' which is an 'engine'.
> >
> > Thus, the relation itself must be "conceptualised" with respect to
> > it's "function", "role" or "place" within the defined concept.
> > It is not enough to say "a car 'has' an engine".
> >
> > This is a REQUIREMENT for the broker logic to be able to work.
> >
> > ******************************************************************
> > * However, that does NOT mean that the source ontology has to *
> > * be formatted this way. Only that it's content plus the formal *
> > * description of it's structure should allow the source broker *
> > * to present the data in such a format to the querying broker. *
> > ******************************************************************
> >
> > Assume some query/response interaction is going on between us and a
> > remote party which has his own ontology (compatible thru the broker
> > but different from ours) and that the syntax of the exchange has
> > already been agreed upon.
> >
> > I may happen that among a query or response a word appears for which
> > the "meaning" is not yet known. Then we have to ask for
> > "explanations",
> > that is we have to engage in a "sub-dialog" in order to figure out
> > what can be the "translated" meaning in our local ontology.
> >
> > We have to ask for the "attributes" that define this new word.
>
> MW: By now you are being quite prescriptive about how ontologies should
> look.
NOT prescriptive at all about the "look", only about the information content.
See responses above about "the rules".
> Which at one level is fine, but at another level, I doubt if many
> current ontologies conform, so they would all have to be rewritten,
Probably not so bad, as I explained in Tim King response, you can
always replace a missing information by a structurally equivalent
but uninformative one that will allow the broker logic to proceed,
just making the concepts matching less reliable.
> in which case they might as well be incorporated into a single integrated
> ontology, which your rules are pushing towards anyway.
NO, NO, NO!
The fact that adhering to a common formalism indeed allow you
to do that SHOULD NOT BE AN EXCUSE for doing it.
There are other reasons to keep all ontologies decoupled.
Local variants of the same ontology will have their uses, and yet,
they *will still be able to share* knowledge!
Just as an example, assume that two sites use the same ontology but
at different revision level, for historical reasons etc... (you
should know about that kind of "reasons" just as well as me!)
The difference of revision level is NO MORE than any other kind
of difference in both the domains (old vs new) and the contexts
(rev1.word vs rev2.word), so it is *manageable* by the broker
logic without any special considerations!
> > Suppose first, that being answered with the list of attributes that
> > define this new word we happen to "know" all of them, that is, we
> > already have a translated meaning in our local ontology for each one.
> > Then we check if this set of translated attributes in our ontology
> > constitute a proper concept closed under |><|.
>
> MW: How would this happen (that we - the system - "know" them all)?
> Again it seems to me that you are stating the conditions that mean we
> must have integrated the ontologies before they can interoperate. After
> they are integrated, of course they can interoperate.
Two concepts (humanly intended to be the same) from two different
ontologies will just have to have *enough* common and "sensible"
attributes to make that work.
OTOH if, as Jon Awbrey mentioned in one of his messages, you define
an human being as just an "apterous biped" you are in trouble!
> > But, adding a new "thing" instance may actually change drastically the
> > structure of the local concepts lattice and may require that we make
> > choices about possible renamings and redefinitions of
> > existing concepts.
>
> MW: Yes, but this is the integration process, and I do not see how you
> automate it. You can only automate interoperation after the integration
> process has been undertaken.
NO, read carefully the following from the previous message.
I did not then elaborate about contextual disambiguation but I will below.
> > This is the 'penguin' case that I first mentioned in msg08062.
> > I will try to specify more formally what happens in such a case.
> >
> > Suppose we (the receiving ontology A) know about birds but
> > not penguins.
> > When we first "hear" about birds from the source ontology B
> > and ask for
> > the definition (assuming we agree with the definition of all
> > attributes
> > involved, if not this is a case for recursion) we notice that B.birds
> > curiously match our A.bird concept except that they don't seem to fly!
> >
> > How is it that we are still able to figure out that B.birds are likely
> > the same that our A.birds?
> > Because when we take the |><| closure of the B.bird thru our FCA
> > lattice it maps back to our A.bird!
> >
> > So we know that something is strange about B.birds even before they
> > tell us about penguins, but yet these must be birds.
> > Then we create an instance of an hypothetical B.bird in the X set
> > of our FCA lattice in order to make the B.bird concept properly
> > closed and bind the B.bird name to this new concept.
> > If we need a local name for it, just call it a "bird of B".
> >
> > If B then tell us about 'penguins' we will not be "surprised" and
> > will not need to change anything.
> >
> > But, suppose B now talks about how birds fly (assuming that flight
> > is an agreed upon concept), then this means that B is using his
> > word 'bird' ambiguously to mean either any bird or only "flying birds"
> > and that B *does* have a concept exactly matching our A.bird concept
> > but that he did not care to name it.
> >
> > This is a case for *contextual disambiguation*, NOT a
> > conflict of concepts!!!
>
> MW: Yes, but how do you automate this process, except when both "bird"
> concepts are defined in terms of the same higher level concepts.
Yes, they must have feathers, beak and any other zoological or behavioral
attributes (laying eggs, flying) that may define them but NOT ALL attributes
need to be present on each side for a match to occur.
This is the case explained above where a "bird of B" does not necessarily
fly, just check what the |><| closure does.
The case for disambiguation arise because B is using his 'bird' word in
an ambiguous manner, to mean either any bird or only "flying birds".
We already know that a "bird of B" is also a bird for us except that B
seem to never have seen a bird flying, therefore when B talks of birds
we (usually) map the word to our "bird of B" concept which lacks the
flying property.
But when a "bird of B" concept appear in a statement where, for the
statement to make sense (I know, I know, I must define how the broker
can figure out what makes sense, but I will *not* try to explain this
here *nor* in any further messages) the said bird must fly, we first
ask B if, by any chance, he has a name for flying birds, by issuing
some "reverse" query, sending all the attributes of the "bird of B"
comprehension *plus* the 'flying' property.
B will then answer with the full list of attributes for the |><| closure
of this set in *his* FCA lattice and, a name if and only if he has one.
Because it might well be that B knows about flying birds, has some more
attributes that he know are always present for flying birds but did
*not* care to give a name to this concept.
If the returned list of attributes does not include all the attributes
of the "bird of B" comprehension plus the 'flying' property, this is
an error condition on behalf of B. Either his concepts are corrupted
or his statement about flying birds is inconsistent, or we screwed up
ourself somewhere (there might be bugs...). In any case the statement
should be rejected.
We know then, that when B talks about a "bird of B" in a context which
implies flying , B also assume that this bird has all the attributes he
told us for flying birds. If there is also a name we may choose or not to
replace the ambiguous occurrence of "bird of B" by "flying bird of B".
This depends on the *local* application and will have to be specified
to the broker by "option setting" callbacks.
But we are going much too far into implementation details here, since
you asked I am trying to reassure you, but is this NOT the way to
conduct an architectural design.
One should be able to make design decisions WITHOUT consideration for
such points and yet have a *reasonable* confidence that those decisions
will not have to be reconsidered when detailing further specific points.
There is never any GUARANTEE for that, design is an ART!!!
Back to the point. In any case, all *occurrences* of "bird of B" in
a context which implies flying will have to be mapped to the B concept
for flying birds that he told us about, irrespective of the replacement
of the concept name at those same occurrences.
There may be other conventions to resolve contextual ambiguities in
statements from B but they must be passed as integrity constraints
*thru* the protocol just like any other statement and will not impact
the broker logic at the lower levels.
> > Please notice that all this does NOT rely on the spelling of the
> > word 'bird' on either side, so we indeed succeeded in *translating*
> > the word 'bird' between the two knowledge domains!
> >
> > The matching of attributes is going to be more tricky, because it
> > implies recognising the *equivalence* of two relations which is
> > formally undecidable in a general setting.
> >
> > But, as usual, practical REAL cases will be solved and a hint about
> > how this can happen an even be real fast (not grinding thru the
> > theorem prover for hours) is given in msg08252.
> >
> > The trick is that, with each assertion there must be a proof of the
> > consistency of the assertion stated of course within the formalism
> > of the originating ontology.
> > The proof itself is subject to "translation" under the formalism of
> > the receiving ontology, but once this is done, THIS IS A READY MADE
> > PROOF of the consistency of the assertion within the receiving
> > ontology which has only to be checked with no search involved.
> >
> > There CERTAINLY are some difficulties remaining to recognise
> > equivalence of relations used to define attributes/properties and
> > written by different people under different assumptions, because
> > this involve having a confluent rewriting system which will give
> > the same "canonical" representation to equivalent relations, and
> > this is undecidable in a general setting.
> >
> > But AGAIN (and AGAIN!) how many of
> > the REAL cases will be hindered by this?
> >
> > I hope this gives you a "feeling" about a method for integrating
> > independently developed ontologies through some automated process.
>
> MW: I do not believe you have described the automatic integration of
> independently devloped ontologies, but only the interoperation of
> ontologies that are already implicitly integrated through the use of
> common classes (your attributes and properties).
You will neither learn a foreign language if not for a minimal
dictionary or a live interaction which will allow you to grasp
some elementary meanings thru monstration.
The key is BOOTSTRAPPING the translation process.
Only a VERY small number of attributes and properties will have to be
agreed upon, all the others will be constructed from these.
What's more, if you have no hope of any success, why are you watching SUO?
> > Solving all the details will probably require *hundreds* of pages
> > of specifications, but the road is THERE!
>
> MW: I think you need to investigate what it takes to develop the set
> of classes (attributes and properties) that you need to do this. I
> have been working on this area for some 15 years,
HAND-CRAFTING!!!
No one will ever get anywhere unless quitting that!
> and as yet I have seen no limit to the number of classes you need.
> Never mind that the classes and what they mean a different depending
> on the paradigm you use.
Interesting!
Does not this mean than trying to build a "good enough" ontology
is even more vain that what I propose?
Only the "paradigm" can be a real problem and that's not even sure.
There will be choices to be made AGAINST the "philosophy" and
"metaphysic" of some people, but that only mean they will not
like the basic concepts offered. That is not really worrying
if they still CAN rebuild the concepts they want from the
"common" concepts, just like you may choose either the metric
system or avoirdupoids but you have to settle for one to make
sense of a number on a blueprint, and stick to the convention.
And, once more, DON'T AIM AT NATURAL LANGUAGES, for now and
probably quite a long while.
> MW: I should perhaps contribute something positive. I have defined
> what I think it takes to achieve integration in an ISO Technical
> Specification, ISO TS 18876 - Integration of Industrial Data for
> Exchange Access and Sharing (IIDEAS). You can find the Committee
> Drafts for Parts 1&2 at http://www.iso18876.org/.
Thanks, I downloaded it and will have a look.
Cheers.
-- Jean-Luc Delatre
--------------------------------------------------------------
"The algorithm to do that is extremely nasty. You might want
to mug someone with it." -- M. Devine, Computer Science 340
--------------------------------------------------------------
http://perso.club-internet.fr/jld/ -- GSM: +33 6 11 24 06 29