SUO: should a standard specify...
All,
I face some practical problems that I hope you can help me with.
The broad topic has to do with how to access the guts of ontologies
based on the ontologies themselves and perhaps also their specifications.
Those in a hurry can skip to the SUMMARY at the end.
GOAL:
I am trying to extract suitable sections of OpenCyc, SUMO, and perhaps ISO15926
to serve as plausible example ontologies for demonstrating the IFF.
PROBLEM:
Faced with hundreds of megs of terms, definitions, comments, relations,
axioms, and so forth, how can I comb through these ontologies to
do the following?
(1) find out what are the ontological primitives (e.g., terms, relations,
axioms) used by each ontology and whether these primitives are compatible
(e.g., do the ontologies mean the same thing by "individual", "constant",
"collection", "relation", ... ?);
(2) carve out from each of the ontologies subontologies that deal
with some particular domain, say, naive spatial reasoning;
(3) get an idea of possible correspondences among the different
subontologies, to determine whether the ontologies stand a chance of
being able to interoperate with each other.
FUNDAMENTAL ISSUE:
How to get the relevant ontological primitives and assertions from the
ontology itself?
(It is interesting to note that even if I were working with much
smaller ontologies, the fundamental issue would still remain.)
Now you may say, "Well, that's your problem because of what you've
chosen to do, and it's not really a problem that members of this group need
to worry about." But I think it is, and I will try to explain why.
Many members of the group would seem to agree that the SUO should have some
kind of framework within which different ontologies can "interoperate."
(SUMO itself is modular; and the IFF group feels that the IFF and SUMO
represent complementary approaches to achieving interoperability.)
For the sake of argument, let's suppose that there is indeed widespread
agreement among members on that score.
It seems that interoperability always comes at some price, such as
agreement on certain conventions. From recent postings to the list, one
price people seem ready to pay is that of agreeing on a particular
logical language.
That agreement might be paraphrased roughly as follows: "once different
ontologies can represent entities, relations, axioms, and semantics in
the same language (something like KIF/CL ??), then we can begin to
provide a framework that allows different ontologies to interoperate."
(I'm leaving "interoperate" as an undefined term, because I think that for
my argument its definition is not necessary).
What I want to suggest is that the assumption of a common language for
expressing different ontologies may not be realistic, at least not at the
present time. (See, for example, the discussion below of "thing" in SUMO,
Cyc, and ISO15926-2.)
What I want further to suggest is that maybe what is needed is not so much
an agreed-upon common language, but an agreed-upon method to identify in
a given ontology what are the primitive entities, relations, axioms, and
definitions, and how these primitives can be used to extract
domain-specific subontologies. Perhaps such an agreed-upon method should be
part of the ontology's specification. Such a method would certainly help in
the examples below.
That is, I am wondering how the "relevant primitive guts" of different
ontologies can be identified, and I am looking to find some method
that will allow humans and machines to use the ontology itself
to identify these guts and then use them to extract useful subontologies.
You may suggest that a common ontology language would help establish such
a method. I agree that it might, but I have two points: (1) it won't
do so automatically, and therefore (2) some attention should be given
in the standard for how certain fundamental primitives of ontologies can
be designated in the ontologies themselves, as well as how they can be made
available to programs that want to use them to extract particular
subontologies.
To put it metaphorically, how can two ontologies, who happen to find
themselves on a street corner checking each other out, assess each other's
goods and whether or not they might be able to have a meaningful
encounter together? The mere fact that they speak the same language may
help, but something more is needed I think. It is this something more
that I am suggesting bears consideration.
Now, for a concrete example of the problem mentioned above.
I want to find those components of the different ontologies that bear
on, say, naive spatial relations. But even before I try to extract those
components, I'll probably need to understand something very general: how do
Cyc, SUMO, and ISO1596-2 treat the notion of "thing" or "entity"?
So, I looked in each of these ontologies for "thing" and "entity".
What follows are some results, which don't pretend to be exhaustive.
Cyc has ( at http://www.cyc.com/cyc-2-1/vocab/fundamental-vocab.html#Thing ):
------------------------------------------------------------------------------
#$Thing is the universal set: the collection of everything! Every Cyc constant
in the Knowledge Base is a member of this collection; in the prefix notation
of the language CycL, we express that fact as (#$isa CONST #$Thing). Thus,
too, every collection in the Knowledge Base is a subset of the collection
#$Thing; in CycL, we express that fact as (#$genls COL #$Thing).
...
isa: #$Collection
some subsets: #$Path-Generic #$Intangible #$Individual #$SimpleSegmentOfPath
#$Path-Simple #$MathematicalOrComputationalThing #$IntangibleIndividual
#$Product #$TemporalThing #$SpatialThing #$Situation #$EdgeOnObject
#$FlowPath #$ComputationalObject #$Microtheory (plus 1488 more public
subsets, 13568 unpublished subsets)
------------------------------------------------------------------------------
SUMO has:
------------------------------------------------------------------------------
(documentation Entity "The universal class of individuals. This is the root
node of the ontology.")
;; Everything is an entity (due to Robert E. Kent).
(forall (?THING) (instance ?THING Entity))
------------------------------------------------------------------------------
The ISO15926-2 draft, in clause 5.2.1.1 (with the underscore fore
and aft added by me to indicate bolding in the document), has:
------------------------------------------------------------------------------
A _thing_ is anything that is or can be thought about or perceived,
including material and non-material objects, ideas, and actions. Every
_thing_ is either an _individual_, a _class_, or a _relation_.
[a NOTE is ommitted here]
EXPRESS specification:
---------------------
*)
ENTITY thing
ABSTRACT SUPERTYPE OF (ONEOF (class, individual, relation));
id : STRING;
UNIQUE
UR1 : id;
END_ENTITY;
(*
------------------------------------------------------------------------------
How are people or programs to deal with such variability?
My particular problem, and I think it is representative of a problem to be
faced by many people and machines in the near future, is:
How can I get a grip on what these different ontologies have to say
about something I am interested in - in this case, "thing" or "entity"?
I'd like somehow to have a way to investigate possible ontologies for their
usefulness or for their potential to match my needs, so that I could determine
whether it is feasible to try to work with several ontologies together.
In trying to extract suitable subontologies of SUMO, Cyc, and ISO15926,
I began by looking for relevant "possible correspondences" to some
domain of interest -- that is, what were the basic ontological terms,
definitions, and assertions that each ontology used to talk about "thing"
and what are the correspondences between the ways each ontology does this?
(In a sense, perhaps, what I'm looking for is a sort of generic ontology API
that would allow me to gather the relevant ontological primitives for a
particular domain. Ideally, such an API would allow programs to automatically
assess different ontologies for compatability, based on only a small set of
seed primitives.)
As far as the SUO effort is concerned, I suggest that we consider whether
the identification of and access to certain ontological primitives, as well
as the means to use those primitives for extracting domain-specific
subontologies, should be part of the standard.
Just a few specific questions follow.
Should an ontology specification specify
1. whether/how an ontology makes available to users methods to map input terms
(given by a human or a machine) to terms used by the ontology?
In the above examples, I just intuited my way to look for "thing" or "entity",
and I figured I would find one or the other of them. But in general, we don't
want to rely on human intuition or luck.
2. whether/how an ontology can present in one package all the definitions,
relations, and axioms established for a particular term? I don't mean
_all_ the possible inferences that include a particular term, just the
definitions, relations, and axioms that are explicitly listed in the ontology.
I think that being able to access just what is explicitly stated is valuable
for determining whether two different ontologies could possibly interoperate.
Feel free to enlighten me here, though.
3. whether/how the semantics of the ontology can/should be made available
from, say, lists of constant symbols, function symbols, relation symbols,
along with denotational mappings and satisfaction relations?
For instance, (following the second edition of Mathematical Logic by
Ebbinghaus, Flum, and Thomas) suppose R is a binary relation symbol.
Then a formula like "for all x, Rxx" is "just a string of symbols to
which no meaning is attached." Depending on the domain for x and on the
interpretation of R as a particular relation on the domain, that string
of symbols will mean different things. Without the denotational mappings
and satisfaction relations, what good will it do to know certain axioms?
So, should the specification of an ontology say how the model theory
used by the ontology is made available to interested users?
------------------------------------------------------------------------------
SUMMARY
* I see some difficult practical issues of how a human or a machine
can determine whether (certain subdomains of particular) ontologies
stand a chance of being able to interoperate with (certain subdomains of)
other ontologies.
* The main difficulties seem to be: (1) how to identify the relevant
ontological primitives (e.g., terms, definitions, axioms) used by an ontology;
and (2) how to use these primitives to extract a particular domain-related
portion of the ontology.
* Being able to identify, extract, and manipulate these primitives _seems_
key to being able to assess whether two ontologies can interoperate
(in virtually any plausible sense of that word). But it may indeed not be
appropriate to suppose that one could determine whether two ontologies
can interoperate based on accessing the ontological primitives and
assertions used by the ontologies to talk about a particular domain.
To those who hold this view, can you please elaborate?
* Currently, as shown by the examples of SUMO, Cyc, and ISO15926, different
ontologies have different ways of representing these primitives, and in many
cases it is difficult to determine from the ontology itself, how these
primitives are identified and can be made available. Currently, these
determinations must be done manually (I think) and in an ad hoc way.
Perhaps the SUO standard can address how to accomplish make such
determinations programatically.
* It is possible that a single, agreed-upon ontology language will go a long
way in helping to identify and make available these ontological
primitives and the subsequent extraction of domain-specific subontologies.
But it seems that some specific mechanism that is different from a
common language, and which belongs perhaps to the ontology's specification,
needs to be provided. Such a mechanism needs to be, well, mechanical -- capable
of being done by a machine with limited sets of seeded input.
* What that mechanism is, what the ontological primitives are, and how
this mechanism can be used in practice to identify and make available these
primitives and to allow certain subontologies to be extracted -- I suggest
that all this should perhaps be investigated in the context of our
standards work.
* If these issues have already been addressed and solutions for them
have been found, please let me know.
-----------------------------------------------------------------------------
Any and all comments appreciated.
Thanks,
Jim