SUO: RE: OpenCyc Motion Open for Discussions
Hi y'all,
I'm glad that the opencyc motion came out and that it has been seconded.
At this stage, I think I will vote for starting work on the document.
However, I would like to qualify this as, in my opinion, the document
released is borderline in its capacity as a starter document. This is a
1000 pages text file which is one the one hand very hard to read and on
the other hand cluttered with extraneous things. Nonetheless, I
understand John Deolivera's rationale and agree that cleaning it up
could be a task for a working group here.
With this in mind, I would like to suggest a few things. It may help if
we go forward, and I hope it may help people evaluating the task at hand
and making their own opinion.
Cyc uses microtheories, one of the reason this file is so hard to read
is that microtheoretic partitions are absent of the file. I wonder how
much if this is going to be worked out as a standard we will have to
endorse and reuse the microtheoretical apparatus. Actually, I think the
first task should be to break down this file in microtheories, then to
abstract all the material potentially useful for a SUO from these
microtheories.
I note that OpenCyc is not an upper-level ontology, although it contains
one. So, next question, do we really want to abstract the UO stuff only
or do we want more? Of course, this raises the question of the
boundaries of an UO. In my opinion, all the stuff like 'user agreement',
geographical data, chemical, biological, computer ontologies should be
sorted out and considered as Standard _Domain_ Ontologies. I think this
material is a very good standard ontology (a very good starting point at
the very least) but it does not all pertain to an Upper-Level one.
Moreover, there is an overwhelming number of implementation specific
material in this file, I include here all the meta-knowledge predicates
(I emphasize this does not mean all meta predicates, of course),
canonicalizer, 'EL' stuff et al.
In addition, I would like to see this file stripped of the Natural
Language processing stuff (in particular, lexical Mts, there's a bunch
of other stuff 'RelativeClauseMt', things like that).
Also, I think the wordnet mappings should be removed. I don't mean that
these mappings are uniteresting, on the contrary. But I think that we
should have a separate mapping repository articularing OpenCyc, SUMO,
Wordnet, and DOLCE for instance (if anybody wonders, we are cooking our
little ontological soup in Leipzig too).
Suggested roadmap for working with this document:
1) Isolate structural vocabulary.
1') Legifer on the use of Mts (I'd support their use).
2) Remove Metatag and implementation material (CoreConstant,
*Canonicalizer*, and so on).
3) Define scope of UO. Isolate material.
4) Define domain ontologies. Isolate material.
5) Lay-out the structure of modules needed (one of OpencCyc's most
valuable potential from the structural standpoint).
6) Isolate or Remove:
-NL stuff
-External mappings
Those who know me at Cycorp also know that I am not sympathethic with
putting together implementation and NL stuff together with the material
of the ontology. I understand why this is convenient in Cyc and I
understand the trade-off in that context. I want this group (SUO) to
understand that in building an upper-level ontology, these things have
to be clearly and neatly separated. I think this is do-able. I will be
happy to contribute to this task. I evaluate the workload for cleaning
up this file to between 1 and 2 man/week.
To finish with, the language should not be an issue at this stage. All
preliminary work done on the OpenCyc ontology would probably be easier
just not worrying about it, CycL would be fine here. When we reach a
level at which we are happy with the putative standard, we could choose
any language to specify the ontology.
Don't know if I should apologize for such cursory comments.
Best regards,
Pierre
http://ifomis.de