Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

ONT RE: Ontology case study




Dear Adam,

Some responses below.


Matthew West
Principal Consultant
Shell Information Technology International Limited
Shell Centre, London SE1 7NA, United Kingdom

Tel: +44 20 7934 4490 Other Tel: +44 7796 336538
Email: matthew.r.west@is.shell.com
Internet: http://www.shell.com


> -----Original Message-----
> From: Adam Pease [mailto:apease@ks.teknowledge.com]
> Sent: 28 May 2002 19:04
> To: West, Matthew R SITI-ITPSIE; ontology@ieee.org
> Subject: RE: Ontology case study
> 
> 
> Matthew,
>    Since we're addressing generalities rather than a change 
> proposal for a 
> starter document, I'd prefer to continue this discussion on the 
> ontology@ieee.org list.  Comments below:
> 
> At 05:29 PM 5/26/2002 +0200, West, Matthew R SITI-ITPSIE wrote:
> >Dear Adam,
> >
> >Your case study tramples all over my heartland, so see some
> >comments below.
> >
> >
> >Matthew West
> >Principal Consultant
> >Shell Information Technology International Limited
> >Shell Centre, London SE1 7NA, United Kingdom
> >
> >Tel: +44 20 7934 4490 Other Tel: +44 7796 336538
> >Email: matthew.r.west@is.shell.com
> >Internet: http://www.shell.com
> >
> >
> > > -----Original Message-----
> > > From: Adam Pease [mailto:apease@ks.teknowledge.com]
> > > Sent: 23 May 2002 00:17
> > > To: SUO
> > > Subject: SUO: Ontology case study
> > >
> > >
> > >
> > > Folks,
> > >    Bill Andersen and I were speaking a moment ago and he
> > > pointed out that I
> > > hadn't related an application of ontology that we did a while
> > > ago.  My hope
> > > is that this case should point out one simple, concrete
> > > application of an
> > > ontology, in a deployed business environment.
> > >    During the dot-com boom we worked with a Internet-based
> > > real-estate
> > > company to solve a database integration task.  I was
> > > surprised to find out
> > > as we started the project that what consumers know as the
> > > Multiple Listing
> > > Service database that realtors use is actually a collection
> > > of some 30,000
> > > locally-developed databases that record information about
> > > homes for sale in
> > > a particular geographic area.  All the databases cover 
> the same basic
> > > information, but all the table names, field names, and 
> symbols may be
> > > different from one database to the next.
> > >    We created an ontology of real estate in first order
> > > logic, using an
> > > upper ontology.
> >
> >MW: An ontology is really overkill for this.
> 
> I think this is simply a question of personal preference and 
> experience.  My understanding is that you're not familiar 
> with logic so 
> it's not surprising that it seems imposing.  

MW: I am much more familiar than I was when I arrived here, so
my principle concern is utility and economy.

> I'm not familiar 
> with the 
> EXPRESS language so it would seem like overkill to me to use 
> EXPRESS syntax 
> for describing something in a much smaller and more familiar 
> (to me) syntax 
> of SUO-KIF.

MW: Well this is compact vs expressive. I generally prefer compact
myself, so this is not the reason. Rather economy, and using the
tools of the community you are working in. I haven't tried to
suggest that people here should use EXPRESS or some other data
modelling formalism. Neither would I suggest other communities
used KIF, unless their current tools couldn't support the business
requirements, and KIF obviously did.

> 
> >  Though I should perhaps make
> >clear some distinctions I use related to terms like ontology.
> >
> >  - Taxonomy, a dictionary of standard terms and their 
> natural language
> >    definitions.
> >
> >  - Thesaurus, a taxonomy with subtype/supertype relations.
> >
> >  - Data Model, a structure of types and relations against 
> which instances
> >    of the types and relations can be stored. Definitions of 
> the types and
> >    relations are in natural language. Some constraints are 
> defined, usually
> >    in terms of the cardinality of relationships.
> >
> >  - Ontology, types, relations, and possibly instances of 
> those together
> >    with axioms, usually in First Order Logic, that define 
> the rules that
> >    apply to members of the types and relations.
> >
> >Data Models are what are usually developed/used to design a 
> database. The
> >axioms of an ontology are generally of little use/relevance 
> either in the
> >design of the database,
> 
> Again, a question of perspective.  If one doesn't read logic, 
> the axioms 
> would certainly be of little use.  If one does read logic, and has a 
> theorem prover available, the axioms could be used to test 
> the consistency 
> of the schema.

MW: Not many people - particularly business users - read logic. They
have a tough enough time with data models, and we usually present those
in a somewhat simplified form to check that we understand what they
need.

> 
> >or in the population of it by instances.
> >
> >So in general my question for this sort of case would be: 
> Why would you
> >develop an ontology (more work) when a data model will do?
> 
> Because if I have a schema like
> 
> Table: Foo
> col 1: parent name  col 2: parent SSN  col 3: child name  col 
> 4: child SSN
> Mary M. Smith            555-11-1234    Robert J. Smith     
> 555-11-4444
> Robert J. Smith          555-11-4444    Mary M. Smith       
> 555-11-1234
> 
> Table comments:
> parent: "The parent of a child specified in the child columns."
> 
> No automated process can read the English and determine that 
> the database 
> is bad.  If you have instead a formal ontology that defines
> 
> (=>
>    (parent ?X ?Y)
>    (not
>      (parent ?Y ?X))
> 
> as part of the formal definition of the parent relation, then 
> an automated 
> process could catch the problem.  I'm sure you could find 
> some existing 
> database constraint language that could handle this 
> particular constraint, 
> but not all the constraints that are easily specified in 
> logic, unless of 
> course the database constraint language is a particular 
> syntax for logic 
> (which many language designers seem to be moving towards, see 
> OCL in the 
> UML for example).

MW: The utility of this is determined by the frequency with which
people put this kind of bad data into the database. This kind of
error is very rare, simply because the people putting the data in
know it is wrong. Now typo's are very common, but I don't see how
axioms can help with the miss-spelling of names.

> 
> >Of course ISO15926-2 is a data model developed specifically 
> to solve this
> >sort of problem, with ISO18876 an architecture and 
> methodology for it.
> >
> > > We then "compiled" the ontology by hand into
> > > a relational
> > > database.
> >
> >MW: Most data modelling tools can compile automatically to a 
> relational
> >database, so more expense here.
> 
> True, but that's only because these modeling tools are built 
> on technology 
> that's been around a bit longer, so it's not surprising there 
> would be 
> software implemented for it.  That doesn't mean there some technical 
> obstacle.  Another issue is that existing DB languages are 
> less expressive 
> than logic.  If the data modeling tool is just an ER diagram, it maps 
> trivially into a DB schema.

MW: As one design option yes (but certainly not the only one). Yes DB
languages are less expressive than logic, but my point is that that has
not value unless/until you need that additional expressiveness, and then
you generally want it in a different context, e.g. checking data quality,
rather than doing database design.
> 
> > > We then wrote scripts, again by hand, to map the
> > > contents of
> > > each MLS database into our common database.
> >
> >MW: There are sophisticated tools to support this kind of 
> application.
> >There are two basic types, ETL (Extract Transform and Load) and EAI
> >(Enterprise Application Integration).
> >
> >ETL tools specialise in the batch migration of usually 
> transaction data
> >between different databases with different structures. Often the data
> >is being brought into a Management Information System (MIS) 
> data warehouse.
> >
> >EIA tools specialise in real time integration of data where 
> systems need
> >to cooperate, usually this involves the synchronisation and 
> restructuring
> >of reference data (organisations, products, addresses, ...).
> >
> > > The ontology was
> > > robust and
> > > comprehensive enough such that after the first couple
> > > databases, and we
> > > eventually mapped several dozen of them, there were no
> > > changes to the ontology.
> >
> >MW: When integrating (or merging) a number of systems there 
> is a lot more
> >than just the data structure that needs to be integrated. In 
> the case of
> >your real-estate (we call them estate agents) application, 
> things like
> >getting the names of cities, states etc in a standard form 
> and correct
> >spelling, not to mention the companies involved.
> 
> I agree.
> 
> >There are tools that
> >will help with these data quality issues.
> >
> > >    The company's web site went live for several months, using our
> > > ontology-based database, and then they exhausted their
> > > funding and went out
> > > of business.
> > >    Of course this is just an anecdote.  I doubt it's 
> going to change
> > > anyone's mind who already has a strong opinion about the
> > > usefulness of a
> > > single ontology for supporting integration.  But at the very
> > > least, it's a
> > > concrete instance of the productive use of a particular
> > > ontology on one
> > > commercial integration task.
> >
> >MW: Whilst I accept that you can use an ontology for this 
> kind of thing,
> >it is generally more than you need. So it is interesting 
> that I am here,
> >even though I wouldn't use a formal ontology for this sort 
> of application,
> >and I don't see much future for formal ontologies here, 
> except as in your
> >case it happens to be the approach that those involved are already
> >familiar with.
> 
> And likewise the opposite in your case.  This does beg the 
> question though 
> if you believe ontologies are useless, why are you here?

MW: I do not believe ontologies are useless, only not the most
economic tool for doing database design.

MW: I'm here because I think we are going to get to applications in
the next 10 years that will really need the full flexibility of FOL,
but they aren't shouting at me yet (killer apps that is).

MW: Interfacing benefits from FOL, but here you are dealing with
existing ontologies/data models.
> 
> > > I've provided references in
> > > previous messages
> > > to other government research projects where we've used 
> ontologies for
> > > integration as well.
> > >    One impact of an ontology that I find undeniable is that
> > > at the very
> > > least, a formal ontology can serve as a more precise set of
> > > comments about
> > > the meaning of database tables and fields than informal English.
> >
> >MW: I disagree. The disadvantage in this application of a 
> formal definition
> >is that only those who understand the formalism can understand it.
> 
> That's true for any language including EXPRESS, EPISTLE etc.  
> How is this 
> an argument against logic and ontology?

MW: EXPRESS is a language that is computer processable, and is processed
to provide a database. You do not automatically process the KIF in 
your ontology, to produce a database design, it is read by humans, so you
should use something that they understand. It is the information you
convey, rather than the information you contain that matters.
> 
> >  I think
> >the only time that stating axioms formally is sensible (in a 
> commercial
> >application) is when they are going to be acted on by some 
> application
> >(usually of course some sort of inference engine).
> 
> That's a reasonable statement to make about expression of 
> information in 
> logic, but it doesn't address the at least anecdotally demonstrated 
> advantage of ontology in supporting modeling tasks, as oppposed to 
> generating a model from scratch.

MW: Well if I had an ontology, I certainly wouldn't ignore it (as long
as I could make sense of it) but I would translate it into a data model
as my first step if what I was doing was database design.
> 
> > >    This also points out some fruitful areas of research for tool
> > > builders.  It would be nice to compile the ontology to an SQL
> > > DB schema.  I
> > > believe Bill's company is working on that.
> >
> >MW: This might be useful, but personally I think that 
> merging data models,
> >ontologies, and database definition, would be even better. We have
> >also taken approaches that involve a fixed database 
> structure whatever
> >the data model/ontology.
> >
> > > Another extremely
> > > valuable tool
> > > would be one that aids in doing the mapping from disparate
> > > databases to a
> > > common database.
> >
> >MW: Already done (though at the data model rather than 
> ontology level)
> >see above. Actually, I do think there is some room here, 
> since most of
> >the tools have grown up empirically, rather than based on 
> FOL explicitly
> >or knowingly.
> >
> > > I'm doubtful about the prospects for doing that
> > > completely automatically, but hopefully that tools could be
> > > developed that
> > > would help.
> >
> >MW: Indeed. None of the tools I mention are automatic, they 
> manage the
> >definition and execution of the interfaces. I too very much doubt
> >if full automation is possible.
> >
> > >    I hope this is helpful.
> >
> >MW: Yes, we should be talking more about possible applications, and
> >what it takes to support them.
> >
> >My personal view is that full formal ontologies only come into their
> >own when there is inferenced based work to do. I think there are two
> >potential areas for this:
> >
> >1. Data Mining, by which I mean discovering facts that are hidden in
> >the data you have, but are not easily apparent.
> >
> >2. Automation, by which I mean providing the where-with-all 
> for applications
> >to take decisions without (or with reduced) human intervention.
> > >
> > > Adam
> > >
> > >
> > >
> > > Adam Pease
> > > Teknowledge
> > > (650) 424-0500 x571
> > >
> 
> Adam Pease
> Teknowledge
> (650) 424-0500 x571
>