RE: ONT RE: Ontology case study
Matthew,
At 06:18 PM 6/6/2002 +0200, West, Matthew R SITI-ITPSIE wrote:
>Dear Adam,
>
>See comments below.
>
>
>Matthew West
>Principal Consultant
>Shell Information Technology International Limited
>Shell Centre, London SE1 7NA, United Kingdom
>
>Tel: +44 20 7934 4490 Other Tel: +44 7796 336538
>Email: matthew.r.west@is.shell.com
>Internet: http://www.shell.com
>
>
> > -----Original Message-----
> > From: Adam Pease [mailto:apease@ks.teknowledge.com]
> > Sent: 06 June 2002 16:38
> > To: West, Matthew R SITI-ITPSIE; ontology@ieee.org
> > Subject: RE: ONT RE: Ontology case study
> >
> >
> > Matthew,
> >
> > At 09:44 AM 6/6/2002 +0200, West, Matthew R SITI-ITPSIE wrote:
> > >Dear Adam,
> > >
> > >See comments below.
> > >
> > >
> > >Matthew West
> > >Principal Consultant
> > >Shell Information Technology International Limited
> > >Shell Centre, London SE1 7NA, United Kingdom
> > >
> > >Tel: +44 20 7934 4490 Other Tel: +44 7796 336538
> > >Email: matthew.r.west@is.shell.com
> > >Internet: http://www.shell.com
> > >
> > >
> > > > -----Original Message-----
> > > > From: Adam Pease [mailto:apease@ks.teknowledge.com]
> > > > Sent: 05 June 2002 16:26
> > > > To: West, Matthew R SITI-ITPSIE; ontology@ieee.org
> > > > Subject: RE: ONT RE: Ontology case study
> > > >
> > > >
> > > > Matthew,
> > > >
> > > > At 11:11 AM 6/5/2002 +0200, West, Matthew R SITI-ITPSIE wrote:
> > > > [snip]
> > > >
> > > > > What an ontology can do is provide a set of integrity
> > > > > > > > constraints that can be automatically processed
> > to determine
> > > > > > > > in part if
> > > > > > > > that merger was specified correctly.
> > > > > > >
> > > > > > >MW: Please explain how you think that might work.
> > > > > >
> > > > > > Imagine that a database technician reverses the fields
> > > > that should be
> > > > > > mapped from a client database into a common DB.
>
>MW: Below you say that your case is some one miss-typing some data,
>but I interpret this section above as someone using say some SQL to
>move some data from one database to another, and in the SQL two fields
>have become transposed. It would never occur to me to have a database
>of any size retyped.
I'm not addressing retyping data. I'm addressing data which is typed in
for the first time, such as when a clerk takes information from a patient
upon check-in at an emergency room.
> > He flips
> > > > > > birth date and
> > > > > > hire date (for an employee database for a multinational
> > > > > > corporation for
> > > > > > example). As each record is pulled into the common DB it's
> > > > > > checked against
> > > > > > the constraint
> > > > > >
> > > > > > (=>
> > > > > > (and
> > > > > > (instance ?X Employee)
> > > > > > (birthTime ?X ?BTIME)
> > > > > > (deathTime ?X ?DTIME))
> > > > > > (greaterThan ?DTIME ?BTIME))
> > > > >
> > > > >MW: Well that is fine of course (not wrong). However, in
> > this case
> > > > >either all are going to be right, or wrong. So my
> > strategy would be
> > > > >to eyeball the resultant data and do the same test,
> > rather than code
> > > > >and check the constraint. Just more efficient in this
> > sort of case.
> > > > >You still need to know that a birth date is before an
> > employment date
> > > > >(but I do don't I?).
> > > > >
> > > > >MW: Don't get me wrong. I still think there is a killer
> > app somewhere
> > > > >in database meets ontology, but I know I don't want to
> > get shot down
> > > > >in flames suggesting something that can be done as well
> > some existing
> > > > >way.
> > > > > >
> > > >
> > > > I'm puzzled by this perspective. On the one hand, you've
> > > > been making an
> > > > efficiency claim that one can't have too many sorts of
> > > > constraint checks on
> > > > a DB because it's too inefficient. On the other hand you
> > > > offer "eyeball
> > > > the resultant data" as a solution, which I would think would
> > > > be impractical
> > > > if the data set is so large that automated constraint
> > > > checking is an issue.
> > >
> > >MW: You only need to eyeball a sample of the data. When you see
> > >that 2 or 3 rows have the data transposed, and you know a systematic
> > >process has been applied, you really don't need to check the rest
> > >of the data set.
> >
> > But that's not the situation my example has been addressing.
>
>MW: See my comment above.
>
> > There's no
> > systematic process, just an individual data entry error.
>
>MW: This case is not dealt with in database design, and so not
>included in data models (actually EXPRESS can do this sort of
>constraint, but this feature is not widely used).
>
> > Note that there
> > may be thousands of such entirely random errors created by
> > tired data entry
> > clerks.
>
>MW: Indeed, but your axiom will only pick up a small percentage
>of them.
That particular axioms, of course, which is why there is benefit in having
an ontology which has general and domain specific rules which can catch
some larger percentage.
> >
> > > >
> > > > Note also that having the data entry clerk check this
> > > > constraint is not a
> > > > practical solution. Typos happen, humans make errors.
> > >
> > >MW: This is a different case. Adding checks on data entry is
> > >quite common. But it is not considered part of the static database
> > >design (that the data model supports). It might also be a check
> > >you run on the whole database periodically.
> >
> > Eureka! And that periodic database check would be one way to
> > use a formal
> > ontology, doing data cleaning, checking for data entry typos, against
> > constraints like and individual's birth must come before his
> > or her death.
>
>MW: Quite. But as noted above, this constraint would only pick up
>a small percentage of typos, and there are tools available to do
>this kind of check,
What tools would those be?
> and you can also set database triggers to perform
>the check on data entry.
Of course, and in the case of using an ontology, one would set a "trigger"
to check the data using an ontology and theorem prover. The other
alternatives, are either to implement the trigger using procedural
(imperative) code, or not at all.
Adam
> >
> > Adam
> >
> > > >
> > > > Adam
> > > >
> > > >
> > > >
> > > > Adam Pease
> > > > Teknowledge
> > > > (650) 424-0500 x571
> > > >
> >
> > Adam Pease
> > Teknowledge
> > (650) 424-0500 x571
> >
Adam Pease
Teknowledge
(650) 424-0500 x571