Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

SUO: The story I promised






This happened about 10 years ago so my memory is a bit blurry, but I will
do the best I can to remember the story.

First, an apology.  The example I posted that led to my mentioning this
story was based on the Guarino-Welty axioms dealing with rigidity, etc.
When I finally got around to writing this, I discovered that other
principles were at work as well.  So if this doesn't match well the previous
post, please forgive me.  Here goes:

---

Ok, I used to work for the Government and we had this database that kept
track of information for an automated analysis system we developed.  The
system consisted of an array of programs (about 10) most of which hit the
database.

We had this table in this database that looked like this:

  <id, name, location>

Each row was to represent an organization *and* a facility, since it was
believed that there was a 1-1 correspondence between a particular kind of
organization and a particular kind of facility.  This was a choice made by
software engineers working on the project with the blessing of the customer
domain experts.  I (and others) argued strongly against this decision.  If
I had know then what I know know I could have made a better case but at the
end of the day, the efficiency-obsessed engineers and operationally blinded
domain experts won out [Note: This example will also double as an argument
against "conceptualism" and in favor of the formal ontological approach
favored by Guarino, Welty, and others, including Ontology Works]

Placing things in database rows led this leads to many consequences in this
case:

  1.  The ontology is blurred - here we should have (at least) three
  categories: organizations, locations, and facilities.  This was known to
  the experts but not incorporated into the database design.

  2.  Time wasn't modeled, so there was no way to say that an organization
  changes location - in other words location (for organizations and
  facilities) becomes an essential property - and wrong!

  3.  Organizations weren't modeled properly - what would happen to an
  organization if it changed type (role, really) and would no longer be of
  the right type to occupy this table?  What would happen to the associated
  facility?  Would it go away?

  4.  The semantics are blurred.  Does each row represent a single entity
  or two?

So one day, the customers came to us and said "We have an organization that
appears to use different facilities at different times.  Can you fix the
database to reflect this?".  We said sure, but it would take time.  Well,
the schema changes were simple to make - the table was broken into two (one
for organizations and one for facilities) and a third added to record
changing occupation of facilities over time.  This was the easy part...
The hard part was changing all of the code (SQL/C++/Lisp) that was based on
the faulty assumption(s).  That took about 2 person-years of recoding and
cost roughly $500,000.

I mentioned earlier that if we had available a methodology such as the one
proposed by Guarino & Welty, we could have avoided such a mistake.  How?

  1.  Model organizations properly.  This would mean that 'organization'
  becomes a rigid property - it has the modal property of being true
  whenever the entity in question exists simpliciter.  The particular kind
  of organization in question would have been made what G&W call a
  "Material Role" - basically a contingent property - subsumed by
  'organization'.  The specific kind of organization and 'organization' in
  general could not be equated in this database - this would be prevented
  by a constraint on subsumption between non-rigid and rigid properties.

  2.  Model facilities properly.  Organizations do not have locations
  (except in some legal or social sense or in some sense parasitic on
  physical assets or members thereof) but facilities do.

  3.  Organizations also have socially-derived identity conditions.
  Facilities (structures) do not - at best (if you like the idea of
  superposition and material constitution) they have a structural identity
  above the sum of their parts.  Another reason they shold not have been
  folded into the same table (and thus identified)

Well, that's about it.  As you can see there's no rocket science here - only
commonsense fixes to errors made by ordinary people who are too busy with
their jobs to worry about things that come back to bite them later.  But
this is just the issue, especially when models become large and unwieldy,
where no human can do all the needed checking of design assumptions.

Admittedly it's not the cleanest explanation I could have made but it's
about all I can manage right now.  Comments?

 .bill