SUO: RE: An article on the pitfalls of metadata
Dear Rich,
Well the article makes me realise that I am involved
in one of these initiatives in EPISTLE where we have
a data model and some 50,000 items of reference data.
So some comments from the coal face.
Our context is the handover of design information
between engineering contractors and owner operators
for large capital systems like oil rigs, typically
worth $2 billiion or more.
I will comment on each of the seven reasons for
failure:
1. People Lie.
We are in a contractual situation, also a relatively
small world where establishing trust is essential to
do business at all. There is therefore not much long
term benefit in lieing.
2. People are lazy
We are banking on this. The objective is that using
the reference data will be the easiest (and cheapest)
way to get the job done.
3. People are stupid.
Unfortunately "You can't stop dumb people doing dumb
things" and "We're all dumb at least some of the time"
So you need to check and recheck what you have been
given. Just a necessary part of the process. However,
at least this can noe be largely automated, and its
surprising how much the quality goes up when people
know what they are doing is going to be checked.
4. Mission Impossible - know thyself
Yes people are unreliable, but our experience has been
that even using the reference data badly is better than
not using it at all.
5. Schemas aren't neutral
Certainly any single hierarchy is not neutral, but you
don't have to have just a single hierarchy. Our data
model is not neutral either. It is explicitly a 4D
paradigm, and you have to see the world through those
glasses. However, it is neutral between say washing
machines and pumps. It also allows the addition of
new things that you did not think of in the first place.
6. Metrics Influence Results
I'm not sure that this is really relevant in our context.
7. There's more than one way to descibe something
The world of engineering is relatively objective, we do
not have too much problem with people describing lengths
in metres or yards, we know how to relate one to the
other. On the other hand at a higher level this is an
issue. So we have made a choice for a 4D paradigm.
Well it looks like most of these issues are real, but
not necessarily insurmountable.
Matthew West
Principal Consultant
Shell Information Technology International Limited
Shell Centre, London SE1 7NA, United Kingdom
Tel: +44 20 7934 4490 Other Tel: +44 7796 336538
Email: matthew.west@shell.com
Internet: http://www.shell.com
> -----Original Message-----
> From: Richard Cooper [mailto:rich@valutech.com]
> Sent: 22 August 2003 17:50
> To: sowa@bestweb.net
> Cc: SUO; cg@cs.uah.edu
> Subject: SUO: An article on the pitfalls of metadata
>
>
>
> There was an article on the web that was discussed
> in the Semantic Web list a while back. It might
> be useful to stimulate some more discussion on the
> ontology design issues we've been grappling with.
>
> Its titled: "Metacrap: Putting the torch to seven
> straw-men of the meta-utopia", and its posted at:
>
> http://www.well.com/~doctorow/metacrap.htm
>
> The basic idea is that ontology designers can't
> forsee the human self interest that must inevitably
> foil their categorization of any subject area. At
> the end, he agrees that metadata is a useful thing,
> but stresses that it won't come from letting people
> fill in ontology forms because they will fudge the
> data into their own self interest.
>
> It seems to me this discussion could enlighten some
> of these CG and SUO issues. Even a Physics ontology,
> Math, Biology, Algorithms, every kind of ontology
> has an unrealistic bias according to the auther.
>
> Anybody have any thoughts on this?
>
> Rich
>
>