Re: ER schemas and ontologies

fritz@rodin.wustl.edu (Fritz Lehmann)
Date: Sat, 24 Sep 94 21:04:16 CDT
From: fritz@rodin.wustl.edu (Fritz Lehmann)
Message-id: <9409250204.AA20685@rodin.wustl.edu>
To: szabo@netcom.com
Subject: Re: ER schemas and ontologies
Cc: srkb@cs.umbc.edu
Sender: owner-srkb@cs.umbc.edu
Precedence: bulk
     Nick Szabo wrote on the edi-new@tegsun.harvard.edu list:
---begin quote---
>From an information theoretic point of view, it often
makes much more sense for businesses to specify and evolve
their own semantics, rather than using semantics imposed from above.
That's why, for example, industry specific jargon develops.
That's why, for example, free-form microcomputer software
such as spreadsheets and word processors are so popular -- they
allow expression in a near infinity of ways completely unimagined
by their creators, and thus _unconstrained by_ their creators.
The people creating "semantic standards" necessarily lack most of
both the intuitive knowledge and the formal knowledge (laws, contractual
clauses, managerial procedures, technological details, etc., etc.,
etc.) of of how business relationships are conducted in the wide
variety industries that are intended to actually use these semantic
structures.  These "semantic standards" promise to confuse and
constrain their customers in a multitude of ridiculous ways.
---end quote----

     I think this is an important and valid warning to those
of us who aspire to ontology-based business systems.  The
cost, howvever, of a lack of ontological foundation is near-
complete lack of interoperability.  Each time one system must
communicate with another, a human being must spend time
determining whether a structure in one system refers to
the same kind of thing (in the real world) as a structure in
other system.  Usually there is no nice 1-1 mapping.  A lot
of work is normally required to break up records, re-
arrange them, re-name fields, drop irrelevant information,
and construct the record form for the target system.  The
really annoying thing is that doing it once is not enough.
The systems are subject to frequent nontrivial revision and
new systems are added often (by "systems" I mean new data sources
as well as new EDI partners).  The "integration" job has to
be done over and over again.  This is big money -- its a
large part of the 70-80% of all corporate software costs
that are devoted to "maintainance".

     Ontology-based definition is not "imposed semantics"
in quite the way Szabo suggests, since only the lower levels
are pre-defined.  The whole point of our "negotiated-ontolgy"
approach is that it allows two systems to define new
concepts and relations and transactions needed for their 
interaction.  The structure of these upper, defined, levels is 
not imposed.  Also, any thing which is completely agreed-
upon and recognized by both parties can be added as an undefined
conceptual primitive.  If partners are dealing in baseballs
they may just say BASEBALL is a primitive since they both
know what one is, what they weigh, how they're packed, etc.
and the order "send 100 BASEBALLs" may need no further
information about baseballs.

     Of course those who "lack knowledge" of an particular area
should not be deciding what its semantics are.  Expertise and
wisdom are needed, including the wisdom not to pre-define that
which should be left open.

     Szabo said: "it often makes much more sense
 for businesses to specify and evolve
 their own semantics".   Yes, but specify in terms of what?
Usually what is specified is the _forms_ structure, which
is processed by the computer, and a bunch of English field
names which, for the computer, are wholly unrelated tokens.
That's why a person (at great expense) has to go in and 
figure out what everything means before any two systems
can be integrated.  And all the wisdom brought to bear
on this task is lost for next time around.  There will
always be the need for human analysis, but it should be
selective, dealing with challenging issues, rather than
expensive and tedious re-inventing of the wheel each time.

    Szabo also said: "These "semantic standards" promise to confuse and
constrain their customers in a multitude of ridiculous ways."
They would threaten to do just that if the _higher_levels_ are
imposed.  Just as in X12 or EDIFACT when you are told that a segment
has this and only this information in it.  What we favor imposing
is a common low level language "below" the data element level,
with much more freedom to create new transactions, segments, etc.
(automatically) than is expected in "old-EDI".  Also, constraining
and confusing do not always go hand in hand; it is often the
unconstrained (typically, unexplained) system that is the most
confusing in the long run.

                          Yours truly,   Fritz Lehmann
GRANDAI Software, 4282 Sandburg Way, Irvine, CA 92715, U.S.A.
Tel:(714)-733-0566  Fax:(714)-733-0506  fritz@rodin.wustl.edu
=============================================================