EDI with real semantics

fritz@rodin.wustl.edu (Fritz Lehmann)
Date: Thu, 28 Jul 94 01:24:03 CDT
From: fritz@rodin.wustl.edu (Fritz Lehmann)
Message-id: <9407280624.AA00765@rodin.wustl.edu>
To: edi-new@tegsun.harvard.edu
Subject: EDI with real semantics
Cc: cg@cs.umn.edu, chuck@ontek.com, srkb@cs.umbc.edu
Sender: owner-srkb@cs.umbc.edu
Precedence: bulk

     The "edi-new" list is a good development.  I feel that it
could be even a bit more up-to-date on the possibilities of "new
EDI", though, based on the archived messages I've seen.

     The general Electronic Data Interchange (EDI) problem is that
the current EDI participants, for practical reasons, have stayed
within a pretty narrow idea of EDI limited essentially to printed
standards for data formats, and programs which (fairly mundanely)
implement them.  EDI transactions, segments and fields are now
just electronic versions of printed forms containing data.  What
is happening in the outside world, though, is a transcending of
mere data bases and data transactions, towards "knowledge bases"
and "intelligent agent transactions".

     There is a movement going on worldwide for what is called
"knowledge sharing" -- including the ARPA Knowledge Sharing Effort
in the USA and similar efforts in Europe.  Unlike current EDI,
this aims to transmit the _meaning_ of data along with the data.
This requires at least some (machine-negotiated) sharing of an
"ontology" or a concept-system for describing things and events in
the real world.  Concepts like "person", "order", "time",
"destination" and "payment" have formal conceptual definitions.

     A well-known attempt at this in the Artificial Intelligence
community is the CYC project led by Lenat and Guha at MCC in
Austin, Tex.  This is one of several different concept-systems; no
one concept-system serves all purposes, but they must have some
common ground in order for disparate systems to communicate.  A
similar system is part of the proposed ANSI IRDS Conceptual
Schema, in which databases describe their subjects, and
themselves, in "metadata" that are logically and even
philosophically based on structures of basic conceptual
primitives.  Another is the related enterprise-integration system
of Ontek, Inc. in Laguna Hills, Cal.  The designated languages for
these systems are Conceptual Graphs, the Knowledge Interchange
Format (KIF), and CYC's Epistemological Language, all of which are
machine-usable enhancements of symbolic logic -- "position-
independent" -- and are nearly inter-translatable.  When these and
other knowledge-based systems are better harmonized (soon, I
believe), information on any subject can flow from one system into
another and be "understood" by the receiving machine to the extent
that common ontological ground can be established automatically.
Since these are logic-based-systems using semantic primitives,
they have completely extensible semantics and they won't get
obsolete due to an outdated specialization level.

     Establishing the ontological "common ground" automatically
(between, say, a purchasing agent and a selling agent) by machine-
negotiation is a natural further development beyond the proposed
"ICSDEF message" (Interchange Structure Definition) currently
being urged by Ken Steel at U. of Melbourne as the key to "open
EDI".  Before any binding transactions take place, two programs
define for each other their respective capabilities and needs.  In
the case of mismatches, they try to agree on a maximal common
concept-system and on the right set of message segments and
fields.  The actual "standard" used for subsequent transactions is
defined in this initial negotiation.  This is related to the
"KQML" language for interagent communication, also part of the
ARPA Knowledge Sharing Effort.  There's no need to render current
standards obsolete -- in fact the "generic" ontologies and
existing international EDI standards (if properly defined
conceptually) can serve as a _starting_point_ for the two programs
to tailor the standard needed for their specific relationship.

     In EDI now, human beings in two different organizations must
negotiate to agree upon the specific protocol for Purchase Order
applications, etc., and programmers then create translators to and
from the local databases.  The two communicating computer systems
do not "know" what is in the segments and fields -- they can only
check for syntactic correctness at best.  The only meaning is in
English language remarks and definitions in standards documents,
and in the fact that human beings are checking the printed
versions of the transactions.  (Ken Steel has reported that 70% of
EDI documents are actually paper mailed or FAXed human-to-human!)
In future EDI, each system will have a conceptual definition of
what it _means_ to be, say, a "confirmation", and it will be able
to detect automatically whether a confirmation in a particular
transaction makes any sense.

     At present the EDI standards have a notion of the price paid
for a product, but no notion of what that product is, nor its
proper specification parameters.  There is just a human-readable
description, an arbitrary designated part-number, or something
similar.  These too are becoming obsolescent.  The PDES/STEP
standards for manufacturers are now beginning to emerge
(complicated standards for product descriptions and
specifications; they now cover machined shapes, CAD
specifications, assemblies,  electronic components and some other
areas).  These will eentually be extended to almost all areas of
commerce.  These are also being linked into the "knowledge level"
and defined conceptually, so that, coupled to intelligent EDI, a
computer will "understand" automatically that a Purchase Order to
ship four Mack Trucks in one Post Office Mailsack is ill-advised.
Similarly, U.S. Government procurement will have built-in machine
safeguards against certain unsound, nonsensical or illegal
transactions.

     Nitin Borwankar of Sybase (nitin@sybase.com) called for the
first step: the use of metadata for describing the contents of EDI
fields:
     QUOTE:
"The *meaning* of each of these transactions/segments/fields etc.
is defined *out-of-band* in the X.12 standards documents which are needed
to decode an encoded EDI message.
I propose that the meaning of each transaction/segment/field etc.
be included in the new-EDI message, in a structured way, which is human
readable.  Thus the meaning of the interchange is included *in-band*
in the message itself."
     END QUOTE

     I agree, but, the key thing is to insure that this metadata
itself is _not_ limited to a human-readable set of English
descriptions dependent solely on human understanding.  All of the
transactions/segments/fields should in addition have _conceptual_
definitions in a knowledge representation language as described
above.

     Any specialized EDI convention for an industry segment
(common application) will have a combination of general concepts
like "money" and "delivery point" etc., as well as concepts
specific to that industry.  Often the terms of art in the industry
are well-agreed-upon (by consensus or by standards) and the
specific conceptual definitions will be mainly uncontroversial.

     The "generic" or universal ontologies may allow novel
intersector transactions (i.e. between different industries).
Concepts of automobile distribution may not be known to a currency
trading system, and vice-versa, but if the two need to communicate
it should be possible for them to exchange sequences of long and
painstaking definitions (based on generic ontology) until a valid
transaction between them becomes feasible.

     Among other reasons for having a formal "ontology" is so that
other kinds of systems within an enterprise can communicate
intelligently with the EDI program.  This includes Management
Information, CAD data, accounting systems, manufacturing, etc.
The same "widget" ordered by the Army could appear first in the
EDI RFQ transaction, then in the Purchase Order, then in the CAD
design system, then in a Numerical Control program for machining,
then in a CIM shop-floor program, then in Inventory, then in
Accounting, then in a Shipping router, then in a final EDI
confirmation, then in the Army's logistics system, etc.  The same
entity, the widget, has a different (and incompatible)
representation in each program now, so intercommunication is
nearly impossible.  These programs do have different purposes and
need different information, but they can be integrated at the
knowledge level, and EDI should fit in to this integration scheme.
This is a general example of current "enterprise integration"
theory.

     Similarly, the Medical EDI establishment can tie their work
into the extensive conceptual standardization taking place at the
National Library of Medicine.   If a particular general medical
syndrome is authorized for reimbursement, the taxonomies and
semantic network in the UMLS Metathesaurus could be exploited to
help determine if more specific disease reports in a transaction
fall within the reimbursable general category.  There are similar
concept systems for treatments, procedures, and medication.  (I
don't know the details of current medical EDI.)

     Even the area of legal regulations is being studied and
formalized at the conceptual level for computers, although this
work is still at the early research stage.  If and when such
systems become practical, logical formulations of the original
regulatory language can be used as a formal constraint on EDI
transactions.  A certain RFQ could trigger: "Sorry, this violates
the regulation on minority participation in Regulation CFR
XXX:xxx;x(x)iiiv because ..."  The fact that EDI transactions have
the effect of changing ownership and generating legal obligations
means that much of the semantics will ultimately depend on
business law.

     New tokens or "tags" can also be defined "on the fly"
conceptually to adjust to specialized business needs, as Nick
Szabo has suggested.    A tag then will really mean exactly what
you say it means.  A problem with this is that a user would need
to understand the intended use at a deep level and be able to
express the conceptual definition correctly -- a skill not likely
to be very universal.

     These things are already happening elsewhere --  links are
already forming between different systems at the knowledge level.
I would be very interested to see what the "new-EDI" thinkers (and
the EDI Establishment) have to say about this kind of advanced
programme for EDI.  I myself don't think it's really a question of
"whether" it will be done for EDI -- just "when".

     The "edi-new" mailing list seems well suited to be a forum to
discuss what in my view will really turn out to be "EKI" where the
K is for Knowledge.  (The difference from other Knowledge Based
Systems is that, in EDI, real ownership and legal obligations
actually change.)  How long this takes to develop at the "thinkers
and innovators" level, and how long after that it takes to spread
to the practical world of everyday EDI transactions, are hard
things to predict.

                          Yours truly,   Fritz Lehmann

GRANDAI Software, 4282 Sandburg Way, Irvine, CA 92715, U.S.A.
Tel:(714)-733-0566  Fax:(714)-733-0506  fritz@rodin.wustl.edu
=============================================================