Schema-to-schema mapping; STEP

fritz@rodin.wustl.edu (Fritz Lehmann)
Date: Fri, 29 Jul 94 13:39:41 CDT
From: fritz@rodin.wustl.edu (Fritz Lehmann)
Message-id: <9407291839.AA17758@rodin.wustl.edu>
To: cg@cs.umn.edu, chuck@ontek.com, pdoudna@aol.com, srkb@cs.umbc.edu
Subject: Schema-to-schema mapping; STEP
Sender: owner-srkb@cs.umbc.edu
Precedence: bulk

     This is a copy of another pro-knowlege-level
exhortation I sent, this time to the database
newsgroups, in connection with the PDES/STEP
standard for describing manufactured products.
---------------------------------------------
comp.databases.theory #2621                                           (1)--[2]
From: fritz@rodin.wustl.edu (Fritz Lehmann)
Newsgroups: de.comp.databases,comp.databases.theory,comp.databases
[2] Re: Schema transformation (LONG)
Date: Fri Jul 29 12:51:16 CDT 1994
Organization: Center for Optimization and Semantic Control, Washington
+             University
Lines: 334
Keywords: Schema mapping, STEP/EXPRESS

     Rafael Ortiz (ortiz@cellar.rz.uni-ulm.de) wrote regarding a
schema-mapping from a proprietary Daimler-Benz format to STEP:
-------begin quote-------
[. . .]
One of the project steps will be to implement a pre- and postprocessor that
does the schema-schema mapping between the proprietary data format and the
STEP-Standard. A formal schema-to-schema mapping language has to be designed
for it.

We are looking for people wo are or were
working in the area of schema-to-schema
mapping methodologies in order to exchange experience.

Does anybody has experience with the implementation of pre- and postprozessors
eventually based on a formal schema-to-achema mapping file ? How was the
identification problem solved ?
[. . .]
----------end quote------

     You have raised a very big and very interesting
subject.  There are machine-assistants for schema integration
designed by Shamkant Navathe and his colleagues.

     My own view is that the kind of integration that you
propose may be a practical necessity for you now, but
that such approaches are basically short-sighted and
obsolescent.  The mapping effort which you make now will
apply only to your current project, and the effort will have
to be repeated each time you face another integration
task.  So long as specifications and standards leave out
the _meaning_ of the contents of data fields, they will
never be integrated automatically with different systems.
It is possible to use knowledge representation languages
to describe the objects, relationships and events desribed
in databases and in languages such as EXPRESS.  I know of two
efforts to give such computer-usable semantics to EXPRESS:
Alex Bejan of IBM and Michael Wermeliger, Lisbon, Portugal,
are translating EXPRESS to and from the "Conceptual Graphs"
language developed by John Sowa.  Also, Jeffrey van Baalen
is similarly working on a translator between EXPRESS and the
KIF language of the US ARPA Knowledge Sharing Effort.  These
languages will then enable a _conceptual_ definition of the
meaning of a STEP field, not simply dependent on an informal
natural language description (in the standards documentation)
unintelligible to a computer.

     The "meaning" to which I refer is a formal logical
description in terms of basic "ontological primitives"
at least partly shared by both systems to be integrated.

     Your example of a person with an address was:
SCHEMA a_verion_1;

ENTITY person;
    place: address;
END_ENTITY;

ENTITY address;
    street: STRING;
    city: STRING;
END_ENTITY;

END_SCHEMA;



SCHEMA a_version_2;

ENTITY person;
    street: STRING;
    city: STRING;
END_ENTITY;

END_SCHEMA;

     These are in reality mere form descriptions.  Ask yourself:
How do I know that the person's "street" and "city" in version2
should be taken from "address" in version1?  It is because of
the real conceptual meaning, not the data formats.
The true analysis is somewhat painstaking and long.  A PERSON has
a CUSTOMARY LIVING PLACE to receieve MAIL, called the ADDRESS.
The ADDRESS is a DESCRIPTION of this PLACE sufficient for
MAIL DELIVERY.  People in CITIES live at HOUSES located on
STREETS.  The STREET and the CITY names are sufficient to
determine the ADDRESS, so together they identify where a
PERSON LIVES.  Version1 and version2 both need that information.
Version2 assumes that the fact that "street" and "city" make
up an address is obvious, and the field "address" is
omitted.  Version1 explicitly includes it and shows the
dependencies.  Neither version1 nor version2 contain any
other functional relation between a PERSON and a STREET
or CITY, so this is assumed to be ADDRESS information.

     All this makes sense only given a "background ontology"
of real-world knowledge about people, houses, etc.  In
your engineering field, the background knowledge would include
basic physical and spatial infromation, as
well as the kind of information in the STEP field
descriptions in the standards, but formalized conceptually.

     I just wrote a polemic on this subject in connection
with EDI (Electronic Data Interchange), which is also a mere
"form based" standard.  It gives further information on
this knoledge-based approach, so I attach it below.

     I realize that you need a quicker answer to your
problem than is possible with a knowledge-based
approach, but your message provides a good example
to illustrate the need for it.  If anyone else who
reads this has knowledge or interest in this
(especially in connection with PDES/STEP), please
let me know.

     Also, I would like to hear from you
if others have answers for your schema-mapping question.

                          Yours truly,   Fritz Lehmann
GRANDAI Software, 4282 Sandburg Way, Irvine, CA 92715, U.S.A.
Tel:(714)-733-0566  Fax:(714)-733-0506  fritz@rodin.wustl.edu
=============================================================
POLEMIC ATTACHMENT FOLLOWS:

[POLEMIC ON EDI OMITTED, SINCE I SENT IT OUT ALREADY]
END