Re Ref.s for combining ontologies/thesauri/KRep

Fritz Lehmann <flehmann@orion.oac.uci.edu>
Date: Fri, 13 Jan 1995 02:54:27 -0800
From: Fritz Lehmann <flehmann@orion.oac.uci.edu>
Message-id: <199501131054.AA06291@orion.oac.uci.edu>
To: cg@cs.umn.edu, srkb@cs.umbc.edu
Subject: Re Ref.s for combining ontologies/thesauri/KRep
Cc: 73173.1330@compuserve.com, 73173.1333@compuserve.com, chein@lirmm.fr,
        mugnier@lirmm.fr
Sender: owner-srkb@cs.umbc.edu
Precedence: bulk

Dear Marie-Laure,

     I'm specifically quite interested in the problem
you mentioned, of combining the type lattices of
multiple "ontologies" which all apply to the same
group of events.  The whole issue of combining
ontologies has been addressed only in pieces, as far
as I know.  It is in essence the same problem that
is sometimes called "Semantic Database Integration".
I had a paper on this called the "The EGG/YOLK
Reliability Hierarchy" in the Proceedings of
CIKM-94 (Adam & Yesha, Ed.s, ACM Press, New York, 1994).
At least two other articles in that collection
dealt with this issue, by Sheth et al. and by Johannesen
(the latter used Conceptual Graphs and Wille's
Concept lattices).

     Another important work on this is the article
of Kevin Knight & Steve Luk in AAAI-94 - integrating multiple
"word-ontologies" _automatically_, like Wordnet, Ontos, Penman,
Longmans, etc. to build the Pangloss Ontology Base.
I think this is an important project. It's called
"Building a Large-Scale Knowledge Base for Machine Translation".
There are people who are reconstructing thesauri like
Roget's on more logical lines (basing structure more on
the conceptual level than the linguistic peccadilloes).

     Also there have been some works on integrating different
library classification systems.  I know one by M. Dienes,
"Structural Differences in Classification Systems ...",
in "Universal Classification II", J. M. Perrault & I.
Dahlberg, Eds, Indeks-Verlag, Frankfurt, 1983.

     A major, specialized effort to combine taxonomies
and vocabularies is the UMLS Metathesaurus effort run
from the US National Library of Medicine.  They combine
numerous medical systems, including some French ones.
See "The Unified Medical Language System" by Lindberg,
Humphreys & McCray, Methods of Information in Medicine,
v. 32, p281-91, 1993.

     I would like to learn of ANY OTHER "taxonomy combining"
work" in the world.  My current view is that (almost) all
ontologies, taxonomies and thesauri can and should be
combined.  This does not require genuine equivalence of
concepts in different systems, only "substantial overlap"
(as we defined formally in EGG/YOLK theory).

     I myself am experimenting right now to integrate a number
of ontologies and thesauri in a very narrow field: mailing
addresses.  Each source-taxonomy gives "tags" for types.
The tags establish cross-links between systems.  Who else
is doing this kind of thing?  (I'll post this to the srkb and
cg lists hoping for an answer.)

     Here are the source-systems I've used so far:
----------------------------
Tag References:
[Roget.4] Roget's International Thesaurus, 4th Ed., R. L. Chapman, Ed.,
Crowell/Harper & Row, New York, 1979.
[AODT.93] The Alcohol and Other Drug Thesaurus, 1st Ed., Nat. Institute on
Alcohol Abuse and Alcoholism, US Dept. of HHS, 1993.
[EDIFACT] UN/EDIFACT: United Nations rules for Electronic Data Interchange for
Administration, Commerce and Transport, 1994 Draft, ITU, Switzerland, 1994.
[FIPS-PUB 10-3] U.S. IFIPS Publication 10-3. 1994.
[KRes.Carroll.94 CARCNC00]  Knowledge Research (Ballard/Naurocki) encoding
of address format in Carroll's Guide to Government, 1994.
[ONTOLINGUA-BIBLIO.94] Bibliography Ontology in Ontolingua/KIF, Tom Gruber,
Stanford Knowledge Research Labratory 199?
[STEP.94] ISO 10303, 1994 STEP Draft, Standard for Product Data Representation
and Exchange, ISO, 1994.
[WordMenu.92] Random House Word Menu, Steven Glazier, Random House, New York,
1992.
[X12]  ANSI X12 Standard for Electronic Data Interchange, Release 003041, Feb.
1994.
--------------------------
     I'd like to add other systems too --like CYC, Pangloss, Ontos, UMLS,
Dionysius, Dewey, COLON, Holotheme, Wordtree, CONCEPT, TOTO, etc.  In
fact, if anybody reading this has a database format or ontology/taxonomy
that is relevant to NAMES, TITLES, OCCUPATIONS, ADDRESSES, ROUTING,
GEOGRAPHIC-LOCATIONS, MAIL etc., please send me your categories!

     At the Int. Conf. on Ordinal and Symbolic Data Analysis in
Dear Marie-Laure,

     I'm specifically quite interested in the problem
you mentioned, of combining the type lattices of
multiple "ontologies" which all apply to the same
group of events.  The whole issue of combining
ontologies has been addressed only in pieces, as far
as I know.  It is in essence the same problem that
is sometimes called "Semantic Database Integration".
I had a paper on this called the "The EGG/YOLK
Reliability Hierarchy" in the Proceedings of
CIKM-94 (Adam & Yesha, Ed.s, ACM Press, New York, 1994).
At least two other articles in that collection
dealt with this issue, by Sheth et al. and by Johannesen
(the latter used Conceptual Graphs and Wille's
Concept lattices).

     Another important work on this is the article
of Kevin Knight & Steve Luk in AAAI-94 - integrating multiple
"word-ontologies" _automatically_, like Wordnet, Ontos, Penman,
Longmans, etc. to build the Pangloss Ontology Base.
I think this is an important project. It's called
"Building a Large-Scale Knowledge Base for Machine Translation".
There are also people who are reconstructing thesauri like
Roget's on more logical lines (basing structure more on
the conceptual level than the linguistic peccadilloes).

     Also there have been some works on integrating different
library classification systems.  I know one by M. Dienes,
"Structural Differences in Classification Systems ...",
in "Universal Classification II", J. M. Perrault & I.
Dahlberg, Eds, Indeks-Verlag, Frankfurt, 1983.

     A major, specialized effort to combine taxonomies
and vocabularies is the UMLS Metathesaurus effort run
from the US National Library of Medicine.  They combine
numerous medical systems, including some French ones.
See "The Unified Medical Language System" by Lindberg,
Humphreys & McCray, Methods of Information in Medicine,
v. 32, p281-91, 1993.

     I would like to learn of ANY OTHER "taxonomy combining"
work" in the world.  My current view is that (almost) all
ontologies, taxonomies and thesauri can and should be
combined.  This does not require genuine equivalence of
concepts in different systems, only "substantial overlap"
(as we defined formally in EGG/YOLK theory).

     I myself am experimenting right now to integrate a number
of ontologies and thesauri in a very narrow field: mailing
addresses.  Each source-taxonomy gives "tags" for types.
The tags establish cross-links between systems.  Who else
is doing this kind of thing?  (I'll post this to the srkb and
cg lists hoping for an answer.)

     Here are the source-systems I've used so far:
----------------------------
Tag References:
[Roget.4] Roget's International Thesaurus, 4th Ed., R. L. Chapman, Ed.,
Crowell/Harper & Row, New York, 1979.
[AODT.93] The Alcohol and Other Drug Thesaurus, 1st Ed., Nat. Institute on
Alcohol Abuse and Alcoholism, US Dept. of HHS, 1993.
[EDIFACT] UN/EDIFACT: United Nations rules for Electronic Data Interchange for
Administration, Commerce and Transport, 1994 Draft, ITU, Switzerland, 1994.
[FIPS-PUB 10-3] U.S. IFIPS Publication 10-3. 1994.
[KRes.Carroll.94 CARCNC00]  Knowledge Research (Ballard/Naurocki) encoding
of address format in Carroll's Guide to Government, 1994.
[ONTOLINGUA-BIBLIO.94] Bibliography Ontology in Ontolingua/KIF, Tom Gruber,
Stanford Knowledge Research Labratory 199?
[STEP.94] ISO 10303, 1994 STEP Draft, Standard for Product Data Representation
and Exchange, ISO, 1994.
[WordMenu.92] Random House Word Menu, Steven Glazier, Random House, New York,
1992.
[X12]  ANSI X12 Standard for Electronic Data Interchange, Release 003041, Feb.
1994.
--------------------------
     I'd like to add other systems too --like CYC, Pangloss, Ontos, UMLS,
Dionysius, Dewey, COLON, Holotheme, Wordtree, CONCEPT, TOTO, etc.  In
fact, if anybody reading this has a database format or ontology/taxonomy
that is relevant to NAMES, TITLES, OCCUPATIONS, ADDRESSES, ROUTING,
GEOGRAPHIC-LOCATIONS, MAIL etc., please send me your categories!

     At the Int. Conf. on Ordinal and Symbolic Data Analysis in
Paris-INRIA, June 20-23, Rudolf Wille is presenting his and my work on
"Triadic Concept Analysis" which builds a new and complicated
mathematical structure called a "trilattice" automatically, based
on relating three sets: objects, attributes, and "modes"
where the "modes" may be different ontologies which assign
attributes to objects differently.

    On the homomorphisms issue: I'm still very interested,
but I haven't had time to deal with this lately.  I'm
convinced that when the poset of homomorphisms of finite
relational structures is understood, this will be the key
to "fret-factoring" and fast inference.  (I would think that you
and Michel would have some disapproval of the paper which Gerard
Ellis and I wrote about this in ICCS-95.)

    I hope the above references are useful and answer your
questions.
                          Yours truly,   Fritz Lehmann
GRANDAI Software, 4282 Sandburg Way, Irvine, CA 92715, U.S.A.
Tel:(714)-733-0566  Fax:(714)-733-0506  fritz@rodin.wustl.edu
=============================================================