notes from Knowledge Representation Standards Workshop

Tom Gruber <Gruber@sumex-aim.stanford.edu>
Full-Name: Tom Gruber
Message-id: <2846367990-6147902@KSL-Mac-69>
Date: Tue, 13 Mar 90  17:46:30 PST
From: Tom Gruber <Gruber@sumex-aim.stanford.edu>
To: shared-kr@sumex-aim.stanford.edu
Subject: notes from Knowledge Representation Standards Workshop
Below are notes taken from the discussions of the Shared KBs Working
Group of the First Knowledge Representation Standards Workshop, held on
March 5-7, 1990 at Santa Barbara.  The working group was cochaired by
Mark Fox, Tom Gruber, and Marty Tenenbaum.  The results of the working
group in the interlingua, which is very relevant to the KIF effort, are
not discussed in this message.

The Generic, Reusable Knowledge Bases working group was divided into
three subgroups, according to three "models of KB sharing":
  1.  TASK-SPECIFIC ONTOLOGIES (e.g., Penman, diagnostic "shells")
  2.  "GENERIC" ONTOLOGIES (e.g., packaged AI representations
of time, causality, resource constraints, space, etc)
  3.  SHARED PHYSICAL MODELS (typically of physical devices or
processes, used by several "agents" as a common substrate or framework
or index for their task-specific knowledge)

Each subgroup was charged with answering 13 questions.  Their answers
are summarized below and sometimes merged.


1.  WHAT'S BEING SHARED?

DOMAIN MODELS - descriptions of the objects, processes, relationships,
etc.  of some aspects of the world.  Often centered around physical
systems like engineered artifacts or biological systems.  Intended to be
task-independent.

TASK MODELS - descriptions of the inputs and outputs of performance
tasks, assumptions of problem-solving methods, etc.  Usually at an
abstract level of description.

SOFTWARE MODELS - descriptions of shared inference mechanisms or
"knowledge services" available
  * information like that in data dictionaries (about how software
interacts with terms in the domain and task models)
  * declarative descriptions of he algorithms and procedures, if
possible
  * descriptions of performance characteristics of the services or
inference mechanisms

MODELS OF THE SHARED ONTOLOGIES
  * assumptions for appropriate use of terms, beyond what can be said in
fine-grained constraints
  * dependencies on other ontologies (e.g., ontology A needs Fred's time
ontology)


2. WHO DESIGNS THE ONTOLOGY AND WHY?

  * for task-specific ontologies, tightly-knit collaborations
  * for shared physical models, parallel groups, then reconciliation,
initially bottom-up, then work toward the middle
  * for generic ontologies, parallel groups post into clearinghouse, where
they can be publicly reviewed.  Initial contributions labelled "red", then
given "green" status by community acceptance.  Adjudication of conflicts by
standards committee, with appeals to other authority.

For all types of sharing,
  * high value in getting there first (early)
  * utility in designing the ontology in the context of a larger ontology
    - for cumulative work in representations
    - for comparison with other, similar ontologies


3. WHO'S EXTENDING AND MAINTAINING THE ONTOLOGY AND HOW?

  * for task-specific, domain experts and knowledge engineers.  Therefore need
strongly-enforced constraints and ontology-specific knowledge acquisition
tools.
  * for generic ontologies, need good design tools.
  * for shared physical models, knowledge engineers build core model
and others (usually also KEs) build task-specific "services" based
on the core model.  The core model needs to be updated throughout
lifecycle of the modeled system, probably by knowledge engineers.
Knowledge acquisition tools might help domain experts do some of
the maintenance.


4. WHAT LEVEL OF SHARING/COMMITMENT?

  * task-specific shares both ontology and (task-specific) inference
mechanism.  The role of the ontology is to parameterize and organize
the domain knowledge needed by the inference mechanism.
  * generic ontologies are just ontologies.  Should not need to deliver
inference mechanism.  May want to for certain class of inferences,
such as in spatial intersection or propagating time intervals.
  * shared physical models share at the ontology level primarily; each
"agent" is a consumer of the shared model, but doesn't directly add
to it.  Agents may offer "knowledge services" which are tied top
inference mechanisms, but don't require sharing of the inference
mechanisms themselves (just the query and reply language).
  * all models of sharing require complete agreement on the syntax
level interlingua and agreement on vocabulary on the core model or
used by the inference mechanisms.


5. TOOLS AND TECHNIQUES

  * editors and browsers, copy & edit, queryers
  * constraint checkers
  * term classifiers
  * object instantiation mechanisms
  * version control and accountability
  * notification mechanisms
  * well-managed email
  * network patch/update distribution system
  * design knowledge capture (e.g., hypertext with knowledge-based indexing)
  * epistemology level to heuristic level compilers
  * multimedia domain data with tools such as
     - automatic thesarus generated from domain text corpus
     - information lens filters
  * visualization tools (better than browsers)
  * clearinghouse and library with citation index
  * test suites for continuous validation
  * NL explanation tools


6. RESEARCH ISSUES

  * representation of multiple models, microtheories
    - varying level of abstraction and granularity
    - how and why to switch among them
  * declarative models of inference mechanisms
  * modeling of assumptions, applicability conditions
  * evaluation of quality and sharability
  * visualization of knowledge (beyond graphers)
  * knowledge communication and knowledge-based collaboration
  * formalization of test suites (what questions to ask of a shared
ontology)


7. SPECIFIC EXAMPLE ONTOLOGIES TO WORK ON

  * not constrained by the sharing problem per se
  * try to integrate existing domain modeling work
  * PRODUCT LIFECYCLE MODELING seems good for all three types of
sharing
    - for generic, requires basics of time/activity/process, 
space/physical structure, etc.
    - for task-specific, supports lots of automation tasks such as
requirements analysis, design checking, scheduling, diagnosis, repair,...
    - for shared physical modes, is paradigmatic: all the engineering
activities based on model of the engineered artifact and process


8. WHAT DOMAINS?

  * mainly independent of the sharing issues, but should provide
    - lots of modeling problems (interesting but solvable)
    - high value for sharing knowledge (for people and programs)
    - lots of existing modeling work already done in the domain
  * narrower than Cyc, more domain depth
  * something in engineering, problems of institutional management of
resources
  * Human Exploration Project at NASA, lunar habitat, is a good candidate


9. WHAT ARE THE ASSUMPTIONS AND APPLICABILITY CONDITIONS FOR SHARING
AND USING THE KBS?

  * this information should be part of the shared KB
  * task-specific ontologies require a well-understood task and method
  * generic require consensus among theoreticians and practitioners
that the representation is a good idea (no armchair proposals)
  * shared physical model assumes that the shared part -- the physical
structure or process described in the core model -- can be described in
a fairly task-independent fashion


10. EVALUATION CRITERIA

  * Process: knowledge marketplace with color "warning label"
scheme, review committees using test suites and user feedback
  * is a research area: How to evaluate representational adequacy?
  * could use productivity measures
  * interoperability measures


11. REQUIREMENTS ON OTHER WORKING GROUPS 
  * interlingua must be sufficient to capture fancy aspects of the
generic representations of time, etc.  This may mean defaults and
other extensions.  not much specific advice on this as of yet.
  * need functional interface to the interlingua in order to share
inference mechanisms.
  * may need a query language, possible a trivial extension of the
interlingua, to support multiple programs sharing physical model on
a network (groups 1 and 2)
  * need concurrency and version management from group 2.


12. HOW TO ORGANIZE THE COMMUNITY TO PROCEED?

  * Organize groups already working on ontologies and domain
modeling - encourage leverage off currently funded projects
  * establish clearinghouse for candidate ontologies, inference
mechanisms, and tools
     -  let a dozen flowers bloom (today)
     - seed development of crucial ontologies
  * develop test suites (see notes on evaluation)
  * develop a "meta ontology" for describing, comparing ontologies
  * fund effort to "merge" and "reconcile" candidate ontologies
  * look at learning from/integrating what has been done in domains, such
as PIDES and EDIF (negative example)


13.  WHAT'S THE PAYOFF AND WHERE'S THE LEVERAGE
OF SHARING KBS?

  * integrate ongoing modeling work in common domain area
  * improved productivity of KBS development
  * multiple applications / one KB
  * allows public sharing of technology that contains proprietary
knowledge
  * enables knowledge-compilation
  * support lifecycle product description
     - for KBS developers, domain engineers, maintainers
     - for concurrent, collaborative activities (esp. in engineering)
  * ontologies as index for databases (e.g., federally-supported databanks)
  * ontologies as integration language for KBS software
  * ontologies as substrate for institutional memory and
KB-supported communication
  * "reward structure" for contributing:
    - advantage for early to market (positive returns theory)
    - citation credit in a library (a la Bob Kahn's plan)
    - academic motivation: sharing ontology as publication
    - value in "buy versus make" decisions, both for commercial KBS
development and academic collaboration