Large-Scale Repositories
of Highly Expressive Reusable Knowledge

A Research Project
conducted by the

Richard Fikes, Daphne Koller,

Co-Principal Investigator Co-Principal Investigator

Co-Scientific Director, Knowledge Systems Laboratory Robotics Laboratory

Phone: (415) 725-3860 Phone: (415) 723-6598

Fax: (415) 725-5850 Fax: (415) 725-1449
E-mail: fikes@ksl.stanford.edu E-mail: koller@cs.stanford.edu

Abstract

We are developing technology that will support collaborative construction and effective use of distributed large-scale repositories of highly expressive reusable ontologies[1]. Our primary objectives are to develop a distributed server architecture for ontology construction and use, representation formalisms that remove key barriers to expressing essential knowledge in and about ontologies, ontology construction tools, and tools for obtaining domain models for use in applications from large-scale ontology repositories. We are building on the results of the DARPA Knowledge Sharing Effort, specifically by using the Knowledge Interchange Format (KIF) as a core representation language and the Ontolingua system as a core ontology development environment.

In order to enable distributed ontology repositories and services, we will develop a distributed server architecture for ontology construction and use based on ontology servers which provide access via a network API to the contents of ontologies and to information derivable from the contents by a general purpose reasoner. Ontology servers will be analogous to data base servers and will provide services including configuration management, support for distributed ontologies with components resident on remote servers, and automatic caching of derived results.

There are significant gaps in the expressive power of current knowledge representation languages. These gaps prevent the inclusion in ontologies of knowledge about domains that is essential for many high-priority applications and knowledge about ontologies themselves that is essential for effective ontology use and reuse. We are closing some of the more important of those gaps by developing new representation formalisms, integrating existing formalisms, and incorporating the results into the tools and servers developed in the project. One of our main efforts concerns the representation of uncertain knowledge within an ontology. This work aims to integrate the two most prominent paradigms in knowledge representation: Bayesian networks and first-order logic. The representation language resulting from this work will enable ontologies to contain richly textured descriptions that include uncertainty, are structured into multiple views and abstractions, and are expressed in a generic representation formalism optimized for reuse. In addition, a computer interpretable ontology description language will enable annotation of ontologies with assumptions made, approximations made, topics covered, example uses, competency, relationships to other ontologies, etc.

We are addressing key difficulties in building large scale ontologies by developing ontology construction tools for specifying the overall structure of an ontology during the early stages of development, supporting teams of collaborating developers, testing and debugging ontologies, merging ontologies, and automatically acquiring probabilistic domain models from data.

We are also developing retrieval, extraction, composition, and translation tools that will enable users to effectively obtain domain models from large-scale ontology repositories that satisfy a set of application-specific requirements regarding content, level of abstraction, view, underlying assumptions, representation language, useability by problem solving methods, etc. In particular, we will support the extraction from a probabilistic ontology of Bayesian networks whose scope and level of abstraction is tailored to the current situation.

The technology developed in this project is intended to be integrated into a complete knowledge base development environment by a team headed by Science Application International Corporation (SAIC). In addition to participating in the SAIC integration team, KSL will integrate the technology developed in this project into a mutually supporting interoperable suite of prototype tools and servers. We will make that suite available for use by the HPKB community via network API's and HTML-based user interfaces. In addition, we continue to maintain the DARPA Knowledge Sharing Effort's on-line ontology library and will use that library as a testbed for demonstration and evaluation of the prototypes.

[1] We consider an ontology to be a domain theory that specifies a domain-specific vocabulary of classes, properties, predicates, functions, and entities, and a set of relationships that necessarily hold among those vocabulary items.

Innovative Claims

Technical Rationale

Richard Fikes,	Daphne Koller,
Co-Principal Investigator	Co-Principal Investigator
Co-Scientific Director, Knowledge Systems Laboratory	Robotics Laboratory
Phone: (415) 725-3860	Phone: (415) 723-6598
Fax: (415) 725-5850	Fax: (415) 725-1449
E-mail: fikes@ksl.stanford.edu	E-mail: koller@cs.stanford.edu

Large-Scale Repositories of Highly Expressive Reusable Knowledge

Abstract

Large-Scale Repositories
of Highly Expressive Reusable Knowledge