More about contexts

sowa@watson.ibm.com
Message-id: <199209300208.AA01752@venera.isi.edu>
Date: Tue, 29 Sep 92 22:03:11 EDT
From: sowa@watson.ibm.com
To: SRKB@ISI.EDU, INTERLINGUA@ISI.EDU, CG@CS.UMN.EDU
Subject: More about contexts
During the past few months, Pat Hayes and I have been carrying on an
open email debate about contexts, their formalization in logic, and
their desirability in KIF and related languages.  In my work on the
IRDS conceptual schema, I have been using contexts as a mechanism for
representing encapsulation in object-oriented programming languages.
Partly as a result of that work, I have become even more firmly
convinced that contexts are a valuable mechanism that should be
seriously considered for the knowledge sharing effort.

As Pat and I have agreed in past email notes, the notion of "context"
in the literature has been rather confused, with many different ideas
bundled up in a single term.  There are three distinct concepts that
must be distinguished:

 1. A packaging mechanism for enclosing a collection of formulas and
    allowing them to be named and referenced as a single unit.

 2. The contents of that package, which could be called anything from
    "quoted formula" to "microtheory".

 3. The permissible operations on the formulas in the package.  These
    operations could be defined by a set of axioms in a larger package
    that encloses the one under discussion.

Whenever I use the word "context", I mean only the packaging mechanism.
In my Conceptual Structures book, I defined a context to be a concept
box that contains a collection of nested conceptual graphs in its
referent field.  In effect, the box behaves like a Lisp QUOTE, and the
relations attached to the box determine how the nested graphs are used.

According to this view, a context is a very lightweight construct with
little more semantics than a Lisp QUOTE plus an assignment.  The power
of the construct comes from what you do inside and outside the package.
Inside a context box, you can insert any number of formulas that define
an arbitrarily rich theory; outside the box, you can state any number
of other axioms (actually, metalanguage axioms) that make assertions
about the context and how the formulas inside it can interact.  In effect,
the formulas outside the box define the rules of inference for the theory
stated by the formulas inside the box.

Besides making statements about the formulas inside a box, you must be
able to attach relations to the box itself to make the following kinds
of assertions:

 1. "The conjunction of the formulas in this box is false."

 2. "The formulas in this box describe a possible state of affairs."

 3. "Mary believes the propositions stated by the formulas in this box."

 4. "The formulas in this box interact according to the axioms contained
    in the box named S5."

 5. "The formulas in this box were true during the time interval [t1,t2]."

 6. "Any formula placed in this box must be in Horn-clause form."

The ability to say things like this gives you an immense amount of power.
If used in an undisciplined way, that power could get you into trouble by
letting you say things that are inefficient, undecidable, or inconsistent.
But it also gives you the power to state and enforce whatever discipline
you like.  For example, you could package a theory T in a context C and
say that the rules of inference for T are in another package S4 or S5.

If we adopt such a mechanism as a basis for knowledge sharing, there
are several issues that must be addressed:

 1. Can you develop a coherent model theory for such a system of nested
    contexts?  How do you ensure that the system is consistent and the
    reasoning mechanisms sound?

 2. Since the context mechanism itself has so little semantics, the real
    crux of the issue lies in what you do with the contexts.  How can you
    keep people from getting into trouble by using contexts in weird and
    wonderful, but totally undisciplined ways?

 3. If you want to share knowledge between different systems, it is easy
    to map from a more restricted theory to a more general one.  But in
    general, it may be impossible to map backwards from a general one
    to a more restricted one.  How can you ensure that translation is
    possible?  Or at least determine when it is possible or impossible?

To answer the first question, construct a quasi-first-order model
consisting of a union of conventional models with primitive objects
and metamodels whose objects correspond to the linguistic elements
that talk about primitive objects.  Build it up recursively:

 a) For any context containing only first-order formulas, a model for
    the universe of discourse (UoD) can be constructed in the usual way
    from a set S of basic objects and relations R over them.

 b) For any context C whose formulas talk about a universe of discourse
    U, construct an extended UoD for C, which consists of the union
    of U and the language L in which the formulas in C are stated.  The
    language L is defined in the usual way as the set of all grammatical
    statements derivable by some suitable grammar.

 c) For any context C that contains nested contexts C1,...,Cn, let the
    UoD of C be the union of the extended UoD's of C1 through Cn plus
    the set of contexts {C1,...,Cn} themselves.  Then the formulas in C
    can refer to the nested contexts, the languages of those contexts,
    the primitive objects, or the way that any of these things relate
    to one another.

Note that the language of each context has a first-order structure.
The universe of discourse of each context becomes larger and larger as
you go outward, but if the primitive UoD is countable, all the extended
UoD's and their unions remain countable.  This structure gives you the
power to state any definitions or make any metalinguistic statements
you like, but none of the quantifiers in any of the metalevels range
over uncountable sets, as in the usual higher-order logics.  That means
that at any level, you need nothing more powerful than first-order proof
procedures.

If you prefer, you can restrict your languages to a more efficient subset,
such as Horn clauses or whatever.  That is effectively what people do when
they write a Prolog interpreter in Prolog, for example.  In that case,
Prolog becomes the metalanguage that talks about the nested language,
which happens to have exactly the same syntax.  Prolog programmers have
found this to be a very useful technique, and the context mechanism is
a generalization of that approach.

Re consistency:  There are two distinct issues:  consistency of the
framework itself vs. consistency of whatever formulas anyone might choose
to insert in any context.  To show consistency of the framework, it is
sufficient to construct one instance (example below).  Consistency of
what people put in the contexts, however, is the responsibility of the
user.  Any practical programming language has to be powerful enough to
let people get hung up in a loop, and any practical reasoning system has
to be powerful enough to let people say things that are inconsistent.
The responsibility of the framework designer is to ensure that the
framework itself is consistent and that people who want to enforce
appropriate disciplines are given mechanisms for doing so.

Re discipline:  Building a knowledge base is something that a domain
expert should be able to do.  But designing a metalanguage with rules
of inference that support disciplined knowledge engineering techniques
is a sophisticated task that would probably be left to professional
system designers.  The context mechanism provides a clean way of
separating the two tasks.  Following are some metalinguistic frameworks
that system designers might implement for the users:

 1. File system:  The simplest use of contexts is for packaging formulas,
    naming the packages, and retrieving them upon request.  The only
    metarule that would affect the reasoning methods would be one that
    says that any formula could be moved in or out of any package at
    any time.  This rule has the effect of making the context structure
    transparent to the theorem prover; i.e. all the formulas interact as
    though they were in a single global space.  Since this simple example
    is consistent, it shows that the framework of nested contexts by
    itself does not lead to inconsistencies (although people are free
    to put inconsistent combinations of formulas in the contexts, if
    they choose to do so).

 2. Modality:  If you divide a knowledge base K into two contexts, a set
    of laws L and a set of contingent facts F, you can define necessity
    as provability from L and possibility as consistency with L.  These
    definitions satisfy the usual axioms for the modal operators.  They
    also give you a convenient way of formalizing conventional database
    systems:  the context F consists of all the data stored in the
    database, and the context L consists of the integrity constraints
    on that data.

 3. Limiting expressive power:  The various systems that limit expressive
    power for the purpose of improving efficiency can be organized in a
    hierarchy.  Let A and B be names of contexts that define the rules of
    inference for two different systems.  Then you can define A < B to
    mean that every rule of inference in A is a rule or derived rule in B
    and that every statement form that is supported by A is also supported
    by B.  For example, B could be classical FOL, and A could be some
    terminological language or Horn-clause language.  Then when two
    knowledge bases share knowledge, they would have to specify which
    rules they were operating under.  You could always export from a
    system that operates under the rules A to a system that operates
    under B.  You could also allow limited export from B to A, provided
    that any message m sent from B to A was first checked to see whether
    m used only the syntactic forms permissible by the rules of A.

 4. Hybrid systems:  Many systems have collections of knowledge of
    different kinds.  A deductive database system, for example, might
    have three different collections of knowledge:

    a) A very large conventional database where no fact requires anything
       more complex than ground-level atoms.  Let the rules in context GL
       define the subset of logic that is equivalent to the database.

    b) A set of integrity constraints expressed in full FOL.  Note:  It
       is possible to check any FOL constraint against a set of ground
       level atoms in at most polynomial time.  Such checks can be
       stated in SQL and executed by a standard query processor.

    c) A deductive component that uses Prolog-like Horn clauses
       (context HC) to draw inferences from database in GL form.

    Since the expressive powers of these three versions of logic are
    ordered GL < HC < FOL, anything in expressible by GL could be exported
    to a knowledge base using HC for inference a knowledge base using
    FOL for integrity checks.  After a deduction is performed by HC,
    any conclusion that uses only the ground-level GL form could be sent
    back to the database, even though the more general HC or FOL forms
    could not be stated in GL.  Checking whether a conclusion can be
    exported in a more restricted form is a purely syntactic test that
    can be done in linear time.

For knowledge sharing, any systems that communicate would have to agree
on a standard context that contains the metalanguage that defines the
subset of logic that they are using for communication.  That subset could
be identical to the subset that they use for reasoning, or it might be
a more restricted subset.

In summary, the context mechanism provides an enormous amount of
power for a small investment.  Furthermore, it also provides a way of
enforcing any kind of discipline needed to restrict that power.  And
most importantly, it can simulate many different kinds of protocols
within a common framework:  conventional database systems, deductive
databases, terminological systems, object-oriented systems, and many
others.

John Sowa