KIF counterproposal

Matthew L. Ginsberg <ginsberg@sunburn.stanford.edu>

Mail folder: Interlingua Mail
Next message: Ramesh S. Patil: "Re: KIF counterproposal "
Previous message: Danny Bobrow: "Re: [Ramesh S. Patil <ramesh@vaxa.isi.edu> : Re: ccodes and rcodes ]"
Reply: Ramesh S. Patil: "Re: KIF counterproposal "
Reply: Matthew L. Ginsberg: "Re: KIF counterproposal"

Date: Wed, 19 Sep 90 16:46:09 -0700
From: Matthew L. Ginsberg <ginsberg@sunburn.stanford.edu>
Message-id: <9009192346.AA07486@Sunburn.Stanford.EDU>
To: interlingua@venera.isi.edu
Subject: KIF counterproposal


Some of the recent suggestions seem to be pretty similar to what I've
been proposing, so I'm attaching to this message a LaTeX file for an
interlingua that is hopefully both simpler and less controversial than
the current one while retainingw the ability to customize that people
like Bobrow are interested in.

Please let me know what you think ...

						Matt Ginsberg

\documentstyle[12pt]{article}
\setlength{\topmargin}{0pt}
\setlength{\oddsidemargin}{0pt}
\setlength{\evensidemargin}{0pt}
\setlength{\textwidth}{6.5in}
\setlength{\textheight}{8.5in}
\setlength{\headheight}{0pt}

\newcommand{\dblspace}{\baselineskip=19.3pt}
\newcommand{\nodblspace}{\baselineskip=\normalbaselineskip}

%here are some mathematical symbols
\def\emptyset{{\hbox{\rm \O}}}
\def\set#1{\{#1\}}
\newcommand{\type}[1]{\mbox{\tt #1}}
\def\R{\rlap{\rm I}\,{\rm R}}
\def\powset{{\cal P}}
\def\glb{{\mbox{\rm glb}}}\def\lub{{\mbox{\rm lub}}}

%date stuff
\def\now{\ifcase\month\or
  January\or February\or March\or April\or May\or June\or
  July\or August\or September\or October\or November\or December\fi
  \space\number\year}

\newcommand{\nm}{non\-mon\-o\-ton\-ic}
\newcommand{\Nm}{Non\-mon\-o\-ton\-ic}
\newcommand{\aelogic}{auto\-epis\-temic logic}
\newcommand{\Aelogic}{Auto\-epis\-temic logic}
\newcommand{\citemrs}{\cite{Genesereth:mrs,Russell:MRSguide}}
\newcommand{\abt}{American Bitechnologies}
\newcommand{\mvl}{{\sc mvl}}\newcommand{\Mvl}{{\sc Mvl}}

\newcommand{\prolog}{{\sc prolog}}\newcommand{\Prolog}{{\sc Prolog}}
\newcommand{\atms}{{\sc atms}}\newcommand{\Atms}{{\sc Atms}}

\def\draftline{}
\def\authors{Matthew L. Ginsberg}

\def\group{\nodblspace Computer Science Department\\
Stanford University\\
Stanford, California 94305}

\def\abstitle#1#2{\begin{center}
{\Large\bf #2}\\
\vspace{1in}
{\large \authors}\\
\vspace{.5in}
\group
\end{center}
\vspace{1.5in}}

\newcommand{\atitle}[2]
{\def\tphrase{#2}\begin{titlepage}\draftline\abstitle{#1}{#2}
\renewenvironment{abstract}
{\nodblspace
\begin{center}
\large\bf Abstract
\end{center}
\medskip}{}}

\begin{document}

\newcommand{\kif}{{\sc kif}}
\newcommand{\Kif}{{\sc Kif}}

\atitle{}{A KIF Proposal}

\begin{abstract}  A {\em knowledge interchange format}, or \kif, is
a language for the automatic exchange of knowledge.  This proposal
describes a possible \kif\ that is committed to a declarative
semantics in a way that ensures that recipients of \kif\ documents
will be able to use the information they contain.  The language
currently makes no attempt to predict the result of ongoing research
efforts in the knowledge representation community, but instead
includes a general extension mechanism that can be used to
incorporate such results as they are accepted by the AI community as
a whole.

\end{abstract}
\end{titlepage}

\section{Introduction}

Imagine that the users of two large knowledge bases wish to exchange
the information their knowledge bases contain.  Assuming that these
knowledge bases have been developed using different knowledge
representation languages, they currently have no means by which to do
so other than by translating their knowledge from one format to
another.  If a third user wants to share the knowledge, another
translation process must be undertaken.  

A {\em knowledge interchange format}, or \kif, is a formal language
by which these sorts of translations can be effected.  Knowledge
engineers wishing to use \kif\ need only provide translators to and
>From a single language (the interchange format) in order to have
access to the knowledge bases developed by other researchers who have
written similar \kif\ interfaces.

On the face of it, the development of a \kif\ is a laudatory goal.  A
moment's reflection, however, reveals that there is a serious problem
with any interchange language, since the state of KR (knowledge
representation) research is such that virtually all existing KR
systems make commitments that are not reflected in the declarative
knowledge they contain.  A system may use certainty factors, or
probabilities, or a particular form of \nm\ reasoning, or frames, or
inheritance reasoning with exceptions, or any of a wide variety of KR
techniques for which no declarative semantics is currently available.
In most of these cases, the semantics of the system are given, at
least in part, by the procedures that manipulate the knowledge.  The
recipient of a \kif\ database will presumably be manipulating this
knowledge using his own procedures, and will therefore be unable to
share in the procedural component of the knowledge in the original
system.

There are two approaches that might be taken to this problem.  On the
one hand, we might try to design a \kif\ that anticipated all such
procedural information by including specific facilities to describe
programs and their functioning.  The other approach would be to
accept procedural information as currently outside the scope of
information that can be legitimately exchanged, and resign oneself to
the ensuing inaccuracies in exchanged information.

This document reflects a strong commitment to the second of these
approaches.  The reason is very simple: Such a commitment is dictated
by the needs of the knowledge engineering community.  Here are the
reasons:
 \begin{enumerate}
 \item There are virtually no KR systems that can make use of
information regarding the procedures used to analyze the knowledge.
The results of translating into this sort of a language would be
essentially useless to the recipients.
 \item The construction of a declarative language capable of
describing procedural commitments is currently only a research
venture.  There is no accepted declarative description of procedural
information, let alone a description that splits the procedural from
the nonprocedural information so that existing systems can make some
use of the result.  Although research in this area is ongoing, its
inclusion into a \kif\ intended for widespread use is premature.
 \end{enumerate}

If the \kif\ cannot contain a declarative description of procedural
information, what {\em should\/} it contain?  The commitments made
in this document are to the following:
 \begin{enumerate}
 \item The basic \kif\ language should be one accepted by the entire
KR community.  This is to ensure that any recipient of a \kif\
database can do something useful with the knowledge he receives.
 \item It should be possible to extend the language as the KR
community comes to accept more things as ``standard.''  In addition,
if some subgroup comes to share specific understandings, they should
be able to include these understandings in their \kif\ databases,
although this inclusion should be transparent to any users that do
not share them.
 \end{enumerate}

The way in which we propose to do this is as follows:
 \begin{enumerate}
 \item The basic \kif\ language will be first-order predicate
calculus, with specific notational conventions to handle variables,
quantification, and so on.  We expect that all existing KR systems
can write translators into and out of this language.  This basic
language is described in Section \ref{s.basic} of this document.
 \item All \kif\ documents will include information indicating the KR
commitments made by the underlying system.  Any recipient who can
support all of the commitments made by the sender is assured that no
information has been lost in the transmission of the system;
recipients who cannot support all of the commitments may receive
approximate information in some sense.  See Section \ref{s.comments}.
 \item The additional information passed by \kif\ documents making
specific KR assumptions is clearly separated from the declarative
information that is also being passed.  This is to ensure that a
recipient not sharing these commitments retains partial access to the
knowledge being sent.  The syntax used for the additional information
is described in Section \ref{s.comments}.
 \end{enumerate}

The next section of this document begins to describe the syntax of
\kif; most of the interesting details are in Sections \ref{s.basic}
and \ref{s.comments}.

\section{Overall syntax}

A \kif\ document is a sequence of {\sc ascii} characters that can be
processed by the {\sc lisp} reader; we will concern ourselves with
the results returned by the reader instead of the character stream
itself, viewing a \kif\ document as a stream of s-expressions.  The
following s-expressions have a special meaning in \kif:
 \begin{enumerate}
 \item Any keyword (i.e., any top-level s-expression that is an atom
beginning with a \type{:}) {\em other than\/} \type{:label},
\type{:cancel} and \type{:nonsemantic} is the name of a KR commitment
made by the system.  The next s-expression should be used to convey
additional information as appropriate (examples are in Section
\ref{s.comments}).  These keyword/s-expression pairs will typically
contain system-dependent information concerning subsequent sentences,
and are ``sticky'' in that they apply to {\em all\/} subsequent
sentences.
 \item \type{:label} should be followed by an atom that labels the
last keyword encountered.  \type{:cancel} should also be followed by
an atom and indicates that the keyword information so labelled should
not be applied to the following sentences.  Once again, examples are
in Section \ref{s.comments}.
 \item \type{:nonsemantic} indicates that the most recent keyword
includes information without semantic content (see Section
\ref{s.def} for an example).  This is an indication that a system
that doesn't understand the keyword is still safe in using the
information.
 \end{enumerate}

All other s-expressions are interpreted as basic \kif\ sentences with
a declarative semantics; these are described in the next section.

\section{Basic KIF}
\label{s.basic}

The legal sentences in basic \kif\ are precisely those of first-order
logic, expressed in prefix normal form.  In order to make this precise,
we need to specify the fashion in which variables are distinguished from
constants, and specify alphanumeric replacements for symbols such as
$\neg$, $\forall$ and $\exists$.

\subsection{Variables}

Any atom beginning with the character \type{?} is a variable.  \Kif\
also supports the use of {\em sequence\/} variables, which are
variables that can match arbitrary sequences of atoms instead of just
single ones; there is substantial evidence from the \prolog\
community that such variables are useful in the construction of
declarative databases.

Unlike \prolog, \kif\ does not assume either that sequence variables
appear only at the end of argument lists, or that only one sequence
variable appears in any particular expression; this makes the problem
of unifying two \kif\ expressions more difficult than standard
unification.  We therefore propose to make available on request a
fast unifier for expressions that may include multiple sequence
variables.  This unifier is written in Common Lisp using the
almost-linear algorithm that recently appeared in the {\em Journal of
AI} and only incurs significant computational overhead when
attempting to unify expressions that actually {\em do\/} contain
multiple sequence variables.

Any variable whose second character is \type{*} is a sequence
variable; all other variables are not.  Thus \type{?*} and \type{?*x}
are typical sequence variables; \type{?} and \type{?x} are typical
nonsequence variables.  The character \type{*} was chosen because of
the obvious analogy with {\sc bnf} expressions; we have not used a
distinct initial character for sequence variables because it is
important that all variables have a uniform and easily identified
representation.

\subsection{Connectives}

The replacement of connectives with alphanumeric expressions is
summarized by the following table of standard logical sentences and
their \kif\ equivalents:
 \begin{center}
\begin{tabular}{l|l}
\multicolumn 1{c|}{\bf Standard} & \multicolumn 1c{\bf KIF} \\
\hline
$\neg p$ & \type{(not p)} \\
$p_1 \wedge \dots \wedge p_n$ & \type{(and p1 ... pn)} \\
$p_1 \vee \dots \vee p_n$ & \type{(or p1 ... pn)} \\
$p \supset q$ & \type{(if p q)} \\
$p \leftrightarrow q$ & \type{(iff p q)} \\
$\forall x p(x)$ & \type{(forall ?x (p ?x))} \\
$\exists x p(x)$ & \type{(exists ?x (p ?x))}
 \end{tabular}
 \end{center}

\section{Semantics}
\label{s.comments}

The syntax of \kif\ is as given in the preceding two sections;
s-expressions following keywords have no restrictions (hence the
inherent flexibility of this proposal), while the syntax of other
sentences is dictated by the rules of first-order logic.

In general, the semantics of a \kif\ database is also determined by
the rules of first-order logic; the extension facility describes
situations in which first-order logic is inappropriate for some
system-dependent reason.

The form of the additional information used by a \kif\ database is
 \[\type{:extension s-exp}\]
 where \type{:extension} is the name of a particular extension to
first-order logic and \type{s-exp} is additional information that is
to be used when considering subsequent sentences.

It is not within the scope of the standardization effort to delimit
the list of possible extensions; although we are about to give a
variety of examples showing the flexibility of the scheme, it is not
even in the scope of the standardization effort to propose such
extensions.  Rather, we expect that such extensions will be developed
by the designers of the various knowledge bases; as common agreements
are reached by these designers, they will settle on specific
extensions that sanction them.

In this document, we will consider the use of our extension facility
to treat three separate semantic extensions: definitions,
probabilities and procedural attachments.  We have attached
\type{mlg} to all of the extension names in order to indicate that
they are only trial balloons that might be supported by as little as
a single researcher.  In some cases, we would hope that the community
interested in the ideas (probabilities, for example) would rapidly
settle on an agreed language that would describe their shared
semantic commitments.

\subsection{Definitions}
\label{s.def}

A \kif\ document that contained definitions might look like this:

 \begin{verbatim}
     :definition-mlg  predicate
     :nonsemantic
     :label d001

     (forall ?x (iff (bachelor ?x) (and (single ?x) (male ?x))))
     (forall ?x (iff (old-maid ?x) (and (single ?x) (female ?x) (old ?x))))

     :cancel d001
 \end{verbatim}

The \type{definition-mlg} keyword indicates that the following
sentences are to be interpreted according to a particular set of
semantic or other rules that are agreed upon by all users of the
definition-mlg ``package.''  In this particular case, the following
atom (\type{predicate}) indicates that the rules are being used to
define predicates -- \type{bachelor} and \type{old-maid} in the
example.  The label \type{d001} is used to delimit the scope of the
definitional declaration.

Any system that recognizes the \type{definition-mlg} commitments is
now free to interpret the above two rules as definitions, presumably
obtaining some computational benefit as a result.  A system that
doesn't recognize these commitments will still be able to make sense
of the rules, since the \type{nonsemantic} keyword clearly indicated
that the keyword doesn't affect the {\em meaning} of the following
information.  Other metalevel information (indicating that a certain
rule is to be used for backward-chaining, for example) can be handled
similarly.

\subsection{Probabilities}

Of course, the interesting extensions are those that actually change
the semantics of the knowledge in some way.  Here might be one:

 \begin{verbatim}
     :probability-mlg  .75
     :label p001

     (flies Fred)

     :cancel p001
     :probability-mlg  .80
     :label p002

     (forall ?x (if (bird ?x) (flies ?x)))

     :cancel p002

     (forall ?x (if (ostrich ?x) (bird ?x)))
 \end{verbatim}

This database tells us that Fred flies with probability 0.75, that
birds fly with probability 0.8, and that ostriches are birds.  The
last sentence is interpreted normally, since all of the previous
declarations have been cancelled.  The interpretation of the
sentence, ``Birds fly with probability 0.8'' is up to the conventions
of \type{probability-mlg}; this particular system might assume that
the probability actually labels the entire sentence, or might strip
off the leading quantifier and interpret the result as a conditional
probability.

Now imagine that this knowledge base is sent to a system that has no
probabilistic facility.  The recipient, upon seeing the
\type{probability-mlg} keyword, will presumably realize that the
database he is receiving uses probabilistic information.  He can
then do one of three things:
 \begin{enumerate}
 \item Accept the probabilistic information as valid, concluding
above that Fred flies and that birds do.  His database is likely to
contain information not intended by the sender if he does this, but
doing so may well be his most attractive option.
 \item Ignore the probabilistic information.  This ensures that he
only uses knowledge that the sender thought was valid, but may reduce
the usefulness of the information received.
 \item ``Subscribe'' to \type{probability-mlg} in some sense.  This
does {\em not\/} mean that he has to include in his system a full
probabilistic reasoning facility, only that he has to make sense of
the additional information he is receiving.  Perhaps he will accept any
sentence stated with probability in excess of some threshold and ignore
the rest.  Perhaps he has a default reasoning facility that he will use
instead, and so on.
 \end{enumerate}

The point, of course, is that the recipient of the system is able to
interpret the incoming information in a convenient and rational
fashion.  Note also that the labels serve to delimit the extent of
the suspect knowledge.

Many other existing declarative schemes are similar.  \Atms's, for
example, need to label sentences as assumptions or otherwise; many
nonmonotonic reasoning schemes need to label sentences as defaults.

\subsection{Procedural attachment}

Suppose that we want to evaluate the truth of the predicate
\type{subsetp} using {\sc lisp} as opposed to inferentially.  We
might write this as: 

 \begin{verbatim}
     :attach-mlg  subsetp
\end{verbatim}

Presumably, this simply means that \type{subsetp} is procedurally
attached.  The lack of any delimiters indicates that all of the
subsequent information might be affected.  Or, we might have
 \begin{verbatim}
     :attach-mlg  lisp!
\end{verbatim}
 which might indicate that any {\sc lisp} predicate without side
effects is procedurally attached in this way.

Once again, other examples are similar.  If we have a modal operator
of knowledge that has an $S5$ semantics, we might write
 \begin{verbatim}
     :modal-sak  (L S5)
 \end{verbatim}
 while if $L$ is described truth-functionally instead, we would have
 \begin{verbatim}
     :modal-mlg  (L lisp-fn)
\end{verbatim}
 where \type{lisp-fn} is the function that actually does the
computation.  If mlg and sak manage to resolve their differences,
they might settle on a single package \type{:modal}.

\section{Conclusion}

Standards are appropriate only where there is consensus.  The aim of
the interlingua that we have proposed is to standardize on what
really is standard -- the fact that existing declarative databases
are largely equivalent to collections of first-order sentences.  In
many cases, the inability to transmit declarative information between
large systems reflects more the fact that there is no common language
to do so than fundamental differences between the systems being used,
and we have attempted to provide a way around this difficulty.

What we have emphatically {\em not\/} tried to do is to get involved
in settling any fundamental differences that do exist, since these
are not yet ripe for standardization.  Instead, we have provided a
general way for users to include their nonsemantic knowledge when
transmitting declarative databases in the hope that this will
encourage the emergence of standards in these areas from the
knowledge representation and engineering community itself.

\end{document}