ANSI standards and knowledge representation
"Matthew L. Ginsberg" <ginsberg@t.uoregon.edu>
Date: Wed, 17 Aug 94 09:49:56 PDT
From: "Matthew L. Ginsberg" <ginsberg@t.uoregon.edu>
Message-id: <9408171649.AA06846@t.uoregon.edu>
To: ansi@t.uoregon.edu
Subject: ANSI standards and knowledge representation
Sender: owner-srkb@cs.umbc.edu
Precedence: bulk
I've been asked to attend the September meeting of ANSI X3T2, which is
involved in preparing the US position on the Common Logic Foundation
and on Conceptual Schema Modelling Facilities. At the moment, the US
position is that being proposed by Fulton, Genesereth, and Sowa, and
is based on the STEP Semantic Unification Meta-Model (SUMM), Knowledge
Interchange Format (KIF), and Conceptual Graphs (CGs).
In order to ensure that I represent the community's views as opposed
to my own, I would appreciate it if you could both recirculate this
message and respond to it by answering the following four questions:
1. Do you know of any fielded commercial system that uses SUMM, KIF
or CGs?
2. Do you know of any independent research effort that uses one of
these systems? By "independent" I mean independent of the developers
of the systems themselves, so that (for example) a project using KIF
and led by a member of the KIF development team doesn't count.
3. Do you believe standardization of knowledge representation
languages is appropriate at the present time?
4. If KR languages are to be standardized, would you rather see
standardization based on first-order logic alone (with hooks to
subsequent syntactic or semantic extensions), or based on a larger
language such as KIF or CGs?
Thanks! I'm attaching a draft spec for a "first-order logic only"
specification; if you have any thoughts on that, I'd be glad to
hear and forward them.
Matt Ginsberg
I. OVERALL DESCRIPTION
1. The basic language is first-order predicate calculus, with specific
notational conventions to handle variables, quantification, and so on.
We subscribe to the syntax of first-order logic only, and not
necessarily to its semantics.
2. All documents include information indicating the KR commitments
made by the underlying system. Any recipient who can support all of
the commitments made by the sender is assured that no information has
been lost in the transmission of the system; recipients who cannot
support all of the commitments may receive approximate information in
some sense.
3. The additional information passed by documents making specific KR
assumptions is clearly separated from the declarative information that
is also being passed. This is to ensure that a recipient not sharing
these commitments retains partial access to the knowledge being sent.
The syntax used for the additional information is described in Section
II. SYNTAX
A document is a sequence of ASCII characters that can be processed by
a LISP reader; we will concern ourselves with the results returned by
the reader instead of the character stream itself, viewing a document
as a stream of s-expressions. The following s-expressions have a
special meaning:
1. Any keyword (i.e., any top-level s-expression that is an atom
beginning with a :) other than :label, :cancel and :nonsemantic is the
name of a KR commitment made by the system. The next s-expression
should be used to convey additional information as appropriate
(examples follow in Section IV). These keyword/s-expression pairs
will typically contain system-dependent information concerning
subsequent sentences.
2. :label should be followed by an atom that labels the last keyword
encountered, and also indicates that this keyword is "sticky" in that
it applies to all subsequent sentences until cancelled. :cancel
should also be followed by an atom and indicates that the keyword
information so labelled should not be applied to the following
sentences.
3.:nonsemantic indicates that the most recent keyword has no semantic
content (e.g., control information). This is an indication that a
system that doesn't understand the keyword is still safe in using the
information.
All other s-expressions are interpreted as basic sentences with a
declarative semantics; these are described in the next section.
III. BASIC LANGUAGE
The legal sentences are precisely those of first-order logic,
expressed in prenex normal form. In order to make this precise, we
need to specify the fashion in which variables are distinguished from
constants, and to specify alphanumeric replacements for symbols such
as - (negation), A (forall) and E (exists).
IIIa. Variables
Any atom beginning with the character ? is a variable. We also
support the use of sequence variables, which are variables that can
match arbitrary sequences of atoms instead of just single ones; there
is substantial evidence from the PROLOG community that such variables
are useful in the construction of declarative databases.
Unlike PROLOG, we do not assume either that sequence variables appear
only at the end of argument lists or that only one sequence variable
appears in any particular expression. Although this makes unification
more difficult, it is possible to construct a unifier that handles
this more general problem and only incurs significant computational
overhead when attempting to unify expressions that actually do contain
multiple sequence variables.
Any variable whose second character is * is a sequence variable; all
other variables are not. Thus ?* and ?*x are sequence variables; ?
and ?x are nonsequence variables. The character * was chosen because
of the obvious analogy with BNF expressions; we have not used a
distinct initial character for sequence variables because it is
important that all variables have a uniform and easily identified
representation.
IIIb. Connectives
The replacement of connectives with alphanumeric expressions is
summarized by the following table of standard logical sentences and
their equivalents:
-p (not p)
p1 & ... & pn (and p1 ... pn)
p1 v ... v pn (or p1 ... pn)
p -> q (if p q)
p <-> q (iff p q)
Axy [p(x,y)] (forall (?x ?y) (p ?x ?y))
Exy [p(x,y)] (exists (?x ?y) (p ?x ?y))
The first argument to forall and exists can be an atom if only a
single variable appears under the scope of the quantifier.
IV. SEMANTICS
The syntax of the language is as given in the preceding two sections;
s-expressions following keywords have no restrictions (hence the
inherent flexibility of the proposal), while the syntax of other
sentences is dictated by the rules of first-order logic.
In the absence of information to the contrary, the semantics of a
database are also given by the rules of first-order logic; the
extension facility describes situations in which first-order logic is
inappropriate for some system-dependent reason.
The form of the additional information used by the database is
:extension s-exp
where :extension is the name of a particular extension to first-order
logic and s-exp is additional information that is to be used when
considering the subsequent sentence or sentences.
As examples, we will consider the use of our extension facility to
treat four separate semantic extensions: definitions, probabilities,
procedural attachments and relevance logic. We have attached "mlg" to
the extension names in order to indicate that they are mere trial
balloons that might be supported by only a single researcher. In some
cases, we would hope that the community interested in the ideas
(probabilities, for example) would rapidly settle on an agreed
language that would describe their shared semantic commitments.
IVa. Definitions
A document that contained definitions might look like this:
:definition-mlg predicate
:nonsemantic
:label d001
(forall ?x (iff (bachelor ?x) (and (single ?x) (male ?x))))
(forall ?x (iff (old-maid ?x)
(and (single ?x) (female ?x) (old ?x))))
:cancel d001
The definition-mlg keyword indicates that the following sentences are
to be interpreted according to a particular set of rules that are
agreed upon by all users of the definition-mlg "package". In this
particular case, the following atom, predicate, indicates that the
rules are being used to define predicates -- bachelor and old-maid.
The label d001 is used to delimit the scope of the definitional
declaration.
Any system that recognizes the definition-mlg commitments is now free
to interpret the above two rules as definitions, presumably obtaining
some computational benefit as a result. A system that doesn't
recognize these commitments will still be able to make sense of the
rules, since the nonsemantic keyword indicates that the keyword
doesn't affect the meaning of the following information. Other
control information (indicating that a certain rule is to be used for
backward-chaining, for example) can be handled similarly.
IVb. Probabilities
The interesting extensions are those that actually change the
semantics of the knowledge in some way. Here is one:
:probability-mlg .75
(flies Fred)
:probability-mlg .80
(forall ?x (if (bird ?x) (flies ?x)))
(forall ?x (if (ostrich ?x) (bird ?x)))
This database tells us that Fred flies with probability 0.75, that
birds fly with probability 0.8, and that ostriches are birds. The
interpretation of the sentence, "Birds fly with probability 0.8" is up
to the conventions of probability-mlg; this particular system might
assume that the probability actually labels the entire sentence, or
might strip off the leading quantifier and interpret the result as a
conditional probability.
Now imagine that this knowledge base is sent to a system that has no
probabilistic facility. The recipient, upon seeing the
probability-mlg keyword, will presumably realize that the database he
is receiving uses probabilistic information. He can then do one of
three things:
1. Accept the probabilistic information as valid, concluding above
that Fred flies and that birds do. The resulting database is likely
to contain information not intended by the sender, but may
nevertheless be the most attractive option available.
2. Ignore the probabilistic information, retaining only the
statement that ostriches are birds. Only knowledge thought by the
sender to be valid is used, but the overall value of the information
received may be reduced.
3. "Subscribe" to probability-mlg in some sense. This does not
imply the use of a full probabilistic reasoning facility, only that
some sense is made of the additional information being received.
Perhaps any sentence stated with probability in excess of some
threshold should be accepted, and the rest ignored. Perhaps a default
reasoning facility will be used instead, and so on.
The point, of course, is that the recipient of the system is able to
interpret the incoming information in a convenient and rational
fashion. Note, incidentally, that the first two options described
above (accept or ignore suspect sentences) can be done automatically;
the third requires some sort of human intervention.
Many other existing declarative schemes can be handled similarly.
ATMS's, for example, need to label sentences as assumptions or
otherwise; nonmonotonic reasoning schemes typically need to label
sentences as defaults.
IVc. Procedural attachment and relevance logic
Suppose that we want to evaluate the truth of the predicate subsetp
using LISP as opposed to inferentially. We might write this as:
:attach-mlg subsetp
:label a001
This presumably means that subsetp is procedurally attached. Since
the label is never cancelled, all of the subsequent information is
affected. Or, we might have
:attach-mlg lisp!
:label a002
indicating that any LISP predicate without side effects is
procedurally attached in this way.
Once again, other examples are similar. If we intend that the
database be interpreted using Belnap's four-valued relevance logic
instead of conventional methods, we might write
:relevance-ndb nil
:label b001
A modal operator of knowledge with an S5 semantics might lead us to
write
:modal-sak (L S5)
:label s001
while if L is described truth-functionally instead, we would have
:modal-mlg (L lisp-fn)
:label g001
where lisp-fn is the function that actually does the computation. If
mlg and sak manage to resolve their differences, they might settle on
a single package :modal.