Points of agreement & disagreement

sowa@watson.ibm.com
Message-id: <199202122212.AA29300@venera.isi.edu>
Date: Wed, 12 Feb 92 16:57:17 EST
From: sowa@watson.ibm.com
To: HAYES@SUMEX-AIM.STANFORD.EDU
Cc: INTERLINGUA@ISI.EDU, SRKB@ISI.EDU, CG@CS.UMN.EDU
Subject: Points of agreement & disagreement
Pat,

As far as the basic features of KIF are concerned, I think that we pretty
much settled that issue a couple of exchanges ago.  But we have been
disagreeing on the reasons why we agree.  To help sort out some of the
arguments, I'd like to summarize my position:

 1. KIF:  There seems to be an emerging consensus on what should
    be in the core of KIF:  a sorted first-order logic; the basic
    universal and existential quantifiers plus some generalized
    quantifiers like exactly-one and unique; some metapredicates for
    talking about types, relations, and their attributes; and an
    "apply" operator in one or more guises to allow you to quantify
    over types and relations without getting into the wild and wooly
    areas of unrestricted higher-order logic.  I'll certainly give
    you an exclusive-OR if you give me a lambda.  Other things like
    contexts, defaults, laws, modality, and the Hilbert epsilon operator
    are worth thinking about, but more work is needed before they are
    definitively admitted into the general consensus.

 2. Ontology:  Everybody seems to agree that shared ontologies are
    essential for shared KBs and that we have much more work to do
    before we can agree on what they should be.  I don't believe that
    any ontologies can be fixed and frozen for all time.  Instead, we
    might get something like libraries of ontologies.  Then two projects
    that want to share knowledge might agree "Let's start with ontology
    XP309 with the following additions, deletions, and modifications..."
    Then both projects could develop private conventions for their own
    internal use as long as they communicate via the shared terms.

 3. Conceptual analysis:  We both agree that guidelines are needed for
    knowledge engineering and that the wisdom of the ancients that lies
    buried in NL is useful.  But the discussion seems to be hung up on
    syntax.  I would suggest the term "conceptual analysis", which puts
    the emphasis on semantics.  Even when linguists are looking at
    semantics, however, they look for syntactic corroboration with
    techniques such as co-occurrence patterns.  But the use of syntactic
    evidence does not imply that they are looking only at syntax.

 4. Logical syntax:  I am bilingual in predicate calculus and conceptual
    graphs.  But only a small minority of programmers (or even AI
    researchers) are really fluent in any version of logic.  Many people
    are so put off by the syntax of predicate calculus that they invent
    those "odd syntaxes" that you dislike.  I would agree that a lot
    of those languages are very sloppily constructed, but I sympathize
    with the feeling that there must be a better way.  Furthermore, I
    will make a little bet:  If you give me one hour for a tutorial
    on Peirce's existential graphs, I will make you dissatisfied with
    predicate calculus -- not fully convinced of the need for a change
    perhaps, but at least wavering.

 5. Expressive power:  Since my ultimate goal is to develop a full
    fledged semantic representation for all of natural language, I
    need the ultimate in expressive power, including higher-order
    indexical intensional modal temporal logic with presuppositions,
    focus, and contexts.  But I would never try to infect the basic
    core of KIF with such an unmanageable beast.  Even though I might
    need all that apparatus in the intermediate stages of analyzing
    language, the final stage that I would send off to a shared KB
    would be a rather tame sorted FOL with the indexicals & such
    resolved to simple constants and variables.

 6. Metametalanguages:  As I've said in other notes, I prefer to do
    my reasoning in a fairly conventional FOL.  In order to reconcile
    that preference with the expressive power mentioned in point #5,
    I like to follow Quine, Dunn, et al., in treating modal and
    intensional constructions as metalanguage that says how the
    embedded propositions are to be handled.  The first stage of
    handling such constructions would be to parcel out the various
    propositions into different contexts or belief spaces.  Then
    to do the theorem proving, you enter one of the contexts and
    do reasoning in FOL or even a subset of FOL.  The conclusions
    could be exported to another context or KB using nothing more
    expressive than the consensus KIF from point #1.  I would do the
    same sort of thing for nonmonotonic reasoning:  I would handle
    defaults, etc., as recommendations for things to add in a belief
    revision or theory revision stage; but the actual reasoning would
    be purely first-order.

 7. NL generation:  Analyzing unrestricted NL is a task that will provide
    long-term job security for anybody who is clever enough to get
    funding for it.  But generating simple-minded, intelligible, but
    nonelegant prose from a formal KR is possible if you have a well
    defined set of conventions.  In conceptual graphs, those conventions
    are embodied in what I call the "canonical graphs".  If you generate
    CG's directly from some other KR language, you won't necessarily
    get a canonical form, but if you define the predicates from the
    other language with suitable lambda-expressions, you can generate
    a canonical form that can then be mapped to NL or at least a
    pidgin NL.  Generating elegant, well-organized prose is still a
    research effort, but pidgin NL is probably better than the kind of
    documentation (or lack thereof) that you get from most programmers.
    (And by the way, I definitely do not assume that there is a unique
    "canonical graph" for every proposition.  All I claim is that if you
    have a canonical graph, you can generate a syntactically correct
    sentence from it.)

There are certainly lots of research issues involved in points #2, #3,
#4, #5, #6, and #7.   But I think we can agree on a basic core for KIF
as in point #1 without waiting for the other research to be completed.

Now some comments on some of your comments on some of my comments on...

> If we agree with
> the Stanford philosophers that NL is essentially indexical in nature, then
> LofT, since it is the vehicle for memory, cannot be similarly constructed or
> our memories would have the same indexical quality and they would all seem to
> be about 'now'.

No.  Let me give an example of how I would handle indexicals.  Consider
the pair of sentences "I see a cat.  The cat is black."  In conceptual
graphs, I use the symbol # as a marker for indexicals.  I would translate
those two sentences into

   [PERSON: #I]<-(EXPR)<-[SEE]->(PTNT)->[CAT].

   [CAT: #]->(ATTR)->[BLACK].

Here #I is an indexical that refers to the person who is speaking
in the current context, and # by itself is the usual marker for an
anaphoric definite article.  (EXPR is my abbreviation for experiencer,
PTNT is patient, and ATTR is attribute.)  Tenses are also indexicals,
and if I wanted to show tense as well, I would draw a big box around
both graphs in the form [TIME: #now]<-(PTIM)<-[SITUATION: ...], where
(PTIM) is point in time and #now is the indexical for time.

The formula operator phi, which translates CGs to predicate calculus
is not defined for graphs that contain #.  Therefore, I would have to
resolve the indexicals by searching for appropriate antecedents for
the # markers.  The result would be a graph like the following:

   [TIME: 12 Feb 1992, 21:25 GMT]<-(PTIM)<-[SITUATION:

      [PERSON: John]<-(EXPR)<-[SEE]->(PTNT)->[CAT: *x].

      [CAT: *x]->(ATTR)->[BLACK] ].

Now all the indexicals are gone, and we can apply phi to the result.
If we have a KIF that doesn't handle situations and times, then phi
would just translate the graphs inside the situation box into

   (Ex:cat)(Ey:see)(Ez:black)(person(John) & expr(y,John)
      & ptnt(y,x) & attr(x,z)).

What this example illustrates is that you can have a semantic
representation that is replete with indexicals, but still generate
a LofT suitable for long-term memory simply by resolving the context-
bound indexicals to context-independent referents.  But you also have
the option of storing the whole context in a unresolved form and
waiting for further information before completing the analysis.

> By simply talking about 'meaning' you are blurring this
> important distinction.
>
>> Dixon, being a linguist, uses the term "semantics" in a broad sense.
>
> I can't let that go by. He uses it in exactly the same narrow sense that you
> use it, ie as referring to the meaning of natural languages.

Actually in _Conceptual Structures_, I never used the word "meaning"
in a technical sense, and I rarely used it in even an informal sense.
I would prefer to avoid using that term altogether and just say
explicitly "truth-functional denotation" or whatever other aspect of
"meaning" we are talking about.

> I can't help noting, for example, that if we take Bollinger's dictum
> seriously, then EVERY difference Dixon has found in surface sytax in ANY
> language must somehow be mirrored in a distinction in the semantic language.

That's true.  I would expect that a fully detailed representation that
tried to capture every nuance of meaning would have to do that.  That
is part of the long-term research project.  But most of the linguistic
issues would have more impact on the ontology than on KIF.

> ... "I never owned a red ball"... might... mean something which
> would be directly rendered into English as 'all the owning experiences I
> have had have not had red-ball as an accurate description of their object',
> which doesn't directly existentially quantify over any balls.

Certainly.  But a parser that analyzes English would be easier
to construct if it could always translate "a red ball" into an
existential form and then let a later stage push the negations
inward to derive the universal quantifiers if they are preferred.

>>  Predicate calculus is based
>> on C. S. Peirce's first effort to represent full FOL (his notation
>> of 1883).  By 1897, he had scrapped that notation in favor of his
>> existential graphs, which he called "The logic of the future".
>> I prefer Peirce's revised, improved logic to his first attempt.
>
> Transatlantic cultural piracy in full swing!! Predicate calculus is based on
> the work of Frege as adopted and enriched by Russell and Whitehead in Principia
> Mathematica. However, your point holds since Frege used a graphical notation.

A bit of history:  Frege's Begriffsschrift of 1879 was the first complete
system of FOL.  But it had a tree-like notation with only a universal
quantifier, implication, and negation.  Peirce's notation of 1883 was
a generalization of Boolean algebra.  For the existential, Peirce used
Sigma, which was a generalization of Boole's + for "or".  For the
universal, Peirce used Pi, which was a generalization of Boole's dot
for "and".  By 1885, Peirce's algebra was isomorphic to current
predicate calculus, but with different symbols for the quantifiers and
operators.  Frege's compatriot, Ernst Schroeder knew the Begriffsschrift,
but he didn't like it.  In 1890, Schroeder adopted Peirce's notation for
his 3-volume _Vorlesungen ueber Logik_.  Peano followed Schroeder, but
he changed the symbols because he wanted to mix logic with arithmetic.
Russell picked up the notation from Peano and used it for some major
work before he rediscovered Frege.  Therefore, it is true that Frege
was the first to invent a complete system of FOL, but it is also true
that logic was well established on Peirce's basis before Frege had any
influence on it.

>>  I would say that both forms are truth-functionally
>> equivalent, but that "red(x)" uses a first-order expression, while
>> "color(x,red)" uses a second-order expression.  The second-order form
>> makes it easier to quantify over colors and store them in a database.
>
> They both look first-order to me. The second is second-order only if a color is
> taken to be a property, which does not seem very plausible.

But I do take color to be a property, and I also take shape, size, and
a lot of other things to be properties.  And the mechanism that I use
for relating the type RED to the instance red of type COLOR is exactly
the same as the one for relating the type SQUARE to the instance square
of type SHAPE.

> For example, a red
> can be bright, and a brightness ( in particular, that of a red ) can be
> dazzling. So you can't stop at second-order, it seems, once you start
> away from the safety of first-order...

You do need higher-order types:   I treat PROPERTY as a third-order type
with color and shape as instances.  RANK is another third-order type with
species, genus, ..., kingdom as instances.  GENUS is a second-order type
with homo, felis, canis, ... as instances.  And HOMO, FELIS, and CANIS
are first-order types.

I would also say that BRIGHT is a first-order type, and BRIGHTNESS is
a second-order type.  But I would represent the phrase "a dazzlingly
bright red shirt" with only first-order types:

   [SHIRT]->(ATTR)->[RED]->(ATTR)->[BRIGHT]->(ATTR)->[DAZZLING].

Ontologically, I would say that this graph means there exists a
shirt, which has as attribute an instance of red, which has as
attribute an instance of bright, which has as attribute an
instance of dazzling.  This representation requires me to populate
my ontology with a lot of quantifiable types derived from adjectives
and adverbs, but it doesn't require a higher-order form, at least
not for this example.

> Why should a selection of ontological primitives for, say, an
> engineering system to help designers of electric motors, have much
> to do with categories suggested by the syntax of English?

Not syntax, but semantics.  You deleted my example about Iris Tommelein,
who used primitives derived from linguistic research as a basis for
knowledge representation in civil engineering.  The basic types used
in talking about electricity are also derived by a metaphorical extension
of terms used to describe flowing water -- current, flow, resistance,
pressure, etc.  Of course, the metaphor is only suggestive, and you
still have to test every predicted extension:  when you cut a wire, for
example, the electricity doesn't spill out on the floor.

> I have heard it argued convincingly that
> English is losing its future tense, and vernacular French has completely lost
> it: should I abandon reasoning about the future?  You see why I find syntax
> less than an ideal guide.

Yes, that's why I prefer the term "conceptual analysis".

>>  We might
>> agree that describing a house in polar coordinates or rectangular coordinates
>> makes no difference in meaning.  But just try giving your local contractor
>> a set of house plans in polar coordinates.
>
> This misses a whole lot of points. First, the ridicule comes from using the
> difference in a communication. But that is exactly my point: there is an
> essential difference between the situation of communication (over a narrow
> channel) between intelligent people, and that of being an internal vehicle for
> storage, retrieval and inference.

No.  My example was not intended to ridicule, but to emphasize the
point that the choice of predicates may not change the truth-functional
denotation of a formula, but it could make a very big difference in the
ease of knowledge engineering, clarity of communication, volume of
storage, efficiency of inference, etc.

>  But anyway, the importance of the difference here is obviously the difficulty
> of computing the relevant pieces of information in one case. (I am extending my
> house right now, so am unusually familiar with the need to know the lengths of
> things parallel to the walls.) But the differences you have been drawing our
> attention to, which are settled by the application of a single
> lambda-expression, no recursion, etc., are computationally piffling.
>
> And i think McCarthy's dictum, that we should concentrate on the basic
> structure of the representation and leave computational niceties for later, is
> still a good one.

McCarthy's dictum is a variant of the old programming adage, "Make it
run before you make it faster."  I agree with that, but the choice of
coordinate system certainly affects the clarity of the structure at
least as much as it affects performance.

> We were disagreeing about whether the
> knowledge representation language should, as a matter of doctrine, have a
> representational distinction corresponding to every surface distinction of
> English. You say, essential: I say, unnecessary and potentially misleading.

No, I didn't say that.  You are conflating multiple points from my list
at the head of this note.  I admit that I can't blame you for that,
since I hadn't given you my seven-point summary before.

>> Just because some naive
>> prospectors may have settled on iron pyrites is no reason why trained
>> prospectors should stop looking for gold.
>
> Now, I wonder where one gets the right kind of training?

You need a strong, interdisciplinary background that includes logic,
linguistics, AI, and philosophy.  Some people like me are trying to
write books and papers that bring these things together, and we are
not being helped by linguists who denounce AI researchers as hackers
or AI researchers who denounce linguists as irrelevant.

>> We agree pretty much in principle, but I would say evaluating denotations
>> in terms of models is of immense economic importance. ... So I would want to
>> keep it high on the list of goals and requirements.
>
> John, you seem to be shifting around. On the one hand, we were concerned with
> what I took to be a scientific issue to do with the representation of (models
> of) human knowledge.  Now we are concerned with the detailed pragmatics of
> engineering. Both are worthy of care, but they may not push in the same
> directions. In particular, most 'common sense' reasoning is not concerned with
> models which can be made into databases.

The multiple meanings of the word "model" have caused some semantic
drift in this discussion.  My original point was that sorted logic makes
it faster to evaluate the denotation of a formula in terms of a model.
You replied that "surely" I didn't intend for anyone to run through all
the values of a quantified variable.  I answered that database systems
routinely run through the values in answering a query.  And now you are
talking about models of scientific knowledge.

In any case, I think that we largely agree on what should go into KIF,
although we have been disagreeing on the reasons why we agree.

John