Responding to msgs #68, 69, 70, 71, 88, 91, and 92

sowa <sowa@turing.pacss.binghamton.edu>
Date: Sat, 8 May 93 14:22:47 EDT
From: sowa <sowa@turing.pacss.binghamton.edu>
Message-id: <9305081822.AA09259@turing.pacss.binghamton.edu>
To: cg@cs.umn.edu, interlingua@ISI.EDU, phayes@cs.uiuc.edu
Subject: Responding to msgs #68, 69, 70, 71, 88, 91, and 92
Cc: sowa@turing.pacss.binghamton.edu
Pat,

I didn't have time to get back to my email since Wednesday, so I have
had a number of notes accumulate in my mailbox.  Since they have various
collections of addressees, I'll try to comment on all of them to everybody,
but with numerous deletions in an attempt to conserve electrons.

> ... I could rephrase [Aronson's point] as saying that the way in which a 
> theory can describe the world must depend on what entities it - the theory - 
> hypothesises to exist. 

Fine.  This is a point on which we have no disagreement.

> ... Now, lets forget NL and think about these mental (or computer) 
> representations. These, we agree(?), are what is meant by knowledge 
> representations. 'Making these explicit' IS the processing of designing 
> Krep formalisms and writing axioms[*1] in them, right? And these are the 
> things whose relations to the world is described by semantic theories. 
> That is exactly what I have been (intransigently) saying and you have 
> apparently been vehemently denying. Or are you going now to say that 
> there is yet another layer between these and the world? Where will this 
> erection of mathematical barriers between beliefs and reality ever stop?

I think I see one of the underlying causes of our disagreements.
When I talk about things inside the computer (for simplicity, let's
avoid both mental models and NL for the moment), I distinguish those
things that are language-like from those that are model-like.  The
great flexibility of programming languages like LISP and Prolog allows
AI programmers to move very smoothly from language-like representations
like KIF to model-like constructions that serve as internal surrogates
for some physical systems.  Because those languages move so easily from
one mode to the other, AI programmers don't always distinguish them.

The distinction is probably clearer in database systems.  I regard 
SQL as an example of a language that is used to talk about a database,
which I regard as a model.  A relational DB, for example, has a set
of individuals (the data elements stored in its tables) and a set of
relations over those individuals (the tables themselves).  SQL, for
all its faults (which I have discussed with great disgust), has the
expressive power of first-order logic.  The SQL language can be used
in several ways:  

 1. For defining the tables and their formats.

 2. For populating them with individuals (i.e. DB updates).

 3. For asking queries, which corresponds to the process of evaluating
    the denotation of a formula in terms of the model.

Please excuse me for being "pedantic" in explaining what you probably
already know in great detail.  But I need to go into the details in
order to show exactly what I mean when I say the AI languages are being
used in two very distinct ways.  They are sometimes being used like
SQL to make assertions and ask questions.  But at other times, they
are being used to construct what I call "models", which are of the
same nature as relational databases -- from an implementational point
of view.  But from a logical point of view, those "models" serve the
same purpose as Tarski's models when they are used to answer questions
(i.e. they are the structures in terms of which the system computes
the denotations of formulas).

> First, there is no reason why this should be computable! (I have referred 
> to this mistake in an earlier message, but you did not reply to the point.) 
> Model theory is not a theory of how a model of a formalism can be computed: 
> it only undertakes to specify how the truth conditions depend on the syntax. 
> When a 'model' is defined as a set D together with a set of relations, etc.., 
> nothing is said or implied about whether that set could be computed.

Now I'm accusing you of long-windedly expostulating on the obvious.
I thought that we had agreed many notes ago that we weren't disagreeing
about any of Tarski's formal operations, nor about Cantor's formal
constructions.  When I used the phrase "formal and computable", I wasn't
suggesting that the two words were synonyms -- if I believed that, I 
would have used only one.  To repeat:  formal is a prerequisite for
computable, and computable is a prerequite for being implementable
on a digital computer.  But there are formal things that are not
computable, and things that are computable in Turing's sense that
couldn't be implemented even on a computer that incorporated every
atom of the universe.

> Second, to say that the idealization must be a mathematical construction does 
> not mean that it is a construction made out of 'mathematics', where that is 
> some strange abstract kind of stuff.

No.  That's why I like to use the term "data structure".  It is a term
that everyone tuned in to these mailing lists understands, and it is
clearly distinct from the physical objects in the world outside the computer.

> And notice that I have, here, referred to the actual gas molecules. Len's 
> idealisation doesn't refer to them, but it is evidently possible to do so; so 
> 1why shouldn't another, less idealised, theory - for example, a theory about the 
> degree of idealisation of the first theory -  do so?

At the time that Boltzmann was formulating his "model", the atomic
hypothesis was widely accepted only in chemistry.  Ernst Mach fought
against it up to the end of the nineteenth century.  It wasn't until
Einstein's famous paper on Brownian motion that the last vestiges of
resistance to atoms disappeared.  Boltzmann committed suicide in
1906, partly because of depression caused by the widespread rejection
of his hypothesis.

My point is that a physical theory, like an engineering drawing,
starts out as a mathematical construction whose terms cannot with
any certainty be related to real world "things".  If the theory is
successful, we may come to believe in the existence of those "things",
but we have to be able to construct models from sets, data structures,
or other abstract stuff in both physics and engineering.

> We must be misunderstandin one another. Perhaps
> the basic misunderstanding has to do with 'theory'....

I consider a theory to be the deductive closure of a set of axioms.
I consider an axiom to be a proposition assumed as a hypothesis for
the purpose of exploring the implications of a theory.  The distinction
between "sentence" and "proposition" is another topic that could get us
into an endless round of notes.  I define "proposition" as an
equivalence class of sentences in one or more languages (as in an
earlier note that I sent in response to Len Schubert).  If you only
have one formal language and you choose identity to be your mapping,
then "sentence" and "proposition" are indistinguishable.  But if you
have two or more languages (e.g. CGs and KIF), then your definition
of proposition depends on the mapping you choose between the languages.

In any case, I don't believe that the concept of an agent (human or
computer) is necessary to define either "theory" or "model".

> Sowa:
>
> Krep <==> Model <--> World

No.  As I pointed out above, the languages used in AI are sometimes
used for language-like and sometimes for model-like purposes.  I would
say that KIF and CGs are both pure language-like kn. representations.
They can be used to talk about the world, but their mapping to the
world in a computer implementation is by means of a model-like scene
representation that would be used by a robot-driver:

   Krep (e.g. KIF or CGs) <--> Model <--> World

If you want to bring NL into the picture, it would be placed on the left:

  NL <--> Krep (e.g. KIF or CGs) <--> Model <--> World

> Hayes & Schubert (I resist the temptation to say 'AI'):
>
> NL <--> Krep <==> World

> where model theory is the double arrow, in each case.

I'm glad that you resisted the temptation to say 'AI', because
I am armed with a battery of quotations from Marvin Minsky to 
David Marr to refute that claim (not to mention the philosophers
like Barwise -- I loved your example where Barwise pointed to thing
things in his office and asked you to point out the sets).

> ... Krep is crucially related to inference. Since Krep can be 
> thought of as being a formalism for expressing content, thinking 
> is modelled by inference, and the computational properties of Krep 
> formalisms are routinely expressed in terms of what inferences are 
> permitted or facilitated. This is the Krep framework within which,
> for example, all of McCarthy's and Schubert's and my Krep work, CYC, 
> and most of the work described in the proceedings of the Krep meetings 
> is found. It is the tradition within which there have been vast arguments 
> about whether new Krep formalisms being developed were more expressive 
> than first-order logic, and within which such ideas as nonmonotonicty 
> have been developed and criticised. All of this work takes the Krep 
> formalism to be what is manipulated by the computer. And of course the 
> prime motivation for the entire knowledge-sharing effort also works 
> within this tradition, in which knowledge is expressed in a formalism 
> manipulated by machines, and correctness of intertranslation is 
> ultimately defined model-theoretically.

The only point in that paragraph with which I would quibble is the
phrase "thinking is modelled by inference".  I would say that some
human thinking is fairly well modelled by inference, but much of human
thinking is better modelled by mental manipulation of image-like or
model-like representations.  With the rest of your paragraph, I have
no disagreements whatever.

> I expect you will agree with all this, but still insist that models be 
> built only from unreal objects, perhaps referring us your four-way split 
> again to justify this.

My four-way split is motivated primarily by my work with databases,
rather than my work with NL.  In DB theory, it is common to make a
very clear distinction between lexical object types (LOTs) and
non-lexical object types (NOLOTs).  They say explicitly that NOLOTs
like people and trees cannot be flattened out and stored on a disk;
instead, they must be represented in a database by "surrogates",
such as "tuple identifiers".  Those surrogates are very closely
related to the GENSYMs used in LISP to represent the external NOLOTs.

My claim is that the DB people have been making a distinction that
has proved to be very useful to them, and I believe would also be
very useful to AI.  I would agree that those AI researchers who have
never attached their systems to a robot manipulator have a tendency
to ignore or downplay the importance of separately thinking about
representing the language and representing the models.  But I will
further claim that those AI people who are working with robots make
a distinction between models in the machine, the real world situation,
and the kn. rep. languages.  I believe that if the Knowledge Sharing
Effort is going to link up with systems dealing with manufacturing,
database systems, and other commercial programming problems, it will
become increasingly important for them to make that distinction as well.

> ... But TMT, as you showed us very nicely a few messages ago,
> is completely agnostic about what its domains consist of. 

I cited Schoenfield's failure to mention what his "individuals"
happened to be.  But in every example in Schoenfield's book as well as
in all of Tarski's writings (at least his collected papers in the
book _Logic, Semantics, Metamathematics_ and his more popular
_Introduction to Logic_), they never used a single instance of anything
other than a number or a geometrical construction such as a point,
a line, or a sphere.  That tradition from Tarski to Schoenfield and
other mathematical logicians is what I mean by "Tarski's Model Theory".

I strongly object to your identifying (notice that I avoid the loaded
word "confusing") that body of work in mathematical logic with what
you call TMT, which permits the individuals to be identifed with
NOLOTs without specifying the operational mechanisms by which that
identification is to be carried out.  Those philosophers who did want
to address that mapping between symbols and the world (e.g. Carnap,
Goodman, Quine, Montague, Barwise, and others) have all developed
highly "idiosyncratic" theories, to use your term.  The only people
who have used model theory without such an accompanying philosophy
are those who have been addressing the technical issues of the
formalism -- that includes most textbook writers, including our
friends Nilsson and Genesereth -- but did not want to get bogged
down in the numerous philosophical issues.

> I suspect that your central concern is a worry about realism....

I do believe that a clear distinction between the models of model theory
and the real world helps to clarify many philosophical issues.  But as I
pointed out above, such mundane matters as relational databases and robot
manipulators drive us to exactly the same distinctions.

> But in any case, whatever one's metaphysical position, most of your concerns 
> simply don't arise when we are considering most Krep problems....

But they do -- in databases, robot manipulators, and nonmonotonic reasoning.
I believe that a lot of the work in nonmonotonic logic is addressing the wrong
problem, and that it could be handled philosophically more cleanly
and computationally more efficiently by looking at the way that the models
relate to the world rather than by trying to change the rules of inference.
I'm not saying that the nonmon people are confused -- I'm just saying that
by not making a clear distinction between the languages and the models,
they are depriving themselves of some very useful techniques.

> ... Admit it, John: you are a constructivist!

As a mathematician, I am willing to accept some nonconstructive proofs.
But when I got religion (i.e. computer science), I put a very strong
emphasis on not just computable, but efficiently computable structures.
As I said many times in these notes, I'm willing to let people say
anything they please in CGs or KIF.  But I'd like to make the efficiently
computable stuff the path of least resistance.

> ... The ambiguity of interpretation emerges as
> the fact of there being a large number of ways of interpreting the theory
> over the domain of real colors, but does not mean that defining any such
> interpretation involves real-color theory.

I think this passage touches on our misunderstanding.  In fact, this
is one reason why I want to make the distinction between models and
reality.  If you give me a theory of colors (a collection of predicates
for colors and some axioms that relate them), I can construct a model
that I can represent on a computer without using any crayons from the
big Crayola box.  The model in the computer is made up of data structures
that I can query with the SQL language or with KIF or CGs.  But when I
want to relate that model to the real world, then I have to bring out
my Crayola box or find a physicist who specializes in optics and has some
instruments that can be hooked up to my computer representation.

But when I use KIF, CGs, or SQL, I agree with you that I am talking
"about" colors.  But the link between those languages and my Crayola
box or the physicist's instruments is only indirect through the data
structures in my computer.  When I say "Green is between yellow and
blue on the spectrum", I am talking about colors, not about data
structures, Crayola boxes, or optical instruments.  But when the
truth of that statement is evaluated in my computer, it is
done either syntactically by proving a theorem or semantically by
checking the position of green, blue, and yellow in my data structures.

> Well, if it were, then we would expect to find inside robots TWO 
> 'representations': what we have been calling Krep, and data structures 
> similar to the environment, structured to be TMT models of the first 
> representations. Is this what you claim is done? 

Yes.  That is exactly what I have been claiming.

> (Something like this was done in some early robot experiments, 
> but largely because it was often expensive or impractical to run the physical 
> robots, or because -as in early Shakey planning - the robot did not yet exist, 
> so the world had to be simulated.)

It is still being done today in the latest and greatest robots.
They maintain an internal representation of the physical scene, which
is distinct from the language that is used to talk about the scene.
The language is still talking "about" the scene, but the internal
representation is very important for relating three different components
of the system:  the vision processor, the mechanical manipulator, and
the kn. representation language.

> ... [My collaborator] insists that I must
> not use the word 'reality' or 'the world', and I keep insisting that that
> is what I am talking about.

I'm happy to say that my language is about the world, and I'm happy to
let you do the same.  But I claim that my four-part distinction helps
me to formulate a theory of how language relates to (or "is about") the
world.  And I claim that such a distinction is useful for both
philosophical and computational reasons.

John