RE: pun in ontolingua KB

Tom Gruber <gruber@HPP.Stanford.EDU>
Date: Tue, 21 Jun 1994 11:38:14 -0700 (PDT)
From: Tom Gruber <gruber@HPP.Stanford.EDU>
Reply-To: gruber@HPP.Stanford.EDU
Subject: RE: pun in ontolingua KB
To: "Benjamin J. Kuipers" <kuipers@cs.utexas.edu>
Cc: Doug@SURYA.CYC-WEST.MCC.COM, ontolingua@HPP.Stanford.EDU, srkb@cs.umbc.edu
In-reply-to: Benjamin J. Kuipers's message of Tue, 21 Jun 1994 08:26:11 -0500: <199406211326.IAA14710@archimedes.cs.utexas.edu>
Message-id: <XLView.772225353.7590.gruber@hpp-ssc-1>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII
Sender: srkb-owner@cs.umbc.edu
Precedence: bulk
At  8:26 AM 6/21/94 -0500, Benjamin J. Kuipers wrote:
>Doug is right.  This is a (the?) critical issue in knowledge-sharing.
>
>Tom responds to my original comment:
>
>   > By itself, this is a minor bug in the documentation, and easily
>   > corrected, but ...
>   >
>   >   Q: Does the bug in the automatically-generated documentation reflect
>   >      a bug in the KB?
>
>   So this isn't a bug, it's a "feature" of our knowledge-free indexing trick
>   used on free text documentation.
>
>No, it *is* a bug, except that it is not an "easily fixable error".
>It's an inherent and hard-to-detect failure mode of the indexing
>strategy.

Fair enough.  If I turned off indexing of the documentation strings, then would
the bug in the documentation be gone?  That would indicate that we're giving 
semantic import to the links, and that never occured to me.  To be a bit
more polemic, I would claim that this view says that a "documentation bug"
is indicated any time a natural language string is misinterpreted.  This is
interesting from a philosophical perspective.  And it has practical
implications as well:  perhaps we need a notion of community-debugging of
ontologies -- iteratively rewriting the specification until some large
majority of readers sees it the same way.  I'm only half kidding.

Let's take a closer look at the actual text in question.

"In engineering analysis, physical quantities such as the _length_ of a
beam or the velocity of a body are routinely modeled by variables in
equations with numbers as values.  While human engineers can interpret
these numbers as physical quantities by inferring dimension and units from
context, the representation of quantities as numbers leaves implicit other
relevant information about physical quantities in engineering models, such
as physical dimension and unit of measure..."

The word _length_ was linked to the definition of list length, which is clearly
not what was meant in the sentence.  Ben took the word length to mean what is 
spelled length-dimension in that ontology.  In the sentence quoted above, it is
(intended to be) referring to a scalar quantity that is the value of a function
from beams to quantities -- the "length" in question is a scalar quantity of 
dimension length-dimension.  It isn't the dimension, and it isn't the function 
from beams to length quantities.  All these distinctions are muddied by the 
natural language text, but are explicit in the formalization.  Ben's 
interpretation isn't wrong; he chose one of the three common ways in which 
words about quantities are used in engineering texts.  The fact that the words 
are ambiguous and used inconsistently in text, and that the distinctions matter
for sharing models, are the motivations for us writing the ontologies.

>   Such mistakes in interpretation do NOT occur
>   for the formal part of the specification (e.g., the axioms and slot 
values).
>
>Certainly mistakes can be introduced into the formal specification,
>either manually or by some flawed automatic transformation.  And
>certainly we need methods for checking for such mistakes.  I would bet
>that such checking cannot, in principle, be complete.

I agree that mistakes occur in the formal spec, but not of the sort we're
talking about here: misinterpretation of the denotation of a symbol due to
linguistic and cultural context.  In the formal expressions in that
ontology, if the term length (or length-dimension) were used, it would have
a single denotation [yes, yes, for a given Tarskian model...read on].

Nonetheless, the problem of getting agreement on meaning is by no way
solved by the use of formalism.  I would add a pile of money to Ben's bet
that there is no way to be complete in checking for errors in
specification, for either formal or informal.  Even if we could do
theorem-proving until hell freezes over, and write error-free translators,
and _generate_ natural language documentation strings from them using
error-free generators,  we are still hostage to interpretation errors for
all the terms that are primitives in the ontologies (words that are not
defined purely in terms of other words).

>So, we need (a) methods for checking as best we can, and (b) methods for
>continuing to function in spite of interpretation errors.

Absolutely.  I would conjecture that tools for (a) are worth doing and
moderately useful (ontolingua catches bugs for me all the time;
McAllester's Ontic does this sort of thing (and more) for mathematical
theories).  But for (b) we need a social process, because in the end,
ontologies are social contracts. 

tom


P.S.  For those interested in design criteria for ontologies, there is an
initial proposal in

http://ksl-web.stanford.edu/knowledge-sharing/papers/README.html#onto-design

and for their application in the engineering math ontologies,

http://ksl-web.stanford.edu/knowledge-sharing/papers/engmath.html

We also hope to be posting papers by Mark Fox and Michael Gruninger soon on the
topic of "competency questions" as a basis for validating ontologies.