Re: clarifying clarifying ontologies

hovy@isi.edu (Eduard Hovy)
Date: Tue, 8 Aug 1995 11:24:37 -0700
X-Sender: hovy@quark.isi.edu
Message-id: <v02120d80ac4cf87aac75@[128.9.208.191]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
To: Doug Lenat <lenat@cyc.com>
From: hovy@isi.edu (Eduard Hovy)
Subject: Re: clarifying clarifying ontologies
Cc: pclark@cs.utexas.edu, cg@cs.umn.edu, srkb@cs.umbc.edu
Sender: owner-srkb@cs.umbc.edu
Precedence: bulk

>For the first time, Ed, I find nothing to disagree
>with in your message!   :-)
>
>Seriously, that last message was blank, please
>retransmit whatever it was you intended to send.
>--Doug


Gosh, total agreement would be terrible!  So here it is... 

At 4:12 PM 8/8/95, Peter Clark wrote:
>>> Taxonomies of "concepts" (or even "predicates") without axioms (or their 
>>> moral equivalent) that pin down their intended meaning are not at all 
>>> useful. [Ken Forbus]
>
>> This very strong statement might be true for qualitative physics, but it 
>> certainly isn't for NLP.  Most large NLP systems, parsers and generators, 
>> find it convenient to use taxonomies. The systems tend to need to know 
>> what general class of thing (syntactic or semantic, depending on the system)
>> a symbol belongs to, in order that it may be properly handled.  That's what 
>> a taxonomy provides. [Ed Hovy]
>
>Ed - 
>   I'm not sure if you're giving enough credit to your NLP systems. The
>fact is, they *do* have axioms pinning down the intended meaning of the 
>taxonomic symbols -- only those "axioms" are buried in the NL software which 
>uses the taxonomies, rather than explicitly listed (which is fine given their
>task). Consider: How do you know if you've put a symbol (ie. linguistic term
>such as "engine") in the wrong place in the taxonomy? Answer = the NL 
>software generates garbled/silly sentences. In other words, the 
>software has assigned the wrong (linguistic) meaning to the symbol. 
>The NL software defines *what it means*, in linguistic terms, to be a 
>two-place-relational-process (say) eg. that they have a domain and range,
>that they can be realized linguistically as <domain><relation><range>, etc.
>The symbols in the taxonomy are certainly not of the vacuous nature 
>which I think Ken Forbus was hinting at. I don't see any conflict 
>between NL work and Ken's position: NL systems use more knowledge than 
>just an isa-hierarchy too!
>

Thanks Pete for this comment.  Here are some of the axioms and other 
tricks commonly used for parsing and generation: 

- superconcept structure, which sometimes can be used to infer additional 
  aspects such as object decomposability or not (reflected in count vs 
  mass nouns) 

- role-filler constraints (the Patient of an Eat is a type of Food) 

- reified relations (a relation can be used in "unreified" and "reified" 
  forms, as COLOR in: 
    (X15 / HORSE                  or     (X15 / HORSE)
         :color (Y12 / YELLOW))          (R16 / COLOR 
                                              :domain X15 
                                              :range (Y12 / YELLOW)) 
  The advantage here is your ability to attach additional info (say, 
  modifiers, strengths of belief, etc.) to the reified form.  This 
  probably counts merely as a notational variant though 

- many systems use PART-OF links (for example to help disambiguate the 
  possessive in "the politician's talk was full of lies" and "the 
  politician's ear was full of flies") 

- occasionally, systems use PERTAIN links (e.g., Metropolitan PERTAIN City) 
  as in our Japangloss Japanese-to-English MT system; not very common though 

- I can imagine Hebrew and Arabic NLP systems using number constraints on 
  fillers (certain nouns of paired objects like hands and eyes are marked 
  for singular, dual, and plural) 

These are the ones I can think of now.  They all help ensure the correct 
linguistic behavior, as Pete mentions.  And all of them (and more) are 
needed to perform arbitrary inference when the core NLP system gets into 
trouble of course, but then, hopefully, there's Ken's and other domain 
expert reasoners to fall back on!  

Our problem is collecting "superficial" knowledge like this at a large 
enough scale to be useful for NLP.  People are parsing dictionaries, 
performing statistical processing over large corpora of texts (to find, 
say, the correlation of Eats and types of Food vs Eats and Dust, etc.), 
and hiring students to enter features.  It's a big job.  

E


----------------------------------------------------------------------------
Eduard Hovy
email: hovy@isi.edu          USC Information Sciences Institute 
tel: 310-822-1511 ext 731    4676 Admiralty Way 
fax: 310-823-6714            Marina del Rey, CA 90292-6695 
project homepage: http://www.isi.edu/natural-language/nlp-at-isi.html