Skuce on Creating and Sharing Ontologies Robert Neches <neches@ISI.EDU>
Subject: Skuce on Creating and Sharing Ontologies
Date: Tue, 03 Mar 92 11:53:08 PST
From: Robert Neches <neches@ISI.EDU>
------- Forwarded Message
Date: 02 Mar 92 13:15:53 -0400
From: "doug skuce" <firstname.lastname@example.org>
To: email@example.com, interlingua
Subject: creating and sharing ontologies
How Shall We Discover Very General Ontologies and
How Shall We Ever Agree On Them?
Doug Skuce, Computer Science, Univ. of Ottawa (firstname.lastname@example.org)
This is just a quick note describing my ideas on how to attack the above
problems. (March 1, 1992)
Knowledge sharing will need agreement, assuming this is possible, on the
most general categories that ontologies can have, notions like: thing,
object, entity, property, attribute, event, process, state, situation,
collection, relation, etc, to name a few favorites. At the moment, everyone
uses these concepts and terms, but probably a) in very different ways and b)
they cannot tell anyone else what they mean by them. A well known example
would be the top levels in the CYC ontology, which I find difficult to
understand. (My review of CYC for the AI J discusses this at length. See
also my paper in the 90 Banff Workshop, coauthored with Ira Monarch). I
believe this problem is very deep and critical to k sharing, yet I have not
seen much discussion of it.
I believe that such notions can only be clarified by studying linguistic and
psychological data, first, say, for English, but ultimately seeking true
linguistic universals. At the bottom line we are talking word meanings. The
only AI ontology I know of based on linguistic research is the Penman
ontology based on Halliday's studies of English. It is also the best
documented one, with reasonable if minimal descriptions of each of some
fifty categories. (Contact Ed Hovy at ISI). To contrast, there is no mention
in the CYC ontology of where they got their ideas from. I would characterize
such idiosyncratic ontologies as "ad hoc", and feel we should not continue
to work in this manner.
Instead, there should be proposals, just like the k sharing proposals, that
can be debated, iterated, and, hopefully, accepted by some large community.
They should be based on linguistic data, preferably multilingual, so that,
say, Japanese speakers won't say "there's no notion of ___ in Japanese!!". I
have been working on such as a "background" task for a number of years. The
sources include "AI" ontologies like CYC's and Penman's, and research such
1. George Miller's Wordnet system. This has 40K English nouns and verbs
arranged in hierarchies and is on line.
2 Dixon's book, A New Approach to English Grammar, on Semantic Principles.
Clarendon Press, Oxford, 1991.
3. Rosch's work on psychological categories (approach this through George
Lakoff's book, Women, Fire, and Dangerous Things.)
By the end of the summer (92) I hope to have an initial proposal with maybe
25 categories (25 would be a big step!) But the next problem is: how to
communicate them? Existing attempts to do this in informal, unstructured
English leave a lot to be desired. At the moment, we have little choice but
to attempt to describe each in natural language, in my case, English. Now,
how should these descriptions be worded? (I would not want to call these
definitions, since they will still be pretty vague) Should examples be used?
Should any formal notation be used? Should a restricted vocabulary (e.g. in
LDOCE) be used? How to deal with circularity (i.e. descriptions that are
mutually dependent)? In other words, we need also to agree on a "KIF" or
format for describing these very general categories. It will have to use
natural language, but I would suggest it should not be totally unstructured.
(Something along the lines of Mel'cuk's ECD for natural language
dictionaries would probably be useful.) Hence two proposals are needed, one
for the format, and then the actual set of descriptions of categories
themselves. I will try to do both, but even having an agreed format would be
a big step forward. I am using my CODE knowledge management system, a tool
that greatly facilitates the job.
The key to finding the categories, I believe, is to identify some primitive
semantic properties or predicates that are necessary to identify the
essential nature of each category. Probably each has just one distinguishing
property, e.g. existence for things. I have in mind notions like existence
(in one or more of its senses), part-whole, grouping into collections, being
dependent on something else, natural numbers, equality, etc. Mathematics can
be built up from a few such primitive notions, so possibly more general
knowledge can also.
Unfortunately, semanticists have been searching for these holy grails for
some time and have not turned up much, so it is possible that such an
enterprise is premature in 1992.
I would appreciate hearing from anyone who has similar interests. We need
more cooperation, and less competing ontologies.
Doug Skuce tel 613 564 5418
Dept of Computer Science fax 613 564 9486
University of Ottawa
Ottawa, Ont, Canada, K1N 6B5 email@example.com
------- End of Forwarded Message