**Mail folder:**Interlingua Mail**Next message:**phayes@cs.uiuc.edu: "Re: Standardizing FOL"**Previous message:**phayes@cs.uiuc.edu: "Re: Standardizing FOL"**In-reply-to:**phayes@cs.uiuc.edu: "Re: Standardizing FOL"**Reply:**phayes@cs.uiuc.edu: "Re: Standardizing FOL"

Reply-To: cg@cs.umn.edu Date: Wed, 22 Sep 93 09:28:16 EDT From: sowa <sowa@turing.pacss.binghamton.edu> Message-id: <9309221328.AA11444@turing.pacss.binghamton.edu> To: cg@cs.umn.edu, interlingua@isi.edu, sowa@turing.pacss.binghamton.edu Subject: Re: Standardizing FOL Cc: phayes@cs.uiuc.edu

Pat, My summertime reading has strengthened my conviction that a typed or sorted FOL with metalevels is the way to go. I found the following book at a Barnes & Noble (totally uncharacteristic of them), and I bought it to give them some encouragement: T. Gergely and L. Ury, _First-Order Programming Theories_, Springer-Verlag, 1991. A quote from the preface gives the general outlook: Many different approaches with different mathematical frameworks have been proposed as a basis for programming theory. They differ in the mathematical machinery they use.... These logics, however, are very eclectic since they use special entities to reflect a special world of programs, and also they are incomparable with one another.... This work reflects our journey from the eclectic programming logics to the simple classical first-order logic, which furthermore turned out to be more powerful than the nonclassical ones. In the book, they develop logic and model theory for many-sorted FOL, use it to develop a theory of computability and apply it to program schemas. Then in Part II, they consider extended dynamic logics, and in Part III, they consider temporal logic -- neither of which is expressible in the other. Then in Part IV, they develop time logic -- FOL with time as an explicit sort -- and they embed both temporal logic and dynamic logic in it. I bought the next two books at the AAAI book exhibit (there wasn't much else but books there any more): K. Meinke & J. V. Tucker,eds, _Many-Sorted Logic and its Applications_, John Wiley & Sons, 1993. This book contains a variety of papers on theoretical topics concerning many-sorted logic, proof procedures, etc. But the first chapter is a very nice 80-page tutorial that summarizes the standard results, relates sorted logic to other logics, and discusses the trade-offs. In particular, she has a good treatment of second-order logic in its usual sense and the corresponding sorted version, where you introduce a special sort for functions of various kinds. The usual SOL is extremely powerful: you can, for example, define finiteness, say that the domain is uncountable, etc. But that gives you noncompactness: For each n, you could have a statement that there exist n individuals, but at the same time, you could say that the domain is finite. It also violates my favorite theorem -- the Loewenheim-Skolem theorem, since you can say that the domain is uncountable and cannot therefore have a countable model. The many-sorted mapping lets you say the same things, but the theory has all the nice properties: compactness, countable models, etc. You can give the definition of finiteness, if you like, but it doesn't work: there may exist infinite models for it. The reaons for these differences is that true SOL lets you quantify over uncountable numbers of functions -- of which at most a countable number could ever be defined or even named. But the many-sorted version of SOL requires you to specify up front what your sorts are. And there is no way to specify an uncountable set. The third book (which I also got at AAAI): Kurt Jensen, _Coloured Petri Nets_, Springer-Verlag, 1992. This book presents an extended version of Petri Nets (CPN) that has been implemented and used for some practical applications. The "colors" in CPN are arbitrary specifications that can be associated with each node and with the tokens that pass between nodes. In this book, they use ML, a purely functional programming language to define the colors. Since ML is purely functional, it can be written in KIF or CG notation in the usual way. Its lack of global variables means that any ML code written inside a node of the Petri net will only operate on the tokens passed to it without any possible side effects. The global network structure, however, gives you a very powerful and natural way to specify and visualize interactions of parallel processes. They have implemented graphic tools that let you draw CPNs, simulate them, and compile them into executable code (also in ML). And at the end of the book, they discuss several major (non-toy) applications in production use. I like the CPN approach because the nets can be mapped directly to conceptual graphs with nodes of type EVENT and STATE, connected by relations of type Successor. The Gergely-Ury time logic can be used to axiomatize the interactions, and the CGs can be translated directly to KIF. I won't claim that this is the only or even the best way to do industrial-strength procedural programming in logic, but it is one way that is both easy-to-use and efficient. There very well may be many others that could be developed once KIF and CGs become a standard. Some comments on your comments: > I understand, but the meta-meta hierarchy you outline isnt the same > one as the higher-order hierarchy (of functions of functions of..) that > we were talking about. True. But I would challenge you to find any real application that would require full HOL that cannot be done with either the many-sorted version of HOL or the metalevels of KIF and CGs. > The context mechanism doesnt go far beyond first-order logic, by the > way, in fact it never goes beyond first-order logic. All it amounts to, > as far as I can see, is a kind of quotation; but that hardly extends > the semantic base at all. (Perhaps I am missing something, however.) True. It is purely first order. But what it does do is let you partition your knowledge base into modules. Outside the module, you can use metalanguage to talk about what's inside, axiomatize the insides, reason about them, etc. Furthermore, you can have families of contexts that are inconsistent with one another, and outside the modules you can talk about how they are related to one another. > Maybe, but you can't define ANY kind. For example, it doesnt have the > syntactic power to enable you to define lambda-conversion. And its not > clear that this notion of 'define' always captures the semantics of the > logics in question. (Although Im sure that it can be proved to do so > in many cases.) True. There may be cases where the logics are fully embedded, as in the treatment of temporal logic and dynamic logic by Gergely & Ury, and there may be cases where you have statements that "look the same" but have different semantics, as in the mapping from HOL to a many-sorted simulation of HOL. And yes, indeed, you can define lambda conversion. > Yes, but your scare quotes are quite appropriate here. Its not clear > that this notion of 'true' is the one which a genuine semantic theory > would provide. Suppose I have a language LP with a model > theory defined on it, and suppose we can describe the syntax of LP and its > model theory in L1: call this the lp-theory. Now, take a first-order > model of the lp-theory: will this be suitably isomorphic to a model > of LP? Will every model of LP be suitably isomorphic to a first-order > model of lp-theory? Neither of these questions has an obvious answer. I would say that the answer is intuitively obvious, although verifying the intuition might take some work. Let me ask you how you (or anyone else) defined this language LP and its model theory? Standard practice in mathematical textbooks and research papers is to use some natural language as the metalanguage for defining the theory (or for defining the language that specifies the theory). That use of NL is essentially always a rather simple version of a sorted FOL using NL words and syntax intermixed with variables and mathematical expressions. If you can specify your LP in such a way, you should be able to map it into the KIF-CG version of FOL with metalevels. > No, you don't have them. I was imagining a 'context' syntax which would > allow higher-order heirarchies to exist but with a limited range. The > semantic difficulty with (fullblown) higher-order logic is .... Yes, but. The quoting mechanism of KIF and CGs gives you the limited range you were asking for. Outside the context box, you can specify exactly what kinds of expressions and rules of inference you will permit inside the box. You can limit it in any way you like. Writing the specifications that define exactly the limits you want may not be easy, but it can be done -- "A simple matter of programming," as they say in AI. > What you can make the logic talk about is one thing, but what that talk > means is another. Until you provide a model theory, its not clear what > these internal translations and descriptions amount to. (Now do you see > why I make such a fuss about the importance of semantics? :-) Yes, of course. But that is exactly why we have adopted FOL for KIF and CGs. We have a very nice, very simple, fully defined model theory. If you use our notations to define your logic inside one of our contexts, you are free to use our constructions to create your models, but then defining the model theory for your logic is your responsibility, not ours. We give you the lego blocks and let you play with them in our context boxes. But if your construction collapses, our responsibility is limited to ensuring that your local disaster doesn't affect anybody else's contexts. KIF and CGs come with a limited warranty. > ... any semantics, be it fuzzy or modal or whatever, thinks about > a sentence as having some kind of property - truth, for example - which > inference rules preserve somehow. So sets of sentences which are > semantically interesting are always ones which are closed under the > application of these rules. But we seem often to be dealing with sets > of sentences which don't have this recursive-closure character. So how > can they be characterised semantically? (Maybe this is not true of linear > logics, and I confess to not yet understanding what the semantic base > of linear logic could be.) All that I am claiming is that if you can define your semantics in mathematical English, you can translate your English into CGs and use it to specify the semantics of what is inside a context box. There are a lot of interesting logics around, such as linear logic, which we haven't fully assimilated to current practice. That is a job for the next century. All that we are trying to do with KIF and CGs is to give the stamp of approval to the current state of the art and use it as a base for industrial-strength applications. > I largely agree, defining theories is where the real work needs to be done. > My worries were inspired by the thought that maybe some of our recurrent > problems > in doing this well are the result of not having the proper conceptual 'tools' > (not the implementational stuff you are talking about, but the languages that > underlie them.) That is very possibly true. On the other hand, we can see very large application areas where the current technology is not going anywhere near the limits. Just good old fashioned FOL is a major conceptual step for most programmers. As I have been trying to say again and again and again in all these notes is that we are not trying to develop standards for research, but standards for industrial-strength applications. > If building your tools requires standardising my tools, then its too early > to be standardising. But lets stop arguing about this: I agree, if we > can save some wheel-reinventing, that wont be a bad thing. Lets just bear > in mind that research hasnt finished in this area, and not give students > the wrong impression. Pat, we are not stopping you from building any kind of tools you want. But what we want to do is to develop standards for commercial tools so that you can buy off-the-shelf parsers, theorem provers, etc., in the same kinds of stores where you can buy the latest spreadsheets, word processors, and megadeath games. As far as giving things to our students, it is a crying shame that the DB teachers can give their students SQL to play with, the programming language teachers can give their students all sorts of languages to use, but there are very few AI things that we can give our students to use for their homework exercises. The available expert system shells are based on the technology of the 1970s, which the experience of the 1980s showed would not scale up to decent applications. I'm less worried about giving PhD students "the wrong impression" than I am about not having decent educational tools for the freshmen. John