**Mail folder:**SRKB Mail**Next message:**John McCarthy: "Re: INTERNATIONAL STANDARD FOR LOGIC: CSMF"**Previous message:**sowa: "Re: INTERNATIONAL STANDARD FOR LOGIC: CSMF"**In-reply-to:**David McAllester: "INTERNATIONAL STANDARD FOR LOGIC: CSMF"

From: Chris Menzel <cmenzel@kbssun1.tamu.edu> To: dam@ai.mit.edu Cc: tony@ontek.com, fritz@rodin.wustl.edu, jksharp@sandia.gov, msmith@vax2.cstp.umkc.edu, sharadg@atc.boeing.com, roger@ci.deere.com, sowa@turing.pacss.binghamton.edu, msingh@bcr.cc.bellcore.com, bhacker@nara.gov, scott@ontek.com, tony@ontek.com, skperez@mcimail.com, genesereth@cs.stanford.edu, duschka@cs.stanford.edu, E.Hunt@cgsmtp.comdt.uscg.mil, cg@cs.umn.edu, interlingua@ISI.EDU, srkb@cs.umbc.edu In-reply-to: <9409092106.AA03548@spock> (dam@ai.mit.edu) Subject: Re: INTERNATIONAL STANDARD FOR LOGIC: CSMF Message-id: <94Sep10.162039cdt.9769@kbssun1.tamu.edu> Date: Sat, 10 Sep 1994 16:20:31 -0500 Sender: owner-srkb@cs.umbc.edu Precedence: bulk

David McAllester writes: I have not generally taken part in interlingua discussions but have been generally concerned by the emphasis on first order semantics. It is well known, and stated in previous messages, that you can't even define the concept of "connected to" in a graph in first order logic. This has important PRACTICAL implications. Let Arc and Con be two binary relations and Phi(Arc, Con) be a formula which claims to state that Con(x,y) is true if and only if there is a path of arcs form x to y where Arc(x,y) means there is an arc from x to y. I challenge the supporters of a purely first order system to write this definition. When most people write Phi(Arc,Con) in first order logic they get something such that forall x P(x) ^ Arc(x,y) => P(y) plus Phi(Arc,Con) does NOT imply forall x P(x) ^ Con(x,y) => P(y). It can be done in first order ZF set theory (see below). Of course, as Professor McAllester points out, it will require notions like induction that are "naturally second order" in the sense that if you want definitions that are insusceptible to unintended interpretations, you'll need full second-order model theory. However, these notions are all definable in first-order ZF in the weaker sense noted, i.e., they pick out the right things in the real world of sets. (And it is enough, I think (though I would be happy to be convinced otherwise) that they *in fact* pick out the right things (we're describing the real world, aren't we?), even though there are models in which they don't.) In particular, the notions of finitude and natural number are definable in ZF in this sense, and the principle of mathematical induction is a theorem of ZF, in the following form: (0 \in x & (n)(n \in x -> n' \in x)) -> (n)(n \in x), where "n" is a sorted variable ranging over the natural numbers, i.e., the finite ordinals as defined in ZF a la von Neumann. (Cf. the texts by, e.g., Drake, Levy, Devlin, etc.) A concept language which can not define the concept of connected in a graph would seem to be of limited utility. Agreed. [...] The problem, in practice, is mathematical induction. Many truths can only be proved if we have some form of induction principle. Induction principles are naturally second order. First order induction schemes are HARDER to compute with than higher order principles invoked with higher order matching and unification. Higher order unification may be bad, but wait till you try to use a first order induction scheme! As noted, mathematical induction in ZF is a straight object language theorem, not a schema. Induction is a schema in PA just because it is a much weaker first-order theory. It only talks about numbers, not sets (except out of the corner of its mouth via open formulas). In summary, induction principles seem to be a necessity. Higher order induction principles seem more computationally tractable than first order induction schemes. I think that the concern that Professor McAllester expresses here, though very important, is not my concern. A computational reasoning system that uses a higher order language does not know whether it is being interpreted with a "real" higher-order model theory or a general model. (Indeed, it is not exactly clear what role model theory plays in a such a system beyond guiding the designers.) However, it may well be true that a higher-order *formalism* is, as Professor McAllester notes, computationally more tractable than a first-order formalism for the purpose of automated reasoning. Important as it is, this is a wholly distinct issue. I have no doubt, and have myself argued at length, that a higher-order *language* is vastly preferable to a first-order language for the representation of the logical forms of natural language. Mikhail Zeleny, in a typically informative posting, provides several examples in support of this claim. And the success of HOL no doubt provides strong evidence that the same is true with respect to automated reasoners. *BUT it is another question entirely whether a full blown higher-order semantics is required for any or all of these purposes.* (Indeed, as I understand it, and as John Sowa seems to have confirmed, the semantics behind HOL is explicitly *not* genuinely higher-order.) It is *this* claim I have been questioning, not the virtues of higher-order *formalisms*. These are two issues that must be kept distinct in the higher-order/ first-order debate. **Re the definability of connectedness in first-order ZF** I think the following will meet Professor McAllester's challenge. (And even if there is a flaw below, there is surely a way to do it along the same lines. Given, as Mikhail points out, the well known fact that all of classical mathematics is representable in ZF, this is not purported to be a deep result, but just an illustration of how a lot of "naturally higher-order" theorems have robust first-order counterparts in ZF. I imagine the higher order proof of McAllester's theorem looks pretty much the same as the one below.) Let "i" and "n" be sorted variables ranging over the natural numbers (i.e., finite ordinals as defined in ZF). First define a path within a graph g to be simply a function f on some initial segment of the numbers such that for all i on which f is defined (save the largest, if there is one), Arc(f(i),f(i+1)). (Or we could define a path as a sequence of arcs; macht nichts.) Again, note that f, despite playing the role of a function here, is just a certain kind of set (a set of pairs) and hence is a perfectly legitimate first-order object in set theory. Now define the auxiliary predicate C such that D1: C(x,y,n) =df (Ef)(path-in(f,g) & length(f)=n & f(1)=x & f(n)=y). That is, C(x,y,n) iff there is a path of length n from x to y. Now let p be the property P above (represented, as usual, as a set); I switch to lower case to emphasize the first-order character of things. In particular, I'll write "x \in p" instead of "P(x)". (But note that we *could* retain the higher-order syntax for aesthetic or computational reasons.) Recall that we are given (*) (x \in p & Arc(x,y)) -> y \in p. Lemma: (n)(x \in p & C(x,y,n) -> y \in p). Pf: By induction on n. For n=1, given (*), D1, and the definition of a path, the result is immediate. So suppose the lemma holds for n, and suppose x \in p & C(x,y,n+1), i.e., there is a path f from x to y of length n+1. Let z be y's predecessor in f. Then there is a path f'=<f(1),...,f(n)> of length n from x to z, so by the induction hypothesis, z \in p. But by the definition of a path, we have Arc(z,y), and so by (*), y \in p, as required, and the lemma is proved. Now, define Con such that D2: Con(x,y) =df (En)C(x,y,n). By first-order logic, the Lemma is equivalent to (**) (En)(x \in p & C(x,y,n)) -> y \in p, which is equivalent to (***) (x \in p & (En)C(x,y,n)) -> y \in p, which by D2 is what we want, viz., (****) (x \in p & Con(x,y)) -> y \in p. --Chris ================================================================= Christopher Menzel | Internet -> cmenzel@tamu.edu Philosophy, Texas A&M University | Phone ----> (409) 845-8764 College Station, TX 77843-4237 | Fax ------> (409) 845-0458 =================================================================