CYC: Using Common Sense Knowledge...
CYC: Using Common Sense Knowledge to Overcome Brittleness and Knowledge Acquisition Bottlenecks Doug Lenat, Mayank Prakash, & Mary Shepherd Microelectronics ~3 Computer Technology Corporation, 9430 Research Boulevard, Austin, Texas 78759 T he major limitations in building large software have always been (a) its brittleness when confronted by problems that were not foreseen by its builders, and (b) the amount of manpower required. The recent history of expert systems, for example, highlights how constricting the brittleness and knowledge acquisition bottlenecks are. Moreover, standard software methodology (e.g., working from a detailed ���spec���) has proven of little use in AI, a field which by definition tackles ill-structured problems. How can these bottlenecks be widened? Attractive, el- egant answers have included machine learning, automatic programming, and natural language understanding. But decades of work on such systems (Green et al., 1974 Lenat et al., 1983 Lenat & Brown, 1984 Schank & Abelson, 1977) have convinced us that each of these approaches has difficulty ���scaling up��� for want of a substantial base of real world knowledge. Making Al Programs More Flexible [Expert systems���] performance in their special- ized domains are often very impressive Never- theless, hardly any of them have certain common- sense knowledge and ability possessed by any non- feeble-minded human. This lack makes them ���brittle.��� By this is meant that they are difficult to expand beyond the scope originally contem- plated by their designers, and they usually do not recognize their own limitations. Many important We would like to thank MCC and our colleagues there and elsewhere for their support and useful comments on this work. Special thanks are due to Woody Bledsoe, David Bridgeland, John Seely Brown, Al Clarkson, Kim Fairchild, Ed Feigenbaum, Mike Genesereth, Ken Haase, Alan Kay, Ben Kuipers, John McCarthy, John McDermott, Tom Mitchell, Nils Nilsson, Elaine Rich, and David Wallace applications will require commonsense abilities. . . Common-sense facts and methods are only very partially understood today, and extending this un- derstanding is the key problem facing artificial in- telligence. -John McCarthy, 1983, p. 129. How do people flexibly cope wit.h unexpected situa- tions? As our specific ���Lexpert��� knowledge fails to apply, we draw on increasingly more general knowledge. This general knowledge is less powerful, so we only fall back on it reluctantly. ���General knowledge��� can be broken down into a few types. First, there is real world factual knowledge, the sort found in an encyclopedia. Second, there is common sense, the sort of knowledge that an encyclopedia would assume the reader knew without being told (e.g., an object can���t be in two places at once). Abstract MC& CYC project is the building, over the coming decade, of a large knowledge base (or KB) of real world facts and heuristics and-as a part of the KB itself-methods for efficiently reasoning over the KB. As the title of this article suggests, our hypothesis is that the two major limitations to building large intelligent programs might be overcome by using such a system. We briefly illustrate how common sense rea- soning and analogy can widen the knowledge acquisition bot- tleneck The next section (���How CYC Works���) illustrates how those same two abilities can solve problems of the type that stymie current expert systems. We then report how the project is being conducted currently: its strategic philosophy, its tac- tical methodology, and a case study of how we are currently putting that into practice. We conclude with a discussion of the project���s feasibility and timetable. THE AI MAGAZINE 65 AI Magazine Volume 6 Number 4 (1985) (�� AAAI)
A third, important, immense, yet nonobvious source of general knowledge is one we rely on frequently: all of the specific knowledge we have, no matter how far-flung its ���field��� may be from our present problem. For example, if a doctor is stymied, one approach to deciding what to do next might be to view the situation as a kind of combat against the disease, and perhaps some suggestion that is the analogue of advice from that domain might be of use (���contain the infection,��� ���give the infection some minor chances, as the risk is worth the information learned about its behavior,��� and so on). Unlike the first two kinds of gen- eral knowledge, which simply get found and used directly, this type of knowledge is found and used by analogy. In other words, the totality of our knowledge can-through analogy-be brought to bear on any particular situation we face and that, after all, is what we mean by knowledge being ���Lgeneral.��� To perform this in ���real time,��� we em- ploy heuristics for prioritizing which analogies to consider first, and we may also use our brain���s parallelism to good effect here. Presenting an example of this sort of synergy that doesn���t appear contrived is difficult. We refer the reader to Skemp (1971)) Hadamard (1945)) and Poincare (1929), to name a few, who document cases of the use of detailed analogies to aid in solving difficult problems. In notes only recently analyzed and reported (Broad, 1985) Edison de- scribes how most of his inventions started out as analogues of earlier ones e.g., the motion picture camera started out looking like a phonograph and gradually evolved from there. Some evidence for the reliance on metaphor may come from the reader���s own introspections: try to be aware in the next few days of how pervasively-and ef- ficaciously-you employ metaphors to solve problems, to cope with situations, and in general to make decisions. Usually these metaphors are not very ���Ldeep ��� rather, we get power from them because of the immense breadth of knowledge from which we can choose them. Lakoff and Johnson (1980) go even further, arguing persuasively that��� almost all of our thinking is metaphorical. The test of this idea-solving problems by analogizing to far-flung specific knowledge-will be in the performance of the CYC system, once it has a large enough accumula- tion of specific knowledge. The CYC project is an attempt to tap into the same sources of power by providing a comprehensive skeleton of general knowledge (to use directly) plus a growing body of specific knowledge (from which to draw analogies). Making It Easier to Add New Knowledge Interestingly, the large real world knowledge base nec- essary to open the brittleness bottleneck also provides an answer to the other bottleneck problem, knowledge acqui- sition (KA). Let���s see why this is so. How do people learn and understand new information? In a recent essay, Mar- vin Minsky (1984) points out that humans rarely learn ���what��� -we usually learn ���Lwhich,��� In other words, we assimilate new information by finding similar things we already know about and recording the exceptions to that ���Lanalogy.��� This leads to amusing mistakes by children (���Will that Volkswagen grow up to be a big car?���) and by adults (e.g., cargo cults), but these are just extreme cases of the mechanism that we use all the time to assimilate new information. In other words, we deal with novelty the best we can in terms of what we already know (or, more accurately, what we believe) about the world. We have to date seen no Al system which tries to do knowledge acquisition ���from strength���. . . Another way of saying this is that the more we know, the more we can learn. That is, without starting from a large initial foundation, it���s difficult to learn. We have to date seen no AI system which tries to do knowledge acquisition ���from strength,��� from an initially given large, broad body of knowledge. If the knowledge base is large (and representative) enough, then adding a new piece of knowledge ought to be doable just by pointing to-and thereby connecting-a few existing pieces. A very weak form of this process is in use today in building expert systems, namely the simple expedient of ���copy&edit���: As you build up your expert system���s knowl- edge base, and you���re about to enter the next chunk (rule, frame, script,. . .), t i is often faster to find a similar chunk, copy it, and then edit that copy, than it is to enter (for- mulate and type in) that new chunk into the system from scratch. One limitation to current uses of copy&edit is that if the growing system only knows a couple hundred things (rules, frames, etc.), then those form the total universe of potential objects from which to do the copy. As the size of the knowledge base grows, it becomes increasingly likely that one can find a match that���s close enough to result in a large savings of time and energy and consistency. (The new piece of knowledge is more likely to preserve the ex- isting semantics if it���s been copied, rather than typed in from scratch.) CYC is aiming at tapping this source of power, gladly swapping the problem of ���telling the system about 3��� for the problem of ���finding an already known x��� that���s similar to x.��� But there are two more powerful ways in which CYC may help the copy&edit process along. The first is to employ analogy to suggest or help the user choose par- ticular existing frames from which to copy. The second is to use analogy to do some of the editing itself, automati- cally. The following section shows how these two new aids to knowledge acquisition would work. Analogical Reasoning Suppose one has begun to explicate the ���medical treat- 66 THE AI MAGAZINE
r -_-- ___________._..____________I__.______.___ Representing Knowledge in CY C CYC���s rcprescjltatiorJ hIguagr is frajnc-based and is silnihr to 131,1, ((hiJAw Rrjd J,cnat, 1080) a1it1 KRL (13obrow a11d Winograd, 1977). 01jc of the ccjltral prijlciph in its tlesig~~ is that it. be a pnrt of t,hc CYC KJ3 itself. That s.hoI~ld facilihtc trnnslating tire systcin t.0 other ���ujjd~rlyiug hguages,��� and should allow C:YC to apply its own knowlrdgc and skills (question- answcrijIg il~id ailalogixing) to itself. CVc bChC this is iniportarlt if it is to JlIojlit~Jr its 0w1J rmtimc behavior and enforce a Consistent semantics on its builders a~jd users. To impIcjncnt such a self-describing language, each kind of slot is given a TuIl-fhlgrd frajnc c\escrihhg its semantics: What is its invcrsc? What kiud of frames can legally have this slot ? What kind of vnlucs cn11 fill it.? IJow cat1 its value br fount1 if jlo~lc is currcnt~ly there? Wljcn should such values grt cacl~ed? What relations of various sorts does it. part.iCiJJate in with other slots? This last category of hformatiojj (rclatiojjs to other slots) is very important. It jncaus that the set of known slot IIRIIICS forms a part of the C:Y(: hierarchy of co~~~e1)t.s. So a rule or sCript or slot definition that asks if Grebcs is an clemejjt of Supergejjuses cau get, an afirrnativc response CVCJJ though there is no elementOf slot explicitly rccortlcd on the hhs franic, hcause the is a biolaxonomicLcvet slot sitting bhcrc a~~(1 bc~aus~ lljc frame for biol%xonomlcl.evel says that. it is a spccializatiojj of elementOf Almost every step of our jnet.Irotlology loop cjhls atltling new, spccializctl kinds of slots to the system. as ~jccdctl, along with tljc olhcr kinds of frames that get added. Let���s corjsidcr a Case whcrc JICW kinds of Slots were added 10 tile syskm. %rctxs have small wings.��� That sounds easy to rcprcscjit. Givejj a Grcbcs frame, we COlJId add any of tl1r fO1kJWilIg S1Ot/vahlC! pairs t0 it,: wirjgsiac: Small wiugs: (# - - - size Sj~jall) parts!hc: (Wings Small) partsl~cscripliOIi: (Wings (sine Small)) ~hfcJrtllrl,?k~y, tlwrc aje actually many ttiifcrcjil things llrc serjt.ejiCc could mean. What is it, exactly. that is small? Is it: 0 The length of a Grctjc���s wing cojrjparcd to the Iciiglli of stnjitlnrd jnctcrstjck? (actualValue) 0 What. we���d expect their wings Icrjgtll to bc (compared to a meterstick), knowing that ~htXX arc aquatic birds, and the aCtualValue of the lcngtlj of mosL aquat,ic bircls��� wings?*(expectcdValuc) * What we���d cxpcct their wings��� 1cJjgth to ~JC (COjqJarCd to a metcrstick), knowing hhe length of most other parts of a ~hcbc?* (expectedPartValue) 0 ���rile ratio of the kJIgth of a Grebc���s wi~ig lo the? hgth of most aquatic birds��� wings? (relativek4agnitude) * Tlrc ratio of ttlc length of a G~.cbc���s wing to the JJlCall length of its hotly parts? (relPartMagnitude) 0 ���ih rat.jO of hbeS��� wiljg-to-bOdy-Siae to t~IIC Wing-to-body-size for typical aqllatk birds? (I.e., t1Jc ~~IlotiCiit of reIJ���arth!ag- jiitude for Grcbcs��� wings antI reli���artMagnit~u& foj Aquat~icDirds wings.) (rclPfoportlonParts[)c?scrlption) hi the casr of our Grcbcs articlc, tlio sixth meaning was intcrjtlcd. Our language sliouttl and dot3 permit each of ttjcsc meanings to 1~ rC[Jr~?wlltcd in a separate but equally cffjcient njajjjjer. Essentially. there arc six diffeicnt relations hcing talked ,&out, and we trajjslatc thn irjto six sepalate kinds of slots. 1Snclj of the can exist for Grcbcs and CRY have JIKLJJY mtrics iII caclr case, 01Ic of the rjjtries could bc (Wings (size Small)). Each of the six slots can he cojisitlcred a relation over four argujjjcjjts (II i ni v), wlicre II is the name of a frame (h this case, ���Grcbcs���!), i is tlic ~iainc of a l)art (iii this cast: %~ii:gs���!), JII is the name of a jric~asuring or comparing furiction (ijj his case %ixc���), and s is the purported value (in this Case ���Sjnall���j. Each of the six slots has its own definition, a function that takes the first tljrcc arguments and returjja t,Jjc fourth OJIC (the valj~c v): actualValue. (in (i II)) expectedvalue. (III (i (basicJiindOf u))) expectedPartValuc (in (apply (basicKijitIOf i) Ii)) relativahlagnitude: (Quohnt (actualValue II i nj) (cxpcctedValuc II i III)) relPartMagnitude: (Q uotjctjt .��� ( aCtiialValuc u i m) (cxpcctedl���artValrjr II i m)) relProportionPat tsDcscription: (Quotient (rcll���artMagjjitljdc u i III) (r#artMagnit utlc (basicKindOf u) i 111)) OJIC of 011r Iarjguago rlcsigjj priiiciplcs is to view ���hgc vahid��� a.9 a sign ht WC hwI���t di~���i&?d II1J th? WOrk1 CjJOIlgh, or propxly. ilrncc tljc proliferation of related slots, i hove. Another design principle is to view any piece of Jhp Co&> in our 1~13 as a transient I an intcrjncdinttl value which has bcc3i conqnttctl and then ���cacliccl��� from some dcclarativc, dcscriptivr knowlcdgc. Ipor most slots, this 1iICai3s that their defn slots arc to bc thought of as virtual slots, which arc computed from other slots. JJI the siml~lcst cast, tfjcsc other slots arc slotCombiner and builtl3om? antI (3���C applies the va111e stored in the slotCombiner slot (0 the vaiuc storctl in the builtFrom sht.. Jhr ijjstajjce, relativeMagnltude has a slotCombiner slot fiIIctI with tljr value ~~OJjlbiIIr~~y~~atioirlg and ii builtfrom dot fihd with the list (actlia1\~:11l~e CxpCCtd\/ahIe). THE AI MAGAZINE 67