CYC REPORT
Vaughan Pratt
Stanford Univ.
April 16, 1994
Revised April 19


This is a report of my visit to Cyc-West on Friday April 15, with
corrections made in response to one round of feedback from Guha.

By way of preparation for the demo, Doug Lenat sent me his recent paper
"Enabling Agents to Work Together," which I read and responded to as
follows.


Date: Sun, 10 Apr 94 22:32:01 PDT
From: Vaughan Pratt  
Message-Id: <9404110532.AA03997@Coraki.Stanford.EDU> 
To: doug@surya.cyc-west.mcc.com
Subject: Visit
Cc: pratt

Hi, Doug. I've now read the paper "Enabling Agents to Work Together" that
you sent me, and also your August 1990 CACM paper (which was hard to find
since it was cited in your EATWT paper as appearing in July).

The papers didn't include any output from a CYC demo, so I'm not entirely
clear as to what I should be expecting to see on Friday. Could you give me
an idea of what to expect? Will you just be demonstrating CYC doing
something under its own steam, or will there be an interactive session with
CYC? If interaction, will you just be invoking some of its subroutines to
demonstrate what they do, or will you be asking CYC questions? If
questions, can CYC be queried in English (CycNL?) at all, or only in a
formal language (CycL?). In either case, can CYC answer only prearranged
questions, or can it field new questions?

If CYC can handle new questions, in what domains might it reasonably be
expected to perform well? For example does it know about counting numbers,
arithmetic, or lists, and if so, up to what level? Does it know that the
world has meridians, latitudes, and poles? What does it know about travel,
e.g. miles, gallons, and miles per gallon, or the concept of distance
between two towns? Does it know how things move, such as that people can
get places by walking? What other forms of transport does it know about?
What does it know about things in the sky (sun, clouds, etc.), or the
weather? What does it know about humans, e.g. does it know they have
height, weight, organs, etc? Does it know anything about government, such
as needing lots of votes in order to get elected? And are there other areas
like these that CYC might be expected to handle reasonably well?

Also, how much reasoning ability does CYC have, approximately? For example
if I told it that a was bigger than b and b bigger than c, could CYC tell
on its own that a was bigger than c or would it need help? Examples of CYC
solving concrete problems that demonstrate the range of its current
reasoning abilities would help here.

I'm assuming the demo will work best if my expectations are well matched to
CYC's current capabilities. If I arrive expecting too much I may go away
disappointed, but if my expectations are set too low initially we may spend
too long on things that don't do justice to CYC's full range of abilities.

Best
Vaughan

===

I did not receive a reply.

===================
VISIT TO CYC-WEST, 4/15/94

I visited Ramanathan Guha and the Cyc-West staff, Srinija Srinivasan, and
Rupert Brauch, at Cyc-West's offices on April 15 at 10 am for a
demonstration of the CYC system. We began with some demos on an 8 megabyte
Symbolics workstation.


CYC DEMO 1

Consistency check of relational databases from different sources.

(i) A spreadsheet (database, relation) labeled "Activity" from one source
was displayed. It contained a record in which the IDL, a fictitious
organization in the Middle East, attacks and destroys a Palestinian village
between 0010 and 0300 on 7/4/93. CYC inferred an inconsistency with a
record in another spreadsheet labeled "Organization" from another source
that contained a record claiming that IDL is a pacifist organization.
Inconsistencies were indicated by coloring the responsible cells of the
spreadsheet red.

(ii) A spreadsheet labeled "Organization" contained a record showing that
IBM believes in capitalism. A spreadsheet labeled "Personality" contained a
record showing the ideology of an IBM employee, James A. Cannavino, to be
communist. These entries were flagged as inconsistent.Manually changing
"communist" to blank removed the inconsistency, and CYC then guessed
"capitalist" as a replacement entry. I suggested changing IBM's belief to
blank instead, which also removed the inconsistency.

The database showed another inconsistency involving Cannavino's date of
birth, listed as 1953. This inconsistency turned out to be due to
date-last-updated of this record being 1950, before he was even born. When
the date-last-updated was changed to 1954 this removed the inconsistency.

Guha feeback: ``There were really two functionalities being displayed here.
The first is of course the ability to detect inconsistencies. The second
(and possibly more important of the two) abilities was that of integration
of information accross multiple structured information sources. The three
tables involved could have been created (i.e., not just the actual filled
tables but their schemas) by 3 different people, on different machines,
never having spoken to each other, etc., and Cyc would still have
automatically "lifted" the cells' entries and noticed the cross-table
contradictions. As a trivial example of this, you may have noticed that the
Personality table had a "beliefs" column that was closely tied in with the
Organization table's heading "ideology". The information from these tables
gets mapped into a "universal schema" (Cyc) and then, after inference,
translated back into the schemas of these tables.''

Guha feedback (cont'd): ``It is worth remarking that this data -- these
spreadsheets -- were prepared for us (i.e., not by us) by our DOD customer,
as a sanitized version of classified DB's. It is also worth reiterating
that no particular example of its behavior is "the big deal." Any single
cell value it highlighted as being suspicious, or set of cell values it
highlighted as being inconsistent, etc., could trivially be caught by an
expert system rule, finer typing and constraints on the DB's original
schema, etc. The point is, rather, to note the breadth of such constraints
which might prove useful in SOME case someday (and, to a lesser extent, to
note the shallowness of the searches involving such knowledge.)''


CYC DEMO 2

Retrieving online images by caption. The intended customer has a huge image
library it currently can only access by Boolean combinations of keywords,
synonyms, and other less-than-CYC capabilities, and is interested in
something with CYC's capabilities.

The CYC demo was done with 20 images, each described by half a dozen CYC-L
axioms. The request "Someone relaxing" yielded one image, 3 men in
beachwear holding surfboards. CYC found this image by making a connection
between relaxing and previously entered attributes of the image.

This inference was made using the following reasoning shown in a window. (I
asked if we could just email the reasoning to me, but Guha said this would
require the customer's permission, so I copied down what was in the window
more or less verbatim. RA, X, G1 abbreviate longer gensym's, "allGenls"
means "Subset", "allInstanceOf" means "memberOf.")

1. (=> (logAnd (allInstanceOf RA RecreationalActivity)
(allInstanceOf X SentientAnimal)
(DoneBy RA X))
(holdsIn RA (feelsEmotion X RelaxedEmotion Positive))) 2. (=> (performedBy
X Y) (doneBy X Y))
3. (allInstanceOf G1 AdultMalePerson)
4. (allGenls Vertebrate SentientAnimal)
5. (allGenls Mammal Vertebrate)
6. (allGenls Primate Mammal)
7. (allGenls Person Primate)
8. (allGenls HumanAdult Person)
9. (allGenls AdultMalePerson HumanAdult)

These axioms plus certain of the half dozen or so properties typed in for
that photo permitted the inference that G1, one of the three surfers, was
relaxing.

Guha feedback: ``Plus some existing assertions about recreational
activities. If given a picture (and corresponding caption) of someone
actually surfing, rather than "standing, holding a surfboard", there would
be both pro- and con- arguments over relaxing, and the con- argument would
either dominate or at least tie.''

Another photo showed a girl reclining on a beach. The request "find someone
at risk for skin cancer" turned up both the 3-surfer photo and this one.
The logic used here was that reclining at the beach implies suntanning and
suntanning promotes risk of skin cancer.

Guha said that CYC supports nonmonotonicity (exceptions, e.g. "unless you
are wearing sunblock"). I asked if the image database contained any
examples of nonomonotonicity, he replied that it didn't.

Guha correction: ``We can add the assertion that she is under a beach
umbrella, and then it WON'T find her image to the skin cancer query. Then
we can tell it that the umbrella is broken, has holes in it, etc., and her
picture will be back. Then we can tell the system that it's cloudy out, and
she'll be not found again. Etc.'' (Suprising such an example isn't already
featured in the demo, given the importance attached in AI to nonmonotonic
reasoning. -v)

I tried retrieving some of the other photos in this way. This worked for
two requests, but then I asked for "A tree", and it failed to find the
picture captioned "A girl with presents in front of a Christmas tree." We
then asked for "A Christmas tree" with no more luck. Apparently CYC-NL was
translating "Christmas tree" to "trimmed Christmas tree"; Guha tested
whether this was the problem by adding the adjective "trimmed" to the
information about the photo in the image database. It still didn't find the
picture, so we left this as an unresolved mystery.

Guha feedback: ``The system we were running was the experimental system
(i.e., the one to which code changes, etc. are being made) and not the
released one. These two problems [this and the one below about whether one
can drink bread] have since been fixed.''

This 20-image database is the only demo involving CYC-NL, CYC's natural
language component. Guha said that CYC-NL correctly parses 85% of two pages
worth of USA Today sentences, and gets the right semantics as well for 70%.
I asked if we could look at these parses but they were not available, being
in Austin. I expressed a strong interest in seeing these at some point in
the future.

Guha feedback: ``We are in the process of making CycNL usable for the more
general purpose of just browsing the KB and this should be available in a
couple of weeks. That should be of more interest than looking just at a
couple pages of static already-parsed sentences.''

An example of CYC-NL tranlating from English to CYC-L's internal language
was provided by the caption "A girl is on a white lounge chair" for an
image not previously entered (if I understood correctly). CYC-NL's
translation of this English sentence was

(LogAnd (mtImageDepicts GirlLoungingAtBeachImageMt
ChaiseLounge-1-G5055-365)
(mtImageDepicts GirlLoungingAtBeachImageMt
FemaleChild-1-G5054-364)
(InstanceOf FemaleChild-1-G5054-364 FemaleChild) (on-2
FemaleChild-1-G5054-364 ChaiseLounge-1-G5055-365) (allInstanceOf
ChaiseLounge-1-G5055-365 ChaiseLounge) (colorOfObject
ChaiseLounge-1-G5055-365 WhiteColor))

That is, the girl-lounging-at-beach image depicts two particular objects,
chaise-lounge-365 and female-child-364. The object female-child-364 is an
instance of a female child and is on chaise-lounge-365. (Guha feedback: in
CYC's sense 2 of on---CYC has dozens of senses of "on".) The object
chaise-lounge-365 is an instance of a chaise-lounge (Guha explained that
without the "all" in "InstanceOf," the containing class would be required
to be the minimal containing class) and is white-colored.

Guha feedback: ``To be precise, Cyc-NL translates the input which is then
processed further to take into account the context of the utterance, i.e.,
that the statement describes what is depicted in that image. The context,
in other words, is that of telling Cyc about images. So if I say "there's a
girl..." what I really mean is "The image explicitly depicts a girl..." and
Cyc gets this.''

Other axioms for this photo included "The girl is on a beach" and "The girl
is reclining." The request "someone relaxing" found this image by inferring
from the fact that she was reclining that she was relaxing.


OTHER ASPECTS OF CYC

CYC's knowledge is expressed as axioms. Guha said CYC currently has half a
million axioms. To date these have been put in manually. The CYC project
currently employs three computer scientists who work on the CYC system and
fifteen people from other walks of life who write CYC axioms.

Guha feedback: ``The number of axioms entered by hand was until recently
well over 2 million. The new smaller number is the result of serious
compaction, generalization, cleaning up of redundancies, etc. Our staff
comprises 22 individuals at present (19 FTEs); we hope to staff up to over
30 soon, assuming that is we "stay in business" at MCC.''

[I figure that if 15 people worked 250 days a year for six years putting in
half a million axioms, this would be 22 axioms per person per day. This
rate for declarative programming is better than twice the often-used figure
of 10 lines of code per day for imperative programs.]

Guha feedback: ``This analysis is not very accurate for a few reasons: our
staff size has gone up and down, but for the first 5 years in particular we
had a much smaller staff. Also, the typical knowledge enterer will work on
a topic for several days, then enter several hundred axioms in a burst, in
a day.'' (So presumably the average rate is *considerably* better than
twice. Also imperative programming is surely at least as bursty. -v)

I wanted to know what CYC knew, and asked how we could find out whether it
knew certain things. I began by asking whether CYC knew that bread is food.
Guha asked this question in the form

(evaluate-term '(#%allGenls #%Bread #%Food))

and then

(evaluate-term '(#%allGenls #%Bread #%EdibleStuff))

and CYC returned True in each case. I then asked if CYC considered bread to
be drink. Guha typed

(evaluate-term '(#%allGenls #%Bread #%Drink))

which returned NIL, but Guha said that this merely indicated no knowledge.
To get positive information one needs positive data, so Guha added

(#%MutuallyDisjointWith #%Bread #%Drink)

to CYC's axioms. CYC was unable to infer from this that food was nondrink.
Guha wasn't sure why, and after a bit of fiddling we dropped this question.
[Guha feedback: fixed.]

I wanted to know if CYC knew that people needed food. To find this out,
Guha asked CYC to show all axioms having "Dying" as a consequence. CYC
found hang gliding and touching toxic substances but not starvation or
anything related to food. Lots of axioms had "Eating" in their antecedent,
but we didn't run across any bearing on the *need* for food, though we did
run across many other items about food such as 8 ounces being the typical
amount of soup that one eats.

Guha feedback: ``Cyc does know that lack of food causes hunger. And it
knows that starvation is one way to cause death. It was missing the
definition of starvation, in effect. This is exactly the sort of debugging
involved in fleshing out the Cyc KB: get the answer to a question wrong,
and see what it's missing, and add it.''

I then asked what CYC knew about the earth. CYC didn't have anything
bearing on "Earth" under that name, but after some searching for axioms
that might be relevant, Guha turned up one that mentioned PlanetEarth,
which told us the name we should have used.

Did CYC know how big the earth was? CYC knew that PlanetEarth was bigger
than PlanetVenus, but such comparisons with other planets was all CYC knew
about the size of the earth. (To my surprise, no one else in my family knew
the diameter of the earth even approximately, but all three knew that Venus
was smaller, so at least on this detail CYC seems to be an excellent
reflector of human knowledge.)

I asked what CYC knew about the sky. Guha said that CYC knew that the earth
has sky (we didn't formulate a question testing this) but doesn't know what
color the sky is. CYC knows about air that the atmosphere has air as one of
its constituents, and that air contains oxygen, CO2, gaseous water, etc.,
though not the proportions.

Guha remarked at this point, if I understood him correctly, that 5-10% of
CYC's knowledge consisted of axioms that someone typed in that held in
Austin on a particular day.

Guha feedback correcting this: ``5-10% of the knowledge is of random
specific information (such as the people who work on the project, etc.) Of
course I should hope that 99% of it is true even in Austin, where many
folks do believe it or not have common sense, and think that bread is
edible, etc.''

Apropos of CYC's reasoning capability, Guha said that only a few of CYC's
axioms are flagged as forward-chaining, e.g. male implies masculine (i.e.
if you say someone is male then CYC immediately infers that he is also
masculine rather than waiting for "masculine" to enter the arena in some
other way).

I then wanted to know what CYC knew about cars, e.g. their number of
wheels, range in miles, maximum velocity, etc. We found axioms indicating
that the typical cost of a car was $6K to $80K, but none that contained any
answers to my questions. Guha said that this remaining information would
later on be obtained from extant databases once CYC had the ability to read
them.

Guha feedback: ``Cyc does have the ability to read them already (as
displayed in Demo 1). We do not have the relevant databases however. See
the comment above about populating the KB with specific facts; that comment
goes DOUBLE for data which is best held in a DB.''

The demo ended at 12:30, having taken two and a half hours.

COURSE

CYC is taught in the Stanford course CS 321, "Representing Large Bodies of
Knowledge," which meets once a week. As of this week the meeting place will
be Cyc-West (corner of Page Mill and Foothill). It has an enrollment this
spring estimated by Guha at 8. Lenat is coteaching it with Guha. Each of
them will meet four times with the class during the term, for which each
makes two 8-day trips to Stanford from Austin. Lenat previously cotaught
this course (then numbered CS 309B) with Pat Hayes three years ago.

Guha feedback: ``Actually, this is the fourth time this course is being
taught. The first 2 times, it was taught by Doug and myself, the third
time, it was taught by Doug, myself and Pat Hayes.''

FURTHER QUESTIONS

I had intended to ask CYC many more questions along the lines of those in
my message above to Doug of April 10. However the rate of return on the few
we had time to ask, combined with the difficulty we had in finding our way
around the half million axioms even to find the neighborhood of where we
should be looking, discouraged me from continuing. It was clear that the
bulk of my questions were going to be well beyond CYC's present grasp.

Doug talked about CYC at a faculty luncheon on April 5, and I had formed
the impression from his enthusiastic description that CYC would be able to
answer a reasonable percentage of questions at this level of general
knowledge at least. From the success Doug reported with natural language I
had also expected that CYC's knowledge could be tested in English, allowing
an operator to reword the English as needed to match CYC's command of
English.

Guha feedback: ``Don't forget, our goal, with building our NL front end, to
to enable trained Cyc knowledge enterers work faster. I hope that in future
years it also extends to allow Cyc-uninitiated folks to sit down and
converse with it, but that is not a high priority for us this year.''

The demo was very helpful in calibrating me on the level I should have been
testing CYC at. First, English was available for no CYC application other
than retrieving images from among the twenty images in the prototype image
database. Second, even when the questions are phrased in CYC's retrieval
language (e.g. "Is bread food?" became (evaluate-term '(#%allGenls #%Bread
#%Food))), or by associative search of CYC's half-million axioms, our main
retrieval mode during the demo, the demo made clear that my expectations
had been set way too high.

Guha feedback: ``The good thing about that, is that now your expectations
have been set so LOW that you will be astounded at the progress we seem to
have made the next time you take a look at it.'' (I will make myself
available for this occasion when it arises. -v)

After looking at the axioms it is clear to me that merely lowering
expectations is not sufficient in order to come up with a suite of
questions that CYC is likely to be able to answer say 20% of. The
distribution of CYC's half-million axioms in "knowledge space" showed no
discernible pattern that would allow me to construct such a suite, short of
simply picking particular axioms out of CYC's database and carefully
phrasing questions around those axioms. And even then our experiences with
"Is bread drink" and the Christmas tree indicated that CYC would still have
difficulty with a sizable percentage of the questions of a suite
constructed in this way.

Guha feedback: ``Our goal (at least for the next few years) for Cyc is not
a program that can simulate a child in its input/output behaviour. The goal
is more to create a common sense substrate for information retrieval based
on content. Also, I am not sure how to respond to your complaint about not
being able to discern a pattern in what Cyc knows. Getting a good grasp of
this is one of the hardest part of our training people on the project and
easily takes a couple of months. I should also point out that the NL part
of the project has been around only for a year or so.''

Had my initial expectations been meet, I would have continued with the
following questions, which I would have thought ranged in difficulty (for
computers) from very easy to difficult but by no means impossible.

Guha feedback: ``At least some of these questions can be posed to and
answered by Cyc in its current state. If you are interested, I can try an
experiment involving this. I'd be interested in hearing from you about any
other program with which you have had more luck in getting these questions
answered.'' (What other programs exist that claim to be as comprehensive in
their general knowledge as claimed for CYC? -v)

Try these yourself or on your kids to get some idea of how difficult you
think they are for people. Then estimate how long it will be before someone
writes a computer program that can answer say 50% of questions at this
general level of difficulty.

I tried them on my kids, and to my surprise they both enjoyed the whole
test as a low-stress off-the-wall pop quiz. Since they found the questions
so easy, they kept looking for trick aspects; here one might expect a
computer to do much better than a person in finding lots of "trick"
interpretations.

When you suspect that the answer is just a guess, it is fair game to ask
"Why?", bearing in mind that each successive "Why?" may be an order of
magnitude harder than its predecessor.


QUESTIONS

1. Identity
(a) What is your name?
(b) How big are you?
(c) How old are you?
(d) What do you cost?
(e) What is your address?

2. Counting
(a) Do you understand counting?
(b) What is the smallest counting number? (c) What is the largest counting
number? (d) How high can you count?
(e) How long would that take?
(f) What other kinds of numbers are there besides counting numbers?

3. Comparison
(a) Do you understand comparison?
(b) If a is bigger than b and b is bigger than c, is a bigger than c? (c)
If a is better than b and b is better than c, is a better than c? (d) If a
is to the left of b and b is to the left of c, is a to the
left of c?
(e) If Tom is 3 inches taller than Dick, and Dick is 2 inches taller than
Harry, how much taller is Tom than Harry? (f) Can Tom be taller than himself?
(g) Can Tom be shorter than Dick?
(h) Can a sister be taller than her brother? (i) Can two siblings each be
taller than the other?

4. Geography
(a) Where is the north pole?
(b) What is the top of the earth called? (c) On a map, which compass
direction is usually left? (d) Which compass direction is usually up? (e)
How far north can one go? How far west? (f) Is the equator between the
north and south poles? (g) What shape is the earth?
(h) Is the earth hollow?
(i) What shape is the equator?
(j) What shape is the north pole?
(k) Which is wetter, land or sea?
(l) How big is the earth?
(m) How far apart are the poles?
(n) What is the circumference of the equator? (o) If a is to the west of b
and b is to the west of c, is a
necessarily to the west of c?

5. Travel
(a) If Stanford is 50 miles from Berkeley, how far is Berkeley from
Stanford?
(b) How far is Stanford from Stanford?
(c) Can Stanford be 50 miles from Berkeley, Berkeley 50 miles from
Sacramento, and Stanford 1000 miles from Sacramento? (d) Is it possible to
travel 100 miles and end up where you started? (e) While travelling at a
steady speed in a straight line in space,
can you visit the same point twice? Three times? (f) While travelling at a
steady speed around a circle, can you visit
the same point twice? Three times?

6. Action
(a) Can people run? Swim? Fly?
(b) Can fish run? Swim? Fly?
(c) Can birds run? Swim? Fly?
(d) Do people run on land or in water? How about fish? Birds? (e) Do people
swim on land or in water? How about fish? Birds? (f) Do people fly on land
or in the air? How about fish? Birds? (g) Which of legs, fins, and wings
does one use to swim, run, and fly? (h) Which is fastest, swimming,
running, or flying? Which is slowest? (i) If the door is closed, what must
you do first before walking
through it?
(j) If the door is locked, what must you do first before opening it? (k) If
the key is in your pocket, what must you do first before unlocking
the door?
(l) What should I ask next?
(m) If it takes all your energy to open the door, can you then walk
through it?

7. Transport
(a) Which of cars, ships, and planes does one use to go by sea, land,
and air?
(b) Which is fastest, cars, ships, or planes? Which is slowest? (c) Which
of cars, ships, and planes can carry cargo? Which can
carry passengers?
(d) Which was invented first, cars, ships, or planes? Which last? (e) Which
of cars, ships, and planes travel in shipping lanes, on
roads, and in air lanes?
(f) Match up cars, ships, and planes to the navy, army, and air force. (g)
Is a jeep a car, a ship, or a plane? What about a yacht? A biplane?
An aircraft carrier? A truck?
(h) What fuels are normally used by cars, ships, and planes? (i) Can cars
be propelled by the wind? Ships? Planes? (j) How do you communicate between
two ships? Two planes? Two cars? (k) Can a plane communicate with a ship? A
plane with a car? (l) Where are cars kept when not in use? Planes? Ships?
(m) What do you steer a car with? What about a plane? A ship? (n) Must a
car always stop when driving through a small town? (o) Can a car drive
backwards? Can a plane fly backwards? Can a
ship sail backwards?

8. Sky
(a) What color is the sky? Why?
(b) What color are clouds? Why?
(c) What is the name of the period during which the sun is down? (d) What
is the name of the period during which the sun is up? (e) Approximately
when does the sun rise? (f) Give three names for the rising of the sun. (g)
About how long does the sun stay up? (h) Approximately when does the sun
set? (i) Give three names for the setting of the sun. (k) About how long
does the sun stay down? (l) What are the times of rising and setting of the
moon called? (m) How long does the moon stay up?
(n) When do the stars come out? Why?
(o) Which is brightest, the sun, the moon, or the stars? Which least? (p)
Can you see the stars when the sun is up? (q) Can you see the moon when the
sun is up? (r) Does the sun go round the earth or vice versa? (s) Does the
moon go round the earth or vice versa? (t) Which is closer to the earth,
the stars or the planets?

9. Weather
(a) Is rain wet or dry?
(b) Is hail hard or soft?
(c) What color is rain?
(d) What color is hail?
(e) What color is snow?
(f) Which is harder, rain or hail?
(g) Which is warmer, rain or snow?
(h) Can rain cause lightning?
(i) Can rain cause floods?
(j) Can lightning cause floods?

10. People
(a) Can people walk? Run? Fly? Swim? Talk? Think? (b) Name three external
human organs and three internal. (c) How long can people go without air?
Water? Food? Sex? Money? Snow? (d) How long does the average human live?
(e) Does the average female live more than a year longer than the
average male?
(f) How tall is the average human?
(g) Is the average male height within an inch of the average female height?
(h) How heavy is the average human?
(i) Is the average male weight within 5 lbs of the average female weight?
(j) Are people made up mainly of organic or inorganic material? (k) Do
people have more or less hair than polar bears? Birds? Pigs? (l) About how
many people fit comfortably in a car? A ship? A plane?
An elevator? An escalator? A turnstile?
(m) Do people talk daily about the weather? The color of rain? (n) Can
three people listen to the same person at the same time? (o) Can three
people talk to the same person at the same time? (p) Can a person whistle
while talking?

11. Addition
(a) Do you understand addition?
(b) What is 1 plus 1?
(c) What is 2 plus zero?
(d) Is the sum of two counting numbers always a counting number? (e) Is
every counting number the sum of two other counting numbers? (f) Is
addition commutative?
(g) Does addition satisfy any other laws? (h) If I increment x y times, do
I get the same as when I increment
y x times?
(i) Can any two numbers be added together?

12. Lists
(a) How long is the list of numbers from 1 to 10? (b) What comes between
the third and fifth members of a list? (c) Do you know what it means to
concatenate two lists? (d) What do you get when you concatenate the list of
numbers from 1
to 3 with itself?
(e) Is the concatenation of two lists always a list? (f) Is concatenation
associative?
(g) Is concatenation commutative?
(h) How long is the concatenation of two three-item lists? (i) How long is
the concatenation of two n-item lists? (j) Can a list be circular?
(k) Is a list one-dimensional or two-dimensional? (l) Is a one-dimensional
array a list?
(m) Is the concatenation of a sorted list of 1-digit numbers with a
sorted list of 2-digit numbers sorted?
(n) If the first two members of a list are both 1, and thereafter
each member is the sum of the two members immediately preceding it, what is
the third member? The fourth? The hundredth?

13. Subtraction
(a) Do you understand subtraction?
(b) What is 5 minus 3?
(c) What is -3 plus 5?
(d) If I open a bank account with $100, deposit $5000 a week later,
and withdraw $3000 another week later, how much is in the account? (e) Do
banks permit withdrawals from an empty account? (f) If I open a bank
account with $100, withdraw $3000 a week later, and
the deposit $5000 another week later, how much is in the account? (g) If I
have $x in my bank account and withdraw $y, how much is left? (h) Can one
withdraw a negative amount? What would that mean? (i) Can I make two
consecutive deposits? (j) Can I make any number of consecutive deposits?
(k) Can I make two consecutive withdrawals? (l) Can I make any number of
consecutive withdrawals? What if they
are all negative?

14. Government
(a) What is the name of the head of a monarchy? A dictatorship?
A republic? How is each typically chosen? (b) How long does a major
election take, seconds, months, or decades? (c) Is an election candidate
who gets 90% of the votes likely to
win his or her race? How about 80%? 20%? 10%? (d) What are the two best
known ways of getting votes, from among
soliciting, borrowing, stealing, and buying? (e) Which of these two means
should be used secretly? (f) What media are available for soliciting votes?
(g) Do voters elect or get elected?
(h) How many times may one vote for a candidate in a race? (i) Do lobbyists
vote or lobby?
(j) Do political parties often have fewer than ten people? (k) Which are
there usually more of, parties, or candidates per party? (l) Name four ways
of leaving elected office.


A REFLECTION ON CYC

This report has concentrated on the facts, avoiding speculation about the
merits or otherwise of CYC. However some email discussion with Guha prompts
the following reflection.

The impression one gets in reading and hearing about CYC from its authors
is that CYC is well along the path to having comprehensive general
knowledge. What is lacking here is a quantitative measure of how far along.
As things stand right now there exists for example no way of telling
whether adding an English front-end enhances the rate at which CYC can
acquire general knowledge, since there is no way of measuring this rate. If
one goes by mere axiom count then the recent compression from 2 million
axioms to half a million would indicate a step backwards. Presumably this
is unlikely, but in the absence of an objective measure of progress towards
comprehensive general knowledge, how can it be demonstrated concretely that
the compression had a substantially more beneficial effect than would have
been achieved merely by removing the first 1.5M axioms?