Wednesday, December 12, 2012

Adventures in hyperspace

The hypercube.  A geekly rite of passage, at least for geeks of a certain age.  The tesseract.   The four-dimensional cube.  Because what could be cooler than three dimensions?  Four dimensions!  Cue impassioned discussion over whether time is "the" fourth dimension, or such.

Cool concept, but can you visualize one, really "see" a hypercube in your mind's eye?  We can get hints, at least.  It's possible to draw or build a hypercube unfolded into ordinary three-dimensional space, just as you can unfold a regular cube flat into two-dimensional space.  Dali famously depicted such an unfolded 4-cube.  You can also depict the three-dimensional "shadow" of a 4-cube, and even -- using time as an extra dimension -- animate how that shadow would change as the 4-cube rotated in 4-space (images courtesy of Wikipedia, of course).

That's all well and good, but visualizing shadows is not the same as visualizing the real thing.  For example, imagine an L-shape of three equal-sized plain old 3D cubes.  Now another L.  Lay one of them flat and rotate the other so that it makes an upside-down L with one cube on the bottom and the other two arranged horizontally on the layer above it.  Fit the lower cube of that piece into the empty space inside the L of the first piece, so that the first piece is also fitting into the empty space of that piece.

What shape have you made?  Depending on how natural such mental manipulation is for you and how clear my description was, you may be able to answer "A double-wide L" or something similar.  Even if such things make your head hurt, you probably had little trouble at least imagining the two individual pieces.

Now do the analogous thing with 4-cubes.  What would the analogue of an L-shape even be in 4-space?  How many pieces would we need? Two?  Three?  Four?  Very few people, I expect, could answer a four-dimensional version of the question above, or even coherently describe the process of fitting the pieces together.

Our brains are not abstract computing devices.  They are adapted to navigating a three-dimensional world which we perceive mainly (but not exclusively) by processing a two-dimensional visual projection of it.  Dealing with a four-dimensional structure is not a simple matter of allocating more mental space to accommodate the extra information.  It's a painstaking process of working through equations and trying to make sense of their results.

That's not to say we're totally incapable of comprehending 4-space.  We can reason about it to a certain extent.  People have even developed four-dimensional, and even up to seven-dimensional (!) Rubik's Cubes using computer graphics.  It's not clear if anyone has ever solved a 7-cube, but a 3x3x3x3x3 cube definitely has been solved.

Even so, it's pretty clear that the solvers are not mentally rotating cube faces in four or five dimensions, but dealing with a (two-dimensional representation of) a three-dimensional collection of objects that move in prescribed, if complicated, ways.

From a mathematical point of view, on the other hand, dealing in four or five or more dimensions is just a matter of adding another variable.  Instead of (x,y) coordinates or (x,y,z), you have (w,x,y,z) or (v,w,x,y,z) coordinates and so forth.  Familiar formulas generally apply, with appropriate modifications.  For example, the distance between two points in 5-space is given by

d2 = v2 + w2 + x2 + y2 + z2

if v, w, etc. are the distances in each of the dimensions.  This is just the result of applying the pythagorean theorem repeatedly.

Abstractly, we can go much, much further.  There are 10-dimnsional spaces, million-dimensional spaces, and so on for any finite number.  There are infinite-dimensional spaces.  There are uncountably infinite-dimensional spaces (I took a stab at explaining countability in this post).

Whatever intuition we may have in dealing with 3- or 4-space can break down completely when there are many dimensions.  For example, if you imagine a 3-dimensional landscape of hills and valleys, and a hiker who tries to get to the highest point by always going uphill when there is a chance to and never going downhill, it's easy to imagine that hiker stuck on the top of a small hill, unwilling to go back down, never reaching the high point.  If the number of dimensions is large, though, there will almost certainly be a path the hiker could take from any given point to the high point (glossing over what "high" would mean).  Finding it, of course, is another matter.

You can't even depend on things to follow a consistent trend as dimensions increase, as we can in the case of a path being more and more likely to exist as the number of dimensions increases.  A famous example is the problem of finding a differentiable structure on a sphere.

Since we can meaningfully define distance in any finite number of dimensions, it's easy to define a sphere as all points a given distance from a given center point (it's also possible to do likewise in infinite dimensions).  If you really want to know what a differentiable structure is, have fun finding out.  Suffice it to say that the concepts involved are not too hard to visualize in two or three dimensions.  Indeed, the whole field they belong to has a lot to do with making intuitive concepts like "smooth" and "connected" mathematically rigorous.   Even without knowing any of the theory (I've forgotten what little I knew years ago), it's not hard to see something odd is going on if I tell you there is:
  • exactly one way to define a differentiable structure on a 1-sphere (what most of us would call a circle)
  • likewise on a 2-sphere (what most of us would just call a sphere)
  • and the 3-sphere (what some would call the 3-dimensional surface of a hypersphere)
  • and the 5-sphere (never mind)
  • and the 6-sphere
Oh ... did I leave out the 4-sphere?  Surely there can only be one way for that one too, right?

Actually no one knows.  There is at least one.  There may be more.  There may even be an infinite number (countable or uncountable).

Fine.  Never mind that.  What happens after 6 dimensions?
  • there are 28 ways on a 7-sphere
  • 2 on an 8-sphere
  • 8 on a 9-sphere
  • 6 on a 10-sphere
  • 992 on an 11-sphere
  • exactly one on a 12-sphere
  • then 3, 2, 16256, 2, 16, 16, 523264, and 24 as we go up to 20 dimensions
See the pattern?  Neither do I, nor does anyone else as far as I know. [The pattern of small, small, small, big-and-(generally)-getting-bigger continues at least up to 64 dimensions, but the calculations become exceedingly hairy and even the three-dimensional case required solving one of the great unsolved problems in mathematics (the Poincaré conjecture).  See here for more pointers, but be prepared to quickly be hip-deep in differential topology and such.]   In the similar question of differential structures on topological manifolds, there is essentially only one answer for any number of dimensions except four.  There are uncountably many differential structures on a four-dimensional manifold.  So much for geometric intuition.

It's worth pondering to what extent we can really understand results like these.  Certainly not in the same way that we understand how simple machines work, or that if you try to put five marbles in four jars, at least one jar will have more than one marble in it.

Statements like "there are 992 differentiable structures on an 11-sphere" are purely formal statements, saying essentially that if you start with a given set of definitions and assumptions, there are 992 ways to solve a particular problem.  The proofs of such statements may use various structures that we can visualize,  but that's not the same as being able to visualize an 11-dimensional differentiable structure.  Even if we happen to be able to apply this result to something in our physical world, we're really just mechanically applying what the theorems say should happen in the real world.   Doing so doesn't give us a concrete understanding of an eleven-dimensional differentiable structure.  

That, we're just not cut out to do.  In fact, we most likely don't even visualize three complete dimensions.  We're fairly finely tuned to judging how big things are, how far away they are and what's behind what (including things we can't see at the moment) and what's moving in what direction how fast, but we don't generally visualize things like the color of surfaces we can't see.  A truly three dimensional mental model would include that, but ours don't.  Small wonder a hypercube is a mind-boggling structure, to say nothing of some of the oddities listed above.


Monday, November 19, 2012

If language isn't an instinct, what is it?

Steven Pinker's The Language Instinct makes the case that humans, and so far as we know only humans, have an innate ability to acquire language in the sense we generally understand it.  Further, Pinker asserts that using this ability does not require conscious effort.  A child living with a group of people will normally come to learn their language, regardless of whether the child's parents spoke that language, or what particular language it is.  This is not limited to spoken languages.  A child living among sign language users will acquire the local sign language.  There are, of course, people who are unable to learn languages, but they are the rare exceptions, just as there are people who are completely unable to see colors.

There is, on the other hand, no innate ability to speak any particular language. A child of Finnish-speaking parents will not spontaneously speak Finnish if there is no one around speaking Finnish, and the same can be said of any human language.

This is noteworthy, even it if might seem obvious, because human languages vary to an impressive degree.  Some have dozens of distinct sounds, some only a handful.  Some have rich systems of inflections, allowing a single word to take thousands of different forms.  Some (like English, and Mandarin even more so) have very little inflection.  Some have large vocabularies and some don't (though any language can easily extend its vocabulary).  The forms used to express common concepts like "is", or whether something happened yesterday, is happening now or might never happen, can be completely different in different languages.

At first glance this variation may seem completely arbitrary, but it isn't.  There are rules, even if our understanding of them is very incomplete.  There are no known languages where, say, repeating a word five times always means the opposite of repeating that word four times.  There's no reason in principle there couldn't be such a language, but there aren't, and the probable reasons aren't hard to guess.

There's a more subtle point behind this: There is no one such thing as "communication", "signaling" or "language".  Rather, there are various frameworks for communication.  For example, "red means stop and green means go" is a system with two signals, each with a fixed meaning.  Generalizing this a bit, "define a fixed set of signs each with a fixed meaning" is a simple framework for communication.  A somewhat more complex framework would allow for defining new signs with fixed meanings -- start with "red means stop and green means go", but now add "yellow means caution".

Many animals communicate within one or the other of these frameworks.  Many, many species can recognize a specific set of signs.  Dogs and great apes, among others, can learn new signs.  Human language, though, requires a considerably more complex framework.  We pay attention not only to particular signs, but to the order in which they are communicated.  In English "dog bites man" is different from "man bites dog".  Even in languages with looser word order, order still matters.

Further, nearly all, if not all, human languages have a concept of "subordinate clause", that is, the ability to fold a sentence like "The boy is wearing a red shirt" into a sentence like "The boy who is wearing the red shirt kicked the ball."  These structures can nest deeply, apparently limited by the short-term memory of the speaker and listener and not by some inherent rule.  Thus we can understand sentences like I know you think I said he saw her tell him that.  As far as we can tell, no other animal can do this sort of thing.

This not to say that communication in other animals is simple.  Chimpanzee gestures, for example, are quite elaborate, and we're only beginning to understand how dolphins and other cetaceans communicate.  Nonetheless, there is reasonable evidence that we're not missing anything on the order of human language.   It's possible in principle that, say, the squeaks of mice are carrying elaborate messages we haven't yet learned how to decode, but mice don't show signs of behavior that can only be explained by a sophisticated signaling system.   Similarly, studies of dolphin whistles suggest that their structure is fundamentally less complex than human language -- though dolphins are able to understand ordered sets of commands.

In short, human languages are built on a framework unique to us, and we have an innate, species-universal, automatic ability to learn and use human languages within that framework.  Thus the title The Language Instinct.   Strictly speaking The Instinct to Acquire and Use Language would be more precise, but speaking strictly generally doesn't sell as many books.


This all seems quite persuasive, especially as Pinker puts it forth, but primatologist and developmental psychologist Michael Tomasello argues otherwise in his review of Pinker, straightforwardly titled Language is not an Instinct  (Pinker's book titles seem to invite this sort of response).  Tomasello is highly respected in his fields and knows a great deal about how human and non-human minds work.  I cited him as an authority in a previous post on theories of mind, for example.  Linguistics is not his area of specialization, but he is clearly more than casually familiar with the literature of the field.

Tomasello agrees that people everywhere develop languages, and that human languages are distinct from other animal communication systems, albeit perhaps not quite so distinct as we would like to think.  However, he argues that there does not need to be any language-specific capability in our genes in order for this to be so.  Instead, the simplest explanation is that language falls out as a natural consequence of other abilities, such as the ability to reason in terms of objects, actions and predicates.

To this end, he cites Elizabeth Bates' analogy that, while humans eat mostly with their hands, this does not mean there is an innate eating-with-hands capability.  People need to eat, eating involves moving food around and our hands are our tool of choice for moving things around in general.  Just because everyone does it doesn't mean that there is a particular instinct for it.  Similarly, no other species is known to cook food, but cooking food is clearly something we learn, not something innate.  Just because only we do it doesn't mean that we have a particular instinct for it.

This is a perfectly good point about logical necessity.  If all we know is that language is universal to humans and specific to humans, we can't conclude that there is a particular instinct for it.  But Tomasello goes further to assert that, even when you dig into the full evidence regarding human language, not only is there no reason to believe that there is a particular language instinct, but language is better explained as a result of other instincts we do have.


So how would we pick between these views?  Tomasello's review becomes somewhat unhelpful here.  First, it veers into criticism of Pinker personally, and linguists of his school of thought in general, as being unreceptive to contrary views, prone to assert his views as "correct" and "scientific" when other supportable views exist, and overly attached to the specialized jargon of their field.  A certain amount of this seems valid.  Pinker is skilled in debate, a useful skill that can cut both ways, and this can give the air of certainty regardless of how certain things actually are.  There is also mention of Pinker's former advisor, the famed linguistic pioneer and polemicist Noam Chomksy, but Pinker's views on cognition and language are not necessarily those of Chomsky.

Second, and one would have to assume as a result of the first point, the review takes on what looks suspiciously like a strawman.   In Tomasello's view Pinker, and those claiming a "language instinct" that is more than the natural result of human cognition and the general animal ability to signal, are generally concerned with mathematical elegance, and in particular the concept of generative grammar.

Generative grammar breaks language down into sentences which are in turn composed of a noun phrase and a verb phrase, which may in turn be composed of smaller parts in an orderly pattern of nesting.  This is basically the kind of sentence diagramming you may have learned in school [when I wrote this I didn't realize that there are several ways people are taught to analyze sentences, so I wrote "you learned in school", assuming everyone had had the same experience..  But of course there are several ways.  In some schemes the results look more like dependency graphs than parse trees, which sent me down a fairly eye-opening path.  So, sorry about that, but at least I ended up learning something].

Linguistic theories along these lines generally add to this some notion of "movement rules" that allow us to convert, say, The man read the book into The book was read by the man.  Such systems are generally referred to as transformational generative grammars, to emphasize the role of the movement rules, but I'll go with Tomasello here and drop the "transformational" part.  Keep in mind, though, that if a field is "basically" built on some familiar concept, that's just the tip of the iceberg.

A generative grammar, by itself, is purely syntactic.  If you call flurb a noun and veem a verb, then "Flurbs veem." is a grammatically correct sentence (at least according to English grammar) regardless of what, if anything, flurb and veem might actually mean.  Likewise, you can transform Flurbs veem into Veeming is done by flurbs and other such forms purely by moving grammatical parts around.

Tomasello questions whether the structures predicted by generative grammar even exist in all languages.  Generative grammar did happen to work well when first applied to English, but that's to be expected.  The techniques behind it, which come from Latin "grammar school" grammar by way of computing theory, were developed to analyze European languages, of which English is one.  Likewise, much of the early work in generative grammar was focused on a handful of the world's thousands of languages, though not necessarily only European ones.  There is an obvious danger in such situations that someone familiar with generative grammar will tend to find signs of it whether it is there or not.  If all you have is a hammer, the whole world looks like a nail.

From what I know of the evidence though, all known languages display structures that can be analyzed reasonably well in traditional generative grammar terms.  Tomasello asserts, for example, that Lakota (spoken by tribes in the Dakotas and thereabouts) has "no coherent verb phrase".   A linguist whom I consulted, who is familiar with the language, tells me this is simply not true.  The Jesuit Eugene Buechel was apparently also unaware of this when he wrote A Grammar of Lakota in 1939.

But perhaps we're a bit off in the weeds at this point.  What we really have here, I believe, is a set of interrelated assertions:
  • Human language is unique and universal to humans.  This is not in dispute.
  • Humans acquire language naturally, independent of the language.  Also not in dispute.
  • Human languages vary significantly.  Again, not in dispute.
  • Human language is closely related to human cognition.  This is one of Tomasello's main points, but I doubt that Pinker would dispute it, even though Tomasello seems to think so.
  • Generative grammar predicts structures that are actually seen in all known languages.  Tomasello disputes this while Pinker asserts it.  I think Pinker has the better case.
  • Generative grammar describes the actual mechanisms of human language.
That last is subtly different from the one before it.  Just because we see noun phrases and verb phrases, and the same sentence can be expressed in different forms, doesn't mean that the mind actually generates parse trees (the mathematical equivalent of sentence diagrams) or that in order to produce "The book was read by the man" the mind first produces "The man read the book" and then transforms it into passive voice.  To draw an analogy, computer animators have models that can generate realistic-looking plants and animals, but no one is claiming that explains how plants and animals develop.

Personally, I've never been convinced that generative grammars are fundamental to language.  Attempts to write language-processing software based on this theory have ended in tears, which is not a good sign.  Generative grammar is an extremely good fit for computers.  Computer languages are in fact based on a tamer version of it, and the same concepts turn up repeatedly elsewhere in computer science.  If it were also a good fit for natural languages, natural language processing ought to be considerably further along than it is.  There have been significant advances in language processing, but they don't look particularly like pure generative grammar rendered in code.  Peter Norvig has a nice critique on this.

Be that as it may, I don't see that any of this has much bearing on the larger points:
  • Human language has features that are distinct from other human cognitive functions.
  • These features (or some of them) are instinctive.
In putting forth an alternative to generative grammar, drawn from work elsewhere in the linguistic community, Tomasello appears to agree on the second point, if not the first.  In the alternative view, humans have a number of cognitive abilities, such as the ability to form categories, to distinguish objects, actions and actors and to define a focus of attention.  There is evolutionary value in being able to communicate, and a basic constraint that communication consists of signals laid out sequentially in time (understanding that there can be multiple channels of communication, for example saying "yes" while nodding one's head).

In this view, there are only four basic ways of encoding what's in the mind into signals to be sent and received:
  • Individual symbols (words)
  • Markers on symbols (for example, prefixes and suffixes -- "grammatical morphology")
  • Ordering of symbols (syntactic distinctions like "dog bites man" vs. "man bites dog")
  • Prosody (stress and intonation)
Language, then, would be the natural result of trying to communicate thoughts under these constraints.



It's quite possible that our working within these constraints would result in something that looks a lot like generative grammar, which is another way of saying that even if language looks like it can be described by generative grammar, generative grammar may not describe what's fundamentally going on.

On the other hand, this sort of explanation smacks of Stephen Jay Gould's notion that human intelligence could be the result of our having evolved  a larger brain as a side-effect of something else.  While evolution can certainly act in such roundabout ways, this pretends that intelligence isn't useful and adaptive on its own, and it glosses over the problem of just how a bigger brain is necessarily a smarter brain, as opposed to, say, a brain that can control a larger body without any sophisticated reasoning, or a brain less likely to be seriously injured from a blow to a head.

Likewise, we can't assume that our primate ancestors, having vocal cords, problem-solving ability and the need to communicate, would necessarily develop, over and over again, structurally similar ways of putting things into words.  Speaking, one could argue, is a significantly harder problem than eating with one's hands, and might require some further, specialized ability beyond sheer native intelligence.

There could well have been primates with sophisticated thoughts to express, who would have hunted more effectively and generally survived better had they been able to communicate these thoughts, but nonetheless just couldn't do it.  This would have given, say, a group of siblings that had a better way of voicing their thoughts a significant advantage, and so we're off to the races.  Along those lines, it's quite possible that some or all of the four encoding methods are both instinctive and, in at least some aspects, specific to language as opposed to other things the brain does.


Looking at the list of basic ways of encoding:

Associating words with concepts seems similar to the general problem of slotting mental objects into schemas, for example having a "move thing X from point A to point B" schema that can accept arbitrary X, A and B.  Clearly we and other animals have some form of this.

However, that doesn't seem quite the same as associating arbitrary sounds or gestures with particular meanings.  In the case of "move thing X from point A to point B", there will only be one particular X, Y or B at any given time.  Humans are capable of learning hundreds of thousands of "listemes" (in Pinker's terminology), that is sign/meaning pairs.  This seems completely separate from the ability to discern objects, or fit them into schemas.  Lots of animals can do that, but it appears that only a few can learn new associations between signs and meanings, and only humans can handle human-sized vocabularies.

Likewise, morphology -- the ability to modify symbols in arbitrary, conventional ways, seems very much language-specific, particularly since we all seem to distinguish words and parts of words without being told how to do so.  The very idea of morphology assumes that he sings is two words, not he and sing and s.

Ordering of symbols is to some extent a function of having to transmit signals linearly and both sides having limited short-term memory.  Related concepts will tend to be nearby in time, for example.  This is not a logical necessity but a practical one.  One could devise schemes where, say, all the nouns from a group of five sentences are listed together, followed by all the verbs with some further markers linking them up, but this would never work for human communication.

But to untangle a sentence like I know you think I said he saw her tell him that, it's not enough to know that, say I next to know implies that it's me doing the knowing.  We have to make a flat sequence of words into a nested sequence of clauses, something like I know (you think (I said (he saw (her tell him that)))).  Different languages do this differently, and it can be done different ways in the same language, depending on which wording we choose.  (He saw (her tell him that)), I know (you think (I said)).

Finally, prosody is probably closely tied to expressions of emotion.  Certainly SHOUTING WHEN ANGRY is related to other displays of aggression, and so forth.  Nonetheless, prosody can also be purely informational, as in distinguishing "The White House is large" from "The white house is large."  This informational use of prosody might well be specific to language use.
In each of these cases, it's a mistake to equate some widely-shared capability with a particular facility in human language.  There is more to vocabulary, morphology, syntax and prosody than simply distinguishing objects, changing the form of symbols, putting symbols in a linear sequence or speaking more or less loudly, and this, I believe, is where Tomasello's argument falls down.


Similarly, the ability to map some network of ideas (she told him that, you think I said it, etc.) into a sequence of words seems distinct from the ability to conceive such thoughts.  At the very least, there would need to be some explanation of how the mind is able to make such a mapping.  Perhaps that mechanism can be built from parts used elsewhere.  It wouldn't be the first time such a re-purposing has evolved.  Or it might be unique to language.

Most likely, then, language is not an instinct, per se, but a combination of pieces, some of which are general-purpose, some of which are particular to animal language, and some of which may well be unique to human language.  The basic pieces are innate, but the particular way they fit together to form a given language is not.

Try fitting that on a book cover.

Tuesday, October 2, 2012

What is or isn't a theory of mind? Part 2: Theories of mind and theories of mind

I said a while ago I'd return to theories of mind after reading up a bit more on primate cognition.  I have, now, and so I shall.  But first, the one that got away.

When I first got started on this thread, I ran across a paper discussing what kind of experiment might show other primates to have a theory of mind.  There were two points: First, that existing experiments (at the time) could be explained away by simple rules like "Non-dominant chimps can follow the gaze of their dominants and associate taking food while the dominant is looking with getting the snot beat out of them."  Second, that there were experiments, in principle, in which positive results could not be plausibly explained away by such rules.

I haven't been able to dig up the paper, but as I recall the gist of the improved experiments was to produce, as I mentioned before, a combinatorial explosion of possibilities, which could be explained more simply by the primate subject having a theory of mind than by a large number of ad-hoc rules.  "Explained more simply" are, of course, three magic words in science.

The scenario involved a dominant chimp, a non-dominant one, food to be stolen, and various doors and windows by which the non-dominant chimp would be able to see what the dominant saw, with or without the dominant seeing that it saw it.

Um, yeah.

For example, the two chimps are in separate rooms, with a row of compartments between them.  The compartments have doors on both sides.  The experimenter places the food in the third of five compartments.  Both doors are open, both chimps can see the food and they can see each other.  The doors are closed (the dominant's first), and the food is moved to compartment one.  The non-dominant sees its door to compartment one open and then the other door.  Does it rush for the food or hang back?  What if it sees that the other door to compartment one was already open?

If we try several different randomly-chosen scenarios with such a setup, a subject with a theory of mind should behave differently with one without, and due to the sheer number of possibilities a "mindless" explanation would have to be hopelessly contrived.

Something like that.  I probably have the exact setup garbled, but the point was to go beyond "Did I see the dominant looking at the food?" to "Where does the dominant think the food is?"  Interesting stuff.


While trying to track that paper down again, I ran across a retrospective by Josep Call and Michael Tomasello entitled Does the chimpanzee have a theory of mind? 30 years later.  I'm not familiar with Call, but Tomasello turns up again and again in research on developmental psychology and the primate mind.  He also has quite a bit to say about language and its development in humans, but that's for a different post (or several).

As with well-written papers in general, the key points of the retrospective are in the abstract:
On the 30th anniversary of Premack and Woodruff’s seminal paper asking whether chimpanzees have a theory of mind, we review recent evidence that suggests in many respects they do, whereas in other respects they might not. Specifically, there is solid evidence from several different experimental paradigms that chimpanzees understand the goals and intentions of others, as well as the perception and knowledge of others. Nevertheless, despite several seemingly valid attempts, there is currently no evidence that chimpanzees understand false beliefs. Our conclusion for the moment is, thus, that chimpanzees understand others in terms of a perception–goal psychology, as opposed to a full-fledged, human-like belief–desire psychology.
In other words, chimps appear to understand that others (both other chimps and those funny-looking furless creatures) can have goals and knowledge, but they don't appear to understand that others have beliefs based on knowledge behind those goals.  For example, if a human is reaching for something inaccessible, a chimp will reach it down for them (at least if the human has been a reliable source of goodies).  If a human is flips a light switch by foot -- an unusual act for a creature lacking prehensile toes -- a chimp is likely to try the same, unless the human's hands were full at the time, suggesting that the chimp is aware that the human had to use their foot in the second case but wanted to in the first.

Naturally, any of the ten cases given in the paper could be explained by other means.  Call and Tomasello argue that the simplest explanation for the results taken together is that chimps understand intention.

Likewise, there is good evidence that chimps understand that others see and know things.  For example, they are more likely to gesture when someone (again, normal or furless) is looking, and they will take close account of who is looking where when trying to steal food.

On the other hand, chimps fail several tests that young humans pass in largely similar form.  For example, if there are two boxes that may contain food, and a dominant and a non-dominant subject both see food placed in it, the subject will not try to take food from the box with food in it.  Makes sense.  The subject knows the dominant knows where the food is.

If they both see the food moved to a second box, the subject will still leave it alone.  If the dominant doesn't see the food moved but the subject does, the subject ought to know that the dominant thinks the food is still in the first box and that it's safe to go for the second (the experiment is set up so that the dominant can effectively only guard one box).

However, it doesn't go for the box with the food in it.  This and other experiments suggest that the subject doesn't know that the dominant doesn't know what it knows.

Um, yeah.

In other words, the subject appears to assume that that, because it knows the food is in the second box, so does the dominant.  "Because" might be a bit strong here ... try again: Chimps understand that others may have their own intentions different from theirs, and that others can know things, but not that others can have knowledge different from theirs.


Call and Tomasello conclude:
It is time for humans to quit thinking that their nearest primate relatives only read and react to overt behavior.   Obviously, chimpanzees’ social understanding begins with the observation of others’ behavior, as it does for humans, but it does not end there. Even if chimpanzees do not understand false beliefs, they clearly do not just perceive the surface behavior of others and learn mindless behavioral rules as a result. All of the evidence reviewed here suggests that chimpanzees understand both the goals and intentions of others as well as the perception and knowledge of others. 
[...]
In a broad construal of the phrase ‘theory of mind’, then, the answer to Premack and Woodruff’s pregnant question of 30 years ago is a definite yes, chimpanzees do have a theory of mind. But chimpanzees probably do not understand others in terms of a fully human-like belief–desire psychology in which they appreciate that others have mental representations of the world that drive their actions even when those do not correspond to reality [I'd argue for "their own perception of reality" here].  And so in a more narrow definition of theory of mind as an understanding of false beliefs, the answer to Premack  and Woodruff’s question might be no, they do not.
I suppose this might sound wishy-washy (chimps have a theory of mind, except no, they don't), but for my money it's insightful, not just concerning the minds of chimps, but the notion that there can be more than one kind of theory of mind.

Saturday, September 22, 2012

Cathexis

Whazzat?

I don't believe I had ever seen the word before I read it in an article today in reference to the recent Dutch elections.  It's a psychological term, coined by Freud as a translation of the German Libidobesetzung, meaning "the concentration of mental energy on one particular person, idea, or object."  In the context at hand, the author argued, a small group of media and political trend-setters had decided that one candidate had clearly won a multi-party debate, and so ran with the story, giving the as-yet-undecided public something to latch on to.  This in turn led to that candidate gaining ground in the polls, and so forth.

Cathexis, then, is a nice, short, pseudo-classical word for what we also call the bandwagon effect or perhaps groupthink, or at least that's how the author used it in this context.

After looking the word up, I thought, "Wow, what a nice, short, pseudo-classical word for something we see all the time.  Maybe I should start using it."

And then my mind flashed an image of a salon somewhere, some concentration of intellectuals feeding off each other to produce a more-than-the-sum-of-the-parts ferment of ideas.  And one hallmark of such a situation, as with any tight-knit circle, is that people will tend to come to use the same expressions and shorthands, the same obscure references and little-known (or completely invented) terms that just seemed to hit the spot.  In extreme cases, this can tend to obscure, or even supplant, the actual ferment of ideas.  Are they developing great thoughts together, or just throwing around the same terms of art?

Terms like cathexis.

So if I were to start using "cathexis" as though it were something anyone might be expected to know, and my vast army of readers did likewise (OK, I made up the part about the vast army), would I be fomenting cathexis myself?  Does it matter if I actually have something to say?

Dunno.



Other words I looked up to make sure I was using them properly: salon (in the sense I used it), foment.

One bit of the dictionary definition I left off was "(esp. to an unhealthy degree)".  It's not clear to me whether the author of the article had that shade of meaning in mind, or not.


Thursday, July 12, 2012

Is English acquiring separable suffixes (or has it always had them)?

"Did you set up the tent?  That's a nice setup you've got there."

Note that that's "set up the tent" and not "setup the tent".  I've got the style guides on my side with that one, but they'll eventually have to catch up with current usage.  These days everyone seems to write "setup the tent".

Frankly this bugs me much more than it ought to.  If you read the sentence above aloud and listen carefully, you'll notice that "set up" sounds different from "setup".  In the first case they're pronounced as two separate words, with roughly equal emphasis on each and possibly a slight pause in between, while in the second case there's just one word and definitely no pause, emphasis on set.  When I read "setup", my mind's ear hears one word with emphasis on set, and if I read it in a case where one would actually use equal emphasis, it jars.

But usage is usage, so what can you do?


When I studied German in high school, I was for some reason taken with the idea of separable prefix verbs, which are a lot like English phrasal verbs such as set up or tear down.  These verbs can take two forms.  In the infinitive (the form you'd use with will, can and such) or past participle (the form you'd use with have) or a couple of other cases they appear as one word, but in other forms the front of the verb breaks off and migrates toward the end of the sentence, with arbitrarily many words between it and the stem that's left behind.  For example:
  • Hast du das Zelt aufgestellt? (Have you set up the tent?)
  • Ich will die Neubauten einstürzen. (I want to tear down the new buildings).
but
  • Stellst du das Zelt auf? (Are you setting the tent up?)
  • Ich stürze die Neubauten ein. (I tear the new buildings down).
At the time I thought it was cool that different that parts of verbs could get up and wander around like that, but that's actually old hat.  In fact English allows even a bit more latitude than German, since the parts can appear together as well as with the second part at the end of the clause:
  • I tear down the new buildings.
but not
  • * Ich stürze ein die Neubauten.
What's new and different here is actually the infinitive (or past participle) form.  While we maintain the order of the parts and give them equal emphasis, German switches the order and treats the result as one word with the emphasis on the first part (AUFstellen).

In fact, German does a bit more, smashing the particle ge (past participle marker) or zu (roughly equivalent to to with a verb in English) into the result, as with aufgestellt above or aufzustellen.  It would be like saying "uptoset" instead of "to set up".  Though this looks odd, it just means that the verb gets inflected before the prefix is stuck on, and in German some of the inflection happens through prefixes while in English we only use suffixes such as -s, -ed, or -ing.


Ok, what does this have to do with my grammatical peeve?

I'm grasping at the idea that phrasal verbs like set up and tear down, which act quite similarly to German separable prefix verbs, also have two forms, namely taken apart and smashed together, but it's harder to tell because the smashed-together form (note the hyphen there and the lack of one before) doesn't look as dramatically different from the taken-apart form as it does in German.

We use the smashed-together form in English mainly when making a noun from a phrasal verb:
  • That's a nice setup. (something that's been set up)
  • That house is a teardown. (something that should be torn down)
German has a similar form.  For example, Aufstellung is a noun analogous to setup, except that it actually means list, because, well, languages are like that.

We also use a similar form when using a participle as an adjective (not sure what the technical term is for that):
  • A set-up tent occupies much more space.
  • The torn-down building had looked sad.
  • The falling-down buildings looked even sadder.
In German such forms appear as one word
  • einstürzende Neubauten (new buildings that are collapsing)
  • drei eingestürzten Neubauten (three collapsed new buildings)
But hang on.  What's that hyphen doing there in the English sentences?  It's indicating that the parts are not completely run together, as they get roughly equal emphasis instead of emphasis on the first part, but neither are they completely separate.  We can't put words between them.  You could equally well say either of
  • Did you set up the tent?
  • Did you set the tent up?
but you wouldn't say
  • * A set tent up occupies much more space.
  • * A set nicely up tent is a joy to behold.
In short, the finicky rules about when to say "set up", "set-up" or "setup" at least have a somewhat coherent theory behind them:
  • If you hear one word with the emphasis on the first part, write one word (A nice setup; The overhang of the ledge).  Generally the phrasal verb will be acting as a noun.
  • If you hear two equally emphasized parts but you couldn't put words between the parts, use a hyphen (A nicely set-up tent).  Generally the phrasal verb will be acting as an adjective in participle form.
  • If you hear two equally emphasized parts and the parts can just as well appear separately, use two words (I set up the tent; I set the tent up.) Generally the phrasal verb will be acting as an ordinary verb.
German uses the smashed-together form in the first two cases, and in the last case uses the taken-apart form, but with the first part of the smashed-together form always going to the end of the clause.  Conversely, we can say that English has a similar construct to German separable prefix verbs (not a shock, given the close kinship of the languages), but with suffixes instead of prefixes.  In both languages the affix is applied after the verb stem has been inflected, thus falling-down and not *fall-downing.


This leaves me no more enlightened than before as to why people would tend to write the smashed-together form in all cases, even when the pronunciation is different and the parts could just as well appear separately.  My guess is that, because the smashed-together form sometimes appears even under the finicky rules, that's taken as "correct" and therefore used wherever the two parts appear next to each other.

I really don't know.

Still bugs me, though.

Wednesday, June 20, 2012

What is, or isn't a theory of mind? Part 1: Objects

One notion of self-awareness revolves around the notion of a theory of mind, that is, a mental model of mental models.

Strictly speaking, having a theory of mind doesn't imply self-awareness.  I could have a very elaborate concept of others' mental states without being aware of my own.  This would go beyond normal human frailty like my not being aware of why I did some particular thing or forgetting how others might be affected by my actions.  It would mean being able to make inferences like "he likes ice cream so there will probably be a brownie left" without being aware that I like things.  That seems unlikely, but neurology is a wide and varied landscape.  There may well be people with just such a condition.

This is clearly brain-stretching stuff, so let's try to ease into it.  In this post, I want to start with a smaller, related problem:  What would a theory of objects look like, and how could you tell if something has it?  What we're trying to describe here is some sort of notion that the world contains discrete objects, which have a definite location and boundaries and which we can generally interact with, for example causing them to move.  This leaves room for things that aren't discrete objects, like sunshine or happiness or time, but it does cover a lot of interesting territory.

Not every living thing seems to have such a theory of objects.  A moth flying toward a light probably doesn't have any well-developed understanding of what the light is.  Rather, it is capable of flying in a given direction, and some part of its nervous system associates "more light in that direction" with "move in that direction".  It's the stimulus of the light, not the object producing the light, that the moth is responding to.  In other words, this is a simple stimulus-response interaction.

On the other hand, a theory of objects is not some deep unique property of the human mind.  A dog chasing a frisbee clearly does not treat the frisbee as an oval blob of color.  It treats it as a discrete, moving object with a definite position and trajectory in three dimensions.

You might fool the dog for a moment by pretending to throw the frisbee but hanging onto it instead, but the dog will generally abandon the chase for the not-flying disc in short order and retarget itself on the real thing.  It can recognize discs of different shapes and sizes and react to them as things to be thrown and caught.  It's hard to imagine a creature doing such a thing without some abstract mental representation of the disc -- and of you for that matter.

Likewise a bird or a squirrel stashing food for the winter and recovering it months later must have some representation of places and, if not objects, a "food-having" attribute to apply to those places.  That they are able to pick individual nuts and take them to those places also implies some sort of capability beyond reacting to raw sense data.

(but on the other hand ... ants are able to move objects from place to place, bees are able to locate and return to flowers ... my fallback here and elsewhere is that rather than one single thing we can call "theory of object" there must be many different object-handling facilities, some more elaborate than others ... and dogs and people have more elaborate facilities than do insects).

I've been careful in the last few paragraphs to use terms like "facility" and "representation" instead of "concept" or "idea" or such.  I'm generally fine with more loaded terms, which suggest something like thought as we know it, but just there I was trying to take a particularly mechanistic view.

So what sort of experiment could we conduct to determine whether something has a theory of objects, as opposed to just reacting to particular patterns of light, sound and so forth?  We are looking for situations where an animal with a theory of objects would behave differently from one without.

One key property of objects is that they can persist even when we can't sense them.  Technically, this goes under the name of object permanence.  For example, if I set up a screen, stand to one side of it and throw a frisbee behind it, I won't be surprised if a dog heads for the other side of the screen in anticipation of the frisbee reappearing from behind it.  Surely that demonstrates that the dog has a concept of the frisbee as an object.

Well, not quite.  Maybe the dog just instinctively reacts to a moving blob of color and continues to move in that direction until something else captures its attention.  Ok then, what if the frisbee doesn't come out from behind the screen?  Perhaps I've placed a soft net behind the screen that catches the frisbee soundlessly.  If the dog soon abandons its chase and goes off to do something else, we can't tell much.  But if it immediately rushes behind the screen, that's certainly suggestive.

However ... one can continue to play devil's advocate here.  After all, the two scenes, of the frisbee emerging or staying hidden, necessarily look different.  In one case there is a moving blob of color -- causing the dog to move -- followed by another blob of moving color.  In the other, there is no second movement.  So perhaps the hard-wiring is something like "Move in the direction of a moving blob of color.  If it disappears for X amount of time, move back toward the source."  That wouldn't quite explain why the dog might go behind the screen, but with a bit more thought we can probably explain that away.

What we need in order to really put the matter to rest is a combinatorial explosion.  A combinatorial explosion occurs when a few basic pieces can produce a huge number of combinations.  For example, a single die can show any of 6 numbers, two dice can show 36 combinations, three can show 216, four can show 1296 and so forth.  As the number of dice grows, it quickly becomes impractical to keep track of all the possible combinations separately.

If something, for whatever reason, reacts to combinations of eight dice that total less than 10 one way and those that total 10 or more a different way, it's hard to argue that it's simply reacting to the 9 particular combinations (all ones, eight different ways to get a two and seven ones) that total less than ten one way and the other 1,679,607 the other way.  Rather, the simplest explanation is that it has some concept of number.  On the other hand, if we're only experimenting with a single die, and a one gets a different reaction from the other numbers, it might well be that a lone circle has some sort of special status.

In the case of the frisbee and screen experiment, we might add more screens and have grad students stand behind them and randomly catch the frisbee and throw it back the other way.  If there are, say, five screens and the dog can follow the frisbee from the starting position to screen four, back to screen two and finally out the far side, and can consistently follow randomly chosen paths of similar complexity, we might as well accept the obvious:  A dog knows what a frisbee is.

Why not just accept the obvious to begin with?  Because not all obvious cases are so obvious.  When we get into borderline cases, our intuition becomes unreliable.  Different people can have completely different rock-solid intuitions and the only way to sort it out is to run an experiment that can distinguish the two cases.


This is where we are with primate psychology and theories of mind.  It's pretty clear that chimps (and here I really mean chimps and/or bonobos), for example, have much of the same cognitive machinery we do, including not only a theory of objects and some ability to plan, but also such things as an understanding of social hierarchies and kinship relations.

On the other hand, attempts to teach chimps human language have been fairly unconvincing.  It's clear that they can learn vocabulary.  This is notable, even though understanding of vocabulary is not unique to primates.  There are dogs, for example, that can reliably retrieve any of dozens of objects from a different room by name.

There has been much less success, however, with understanding sentences with non-trivial syntax, on the order of "Get me the red ball from the blue box under the table" when there is also, say, a red box with a blue ball in it on the table.  Clearly chimps have some concept of containment, and color, and spatial relationships, but that doesn't seem to carry through to their language facility such as it is.

So what facilities do we and don't we have in common?  In particular, do our primate cousins have some sort of theory of mind?

That brings us back to the original question of what constitutes a theory of mind, and the further question of what experiments could demonstrate its presence or absence.

People who work closely with chimps are generally convinced that they can form some concept of what their human companions are thinking and can adjust their behavior accordingly.  However, we humans are strongly biased toward attributing mental states to anything that behaves enough like it has them -- we're prone to assuming things (including ourselves, some might say) are smarter than they are.

Direct interaction in a naturalistic setting is valuable, and most likely better for the chimp subjects, but working with a chimp that has every appearance of understanding what you're up to doesn't necessarily rule out more simplistic explanations.  For example, if the experimenter looks toward something and the ape looks in the same direction, did it do so because it reasoned that the experimenter was intentionally looking that direction and therefore there must be something of interest there, or simply out of some instinct to follow the gaze of other creatures?

These are thornier questions, with quite a bit of research and debate accumulated around them over the past several decades.  I want to say more about them, though not necessarily in the next post.  I'm still reading up at the moment.

Tuesday, May 15, 2012

Counting

(Intermittent, indeed)

Counting.  What could be simpler?  Well ...

1. Origins and speculation

Counting is older than humanity.  Other primates can count, birds can count and elephants can count.  There are even claims that some insects can count.  How would we know?  Suppose an animal sees three tasty treats put behind a screen, and then two taken away from behind it.  An animal that makes a beeline for the remaining treat, but acts indifferent if all three have been taken away instead, can plausibly be said to have some form of counting.

There's clear survival value in being able to reckon in this way.  If I'm being chased by four wolves and I only see three at the moment, I should be wary.  If I see four run off looking interested in something else, I can probably go back to whatever it was I was doing.

How would a counting ability work?  Totally speculating here, I could imagine having some supply of "markers" in short term memory, which could be mentally placed in some collection of notional locations.    Even a small number of markers and locations would be useful.  If I have three markers, a "visible" location and a "hidden" location, I can track the tasty treats in the first example.  As each treat is hidden, the marker-tracking system puts a marker in the hidden location.  As each is taken out, it goes to the visible location, or is freed up for future use if the if the treat is taken away in real life.  If there's still a marker in the hidden location, I can expect a treat.

This is all handwaving over what a marker or a location might be, or how to make and dissolve the associations between markers and treats, and between the "hidden" mental location and the area behind the experimenter's screen.  Nonetheless, there are testable hypotheses to be made here.  For example, does the subject still respond appropriately if there are two separate screened areas, or seven, or twenty?  Suppose there are two kinds of objects: tasty treats and something useless.  Does the subject behave differently if there is one treat left instead of one useless object?  How many objects can the subject keep track of?

The results would put limits on how many markers, or locations, or kinds of markers there might be.  They wouldn't rule out other explanations, like some sort of purely symbolic approach, but an artful experimenter ought to be able to come up with scenarios that would make one or the other explanation more or less plausible.  I'm certain that people have done just that, but more carefully and thoroughly than what I've described.

If you're thinking this doesn't quite seem to match up with what pops into your head when you think "counting", you'd be right.  Counting as we normally think of it, "one, two, three, four, ... thirty-nine, forty, forty-one ... five thousand nine hundred and thirty-seven, five thousand nine hundred and thirty-eight ..." is a different beast.  A markers-and-locations scheme is limited by the number of markers.  Counting with a number system requires a small set of symbols and a small set of rules.  With those, you can count until you get tired, so long as your number system tells you how to add one more.  Your supply of numbers is still finite -- no one can count from a million random digits to the number after that -- but it's orders of magnitude larger.

This more familiar notion of number is probably much rarer in the animal world.   It's much more closely akin to language, though it's at least logically possible to have a number system without having a fully-developed language.  It's also possible, and I'd think highly likely, that we and any other animals that can count digitally also have a markers-and-locations style capability.  A toddler who hasn't yet mastered counting to a hundred can still track objects behind screens.

2. The mathematicians arrive

Number theory is one of the oldest branches of mathematics.  Pythagoras studied numbers extensively, trying to uncover the mystic harmonies of the universe.  Archimedes showed how to construct ridiculously large numbers, large enough to, say, number the grains of sand on a beach, or even how many it would take to fill the universe as he knew it.  Pascal, Fermat, Euler, Gauss ... all of these and many more had interesting things to say about numbers, but for my money it was Georg Cantor who got deepest into the question of "what is number"?

Cantor helped formalize the notion of a set -- a collection of objects about which we only care if a given item is in it or not.  Cantor postulated that two sets have the same number of items if, and only if, the items in them can be matched up one-for-one.  For example, if I have the set {Groucho, Harpo, Chico} and the set {Julius, Arthur, Leonard}, I can match Groucho with Julius, Harpo with Arthur and Chico with Leonard and know that there are the same number of items in each set.  I can also do the matching five other ways, but the result is always the same.  Either the two sets match up one-to-one or they don't.

Working from this basis, a number is a collection (formally, the equivalence class) of all sets that can be matched up with each other.  For example, the number three would include {Groucho, Harpo, Chico}, {Julius, Arthur, Leonard}, {1, 2, 3}, {42, 17, 23}, {uno, due, tre}, {zebra, desk, Louis XIV} and any other set that can be matched up one-to-one with any (and thus, all) of those.

The numbers that can be described this way include zero (described by the empty set, {}), one, two, and so on ad infinitum.   If you can (at least in principle) list all the objects in a set, or in other words, if the set is finite, then there is a natural number representing all sets of that size.  These numbers do not include negative numbers, fractions, irrational numbers, imaginary numbers, quaternions or other such.  Those need more machinery, but they can all be related to the basic set of natural numbers {0, 1, 2, 3 ...}.

One of Cantor's great discoveries, and one of the most important discoveries in all of mathematics, is that there are numbers, in this sense, beyond the natural numbers that denote the size of finite sets.  For example, how many natural numbers are there?  There are more than, say, three, because 0, 1, 2 and 3 are all natural numbers.  There are more than 497.  There are more than a billion and more than googolplex.  No matter matter how high you count, you can, in principle, always count one more.

If this sounds quite a bit like the second kind of number system above, where you can always count one more, that's because it is.  In a bit we'll see a mathematical description even more closely aligned with it.

If you have any finite set, you can always add another object that's not already in it.  Put yet another way, there is no finite set that you can match up one-to-one with the set of all natural numbers.  The set of natural numbers is infinite.

But Cantor's definition of a number doesn't say numbers have to be finite, so that's fine.  There was quite a bit of discussion about this, but at the end of the day Cantor's rules for reasoning about numbers work just as well for the infinite.  In fact they present a far richer picture than just the single ∞ that we often take to represent "infinity" (this symbol does have perfectly rigorous math behind it, but it has more to do with the mathematical study called analysis than with the set theory we're talking about here).

The collection of all sets that match up one-to-one with the natural numbers is itself a number, which Cantor called aleph-null (ℵ0).  Are there actually other sets that can be matched up one-to-one with the naturals?  Yes.  For example, the set of points in an infinite square grid.  Start at any particular point and make a spiral outward.  Match the starting point up with zero, the next point with one, and so forth.  There is a natural number for every point in the spiral, and any point you care to pick on the grid, the spiral will eventually reach.

A bit surprisingly, the natural numbers can also be matched up one-to-one with what might appear to be smaller sets of natural numbers.  For example, there is an even number for every natural number, and vice versa.  You can match {0, 1, 2, 3, ...} up with {0, 2, 4, 6 ...} by matching every natural number up with twice that number: 0 with 0, 1 with 2, 2 with 4, 3 with 6 and so on.  There is an even number for every natural and vice versa.  This isn't a paradox or contradiction.  Infinite sets just behave a little differently.

How many sets of natural numbers are there?  Could you match them up with the natural numbers themselves?  Cantor's (second) proof that you can't, the diagonal argument, is a masterpiece, one of those things that sounds like magic or trickery when you first hear it (the first proof is pretty slick as well).  If you can grasp it, and you very likely can, you can probably grasp a large chunk of modern mathematics.  It goes like this:

Try to give each set a natural number.  If there are as many as there are natural numbers, then you can do just that.  For example, here's one partial attempt:
  1. {} (the empty set)
  2. {1, 2, 3, 4}
  3. {0, 1, 2, 3, 4, 5, ...} (all of the natural numbers)
  4. {0, 2, 4, 6, ...} (the even numbers)
  5. ...
Suppose I come up with a scheme for listing sets of natural numbers so that I can tell you the set for any natural number you like.  Zero gets the empty set.  One, two and three get the sets listed above.  42 gets the set of all primes.  Googolplex plus five gets the Fibonacci numbers ... whatever.  Could such a scheme get them all?

Look at each row in turn, and form a new set thus:  If the row number isn't in the set on its own row, add that number to the set you're building, otherwise don't.  In the case above, the result starts off {0, 3, ...}: zero is not in the empty set, so add it.  One is in the set {1, 2, 3, 4}, so skip it.  Two is in the set of all natural numbers, so skip it as well.   Three is not in the set of even numbers, so add it.

We can continue down the line.  42 isn't prime, so it's in the set.  Googolplex plus five is (almost certainly) not a Fibonacci number, so it's (almost certainly) in the set, and so forth.  Is the set we just made already on the list somewhere?  No.  It's not in row 0, because it contains a number not in the set on that row.  It's not in row 1, because 1 is in that set but not in ours.  And so forth.

No matter how you try to number the sets of natural numbers with the natural numbers themselves, there will always be at least one you missed -- a massive understatement, but one is all it takes -- so it can't be done.

Since you can't match the natural numbers up with the sets of natural numbers, there must be an infinite number besides aleph-null, namely the number of sets of natural numbers, that is, all sets for which you can match up each member with a set of natural numbers and vice versa.  This includes, among other things, all the numbers between 0 and 1, if you include infinite expansions like 0.11111... = 1/9 or pi - 3 = .14159265... (see the bottom of this section for a rough proof).

Call this number aleph-one.  Aleph-one is a bigger infinity than aleph null.  Cantor conjectured, but could not prove, that there are no infinities between aleph-null and aleph-one as defined here, or in other words that aleph-one really should be called aleph-one and not aleph-something-else.  We now know that you can pick either way and get meaningful, though perhaps strange, results.

In any case, one of the most important facts about a set is often whether it contains
  • a finite number of elements
  • exactly as many elements as there are natural numbers
  • more
In the first two cases, the set is countable (in the second case, countably infinite).  Otherwise it is uncountable.  Some interesting results depend on this distinction.  For example, even though there are infinitely many possible computer programs, and infinitely many numbers that a computer program might produce as output if it could run forever, including irrational numbers like the square root of two and numbers like pi that aren't the solution to any algebraic equation, there are still only countably many.  But there are uncountably many numbers between zero and one, which is so many more that if you pick a real number at random (leaving aside exactly what that means), the odds are exactly zero that there's a computer program that could calculate it, even if it had forever to do it.

The basic approach behind Cantor's diagonal argument can also be used to prove a number of other significant results that don't directly involve countability.  These include Gödel's incompleteness theorem, which says there are some true statements that can't be proven true, and Church and Turing's result that there is no computer program that can solve the halting problem of deciding whether a given computer program will always eventually stop running.


Cantor was further able to show that there were infinitely many of these infinite cardinal numbers.  There is an aleph-two bigger than aleph-one, and so forth.  For any aleph you can always find one bigger, by applying the same diagonal proof that found aleph-one in the first place.

Hmm .... "always one bigger" ... that sounds familiar.

Matching things up in order to count them makes sense, but this idea of a number as a collection of sets seems somehow distant from the simple notion of counting one, two three.  Suppose we dispense with sets and start with that?

Start with something called "zero".  This being math, it doesn't matter particularly what it is.  It's really just a symbol.  Let's use 0.  Likewise, take something to represent "one more than".  One common symbol is S, for "successor".  So we have
  • 0
  • One more than zero, or S0
  • One more than one more than zero, or SS0
  • and so forth
To translate this to something you're more used to, just count the S symbols, and to represent a particular number (again, whole, non-negative number, including zero), just stick that many S symbols before the 0 symbol.  Numbers considered this way are called ordinals, as opposed to the cardinals described above.

You can tell that one number is bigger than the other if you can make the first number by tacking some number of S symbols in front of the symbol for the second number.  Since SSSSS0 is three S symbols tacked onto SS0, we know that SSSSS0 is bigger than SS0, that is, five is bigger than two, as one would hope..

How many of these are there?  One for every natural number, clearly (just count the number of S symbols).

It would be interesting, at least to a mathematician, to capture some notion of infinity here.  If there are infinite cardinals, why not infinite ordinals?  Infinity would be a number that you can never reach by counting from zero, so let's try to pin that down.  The numbers we've described so far are finite, so let's just postulate that there's a number, call it ω, that's bigger than any of them.  Hey, it's math.  You can do that sort of thing.  You basically just add a rule that says "for any number like those we've described so far, ω is bigger".  Then you hope the new rule doesn't lead to any contradictions.

We'd like this ω to act like any other number, apart from being bigger than any of the naturals, so there should be a number one more than it.  That is, Sω is a perfectly good number in this scheme.  And so are SSω, SSSω, and so forth.  We could also call those 1+ω, 2+ω, 3+ω and so forth.  Put another way, we can add any natural number to ω and get a bigger ordinal.  Counting continues past infinity almost as though infinity never happened.  We can continue this counting as far as we like, and we can say whether one such number is bigger than another just as we did before.

So suppose there's a number bigger than ω with any number of S symbols.  We may as well call that ω + ω, or 2 times ω, or just 2ω.  Likewise, there is a 3ω bigger than 1+2ω, 2 + 2ω or 2ω plus any natural, and a 4ω, and a 5ω, and so on.  And a number bigger than any of those?  Why, that would be ω times ω, or ω2.

And likewise there is an  ωand an  ωand so forth (along with numbers like 17  + 42ω + ω4) and working from that you can define  ωω.  By now you can probably tell that we can keep this up forever.  You end up with something like the infinite cardinals, but more finely sliced.  Ah, math.


[To see that aleph one is the number of real numbers between 0 and 1:  Every such number can be represented by a binary expansion: 0.1 = 1/2, 0.01 = 1/4, 0.011 = 1/4 + 1/8 = 3/8,  0.01010101... = 1/4 + 1/16 + ... = 1/3, etc. The corresponding set for a binary expansion has 0 in it if the first bit after the binary point is 1, 1 in it if the second is 1, and so forth.  For finite expansions like 0.01, the bits are all zero past a certain point, but that's fine.  It just means that the corresponding set only has finitely many numbers in it.  Likewise, there is a binary expansion for each set -- put a zero in the nth place when n is in the set (and only then).  QED.  You do have to be a bit careful because, for example, 0.01 = 0.00111111..., and in general every finite binary expansion can be changed to an infinite one by replacing the last 1 with 01111..., but there are only countably many finite expansions so this doesn't really matter]

3. There were bound to be lawyers involved at some point

Suppose the limit for grand theft is $5,000.  I steal an old car worth $5,000.  Clearly I've committed one count of grand theft.  What if I steal a new car worth $15,000?  I've still committed one count.  What if I steal three old cars worth $5,000?  Keeping in mind that I'm not a lawyer, that ought to be three counts of grand theft.  For example, if I steal one car, serve my time, steal another, serve my time again and steal another, at which point I'm going to be in for a good long time in most states, I've committed three crimes.

What if I snag a $4,000 trailer with three $5,000 cars on it?  Is that one grand theft of $19,000 or one petty theft and three grand thefts?  That would depend on whether we're counting acts of theft (one) or objects stolen (one petty, three grand), and that in turn depends on the particular law.  Not being a lawyer, I'm not sure which way this tends to go, but I'd guess one count of grand theft.

This sort of question comes up all the time in copyright cases.  Was that ten acts of pirating one song or one act of pirating ten songs?  What about pirating the same song ten times?  Still not a lawyer, but I believe the law in this case treats each copyrighted work as a separate count, so pirating ten songs would be ten counts, while pirating one song ten times would be one count (but maybe a more severe one than pirating it once).

The question here is not how to count, as in how to count the even numbers versus the prime numbers, but what to count.


4.  How many rivers run through it?

Pittsburgh lies at the confluence of the Allegheny and Monongahela rivers, which meet to form the Ohio river.  Thus the name Three Rivers Stadium (and several other Three Rivers designations).

In 1858, William Larimer staked a claim on a bluff overlooking the confluence of Cherry Creek and the South Platte (which had long been a seasonal camp for the Arapaho and Cheyenne), and not too long afterward named the place after the governor of Kansas Territory, James W. Denver (who had by that time already resigned his post, unbeknownst to Larimer).

How many rivers run through Pittsburgh?  How many run through Denver?  One could reasonably say that the South Platte River runs through Denver, joined by Cherry Creek, but going by the names, no river runs through Pittsburgh.  Rather, two rivers end at the Point and another begins.

A person traveling by boat from Oil City, Pennsylvania (where Oil Creek flows into the Allegheny), through Pittsburgh to Cairo, Illinois (where the Ohio meets the Mississippi) could be forgiven, though, for thinking that there was a river running through Pittsburgh, particularly since the Allegheny is larger than the Monongahela which joins it.  A hydrologist would likely agree, and go so far as to say the same river flows through New Orleans to the Gulf of Mexico, the Ohio being larger than the Mississippi at Cairo.

The hydrological definition, that when two streams meet the larger of the two is considered to continue and the smaller to end at the confluence, gives a nice, consistent definition, assuming that the size of a stream is well-defined and no two are exactly equal.  Since this definition can be at odds with local names, as at Pittsburgh and Cairo, some might be tempted to say the local names are "wrong", but that may not cut much ice with the locals.

How many rivers run through Los Angeles?  That would depend on whether you call the Los Angeles River a river.  For most of the year along most of its course it's a trickle of water in the middle of a wide concrete culvert.   Near Dodger Stadium it's joined by Arroyo Seco (Spanish for "dry stream") which, as the name implies, lacks even the trickle of water much of the time.  Nonetheless, when the water does flow it flows in significant volume along an identifiable course, so we call it a river.

The Colorado River used to flow into the Gulf of California by way of a lush delta, but for most of the past few decades it has run dry well before then.  Can we say a river runs through that region still?  At this point, maybe not.  If it ran dry some years but flowed for at least part of most years,  probably, but I doubt there's any clear dividing line between mostly dry river and ex-river, at least outside of the technical literature.

Finally, how many rivers run through the Mississippi Delta, that is, the alluvial plain between the Mississippi and Yazoo rivers in western Mississippi, as opposed to the Mississippi River Delta in southern Louisiana?  It's kind of hard to tell.  Streams cut this way and that, split, rejoin, sometimes appear to go nowhere.  Is this a lot of small streams criss-crossing a plain, or a few larger streams with a lot of unusually large islands?  Again, at some point it comes down to arbitrary definitions.

5. What counts?

At least for practical purposes, the world is continuous.  And yet, for various reasons, we comprehend it as being made up mostly of discrete objects.  Counting is fundamentally discrete.  Digital, in fact.  You can't count, or at least not very far, without dividing the world into objects.

Not every creature appears to do this.  Microbes get by on pure, unconscious reaction.  This part of the cell wall is in contact with some nutritious chemical, so those parts of the cell wall contract and the cell moves towards supper.

Only with a mind capable of parsing the world into objects can we begin to talk about counting as a useful activity.  Once there are objects, a markers-and-locations facility is not so remote.  It doesn't have to arise, but it's useful enough and plausibly a small enough step that it's not surprising that it does arise.  The full-blown symbolic version of counting using a few basic names and combining rules, is a whole different matter.

What counts?  In the markers-and-locations sense, quite a few things count, including us.  More will surely be discovered.  In the symbols-and-rules view, it may be just us.