Monday, November 19, 2012

If language isn't an instinct, what is it?

Steven Pinker's The Language Instinct makes the case that humans, and so far as we know only humans, have an innate ability to acquire language in the sense we generally understand it.  Further, Pinker asserts that using this ability does not require conscious effort.  A child living with a group of people will normally come to learn their language, regardless of whether the child's parents spoke that language, or what particular language it is.  This is not limited to spoken languages.  A child living among sign language users will acquire the local sign language.  There are, of course, people who are unable to learn languages, but they are the rare exceptions, just as there are people who are completely unable to see colors.

There is, on the other hand, no innate ability to speak any particular language. A child of Finnish-speaking parents will not spontaneously speak Finnish if there is no one around speaking Finnish, and the same can be said of any human language.

This is noteworthy, even it if might seem obvious, because human languages vary to an impressive degree.  Some have dozens of distinct sounds, some only a handful.  Some have rich systems of inflections, allowing a single word to take thousands of different forms.  Some (like English, and Mandarin even more so) have very little inflection.  Some have large vocabularies and some don't (though any language can easily extend its vocabulary).  The forms used to express common concepts like "is", or whether something happened yesterday, is happening now or might never happen, can be completely different in different languages.

At first glance this variation may seem completely arbitrary, but it isn't.  There are rules, even if our understanding of them is very incomplete.  There are no known languages where, say, repeating a word five times always means the opposite of repeating that word four times.  There's no reason in principle there couldn't be such a language, but there aren't, and the probable reasons aren't hard to guess.

There's a more subtle point behind this: There is no one such thing as "communication", "signaling" or "language".  Rather, there are various frameworks for communication.  For example, "red means stop and green means go" is a system with two signals, each with a fixed meaning.  Generalizing this a bit, "define a fixed set of signs each with a fixed meaning" is a simple framework for communication.  A somewhat more complex framework would allow for defining new signs with fixed meanings -- start with "red means stop and green means go", but now add "yellow means caution".

Many animals communicate within one or the other of these frameworks.  Many, many species can recognize a specific set of signs.  Dogs and great apes, among others, can learn new signs.  Human language, though, requires a considerably more complex framework.  We pay attention not only to particular signs, but to the order in which they are communicated.  In English "dog bites man" is different from "man bites dog".  Even in languages with looser word order, order still matters.

Further, nearly all, if not all, human languages have a concept of "subordinate clause", that is, the ability to fold a sentence like "The boy is wearing a red shirt" into a sentence like "The boy who is wearing the red shirt kicked the ball."  These structures can nest deeply, apparently limited by the short-term memory of the speaker and listener and not by some inherent rule.  Thus we can understand sentences like I know you think I said he saw her tell him that.  As far as we can tell, no other animal can do this sort of thing.

This not to say that communication in other animals is simple.  Chimpanzee gestures, for example, are quite elaborate, and we're only beginning to understand how dolphins and other cetaceans communicate.  Nonetheless, there is reasonable evidence that we're not missing anything on the order of human language.   It's possible in principle that, say, the squeaks of mice are carrying elaborate messages we haven't yet learned how to decode, but mice don't show signs of behavior that can only be explained by a sophisticated signaling system.   Similarly, studies of dolphin whistles suggest that their structure is fundamentally less complex than human language -- though dolphins are able to understand ordered sets of commands.

In short, human languages are built on a framework unique to us, and we have an innate, species-universal, automatic ability to learn and use human languages within that framework.  Thus the title The Language Instinct.   Strictly speaking The Instinct to Acquire and Use Language would be more precise, but speaking strictly generally doesn't sell as many books.


This all seems quite persuasive, especially as Pinker puts it forth, but primatologist and developmental psychologist Michael Tomasello argues otherwise in his review of Pinker, straightforwardly titled Language is not an Instinct  (Pinker's book titles seem to invite this sort of response).  Tomasello is highly respected in his fields and knows a great deal about how human and non-human minds work.  I cited him as an authority in a previous post on theories of mind, for example.  Linguistics is not his area of specialization, but he is clearly more than casually familiar with the literature of the field.

Tomasello agrees that people everywhere develop languages, and that human languages are distinct from other animal communication systems, albeit perhaps not quite so distinct as we would like to think.  However, he argues that there does not need to be any language-specific capability in our genes in order for this to be so.  Instead, the simplest explanation is that language falls out as a natural consequence of other abilities, such as the ability to reason in terms of objects, actions and predicates.

To this end, he cites Elizabeth Bates' analogy that, while humans eat mostly with their hands, this does not mean there is an innate eating-with-hands capability.  People need to eat, eating involves moving food around and our hands are our tool of choice for moving things around in general.  Just because everyone does it doesn't mean that there is a particular instinct for it.  Similarly, no other species is known to cook food, but cooking food is clearly something we learn, not something innate.  Just because only we do it doesn't mean that we have a particular instinct for it.

This is a perfectly good point about logical necessity.  If all we know is that language is universal to humans and specific to humans, we can't conclude that there is a particular instinct for it.  But Tomasello goes further to assert that, even when you dig into the full evidence regarding human language, not only is there no reason to believe that there is a particular language instinct, but language is better explained as a result of other instincts we do have.


So how would we pick between these views?  Tomasello's review becomes somewhat unhelpful here.  First, it veers into criticism of Pinker personally, and linguists of his school of thought in general, as being unreceptive to contrary views, prone to assert his views as "correct" and "scientific" when other supportable views exist, and overly attached to the specialized jargon of their field.  A certain amount of this seems valid.  Pinker is skilled in debate, a useful skill that can cut both ways, and this can give the air of certainty regardless of how certain things actually are.  There is also mention of Pinker's former advisor, the famed linguistic pioneer and polemicist Noam Chomksy, but Pinker's views on cognition and language are not necessarily those of Chomsky.

Second, and one would have to assume as a result of the first point, the review takes on what looks suspiciously like a strawman.   In Tomasello's view Pinker, and those claiming a "language instinct" that is more than the natural result of human cognition and the general animal ability to signal, are generally concerned with mathematical elegance, and in particular the concept of generative grammar.

Generative grammar breaks language down into sentences which are in turn composed of a noun phrase and a verb phrase, which may in turn be composed of smaller parts in an orderly pattern of nesting.  This is basically the kind of sentence diagramming you may have learned in school [when I wrote this I didn't realize that there are several ways people are taught to analyze sentences, so I wrote "you learned in school", assuming everyone had had the same experience..  But of course there are several ways.  In some schemes the results look more like dependency graphs than parse trees, which sent me down a fairly eye-opening path.  So, sorry about that, but at least I ended up learning something].

Linguistic theories along these lines generally add to this some notion of "movement rules" that allow us to convert, say, The man read the book into The book was read by the man.  Such systems are generally referred to as transformational generative grammars, to emphasize the role of the movement rules, but I'll go with Tomasello here and drop the "transformational" part.  Keep in mind, though, that if a field is "basically" built on some familiar concept, that's just the tip of the iceberg.

A generative grammar, by itself, is purely syntactic.  If you call flurb a noun and veem a verb, then "Flurbs veem." is a grammatically correct sentence (at least according to English grammar) regardless of what, if anything, flurb and veem might actually mean.  Likewise, you can transform Flurbs veem into Veeming is done by flurbs and other such forms purely by moving grammatical parts around.

Tomasello questions whether the structures predicted by generative grammar even exist in all languages.  Generative grammar did happen to work well when first applied to English, but that's to be expected.  The techniques behind it, which come from Latin "grammar school" grammar by way of computing theory, were developed to analyze European languages, of which English is one.  Likewise, much of the early work in generative grammar was focused on a handful of the world's thousands of languages, though not necessarily only European ones.  There is an obvious danger in such situations that someone familiar with generative grammar will tend to find signs of it whether it is there or not.  If all you have is a hammer, the whole world looks like a nail.

From what I know of the evidence though, all known languages display structures that can be analyzed reasonably well in traditional generative grammar terms.  Tomasello asserts, for example, that Lakota (spoken by tribes in the Dakotas and thereabouts) has "no coherent verb phrase".   A linguist whom I consulted, who is familiar with the language, tells me this is simply not true.  The Jesuit Eugene Buechel was apparently also unaware of this when he wrote A Grammar of Lakota in 1939.

But perhaps we're a bit off in the weeds at this point.  What we really have here, I believe, is a set of interrelated assertions:
  • Human language is unique and universal to humans.  This is not in dispute.
  • Humans acquire language naturally, independent of the language.  Also not in dispute.
  • Human languages vary significantly.  Again, not in dispute.
  • Human language is closely related to human cognition.  This is one of Tomasello's main points, but I doubt that Pinker would dispute it, even though Tomasello seems to think so.
  • Generative grammar predicts structures that are actually seen in all known languages.  Tomasello disputes this while Pinker asserts it.  I think Pinker has the better case.
  • Generative grammar describes the actual mechanisms of human language.
That last is subtly different from the one before it.  Just because we see noun phrases and verb phrases, and the same sentence can be expressed in different forms, doesn't mean that the mind actually generates parse trees (the mathematical equivalent of sentence diagrams) or that in order to produce "The book was read by the man" the mind first produces "The man read the book" and then transforms it into passive voice.  To draw an analogy, computer animators have models that can generate realistic-looking plants and animals, but no one is claiming that explains how plants and animals develop.

Personally, I've never been convinced that generative grammars are fundamental to language.  Attempts to write language-processing software based on this theory have ended in tears, which is not a good sign.  Generative grammar is an extremely good fit for computers.  Computer languages are in fact based on a tamer version of it, and the same concepts turn up repeatedly elsewhere in computer science.  If it were also a good fit for natural languages, natural language processing ought to be considerably further along than it is.  There have been significant advances in language processing, but they don't look particularly like pure generative grammar rendered in code.  Peter Norvig has a nice critique on this.

Be that as it may, I don't see that any of this has much bearing on the larger points:
  • Human language has features that are distinct from other human cognitive functions.
  • These features (or some of them) are instinctive.
In putting forth an alternative to generative grammar, drawn from work elsewhere in the linguistic community, Tomasello appears to agree on the second point, if not the first.  In the alternative view, humans have a number of cognitive abilities, such as the ability to form categories, to distinguish objects, actions and actors and to define a focus of attention.  There is evolutionary value in being able to communicate, and a basic constraint that communication consists of signals laid out sequentially in time (understanding that there can be multiple channels of communication, for example saying "yes" while nodding one's head).

In this view, there are only four basic ways of encoding what's in the mind into signals to be sent and received:
  • Individual symbols (words)
  • Markers on symbols (for example, prefixes and suffixes -- "grammatical morphology")
  • Ordering of symbols (syntactic distinctions like "dog bites man" vs. "man bites dog")
  • Prosody (stress and intonation)
Language, then, would be the natural result of trying to communicate thoughts under these constraints.



It's quite possible that our working within these constraints would result in something that looks a lot like generative grammar, which is another way of saying that even if language looks like it can be described by generative grammar, generative grammar may not describe what's fundamentally going on.

On the other hand, this sort of explanation smacks of Stephen Jay Gould's notion that human intelligence could be the result of our having evolved  a larger brain as a side-effect of something else.  While evolution can certainly act in such roundabout ways, this pretends that intelligence isn't useful and adaptive on its own, and it glosses over the problem of just how a bigger brain is necessarily a smarter brain, as opposed to, say, a brain that can control a larger body without any sophisticated reasoning, or a brain less likely to be seriously injured from a blow to a head.

Likewise, we can't assume that our primate ancestors, having vocal cords, problem-solving ability and the need to communicate, would necessarily develop, over and over again, structurally similar ways of putting things into words.  Speaking, one could argue, is a significantly harder problem than eating with one's hands, and might require some further, specialized ability beyond sheer native intelligence.

There could well have been primates with sophisticated thoughts to express, who would have hunted more effectively and generally survived better had they been able to communicate these thoughts, but nonetheless just couldn't do it.  This would have given, say, a group of siblings that had a better way of voicing their thoughts a significant advantage, and so we're off to the races.  Along those lines, it's quite possible that some or all of the four encoding methods are both instinctive and, in at least some aspects, specific to language as opposed to other things the brain does.


Looking at the list of basic ways of encoding:

Associating words with concepts seems similar to the general problem of slotting mental objects into schemas, for example having a "move thing X from point A to point B" schema that can accept arbitrary X, A and B.  Clearly we and other animals have some form of this.

However, that doesn't seem quite the same as associating arbitrary sounds or gestures with particular meanings.  In the case of "move thing X from point A to point B", there will only be one particular X, Y or B at any given time.  Humans are capable of learning hundreds of thousands of "listemes" (in Pinker's terminology), that is sign/meaning pairs.  This seems completely separate from the ability to discern objects, or fit them into schemas.  Lots of animals can do that, but it appears that only a few can learn new associations between signs and meanings, and only humans can handle human-sized vocabularies.

Likewise, morphology -- the ability to modify symbols in arbitrary, conventional ways, seems very much language-specific, particularly since we all seem to distinguish words and parts of words without being told how to do so.  The very idea of morphology assumes that he sings is two words, not he and sing and s.

Ordering of symbols is to some extent a function of having to transmit signals linearly and both sides having limited short-term memory.  Related concepts will tend to be nearby in time, for example.  This is not a logical necessity but a practical one.  One could devise schemes where, say, all the nouns from a group of five sentences are listed together, followed by all the verbs with some further markers linking them up, but this would never work for human communication.

But to untangle a sentence like I know you think I said he saw her tell him that, it's not enough to know that, say I next to know implies that it's me doing the knowing.  We have to make a flat sequence of words into a nested sequence of clauses, something like I know (you think (I said (he saw (her tell him that)))).  Different languages do this differently, and it can be done different ways in the same language, depending on which wording we choose.  (He saw (her tell him that)), I know (you think (I said)).

Finally, prosody is probably closely tied to expressions of emotion.  Certainly SHOUTING WHEN ANGRY is related to other displays of aggression, and so forth.  Nonetheless, prosody can also be purely informational, as in distinguishing "The White House is large" from "The white house is large."  This informational use of prosody might well be specific to language use.
In each of these cases, it's a mistake to equate some widely-shared capability with a particular facility in human language.  There is more to vocabulary, morphology, syntax and prosody than simply distinguishing objects, changing the form of symbols, putting symbols in a linear sequence or speaking more or less loudly, and this, I believe, is where Tomasello's argument falls down.


Similarly, the ability to map some network of ideas (she told him that, you think I said it, etc.) into a sequence of words seems distinct from the ability to conceive such thoughts.  At the very least, there would need to be some explanation of how the mind is able to make such a mapping.  Perhaps that mechanism can be built from parts used elsewhere.  It wouldn't be the first time such a re-purposing has evolved.  Or it might be unique to language.

Most likely, then, language is not an instinct, per se, but a combination of pieces, some of which are general-purpose, some of which are particular to animal language, and some of which may well be unique to human language.  The basic pieces are innate, but the particular way they fit together to form a given language is not.

Try fitting that on a book cover.