Thursday, November 6, 2014

Language family trees, new and old

For the most part, change in language is gradual,  though with some exceptions.  People do develop new languages quickly, and not only with constructed languages like Esperanto or Klingon.  It can happen naturally in a couple of ways.

When populations with different languages are brought into close contact, it's common for people to work out a "pidgin", a sort of half-language with vocabulary taken here and there from either parent language, but with minimal syntax and grammar.  Or at least that's been a prominent theory.   There's been some revisiting of just how simple pidgins really are.

One way or another, though, the children who grow up hearing the pidgin end up speaking a "creole", which is a full-fledged language with its own grammar and syntax, and distinct from either of the parent languages.  Confusingly enough, these creoles are often referred to as pidgins.  Tok Pisin is a classic example.  It's been around long enough to develop its own dialects.

There are also sign languages that can be traced back to a small community that, as far as anyone can tell, invented its own language spontaneously since they needed to communicate and (obviously) couldn't use any of the spoken languages around.  These are interesting in that they say something about our innate ability to use language, even without being exposed to one.

Unlike pidgins and creoles, which hybridize existing languages, these sign languages are constructed from scratch and, to be clear, they have the same kinds of complex features that spoken languages have, including the use of arbitrary symbols for abstract concepts.  Like any other true sign language, these are not simple gestures and pantomime, and like other languages, sign or not, they too can develop variants and dialects over time.

There are probably other examples along similar lines.  People have been using languages for quite some time, and there have been quite a lot of people over that time.

Nonetheless, the vast majority of the world's thousands of languages trace their lineage back over thousands of years of gradual change, likely all the way back to the exodus from Africa some 50,000 years ago, and beyond.  Why should we think this?

The main reason to is that most of them bear a family resemblance to other languages.  That is, they share features in common with those other languages, and those features indicate a branching pattern of languages diverging repeatedly from earlier forms (but also intermixing and hybridizing, so we don't have a pure "tree structure").  In some cases we can directly trace the development of a language family, for example the Romance languages of Europe (French, Italian, Portuguese, Romansch, Romanian, Spanish and several others, all derived from Vulgar Latin).

By careful comparison of the various features of similar languages, and by clues in the structures of the languages themselves, we can form a fairly precise model of how the various languages developed from a common ancestor over time, in a way that matches up well with written records where they're available, but does not require them.

If we know for a fact that some language families developed over time from a common ancestor, and most of the others show the same kind of resemblances to each other as members of well-documented families, the simplest explanation is that that what looks like a family resemblance is a family resemblance.

This is not to say that we can assert with certainty that, say, Bantu, Swiss-German, Aleut and Japanese share a common ancestor.  Language changes quickly enough that the evidence becomes indistinct as we look at larger and larger families of languages.  The most thorough and successful reconstruction of an ancestral language, Proto-Indo-European, takes us back only five or six thousand years.

To go back further takes careful statistical analysis, and this is not without its problems and controversies.  However, there is no strong evidence that the languages listed above aren't related.  They all seem to follow the same general plan, albeit with markedly different details.  Again, the simplest assumption, until we know better, is that they all trace back to a small number of ancient languages, just as human ancestry traces back to a common ancestor (note that as always, "common ancestor" doesn't mean "first").

Monday, October 13, 2014

There's extinct, and then there's extinct

As a little follow-up to the previous post:

Both Manx and Old English have been extinct (I'll explain the careful phrasing in a bit).  We know exactly when Manx went extinct: with the death of Ned Maddrell on December 27th, 1974.   We can't say when Old English went extinct, and not only because it would have happened hundreds of years ago.

"When did Old English go extinct?" is the same sort of question as "When did Middle English arise?", though it's not the exact same question.  If we were to define some set of speakers as "the first speakers of Middle English" and some other as "the last speakers of Old English", then there was, pretty much necessarily, a period of time where both were alive.  Middle English was alive, by such a reckoning, but Old English wasn't dead yet.

In fact, it's plausible that some number of people could have been said to speak both, depending perhaps on the company and occasion.  As I argued before, what's really going on here is that there is really no clear line to be drawn in cases of continuous change.

On the other hand, when the number of speakers of a language dwindles, it's reasonable to speak of the language going extinct: when the last speaker speaks no more.

The situation may become somewhat better-defined if we consider features of languages.  It's plausible that there was a last time that someone used urum for "our" with a plural noun (in the appropriate grammatical case, etc.), or a first time (probably some time earlier and/or in some other place) that someone said oure in the like situation.

This at least clarifies some of the difficulties of the exercise.  First, features do not arise or disappear everywhere at once.  At some point, there would have been some people who said urum or oure, as the case may be and some who didn't.  At some later point there were fewer who said urum and more who said oure, and eventually, no one was saying urum anymore.  And even that may be an oversimplification.

Second, what set of features we call "Old English" and what set we call "Middle English" is largely arbitrary.  More realistically, as time went on, there were more people speaking English with what we would now regard as Middle English features and fewer speaking with what we would regard as Old English features.

Except that we still retain some features of Old English, for example, the words is, on, to and he from Ælfric of Eynsham in the previous post.  When we try to draw a line between Old and Middle English, we're really looking for features unique to each.  We could plausibly say that when there are, say, no longer any people who speak with more of Old English's unique features than Middle English's, Middle English has taken over.  When no one is using that particular set of features at all, we could say that Old English is extinct ... but then we should be careful not to choose any features for that set that do survive.

If this has a sort of unsatisfying feel to it, it's because the whole exercise is of limited use.  At the bottom of all this careful definition is a distinction without a difference.  The real story is that features come and go, with enough differences eventually accumulating that we start to feel that two different sets of features denote different languages.


Now, why did I say "have been extinct" above?  Manx is, it turns out, no longer dead.  There aren't many speakers, and perhaps none who speak exclusively Manx, or few that could be said to speak Manx as their first language, but people are speaking Manx again, with pride.  This is thanks not only to modern Manx speakers, but in part to recordings of Ned Maddrell and others, and to the efforts of the linguists who made those recordings and otherwise worked to preserve a record of the language.

Wednesday, October 1, 2014

Xeno and the history of English

Kids these days.  They don't talk like we did when I was a kid.  I'd give some examples, but they're probably already out of date.  You know what I mean, though.  It's one of the constants of life, the next generation doing things a bit differently from the last.  O tempora o mores.

When trying to make sense of the world's languages, dialects, accents, jargons and such, it's natural to look at differences among speakers, or among groups of speakers.  French speakers speak a different language from Italians.  New Yorkers, for the most part, speak differently from people in Clinch County, Georgia, or County Derry, Ireland.  It's easy, though, to neglect differences in a particular language over time, even though they can be just as significant.

Changes over time are in some ways similar to differences among contemporaries.  If someone were to start speaking, say, the English of Beowulf to a group of modern English speakers, they would likely hear it as just another foreign language.  At that level of remove, there's little chance that a modern speaker would think "That person sounds like they're trying to sound like someone from pre-Norman times, but they're definitely speaking English."

There is, however, one salient property of changes over time: Unlike differences between languages at any given time, they have to be small.  Even if a parent and child speak differently, they still need to understand each other.  Granted, this is generally more important to the older generation.  Kids throughout history have always been quick to invent their own vocabularies and otherwise make themselves more easily understood by each other than by their elders.  Even so, from a linguistic point of view they are speaking essentially the same language.

It takes quite a while for that gradual change to amount to what we would call a new language.  For example, here's a bit of a poem (They flee from me) that I studied in college.  I've modernized the spelling, but otherwise the words are the same
I have seen them gentle, tame, and meek,
That now are wild and do not remember
That sometime they put themself in danger
To take bread at my hand
This was written by Thomas Wyatt in 1535, almost 30 years before Shakespeare was born, but that wouldn't necessarily be your first guess, would it?

As we go back further, change becomes more apparent, but again gradually.  Here's a bit of Chaucer (The Former Age), again with modern spelling, from the late 1300s
No man yet knew the furrows of his land,
No man the fire out of the flint yet found,
Unkorven and ungrobbed lay the vine;
No man yet in the mortar spices ground
OK, a couple of unfamiliar words (unkorven may be translated as "unpruned" and ungrobbed as uncultivated), but then what do bodkin and fardels mean in Hamlet's soliloquy, written about 200 years later?  The word order is a bit weird, but just how do you parse the first lines of The Star-Spangled Banner, written about 400 years later?  Maybe that's just poetry for you.

I'm cheating a bit here by focusing on the written word, because there's ample reason to believe that English was pronounced considerably differently in the 1300s.  But then, contemporary speakers, depending on their dialect, will pronounce the same written words differently as well.  In the case of Chaucer, I'm also cherry-picking a bit.  Some passages of Chaucer sound considerably less familiar, and some contemporaries of Chaucer in other parts of England would be less familiar still.

Continuing our journey back in time, here's a passage from The Peterborough Chronicle, written in 1140:
Tha was England suythe todeled: some helden mid the king.  Some helden mid the empress.  For tha the king was in prison, tha wenden the earls to rich men that he never more should come out, sahtleden mid the empress, brought her into Oxford, iauen her the burgh.
This still seems something like English, but it's getting harder to decide what to clean up as just a matter of spelling, and what's just different.  For example, with is now mid, in line with the other Germanic languages.  Verbs have the -en ending when the subject is plural (as they do in Chaucer as well, though not in the passage I chose).  I could just as well have papered that over and written held instead of helden and, stretching a bit more, went instead of wenden.

That funny word iauen is really just gave: the letters u and v were not distinguished until later, and quite likely the i at the beginning is like the y in yellow, corresponding to an initial g in other dialects, so we get gaven, or plain gave when you lop off the -en.  That tha seems a bit like then and a bit like when, not unlike modern dialects that use what where others would use that.

It's worth noting that this was written with the Norman conquest of 1066 still in living memory, though only just.  Given that, I'm actually a bit surprised there aren't more French borrowings.  The only ones that stick out are prison and empress.

Go back not too much further, before the conquest, and we have something not too much different from the Peterborough Chronicle, but different enough from modern English that trying to smooth over the differences is a lost cause.  Here is Ælfric of Eynsham discussing the reflection in grammar of Christian theology.  First a straight transliteration (but with modern punctuation):
Oft ys seo halige þrinnys geswutelod on þisre bec, swa swa ys on þam worde þe God cwæð: 'Uton wyrcean mannan to ure anlicnisse'.  Mid þam þe he cwæð 'Uton wyrcean' is seo þrinnys gebicnod; mid þam þe he cwæð 'to ure anlicnisse' ys seo soðe annis geswutelod: he ne cwæð na menfealdlice, 'to urum anlicnissum', ac anfealdlice, 'to ure anlicnisse'.
Here's an attempt to smooth over the spelling differences and such, taking a few more liberties than in the last example:
Oft is the holy threeness geswutelod on this book, so so as the words that God quoth, 'Uton work man to our (an)likeness'.  With þam þe he quoth 'Uton work' is  the threeness gebicnod; with þam þe he quoth 'to our (an)likeness' is the sooth oneness geswutelod: he ne quoth na manifoldly (many-fold-ly) 'to our (an)likenesses', but one-fold-ly, 'to our (an)likeness'.
(That's sooth, as in soothsayer, meaning 'true', not the modern verb soothe).  And finally, glossing the words that don't seem directly related to modern ones, and updating a few that are,
Often is the holy trinity revealed in this book, just as the words that God said, "Let us make man to our likeness."  With that he said 'Let us make' is the trinity indicated.  With that he said 'to our likeness' is the true unity revealed: he neither said nor plural, 'to our likenesses', but singular, 'to our likeness'.
Even with the words glossed, the word order is a little funky, that "with that" is only approximate and doesn't parse easily, and the ne ... ne ... construct, still used by Chaucer and with traces here and there in Shakespeare, is gone now, the closest remnant being neither ... nor.  The forms of words have changed significantly (halige vs. holy, annis vs. oneness etc.).  There is more and different inflection in the original, both in nouns and pronouns (ure anlicnisse vs. urum anlicnissum, worde as the plural of word) and verbs (ge- ... -od for the past participle instead of just -ed today).

All in all, this is more a translation than a modernization.  Only after a third round of adjustments would Ælfric's English really look like modern English.

Even though the Peterborough Chronicle is in an East Midland dialect that is not a direct ancestor of Chaucer's, and Ælfric spoke a Wessex (West-Saxon) dialect which is, and the writers are writing in considerably different forms, it's not hard to see that these are examples of the same language, but a language which is, overall, changing gradually over time.

Thus even though we can point at plenty of texts and say "This is Middle English", or "This is Old English", there is no time in history we can point at and say "Middle English started here" or "Modern English started here".  As far as anyone at the time was concerned, they were speaking English.  No "Old" or "Middle" or "Modern" about it.

Even if you did pick some set of speakers and say, "the parents spoke Middle English, but the children spoke Modern English", you'd be hard pressed to tell the difference between the two.   There would be much more difference between our Modern English and theirs than between the parents and the children.

Gradual does not mean imperceptible, however.  We can trace the origins of newer words like blog or semiconductor with a fair degree of certainty.  We can remember words and constructions that people don't say much any more (at least in most places), like To whom did she give the book? instead of Who did she give the book to?  Going further back to old written sources, we can get a reasonable idea of when people stopped using -en as a regular plural marker for nouns, though of course that didn't happen all at once.  We can see individual changes happening, it's just that there are so many parts to a language that it takes quite a few -- many more than typically happen in a lifetime -- before we decide to call the result a new language.

Drawing a precise line between an older form of a language and a later form is largely arbitrary, like trying to pick a boundary between green and blue.  You can do it, but wherever you pick, there will be a reasonable argument for picking some slightly different boundary.  Even so, the language has changed and continues to.  You can quite rightly say "no one saw Modern English arise from Middle English," and yet it did.

This is all reminiscent of Xeno's arrow paradox: If an object in a definite place is at rest, and at any given moment a flying arrow is in a definite place, then, it would seem, a flying arrow must always be at rest, so how can it move at all?  Untangling this sort of thing rigorously is the wellspring of the mathematical field of analysis, but for most purposes it's enough to know that the arrow moves anyway, somehow unaware that it's not supposed to be able to do that.

Tuesday, June 17, 2014

Reading University and Mr. Turing's test

The BBC reports that a chatbot called Eugene Goostman has passed the Turing test, marking the first time in history that this has happened.  Or at least, that's what you'd gather from the headline (granted, it doesn't give the name of the chatbot).  The BBC, not having taken complete leave of its senses, explains that this is the claim of the team at the university of Reading that ran the test, and then goes on to cast a little well-deserved doubt on the idea that anything historic is going on.


So what's a Turing test?

Sixty-four years ago, Alan Turing published Computing Machinery and Intelligence, in which he posed a simple question: Can machines think?  He immediately dismissed the question as essentially meaningless and proposed an alternative: Can a machine be built which could fool a person into thinking that it (the machine) was a person?

In Turing's setup, which he called "the imitation game", there would be a judge who would communicate with two players, each claiming to be a person.  The judge and players would communicate via a "teleprinter" or other intermediary, so that it would not be possible to point at one of the players and say "That one's a machine, duh".  Turing goes into quite a bit of detail on points like this that we would take for granted now.  Your favorite instant messaging system is good enough for the task.  On the internet, nobody knows your'e a dog.

Later in the paper Turing makes a pretty audacious claim, considering it was made in 1950 and the supercomputers of the time had somewhere on the order of 16K of memory.  In case you've forgotten how to count that low, that's 1/32 of a megabyte, or about a millionth of the capacity of a low-end smartphone.  Turing's prediction:
I believe that in about fifty years' time [that is, by around the year 2000] it will be possible, to programme computers, with a storage capacity of about 109 [bits], to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning.
109 bits is about 128 megabytes, not an unusual amount of RAM for a computer in the 2000s and a remarkably good prediction for someone writing in 1950.  Keep in mind that Turing wrote this well before Moore formulated Moore's law, itself a good source of misinterpretations.

Turing was a brilliant scientist.  He helped lay the groundwork for what we now call computer science, played a key role in pwning the German Enigma machine during World War II, and thought deeply about the question of intelligence and how it related to computing machinery.  However, he got this particular prediction spectacularly wrong.

It didn't take fifty years to beat the imitation game.  It took more like fifteen.

In the mid 1960s, Joseph Weizenbaum of MIT wrote ELIZA, which purported to be a Rogerian psychotherapist.  You can play with a version of it here.  To be clear, this program wasn't actually trying to do psychotherapy.  It was more like a parody of a "nondirective" therapist whose goal is to stay out of the way and let the patient do all the meaningful talking.  Was it able to fool anyone?  Yes indeed.  So much so that it inspired Weizenbaum, after seeing people confide their deepest secrets to the program, to write a book about the limitations of computers and artificial intelligence.

ELIZA neatly dodges the difficulties that Turing was trying to present to the developer by making the human do all the thinking.  Say "The kids at school don't like me" and ELIZA won't respond with "I know what you mean.  At my school there was this bully named ..." and give you a chance to probe for things only an actual human who had been to school would know.  It will respond with something like "Why do you think the kids at school don't like you?"  It's a perfectly reasonable response, but it reveals absolutely zilch about what the machine knows about the world.

That's fortunate, because the machine knows absolutely zilch about the world.  It's just taking what you type in, doing some simple pattern matching, and spitting back something based, in a fairly simple way, on whatever patterns it found.  This works great for a while, but you don't have to wander very far to see the man behind the curtain.  Answer "Because I am." to one of its "Why are you ...?" questions, and it is liable to answer "Do you enjoy being?", because it saw "I am X" and tried to respond "Do you enjoy being X?"  Except there is no X in this case.

The Eugene Goostman chatbot likewise dodges the difficult questions, but as far as I can tell it does it by acting flat-out batty.  Its website says as much, advertising itself as "The weirdest creature in the world".  When I first saw the Reading story on my phone, there were transcripts included.  These are somehow missing from the version I've linked to, but there is a snippet of a screenshot:
  • Judge: What comes first to mind when you hear the word "toddler"?
  • Goostman: And second?
  • Judge: What comes to mind when you hear the word "grown-up"?
  • Goostman: Please repeat the word to me 5 times
Sure, if you're told that you're chatting with an eccentric 13-year-old boy with English as a second language, you could take pretty much any bizarre response and say "meh ... sure, that sounds like something a 13-year-old eccentric non-native speaker might say ... close enough."  But so what?

The transcripts I saw on my phone were of a similar nature.  Apparently the Goostman website had run the chatbot online for a while, and you can find transcripts from people's interactions with it it on the web. The online version was soon taken down, perhaps from the sheer volume of traffic or, a cynic might say, because the game was up.

This is not the first time people have mistaken a computer for a human behaving outside the norm.  Not long after ELIZA, in 1972, psychiatrist Kenneth Colby, then at Stanford, developed PARRY (this was evidently still before mixed-case text had become widespread).  Unlike ELIZA, PARRY wasn't basically trolling.  It was a serious attempt to mimic a paranoid schizophrenic and so, if I understand correctly, to learn something about the mind of a person in such a state.

Colby had a group of experienced psychiatrists interview both PARRY and actual paranoid schizophrenics.  He then gave the transcripts to a separate group of 33 experienced psychiatrists.  They identified the real schizophrenics with 48% accuracy -- basically random chance and far below Turing's 70%.  That is, PARRY could fool the psychiatrists about 50% of the time, while Turing only expected 30%.

This was from transcripts they got to read over, not from a quick five-minute exchange.  For my money this is a stronger test than Turing's original, and PARRY passed it with flying colors.  Over forty years ago.  Eugene Goostman fooled 33% of the judges (one suspects that the number of judges was a small multiple of three) in five-minute interviews by spouting malarkey.  Not even carefully constructed paranoia, just random balderdash.  Historic?  Give. Me. A. Break.

By the way, if you're thinking "ELIZA is pretending to be a psychotherapist, PARRY is pretending to be a person with mental issues ... hmm ..." ... it's been done.


Thing is, Turing's test just isn't very good.  In attempting to control for factors like appearance and tone of voice, it limits the communication to language, and printed language at that.  In doing so, it essentially assumes that facility in language is the same as intelligence.

But this is simply false.  A highly-intelligent person can become aphasic, and there are cases in the literature of people who can speak highly-complex sentences with a rich vocabulary, but show no other signs of above-average intelligence.  And, as we've seen, it's been feasible for decades to write a computer program that does a passable imitation of human language without understanding anything at all.  I believe there are also documented cases of humans failing Turing tests, but that's a different issue.

It turns out that we humans have a natural tendency to attribute at least some level of intelligence to anything that looks remotely purposeful.  For example, there is an experiment in which people watch two dots on a screen.  I don't recall the exact details, but I think the following gets the gist:

One dot approaches the other and stops.  It then backs off and approaches again, faster.  The first dot is now touching the second, and both move slowly in the direction the first dot had been going.  Ask a person for a description, and they'll likely say that the first dot was trying to get past the second and finally tried pushing it out of the way.

Throw in language and the urge to attribute intelligence is nearly overwhelming.  "OK", one finds oneself thinking, "it's maybe not completely grammatical, and it doesn't make much sense, but that's got to be because the person talking is a bit ... off, not because they're not intelligent at all.  They can talk, for goodness' sake."

Whether something passes the Turing test in practice comes down more to a judge's ability to set aside intuition and look for artifacts of pattern-matching approaches, like the "Do you enjoy being?" example above.

This assumption that language facility was a good proxy for intelligence ran through a lot of early AI, leading to an emphasis on discrete symbol-smashing.  You have to start somewhere, it's clear that understanding language has a lot in common with other signs of intelligence, and a lot of useful work came out of efforts to develop good symbol-smashing tools, but to some extent this is more like looking for your lost car keys where the light is brightest.  Computers are good at smashing symbols, or more generally, dealing with discrete structures, which would include words and sentences.  That's basically their job.

It's now looking like probability and continuous math have more to do with how our minds actually work.  Being able to communicate in the (more-or-less) discrete medium of language came along relatively late in the evolutionary game, long after other aspects of intelligence, and language itself doesn't behave the way we assumed it did fifty years ago.  Science marches on.

There's another problem with the Turing test, something that looks like a strength at first:  It's free-form.  The judge is allowed to ask any questions that seem appropriate.  There is no checklist of abilities to test for.  If the respondent claims to have trainspotting as a hobby, there's no requirement to find out if they know anything about trains, or their schedules, or the sound of a locomotive or the smell of overheating brakes.

More generally, there is no requirement to test for, say, understanding of metaphor, or the ability to learn a new concept or glark the meaning of a word from context.  There is no requirement to determine if the respondent understands the basic properties of objects, space and time.  And so forth.

To be sure, there's an obvious objection to imposing requirements like this.  It would lead to "teaching to the test".  Contestants trying to pass such a variation of the Turing test would naturally try to build systems that would be able to pass the particular requirements.

But that could well be a good thing.  It's surely better than seeing people grab headlines by writing a bot that spouts gibberish.  As long as the requirements are phrased abstractly we can still leave it up to the judges' ingenuity to decide exactly what metaphor to try or what specific questions to ask about space, time and objects.  At the end of the test we can expect these requirements to be covered, or invalidate the judge's result if they aren't, which we can't with a free-form test.

The particular list I gave doesn't necessarily cover everything we might want to associate with intelligence, but a system that can understand metaphors, space and time, and can learn new concepts, can reasonably said to be "thinking" in a meaningful sense of the word.

Setting explicit requirements would also allow for variant tests that would accept forms of intelligence that were significantly different from ours.  For example, one very important part of being human is knowing what it's like to have a human body.   Being embodied as we are plays a large role in our cognition.  However, it's perfectly possible for something to be intelligent and, for example, not experience tastes and smells (indeed, some number of people have no such experience).

It seems reasonable to instruct the judges "We know this might be a machine.  Don't ask it what things taste like."  In the original Turing test, if the program came up with some plausible explanation for lacking taste and smell, a natural follow-up might be "What's it like not to be able to taste and smell?"  It's not clear that a machine would need to have a good answer to that in order to be intelligent.  If it didn't, the judge might have a good reason to think it was a machine even if it did in fact have some meaningful form of intelligence.  Either way the line of questioning is not helpful as a way of testing for intelligence.  In other words, distinguishing human from machine is not quite the same as distinguishing intelligent from unintelligent.

Hiding behind all this is one more shaky assumption: Something is either intelligent or it isn't.  Even though Turing properly speaks of probabilities of guessing correctly, there is still the assumption that a machine is either successfully imitating a human or it isn't.  Suppose, though, that a machine is really good at some area of knowledge and the judges happen to ask about that area 31% of the time.  That machine would pass the Turing test (in the popular but not-quite-accurate sense), but what does that mean?  Is it 31% intelligent?


I wouldn't lay much of this at Turing's feet.  He was doing pioneering work in a world that, at least as far as computing and our understanding of human cognition are concerned, was starkly different from the one we live in, and yet he managed to hit on themes and concepts that are still very much alive today.  Nor would I blame the general public for taking a claim of a historic breakthrough at face value.

But the claim itself?  Coming from a respected university?  Granted, they seem mostly hyped about the quality of their test and the notion that nothing else so far has passed a "true" Turing test.  But this seems disingenuous.  What we have here is, maybe, a more methodologically faithful version of Turing's test, which was passed by a mindless chatterbot.  The only real AI result here is that a Turing-style imitation-based test can be beaten by clearly unintelligent software.

This is not a new result.

[The Wikipedia article on Eugene Goostman makes a really good point that I never caught: Turing predicted a 30% success rate.  He didn't define that as some sort of threshold for intelligence.  Thus, fooling 30% of the judges doesn't mean that something "passes the Turing test and is therefore intelligent" It's just confirming Turing's prediction about how well machines would be able to win the imitation game.]

Saturday, June 7, 2014

Words and their senses

What does "out" mean?  Well, it's the opposite of "in", whatever that may mean.  Except, not in all cases.  Sure, "I went out the door" means pretty much the opposite of "I went in the door", but "I've got an in at the State Department" means I know someone there and I might be able to use that connection to advantage, while "I have an out at the State Department" means ... maybe it means I have an escape route from my current predicament via the State Department, maybe because of the in I have there?

Just off the top of my head, "out" might mean any of a dozen things:

  1. adj Unconscious ("He's out like a light")
  2. adj No longer holding a position ("She's out as CEO of Frobco")
  3. adj Retired as a batter according to the rules of baseball ("He's out at third")
  4. n An instance of retiring a batter according to the rules of baseball ("That's the third out")
  5. adj Openly known to have a particular sexual orientation ("She's an out bisexual")
  6. v To reveal someone to have a particular sexual orientation ("He outed her as a bisexual")
  7. v To reveal someone to have a clandestine role or identity ("The administration inadvertently outed a CIA agent")
  8. adj Not a member of a particular social group ("They always make me feel so out")
  9. n A potential means of escape from a situation ("In tense negotiations, it's always good to have an out")
  10. adj Not in tune ("I  think your A string is out")
  11. adv Deliberately in a different key and/or time signature from the rest of an ensemble ("Vernon likes to play really out when he solos")
  12. adj Not at home ("Sorry, Mr. Smith is out")
and so forth.  It's easy to come up with more.  I've by no means hit even all the usual senses.  My laptop's dictionary gives nine adverbial senses, nine adjectival senses, one prepositional, three nouns and two verbs.  I've heard of lists of over a hundred senses of the word.  It depends on how finely you want to slice it.  I'll come back to that.

Clearly some of these senses are more closely related than others.  The baseball senses 3 and 4 are clearly related, with the noun sense derived from the adjective.  Senses 5, 6 and 7 are clearly related, the verb sense has got to be derived from the adjective sense, and that must be related to phrases like "the secret is out".  On the other hand, it's hard to see how senses 3 and 4 have much to do with senses 5 - 7.

Nonetheless, the use of "out" (or any other word) is not completely arbitrary.  There is a well-developed linguistic theory behind this (well, probably several, but I happen to like this particular one).  At the core, a word has a small number of well-defined senses.  In the case of "out", there is a boundary enclosing something.  What's enclosed is in.  On the other side is out.  Thus when you go out a door, you are going from the space you were in, into another space (which may also be enclosed -- you can go out of one room and into another).  You can draw a circle on the ground and say someone or something is in the circle or out of it, and so forth.

From these basic senses, we build new senses metaphorically.  We can imagine a boundary between the members of a social group and anyone out of the group.  We can talk about entering or leaving a state of being ("Moishe led us out of slavery"), and so forth.

Crucially, these metaphors are not just literary fancy.  They are directly meaningful and productive (in the linguistic sense that we can spontaneously create new instances of the metaphor).  If a social group has a boundary, we don't just say someone is in the group or out of it.  We can welcome someone in.  We can kick someone out.  We can expand the group.  We can designate an inner circle, and so forth.  If we're feeling more creative, we can say something that sounds more "metaphorical" in the usual sense ("The boundaries of the group were porous.  People floated in and out with the tides.")

In the case of "out" we can start with the basic boundary-oriented sense and build, well, outwards:
  • A container can have something in it, and you can take things out of it
  • If the container holds a fluid, you can pour it out of the container
  • or it can run out (or flow out, our seep out, or pour out)
  • When there's no longer anything in the container, you have run out of whatever was in it (or you're just out of it)
  • Any resource can be considered to be a fluid, so you can run out of time (or say time is running out), run out of money, or energy, or whatever.
  • The resource need not be physical.  You can run out of patience.
  • A burning lamp or candle consumes its oil or wax.  When there's none left -- it has run out -- the lamp or candle has burned out.
  • Likewise, any flame can burn out as it runs out of fuel, for example a rocket can burn out
  • Someone exhausted is burned out
  • An electric light performs the same role as a lamp or candle.  When you shut off the current to it, it is out.
  • A person's consciousness is likened to a light or flame, and an unconscious person is out (sense 1 above).  You can also drift in and out of consciousness, but that's considering a state of being as something bounded -- a separate metaphor.
Coming back toward the core senses
  • A social group or position is a bounded area, metaphorically.
  • As noted above, one can be in or out of a group or position, or leave it (senses 2 and 8; also, being in a job or out of it, leaving a job, etc.)
  • Information can be regarded as a substance (you can share it, hide it, have a lot or a little of it, etc.).  You can't see through an opaque container, but once the information is out of the container (out in the open), others can know it (senses 5-7 ... the secret is out)
  • Obviously, you can be in or out of your home (sense 12)
  • We can speak of some quantity being within or outside of given limits, a special case of a state of being as a bounded area.
  • A note too far from its correct pitch is out of tune (sense 10)
  • A musician diverging from the key and/or time signature of a piece is clearly crossing those limits (sense 11)
  • A difficult situation limits one's potential actions, like being in a confined space.  It's good to have a way out of such a space (sense 9)
That leaves baseball.  The baseball usage derives from the game of cricket, where a batsman stands in his ground, guarding a wicket.  When the wicket falls (for any of a number or reasons I won't even try to enumerate), the batsman is out of his ground, or just out  (well, you can also be out of your ground but not out, for example if you're running between the wickets, but clearly out is related to in/out of one's ground).

So there you have it.  By a series of metaphors and analogies, we can connect seemingly unrelated senses of a word, like senses 1, 4 and 7 above, back to basic, familiar and physically-based concepts.

Of course, if you try hard enough, you can connect anything to anything.  I remember learning the following chain of reasoning as a kid, explaining why fire engines were red, starting with a very basic premise:
  • 1+1 = 2
  • 2+2 = 4
  • 4+4 = 8
  • 8+4 = 12
  • There are 12 inches in a ruler (at least in the US)
  • Queen Mary was a ruler
  • The Queen Mary was a ship
  • Ships sail in the ocean
  • The ocean is full of fish
  • Fish have fins
  • The Finns fought the Russians
  • Russians are red (this was back in the Cold War days)
  • Fire engines are always rushin'
  • And that's why fire engines are red
If only fire engines were still red.

How do we know that this whole sense-extension exercise isn't just another chain of silly reasoning?  A metaphor doesn't just relate anything to anything else.  There are two basic rules:
  • One thing will be more concrete than the other.  E.g., holding a substance in a container is more concrete than holding information in one's mind.
  • The metaphor connects the two things coherently, in multiple places.  You can bring things you know about the more concrete thing to the more abstract thing, at least until the metaphor breaks down.  In an example from a previous post "The stop sign was a fire engine" is not a coherent metaphor, even if both are red.  On the other hand, if you have information in your head, you can give it out.  You can have a lot or a little of it.  You can have information crammed into your head until it's about to explode, and so forth.
Even if the explanation above isn't airtight, I hope I've at least presented a plausible case that the senses in which we use words can be explained reasonably well by this sort of metaphoric extension, plus some other rules covering, for example, how we convert adjectival senses like sense 3 to noun senses like sense 4.  Real linguists have, of course, studied this in much more detail.


How many senses does a word have?  Some, like copernicium or Pinophyta, probably only have one, but even highly technical terms like algebraic can have multiple, related senses, following the same core-with-extensions pattern as anything else.  Some very basic words, like the prepositions or head and go have large numbers even by the relatively conservative standards of dictionaries, particularly when you count idioms like head out or go out.  Most words we commonly use have at least a few widely-used senses, and likely several more specialized senses, particularly if you include slang -- which you always should, if you're really trying to understand a language.

But how finely should we really slice?  Senses 6 and 7 above are very nearly the same thing.  Historically, as far as I'm aware, sense 6 precedes sense 7, but is this because outing someone's orientation is a different thing from outing someone's clandestine role, or because the sense of "revealing something that had been secret" was at first only applied to orientation?

I would lean toward that latter, if only for practical reasons.  Otherwise we would have to consider each application of a word its own sense.  Being out of the position of CEO would be a different sense from being out of the position of vice president, or janitor.

All right, then, but then couldn't we argue that all uses of "out" are just different applications of the basic idea of being outside a boundary?  It seems clear that there's something in common at the core of the various senses above, but saying that they're all just uses of the same basic word leaves something out [ahem].

This is especially apparent if you're trying to paraphrase or translate a word.  In senses 6 or 7, we can use basically the same paraphrase: He revealed her to be bisexual; The administration inadvertently revealed the identity of a CIA agent, and sense 5, apart from using a different part of speech, more or less works.  She's a revealed bisexual sounds a bit funny, but it at least makes sense, as opposed to He's revealed like a light for sense 1, or He's revealed at third for sense 3.  On the other hand The secret is revealed works fine for The secret is out.  The paraphrase test isn't foolproof, but it seems like a good starting point.

From the point of view of metaphors, a sense of a word corresponds to a particular metaphor.  Senses 5-7 and The secret is out use the metaphor of information being a substance in an opaque container.  Sense 1 uses a series of metaphors:  consciousness is a light, lights can be out because lamps and candles have fuel and fuel can run out.  If different paraphrases or translations are different ways of expressing the metaphor, then the paraphrase test makes sense.  In short, if you're applying the same metaphor in a new context, you're using the word in the same sense, and if you're not, you're not.



Sunday, May 11, 2014

Pan proiiciens

Humans have several unique qualities (and many more not-so-unique), but one that may not come readily to mind is that we throw things, and we throw them very well.  No other animal we know of could come remotely close to doing this, or this, or this.  Even an average human's throwing abilities are far beyond anything else in the natural world.  If I said that I'd seen someone throw a ball 50 meters, no one would think twice.  If I said that I'd seen a horse throw a ball 50 meters, the most likely response would probably be "No, you didn't", maybe followed by "How??"

If you think about it, it's actually somewhat surprising that humans would be unique when it comes to throwing things.  It's a very useful skill.  If you can bring down a small mammal with a well-aimed rock, you'll eat much better than if you have to run it down.  Even if you're a very good runner-downer, it still takes way less energy to throw a rock.  Throwing a rock can also get a high-hanging fruit out of a tree (though not easily).  Throwing is a good way to get something you're carrying up onto a high ledge that you can then get to with all your limbs free, and so forth.  You'd think something would have stumbled on it before our ancestors did.

On the other hand, a lot of body plans just aren't that well suited to throwing.  Birds have one set of limbs serving as landing gear and for locomotion when not airborne, and the other given over to wings.  That leaves the head and beak as all-purpose picker-uppers, severely limiting throwing potential.  Birds can throw things by picking them up and then heaving the head and letting go, but not very far or forcefully.

Fish and other sea creatures live in a viscous medium where throwing is not particularly useful.  Most mammals have all four limbs specialized for walking, running, jumping and so forth, leaving them in much the same spot as birds.  Lizards are also pretty well tied to the ground or other surface they're traveling on -- particularly the limbless ones, to say nothing of actual snakes (snakes are closely related to lizards, but there are also some proper lizards that lack limbs).

There's an obvious common thread here: To throw effectively you need a free limb, one not specialized to supporting your weight.  You need hands, as opposed to front feet.  You don't need to be fully bipedal, though that clearly ought to help, but you do need to be able to grab something, stand up on your hind limbs and let fly.

There aren't many animals in that category, but several primates are, by virtue of having hands (and feet, and tails) adapted to grabbing and swinging from tree branches instead of always walking on all fours.  Putting all this together, perhaps it's not so surprising that throwing would have only evolved recently, in a branch of the primate family tree, itself relatively recent.

It is a principle of evolution that behavior tends to change first, and then anatomy follows.  First certain fish started coming up out of the water, whether to escape from other fish who couldn't or to move from a shallow, drying-up pond to a deeper one, or for whatever reason. Later came Tiktaalik and its kin with pectoral fin bones (and other features) better adapted to life out of water.

Once the new niche was established on land, there was plenty of selection pressure to reshape the body for living on land, and eventually lose the older adaptations for water living entirely.  More strictly speaking, the new behavior of coming up on land meant that fish with more land-adapted bodies could have better survival chances.

In the case of primates throwing, it's not that other primates don't throw.  Some species of monkeys are well-known to throw excrement at others of their species, and chimps will throw things as a threat.  It doesn't seem to matter much to them what they throw, but it will generally be branches and occasionally rocks.  Chimps don't show signs of throwing with the same purposes we do, but it's significant that some sort of throwing behavior is established in our near relatives.  It's therefore plausible, though not certain, that it was also established in our common ancestors.

While chimps are known to hunt, and are known to throw things, and are known to be reasonably intelligent, they are not known to throw for purposes of hunting.  This is not completely shocking, as they tend to hunt small monkeys in trees when they hunt.  The strategy is to go in as a group, block off escape routes and send one member of the hunting party in for the kill.  Rock throwing would probably not help much in such a situation.  You'd have to encumber a hand carrying a rock up into the trees, and then you'd only have one rock, a bunch of branches in the way, and a monkey that would not be well-inclined to staying still in the path of a hurtling rock.

But imagine a tribe of ancestral chimps a couple of million years ago venturing out of the forest.  These wouldn't be exactly like today's chimps, of course, but today's chimps appear to resemble these ancestors much more closely than we do, so we can consider them ancestral chimps here.  In 1993, William Calvin laid out a hypothesis that a good way for such creatures to get food would be to stake out a water hole and go after the herds that gathered there to drink -- as several other predators do.

Naturally, herds of gazelles and such are adapted to dealing with predators around water holes.  The main strategy is to stampede away, which works well, except that any animal that trips or falls is likely to be trampled by the herd and left behind as an easy meal.  This actually works reasonably well for the herd as a whole (from an individual's view, if you have this behavior you're more likely to be one of the survivors than if you don't), and it works for the hunters as well.  It just doesn't work particularly well for the particular animal left behind.

Now suppose you hit upon a way to make a herd stampede, and at the same time make one of its members stumble and likely be trampled.  That would work even better for you, though again not so well for the unlucky victim.  Calvin's idea is that there's an easy way to do this, that those ancestral chimps could have done with the mental and physical abilities they had: Throw a reasonably-sized rock into the herd.  If it hits one member, that member will stumble, the herd will startle and, with a bit of luck, trample the stumbler.   This doesn't have to happen every time, just enough to make it noticeably easier to get food from the herds gathered at the water hole.

At that point, we're off to the races.  Any change, whether in behavior or anatomy,  that improves throwing force or accuracy, will make for better hunting and better-fed primates.  Richard Young cites paleontological evidence to support a claim that exactly this happened during the past couple million years -- the body, and the hand in particular, became better and better suited to throwing (and to clubbing, another area where we excel).  Doing these well requires a number of changes from the long-fingered, short-thumbed tree-branch-hooking hands of the rest of the chimp family.

Calvin, for his part, speculates that the famous Archeulean hand axe was actually an improved weapon to throw into a herd, as it would be more likely to cut into the prey's hide and cause it to instinctively collapse, making it more likely to be trampled.  As always, this doesn't imply that the hunters were thinking it through in that much detail.  It's enough that throwing a sharp rock works better than throwing one that isn't, and that some in the population had an innate proclivity for chipping away at rocks and so making them sharper.

It's been argued repeatedly that, if we weren't the ones doing the classifying, or weren't so subject to a certain prideful feeling of distinctness from the rest of nature,  we would be classified in the same genus as chimps and bonobos.  Either they would be Homo troglodytes and Homo paniscus, or we would be Pan sapiens.  Under the latter scheme, those savanna-dwelling rock throwers might best be called Pan proiiciens -- throwing Pan, the name Pan being taken from the Greek god of the forest, used to designate the forest-dwelling chimps and bonobos.  Except these particular Pan would be leaving the forest.

Behavior drives anatomical change, but anatomy limits behavior.  Just as only some fish had the right anatomy to even try moving on land, only some animals had the right kind of body to try throwing things.  Primates happened to be close enough, again likely because their hands had already adapted for something else besides walking, something that allowed for grabbing and flinging.  This is a common pattern in evolution.  It used to be called pre-adaptation, but that gives the impression that, for example, primate hands evolved for grabbing tree branches so that they could later evolve for throwing and clubbing.  That's not how it works, so the ungainly but unbiased term exaptation is preferred.

There is another interesting point that Calvin makes.  It's not enough to have the anatomy.  You need to be able to control it.  It takes around a tenth of a second for our bodies to carry out a conscious command.  To throw a projectile accurately [as we do now, as opposed to lobbing rocks into a herd -- D.H. Sep 2015], you need to time your movements within about a hundredth of a second.  Once you've decided to throw something, it's far, far too late to make adjustments as you go along.  You have to have the whole program ready to go ahead of time, adjusted for where your target is -- or will be when the projectile reaches it.  Calvin speculates that this sort of plan-ahead was re-purposed into our control structures for things like language.

I'm not sure I quite buy this.  I doubt that throwing is the only thing in evolutionary history that requires this sort of plan-ahead.  Surely when a hawk dives for a mouse or a cheetah jumps for a gazelle it is doing the same sort of thing under similar constraints.  Nonetheless, it's an interesting idea, and plausible in a general sense, that the original behavior change of throwing things would have brought on a host of other changes, in both behavior and anatomy, that led to wholly new behaviors like speech and large-scale planning (for lack of a better term for the type of planning that we do and other animals don't).

Tuesday, April 29, 2014

This is not your uncle Benoit's Mandelbrot set

I've just thrown away a previous draft of this post, in which I tried to make some overarching point about the interaction between technology and art, and tie that in to the current generation of 3D fractal art.  That didn't work out so well, so maybe I'll just give a few impressions.  All in my personal opinion, of course.  I don't pretend to be an art connoisseur.


Early fractal art worked mainly as eye candy.  A typical Mandelzoom was an assault of wild shapes and saturated colors, looking vaguely organic, or perhaps suggesting a paisley print, or maybe something you might see when you rubbed your eyes.  Striking images.  Trippy, beautiful, even.

As with any algorithmic art, there is a vital human element that gives them the claim to be called "art" -- whether good, or bad, whether your cup of tea or not.  Someone took the time to search the infinity of the Mandelbrot set for an image to their liking, to frame and color it, and to present it.  While not strictly fractal art, Electric sheep is an interesting example of this human-algorithm interplay, on two levels.  On the one hand, someone (Scott Draves) came up with the idea and got it going.  On the other hand, the actual curation is done collectively by thousands of people on the web.  Very cool.  I've got it as live wallpaper on my phone as I write.

But ... it's all sort of ... detached.  It's cool, without a doubt, but it's also cool in the sense of "aloof".  I get the impression of a mind at work, but an emotionless, alien mind.  Maybe I'm biased by knowing too much of the mechanics behind it all.  I'm sure others have warmer impressions, or more easily see familiar objects like trees or flowers, but to me, even when the images suggest some sort of life form, they look like something other.  Which, I'm sure, is exactly the point to a lot of folks.



In 2009, after many not-quite-successful experiments by many people, Daniel White hit upon the Mandelbulb, a way to generalize the process behind the Mandelbrot set to three dimensions in a way that people generally agreed was as cool as the original 2D set.  In 2010, Tom Lowe came up with a different way of generalizing to higher dimensions, called the Mandelbox, which was also deemed worthy.  Strictly speaking, both the Mandelbulb and Mandelbox are actually families of shapes with infinitely many possible parameter settings, some cooler than others, of which the artist is free to pick the coolest, but they're generally referred to as single entities.

Being truly three-dimensional (in that they're embedded in three-dimensional space), these sets offer possibilities well beyond those of two-dimensional fractals.  While some 2D fractals may suggest depth, these have it.  A video rendering is truly a trip through an alien space.  Technically, any fractal (or at least any non-linear fractal, where the structure doesn't simply repeat as you zoom in) has infinitely elaborate detail, but these 3D fractals somehow seem to have more infinitely elaborate detail.

They can also seem more organic, particularly if you play with the formula a bit.  Note the images at the bottom.  They look like they grew, rather than having been produced by a purely mechanical process.  Even the more mechanical-looking images seem more made than found.  In a normal Mandelzoom, it feels like a computer (or math itself) provided the image.  Many of the 3D structures give the strong impression someone made them and the algorithmic process managed to stumble upon them -- see the intricacy of the design, the attention to detail, the exotic esthetic -- or even that they somehow made themselves.

If anything, the feeling of alien-ness is even stronger than for a 2D mathscape.  Even the most organic-looking forms, though they may seem like they ought to be part of some coral reef or fungal growth, seem about as foreign as they could look and still seem familiar.  No puppy dogs or Old Master still lifes here.

The artist has a fair bit of latitude in choosing the color palate, location, scale, viewing angle, lighting and so forth, but the underlying shape is still the underlying shape.  Still, there's only so much you can do with a landscape so fundamentally bizarre, either to make it more familiar or more otherworldly [many 3D Mandelpictures have some sort of height map or similar manipulation that can impose a chosen shape on the underlying contours of the fractal.  Old-fashioned texture mapping and other rendering techniques can be brought to bear as well, along with compositing and other image manipulation techniques.  The artistic effect is to make the fractal more a technique than an end in itself, which seems overall like a good development -- D.H.]


What is it about these forms, both 2D and 3D, that resonates so strangely?  What does it say about our brains that we should see these images the way we do?

It's not shocking that fractal forms should remind us of living things or other natural objects.  Fractal geometry was invented in part to describe the wide variety of natural shapes that didn't fit into the regular categories of classical geometry.

The name "fractal" itself comes from the notion of a fractional dimension, which is ubiquitous in nature once you look for it.  There are several definitions of dimensionality that can take fractional values, but the one usually used with fractals is the Hausdorff dimension.  By that standard, the coastline of Great Britain, for example, has been measured to have a dimension of 1.25, the coastline of Norway 1.52.  Galaxy clustering comes in around 2, cauliflower around 2.3, and the surface of the human brain around 2.8.  Interestingly, the Mandelbrot set itself has a Hausdorff dimension of exactly 2 -- as does its boundary.

As odd as a fractional dimension may sound, it makes a certain kind of sense.  A coastline certainly isn't 2-dimensional, but neither is it quite a straight line or smooth curve.  Because we see these kinds of shapes all the time -- trees, clouds, veins in leaves, mountains, piles of pebbles, coral, broccoli ... it's not a great leap to think we would respond to something artificial with the same general property.  And on the other hand, because they're not exactly like anything we naturally encounter, it's natural to think of them as alien.

In one sense that feeling of some of the 3D forms being organisms or artifacts themselves is a property of the sets themselves, but equally so is it in the eye (or the perceptual machinery behind it) of the beholder and of the artist selecting and rendering the scene.  It would be interesting to show a bunch of people random images from the Mandlethingies and ask "Does that look natural, artificial, or other?"

Then follow up with "If this looks artificial to you, who do you think made it?"

And why?

Sunday, March 9, 2014

Real imitation Old Modern English

Just in time for the NFL playoffs, GEICO ran an ad based on the gag that there is, in fact, an "oldest trick in the book".  The trick?  Two sort of Medieval/Renaissance-looking guys are sitting in some sort of library/scriptorium-looking place.  Quoth the elder, pointing: "Looketh over there!", and when the younger does, "Haha!  Madeth thou look!"

OK, it's a GEICO commercial, not King Lear.  We don't expect linguistic accuracy.   We expect them to get the gag over, which they do just fine, by pasting -eth onto every verb in sight.  Old-looking costumes, old-sounding language ... got it.  Now back to the game in progress.  As the voice-over says "So endeth the trick."

Ironically, that last line is actually proper Elizabethan English.

Before I get back to the whatever larger point I may have, and largely because I just can't help myself, here's a brief rundown of the history of English.  If you're ever having too much fun at a party and would like to recast yourself as a fun-puncturing pedant and so be left in peace, take careful note.

English is generally divided into three forms
  • Old English, spoken from somewhere around the 400s to somewhere around the 1100s.  It's basically a Germanic dialect, and if you weren't familiar with it you likely wouldn't think of it as English at all.  It sounds more like German or Icelandic and has a grammar more similar to classical Latin, with gender and case distinctions and other inflections all over the place, than English as we know it.  The best-known Old English work, Beowulf, doesn't even take place in England, but in what is now Scandinavia.  It looks like this, and starts off "Hwæt! Wē Gār‐Dena in geār‐dagum þēod‐cyninga þrym gefrūnon, hū þā æðelingas ellen fremedon" (except without the modern niceties like punctuation). This, of course, means, in Seamus Heaney's fairly loose translation, "So.  The Spear-Danes in days gone by, and the kings who ruled them, had courage and greatness."  (to be fair, English of that period can also look a bit more familiar, like this)
  • Middle English, spoken from somewhere around the 1100s to the 1400s.  It looks a lot more like English as we know it (to be fair, it can also look like this), but still has some grammatical differences.  For example, I sleep, They sleep and I have slept become I slepe, They slepen and I have yslept.  More noticeably, Middle English was pronounced much differently from Modern English, not having been through the Great Vowel Shift.  Before then, English vowels had approximately the same values as in many other European languages, so the phrases above would be pronounced more like Ee slaypeh, Thay slaypen and Ee hahv eeslept.  Note that the silent e at the end of words isn't silent (or at least not always).  Middle English uses -eth as a verb ending where we would use -s, so she singeth instead of she sings, and it uses thou with -est as a verb ending, so thou comest instead of you come (see this post for a lot more on thou and you).   The best-known work of Middle English, Chaucer's Canterbury Tales, looks like this and starts off "Whan that Aprille with his shoures soote / the droghte of Marche hath perced to the roote ...".  If you heard it recited, you might not catch much, but in print you might well believe it meant "When April, with its sweet showers, has pierced March's drought to the root ..."
  • Modern English, which includes the Early Modern English of Shakespeare's Elizabethan times, spoken from the 1400s or so to the present.  Naturally the language has changed a bit over the past few centuries, but older forms such as Elizabethan English seem like our English with a few odd changes, not like a different language.  Apart from vocabulary, the most noticeable changes to our ears are probably that Early Modern English still uses thou and endings like -eth and -est.
There's a lot more to it, of course, but that seems more than enough for now.

Well, actually ... while I'm still in full-on killjoy mode: The actual rules of thou, -eth and -est are really not that hard:
  • Use thou for the subject and thee for the object (but again, see the other post for details):  Thou art tall;  I bid thee good night.
  • Put -est on the end of the verb if thou is the subject, with a couple of special cases like art for are and hast for have:  Thou knowest not what thou sayest.
  • Change -s on the end of verbs (and only verbs) to -eth: The ice man cometh; So endeth the trick.
That covereth not all the bases, of course, but it will give thee reasonably accurate "old-sounding" English without too much trouble.  For the GEICO ad, we get "Look over there!" and "Made thee look", though perhaps "Look thou yonder!" and "I made thee to look" might be more likely.  I would suspect that the first version, at least, just doesn't sound "old enough" to pass muster for the ad.

Now that I've got that out of my system, what was my actual topic?  Ah yes.  It's not surprising or particularly interesting that modern speakers would get Elizabethan English wrong.  Why wouldn't we?  We don't speak it, and we only hear or read it on special occasions.  The more interesting question is how we get it wrong.  How is it that we come up with the particular pastiche of Elizabethan English that we do?

At a guess, since -eth and -est sound more or less alike and seem to be attached to a lot of verbs, it's not hard to boil real Elizabethan usage down to "use thou a lot and put -eth at the end of verbs" and even to "use thou a lot and put -eth on the end of a lot of random words."

That's so straightforward and obvious that it's easy to forget that there are plenty of other minor differences between our English and Elizabethan English that we don't tend to imitate.  Here and elsewhere, I'll take "our English" to mean "English as spoken by newscasters in major US outlets" or such.  Even in this day and age there is significant variation among dialects.  Please don't take it amiss if I say "We don't say ___" when you've heard people around you say just that -- but do feel free to leave a comment.

I've taken the examples below from Much Ado About Nothing partly because it's reasonably representative, but mostly because it's so much fun to read.
  • The vocabulary has changed noticeably.  For example, we no longer distinguish whither (to where) and whence (from where) from plain where.  We may still recognize words like yonder and betwixt, but we don't tend to use them.  Some words, like meet (suitable, fit) or an (if) have meanings we no longer use, and some, like baldrick or fardel are entirely unfamiliar.
  • Elizabethan English uses word orders and constructions that sound odd to our ears.  Here's a nice illustration:
    Because I will not do them the wrong to mistrust any, I will do myself the right to trust none; and the fine is, for the which I may go the finer, I will live a bachelor.
    Every word there is a perfectly familiar contemporary English word, but the way they're strung together makes me feel a bit like I just got off a ship and haven't got my land legs yet. Your mileage may vary.
  • Cultural references have shifted, understandably.  Shakespeare's works, and other literary works of the time, are full of classical allusions that his audience would have had no trouble with, but are as opaque to us as a modern pop-culture reference would have been to Shakespeare (though I suspect he would have been a quick study). Things like or do you play the flouting Jack, to tell us Cupid is a good hare-finder and Vulcan a rare carpenter?
  • Re-reading the opening of Much Ado, I found that large chunks of it flow almost like today's English, without major twistiness like the speech above, but with lots of little idioms that are just a bit different: he wears his faith but as [like] the fashion of his hat ...Don Pedro is approached [here] ...I wonder [am surprised] that you will still be [are still] talking, Signior Benedick: nobody marks [notices] you ...I would [wish] I could find in my heart that I had not [didn't have] a hard heart.
What we have here, as one might expect, is several layers of changes, none of them very big by themselves, but taken together enough to produce something that's clearly still English, but a significantly different kind of English.

When we try to imitate this language, there are a couple of major constraints:  It can't be too hard to come up with, and it can't be too hard to understand, but it does have to sound more or less like the examples we've heard.  We need a superficial imitation, while a realistic one would just get in the way.  I think that explains why we don't see obsolete words, strange word orders or classical references in the usual "Olde Englissh", but we do see features like thou and -eth.  Since we're going for impression, not accuracy, it's not so important exactly how we apply those features, just that they're there.

On this theory, the little idiomatic changes like I had not or is approached might find their way into a more careful imitation, at least more easily than the other items on the list above.  I can't be bothered to find evidence for or against that idea, but hey, it seems plausible enough.