Sunday, February 14, 2016

Chaotic demonyms

I don't recall having run across the word demonym before Wikipedia, though I may well have run across it and forgotten it.  It's certainly one of those words that appears disproportionately often in Wikipedia (Note to self: What are some others?  Despite the title, this post on the other blog doesn't really offer any)

So what's a demonym?

A demonym (demos = people, nym = name) is a word for someone or something from a particular place.  At least in Wikipedia, a demonym is a noun (I met a Canadian the other day).  The adjectival form is generally the same (We stopped at the Canadian border), but not always (I met a Swede at the Swedish embassy).

Demonyms are a bit of a mess.  Why is it Canadian and not Canadan (or why isn't the country Canadia)?  Why is someone from Washington a Washigtonian but someone from London a Londoner?  Why is someone from Manchester a Mancunian?  From Monaco a Monegasque?  What about Michigander and Arkansawyer, not to mention Hoosier?

Language is full of cases like this that can't be fully described by hard and fast rules, but aren't completely random, either.  For example, let me digress a bit about English regular and irregular forms.

In English, most verbs can be completely described by rules (I rest, he/she/it rests, you rest ... I/he/she/it/you/... rested, I/he/she/it/you/... have rested, I/he/she/it/you/... will rest).  One English verb is highly irregular: I am, he/she/it is, you/we/they are, I/he/she/it was, you/we/they were, I/he/she/it/you/... have been, I/he/she/it/you/... will be).  This is an example of suppletion, where entirely different words fill in for different roles.  Another example is can in dialects that don't use forms like might can but instead say might be able to.

Quite a few English verbs, including many of the most common ones, are somewhat irregular.  They follow the rules in present tense (I sing, he/she/it sings ...) but use a slightly different form for the past tense (I sang) and possibly yet another for the past participle (I have sung).  These are mostly if not entirely holdovers from older Germanic ancestors of modern English, which used vowel changes to indicate tense (more on the ancestry of English here and here) [one exception is snuck, which used to be sneaked].

There are similar cases for nouns (one goose, two geese), but nouns have an extra wrinkle.  For some borrowed nouns we also borrow a plural form (one alumnus, two alumni).  In at least one case we try to borrow a plural but miss (one octopus, two octopi ... but that should probably be two octopodes, or just say two octopusses and be done with it).  When we borrow a plural we're generally not too fussy about the exact details of the source language's inflections.  We say of alumni and not alunorum, for example.

None of this is unique to English.  Languages in general tend to have irregular forms due to ancestral holdovers, borrowing and such.

So where does that leave us with demonyms?

Just as with nouns and verbs, there are some regular cases.  I can't think of any place ending in -ia that doesn't have -ian for its demonym: Asian, Bavarian, Croatian, Dalmatian, ...  This is the "unmarked" case.  If we want to designate a place and its people in English, regardless of what the locals call the place and themselves, we tend to take some root particular to the place and use -ia for the place -ian for the demonym.

On the other hand, there are plenty of cases of full-on suppletion: Bay Stater (Massachusetts), Hoosier (Indiana), Tar Heel (North Carolina), Scouser (Liverpool) ...

And then there's a lot of interesting territory in the middle.  I'd break it down into
  • Cases where we use some variation of -ian, but it's hard to predict which one: Canadian, Floridian ... but Manitoban, Kansan, Canberran ... and even European and Panamanian.  My guess is that most of these can be described by assuming a basic form, probably -ian, together with some rules that would change Iowaian to Iowan and Kansasian to Kansan.  That's not to say that we actually think that way, just that such rules may do a decent job of describing what's going on.  With, of course, some inevitable exceptions because ... language.
  • Cases where we use a different form instead of -ian and its relatives.
In the latter case there are still regularities.  Generally if the place is X-land, the demonym will be X-lander (Rhode Islander, Newfoundlander ...), notwithstanding the Irish, English, Poles, Marshallese and almost certainly some others.  There are also several cases, generally from western Asia, where we borrow the -i suffix: Gujarati, Kuwaiti, Israeli, Pakistani ...  Those appear to be generally regular, just using a different set of rules.

Even accounting for cases like "use -er with -land" and "OK, so we borrowed Gujarati", there is still a region of apparent chaos.  I can see why Wyomingian would be a bit of a mouthful, but why Wyomingite in particular and not, say, Wyominger?  Why Connecticutter and not Connecticutite?  For that matter, why Vermonter and not Vermontian?

Have a look at these lists in Wikipedia (here, here and here).  Maybe you can set out some nice, crisp rules for the various -ites, -ers (besides -landers), -eses (c.f. -esians) and so forth.  I couldn't, and my guess is that it can't be done in a way that doesn't end up looking highly ad hoc (or post hoc?).

I chose the word "chaos" above not just to suggest disorder, but to suggest a particular kind of disorder.  Mathematical chaos often occurs at the boundaries between otherwise regular regions.  Outside a small region, the Mandelbrot iteration always diverges.  Inside a fairly simple boundary, it always converges.  Only in between those two, where it's not clear which outcome will happen, do things get messy.  Finding roots of equations with Newtonian iteration behaves similarly.  In most cases you quickly find one of the roots but in the boundary regions you can't really tell where you end up.

I suspect something analogous is going on here.  Most places are inhabited by -ians, but where that doesn't work and you don't have a definite alternative, what you end up with is essentially luck of the draw.

4 comments:

  1. Nice.
    The reason the irregulars are the most common words is that with common words we know what the forms are. With really uncommon or new words we will default to the regular. Thus if you want to know what the unmarked case is for names in -land, for example, make up a state or country and see what folks do with it. Are people from Skob Skobbers? Skobbians? People from Sobland are almost surely skoblanders, but if you do the experiment, don't be surprised when you get some Skoblandians. But while no-one but two-year-olds says "comed," ("went" being another example of suppletion) nobody's sure if it's " have dived" or "have dove," or "I was woken in the night," or, what? And, of course, there are extralinguistic factors. People from Arkansas just don't want to be Arkansans.

    So the uncommon but not unknown words are the boundary regions in your fractal expansions.

    I wonder what Newton would have thought about this math?

    People who don't say "might can" just have a rule that says modals don't take modal auxiliaries.

    ReplyDelete
  2. I meant to mention more explicitly that there are several -lands that are named after the people. Besides Poland being the land of the Poles, there are Thailand, Lapland (though the Sami who live there call it Sápmi), etc.

    I suppose you're right about the modals.

    ReplyDelete
  3. Actually ... having a rule that modals don't take modal auxiliaries doesn't say what they do take. Why "might be able to" for "might can" but no specific form for "might should"? For that matter, does anyone say "might will" rather than "might be going to", "might would" for "might have been going to" or "might must" for "might have to?"

    Hmm ... here's an interesting list: http://www.jstor.org/stable/455778?seq=1#page_scan_tab_contents

    ReplyDelete
  4. Interesting point about "might should." I hadn't thought of that.
    In line 7 of the list "might've used to" doesn't really belong, since the "have" between the modal is a non-modal auxiliary. And I still haven't seen anything on differences between the usual modals (may/might, can/could, (shall)/should, must...) and what I call pseudo-modals (gonna, wanna, hafta...). The modals proper are complicated enough. Some, but not all, seem to have not only past tenses, but subjunctives as well.

    But it's an interesting list. There are a bunch of these that I evidently didn't grow up South enough to have heard. It's god to know that there's someone out there doing the field work.

    ReplyDelete