Thursday, April 6, 2017

Big vocabulary, or just big words?

The other day I was reading an article that used a couple of words I hadn't seen in a while, say anodyne or encomium.  I more-or-less remembered what they meant, and it was reasonably clear from context what they meant, but I still ended up looking them up.  I had two feelings about this: on the one hand, did the author really have to drag those out?  Why not just use Plain English?  On the other hand, they were correctly used, and apt, so what's the big deal?  I'm sure I've thrown out a word or two here that I could replaced with something more familiar, maybe with a little rewording.

But I'm not here to critique style.  What stuck in my head about this incident was how conspicuous an unusual word can be (and besides that, unusual words tend to stick out).  The article itself was probably a thousand words or so, maybe more, but it was those two that changed the whole reading experience.

This wasn't just because of the extra time it took to look the words up and make sure I knew what they meant.  That's a speed bump these days, reading an online article with search bar and dictionary app at the ready, maybe an extra minute, if that.  Even if I hadn't had a dictionary handy, I could have gotten the good out of the article without knowing exactly what those words meant.

The real issue lies deeper in human perception: We (and living things with recognizable brains in general) are finely tuned to notice discrepancies.  In a field of green grass it's the shape of that predator, or that prey,  or that particularly tasty plant, or whatever, that stands out.  In an article of a thousand words, it's the unusual ones that stand out.

I could go on and on, but it's worth particularly noting how important this is in social environments.  We can spot an unfamiliar accent in seconds.  We can spot someone dressed differently, or with different features than we usually see, well before we're even aware that we have.  The other night I was watching a TV show with a foreign actor playing an American, and everything was just fine until they said "not" with a British "o".   It didn't ruin the whole show -- this was a single vowel, not Dick Van Dyke in Mary Poppins -- but it was noticeable enough I still recall it out of an hour of tense drama.

(I have to say that dialect coaching has gotten a lot better over the past couple of decades.  Time was, movie stars talked like movie stars, with a kind of over-enunciated diction that didn't sound like anyone in real life, and if a character was meant to sound foreign, pretty much anything would do.   This is doubtless because in the early days of "talking pictures" the medium was still transitioning from the stage, a theatre actor was used to projecting up to the cheap seats and a fake accent was as good as a fake beard since everything was a hand-painted set and there probably weren't that many people in the audience who knew what a true Elbonian accent sounded like anyway.  Today pretty much every part of that is different, and we expect realism -- Billy Bob Thornton's all-too-valid complaint about "that Southern accent that no one in the South actually speaks with" notwithstanding.)

Where was I?

I've argued before that we often seem to care most about distinctions when they matter least. Vocabulary is largely another example of that.  Unless  you're reading Finnegan's Wake or something equally chewy, you're probably OK just skimming over anything you don't know and looking it up later.  Even that blowhard commentator with the two-dollar words is trying to get a point across and isn't going to let the vocabulary get completely in the way.

As a corollary to that, you don't need to know very many unusual words in order to stand out.  If you know a few dozen and use them appropriately, you'll almost certainly draw attention (if you learn a few dozen and use them inappropriately you'll also draw attention, but probably not the kind you want).  This can happen naturally if you run across a rare-ish but useful word or two in your reading from time to time and hold onto it for future use.  There's something nice about, say, cogent that is hard to reword cleanly, the distinction between terse and concise is sometimes worth making, and so on.

Contrast that with the average human vocabulary.  This is a hard thing to measure, but if you've heard something on the order of "uneducated people have a vocabulary of 2000 words while educated people know 20,000", rest assured that's complete bunk.  If we're measuring vocabulary, we have to measure "listemes", that is, things that you just have to learn by rote because you can't work them out from their parts.

This includes all kinds of things:
  • proper names of people and places
  • distinct senses of words, particularly small words like out and by, which can have quite a few, depending on how you count.
  • idioms large and small, like in touch or look up (in its non-literal senses) to classics like red herring, two-dollar word and that's the way the cookie crumbles.
  • Cultural references, which are kind of like names and kind of like idioms
  • Fine points that we don't generally think of as idioms, but are idiomatic nonetheless, like fried egg meaning a particular way of frying an egg, as distinct from scrambling an egg or -- for whatever reason -- trying to fry a whole egg in a pan without removing the shell
I'm not trying to give a full taxonomy of things-that-you-just-have-to-learn, but I hope that gives the general idea.   The main point is that there are lots and lots of these, the categories they might fall into are somewhat arbitrary, and how many you know doesn't have a great deal to do with how many literary classics you've read.

I'm not really familiar with the research on this, but my understanding is that the average person knows somewhere in the hundreds of thousands of listemes, and a large portion of them are commonly understood.  On top of those, we can add a smaller portion of jargon, slang or sesquipedalianisms.  That part, people will notice.  But it's a relatively small part.