Monday, October 30, 2023

Language off in the weeds

While out walking, I paused to look at a stand of cattails (genus Typha) growing in a streambed leading to a pond.  "That's a pretty ..." I thought to myself, but what would be the word for the area they were growing in? Marsh? Wetland? Swamp? Bog? Something else?

I've long been fascinated by this sort of distinction.  If you don't have much occasion to use them, those words may seem interchangeable, but they're not.  Technically

  • A wetland is just what it says ... any kind of land area that's wet most or all of the time
  • A marsh is a wetland with herbaceous plants (ones without woody stems) but not trees
  • A bog is a marsh that accumulates peat
  • A swamp is a forested wetland, that is, it does have trees
Wikimedia also has a nice illustration of swamps, marshes and other types of land.  By that reckoning, I was looking at a marsh, which was also the word that came to mind at the time.

This sort of definition by properties is everywhere, especially in dictionaries, encyclopedias and other reference works.  Here, the properties are:
  • Is it land, as opposed to a body of water?
  • Is it wet most or all of the time?
  • Does it have trees?
  • Does it accumulate peat?
The first two are true for all of the words above.  For the last two, there are three possibilities: yes, no and don't care/not specified.  That makes nine possibilities in all

Trees? Peat? Word
Yes Yes peat swamp
Yes No ?
Yes Don't care swamp
No Yes bog/peat bog
No No ?
No Don't care marsh
Don't care Yes peatland
Don't care No ?
Don't care Don't care wetland

As far as I know, there's no common word for the various types of wetland if they specifically don't accumulate peat.  You could always say "peatless swamp" and so forth, but it doesn't look like anyone says this much.  Probably people don't spend that much time looking for swamps with no peat.

Leaving aside the empty spaces, the table above gives a nice, neat picture of the various kinds of wetland and what to call them.  As usual, this nice picture is deceptive.
  • I took the definitions from Wikipedia, which aims to be a reference work.  It's exactly the kind of place where you'd expect to see this kind of definition by properties
  • The Wikipedia articles are about the wetlands themselves, not about language.  They may or may not touch on how people use the various words in practice or whether that lines up with the nice, technical definitions
  • The way the table is set up suggests that a peatland is a particular kind of wetland, but that's not quite true.  A peatland is land, wet or not, where you can find peat.  Permafrost and tundra can be peatland and often are, but they're not wetlands.  Similarly, a moor is generally grassy open land that might be boggy, if it's low-lying, but can also be hilly and dry.  Both peatlands and moors can be wetlands, but they aren't necessarily
  • Even if you take the definitions above at face value, if you have a lake in the middle of some woodlands with a swampy area and a marshy area in between the lake and the woods, there's no sharp line where the woods become swamp, or the swamp becomes marsh, or the marsh becomes lake.
The Wikipedia article for Fen sums this up nicely:
Rigidly defining types of wetlands, including fens, is difficult for a number of reasons. First, wetlands are diverse and varied ecosystems that are not easily categorized according to inflexible definitions. They are often described as a transition between terrestrial and aquatic ecosystems with characteristics of both. This makes it difficult to delineate the exact extent of a wetland. Second, terms used to describe wetland types vary greatly by region. The term bayou, for example, describes a type of wetland, but its use is generally limited to the southern United States. Third, different languages use different terms to describe types of wetlands. For instance, in Russian, there is no equivalent word for the term swamp as it is typically used in North America. The result is a large number of wetland classification systems that each define wetlands and wetland types in their own way. However, many classification systems include four broad categories that most wetlands fall into: marsh, swamp, bog, and fen.

A fen here means "a type of peat-accumulating wetland fed by mineral-rich ground or surface water."  It's that water that seems to make the difference between a bog and a fen: "Typically, this [water] input results in higher mineral concentrations and a more basic pH than found in bogs." (bogs tend to be more acidic).  We could try to account for this in the table above by adding an Acidic? (or Basic?) column, but then we'd have 27 rows with a bunch of question marks in the blank spaces.

In that same paragraph, the article says "Bogs and fens, both peat-forming ecosystems, are also known as mires."  If you buy that definition, it might fit better than peatland in the "trees: don't care, peat: yes" row.

This is all part of a more general pattern: Definitions by properties are a good way to do technical definitions, but people, including technical people when they're not talking about work, don't really care about technical definitions.  For most purposes, radial categories do a better job of describing how people actually use words.  More on that in this post.

All of this is leaving out an important property of bogs and mires: you can get bogged down in a bog and mired in a mire.  Most of these words are old enough that the origins are hard to trace, but bog likely comes from a word for "soft", which more than hints at this (mire is likely related to moss).

This suggests that what we call something depends at least in part on how we experience it.  The interesting part is that defining properties like wetness, grassiness, softness and the presence of peat are also based on experience, which makes untangling the role of experience a bit tricky.


Just because we can distinguish meanings doesn't mean those distinctions are useful, but I'd say they are useful here, and in most cases where we use different words for similar things.  The distinctions are useful because we can draw larger conclusions from them.  For example:
  • It's easier to see what's on the other side of a marsh, since there are no trees in the way
  • A marsh will be sunnier than a swamp
  • There will be different kinds of animals in a swamp than a marsh
  • You can get peat from a bog.  Even today, peat is still a useful material, so it's not surprising that it's played a role in how we've used words for places that may or may not have it.
  • And it's also not surprising that people talk about peat bogs and peat swamps but generally don't specifically call out bogs and swamps without it.
Even the more general term wetland is drawing a useful distinction.  A wetland is, well, wet.  There's a good chance you could get stuck in the mud in a wetland, or even drown, not something that would happen in a desert unless there had been a downpour recently (which does happen, of course).


Let's take a completely different example: Victorian cutlery.  Upper middle-class Victorian society cared quite a bit about which fork or spoon to use when.  Much of this, of course, is about marking membership in the in-group.  If you were raised in that sort of society, you would Just Know which fork was for dinner and which for salad.  If you didn't know that, you obviously weren't raised that way and it was instantly clear that there could be any number of other things that you wouldn't know to do, or not do (If you ever have to bluff your way through, work from the outside in -- the salad fork will be on the outside since salad is served first -- and don't worry, something else will probably give you away anyway).

However, there are still useful distinctions being made, and they're right there in the names.  A salad fork is a bit smaller and better suited for picking up small pieces of lettuce and such.  A dinner fork is bigger, and better for, say, holding something still while you slice it with a knife.  A soup spoon is bigger than a teaspoon so it doesn't take forever to finish your soup, a dinner knife is sharper than a butter knife, a butter knife works better for spreading butter, and so forth.

It's no different for the impressive array of specialized utensils that one might have encountered at the time (and can still find, in many cases).  A grapefruit spoon has a sharper point with a serrated edge so you can dig out pieces of grapefruit.  A honey dipper holds more honey than a plain spoon and honey flows off it more steadily (unless you have a particularly steady spoon hand) and so on.  I have an avocado slicer with a grabber that makes it much easier to get the pit out.  It's very clear (at least once you've used it) that it's an avocado slicer and not well suited for much else.  You can do perfectly well without such things, but they can also be nice to have.  

Consider one more example: The fondue fork, which has a very long, thin stem and two prongs with barbs on them.  You could call it, say, a barb-pronged longfork, and that would be nice and descriptive.  If someone asked you for a barb-pronged longfork and you had to fetch it from a drawer of unfamiliar utensils, you'd have a pretty good chance of finding it.  If someone asked for a "fondue fork" and you didn't know what that was, you'd pretty much be stuck.  The same is true for grapefruit spoons, dinner knives and so forth.  All language use depends on shared context and assumptions about it.


I think there's something general going on here, that how we experience and interact with things isn't just a factor in how we name them, but central to it.  Even abstract properties like softness or dryness are rooted in experience.  Fens and bogs have different soil characteristics, but the names are much older than the chemical theory behind pH levels.

We call it a fondue fork because it's used for putting bits of food in a fondue pot (and, just as importantly, for getting them back out).  A fondue fork has certain qualities, like the long stem and the barbs, that make it well-suited for that task, but they're not directly involved in how we name it.

Words like fen and bog are distinct because fens and bogs support distinct kinds of plant and animal life, moving through a fen is different from moving through a bog, and so forth.  A difference in pH level is a cause of this difference, but that's incidental.  There are almost certainly areas that are called fens that have bog-like pH levels or vice versa.  You could insist that such a fen (or bog) is incorrectly named, but why?


Properties do play a role in how we name things.  Swamps have trees.  Marshes don't.  A knife has a sharp edge.  A fork is split into two or more tines.  A spoon can hold a small amount of liquid.  What we don't have, though, is some definitive list of properties of things, so that someone presented with a teaspoon could definitively say: "This thing is an eating utensil.  It can hold a small amount of liquid.  That amount is less than the limit that separates teaspoons from soup spoons.  Therefore, it's a teaspoon."

In many contexts, it may look like there is such a list of properties.  With marsh and swamp, we can clearly distinguish based on a property -- trees or no trees.  Sometimes, as with red-winged blackbird or needle-nose pliers, but not for marsh and swamp, we use properties directly to build names for things.

But there are thousands of possible properties for things -- sizes, shapes, colors, material properties, temperature, where they are found, who makes them, and on and on.  Of the beyond-astronomically many possible combinations, only a tiny few describe real objects with real names.

At the very least, there has to be some way of narrowing down what properties might even possibly apply to some class of objects.  Stars are classified by properties like mass (huge) and temperature (very hot by human standards), but we don't distinguish, say, a fugue from a sonata based on whether the temperature is over 30,000 Kelvin.

It's not impossible, at least in principle, to create a decision tree or similar structure for handling this.  You could start with dividing things into material objects, like stars, and immaterial ones, like sonatas and fugues.  Within each branch of the tree, only some of the possible properties of things would apply.  After some number of branches, you should reach a point where only a few possible properties apply.  If you're categorizing wetlands, you know that the temperature classifications for stars don't apply, and neither do the various properties used to classify musical forms, but properties like "produces peat" and "has trees" do apply.

In practice, though, even carefully constructed classification systems based on properties, like the Hornbostel-Sachs system for musical instruments discussed in this post, can only go so far.  Property-based systems of classification tend to emphasize particular aspects of the things being categorized, such as (in the case of Hornbostel-Sachs) how they are built and how sound is produced from them.  This often lines up reasonably well with how we use words, but I don't think properties are fundamental.

Rather, how we experience things is fundamental, or at least closer to whatever is fundamental.  Properties describe particular aspects of how we experience something, so it's not surprising that they're useful, but neither should it be surprising that they're not the whole story.

Saturday, October 28, 2023

AI, AI and AI

I have a draft post from just over a year ago continuing a thread on intelligence in general, and artificial intelligence in particular.  In fact, I have two draft AI posts at the moment.  There's also one from early 2020 pondering how the agents in Open AI's hide-and-seek demo figure out that an agent on the opposing team is probably hiding out of sight behind a barrier.

I was pretty sure at the time of the earlier draft that they do this by applying the trained neural network not just to the last thing that happened, but a window of what's happened recently.  In other words, they have a certain amount of short-term memory, but anything like long-term memory is encoded in the neural net itself, which isn't updated during the game.  This ought to produce effects similar to the "horizon effect" in early chess engines, where a player that could look ahead, say, three moves and see that a particular move was a blunder would play another move that led to the same blunder, but only after four moves.

I'm still pretty sure that's what's going on, and I was going to put that into a post one of these days as soon as I read through enough of the source material to confirm that understanding, but I never got around to it.

Because ChatGPT-4 happened.

ChatGPT is widely agreed to have been a major game changer and, yeah ... but which games?

From a personal point of view, my musings on how AIs work and what they might be able to do are now relevant to what my employer talks about in its quarterly earnings reports, so that put a damper on things as well.  For the same reason, I'll be staying away from anything to do with the internals or how AI might play into the business side.  As usual, everything here is me speaking as a private citizen, about publicly available information.

Out in the world at large, I recall a few solid months of THIS CHANGES EVERYTHING!!!, which seems to have died down into a steady stream of "How to use AI for ..." and "How to deal with the impact of AI in your job."  I've found some of this interesting, but a lot of it exasperating, which leads me to the title.

There are at least three very distinct things "AI" could reasonably mean at the moment:

  • The general effort to apply computing technology to things that humans and other living things have historically been much better at, things like recognizing faces in pictures, translating from one language to another, walking, driving a car and so forth.
  • Large language models (LLMs) like the ones behind ChatGPT, Bard and company
  • Artificial General Intelligence (AGI), a very vague notion roughly meaning "able to do any mental task a human can do, except faster and better"
There are several posts under the AI tag here (including the previous post) poking and prodding at all three of those, but here I'm interested in the terms themselves.

To me, the first AI is the interesting part, what I might even call "real AI" if I had to call anything that.  It's been progressing steadily for decades.  LLMs are a part of that progression, but they don't have much to do with, say, recognizing faces or robots walking. All of these applications involve neural networks with back propagation (I'm pretty sure walking robots use neural nets), but training a neural net with trillions of words of speech won't help it recognize faces or control a robot walking across a frozen pond because ... um, words aren't faces or force vectors?

If you ask a hundred people at random what AI is, though, you probably won't hear the first answer very much.  You'll hear the last two quite a bit, and more or less interchangeably, which is a shame, because they have very little in common.

LLMs are a particular application of neural nets.  They encode interesting information about the contents of a large body of text.  That encoded information isn't limited to what we think of as the factual content of the training text, and that's a significant result.  For example, if you give an LLM an email and ask it to summarize the contents, it can do so even though it wasn't explicitly trained to summarize email, or even to summarize unfamiliar text in general.

To be clear, summarizing an email is different from summarizing part of the text that an LLM has been trained on.  You could argue that in some very broad sense the LLM is summarizing the text it's been trained on when it answers a factual question, but the email someone just sent you isn't in that training text.

Somehow, the training phase has built a model, based in part on some number of examples of summaries, of how a text being summarized relates to a summary of that text.  The trained LLM uses that to predict what could reasonably come after the text of the email and a prompt like "please summarize this", and it does a pretty good job.

That's certainly not nothing.  There may or may not be a fundamental difference between answering a factual question based on text that contains a large number of factual statements and performing a task based on a text that contains examples of the task, or descriptions of the task being done, but in any case an LLM summarizing an email is doing something that isn't directly related to the text it's been trained on, and that it wasn't specifically trained to do.

I'm pretty sure this is not a new result with LLMs, but seeing the phenomenon with plain English text that any English speaker can understand is certainly a striking demonstration.

There are a couple of reasons to be cautious about linking this to AGI.

First, to my knowledge, there isn't any widely-accepted definition of what AGI actually is.  From what I can tell, there's a general intuition of something like The Computer from mid 20th-century science fiction, something that you can give any question or task to and it will instantly give you the best possible answer.  

"Computer, what is the probability that this planet is inhabited?"
"Computer, devise a strategy to thwart the enemy"
"Computer, set the controls for the heart of the Sun"
"Computer, end conflict among human beings"

This may seem like an exaggeration or strawman, but at least one widely-circulated manifesto literally sets forth that "Artificial Intelligence is best thought of as a universal problem solver".

There's quite a bit of philosophy, dating back centuries and so probably much, much farther, about what statements like that might even mean, but whatever they mean, it's abundantly clear by now that an LLM is not a universal problem solver, and neither is anything else that's currently going under the name AI.

In my personal opinion, even a cursory look under the hood and kicking of the tires ought to be enough to determine that nothing like a current LLM will ever be a universal problem solver.  This is not just a matter of what kinds of responses current LLM-based chatbots give.  It's also a matter of what they are.  The neural net model underpinning this is based pretty explicitly on how biological computers like human brains work.  Human brains are clearly not universal problem solvers, so why would an LLM be?

There's an important distinction here, between "general problem solver", that is, able to take on an open-ended wide variety of problems, and "universal", able to solve any solvable problem.  Human brains are general problem-solvers, but nothing known so far, including current AIs, is universal.

This may sound like the argument that, because neural nets are built and designed by humans, they could never surpass a human's capabilities.  That's never been a valid argument and it's not the argument here.  Humans have been building machines that can exceed human capabilities for a long, long time, and computing machines that can do things that humans can't have been around for generations or centuries, depending on what you count as a computing machine and a human capability.

The point is that neural nets, or any other computing technology, have a particular set of problems that they can solve with feasible resources in a feasible amount of time.  The burden of proof is on anyone claiming that this set is "anything at all", that by building a network bigger and faster than a human brain, and giving it more data than a human brain could hope to handle, neural nets will not only be able to solve problems that a human brain can't -- which they already can -- but will be able to solve any problem.

So next time you see something talking about AI, consider which AI they're referring to.  It probably won't be the first (the overall progress of machines doing things that human brains also do).  It may well be the second (LLMs), in which case the discussion should probably be about things like how LLMs work or what a particular LLM can do in a particular context.

If it's talking about AGI, it should probably be trying to untangle what that means or giving particular reasons why some particular approach could solve some particular class of problems.

If it's just saying "AI" and things on the lines of "Now that ChatGPT can answer any question and AGI is right around the corner ...", you might look for a bit more on how those two ideas might connect.

My guess is there won't be much.