Intermittent Conjecture: April 2025

This started out as "one more point before I go" in the previous post, but it grew enough while I was getting that one ready to publish that it seemed like it should have its own post.

Where machine learning systems like LLMs do unexpectedly well, like in mimicking our use of language, it might not be because they've developed unanticipated special abilities. Maybe ML being good at generating convincing text says as much about the problem of generating convincing text as it does about the ML doing it.

The current generation of chatbots makes it pretty clear that producing language that's hard to distinguish from what a person would produce isn't actually that hard a problem, if you have a general pattern-matcher (and a lot of training text and computing power). In that case, the hard part, that people have spent decades trying to perfect and staggering amounts of compute power implementing, is the general pattern-matcher itself.

We tend to look at ML systems as problem solvers, and fair enough, but we can also look at current ML technology as a problem classifier. That is, you can sort problems according to whether ML is good at them. From that point of view, producing convincing text, recognizing faces, spotting tumors in radiological images, producing realistic (though still somewhat funny-looking) images and videos, spotting supernovas in astronomical images, predicting how proteins will fold, along with many other problems, are all examples of pattern-matching that a general ML-driven pattern-matcher can solve as well as, or even better than, our own naturally evolved neural networks can.

Not knowing a better term, I'll call these ML-friendly problems. In the previous post, I argued that understanding the structure of natural languages is a separate problem from understanding what meaning natural language is conveying. Pretty clearly, understanding the structure of natural languages is an ML-friendly problem. If you buy that understanding meaning is a distinct problem, I would argue that we don't know one way or another whether it's ML-friendly, partly, I would further argue, because we don't know nearly as much about what that problem involves.

From about 150 years ago into the early 20th century, logicians made a series of discoveries about what we call reasoning and developed formal systems to describe it. This came out of a school of thought, dating back to Leibniz (and as usual, much farther and wider if you look for it), holding that if we could capture rules describing how reasoning worked, we could use those rules to remove all uncertainty from any kind of thought.

Leibniz envisioned a world where, "when there are disputes among persons, we can simply say: Let us calculate, without further ado, to see who is right". That grand vision failed, of course, both because, as Gödel and others discovered, formal logic has inescapable limitations, but also because formal reasoning captures only a small portion of what our minds actually do and how we reason about the world.

Nonetheless, it succeeded in a different sense. The work of early 20th-century logicians was essential to the development of computing in the mid-20th century. For example, LISP -- for my money one of the two most influential programming languages ever, along with ALGOL -- was based directly Church's lambda calculus. I run across and/or use Java lambda expressions on a near-daily basis. For another example, Turing's paper on the halting problem used the same proof technique of diagonalization that Gödel borrowed from Cantor to prove incompleteness, and not by accident.

Current ML technology captures another, probably larger, chunk of what naturally-evolved minds do. Just as formal logic broke open a set of problems in mathematics, ML has broken open a set of problems in computing. Just as formal logic didn't solve quite as wide a range of problems as people thought it might, ML might not solve quite the range of problems people today think it might, but just as formal logic also led to significant advances in other ways, so might ML.

I a previous post entitled Experiences, mechanisms, behaviors and LLMs, I discussed a couple of strawman objections to the idea that an LLM isn't doing anything particularly intelligent: that it's "just manipulating text" and it's "just doing calculations".

The main argument was that "just" is doing an awful lot of work there. Yes, an LLM is "just" calculating and manipulating text, but it's not "just" doing so in the same way as an early system like ELIZA, which just turned one sentence template into another, or even a 90s-era Markov chain, which just generates text based on how often which words appeared directly after which others in a sample text.

In both of those cases, we can point at particular pieces of code or data and say "those are the templates it's using", or "there's the table of probabilities" and explain directly what's going on. Since we can point at the exact calculations going on, and the data driving them, and we understand how those work, it's easy to say that the earlier systems aren't understanding text the way we do.

We can't do that with an LLM, even if an LLM generating text is doing the same general thing as a simple Markov chain. We can say "here's the code that's smashing tensors to produce output text from input text", and we understand the overall strategy, but the data feeding that strategy is far beyond our understanding. Unlike the earlier systems, there's way, way too much of it. It's structured, but that structure is much too complex to fit in a human brain, at least as a matter of conscious thought. Nonetheless, the actual of behavior shows some sort of understanding of the text without having to stretch the meaning of the word "understanding".

In the earlier post, I also said that even if an LLM encodes a lot about how words are used and in which contexts -- which it clearly does -- the LLM doesn't know the referents of those words -- it doesn't know what it means for water to be wet or what it feels like to be thirsty -- and so it doesn't understand text in the same sense we do.

This feels similar to appeals like "but a machine can't have feelings", which I generally find fairly weak, but that wasn't quite the argument I was trying to make. While cleaning up a different old post (I no longer remember which one), I ran across a reference that sharpens the picture by looking more closely at the calculations/manipulations an LLM is actually doing.

I think the first post I mentioned, on experiences etc. puts a pretty solid floor under what sort of understanding an LLM has of text, namely that it encodes some sort of understanding of how sentences are structured and how words (and somewhat larger units) associate with each other. Here, I hope to put a ceiling over that understanding by showing more precisely in what way LLMs don't understand the meaning of text in the way that we do.

Taking these together, we can roughly say that LLMs understand the structure of text but not the meaning, but the understanding of structure is deep enough that an LLM can extract information from a large body of text that's meaningful to us.

In much of what follows, I'm making use of an article in Quanta Magazine that discusses how LLMs do embeddings, that is, how they turn a text (or other input) into a list of vectors to feed into the tensor-smashing machine. It matches up well with papers I've read and a course I've taken, and I found it well-written, so I'd recommend it even if you don't read any further here.

Despite the name, a Large Language Model doesn't process language directly. The core of an LLM drives the processing of a list of tokens. A token is a vector -- an ordered list of numbers of a given length -- that represents a piece of the actual input.

To totally make up an example, suppose vectors are three numbers long. If the word a maps to (1.2, 3.0, -7.5), list maps to (6.4, -3.2, 1.6), of maps to (27.5, 9.8, 2.0), and vectors maps to (0.7, 0.3, 6.8), then a list of vectors maps to [(1.2, 3.0, -7.5), (6.4, -3.2, 1.6), (27.5, 9.8, 2.0), (0.7, 0.3, 6.8)].

Here I'm using parentheses for vectors, which in this case always have three numbers, and square brackets for lists, which can have any length (including zero for the empty list, []). In practice, the vectors will have many more than three components. Thousands is typical. The list of vectors encoding a text will be however long the text is.

The particular mapping from input to tokens is called the embedding^*. The overall idea is to encode similarities along various dimensions. There are (practically) infinitely many ways to do this mapping. Over time this has evolved from a mostly-manual process, to an automated process using hand-written code, to the current state of the art, which uses machine learning techniques on large bodies of text. The first two approaches are pretty easy to understand.

An ML-produced embedding (that is, the procedure for turning an actual list of words into tokens), on the other hand, relies on a mass of numbers created during a training phase. This mass of numbers drives a generic algorithm that turns words into large vectors. While the numbers themselves don't really lend themselves to easy analysis, people have noticed interesting patterns in the results of applying embedding.

Because the model-building phase is looking at streams of text, it's not surprising that the embedding itself captures information about what words appear in what contexts in that text. For example in typical training corpora, dog and cat appear much more often in contexts like my pet ___ than, say, chair does. They are also likely to occur in conjunction with terms like paw and fur, while other words won't, and so forth.

While we don't really understand exactly how the embedding-building stage of training an LLM extracts relations like this, the article in Quanta gives the example that in one particular embedding the vector for king minus the one for man plus the one for woman is approximately equal to the one for queen (you add or subtract vectors component by component, so (1.2, 3.0, -7.5) + (6.4, -3.2, 1.6) = (7.6, -0.2, -5.9) and so on).

It's long been known that use in similar contexts correlates with similarity in meaning. But we're talking about implied similarities in meaning here, not actual meanings. You can know an analogy like cat : fur :: person : hair without knowing anything about what a cat is, or a person, or fur or hair.

That may seem odd from our perspective. A person would solve a problem like cat : fur :: person : ? by thinking about cats and people, and what about a person is similar to fur for a cat, because we're embodied in the world and we have experience of hair, cats, fur and so forth. Odd as it might seem to know that cat : fur :: person : hair without knowing what any of those things is, that's essentially what's going on with an LLM. It understands relations between words, based on how they appear in a mass of training text, but that's all it understands^**.

But what, exactly, is the difference between understanding how a word relates to other words and understanding what it means? There are schools of thought that claim there is no difference. The meaning of a word is how it relates to other words. If you believe that, then there's a strong argument that an LLM understands words the same way we do, and about as well as we do.

Personally, I don't think that's all there is to it. The words we use to express our reality are not our reality. For one thing, we can also use the same words to express completely different realities. We can use words in new ways, and the meaning of words can and does shift over time. There are experiences in our own reality that defy expression in words.

Words are something we use to convey meaning, but they aren't that meaning. Meaning ultimately comes from actual experiences in the real world. The way words relate to each other clearly captures something about what they actually mean -- quite a bit of it, by the looks of things -- but just as clearly it doesn't capture everything.

I have no trouble saying that the embeddings that current LLMs use encode something significant about how words relate to each other, and that the combination of the embedding and the LLM itself has a human-level understanding of how language works.

That's not nothing. It's something that sets current LLMs apart from anything before them, and it's an interesting result. For one thing, it goes a long way toward clarifying what's understanding of the world and what's just understanding of how language works and what combinations of words people actually use.

If an LLM is good at it, then it's something about how language works. If an LLM isn't good at it, then it's probably something about the world itself. I'll have a bit more to say about that in the next (shorter) post.

Because LLMs know about language, but not what it represents in the real world, we shouldn't be surprised that LLMs hallucinate, and we shouldn't expect them to stop hallucinating just because they're trained on larger and larger corpora of text.

The earlier post distinguished among behavior, mechanism and experience. An LLM is capable of linguistic behavior very similar to a person's.

The mechanism of an LLM may, or may not, be similar as far as language processing. We may well learn rules like the way that we use the in relation to nouns in a way that's similar to training an LLM. Whether that's the case or not, an LLM, by design, lacks a mechanism for tying words to anything in the real world. This probably accounts for much of the difference between what we would say and what an LLM would say.

All of this is separate from subjective experience. One could imagine a robot that builds up a store of interactions with the world, processes them into some more abstract representation and associates words with them. But even if that is more similar to what we do in terms of mechanism, it says nothing about what the robot might or might not be experiencing subjectively, even if it becomes harder to rule out the possibility that the robot is experiencing the world as we do.

^* Wikipedia seems to think it's only an embedding if it's done using feature learning, but that seems overly strict. Mathematically, an embedding is any map from one domain into another, no matter how it's produced.

^**Technically, it might matter what the actual numbers are, for example, an embedding that doubled every numeric value or added (1.0, 2.0, 3.0) to every token might produce different results. I'm quietly assuming that models are insensitive to this kind of change of coordinates. If you buy that, then it's relations like king - man + woman ~= queen that matter, and not the actual numeric values that king, man, woman and queen map to. Even if that's not the case, I don't think that changes the overall argument that nothing in an embedding or a model is even trying to capture anything about referents in the real world.

Intermittent Conjecture

Saturday, April 5, 2025

ML-friendly problems and unexpected consequences

Embedding and meaning