Tuesday, September 2, 2025

Frankenstein and the Gray Goo

[Ed. note: The original version of this post referred to Mary Wollstonecraft Shelley as Mary Wollstonecraft and said that she "won" the famous Geneva horror story contest. While Mary and Percy Shelley were not married when the contest took place in the summer of 1816, Mary was already going by the name Mary Shelley. The two would marry in December of that year. The novel Frankenstein would be published anonymously in 1818 and later, in 1821, under the name Mary W. Shelley. Before she took the name Shelley, Mary went by Mary Wollstonecraft Godwin, after her mother, Mary Wollstonecraft, and her father, William Godwin. As far as I know, she never went by Mary Wollstonecraft.

There doesn't seem to be a lot available about the contest itself -- sadly, no one seems to have posted about it to their social media at the time, leaving us to rely on whatever diaries and secondhand accounts were preserved -- so it's not really meaningful to talk about winners and losers. However, it is well established that Mary Shelley had the idea for the story in a dream during the party's stay in Geneva, wrote it up as a short story around that time and later expanded it into the novel we know today -- D.H. 6 Sep 2025]

In 1986, so about 40 years ago, K. Eric Drexler's Engines of Creation was published. It was a pretty big deal at the time. I'm pretty sure I didn't read the actual book, but its themes were widely discussed and I do recall reading several articles examining it, and the concepts in it, in depth.

All of which is why it was something of a mild shock to realize I'd forgotten all about it.

Engines of Creation talks about technology that mimics a fundamental process in living things, technology so powerful that it seemed quite possible it would enter a runaway feedback loop, amplifying itself without limit, rapidly evolving beyond human control. Even if that could be prevented, the technology had such profound implications that it was bound to have a major impact on all aspects of human activity. It would make previously impossible things routine and change the lives of every person on the planet in ways we could only hope to anticipate. Best to understand it and get on board, or risk being left behind forever.

If it seems like I'm deliberately using the same kind of language that's now used to talking about AI while avoiding telling you what I'm actually talking about, well, busted.

As you may already know from the book title, I'm talking about nanotechnology. Today, the term refers to any technology that operates on a scale below the somewhat arbitrary limit of 100 nanometers, or 0.1 microns, or about a thousandth of the thickness of a human hair, or about the size of many viruses.

There have been significant developments in nanotechnology since 1986, for example in developing antimicrobial materials and stain-resistant fabrics, but what caught the public attention at the time was the idea of a universal assembler, a hypothetical nanomachine that could put individual atoms together in any desired (physically possible) configuration. Somehow.

Since a universal assembler would itself be an arrangement of atoms, it should be possible for a universal assembler to create copies of itself, and we're off to the races. So as long as the atoms it needed were available, an assembler could rearrange them into more universal assemblers, which in turn could do the same. Exponential growth being what it is, this process would soon produce tons of replicators, and so on up to any quantity you like, assuming, as Drexler says, "the bottle of chemicals [this is happening in] hadn't run dry long before".

A couple of years after the book was published, some people at IBM used a scanning tunneling microscope to spell out the letters "IBM" in xenon atoms on a substrate of nickel. How hard could it be then to build up a universal assembler atom by atom?

Drexeler's book actually covered a number of topics, mostly to do with nanotechnology. As part of that, Drexler spent a couple of paragraphs discussing the idea of universal assemblers assembling more universal assemblers. Once you introduce the idea of a universal assembler, you kind of have to talk about that. He called the scenario gray goo, with the explanation that "Though masses of uncontrolled replicators need not be grey or gooey, the term 'grey goo' emphasizes that replicators able to obliterate life might be less inspiring than a single species of crabgrass."

In other words, we shouldn't assume that something dangerous would be big and spectacular. It might just as well be an amorphous grey goo made up of very tiny, but still dangerous, little machines.

As I understand it, Drexler wasn't claiming that the gray goo scenario was inevitable. Drexler himself later said that there wasn't any good reason to try to build a universal replicator, and later analyses by others suggested that the actual risk of runaway gray goo is quite small. Even so, it's not hard to see why the idea of gray goo might take off anyway.

Drexler's original scenario involved a "dry" replicator that needed a supply of simple chemicals to work with, but surely something that could assemble atoms at will could also disassemble more complex structures into raw material. This gives us the nightmare scenario of a blob of gray goo that could turn whatever was around it into more gray goo, leaving behind only whatever it didn't need to make more assemblers.

Since elements like carbon, hydrogen and oxygen can combine very flexibly into a wide variety of configurations, it's a good bet that a universal assembler would use them as raw material. Since those are also a the main materials that living things like human beings are made of, there's a certain potential for conflict here. Yes, we contain other elements that might not be useful to the goo, but having a small residue of calcium, phosphorous and such make it through unscathed seems like cold comfort in the larger picture.


Unlike, say, time travel or perpetual motion, the gray goo scenario doesn't violate any known laws of physics. In fact, we know that it's possible for collections of atoms to make copies of themselves. That's life (yeah, sorry).

However, at least as far as life is concerned, we also know that the mechanisms to do this are very complex and hard to predict, much less control. It's an interesting situation, really. There's nothing going on in cell metabolism and cell division that doesn't boil down to well-studied physics and chemical reactions. We know quite a bit about many individual reactions, the structure of cells, processes like DNA replication and RNA transcription and so on. In that sense, there's no mystery.

Nonetheless, molecular biology is full of mysteries that molecular biologists have been struggling for generations to get a handle on. For example, given a DNA sequence coding a protein, it's easy to read off exactly what amino acids that protein will consist of. But a protein isn't a simple sequence of amino acids. It's a three-dimensional structure that interacts with other chemicals in the cell, including but not limited to other proteins.

Exactly how a given sequence of amino acids will fold up into a three-dimensional structure (or one of several possible structures) and how it will interact with other chemicals is still a wide-open topic for research. There have been significant advances in recent years, but it's worth noting that the most successful approach to protein folding so far isn't simulating the physics of how the atoms in the protein will interact.

The current state of the art  for protein folding is AlphaFold, a machine-learning model that in some sense is basically going "Meh ... this matches up with this, this and that in the training set, and those folded up this way, so yeah ... it'll probably be something like this." Yes, I'm being very glib here with something that won Nobel prizes (deservedly, I'd say, for whatever my opinion is worth here), but the point is that the best approach so far is to give up on understanding what's going on physically and do very sophisticated pattern matching.


All of this is to say that we only know of one workable way for collections of atoms to produce copies of themselves, and there is an incredible amount we don't know about how that actually works. Even though life is everywhere in our environment, including places that until recently hardly anyone had the imagination to think it might be, almost all of the Earth is non-living matter -- rocks, magma, ocean water, air and such. In other words, after a few billion years of collections of atoms copying themselves, we do not have a gray goo scenario.

The nightmare gray goo scenario depends on a number of assumptions:

  • That universal assemblers are even possible. A true universal assembler would be able to arrange atoms into any physically possible arrangement. Living systems can produce more living systems. They can produce all manner of interesting chemicals and profoundly transform the world around them, but a universal assembler would be able to produce any chemical structure. Living things can assemble collections of particular molecules, such as nucleotides, amino acids, carbohydrates and lipids. There's no microbe that could put a xenon atom in a particular place on a nickel substrate. In other words, the existence of life doesn't demonstrate that universal assemblers are possible, and life is the only thing we know of that can self-reproduce at scale.
  • That it would be feasible to build one. The IBM demo, impressive though it was, used a human-scale machine to put a particular kind of atom on a carefully prepared substrate. Xenon is a noble gas, meaning that it's very unreactive. Unlike, say, oxygen, a xenon atom is not going to try to bind to the substrate or whatever else is around while you're maneuvering it into place. The IBM demo arranged 35 atoms on a flat surface. This is a far cry from arranging -- thousands? millions? -- of atoms in a complex three-dimensional structure that can move.
  • That we would be able to build an autonomous programmable universal assembler. It would be one thing to have an assembler that could receive instructions from the outside world, on the order of "put this atom here" and then "put that atom there", but a true self-reproducing assembler would have to carry its instructions with it, just as the DNA in a living cell carries the instructions for reproducing the cell.
But even the terms I've been discussing this in are misleading. I've been using phrases like "arrangements of atoms", but what we're really talking about here is chemistry. If you browse the Wikipedia pages on nanotechnology, you'll see illustrations of things that look like tiny machines -- wheels, axles, levers, that sort of thing. But at the scale of individual atoms, our notions of how things move and interact break down entirely.

The six simple machines taught in school operate at a human scale. It's easy to imagine building a molecule shaped like the wheel of a pulley and a polymer to act like rope, but how exactly would you use it? You'd also need an axle for the pulley wheel, and something to hold that axle in place, and something to pull on the rope, and a way to attach something to the other end. All of this is happening at the scale of atoms, which means that everyone's electrons are interacting with everyone else's, trying to find a lower-energy configuration ... my understanding of chemistry is pretty rudimentary, but none of this looks like chemistry as I understand it.

What convinced anyone (though not necessarily Drexler) that we were anywhere near being able to bootstrap a world of universal assemblers that might eventually consume not just all life on Earth, but potentially all matter that could be consumed? I'm not being rhetorical here. I mean, literally what are the thought processes that led to this idea taking off?


For one thing, feedback loops seem to be catnip for a certain kind of brain, my own included. I spent hours and hours as a kid reading and pondering Gödel, Escher, Bach, contemplating self-reference, strange loops, the use/mention distinction and so on. One fun way to play around with these ideas is to write a Quine, a program which produces itself as its output. Every compugeek should do it from scratch at least once. A key milestone in developing a new language is to write a compiler for the language in the language itself and then use the compiler to compile itself (after compiling it with an earlier compiler written in a language that already exists). In a post on the other blog, I mentioned Doug Engelbart's NLS team using NLS to further develop NLS.

In other words, ideas like machines that can build anything, including themselves, or AI systems that can write any code, including code for better AI systems, come naturally to at least some people. I don't think you have to have any training in computing or mathematical logic to hit on ideas like this, but it helps (and on the other hand if you're not already a compugeek it could be a sign that you might enjoy learning more about computing).

The idea of a self-reinforcing feedback loop is compelling and cool, cool enough that it's very easy to get caught up in the implications and brush aside the hidden assumptions that inevitably pile up along the way.


I think there's also another factor at play.

In 1791, Luigi Galvani published his findings about animals and electricity, including the discovery that applying an electric shock to the nerves of a dead frog would cause the legs to move. Alessandro Volta developed an electric battery a few years later, partly in order to demonstrate that electricity could be created by a chemical reaction, as opposed to it being a "vital force" created specifically by living things, but the idea of Galvanism, as Volta himself called it, continued to be widely discussed.

In 1816, while on holiday in Geneva, Mary Shelley, her soon-to-be-official-husband Percy Shelley, John Polidori and Lord Byron held a contest to see who could write the best horror story (as one does). The details of the contest have faded with time, but we know that it inspired Shelley to write a short story that eventually became the novel Frankenstein, or The Modern Prometheus.

As the subtitle implies, one main theme of the story is humanity dealing with life-changing powers it little understands, in this case electricity. Just as Prometheus stole fire from the gods to bring to humanity and paid dearly for it, so Dr. Frankenstein uses electricity to bring power over life itself, and pays dearly.

Today it may seem silly to think that you could reanimate dead flesh just by shocking the bejeezus out of it, but was this really any more outlandish than thinking that if you just put the right atoms together in the right arrangement you could create something that could reproduce itself without limit? Yes, Volta had demonstrated that you could produce electricity from non-living matter, but if all animals produced electricity as well, surely there was something about electricity that was essential to animal life. If you don't know any of the details of how electricity is involved in animal life, it would be easy to think that the mere presence of electric current is all there is to it. Animal life means electricity, so electricity must mean life.

When faced with something new that touches on fundamentals like life, matter or thought, it's sensible to consider the implications. When considering something so fundamental, it's natural to see at least the potential for world-shattering changes and even to feel some measure of awe.

Just as Drexel wasn't so much predicting the advent of gray goo as trying to understand the implications of nanotechnology and its potential for escaping our control, Mary Shelley wasn't predicting armies of reanimated corpses, but discussing the implications of our ability to produce electricity and apply it to animal tissue, and the potential for that ability to outrun our ability to control it. This being the Romantic period, she wasn't alone.

These are worthwhile questions to investigate. What if we learn the secrets of bringing the dead back to life? What if we can create tiny devices to arrange matter in any form we like, including the form of those devices themselves? What if we create machines that are more intelligent than us, and those machines figure out how to make more machines that are even more intelligent?

But as we discuss the implications of a new technology, it's important not to lose track of how things would actually happen. It's fine to brush the details aside in a discussion of what's possible in principle. How can you discuss the implications of a universal assembler without assuming that universal assemblers are possible, one way or another? But when the discussion turns back to what do we do now, in the world we actually live in, the assumptions previously brushed aside have to come back into the conversation.

Friday, July 4, 2025

Update: AB and NN chess engines

When I last looked in on computer chess, it hadn't been too long since AlphaZero had made waves by beating Stockfish after spending nine hours training by playing games against itself with no outside interference. As I understand it, the configuration that Stockfish was running wasn't its strongest, but this result was still impressive: A chess engine that looked at relatively few positions but used a neural network to evaluate them (an "NN" engine) beat an engine that looked at billions using a hand-tuned human-written algorithm (an "AB" engine). Soon an open-source engine based on Alpha Zero, Leela Chess Zero (LC0), was doing impressively well in tournaments.

The hallmark of NN engines was that they would play wild-looking moves that neither a human chess master nor an engine like Stockfish would have played at the time, moves which looked risky or even downright reckless, but often turned out to lead to a crushing advantage, all of this because similar-looking moves had led to good results in practice games.

At this writing, LC0 is still doing quite well in tournaments, but not quite as well as Stockfish, which consistently beats it. So AB wins, right?

Well, not quite. At the heart of an AB engine is the evaluation function, which takes a position on the board and returns a number that says how good the position is. The rest of the engine is dedicated to efficiently searching the tree of possible moves, replies to moves, replies to replies and so on typically a few dozen levels, to find the move that leads to the best possible positions against the opponent's best moves, according to the evaluation function.

There is a whole lot of software engineering behind making this as efficient as possible, including a technique called alpha-beta pruning that gave rise to the "AB" designation. The principle behind alpha-beta pruning is simple: Stop looking at the continuations from a move as soon as you know that the opponent can do better than it would with your current best move, but my brain gets completely befuddled when I try to understand the code, probably because the rule is applied recursively for both sides, so the meaning of "better" flips each time you switch sides in the search (alpha represents the score of the player's best move so far in the search; beta represents the same thing for the opponent). 

Until recently, evaluation functions had been carefully crafted to extract features from a position, like how much material each side had, which pieces had good or bad mobility, how each side's pawns were structured and so forth, and combine those using carefully-selected rules to arrive at a final evaluation.

A significant part of this is figuring out how much weight to assign to each feature in what circumstances. Essentially, this means answering questions like "Is it better to have an extra pawn, or better mobility and pawn structure?". The actual answer is "It depends. We need a rule for deciding how much weight to give each of the features we extracted." This in turn might vary depending on the particulars of the situation. Some things are more important in the middlegame, where there is more material on the board, than in the endgame, for example.

One of the reasons for Stockfish's success is its well-designed test framework for evaluating new code, including new evaluation functions. Different versions of the engine, including versions with different evaluation functions, are systematically played against each other and only changes that win make it into the next version.

Extracting features and carefully tuning various parameters that determine how to combine them certainly seems like what I previously called  an "ML-friendly problem", and it didn't take too long for someone to try that out. The result was the NNUE, a neural network that takes the positions of the pieces, with special attention given to the kings, and produces a numerical evaluation. The NNUE was good enough in testing to find its way into the official release, where it remains to this day.

So NN wins, right?

Well, not quite. A pure NN engine like LC0 is applying a large and relatively quite slow neural net to a comparatively tiny number of positions. It doesn't look ahead very far. In principle, an NN engine might look at only the positions after each possible move in the current position, typically a couple dozen. In practice, they look at hundreds of thousands, which is far more than a human player could, but still far fewer than an AB engine does. The power of an NN engine comes from the weightings in its neural net, which in turn come from playing large numbers of training games.

By comparison, the NNUE is tiny. Here's a picture of its weightings for one particular release, and here's a little more technical detail. The NNUE has around a hundred thousand one-bit input parameters and four layers. A parameter file runs to a few dozen megabytes, most of which are for the input weights in the first layer. 

Just as importantly "The efficiency of NNUE is due to incremental update of the input layer outputs in make and unmake move, where only a tiny fraction of its neurons need to be considered in case of [no] king moves." This is the result of hand-optimization, not some emergent property of the neural net.

LC0's network is much larger, though still tiny compared to the ChatGPTs of the world (which don't even really know the rules of chess, as this fairly sharply-worded piece argues). 

If that's all too vague for you (it is for me), the NNUE code runs on a standard CPU and can do hundreds of millions of evaluations per second, while LC0 prefers running its network on a GPU and does tens of thousands of evaluations per second.

By looking at orders of magnitude more positions than LC0, Stockfish is in effect trusting its neural network much, much less than an NN engine does and instead relies on very deep searches to determine which move to play.

Put another way, its actual evaluation is the aggregate of billions of simplistic evaluations, which happen to use a small neural net, rather than a few hundred thousand sophisticated evaluations using a much larger neural net. More simply yet, Stockfish is looking at many, many positions quickly while an NN engine is looking at many fewer positions more carefully.

The NNUE is essentially automating the process of extracting features from a position and deciding how to combine them. There's nothing particularly mysterious going on. As far as I understand it, its evaluations are similar to those produced by the older code, though different enough to lead to better outcomes when fed into the AB algorithm.

Even in the case of NN engines, the neural net isn't doing all the work. It's still running in a framework of "look at the possible moves, look at the replies to each move, and so on, with AB pruning". That framework wasn't created by a neural net. It was coded for computers by humans decades ago, in the 1950, to automate something human chess players already did.

That is, a naturally-evolved neural network, the human brain, developed both the concept of looking at moves and counter-moves and its realization as code. No LLM has developed code for a successful chess engine, or even come anywhere close*. This is, at least so far, a notable difference between LLMs and natural neural nets.

Within the framework that actual chess engines are built on, it turns out that a bit of neural network-based code can be helpful. Past a certain quite small size, though, adding more NN doesn't seem to help.


* To be a really fair test, the LLM would need to have been trained on a corpus that only mentioned, say, tree searches and the rules of chess, without mentioning anything like alpha-beta pruning or the idea of applying tree-searching to the problem of playing chess. I think it's a very good bet that current chatbots don't meet that standard, so if you've seen something like chess-engine code generated by an LLM, the simplest explanation is that there are similar things in the corpus it was trained on. This is to say nothing of actually producing a full chess engine that uses the tree search as its basis.



Saturday, April 5, 2025

ML-friendly problems and unexpected consequences

This started out as "one more point before I go" in the previous post, but it grew enough while I was getting that one ready to publish that it seemed like it should have its own post.


Where machine learning systems like LLMs do unexpectedly well, like in mimicking our use of language, it might not be because they've developed unanticipated special abilities. Maybe ML being good at generating convincing text says as much about the problem of generating convincing text as it does about the ML doing it.

The current generation of chatbots makes it pretty clear that producing language that's hard to distinguish from what a person would produce isn't actually that hard a problem, if you have a general pattern-matcher (and a lot of training text and computing power). In that case, the hard part, that people have spent decades trying to perfect and staggering amounts of compute power implementing, is the general pattern-matcher itself.

We tend to look at ML systems as problem solvers, and fair enough, but we can also look at current ML technology as a problem classifier. That is, you can sort problems according to whether ML is good at them. From that point of view, producing convincing text, recognizing faces, spotting tumors in radiological images, producing realistic (though still somewhat funny-looking) images and videos, spotting supernovas in astronomical images, predicting how proteins will fold, along with many other problems, are all examples of pattern-matching that a general ML-driven pattern-matcher can solve as well as, or even better than, our own naturally evolved neural networks can.

Not knowing a better term, I'll call these ML-friendly problems. In the previous post, I argued that understanding the structure of natural languages is a separate problem from understanding what meaning natural language is conveying. Pretty clearly, understanding the structure of natural languages is an ML-friendly problem. If you buy that understanding meaning is a distinct problem, I would argue that we don't know one way or another whether it's ML-friendly, partly, I would further argue, because we don't know nearly as much about what that problem involves.


From about 150 years ago into the early 20th century, logicians made a series of discoveries about what we call reasoning and developed formal systems to describe it. This came out of a school of thought, dating back to Leibniz (and as usual, much farther and wider if you look for it), holding that if we could capture rules describing how reasoning worked, we could use those rules to remove all uncertainty from any kind of thought.

Leibniz envisioned a world where, "when there are disputes among persons, we can simply say: Let us calculate, without further ado, to see who is right". That grand vision failed, of course, both because, as Gödel and others discovered, formal logic has inescapable limitations, but also because formal reasoning captures only a small portion of what our minds actually do and how we reason about the world.

Nonetheless, it succeeded in a different sense. The work of early 20th-century logicians was essential to the development of computing in the mid-20th century. For example, LISP -- for my money one of the two most influential programming languages ever, along with ALGOL -- was based directly Church's lambda calculus. I run across and/or use Java lambda expressions on a near-daily basis. For another example, Turing's paper on the halting problem used the same proof technique of diagonalization that Gödel borrowed from Cantor to prove incompleteness, and not by accident.


Current ML technology captures another, probably larger, chunk of what naturally-evolved minds do. Just as formal logic broke open a set of problems in mathematics, ML has broken open a set of problems in computing. Just as formal logic didn't solve quite as wide a range of problems as people thought it might, ML might not solve quite the range of problems people today think it might, but just as formal logic also led to significant advances in other ways, so might ML.


Embedding and meaning

I a previous post entitled Experiences, mechanisms, behaviors and LLMs, I discussed a couple of strawman objections to the idea that an LLM isn't doing anything particularly intelligent: that it's "just manipulating text" and it's "just doing calculations".

The main argument was that "just" is doing an awful lot of work there. Yes, an LLM is "just" calculating and manipulating text, but it's not "just" doing so in the same way as an early system like ELIZA, which just turned one sentence template into another, or even a 90s-era Markov chain, which just generates text based on how often which words appeared directly after which others in a sample text.

In both of those cases, we can point at particular pieces of code or data and say "those are the templates it's using", or "there's the table of probabilities" and explain directly what's going on. Since we can point at the exact calculations going on, and the data driving them, and we understand how those work, it's easy to say that the earlier systems aren't understanding text the way we do.

We can't do that with an LLM, even if an LLM generating text is doing the same general thing as a simple Markov chain. We can say "here's the code that's smashing tensors to produce output text from input text", and we understand the overall strategy, but the data feeding that strategy is far beyond our understanding. Unlike the earlier systems, there's way, way too much of it. It's structured, but that structure is much too complex to fit in a human brain, at least as a matter of conscious thought. Nonetheless, the actual of behavior shows some sort of understanding of the text without having to stretch the meaning of the word "understanding".

In the earlier post, I also said that even if an LLM encodes a lot about how words are used and in which contexts -- which it clearly does -- the LLM doesn't know the referents of those words -- it doesn't know what it means for water to be wet or what it feels like to be thirsty -- and so it doesn't understand text in the same sense we do.

This feels similar to appeals like "but a machine can't have feelings", which I generally find fairly weak, but that wasn't quite the argument I was trying to make. While cleaning up a different old post (I no longer remember which one), I ran across a reference that sharpens the picture by looking more closely at the calculations/manipulations an LLM is actually doing.

I think the first post I mentioned, on experiences etc. puts a pretty solid floor under what sort of understanding an LLM has of text, namely that it encodes some sort of understanding of how sentences are structured and how words (and somewhat larger units) associate with each other. Here, I hope to put a ceiling over that understanding by showing more precisely in what way LLMs don't understand the meaning of text in the way that we do.

Taking these together, we can roughly say that LLMs understand the structure of text but not the meaning, but the understanding of structure is deep enough that an LLM can extract information from a large body of text that's meaningful to us.

In much of what follows, I'm making use of an article in Quanta Magazine that discusses how LLMs do embeddings, that is, how they turn a text (or other input) into a list of vectors to feed into the tensor-smashing machine. It matches up well with papers I've read and a course I've taken, and I found it well-written, so I'd recommend it even if you don't read any further here.


Despite the name, a Large Language Model doesn't process language directly. The core of an LLM drives the processing of a list of tokens. A token is a vector -- an ordered list of numbers of a given length -- that represents a piece of the actual input.

To totally make up an example, suppose vectors are three numbers long. If the word a maps to (1.2, 3.0, -7.5), list maps to (6.4, -3.2, 1.6), of maps to (27.5, 9.8, 2.0),  and vectors maps to (0.7, 0.3, 6.8), then a list of vectors maps to [(1.2, 3.0, -7.5), (6.4, -3.2, 1.6), (27.5, 9.8, 2.0), (0.7, 0.3, 6.8)].

Here I'm using parentheses for vectors, which in this case always have three numbers, and square brackets for lists, which can have any length (including zero for the empty list, []). In practice, the vectors will have many more than three components. Thousands is typical. The list of vectors encoding a text will be however long the text is.

The particular mapping from input to tokens is called the embedding*.   The overall idea is to encode similarities along various dimensions. There are (practically) infinitely many ways to do this mapping. Over time this has evolved from a mostly-manual process, to an automated process using hand-written code, to the current state of the art, which uses machine learning techniques on large bodies of text. The first two approaches are pretty easy to understand.

An ML-produced embedding (that is, the procedure for turning an actual list of words into tokens), on the other hand, relies on a mass of numbers created during a training phase. This mass of numbers drives a generic algorithm that turns words into large vectors. While the numbers themselves don't really lend themselves to easy analysis, people have noticed interesting patterns in the results of applying embedding.

Because the model-building phase is looking at streams of text, it's not surprising that the embedding itself captures information about what words appear in what contexts in that text. For example in typical training corpora, dog and cat appear much more often in contexts like my pet ___ than, say, chair does. They are also likely to occur in conjunction with terms like paw and fur, while other words won't, and so forth.

While we don't really understand exactly how the embedding-building stage of training an LLM extracts relations like this, the article in Quanta gives the example that in one particular embedding the vector for king minus the one for man plus the one for woman is approximately equal to the one for queen (you add or subtract vectors component by component, so (1.2, 3.0, -7.5) + (6.4, -3.2, 1.6) = (7.6, -0.2, -5.9) and so on).

It's long been known that use in similar contexts correlates with similarity in meaning. But we're talking about implied similarities in meaning here, not actual meanings.  You can know an analogy like cat : fur :: person : hair without knowing anything about what a cat is, or a person, or fur or hair.

That may seem odd from our perspective. A person would solve a problem like cat : fur :: person : ? by thinking about cats and people, and what about a person is similar to fur for a cat, because we're embodied in the world and we have experience of hair, cats, fur and so forth. Odd as it might seem to know that cat : fur :: person : hair without knowing what any of those things is, that's essentially what's going on with an LLM. It understands relations between words, based on how they appear in a mass of training text, but that's all it understands**.


But what, exactly, is the difference between understanding how a word relates to other words and understanding what it means? There are schools of thought that claim there is no difference. The meaning of a word is how it relates to other words. If you believe that, then there's a strong argument that an LLM understands words the same way we do, and about as well as we do.

Personally, I don't think that's all there is to it. The words we use to express our reality are not our reality. For one thing, we can also use the same words to express completely different realities. We can use words in new ways, and the meaning of words can and does shift over time. There are experiences in our own reality that defy expression in words.

Words are something we use to convey meaning, but they aren't that meaning. Meaning ultimately comes from actual experiences in the real world. The way words relate to each other clearly captures something about what they actually mean -- quite a bit of it, by the looks of things -- but just as clearly it doesn't capture everything.

I have no trouble saying that the embeddings that current LLMs use encode something significant about how words relate to each other, and that the combination of the embedding and the LLM itself has a human-level understanding of how language works.

That's not nothing. It's something that sets current LLMs apart from anything before them, and it's an interesting result. For one thing, it goes a long way toward clarifying what's understanding of the world and what's just understanding of how language works and what combinations of words people actually use.

If an LLM is good at it, then it's something about how language works. If an LLM isn't good at it, then it's probably something about the world itself. I'll have a bit more to say about that in the next (shorter) post.

Because LLMs know about language, but not what it represents in the real world, we shouldn't be surprised that LLMs hallucinate, and we shouldn't expect them to stop hallucinating just because they're trained on larger and larger corpora of text.


The earlier post distinguished among behavior, mechanism and experience. An LLM is capable of linguistic behavior very similar to a person's.

The mechanism of an LLM may, or may not, be similar as far as language processing. We may well learn rules like the way that we use the in relation to nouns in a way that's similar to training an LLM. Whether that's the case or not, an LLM, by design, lacks a mechanism for tying words to anything in the real world. This probably accounts for much of the difference between what we would say and what an LLM would say.

All of this is separate from subjective experience.  One could imagine a robot that builds up a store of interactions with the world, processes them into some more abstract representation and associates words with them. But even if that is more similar to what we do in terms of mechanism, it says nothing about what the robot might or might not be experiencing subjectively, even if it becomes harder to rule out the possibility that the robot is experiencing the world as we do.


* Wikipedia seems to think it's only an embedding if it's done using feature learning, but that seems overly strict. Mathematically, an embedding is any map from one domain into another, no matter how it's produced.

** Technically, it might matter what the actual numbers are, for example, an embedding that doubled every numeric value or added (1.0, 2.0, 3.0) to every token might produce different results. I'm quietly assuming that models are insensitive to this kind of change of coordinates. If you buy that, then it's relations like king - man + woman ~= queen that matter, and not the actual numeric values that king, man, woman and queen map to. Even if that's not the case, I don't think that changes the overall argument that nothing in an embedding or a model is even trying to capture anything about referents in the real world.

Thursday, March 27, 2025

Losing my marbles over entropy

In a previous post on Entropy, I offered a garbled notion of "statistical symmetry." I'm currently reading Carlo Rovelli's The Order of Time, and chapter two laid out the idea that I was grasping at concisely, clearly and -- because Rovelli is an actual physicist -- correctly.

What follows is a fairly long and rambling discussion of the same toy system as the previous post, of five marbles in a square box with 25 compartments. It does eventually circle back to the idea of symmetry, but it's really more of a brain dump of me trying to make sure I've got the concepts right. If that sounds interesting, feel free to dive in. Otherwise, you may want to skip this one.


In the earlier post, I described a box split into 25 little compartments with marbles in five of the compartments. If you start with, say, all the marbles on one row (originally I said on one diagonal, but that just made things a bit messier) and give the box a good shake, the odds that the marbles all end up in the same row that they started in are low, about one in 50,000 for this small example. So far, so good.

But this is really true for any starting configuration -- if there are twenty-five compartments in a five-by-five grid, numbered from left to right then top to bottom, and the marbles start out in, say, compartments 2, 7,  8, 20 and 24, the odds that they'll still be in those compartments after you shake the box are exactly the same, about one in 50,000.

On the one hand, it seems  like going from five marbles in a row to five marbles in whatever random positions they end up in is making the box more disordered. On the other hand, if you just look at the positions of the individual marbles, you've gone from a set of five numbers from 1 to 25 ... to a set of numbers from 1 to 25, possibly the one you started with. Nothing special has happened.

This is why the technical definition of entropy doesn't mention "disorder". The actual definition of entropy is in terms of microstates and macrostates. A microstate is a particular configuration of the individual components of a system, in this case, the positions of the marbles in the compartments. A macrostate is a collection of microstates that we consider to be equivalent in some sense.

Let's say there are two macrostates: Let's call any microstate with all five marbles in the same row lined-up, and any other microstate scattered.  In all there are 53,130 microstates (25 choose 5). Of those, five have all the marbles in a row (one for each row), and the other 53,125 don't. That is, there are five microstates in the lined-up microstate and 53,125 in the scattered microstate.

The entropy of a macrostate is related to the number of microstates consistent with that macrostate (for more context, see the earlier post on entropy, which I put a lot more care into). Specifically, it is the logarithm of the number of such states, multiplied by a factor called the Boltzmann constant to make the units come out right and to scale the numbers down, because in real systems the numbers are ridiculously large (though not as large as some of these numbers), and even their logarithms are quite large. Boltzman's constant is 1.380649×10−23 Joules per Kelvin.

The natural logarithm of 5 is about 1.6 and the natural logarithm of 53,125 is about 10.9. Multiplying by Boltzmann's constant doesn't change their relative size: The scattered macrostate has about 6.8 times the entropy of the lined-up macrostate.

If you start with the marbles in the low-entropy lined-up macrostate and give the box a good shake, 10,625 times out of 10,626 you'll end up in the higher-entropy scattered macrostate. Five marbles in 25 compartments is a tiny system, considering that there are somewhere around 10,800,000,000,000,000,000,000,000 molecules in a milliliter of water. In any real system, except cases like very low-temperature systems with handfuls of particles, the differences in entropy are large enough that "10,625 times out of 10,626" turns into "always" for all intents and purposes.


This distinction between microstates and macrostates gives a rigorous basis for the intuition that going from lined-up marbles to scattered-wherever marbles is a significant change, while going from one particular scattered state to another isn't.

In both cases, the marbles are going from one microstate to another, possibly but very rarely the one they started in. In the first case, the marbles go from one macrostate to another. In the second, they don't. Macrostate changes are, by definition, the ones we consider significant, in this case, between lined-up and scattered. Because of how we've defined the macrostates, the first change is significant and the second isn't.


Let's slice this a bit more finely and consider a scenario where only part of a system can change at any given time. Suppose you don't shake up the box entirely. Instead, you take out one marble and put it back in a random position, including, possibly, the one it came from. In that case, the chance of going from lined-up to scattered is 20 in 21, since out of the 21 positions the marble can end up in, only one, its original position, has the marbles all lined up, and in any case it doesn't matter which marble you choose.

What about the other way around? Of the 53,120 microstates in the scattered macrostate, only 500 have four of the five marbles in one row. For any microstate, there are 105 different ways to take one marble out and replace it: Five marbles times 21 empty places to put it, including the place it came from.

For the 500 microstates with four marbles in a row, only one of those 105 possibilities will result in all five marbles in a row: Remove the lone marble that's not in a row and put it in the only empty place in the row of four. For the other 52,615 microstates in the scattered macrostate, there's no way at all to end up with five marbles lined up by moving only one marble.

So there are 500 cases where the scattered macrostate becomes lined-up, 500*104 cases where it might but doesn't, and 52,615*105 cases where it couldn't possibly. In all, that means that the odds are 11,153.15 to one against scattered becoming lined-up by removing and replacing one marble randomly.

Suppose that the marbles are lined up at some starting time, and every time the clock ticks, one marble gets removed and replaced randomly. After one clock tick, there is a 104 in 105 chance that the marbles will be in the high-entropy scattered state. How about after two ticks? How about if we let the clock run indefinitely -- what portion of the time will the system spend in the lined-up macrostate?

The there are tools to answer questions like this, particularly Markov chains and stochastic matrices (that's the same Markov Chain that can generate random text that resembles an input text). I'll spare you the details, but the answer requires defining a few more macrostates, one for each way to represent the number five as the sum of whole numbers: [5], [4, 1], [3, 2], [3, 1, 1], [2, 2, 1], [2, 1, 1, 1] and [1, 1, 1, 1, 1].

The macrostate [5] comprises all microstates with five marbles in one row, the macrostate [4, 1] comprises all microstates with four marbles in one row and one in another row, the macrostate [2, 2, 1] comprises all microstates with two marbles in one row, two marbles in another row and one marble in a third one, and so forth.

Here's a summary

MacrostateMicrostatesEntropy
[5]51.6
[4,1]5006.2
[3,2]2,0007.6
[3,1,1]7,5008.9
[2,2,1]15,0009.6
[2,1,1,1]25,00010.1
[1,1,1,1,1]3,1258.0

The Entropy column is the natural logarithm of the Microstates column, without multiplying by Boltzmann's constant. Again, this is just to give a basis for comparison. For example [2,1,1,1] is the highest-entropy state, and [2,2,1] has four times the entropy of [5]. 

It's straightforward, but tedious, to count the number of ways one macrostate can transition to another. For example, of the 105 transitions for [3,2], 4 end up in [4,1], 26 end up back in [3,2] (not always by putting the removed marble back where it was), 30 end up in [3, 1, 1] and 45 end up in [2, 2, 1]. Putting all this into a matrix and taking the matrix to the 10th power (enough to see where this is converging) gives

Macrostate% time% microstates
[5].0094.0094
[4,1].94.94
[3,2]3.83.8
[3,1,1]1414
[2,2,1]2828
[2,1,1,1]4747
[1,1,1,1,1]5.95.9

The second column is the result of the tedious matrix calculations. The third column is just the size of the macrostate as the portion of the total number of microstates. For example, there are 500 microstates in [4,1], which is 0.94% of the total, which is also the portion of the time that the matrix calculation says system will spend in [4, 1]. Technically, this means the system is ergodic, which means I didn't have to bother with the matrix and counting all the different transitions.

Even in this toy example, the system will spend very little of its time in the low-entropy lined-up state [5], and if it ever does end up there, it won't stay there for long.


Given some basic assumptions, a system that evolves over time, transitioning from microstate to microstate, will spend the same amount of time in any given microstate (as usual, that's not quite right technically), which means that the time spent in each macrostate is proportional to its size. Higher-entropy states are larger than lower-entropy states, and because entropy is a logarithm, they're actually a lot larger.

For example, the odds of an entropy decrease of one millionth of a Joule per Kelvin are about one in e(1017). That's a number with somewhere around 40 quadrillion digits. To a mathematician, the odds still aren't zero, but to anyone else they would be.

For all but the tiniest, coldest systems, the chance of entropy decreasing even by a measurable amount are not just small, but incomprehensibly small. The only systems where the number of microstates isn't incomprehensibly huge are are small collections of particles near absolute zero.

I'm pretty sure I've read about experiments where such a system can go from a higher-entropy state to a very slightly lower-entropy state and vice versa, though I haven't had any luck tracking them down. Even if no one's ever done it, such a system wouldn't violate any laws of thermodynamics, because the laws of thermodynamics are statistical (and there's also the question of definition over whether such a system is in equilibrium).

So you're saying ... there's a chance? Yes, but actually no, in any but the tiniest, coldest systems. Any decrease in entropy that could actually occur in the real world and persist long enough to be measured would be in the vicinity of 10−23 Joules per Kelvin, which is much, much too small to be measured except under very special circumstances.

For example, if you have 1.43 grams of pure oxygen in a one-liter container at standard temperature and pressure, it's very unlikely that you know any of the variables involved -- the mass of the oxygen, its purity, the size of the container, the temperature or the pressure, to even one part in a billion. Detecting changes 100,000,000,000,000 times smaller than that is not going to happen.



But none of that is what got me started on this post. What got me started was that the earlier post tried to define some sort of notion of "statistical symmetry", which isn't really a thing, and what got me started on that was my coming to understand that higher-entropy states are more symmetrical. That in turn was jarring because entropy is usually taken as a synonym for disorder, and symmetry is usually taken as a synonym for order.

Part of the resolution of that paradox is that entropy is a measure of uncertainty, not disorder. The earlier post got that right, but evidently that hasn't stopped my for hammering on the point for dozens more paragraphs and a couple of tables in this one, using a slightly different marbles-in-compartments example.

The other part is that more symmetry doesn't really mean more order, at least not in the way that we usually think about it.

From a mathematical point of view, a symmetry of an object is something you can do to it that doesn't change some aspect of the object that you're interested in. For example, if something has mirror symmetry, that means that it looks the same in the mirror as it does ordinarily.

It matters where you put the mirror. The letter W looks the same if you put a mirror vertically down the middle of it -- it has one axis of symmetry. The letter X looks the same if you put the mirror vertically in the middle, but it also looks the same if you put it horizontally in the middle -- it has two axes of symmetry.

Another way to say this is that if you could draw a vertical line through the middle of the W and rotate the W out of the page around that line, and kept going for 180 degrees until the W was back in the page, but flipped over, it would still look the same. If you chose some other line, it would look different (even if you picked a different vertical line, it would end up in a different place). That is, if you do something to the W -- rotate it around the vertical line through the middle -- it ends up looking the same. The aspect you care about here is how the W looks.

To put it somewhat more rigorously: if f is the particular mapping that takes each point to its mirror image across the axis, then f takes the set of points in the W to the exact same set of points. Any point on the axis maps to itself, and any point off the axis maps to its mirror image, which is also part of the W. The map f is defined for every point on the plane and it moves all of them except for the axis. The aspect we care about, which f doesn't change, is whether a particular point is in the W.

If you look at all the things you can do to an object without changing the aspect you care about, you have a mathematical group. For a W, there are two things you can do: leave it alone and flip it over. For an X, you have four options: leave it alone, flip it around the vertical axis, flip it around the horizontal axis, or do both. Leaving an object alone is called the identity transformation, and it's always considered a symmetry, because math. An asymmetrical object has only that symmetry (it's symmetry group is trivial).

In normal speech, saying something is symmetrical usually means it has the same symmetry group as a W -- half of it is a mirror image of the other half. Technically, it has bilateral symmetry. In some sense, though, an X is more symmetrical, since its symmetry group is larger, and a hexagon, which has 12 elements in its symmetry group, is more symmetrical yet.

A figure with 19 sides, each of which is the same lopsided squiggle, would have a symmetry group of 19 (rotate by 1/19 of a full circle, 2/19 ... 18/19, and also don't rotate at all). That would make it more symmetrical than a hexagon, and quite a bit more symmetrical than a W, but if you asked people which was most symmetrical, they would probably put the 19-sided squigglegon last of the three.

Our visual system is mostly trained to recognize bilateral symmetry. Except for special situations like reflections in a pond, pretty much everything in nature with bilateral symmetry is an animal, which is pretty useful information when it comes to eating and not being eaten. We also recognize rotational symmetry, which includes flowers and some sea creatures, also useful information.

It would make sense, then, that in day to day life, "more symmetrical" generally means "closer to bilateral symmetry". If a house has an equal number of windows at the same level on either side of the front door, we think of it as symmetrical,  even though the windows may not be exactly the same, the door itself probably has a doorknob on one side or the other and so forth, so it's not quite exactly symmetrical. We'd still say it's pretty symmetrical, even though from a mathematical point of view it either has bilateral symmetry or it doesn't (and in the real world, nothing we can see is perfectly symmetrical).

That should go some way toward explaining why, along with so many other things, symmetry doesn't necessarily mean the same thing in its mathematical sense as it does ordinarily. The mathematical definition includes things that we don't necessarily think of as symmetry.

Continuing with shapes and their symmetries, you can think of each shape as a macrostate. You can  associate a microstate with each mapping (technically, in this case, any rigid transformation of the plane) that leaves the shape unchanged. The macrostate W has two microstates: one for the identity transformation, which leaves the plane unchanged, and one for the mirror transformation around the W's axis.

The X macrostate has four microstates, one for the identity, one for the flip around the vertical axis, one for the flip around the horizontal axis, and one for flipping around one axis and then the other (in this case, it doesn't matter what order you do it in). The X macrostate has a larger symmetry group, which is the same as saying it has more entropy.

In this context, a symmetry is something you can do to the microstate without changing the macrostate. A larger symmetry group -- more symmetry -- means more microstates for the same macrostate, which means more entropy, and vice-versa. They're two ways of looking at the same thing.

In the case of the marbles in a box, a symmetry is any way of switching the positions of the marbles, including not switching them around at all. Technically, this is a permutation group.

For any given microstate,  some of the possible permutations just switch the marbles around in their places (for example, switching the first two marbles in a lined-up row), and some of them will move marbles to different compartments. For a microstate of the lined-up macrostate [5], there are many fewer permutations that leave the marbles in the same macrostate (all in one row, though not necessarily the same row) than there are for [2, 1, 1, 1]. Even though five marbles in a row looks more symmetrical, since it happens to have bilateral visual symmetry, it's actually a much less symmetrical macrostate than [2, 1, 1, 1], even though most of its microstates will just look like a jumble.


In the real world, distributing marbles in boxes is really distributing energy among particles, generally a very large number of them. Real particles can be in many different states, many more than the marble/no marble states in the toy example, and different states can have the same energy, which makes the math a bit more complicated. Switching marbles around is really exchanging energy among particles, and there are all sorts of intricacies about how that happens.

Nonetheless, the same basic principles hold: Entropy is a measure of the number of microstates for a given macrostate, and a system in equilibrium will evolve toward the highest-entropy macrostate available, and stay there, simply because the probability of anything else happening is essentially zero.

And yeah, symmetry doesn't necessarily mean what you think it might.