Showing posts with label randomness. Show all posts
Showing posts with label randomness. Show all posts

Sunday, September 13, 2020

Entropy and time's arrow

When contemplating the mysteries of time ... what is it, why is it how it is, why do remember the past but not the future ... it's seldom long before the second law of thermodynamics comes up.

[Later in this post, I try to invent something called "statistical symmetry", which isn't really a thing. I eventually revisited that topic in this post, which covers what I was trying to talk about, but more accurately, or at least less inaccurately -- D. H. April 2025]

In technical terms, the second law of thermodynamics states that the entropy of a closed system increases over time.  I've previously discussed what entropy is and isn't.  The short version is that entropy is a measure of uncertainty about the internal details of a system.  This is often shorthanded as "disorder", and that's not totally wrong, but it probably leads to more confusion than understanding.  This may be in part because uncertainty and disorder are both related to the more technical concept of symmetry, which may not mean what you might expect.  At least, I found some of this surprising when I first went over it.

Consider an ice cube melting.  Is a puddle of water more disordered than an ice cube?  One would think.  In an ice cube, each atom is locked into a crystal matrix, each atom in its place.  An atom in the liquid water is bouncing around, bumping into other atoms, held in place enough to keep from flying off into the air but otherwise free to move.

But which of the two is more symmetrical?  If your answer is "the ice cube", you're not alone.  That was my reflexive answer as well, and I expect that it would be for most people.  Actually, it's the water.  Why?  Symmetry is a measure of what you can do to something and still have it look the same.  The actual mathematical definition is, of course, a bit more technical, but that'll do for now.

An irregular lump of coal looks different if you turn it one way or another, so we call it asymmetrical.  A cube looks the same if you turn it 90 degrees in any of six directions, or 180 degrees in any of three directions, so we say it has "rotational symmetry" (and "reflective symmetry" as well).  A perfect sphere looks the same no matter which way you turn it, including, but not limited to, all the ways you can turn a cube and have the cube still look the same.  The sphere is more symmetrical than the cube, which is more symmetrical than the lump of coal.  So far so good.

A mass of water molecules bouncing around in a drop of water looks the same no matter which way you turn it.  It's symmetrical the same way a sphere is.  The crystal matrix of an ice cube only looks the same if you turn it in particular ways.  That is, liquid water is more symmetrical, at the microscopic level, than frozen water.  This is the same as saying we know less about the locations and motions of the individual molecules in liquid water than those in frozen water.  More uncertainty is the same as more entropy.

Geometrical symmetry is not the only thing going on here.  Ice at -100C has lower entropy than ice at -1C, because molecules in the colder ice have less kinetic energy and a narrower distribution of possible kinetic energies (loosely, they're not vibrating as quickly within the crystal matrix and there's less uncertainty about how quickly they're vibrating).  However, if you do see an increase in geometrical symmetry, you are also seeing an increase in uncertainty, which is to say entropy. The difference between cold ice and near-melting ice can also be expressed in terms of symmetry, but a more subtle kind of symmetry.  We'll get to that.


[This is where I start to go astray. These next few paragraphs are all right, but the later post has a better explanation of what's going on. -- D.H. April 2025]

As with the previous post, I've spent more time on a sidebar than I meant to, so I'll try to get to the point by going off on another sidebar, but one more closely related to the real point.

Suppose you have a box with, say, 25 little bins in it arranged in a square grid.  There are five marbles in the box, one in each bin on the diagonal from upper left to lower right.  This arrangement has "180-degree rotational symmetry".  That is, you can rotate it 180 degrees and it will look the same.  If you rotate it 90 degrees, however, it will look clearly different.

Now put a lid on the box, give it a good shake and remove the lid.  The five marbles will have settled into some random assortment of bins (each bin can only hold one marble).  If you look closely, this random arrangement is very likely to be asymmetrical in the same way a lump of coal is: If you turn it 90 degrees, or 180, or reflect it in a mirror, the individual marbles will be in different positions than if you didn't rotate or reflect the box.

However, if you were to take a quick glimpse at the box from a distance, then have someone flip a coin and turn the box 90 degrees if the coin came up heads, then take another quick glimpse, you'd have trouble telling if the box had been turned or not.  You'd have no trouble with the marbles in their original arrangement on the diagonal.  In that sense, the random arrangement is more symmetrical than the original arrangement, just like the microscopic structure of liquid water is more symmetrical than that of ice.


[And this is where I went full LLM -- D.H April 2025]

The magic word to make this all rigorous is "statistical".  That is, if you have a big enough grid and enough marbles and you just measure large-scale statistical properties, and look at distributions of values rather than the actual values, then an arrangement of marbles is more symmetrical if these rough measures measures don't change when you rotate the box (or reflect it, or shuffle the rows or columns, or whatever -- for brevity I'll stick to "rotate" here).

For example, if you count the number of marbles on each diagonal line (wrapping around so that each diagonal line has five bins), then for the original all-on-one-diagonal arrangement, there will be a sharp peak: five marbles on the main diagonal, one on each of the diagonals that cross that main diagonal, and zero on the others.  Rotate the box, and that peak moves.  For a random arrangement, the counts will all be more or less the same, both before and after you rotate the box.  A random arrangement is more symmetrical, in this statistical sense.

The important thing here is that there are many more symmetrical arrangements than not.  For example, there are ten wrap-around diagonals in a 5x5 grid (five in each direction) so there are ten ways to put five marbles in that kind of arrangement.  There are 53,130 total ways to put 5 marbles in 25 bins, so there are approximately 5,000 times as many more-symmetrical, that is, higher-entropy, arrangements.  Granted, some of these are still fairly unevenly distributed, for example four marbles on one diagonal and one off it, but even taking that into account, there are many more arrangements that look more or less the same if you rotate the box than there are that look significantly different.

[And now we're heading back toward reality, though I did make a couple of small edits for clarity and precision. In particular, the parts about symmetrical arrangements of marbles in boxes are OK, though again the later post covers this better. The main mistake was thinking that there's anything statistical about symmetry. Statistics comes in when compare the sizes of macrostates. -- D. H. April 2025]

This is a toy example.  If you scale up to, say, the number of molecules in a balloon at room temperature, "many more" becomes "practically all".  Even if the box has 2500 bins in a 50x50 grid, still ridiculously small compared to the trillions of trillions of molecules in a typical system like a balloon, or a vase, or a refrigerator or whatever, the odds that all of the balls line up on a diagonal are less than one in googol (that's ten to the hundredth power, not the search engine company). You can imagine all the molecules in a balloon crowding into one particular region, but for practical purposes it's not going to happen, at least not by chance in a balloon at room temperature.

If you start with the box of marbles in a not-very-symmetrical state and shake it up, you'll almost certainly end up with a more symmetrical state, simply because there are many more ways for that to happen.  Even if you only change one part of the system, say by taking out one marble and putting it back in a random empty bin adjacent to its original position, there are still more cases than not in which the new arrangement is more symmetrical than the old one.

If you continue making more random changes, whether large or small, the state of the box will get more symmetrical over time.  Strictly speaking, this is not an absolute certainty, but for anything we encounter in daily life the numbers are so big that the chances of anything else happening are essentially zero.  This will continue until the system reaches its maximum entropy, at which point large or small random changes will (essentially certainly) leave the system in a state just as symmetrical as it was before.

That's the second law -- as a closed system evolves, its entropy will essentially never decrease, and if it starts in a state of less than maximum entropy, its entropy will essentially always increase until it reaches maximum entropy.


And now to the point.

The second law gives a rigorous way to tell that time is passing.  In a classic example, if you watch a film of a vase falling off a table and shattering on the floor, you can tell instantly if the film is running forward or backward: if you see the pieces of a shattered vase assembling themselves into an intact vase, which then rises up and lands neatly on the table, you know the film is running backwards.  Thus it is said that the second law of thermodynamics gives time its direction.

As compelling as that may seem, there are a couple of problems with this view.  I didn't come up with any of these, of course, but I do find them convincing:

  • The argument is only compelling for part of the film.  In the time between the vase leaving the table and it making contact with the floor, the film looks fine either way.  You either see a vase falling, or you see it rising, presumably having been launched by some mechanism.  Either one is perfectly plausible, while the vase assembling itself from its many pieces is totally implausible.  But the lack of any obvious cue like pottery shards improbably assembling themselves doesn't stop time from passing.
  • If your recording process captured enough data, beyond just the visual image of the vase, you could in principle detect that the entropy of the contents of the room increases slightly if you run the film in one direction and decreases in the other, but that doesn't actually help because entropy can decrease locally without violating the second law.  For example, you can freeze water in a freezer or by leaving it out in the cold.  Its entropy decreases, but that's fine because entropy overall is still increasing, one way or another (for example, a refrigerator dumps heat into the surrounding environment at a higher temperature than the temperature of its contents, which are losing heat, and if you run all the numbers through the total entropy of the refrigerator together with its environment is increasing).  If you watch a film of ice melting, there may not be any clear cues to tell you that you're not actually watching a film of ice freezing, running backward.  But time passes regardless of whether entropy is increasing or decreasing in the local environment.
  • Most importantly, though, in an example like a film running, we're only able to say "That film of a vase shattering is running backward" because we ourselves perceive time passing.  We can only say the film is running backward because it's running at all.  By "backward", we really mean "in the other direction from our perception of time".  Likewise, if we measure the entropy of a refrigerator and its contents, we can only say that entropy is increasing as time as we perceive it increases.
In other words, entropy increasing is a way that we can tell time is passing, but it's not the cause of time passing, any more than a mile marker on a road makes your car move.  In the example of the box of marbles, we can only say that the box went from a less symmetrical to more symmetrical state because we can say it was in one state before it was in the other.

If you printed a diagram of each arrangement of marbles on opposite sides of a piece of paper, you'd have two diagrams on a piece of paper.  You couldn't say one was before the other, or that time progressed from one to the other.  You can only say that if the state of the system undergoes random changes over time, then the system will get more symmetrical over time, and in particular the less symmetrical arrangement (almost certainly) won't happen after the more symmetrical one.  That is, entropy will increase.

You could even restate the second law as something like "As a system evolves over time, all state changes allowed by its current state are equally likely" and derive increasing entropy from that (strictly speaking you may have to distinguish identical-looking potential states in order to make "equally likely" work correctly -- the rigorous version of this is the ergodic hypothesis).  This in turn depends on the assumptions that systems have state, and that state changes over time.  Time is a fundamental assumption here, not a by-product.

In short, while you can use the second law to demonstrate that time is passing, you can't appeal to the second law to answer questions like "Why do we remember the past and not the future?"  It just doesn't apply.

Thursday, July 2, 2015

Do androids trip on electric acid?

Have a look at some of these images and take note of whatever adjectives come to mind.  If other people's responses are anything to go by, there's a good chance they include some or all of "surreal", "disturbing", "dreamlike", "nightmarish" or "trippy".  Particularly "trippy".

These aren't the first computer-generated images to inspire such descriptions.  Notably, fractal images have been described in psychedelic terms at least since the Mandelbrot set came to general attention, and the newer, three-dimensional varieties seem particularly evocative.  The neural-network generated images, however, are in a different league.  What's going on?

Real neural systems appear to be rife with feedback loops.  In experiments with in vitro neuron cultures -- nerve cells growing in dishes with electrodes attached here and there -- a system with no other input to deal with will amplify and filter whatever random noise there is (and there is always something) into a clear signal.  This would be a "signal" in the information theory sense of something that's highly unlikely to occur by chance, not a signal in the sense of something conveying a particular meaning.

This distinction between the two senses of "signal" is important.  Typically a signal in the information theory sense is also meaningful in some way.  That's more or less why they call it "information theory".  There are plenty of counterexamples, though.  For example:
  • tinnitus (ringing of the ears), where the auditory system fills in a frequency that the ear itself isn't able to produce
  • pareidolia, where one sees images of objects in random patterns, such as faces in clouds
  • the gambler's fallacy, where one expects a random process to remember what it has already done and compensate for it ("I've lost the last three hands.  I'm bound to get good cards now.")
and so forth.  The common thread is that part of the brain is expecting to perceive something -- a sound, a face, a balanced pattern of "good" and "bad" outcomes -- and selectively processes otherwise meaningless input to produce that perception.

In the generated images, a neural network is first trained to recognize a particular kind of image -- buildings, eyes, trees, whatever -- and the input image is adjusted bit by bit to strengthen the signal to the recognizer.  The code doing the adjustment knows nothing about what the recognizer expects.  It just tries something, and if the recognizer gives it a stronger signal as a result, it keeps the adjustment.  If you start with random noise, you end up with the kind of images you were looking for.  If you start with non-random input, you get a weird mashup of what you had and what you were looking for.

Our brains almost certainly have this sort of feedback loop built in.  Real input often provides noisy and ambiguous signals.  Is that a predator behind those bushes, or just a fallen branch?  Up to a point it's safer to provide a false positive ("predator" when it's really a branch) than a false negative ("branch", when it's really a predator), so if a predator-recognizer feeds "yeah, that might be a four-legged furry thing with teeth" back to the visual system in order to strengthen the signal, survival chances should be better than with a brain that doesn't do that.  A difference in survival chances is exactly what natural selection needs to do its work.

At some point, though, too many false positives mean wasting energy, and probably courting other dangers, by jumping at every shadow.  Where that point is will vary depending on all sorts of things.  In practice, there will be a sliding scale from "too complacent" to "too paranoid", with no preset "right" amount of caution.  Given that chemistry is a vital part of the nervous system's operation, it's not surprising that various chemicals could move such settings.  If the change is in a useful direction, we call such chemicals "medicine".  Otherwise we call them "drugs".

In other words -- and I'm no expert here -- it seems plausible that we call the images trippy because they are trippy, in the sense that the neural networks that produced them are hallucinating in a manner similar to an actual brain hallucinating.  Clearly, there's more going on than that, but this is an interesting result.


When testing software, it's important to look at more than just the "happy" path.  If you're testing code that divides numbers, you should see what it does when you ask it to divide by zero.  If you're testing code that handles addresses and phone numbers, you should see what it does when you give it something that's not a phone number.  Maybe you should feed it some random gibberish (noting exactly what that gibberish was, for future reference), and see what happens.

Testing models of perception (or of anything else), seems similar.  It's nice if your neural network for recognizing trees can say that a picture of a tree is a picture of a tree.  It's good, maybe good enough for the task at hand, if it's also good at not calling telephone poles or corn stalks trees.  But if you're not just trying to recognize pictures, and you're actually trying to model how brains work in general, it's very interesting if your model shows the same kind of failure modes as an actual brain.  A neural network that can hallucinate convincingly might just be on to something.

Thursday, October 24, 2013

Arising by chance

Suppose you had a billion dice.  How many times would you expect to roll them before you got all sixes?  That would be six to the billionth power, or about ten to the 780 millionth, that is, a one with 780 million zeroes after it.  As big numbers go, that's bigger than astronomical, but still something you could print out, if only in tiny digits on a very big sheet of paper.  It's smaller than the monstrously big numbers I've discussed previously.  Archimedes' system could have handled it (see this post on big numbers  for more details on all that).

"Bigger than astronomical" means that there's essentially no chance that anyone will ever see a billion dice randomly come up all sixes, even if, say, we set every person alive to rolling a die over and over again, and on through the generations, even if we somehow colonized the galaxy with hordes of dice-rolling humans.

Now suppose that instead of rolling all the dice repeatedly, we just re-roll the ones that didn't come up sixes.  In that case, a bit more than 100 rolls will do.  Why?  With the first roll, about a sixth of the dice -- around 167 million, will come up sixes.  On the second roll, around a sixth of the 833 million or so remaining, or about 139 million, will come up sixes, leaving about 694 million.  Since we're rolling random dice here, these numbers won't be exact, but because we're rolling a whole bunch of dice, they'll be pretty close, percentage-wise.  With each roll there are about 5/6 as many dice left to roll as with the roll before.

At some point, you can no longer assume that close to 1/6 of the dice will come up sixes, but after 100 rolls you should be down to about a dozen, and it won't take too long to get the rest.

One more game before I explain what I'm up to:  Same billion dice, but this time, after an initial roll, you pick one die at random and roll it if it's not a six.  How many times do you have to do this pick-and-roll (sorry) before you have a complete set of sixes?

At the beginning, you have about 833 million non-sixes and it will take about seven tries before you change one of them to a six.  As more and more dice get changed to sixes, it gets harder and harder to find one that isn't already there.  The last die will take about 6 billion tries -- you'll need to roll it about six times, but you'll only get a chance to one in a billion tries.  All told, according to Wolfram Alpha's handy sum calculator, it will take about 20 billion tries before you get all your sixes.  That's not something you could do in an afternoon.  If you could do one try every second, it would take somewhat more than 600 years.  Not really feasible, but not unimaginable.


If we want to talk about something arising by a random process, it matters, and it matters a lot, what kind of random process we're talking about.  In a purely random process, where everything is re-done from scratch at every step, most interesting results will be completely, beyond-astronomically unlikely.  But a process can proceed randomly and still produce a highly-ordered result with very high probability, as long as there is some sort of state preserved from one step to the next.

For example, when sugar crystalizes out of sugar water to make rock candy, it is for all practical purposes completely random which sugar molecule sticks to which part of the growing crystal at any given point.  And yet, the crystal will grow, and grow in a highly, though not completely, predictable fashion, all without violating any laws of thermodynamics.

The end result will be something that would be completely implausible if sugar molecules behaved completely randomly, but they don't.  They behave essentially randomly when drifting around in a solution, but not when near a regular surface of other sugar molecules that's already there.  With each molecule added to the crystal, it's that much easier for the next one to find a place to attach (until enough sugar has crystalized out that the system reaches equilibrium).


Put another way, there is no single such thing as a random process.  There are infinitely many varieties of random process, some with more or less non-random state than others.  It's not meaningful to ask whether something could have arisen at random without specifying what kind of random processes we're talking about.