Showing posts with label thermodynamics. Show all posts
Showing posts with label thermodynamics. Show all posts

Thursday, March 27, 2025

Losing my marbles over entropy

In a previous post on Entropy, I offered a garbled notion of "statistical symmetry." I'm currently reading Carlo Rovelli's The Order of Time, and chapter two laid out the idea that I was grasping at concisely, clearly and -- because Rovelli is an actual physicist -- correctly.

What follows is a fairly long and rambling discussion of the same toy system as the previous post, of five marbles in a square box with 25 compartments. It does eventually circle back to the idea of symmetry, but it's really more of a brain dump of me trying to make sure I've got the concepts right. If that sounds interesting, feel free to dive in. Otherwise, you may want to skip this one.


In the earlier post, I described a box split into 25 little compartments with marbles in five of the compartments. If you start with, say, all the marbles on one row (originally I said on one diagonal, but that just made things a bit messier) and give the box a good shake, the odds that the marbles all end up in the same row that they started in are low, about one in 50,000 for this small example. So far, so good.

But this is really true for any starting configuration -- if there are twenty-five compartments in a five-by-five grid, numbered from left to right then top to bottom, and the marbles start out in, say, compartments 2, 7,  8, 20 and 24, the odds that they'll still be in those compartments after you shake the box are exactly the same, about one in 50,000.

On the one hand, it seems  like going from five marbles in a row to five marbles in whatever random positions they end up in is making the box more disordered. On the other hand, if you just look at the positions of the individual marbles, you've gone from a set of five numbers from 1 to 25 ... to a set of numbers from 1 to 25, possibly the one you started with. Nothing special has happened.

This is why the technical definition of entropy doesn't mention "disorder". The actual definition of entropy is in terms of microstates and macrostates. A microstate is a particular configuration of the individual components of a system, in this case, the positions of the marbles in the compartments. A macrostate is a collection of microstates that we consider to be equivalent in some sense.

Let's say there are two macrostates: Let's call any microstate with all five marbles in the same row lined-up, and any other microstate scattered.  In all there are 53,130 microstates (25 choose 5). Of those, five have all the marbles in a row (one for each row), and the other 53,125 don't. That is, there are five microstates in the lined-up microstate and 53,125 in the scattered microstate.

The entropy of a macrostate is related to the number of microstates consistent with that macrostate (for more context, see the earlier post on entropy, which I put a lot more care into). Specifically, it is the logarithm of the number of such states, multiplied by a factor called the Boltzmann constant to make the units come out right and to scale the numbers down, because in real systems the numbers are ridiculously large (though not as large as some of these numbers), and even their logarithms are quite large. Boltzman's constant is 1.380649×10−23 Joules per Kelvin.

The natural logarithm of 5 is about 1.6 and the natural logarithm of 53,125 is about 10.9. Multiplying by Boltzmann's constant doesn't change their relative size: The scattered macrostate has about 6.8 times the entropy of the lined-up macrostate.

If you start with the marbles in the low-entropy lined-up macrostate and give the box a good shake, 10,625 times out of 10,626 you'll end up in the higher-entropy scattered macrostate. Five marbles in 25 compartments is a tiny system, considering that there are somewhere around 10,800,000,000,000,000,000,000,000 molecules in a milliliter of water. In any real system, except cases like very low-temperature systems with handfuls of particles, the differences in entropy are large enough that "10,625 times out of 10,626" turns into "always" for all intents and purposes.


This distinction between microstates and macrostates gives a rigorous basis for the intuition that going from lined-up marbles to scattered-wherever marbles is a significant change, while going from one particular scattered state to another isn't.

In both cases, the marbles are going from one microstate to another, possibly but very rarely the one they started in. In the first case, the marbles go from one macrostate to another. In the second, they don't. Macrostate changes are, by definition, the ones we consider significant, in this case, between lined-up and scattered. Because of how we've defined the macrostates, the first change is significant and the second isn't.


Let's slice this a bit more finely and consider a scenario where only part of a system can change at any given time. Suppose you don't shake up the box entirely. Instead, you randomly choose one of the marbles, take it out and put it back in a random position, including, possibly, the one it came from. In that case, the chance of going from lined-up to scattered is 20 in 21, since (regardless of which marble you chose) out of the 21 positions the marble can end up in, only one, its original position, has the marbles all lined up.

What about the other way around? Of the 53,120 microstates in the scattered macrostate, only 500 have four of the five marbles in one row. For any microstate, there are 105 different ways to take one marble out and replace it: Five marbles times 21 empty places to put it, including the place it came from.

For the 500 microstates with four marbles in a row, only one of those 105 possibilities will result in all five marbles in a row: Remove the lone marble that's not in a row and put it in the only empty place in the row of four. For the other 52,615 microstates in the scattered macrostate, there's no way at all to end up with five marbles lined up by moving only one marble.

So there are 500 cases where the scattered macrostate becomes lined-up, 500*104 cases where it might but doesn't, and 52,615*105 cases where it couldn't possibly. In all, that means that the odds are 11,153.15 to one against scattered becoming lined-up by removing and replacing one marble randomly.

Suppose that the marbles are lined up at some starting time, and every time the clock ticks, one marble gets removed and replaced randomly. After one clock tick, there is a 104 in 105 chance that the marbles will be in the high-entropy scattered state. How about after two ticks? How about if we let the clock run indefinitely -- what portion of the time will the system spend in the lined-up macrostate?

The there are tools to answer questions like this, particularly Markov chains and stochastic matrices (that's the same Markov Chain that can generate random text that resembles an input text). I'll spare you the details, but the answer requires defining a few more macrostates, one for each way to represent the number five as the sum of whole numbers: [5], [4, 1], [3, 2], [3, 1, 1], [2, 2, 1], [2, 1, 1, 1] and [1, 1, 1, 1, 1].

The macrostate [5] comprises all microstates with five marbles in one row, the macrostate [4, 1] comprises all microstates with four marbles in one row and one in another row, the macrostate [2, 2, 1] comprises all microstates with two marbles in one row, two marbles in another row and one marble in a third one, and so forth.

Here's a summary

MacrostateMicrostatesEntropy
[5]51.6
[4,1]5006.2
[3,2]2,0007.6
[3,1,1]7,5008.9
[2,2,1]15,0009.6
[2,1,1,1]25,00010.1
[1,1,1,1,1]3,1258.0

The Entropy column is the natural logarithm of the Microstates column, without multiplying by Boltzmann's constant. Again, this is just to give a basis for comparison. For example [2,1,1,1] is the highest-entropy state, and [2,2,1] has four times the entropy of [5]. 

It's straightforward, but tedious, to count the number of ways one macrostate can transition to another. For example, of the 105 transitions for [3,2], 4 end up in [4,1], 26 end up back in [3,2] (not always by putting the removed marble back where it was), 30 end up in [3, 1, 1] and 45 end up in [2, 2, 1]. Putting all this into a matrix and taking the matrix to the 10th power (enough to see where this is converging) gives

Macrostate% time% microstates
[5].0094.0094
[4,1].94.94
[3,2]3.83.8
[3,1,1]1414
[2,2,1]2828
[2,1,1,1]4747
[1,1,1,1,1]5.95.9

The second column is the result of the tedious matrix calculations. The third column is just the size of the macrostate as the portion of the total number of microstates. For example, there are 500 microstates in [4,1], which is 0.94% of the total, which is also the portion of the time that the matrix calculation says system will spend in [4, 1]. Technically, this means the system is ergodic, which means I didn't have to bother with the matrix and counting all the different transitions.

Even in this toy example, the system will spend very little of its time in the low-entropy lined-up state [5], and if it ever does end up there, it won't stay there for long.


Given some basic assumptions, a system that evolves over time, transitioning from microstate to microstate, will spend the same amount of time in any given microstate (as usual, that's not quite right technically), which means that the time spent in each macrostate is proportional to its size. Higher-entropy states are larger than lower-entropy states, and because entropy is a logarithm, they're actually a lot larger.

For example, the odds of an entropy decrease of one millionth of a Joule per Kelvin are about one in e(1017). That's a number with somewhere around 40 quadrillion digits. To a mathematician, the odds still aren't zero, but to anyone else they would be.

For all but the tiniest, coldest systems, the chance of entropy decreasing even by a measurable amount are not just small, but incomprehensibly small. The only systems where the number of microstates isn't incomprehensibly huge are are small collections of particles near absolute zero.

I'm pretty sure I've read about experiments where such a system can go from a higher-entropy state to a very slightly lower-entropy state and vice versa, though I haven't had any luck tracking them down. Even if no one's ever done it, such a system wouldn't violate any laws of thermodynamics, because the laws of thermodynamics are statistical (and there's also the question of definition over whether such a system is in equilibrium).

So you're saying ... there's a chance? Yes, but actually no, in any but the tiniest, coldest systems. Any decrease in entropy that could actually occur in the real world and persist long enough to be measured would be in the vicinity of 10−23 Joules per Kelvin, which is much, much too small to be measured except under very special circumstances.

For example, if you have 1.43 grams of pure oxygen in a one-liter container at standard temperature and pressure, it's very unlikely that you know any of the variables involved -- the mass of the oxygen, its purity, the size of the container, the temperature or the pressure, to even one part in a billion. Detecting changes 100,000,000,000,000 times smaller than that is not going to happen.



But none of that is what got me started on this post. What got me started was that the earlier post tried to define some sort of notion of "statistical symmetry", which isn't really a thing, and what got me started on that was my coming to understand that higher-entropy states are more symmetrical. That in turn was jarring because entropy is usually taken as a synonym for disorder, and symmetry is usually taken as a synonym for order.

Part of the resolution of that paradox is that entropy is a measure of uncertainty, not disorder. The earlier post got that right, but evidently that hasn't stopped me from hammering on the point for dozens more paragraphs and a couple of tables in this one, using a slightly different marbles-in-compartments example.

The other part is that more symmetry doesn't really mean more order, at least not in the way that we usually think about it.

From a mathematical point of view, a symmetry of an object is something you can do to it that doesn't change some aspect of the object that you're interested in. For example, if something has mirror symmetry, that means that it looks the same in the mirror as it does ordinarily.

It matters where you put the mirror. The letter W looks the same if you put a mirror vertically down the middle of it -- it has one axis of symmetry. The letter X looks the same if you put the mirror vertically in the middle, but it also looks the same if you put it horizontally in the middle -- it has two axes of symmetry.

Another way to say this is that if you could draw a vertical line through the middle of the W and rotate the W out of the page around that line, and kept going for 180 degrees until the W was back in the page, but flipped over, it would still look the same. If you chose some other line, it would look different (even if you picked a different vertical line, it would end up in a different place). That is, if you do something to the W -- rotate it around the vertical line through the middle -- it ends up looking the same. The aspect you care about here is how the W looks.

To put it somewhat more rigorously: if f is the particular mapping that takes each point to its mirror image across the axis, then f takes the set of points in the W to the exact same set of points. Any point on the axis maps to itself, and any point off the axis maps to its mirror image, which is also part of the W. The map f is defined for every point on the plane and it moves all of them except for the axis. The aspect we care about, which f doesn't change, is whether a particular point is in the W.

If you look at all the things you can do to an object without changing the aspect you care about, you have a mathematical group. For a W, there are two things you can do: leave it alone and flip it over. For an X, you have four options: leave it alone, flip it around the vertical axis, flip it around the horizontal axis, or do both. Leaving an object alone is called the identity transformation, and it's always considered a symmetry, because math. An asymmetrical object has only that symmetry (it's symmetry group is trivial).

In normal speech, saying something is symmetrical usually means it has the same symmetry group as a W -- half of it is a mirror image of the other half. Technically, it has bilateral symmetry. In some sense, though, an X is more symmetrical, since its symmetry group is larger, and a hexagon, which has 12 elements in its symmetry group, is more symmetrical yet.

A figure with 19 sides, each of which is the same lopsided squiggle, would have a symmetry group of 19 (rotate by 1/19 of a full circle, 2/19 ... 18/19, and also don't rotate at all). That would make it more symmetrical than a hexagon, and quite a bit more symmetrical than a W, but if you asked people which was most symmetrical, they would probably put the 19-sided squigglegon last of the three.

Our visual system is mostly trained to recognize bilateral symmetry. Except for special situations like reflections in a pond, pretty much everything in nature with bilateral symmetry is an animal, which is pretty useful information when it comes to eating and not being eaten. We also recognize rotational symmetry, which includes flowers and some sea creatures, also useful information, but even these also tend to have bilateral symmetry as well.

It would make sense, then, that in day to day life, "more symmetrical" generally means "closer to bilateral symmetry". If a house has an equal number of windows at the same level on either side of the front door, we think of it as symmetrical,  even though the windows may not be exactly the same, the door itself probably has a doorknob on one side or the other and so forth, so it's not quite exactly symmetrical. We'd still say it's pretty symmetrical, even though from a mathematical point of view it either has bilateral symmetry or it doesn't (and in the real world, nothing we can see is perfectly symmetrical).

That should go some way toward explaining why, along with so many other things, symmetry doesn't necessarily mean the same thing in its mathematical sense as it does ordinarily. The mathematical definition includes things that we don't necessarily think of as symmetry.

Continuing with shapes and their symmetries, you can think of each shape as a macrostate. You can  associate a microstate with each mapping (technically, in this case, any rigid transformation of the plane) that leaves the shape unchanged. The macrostate W has two microstates: one for the identity transformation, which leaves the plane unchanged, and one for the mirror transformation around the W's axis.

The X macrostate has four microstates, one for the identity, one for the flip around the vertical axis, one for the flip around the horizontal axis, and one for flipping around one axis and then the other (in this case, it doesn't matter what order you do it in). The X macrostate has a larger symmetry group, which is the same as saying it has more entropy.

In this context, a symmetry is something you can do to the microstate without changing the macrostate. A larger symmetry group -- more symmetry -- means more microstates for the same macrostate, which means more entropy, and vice-versa. They're two ways of looking at the same thing.

In the case of the marbles in a box, a symmetry is any way of switching the positions of the marbles, including not switching them around at all. Technically, this is a permutation group.

For any given microstate,  some of the possible permutations just switch the marbles around in their places (for example, switching the first two marbles in a lined-up row), and some of them will move marbles to different compartments. For a microstate of the lined-up macrostate [5], there are many fewer permutations that leave the marbles in the same macrostate (all in one row, though not necessarily the same row) than there are for [2, 1, 1, 1]. Even though five marbles in a row looks more symmetrical, since it happens to have bilateral visual symmetry, it's actually a much less symmetrical macrostate than [2, 1, 1, 1], even though most of its microstates will just look like a jumble.


In the real world, distributing marbles in boxes is really distributing energy among particles, generally a very large number of them. Real particles can be in many different states, many more than the marble/no marble states in the toy example, and different states can have the same energy, which makes the math a bit more complicated. Switching marbles around is really exchanging energy among particles, and there are all sorts of intricacies about how that happens.

Nonetheless, the same basic principles hold: Entropy is a measure of the number of microstates for a given macrostate, and a system in equilibrium will evolve toward the highest-entropy macrostate available, and stay there, simply because the probability of anything else happening is essentially zero.

And also yeah, symmetry doesn't necessarily mean what you think it might.

Sunday, September 13, 2020

Entropy and time's arrow

When contemplating the mysteries of time ... what is it, why is it how it is, why do remember the past but not the future ... it's seldom long before the second law of thermodynamics comes up.

[Later in this post, I try to invent something called "statistical symmetry", which isn't really a thing. I eventually revisited that topic in this post, which covers what I was trying to talk about, but more accurately, or at least less inaccurately -- D. H. April 2025]

In technical terms, the second law of thermodynamics states that the entropy of a closed system increases over time.  I've previously discussed what entropy is and isn't.  The short version is that entropy is a measure of uncertainty about the internal details of a system.  This is often shorthanded as "disorder", and that's not totally wrong, but it probably leads to more confusion than understanding.  This may be in part because uncertainty and disorder are both related to the more technical concept of symmetry, which may not mean what you might expect.  At least, I found some of this surprising when I first went over it.

Consider an ice cube melting.  Is a puddle of water more disordered than an ice cube?  One would think.  In an ice cube, each atom is locked into a crystal matrix, each atom in its place.  An atom in the liquid water is bouncing around, bumping into other atoms, held in place enough to keep from flying off into the air but otherwise free to move.

But which of the two is more symmetrical?  If your answer is "the ice cube", you're not alone.  That was my reflexive answer as well, and I expect that it would be for most people.  Actually, it's the water.  Why?  Symmetry is a measure of what you can do to something and still have it look the same.  The actual mathematical definition is, of course, a bit more technical, but that'll do for now.

An irregular lump of coal looks different if you turn it one way or another, so we call it asymmetrical.  A cube looks the same if you turn it 90 degrees in any of six directions, or 180 degrees in any of three directions, so we say it has "rotational symmetry" (and "reflective symmetry" as well).  A perfect sphere looks the same no matter which way you turn it, including, but not limited to, all the ways you can turn a cube and have the cube still look the same.  The sphere is more symmetrical than the cube, which is more symmetrical than the lump of coal.  So far so good.

A mass of water molecules bouncing around in a drop of water looks the same no matter which way you turn it.  It's symmetrical the same way a sphere is.  The crystal matrix of an ice cube only looks the same if you turn it in particular ways.  That is, liquid water is more symmetrical, at the microscopic level, than frozen water.  This is the same as saying we know less about the locations and motions of the individual molecules in liquid water than those in frozen water.  More uncertainty is the same as more entropy.

Geometrical symmetry is not the only thing going on here.  Ice at -100C has lower entropy than ice at -1C, because molecules in the colder ice have less kinetic energy and a narrower distribution of possible kinetic energies (loosely, they're not vibrating as quickly within the crystal matrix and there's less uncertainty about how quickly they're vibrating).  However, if you do see an increase in geometrical symmetry, you are also seeing an increase in uncertainty, which is to say entropy. The difference between cold ice and near-melting ice can also be expressed in terms of symmetry, but a more subtle kind of symmetry.  We'll get to that.


[This is where I start to go astray. These next few paragraphs are all right, but the later post has a better explanation of what's going on. -- D.H. April 2025]

As with the previous post, I've spent more time on a sidebar than I meant to, so I'll try to get to the point by going off on another sidebar, but one more closely related to the real point.

Suppose you have a box with, say, 25 little bins in it arranged in a square grid.  There are five marbles in the box, one in each bin on the diagonal from upper left to lower right.  This arrangement has "180-degree rotational symmetry".  That is, you can rotate it 180 degrees and it will look the same.  If you rotate it 90 degrees, however, it will look clearly different.

Now put a lid on the box, give it a good shake and remove the lid.  The five marbles will have settled into some random assortment of bins (each bin can only hold one marble).  If you look closely, this random arrangement is very likely to be asymmetrical in the same way a lump of coal is: If you turn it 90 degrees, or 180, or reflect it in a mirror, the individual marbles will be in different positions than if you didn't rotate or reflect the box.

However, if you were to take a quick glimpse at the box from a distance, then have someone flip a coin and turn the box 90 degrees if the coin came up heads, then take another quick glimpse, you'd have trouble telling if the box had been turned or not.  You'd have no trouble with the marbles in their original arrangement on the diagonal.  In that sense, the random arrangement is more symmetrical than the original arrangement, just like the microscopic structure of liquid water is more symmetrical than that of ice.


[And this is where I went full LLM -- D.H April 2025]

The magic word to make this all rigorous is "statistical".  That is, if you have a big enough grid and enough marbles and you just measure large-scale statistical properties, and look at distributions of values rather than the actual values, then an arrangement of marbles is more symmetrical if these rough measures measures don't change when you rotate the box (or reflect it, or shuffle the rows or columns, or whatever -- for brevity I'll stick to "rotate" here).

For example, if you count the number of marbles on each diagonal line (wrapping around so that each diagonal line has five bins), then for the original all-on-one-diagonal arrangement, there will be a sharp peak: five marbles on the main diagonal, one on each of the diagonals that cross that main diagonal, and zero on the others.  Rotate the box, and that peak moves.  For a random arrangement, the counts will all be more or less the same, both before and after you rotate the box.  A random arrangement is more symmetrical, in this statistical sense.

The important thing here is that there are many more symmetrical arrangements than not.  For example, there are ten wrap-around diagonals in a 5x5 grid (five in each direction) so there are ten ways to put five marbles in that kind of arrangement.  There are 53,130 total ways to put 5 marbles in 25 bins, so there are approximately 5,000 times as many more-symmetrical, that is, higher-entropy, arrangements.  Granted, some of these are still fairly unevenly distributed, for example four marbles on one diagonal and one off it, but even taking that into account, there are many more arrangements that look more or less the same if you rotate the box than there are that look significantly different.

[And now we're heading back toward reality, though I did make a couple of small edits for clarity and precision. In particular, the parts about symmetrical arrangements of marbles in boxes are OK, though again the later post covers this better. The main mistake was thinking that there's anything statistical about symmetry. Statistics comes in when compare the sizes of macrostates. -- D. H. April 2025]

This is a toy example.  If you scale up to, say, the number of molecules in a balloon at room temperature, "many more" becomes "practically all".  Even if the box has 2500 bins in a 50x50 grid, still ridiculously small compared to the trillions of trillions of molecules in a typical system like a balloon, or a vase, or a refrigerator or whatever, the odds that all of the balls line up on a diagonal are less than one in googol (that's ten to the hundredth power, not the search engine company). You can imagine all the molecules in a balloon crowding into one particular region, but for practical purposes it's not going to happen, at least not by chance in a balloon at room temperature.

If you start with the box of marbles in a not-very-symmetrical state and shake it up, you'll almost certainly end up with a more symmetrical state, simply because there are many more ways for that to happen.  Even if you only change one part of the system, say by taking out one marble and putting it back in a random empty bin adjacent to its original position, there are still more cases than not in which the new arrangement is more symmetrical than the old one.

If you continue making more random changes, whether large or small, the state of the box will get more symmetrical over time.  Strictly speaking, this is not an absolute certainty, but for anything we encounter in daily life the numbers are so big that the chances of anything else happening are essentially zero.  This will continue until the system reaches its maximum entropy, at which point large or small random changes will (essentially certainly) leave the system in a state just as symmetrical as it was before.

That's the second law -- as a closed system evolves, its entropy will essentially never decrease, and if it starts in a state of less than maximum entropy, its entropy will essentially always increase until it reaches maximum entropy.


And now to the point.

The second law gives a rigorous way to tell that time is passing.  In a classic example, if you watch a film of a vase falling off a table and shattering on the floor, you can tell instantly if the film is running forward or backward: if you see the pieces of a shattered vase assembling themselves into an intact vase, which then rises up and lands neatly on the table, you know the film is running backwards.  Thus it is said that the second law of thermodynamics gives time its direction.

As compelling as that may seem, there are a couple of problems with this view.  I didn't come up with any of these, of course, but I do find them convincing:

  • The argument is only compelling for part of the film.  In the time between the vase leaving the table and it making contact with the floor, the film looks fine either way.  You either see a vase falling, or you see it rising, presumably having been launched by some mechanism.  Either one is perfectly plausible, while the vase assembling itself from its many pieces is totally implausible.  But the lack of any obvious cue like pottery shards improbably assembling themselves doesn't stop time from passing.
  • If your recording process captured enough data, beyond just the visual image of the vase, you could in principle detect that the entropy of the contents of the room increases slightly if you run the film in one direction and decreases in the other, but that doesn't actually help because entropy can decrease locally without violating the second law.  For example, you can freeze water in a freezer or by leaving it out in the cold.  Its entropy decreases, but that's fine because entropy overall is still increasing, one way or another (for example, a refrigerator dumps heat into the surrounding environment at a higher temperature than the temperature of its contents, which are losing heat, and if you run all the numbers through the total entropy of the refrigerator together with its environment is increasing).  If you watch a film of ice melting, there may not be any clear cues to tell you that you're not actually watching a film of ice freezing, running backward.  But time passes regardless of whether entropy is increasing or decreasing in the local environment.
  • Most importantly, though, in an example like a film running, we're only able to say "That film of a vase shattering is running backward" because we ourselves perceive time passing.  We can only say the film is running backward because it's running at all.  By "backward", we really mean "in the other direction from our perception of time".  Likewise, if we measure the entropy of a refrigerator and its contents, we can only say that entropy is increasing as time as we perceive it increases.
In other words, entropy increasing is a way that we can tell time is passing, but it's not the cause of time passing, any more than a mile marker on a road makes your car move.  In the example of the box of marbles, we can only say that the box went from a less symmetrical to more symmetrical state because we can say it was in one state before it was in the other.

If you printed a diagram of each arrangement of marbles on opposite sides of a piece of paper, you'd have two diagrams on a piece of paper.  You couldn't say one was before the other, or that time progressed from one to the other.  You can only say that if the state of the system undergoes random changes over time, then the system will get more symmetrical over time, and in particular the less symmetrical arrangement (almost certainly) won't happen after the more symmetrical one.  That is, entropy will increase.

You could even restate the second law as something like "As a system evolves over time, all state changes allowed by its current state are equally likely" and derive increasing entropy from that (strictly speaking you may have to distinguish identical-looking potential states in order to make "equally likely" work correctly -- the rigorous version of this is the ergodic hypothesis).  This in turn depends on the assumptions that systems have state, and that state changes over time.  Time is a fundamental assumption here, not a by-product.

In short, while you can use the second law to demonstrate that time is passing, you can't appeal to the second law to answer questions like "Why do we remember the past and not the future?"  It just doesn't apply.

Saturday, July 22, 2017

Yep. Tron.

It was winter when I started writing this, but writing posts about physics is hard, at least if you're not a physicist.  This one was particularly hard because I had to re-learn what I thought I knew about the topic, and then realize that I'd never really understood it as well as I'd thought, then try to learn it correctly, then realize that I also needed to re-learn some of the prerequisites, which led to a whole other post ... but just for the sake of illustration, let's pretend it's still winter.

If you live near a modest-sized pond or lake, you might (depending on the weather) see it freeze over at night and thaw during the day.  Thermodynamically this can be described in terms of energy (specifically heat) and entropy.  At night, the water is giving off heat into the surrounding environment and losing entropy (while its temperature stays right at freezing).  The surrounding environment is taking on heat and gaining entropy.  The surroundings gain at least as much entropy as the pond loses, and ultimately the Earth will radiate just that bit more heat into space.  When you do all the accounting, the entropy of the universe increases by just a tiny bit, relatively speaking.

During the day, the process reverses.  The water takes on heat and gains entropy (while its temperature still stays right at freezing).  The surroundings give off heat, which ultimately came from the sun, and lose entropy.  The water gains at least as much entropy as the surroundings lose*, and again the entropy of the universe goes up by just that little, tiny bit, relatively speaking.

So what is this entropy of which we speak?  Originally entropy was defined in terms of heat and temperature.  One of the major achievements of modern physics was to reformulate entropy in a more powerful and elegant form, revealing deep and interesting connections, thereby leading to both enlightenment and confusion.  The connections were deep enough that Claude Shannon, in his founding work on information theory, defined a similar concept with the same name, leading to even more enlightenment and confusion.

The original thermodynamic definition relies on the distinction between heat and temperature.  Temperature, at least in the situations we'll be discussing here, is a measure of how energetic individual particles -- typically atoms or molecules -- are on average.  Heat is a form of energy, independent of how many particles are involved.

The air in an oven heated to 500K (that is, 500 Kelvin, about 227 degrees Celsius or 440 degrees Fahrenheit) and a pot full of oil at 500K are, of course, at the same temperature, but you can safely put your hand in the oven for a bit.  The oil, not so much.  Why?  Mainly because there's a lot more heat in the oil than in the air.  By definition the molecules in the oven air are just as energetic, on average, as a the molecules the oil, but there are a lot more molecules of oil, and therefore a lot more energy, which is to say heat.

At least, that's the quick explanation for purposes of illustration.  Going into the real details doesn't change the basic point: heat is different from temperature and changing the temperature of something requires transferring energy (heat) to or from it.  As in the case of the pond freezing and melting, there are also cases where you can transfer heat to or from something without changing its temperature.  This will be important in what follows.

Entropy was originally defined as part of understanding the Carnot cycle, which describes the ideal heat-driven engine (the efficiency of a real engine is usually given as a percentage of what the Carnot cycle would produce, not as a percentage of the energy it uses).  Among the principal results in classical thermodynamics is that the Carnot cycle was as good as you can get even in principle, but not even it can ever be perfectly efficient, even in principle.

At this point it might be helpful to read that earlier post on energy, if you haven't already.  Particularly relevant parts here are that the state of the working fluid in a heat engine, such as the steam in a steam engine, can be described with two parameters, or, equivalently, as a point in a two-dimensional diagram, and that the cycle an engine goes through can be described by a path in that two-dimensional space.

Also keep in mind the ideal gas law: In an ideal gas, the temperature of a given amount of gas is proportional to pressure times volume.  Here and in the rest of this post, "gas" means "a substance without a fixed shape or volume" and not what people call "gasoline" or "petrol".

If you've ever noticed a bicycle pump heat up as you pump up a tire, that's (more or less) why.  You're compressing air, that is, decreasing its volume, so (unless the pump is able to spill heat with perfect efficiency, which it isn't) the temperature has to go up.  For the same reason the air coming out of a can of compressed air is dangerously cold.  The air is expanding rapidly so the temperature drops sharply.

In the Carnot cycle you first supply heat a to gas (the "working fluid", for example steam in a steam engine) while maintaining a perfectly constant temperature by expanding the container it's in.  You're heating that gas, in the sense of supplying heat, but not in the sense of raising its temperature.  Again, heat and temperature are two different things.

To continue the Carnot cycle, let the container keep expanding, but now in such a way that it neither gains nor loses heat (in technical terms, adiabatically).  In these first two steps, you're getting work out of the engine (for example, by connecting a rod to the moving part of a piston and attaching the other part of that rod to a wheel).  The gas is losing energy since it's doing work on the piston, and it's also expanding, so the temperature and pressure are both dropping, but no heat is leaving the container in the adiabatic step.

Work is force times distance, and force in this case is pressure times the area of the surface that's moving.    Since the pressure, and therefore the force, is dropping during the second step you'll need to use calculus to figure out the exact amount of work, but people know how to do that.

The last two steps of the cycle reverse the first two.  In step three you compress the gas, for example by changing the direction the piston is moving, while keeping the temperature the same.  This means the gas is cooling in the sense of giving off heat, but not in the sense of dropping in temperature.  Finally, in step four, compress the gas further, without letting it give off heat.  This raises the temperature.  The piston is doing work on the gas and the volume is decreasing.  In a perfect Carnot cycle the gas ends up in the same state -- same pressure, temperature and volume -- as it began and you can start it all over.

As mentioned in the previous post, you end up putting more heat in at the start then you end up getting back in the third step, and you end up getting more work out in the first two steps than you put in in the last two (because the pressure is higher in the first two steps).  Heat gets converted to work (or if you run the whole thing backwards, you end up with a refrigerator).

If you plot the Carnot cycle on a diagram of pressure versus volume, or the other two combinations of pressure, volume and temperature, you get a a shape with at least two curved sides, and it's hard to tell whether you could do better.  Carnot proved that this cycle is the best you can do, in terms of how much work you can get out of a given amount of heat, by choosing two parameters that make the cycle into a rectangle.  One is temperature -- steps one and three maintain a constant temperature.

The other needs to make the other two steps straight lines.  To make this work out, the second quantity has to remain constant while the temperature is changing, and change when temperature is constant.  The solution is to define a quantity -- call it entropy -- that changes, when temperature is constant, by the amount of heat transferred, divided by that temperature (ΔS = ΔQ/T -- the deltas (Δ) say that we're relating changes in heat and entropy, not absolute quantities; Q stands for heat and S stands for entropy, because reasons).  When there's no heat transferred, entropy doesn't change.  In step one, temperature is constant and entropy increases.  In step two, temperature decreases while entropy remains constant, and so forth.

To be clear, entropy and temperature can, in general, both change at the same time.  For example, if you heat a gas at constant volume, then pressure, temperature and entropy all go up.  The Carnot cycle is a special case where only one changes at a time.

Knowing the definition of entropy, you can convert, say, a pressure/volume diagram to a temperature/entropy diagram and back.  In real systems, the temperature/entropy version won't show absolutely straight vertical and horizontal lines -- that is, there will be at least some places where both change at the same time.  The Carnot cycle is exactly the case where the lines are perfectly horizontal and vertical.

This definition of entropy in terms of heat and temperature says nothing at all about what's going on in the gas, but it's enough, along with some math I won't go into here (but which depends on the cycle being a rectangle), to prove Carnot's result: The portion of heat wasted in a Carnot cycle is the ratio of the cold temperature to the hot temperature (on an absolute temperature scale).  You can only have zero loss -- 100% efficiency -- if the cold temperature is absolute zero.  Which it won't be.

Any cycle that deviates from a perfect rectangle will be less efficient yet.  In real life this is inevitable.  You can come pretty close on all the steps, but not perfectly close.  In real life you don't have an ideal gas, you can't magically switch from being able to put heat into the gas to perfectly insulating it, you won't be able to transfer all the heat from your heat source to the gas, you won't be able to capture all the heat from the third step of the cycle to reuse in the first step of the next cycle, some of the energy of the moving piston will be lost to friction (that is, dissipated into the surroundings as heat) and so on.

The problem-solving that goes into minimizing inefficiencies in real engines is why engineering came to be called engineering and why the hallmark of engineering is getting usefulness out of imperfection.



There are other cases where heat is transferred at a constant temperature, and we can define entropy in the same way as for a gas.  For example, temperature doesn't change during a phase change such as melting or freezing.  As our pond melts and freezes, the temperature stays right at freezing until the pond completely freezes, at which point it can get cooler, or melts entirely, at which point it can get warmer.

If all you know is that some water is at the freezing point, you can't say how much heat it will take to raise the temperature above freezing without knowing how much of it is frozen and how much is liquid.  The concept of entropy is perfectly valid here -- it relates directly to how much of the pond is liquid -- and we can define "entropy of fusion" to account for phase transitions.

There are plenty of other cases that don't look quite so much like the ideal gas case but still involve changes of entropy.  Mixing two substances increases overall entropy.  Entropy is a determining factor in whether a chemical reaction will go forward or backward and in ice melting when you throw salt on it.


Before I go any further about thermodynamic entropy, let me throw in that Claude Shannon's definition of entropy in information theory is, informally, a measure of the number of distinct messages that could have been transmitted in a particular situation.  On the other blog, for example, I've ranted about bits of entropy for passwords.  This is exactly a measure of how many possible passwords there are in a given scheme for picking passwords.

What in the world does this have to do with transferring heat at a constant temperature?  Good question.

Just as the concept of energy underwent several shifts in understanding on the way to its current formulation, so did entropy.  The first major shift came with the development of statistical mechanics.  Here "mechanics" refers to the behavior of physical objects, and "statistical" means you've got enough of them that you're only concerned about their overall behavior.

Statistical mechanics models an ideal gas as a collection of particles bouncing around in a container.  You can think of this as a bunch of tiny balls bouncing around in a box, but there's a key difference from what you might expect from that image.  In an ideal gas, all the collisions are perfectly elastic, meaning that the energy of motion (called kinetic energy) remains the same before and after.  In a real box full of balls, the kinetic energy of the balls gets converted to heat as the balls bump into each other and push each other's molecules around, and sooner or later the balls stop bouncing.

But the whole point of the statistical view of thermodynamics is that heat is just the kinetic energy of the particles the system is made up of.  When actual bouncing balls lose energy to heat, that means that the kinetic energy of the large-scale motion of the balls themselves is getting converted into kinetic energy of the small-scale motion of the molecules the balls are made of, and of the air in the box, and of the walls of the box, and eventually the surroundings.  That is, the large scale motion we can see is getting converted into a lot of small-scale motion that we can't, which we call heat.

When two particles, say two oxygen molecules, bounce off each other, the kinetic energy of the moving particles just gets converted into kinetic energy of differently-moving particles, and that's it.  In the original formulation of statistical mechanics, there's simply no other place for that energy to go, no smaller-scale moving parts to transfer energy to (assuming there's no chemical reaction between the two -- if you prefer, put pure helium in the box).

When a particle bounces off the wall of the container, it imparts a small impulse -- an instantaneous force -- to the walls.  When a whole lot of particles continually bounce off the walls of a container, those instantaneous forces add up to (for all practical purposes) a continuous force, that is, pressure.

Temperature is the average kinetic energy of the particles and volume is, well, volume.  That gives us our basic parameters of temperature, pressure and volume.

But what is entropy, in this view?  In statistical mechanics, we're concerned about the large-scale (macroscopic) state of the system, but there are many different small-scale (microscopic) states that could give the same macroscopic picture.

Once you crank through all the math, it turns out that entropy is a measure of how many different microscopic states, which we can't measure, are consistent with the macroscopic state, which we can measure.  In fuller detail, entropy is actually proportional to the logarithm of that number -- the number of digits, more or less -- both because the raw numbers are ridiculously big, and because that way the entropy of two separate systems is the sum of the entropy of the individual systems.

The actual formula is S = k ln(W), where k is Boltzmann's constant and W is the total number of possible microstates, assuming they're all equally probable.  There's a slightly bigger formula if they're not.  Note that, unlike the original thermodynamic definition, this formula deals in absolute quantities, not changes.

When ice melts, entropy increases.  Water molecules in ice are confined to fixed positions in a crystal.  We may not know the exact energy of each individual molecule, but we at least know more or less where it is, and we know that if the energy of such a molecule is too high, it will leave the crystal (if this happens on a large scale, the crystal melts).  Once it does, we know much less about its location or energy.

Even without a phase change, the same sort of reasoning applies.  As temperature -- the average energy of each particle -- increases, the range of energies each particle can have increases.  How to translate this continuous range of energies into a number we can count is a bit of a puzzle, but we can handwave around that for now.

Entropy is often called a measure of disorder, but more accurately it's a measure of uncertainty (as theoretical physicist Sabine Hossenfelder puts it: "a measure for unresolved microscopic details"), that is, how much we don't know.  That's why Shannon used the same term in information theory.  The entropy of a message measures how much we don't know about it just from knowing its size (and a couple of other macroscopic parameters).  Shannon entropy is also logarithmic, for the same reasons that thermodynamic entropy is.

The formula for Shannon entropy in the case that all possible messages are equally probable is H = k ln(M), where M is the number of messages.  I put k there to account for the logarithm usually being base 2 and because it emphasizes the similarity to the other definition.  Again, there's a slightly bigger formula if the various messages aren't all equally probable, and it too looks an awful lot like the corresponding formula for thermodynamic entropy.

The original formulation of statistical mechanics assumed that physics at the microscopic scale followed Newton's laws of motion.  One indication that statistical mechanics was on to something is that when quantum mechanics completely reformulated what physics looks like at the microscopic scale, the statistical formulation not only held up, but became more accurate with the new information available.

In our current understanding, when two oxygen molecules bounce off each other, their electron shells interact (there's more going on, but let's start there), and eventually their energy gets redistributed into a new configuration.  This can mean the molecules traveling off in new paths, but it could also mean that some of the kinetic energy gets transferred to the electrons themselves, or some of the electrons' energy gets converted into kinetic energy.

Macroscopically this all looks the same as the old model, if you have huge numbers of molecules, but in the quantum formulation we have a more precise picture of entropy.  This makes a difference in extreme situations such as extremely cold crystals.  Since energy is quantized, there is a finite (though mind-bendingly huge) number of possible quantum states a typical system can have, and we can stop handwaving about how to handle ranges of possible energy.  This all works whether you have a gas, a liquid, an ordinary solid or some weird Bose-Einstein condensate.  Entropy measures that number of possible quantum states.

Thermodynamic entropy and information theoretic entropy are measuring basically the same thing, namely the number of specific possibilities consistent with what we know in general.  In fact, the modern definition of thermodynamic entropy specifically starts with a raw number of possible states and includes a constant factor to convert from the raw number to the units (energy over temperature) of classical thermodynamics.

This makes the two notions of entropy look even more alike -- they're both based on a count of possibilities, but with different scaling factors.  Below I'll even talk, loosely, of "bits worth of thermodynamic entropy" meaning the number of bits in the binary number for the number of possible quantum states.

Nonetheless, they're not at all the same thing in practice.

Consider a molecule of DNA.  There are dozens of atoms, and hundreds of subatomic particles, in a base pair.  I really don't know how many possible states a phosphorous atom (say) could be in under typical conditions, but I'm going to guess that there are thousands of bits worth of entropy in a base pair at room temperature.  Even if each individual particle can only be on one of two possible states, you've still got hundreds of bits.

From an information-theoretic point of view, there are four possible states for a base pair, which is two bits, and because the genetic code actually includes a fair bit of redundancy in the form of different ways of coding the same amino acid and so forth, it's actually more like 10/6 of a bit, even without taking into account other sources of redundancy.

But there is a lot of redundancy in your genome, as far as we can tell, in the form of duplicated genes and stretches of DNA that might or might not do anything.  All in all, there is about a gigabyte worth of base pairs in a human genome, but the actual gene-coding information can compress down to a few megabytes.  The thermodynamic entropy of the molecule that encodes those megabytes is much, much, larger.  If each base pair represents about a thousand bits worth of thermodynamic entropy under typical conditions, then the whole strand is into the hundreds of gigabytes.

I keep saying "under typical conditions" because thermodynamic entropy, being thermodynamic, depends on temperature.  If you have a fever, your body, including your DNA molecules in particular, has higher entropy than if you're sitting in an ice bath.  The information theoretic entropy, on the other hand, doesn't change.

But all this is dwarfed by another factor.  You have billions of cells in your body (and trillions of bacterial cells that don't have your DNA, but never mind that).  From a thermodynamic standpoint, each of those cells -- its DNA, its RNA, its proteins, lipids, water and so forth -- contributes to the overall entropy of your body.  A billion identical strands of DNA at a given temperature have the same information content as a single strand but a billion times the thermodynamic entropy.

If you want to compare bits to bits, the Shannon entropy of your DNA is inconsequential compared to the thermodynamic entropy of your body.  Even the change in the thermodynamic entropy of your body as you breathe is enormously bigger than the Shannon entropy of your DNA.

I mention all this because from time to time you'll see statements about genetics and the second law of thermodynamics.  The second law, which is very well established, states that the entropy of a closed system cannot decrease over time.  One implication of it is that heat doesn't flow from cold to hot, which is a key assumption in Carnot's proof.

Sometimes the second law is taken to mean that genomes can't get "more complex" over time, since that would violate the second law.  The usual response to this is that living cells aren't closed systems and therefore the second law doesn't apply.  That's perfectly valid.  However, I think a better answer is that this confuses two forms of entropy -- thermodynamic entropy and Shannon entropy -- which are just plain different.  In other words, thermodynamic entropy and the second law don't work that way.

From an information point of view, the entropy of a genome is just how many bits it encodes once you compress out any redundancy.  Longer genomes typically have more entropy.  From a thermodynamic point of view, at a given temperature, more of the same substance has higher entropy than less as well, but we're measuring different quantities.

A live elephant has much, much higher entropy than a live mouse, and likewise for a live human versus a live mouse.  As it happens, a mouse genome is roughly the same size as a human genome, even though there's a huge difference in thermodynamic entropy between a live human and a live mouse.  The mouse genome is slightly smaller than ours, but not a lot.  There's no reason it couldn't be larger, and certainly no thermodynamic reason.  Neither the mouse nor human genome is particularly large.  Several organisms have genomes dozens of times larger, at least in terms of raw base pairs.

From a thermodynamic point of view, it hardly matters what exact content a DNA molecule has.  There are some minor differences in thermodynamic behavior among the particular base pairs, and in some contexts it makes a slight difference what order they're arranged in, but overall the gene-copying machinery works the same whether the DNA is encoding a human digestive protein or nothing at all.  Differences in gene content are dwarfed by the thermodynamic entropy change of turning one strand of DNA and a supply of loose nucleotides into two strands, that in turn is dwarfed by everything else going on in the cell, and that in turn is dwarfed by the jump from one cell to billions.

For what it's worth, content makes even less thermodynamic difference in other forms of storage.  A RAM chip full of random numbers has essentially the same thermodynamic entropy, at a given temperature, as one containing all zeroes or all ones, even though those have drastically different Shannon entropies.  The thermodynamic entropy changes involved in writing a single bit to memory are going to equate to a lot more than one bit.

Again, this is all assuming it's valid to compare the two forms of entropy at all, based on their both being measures of uncertainty about what exact state a system is in, and again, the two are not actually comparable, even though they're similar in form.  Comparing the two is like trying to compare a football score to a basketball score on the basis that they're both counting the number of times the teams involved have scored goals.


There's a lot more to talk about here, for example the relation between symmetry and disorder (more disorder means more symmetry, which was not what I thought until I sat down to think about it), and the relationship between entropy and time (for example, as experimental physicist Richard Muller points out, local entropy decreases all the time without time appearing to flow backward), but for now I think I've hit the main points:
  • The second law of thermodynamics is just that -- a law of thermodynamics
  • Thermodynamic entropy as currently defined and information-theoretic (Shannon) entropy are two distinct concepts, even though they're very similar in form and derivation.
  • The two are defined in different contexts and behave entirely differently, despite what we might think from them having the same name.
  • Back at the first point, the second law of thermodynamics says almost nothing about Shannon entropy, even though you can, if you like, use the same terminology in counting quantum states.
  • All this has even less to do with genetics.

* Strictly speaking, you need to take the Sun into account.  The Sun is gaining entropy over time, at a much, much higher rate than our little pond and its surroundings, and it's only an insignificantly tiny part of the universe.  But even if you had a closed system, of a pond and surroundings that were sometimes warm and sometimes cold, for whatever reason, the result would be the same: The entropy of a closed system increases over time.

Sunday, July 16, 2017

Discovering energy

If you get an electricity bill, you're aware that energy is something that can be quantified, consumed, bought and sold.   It's something real, even if you can't touch it or see it.  You probably have a reasonable idea of what energy is, even if it's not a precisely scientific definition, and an idea of some of the things you can do with energy: move things around, heat them or cool them, produce light, transmit information and so forth.

When something's as everyday-familiar as energy it's easy to forget that it wasn't always this way, but in fact the concept of energy as a measurable quantity is only a couple of centuries old, and the closely related concept of work is even newer.

Energy is now commonly defined as the ability to do work, and work as a given force acting over a given distance.  For example, lifting a (metric) ton of something one meter in the Earth's surface gravity requires exerting approximately 9800 Newtons of force over that distance, or approximately 9800 Newton-meters of work altogether.  A Joule of energy is the ability to do one Newton-meter of work, so lifting one ton one meter requires approximately 9800 Joules of energy, plus whatever is lost to inefficiency.

As always, there's quite a bit more going on if you start looking closely.  For one thing, the modern physical concept of energy is more subtle than the common definition, and for another energy "lost" to inefficiency is only "lost" in the sense that it's now in a form (heat) that can't directly do useful work.  I'll get into some, but by no means all, of that detail later in this post and probably in others as well.

I'm not going to try to give an exact history of thermodynamics or calorimetry here, but I do want to call out a few key developments in those fields.  My main aim is to trace the evolution of energy as a concept from a concrete, pragmatic working definition born out of the study of steam engines to the highly abstract concept that underpins the current theoretical understanding of the physical world.



The concept of energy as we know it dates to somewhere around the turn of the 19th century, that is, the late 1700s and early 1800s.   At that point practical steam engines had been around for several decades, though they only really took off when Watt's engine came along in 1781.  Around the same time a number of key experiments were done, heat was recognized as a form of energy and a full theory of heat, work and the relationship between the two was formulated.

What makes things hot?  This is one of those "why is the sky blue?" questions that quickly leads into deep questions that take decades to answer properly.  The short answer, of course, is "heat", but what exactly is that?  A perfectly natural answer, and one of the first to be formalized into something like what we would call a theory, is that heat is some sort of substance, albeit not one that we can see, or weigh, or any of a number of other things one might expect to do with a substance.

This straightforward answer makes sense at first blush.  If you set a cup of hot tea on a table, the tea will get cooler and the spot where it's sitting on the table will get warmer.  The air around the cup also gets warmer, though maybe not so obviously.  It's completely reasonable to say that heat is flowing from the hot teacup to its surroundings, and to this day "heat flow" is still an academic subject.

With a little more thought it seems reasonable to say that heat is somehow trapped in, say, a stick of wood, and that burning the wood releases that heat, or that the Sun is a vast reservoir of heat, some of which is flowing toward us, or any of a number of quite reasonable statements about heat considered as a substance.  This notional substance came to be known as caloric, from the Latin for heat.

As so often happens, though, this perfectly natural idea gets harder and harder to defend as you look more closely.  For example, if you carefully weigh a substance before and after burning it, as Lavoisier did in 1772, you'll find that it's actually heavier after burning.  If burning something releases the caloric in it, then does that mean that caloric has negative weight?  Or perhaps it's actually absorbing cold, and that's the real substance?

On the other hand, you can apparently create as much caloric as you want without changing the weight of anything.  In 1797 Benjamin Thompson, Count Rumford, immersed an unbored cannon in water, bored it with a specially dulled borer and observed that the water was boiling hot after about two and a half hours.  The metal bored from the cannon was not observably different from the remaining metal of the cannon, the total weight of the two together was the same as the original weight of the unbored cannon, and you could keep generating heat as long as you liked.  None of this could be easily explained in terms of heat as a substance.

Quite a while later, in the 1840s, James Joule did precise measurements how much heat was generated by a falling weight powering a stirrer in a vat of water.  Joule determined that heating a pound of water one degree Fahrenheit requires 778.24 foot-pounds of work (e.g., letting a 778.24 pound weight fall one foot, or a 77.824 weight fall ten feet, etc.). Ludwig Colding did similar research, and both Joule and Julius Robert von Mayer published the idea that heat and work can each be converted to the other.  This is not just significant theoretically.  Getting five digits of precision out of stirring water with a falling weight is pretty impressive in its own right.

At this point we're well into the development of thermodynamics, which Lord Kelvin eventually defined in 1854 as "the subject of the relation of heat to forces acting between contiguous parts of bodies, and the relation of heat to electrical agency."  This is a fairly broad definition, and the specific mention of electricity is interesting, but a significant portion of thermodynamics and its development as a discipline centers around the behavior of gasses, particularly steam.


In 1662, Robert Boyle published his finding that the volume of a gas, say, air in a piston, is inversely proportional to the pressure exerted on it.  It's not news, and wasn't at the time, that a gas takes up less space if you put it under pressure.  Not having a fixed volume is a defining property of a gas.  However, "inversely proportional" says more.   It says if you double the pressure on a gas, its volume shrinks by half, and so forth.  Another way to put this is that pressure multiplied by volume remains constant.

In the 1780s, Jacque Charles formulated (but didn't publish) the idea that the volume of a gas was proportional to its temperature.  In 1801 and 1802, John Dalton and Joseph Louis Guy-Lussac published experimental results showing the same effect.  There was one catch: you had to measure temperature on the right scale.  A gas at 100 degrees Fahrenheit doesn't have twice the volume of a gas at 50 degrees, nor does it if you measure in Celsius.

However, if you plot volume vs. temperature on either scale you get a straight line, and if you put the zero point of your temperature scale where that line would show zero volume -- absolute zero -- then the proportionality holds.  Absolute zero is quite cold, as one might expect.  It's around 273 degrees below zero Celsius (about 460 degrees below zero Fahrenheit).  It's also unobtainable, though recent experiments in condensed matter physics have come quite close.

Put those together and you get the combined gas law: Pressure times volume is proportional to temperature.

In 1811 Amedeo Avogadro hypothesized that two samples of the same gas at the same temperature, pressure and volume contained the same number of molecules.  This came to be known as Avogadro's Law.  The number of molecules in a typical system is quite large.  It is usually expressed in terms of Avogadro's Number, approximately six followed by twenty-three zeroes, one of the larger numbers that one sees in regular use in science.

Put that together with the combined gas law and you have the ideal gas law:
PV = nRT
P is the pressure, V is the volume, n is the number of molecules, T is the temperature and R is the gas constant that makes the numbers and units come out right.

The really important abstraction here is state.  If you know the parameters in the ideal gas law -- the pressure, volume, temperature and how many gas particles there are, then you know its state.  This is all you need to know, and all you can know, about that gas as far as thermodynamics is concerned.  Since the number of gas particles doesn't change in a closed system like a steam engine (or at least an idealized one), you only need to know pressure, volume and temperature to know the state.

Since the ideal gas law relates those, you really only need to keep track of two of the three.  If you measure pressure, volume and temperature once to start with, and you observe that the volume remains constant while the pressure increases by 10%, you know that the temperature must be 10% higher than it was.  If the volume had increased by 20% but the pressure had dropped by 10%, the temperature must now be 8% higher (1.2 * 0.9 = 1.08).  And so forth.

You don't have to track pressure and volume particularly, or even two of {pressure, volume, temperature}.  There are other measures that will do just as well (we'll get to one of the important ones in a later post), but no matter how you define your measures you'll need two of them to account for the thermodynamic state of a gas and (as long as they aren't essentially the same measure in disguise), two will be enough.  Technically, there are two degrees of freedom.

This means that you can trace the thermodynamic changes in a gas on a two-dimensional diagram called a phase diagram.  Let's pause for a second to take that in.  If you're studying a steam engine (or in general, a heat engine) that converts heat into work (or vice versa) you can reduce all the movements of all the machinery, all the heating and cooling, down to a path on a piece of paper.  That's a really significant simplification.


In theory, the steam in a steam engine (or generally the working fluid in a heat engine), will follow a cycle over and over, always returning to the same point in the phase diagram (that is, the same state).    In practice, the cycle won't repeat exactly, but it will still follow a path through the phase diagram that repeats the same basic pattern over and over, with minor variations.

The heat source heats the steam and the steam expands.  Expanding means exerting force against the walls of whatever container it's in, say the surface of a piston.  That is, it means doing work.  The steam is then cooled, removing heat from it, and the steam returns to its original pressure, volume and temperature.  At that point, from a thermodynamic point of view, that's all we know about the steam.  We can't know, just from taking measurements on the steam, how many times it's been heated and cooled, or anything else about its history or future.  All you know is its current thermodynamic state.

As the steam contracts back to its original volume, its surroundings are doing work on it.  The trick is to manipulate the pressure, temperature and volume in such a way that the pressure, and thus the force, is lower on the return stroke than the power stroke, and the steam does more work expanding than is done on it contracting.  Doing so, it turns out, will involve putting more heat into the heating than comes out in the cooling.  Heat goes in, work comes out.


This leads us to one of the most important principles in science.  If you carefully measure what happens in real heat engines, and the ways you can trace through a path in a phase diagram, you find that you can convert heat to work, and work to heat, and that you will always lose some waste heat to the surroundings, but when you add it all up (in suitable units and paying careful attention to the signs of the quantities involved), the total amount of heat transferred and work done never changes.  If you put in 100 Watts worth of heat, you won't get more than 100 Watts worth of work out.  In fact, you'll get less.  The difference will be wasted heating the surroundings.

This is significant enough when it comes to heat engines, but that's only the beginning.  Heat isn't the only thing you can convert into work and vice versa.  You can use electricity to move things, and moving things to make electricity.  Chemical reactions can produce or absorb heat or produce electrical currents, or be driven by them.   You can spin up a rotating flywheel and then, say, connect it to a generator, or to a winch.

Many fascinating experiments were done, and the picture became clearer and clearer: Heat, electricity, the motion of an object, the potential for a chemical reaction, the stretch in a spring, the potential of a mass raised to a given height, among other quantities, can all be converted to each other, and if you measure carefully, you always find the total amount to be the same.

This leads to the conclusion that all of these are different forms of the same thing -- energy -- and that this thing is conserved, that is, never created or destroyed, only converted to different forms.


As far-reaching and powerful as this concept is, there were two other important shifts to come.  One was to take conservation of energy not as an empirical result of measuring the behavior of steam engines and chemical reactions, but as a fundamental law of the universe itself, something that could be used to evaluate new theories that had no direct connection to thermodynamics.

If you have a theory of how galaxies form over millions of years, or how electrons behave in an atom, and it predicts that energy isn't conserved, you're probably not going to get far.  That doesn't mean that all the cool scientist kids will point and laugh (though a certain amount of that has been known to happen).  It means that sooner or later your theory will hit a snag you hadn't thought of and sooner or later the numbers won't match up with reality*.  When this happens over and over and over, people start talking about fundamental laws.


The second major shift in the understanding of energy came with the quantum theory, that great upsetter of scientific apple carts everywhere.  At a macroscopic scale, energy still behaves something like a substance, like the caloric originally used to explain heat transfer.  In Count Rumford's cannon-boring experiment, mechanical energy is being converted into heat energy.  Heat itself is not a substance, but one could imagine that energy is, just one that can change forms and lacks many of the qualities -- color, mass, shape, and so forth -- that one often associates with a substance.

In the quantum view, though, saying that energy is conserved doesn't assume some substance or pseudo-substance that's never created or destroyed.  Saying that energy is conserved is saying that the laws describing the universe are time-symmetric, meaning that they behave the same at all times.  This is a consequence of Noether's theorem (after the mathematician Emmy Noether), one of the deepest results in mathematical physics, which relates conservation in general to symmetries in the laws describing a system.  Time symmetry implies conservation of energy.  Directional symmetry -- the laws work the same no matter which way you point your x, y and axes -- implies conservation of angular momentum.

Both of these are very abstract.  In the quantum world you can't really speak of a particle rotating on an axis, yet you can measure something that behaves like angular momentum, and which is conserved just like the momentum of spinning things is in the macroscopic world.  Just the same, energy in the quantum world has more to do with the rates at which the mathematical functions describing particles vary over space and time, but because of how the laws are structured it's conserved and, once you follow through all the implications, energy as we experience it on our scale is as well.

This is all a long way from electricity bills and the engines that drove the industrial revolution, but the connections are all there.  Putting them together is one of the great stories in human thought.

* I suppose I can't avoid at least mentioning virtual particles here.  From an informal description, of particles being created and destroyed spontaneously, it would seem that they violate conservation of energy (considering matter as a form of energy).  They don't, though.  Exactly why they don't is beyond my depth and touches on deeper questions of just how one should interpret quantum physics, but one informal way of putting it is that virtual particles are never around for long enough to be detectable.  Heisenberg uncertainty is often mentioned as well.