Sunday, December 30, 2018

Computer chess: Dumber and smarter at the same time?

[As usual, I've added inline comments for non-trivial corrections to the original text, but this one has way more than most.  I've come up to speed a bit on the current state of computer chess, so that's reflected here.  The result is not the most readable narrative, but I'm not going to try to rewrite it --D.H. Apr 2019]

One of the long-running tags on the other blog is "dumb is smarter".  The idea is that, at least in the world of software, it's perilous to build a lot of specialized knowledge into a system.  Specialized knowledge means specialized blind spots -- any biases or incorrect assumptions in the specialization carry over to the behavior of the system itself.  In cases like spam filtering, for example, a skilled adversary can exploit these assumptions to beat the system.

I will probably follow up on the other blog at some point on how valid the web.examples of this that I cited really were, and to what extent recent developments in machine learning have changed the picture (spoiler alert: at least some).  Here I'd like to focus on one particular area: chess.  Since this post includes Deep Mind, part of the Alphabet family, I should probably be clear that what follows is taken from public sources I've read.


For quite a while and up until recently, the most successful computer chess programs have relied mainly on brute force, bashing out millions and millions of possible continuations from a given position and evaluating them according to fairly simple rules [in actual tournament games, it's not unusual to see over a billion nodes evaluated for a single move].  There is a lot of sophistication in the programming, in areas like making good use of multiple processors, avoiding duplicate work, representing positions in ways that make good use of memory, generating possible moves efficiently and deciding how to allocate limited (though large) processing resources to a search tree that literally grows exponentially with depth, but from the point of view of a human chess player, there's nothing particularly deep going on, just lots and lots of calculation.

For a long time, a good human player could outplay a good computer program by avoiding tactical blunders and playing for positional advantages that could eventually be turned into a winning position that even perfect tactical play couldn't save.  If the human lost, it would generally be by missing a tactical trick that the computer was able to find by pure calculation.  In any case, the human was looking at no more than dozens of possible continuations and, for the most part, calculating a few moves ahead, while the computer was looking at vastly more positions and exploring many more of them in much greater depth than a person typically would.

The sort of positional play that could beat a computer -- having a feel for pawn structure, initiative and such -- comes reasonably naturally to people, but it's not easily expressed in terms of an evaluation function.  The evaluation function is a key component of a computer chess engine that reduces a position to a number that determines whether one position should get more attention than another.  Since there are millions of positions to evaluate, the evaluation function has to be fast.  A typical evaluation function incorporates a variety of rules of the kind you'll find in beginning chess books -- who has more material, who has more threats against the other's pieces, whose pieces have more squares (particularly central squares) to move to, are any pieces pinned or hanging, and so forth.

There's quite a bit of tuning to be done here -- is it more important to have a potential threat against the opponent's queen or to control two squares in the center? -- but once the tuning parameters are set, they're set, at least until the next release.  The computer isn't making a subtle judgment based on experience.  It's adding up numbers based on positions and alignments of pieces.

It's not that no one thought of writing a more subtle evaluation function, one that would allow the program to look at fewer, but better positions.  It's just that it never seemed to work as well.  Put a program that looks at basic factors but looks at scads and scads of positions against one that tries to distill the experience of human masters and only look at a few good moves, and the brute force approach has typically won.  The prime example here would be Stockfish, but I'm counting, engines such as earlier versions of Komodo as brute force since, as I understand it, they use the same alpha/beta search technique and examine large numbers of positions.  I'm having trouble finding exact numbers on that, though [If you look at the stats on chess.com's computer chess tournaments, it's very easy to tell who's who.  For a 15|5 time control, for example, you either see on the order of a billion nodes evaluated per move or on the order of a hundred thousand.]

[Houdini is another interesting example.  Its evaluation function puts more weight on "positional" factors such as active pieces and space, and it tries to be highly selective in which moves it devotes the most resources too.  This is, explicitly, trying to emulate the behavior of human players.  So it's not quite correct to say that programs that try to emulate humans have done worse than ones that just bash out continuations.  Houdini has done quite well.

However, from a software point of view, these programs are all largely similar.  There is an evaluation function that's explicitly coded as rules you can read, and this is used to decide how much attention to pay to what part of the search tree.

Alpha zero, by contrast, uses machine learning (aka neural network training, aka deep neural networks) to build an evaluation function that's totally opaque when considered purely as code.  There are techniques to examine what a neural network is doing, but they're not going to reduce to rules like "a bishop is worth three times as much as a pawn".  It also uses a Monte Carlo approach to examine the search tree probabilistically, which is a somewhat different way to use the evaluation function to guide the search.  As I understand it, this is not the usual practice for other engines, though it's certainly possible to incorporate a random element into a conventional chess engine.  Komodo MC comes to mind.

In short, the narrative of "purely mechanical programs beat ones that try to emulate humans" is not really right, but Alpha Zero still represents a radically different approach, and one that is in some sense structurally more similar to what things-with-brains do.  --D.H. Feb 2019]

This situation held for years.  Computer hardware got faster, chess programs got more finely tuned and their ratings improved, but human grandmasters could still beat them by exploiting their lack of strategic understanding.  Or at least that was the typical explanation:  Humans had a certain je ne sais quoi that computers could clearly never capture, not that the most successful ones were particularly trying to.  Occasionally you'd hear a stronger version: computers could never beat humans since they were programmed by humans and could never know more than their programmers, or some such, but clearly there's a hole in that logic somewhere ...


Then, in 1997, Deep Blue (an IBM project not directly related to Deep Mind) beat world champion Garry Kasparov in a rematch, Kasparov having won the first match between the two.  It wasn't just that Deep Blue won the games.  It outplayed Kasparov positionally just like human masters had been outplaying computers, but without employing anything you could point at and call strategic understanding.  It just looked at lots and lots of positions in a fairly straightforward way.

This isn't as totally surprising as it might seem.  The ultimate goal in chess is to checkmate the opponent.  In practice, that usually requires winning a material advantage, opening up lines of attack, promoting a pawn in the endgame or some other tactical consideration.  Getting a positional advantage is just setting things up to make that more likely.  Playing a simple tactical game but looking twenty plies (half-moves) ahead turns out to be quite a bit like playing a strategic game. [In fact, computer-computer chess games are almost entirely positional, since neither player is going to fall for a simple tactical trap.  That's not to say tactics don't matter.  For example I've seen any number of positions where a piece looked to be hanging, but couldn't be taken due to some tactical consideration.  What I haven't seen is a game won quickly by way of tactics.]


Human-computer matches aren't that common.  At first, the contests were too one-sided.  There was no challenge in a human grandmaster playing a computer.  Then, as computers became stronger, few grandmasters wanted to risk being the first to lose to a computer (and Kasparov is to be commended for taking up the challenge anyway).  Now, once again, the contests are too one-sided.  Human players use computers for practice and analysis, but everyone loses to them head-to-head, even with odds given (the computer plays without some combination of pieces and pawns).

At this writing, current world champion Magnus Carlsen, by several measures the strongest player ever or at the very least in the top handful, stands less than a 2% chance of beating the current strongest computer program head to head.  With the question of "will computers ever beat the best human players at chess?" firmly settled, human players can now be compared on the basis of how often they play what the computer would play.  Carlsen finds the computer move over 90% of the time, but it doesn't take many misses to lose a game against such a strong player.


And now comes AlphaZero.

The "alpha" is carried over from its predecessor, AlphaGo which, by studying games between human players of the game of go and using machine learning to construct a neural network for evaluating positions, was able to beat human champion Lee Sedol.  This was particularly significant because a typical position in go has many more possible continuations than a typical chess position, making the "bash out millions of continuations" approach impractical.  Given that computer chess had only fairly recently reached the level of top human players and go programs hadn't been particularly close, it had seemed like a good bet that humans would continue to dominate go for quite a while yet.

AlphaGo, and machine learning approaches in general, use what could be regarded as a much more human approach, not surprising since they're based on an abstraction of animal nervous systems and brains rather than the classic Turing/Von Neumann model of computing, notwithstanding that they're ultimately still using the same computing model as everyone else.  That is, they run on ordinary computer hardware, though often with a specialized "tensor processor" for handling the math.

However, the algorithm for evaluating positions is completely different.  There's still an evaluation function, but it's "run the positions of the pieces through this assemblage of multilinear functions that the training stage churned out".  Unlike a conventional chess engine, there's nothing you can really point at and say "this is how it knows about pins" or "this is how it counts up material".

AlphaGo looks at many fewer positions [by a factor of around 10,000] than a conventional chess engine would, and it looks at them probabilistically, that is, it uses its evaluation function to decide how likely it is that a particular continuation is worth looking at, then randomly chooses the actual positions to look at based on that.  It's still looking at many more positions than a human would, but many fewer than a pure brute-force approach would.


The "zero" is because the training stage doesn't use any outside knowledge of the game.  Unlike AlphaGo, it doesn't look at games played by humans.  It plays against itself and accumulates knowledge of what works and what doesn't.  Very roughly speaking, AlphaZero in its training stage plays different versions of its network against each other, adjusts the parameters based on what did well and what did badly, and repeats until the results stabilize.

AlphaZero does this knowing only the rules of the game, that is, what a position looks like, what moves are possible, and how to tell if the game is won, lost, drawn (tied) or not over yet.  This approach can be applied to a wide range of games, so far go, chess and shogi (a chess-like game which originated in Japan).  In all three cases AlphaZero achieved results clearly better than the (previous) best computer players after a modest number of hours of training (though the Stockfish team makes a good case that AlphaZero had a hardware advantage and wasn't playing against the strongest configuration).  [Recent results indicate that LC0, the strongest neural net based engine, and Stockfish, the strongest conventional engine, are very evenly matched, but LC0 doesn't have the benefit of a tensor processor to speed up its evaluation --D.H. May 2019]

Notably, AlphaZero beat AlphaGo 60 games to 40.


In one sense, AlphaZero is an outstanding example of Dumb is Smarter, particularly in beating AlphaGo, which used nearly the same approach, but trained from human games.  AlphaZero's style of play has been widely described as unique.  Its go version has found opening ideas that had lain undiscovered for centuries.  Its chess version plays sacrifices (moves that give up material in hopes of a winning attack) that conventional chess engines pass up because they can't prove that they're sound.  Being unbiased by exposure to human games or a human-developed evaluation function, AlphaZero can find moves that other programs would never play, and it turns out these moves are often good enough to win, even against chess engines that never make tactical mistakes.

On the other hand, AlphaZero specifically avoids sheer brute force.  Rather than look at lots and lots of positions using a relatively simple evaluation function, it looks at many fewer, using a much more sophisticated evaluation function to sharply limit the number of positions it examines.  This is the same approach that had been tried in the past with limited success, but with two key differences:  The evaluation function is expressed as a neural network rather than a set of explicit rules, and that neural network is trained without any human input, based solely on what works in practice games against AlphaZero itself.


The Dumb is Smarter tag on Field Notes takes "dumb" to mean "no special sauce" and "smarter" to mean "gets better results".  The "smarter" part is clear.  The "dumb" part is more interesting.  There's clearly no special sauce in the training stage.  AlphaZero uses a standard machine learning approach to produce a standard neural network.

On the other hand, if you consider the program itself without knowing anything about the training stage, you have a generic engine, albeit one with a somewhat unusual randomized search algorithm, and an evaluation function that no one understands in detail.  It's all special sauce -- a set of opaque, magical parameters that somehow give the search algorithm the ability to find the right set of variations to explore.

I think it's this opaqueness that gives neural networks their particularly uncanny flavor (uncanny, etymologically, roughly means "unknowable").  The basic approach of taking some inputs and crunching some numbers on them to get an output is unequivocally dumb.  As I said above, "It's adding up numbers based on positions and alignments of pieces."  Except that those numbers are enabling it to literally make "a subtle judgment based on experience", a judgment we have no real choice but to call "smart".


Progress marches on.  At least one of the previous generation of chess engines (Komodo) has incorporated ideas from AlphaZero [Leela has open-sourced the neural network approach wholesale].  It looks like the resulting hybrid isn't dominant, at least not yet, but it does play very strong chess, and plays in a completely different, more human, style from the conventional version.  That's interesting all by itself.

Saturday, October 6, 2018

Debugging servers and bodies

For quite a while I've wanted to write up some thoughts about the nature of cause and effect.  In actually trying to do so, I realized two things: First, I don't know that much about the main streams of thought on this, which go back millennia, and second, that this was material for several posts.  Rather than step back and formulate an overarching structure for a series, or deep dive into the philosophical literature, I decided to just start somewhere and call that Part I, with the intention of coming back to the subject, probably with unrelated posts in between, from time to time.   And then I realized that there was no need to call it Part anything.  I could just introduce a new tag, cause and effect and apply it where necessary.  So here we go ...


I press keys on the keyboard and words appear on the screen.  It seems pretty clear that one is the cause of the other.  I take a cough suppressant and my cough subsides.  Quite likely it subsided because of the medicine, but maybe I was getting better anyway.  I run an ad online and sales go up.  But they've been going up for months.  Did they go up more because of the ad?  (half of your advertising budget is wasted; the trick is to know which half).  I wear a lucky shirt and my team wins.  I may want to think the two are related, but realistically it's just a bit of fun.

I spend a lot of my time debugging, trying to figure out why something isn't working the way it was expected to, whether it's why the TV isn't working at home or why a server crashed at work.  If I'm trying to figure out what's wrong with something electronic at home, I generally just turn things off and on until everything's back in a sane state.  That's usually fine at home, but not such a good idea at work.  Even if restarting the server fixes the problem, you still want to know why, so next time you won't have to restart it.

How do you tell if one thing caused another?  Debugging is much older than computing.  For example, the practice of debugging human health, that is, medicine, developed a useful paradigm in 1884, known as Koch's postulates after Robert Koch, for determining whether a particular microorganism causes a particular disease:
  1. The microorganism must be found in abundance in all organisms suffering from the disease, but should not be found in healthy organisms.
  2. The microorganism must be isolated from a diseased organism and grown in pure culture.
  3. The cultured microorganism should cause disease when introduced into a healthy organism.
  4. The microorganism must be reisolated from the inoculated, diseased experimental host and identified as being identical to the original specific causative agent.
Koch actually abandoned the "should not be found" part of postulate 1 after discovering that some people could carry a disease without showing symptoms.

Abstracting a little bit with an eye toward applying these postulates to an ailing server, one might say
  1. The conditions you think are causing the problem should be present in affected servers but ideally not in healthy ones.  For example, the unhealthy servers are all deployed in sector A and the healthy ones aren't or, more commonly, the unhealthy ones are running one version of the code while the healthy ones are running the previous version.
  2. This is a bit tricky, but I think it boils down to: There has to be a well-defined way to induce the conditions you think are causing the problem.  For example, you can start a new instance of a server in sector A or update a healthy server to the new version you think is buggy.  It isn't always immediately obvious how to do this.  For example, if you think that the problem is some sort of bad request coming from random parts of the internet, you'll probably have to search through logs to find requests that correspond with problems.
  3. If you trigger the conditions, the problem occurs.
  4. Again doesn't apply directly, but in this context I think it means double-checking for evidence that the conditions you thought were triggered really were triggered.  When you brought up the test server and it fell over, was it actually in sector A?  When you sent the query-of-death you found in the logs, did the server that fell over actually log receiving it just before it fell over?
The kind of double-checking in postulate 4 is crucial in real debugging.  It's very common, for example, to think you restarted a server with a new setting that should cause or fix a given problem, only to find that you restarted a different instance, or accidentally restarted it with the old configuration instead of the new.  For example, as I was writing this paragraph I realized that the command I thought would send a problem request to an unwell server I'd been debugging had actually failed, explaining why I saw no evidence of the request having been handled.

There's also a distinction, in both medicine and software, between fixing the problem at hand -- curing the patient or getting the service back online -- and pinning down exactly what happened.  In my business a common course of action is to roll the production servers back to the last configuration that was known to work, then use a test setup to try to reproduce the problem without impacting the production system.  The ultimate goal is a "red test" that fails with the buggy code and then passes ("goes green") with the fix.

In medicine, as I understand it, the work of isolating causes and developing vaccines and drugs similarly goes on in a laboratory environment until everyone is quite certain that the proposed treatment will be safe, and hopefully effective, in real patients.  In the mean time, doctors mostly do their best with known treatments.

While Koch's postulates are fairly famous, the kind of thing you remember from high school biology years later, they're not actually what modern medicine goes by, just like modern economists don't consult The Wealth of Nations, influential though it was.  One more modern approach can be found in Hill's criteria, a set of nine criteria for determining if a given cause is responsible for a given effect, but there are many other, more recent paradigms.

Notably, Hill's criteria and its modern cousins are not nearly so crisp as Koch's postulates.  The very name "postulate" suggests that you can obtain a rigorous proof, while "criteria" suggests something more indirect: if you don't meet the criteria, then you don't have cause and effect.  The criteria themselves are of the form "more of this suggests causality is more likely", and the end result is an idea of the probability that something is causing something else.

As in many other areas, switching from a yes/no answer to a probability solves a lot of problems, particularly the problem of gray areas where there are reasons to say either yes or no.  It does so, however, at the cost of being able to say for certain "X caused Y".  In my world, you very often can say with confidence "The tweaks we made to parsing in change number 271828 caused the server to reject these requests", but in my world we have a high degree of control of the system.  I can roll the server back to just before and after change number 271828 and run it in a test environment where I can control exactly what data it's trying to parse (or just write a "unit test" that exercises the problem code directly without spinning up a server).

In the field of medicine, and much of the scientific world, however, that's generally not the case.  If we're trying to determine whether eating carrots for years causes freckles, we can't really make people eat carrots for years and count their freckles every week for the duration.  Medicine doesn't, and shouldn't, have the same level of control over patients as I have over a server.  That is, there's less control over the possible causes.

There's typically also less certainty about the effects.  Lots of people get freckles whether or not they eat carrots, so you need more subtle statistical techniques to see if there's even a correlation between carrot eating and freckle getting.  This sort of thing is a major reason that it's generally not a good idea to pin too much on any single medical study, even if it's careful with the data and its interpretation.

Nonetheless, medicine advances.  Some of this is because research has pinned down some causes and effects to a good degree of certainty.  There's no doubt that vaccines were effective against smallpox and continue to be effective against other diseases, albeit not always perfectly.  There's no doubt that antibiotics can be effective against bacterial infections, or that some bacteria have evolved defenses against them.  There's no realistic doubt that smoking causes a number of ill effects, up to and including lung cancer and emphysema.

But medicine is useful even in the absence of certainty.  If there's an 80% chance you've got condition X and the treatment is 90% effective with a 5% chance of major side effects, and condition X is 95% fatal if left untreated, you should probably go for the treatment.  If the treatment is 5% effective with a 90% chance of major side effects, and condition X is almost never fatal, you probably don't want to.  You don't need absolute certainty to make a decision like this, or even to know the exact causes and effects involved.

Sunday, September 9, 2018

When is a space elevator not a space elevator?

Canadian space and defense company Thoth Technology has recently announced plans for a "Space Elevator".  The idea is to use pressurized Kevlar to build a tower 15 kilometers high (nearly twice as high as Mount Everest), complete with hotel and observation decks, and to launch and land space planes using a rooftop runway.  It's an ambitious project, to say the least, but I wouldn't call it a space elevator -- thus the they-said-it-I-didn't quotation marks.

The term space elevator usually refers to a structure built around a very long cable with one end attached to a counterweight somewhere beyond about 36,000 kilometers above the surface of the Earth.  The cable remains taut due to centrifugal force*.  Vehicles called climbers move up and down the structure, delivering payloads into space.

We're not even close to being able to build such a thing.  The main reason is that the cable would be under far more tension than any commercially available material can withstand (it's not theoretically impossible, just not practical with current technology).  Assuming we clear that hurdle, actually building the thing would still be a major piece of engineering.

How would a space elevator work in practice?  Since a space elevator cable is pulled taut by its counterweight, and it's rotating in sync with the Earth's rotation, the higher up you go, the faster that bit of the elevator is moving relative to the ground, since the higher you go the more distance you have to cover every 24 hours.  As a climber goes up the elevator, it will feel a slight force pushing it in the direction the cable is moving (technically a Coriolis force).  This is the force of the cable keeping the climber moving with it, ever so slightly faster for each meter you move farther out from Earth.

In order to put something into orbit it's not enough simply to lift it out of the atmosphere.  If you took a payload up a space elevator to an altitude of 200km, a typical distance for a low Earth orbit, and let it go, it would fall back to the ground.  At that height, the payload would start with a forward motion relative to the ground of around 50 km/h -- a typical speed limit for residential streets -- and lose most or all of it to wind resistance on the way down.  To go into orbit, it would need a forward motion of more like 28,000 km/h.  As Randall Munroe saysgetting to space is easy. The problem is staying there.

To go faster you have to go higher, but fortunately you also need less speed to go into orbit as you get further from the Earth -- the Moon orbits at less than 4000 km/h, for example.  By the time a climber got to about 36,000 kilometers, not much less distance than going around the world, it would finally have enough speed (about 11,000 km/h at that point) that a payload would stay in orbit if released.

Orbits at this distance are geosynchronous (geosync for short), meaning the orbiting object is in sync with the Earth's rotation.  If they're also at the equator, which a space elevator would have to be, they're geostationary, meaning they always stay over the same point on the equator.  Otherwise they will appear to move north and south over the course of the day, but stay at roughly the same longitude.

From the point of view of someone on the elevator at geosync, the payload would just appear to stay where it was, at least for a bit.  In real life, it would tend to drift away over time due to factors like the gravity of the Moon and the Earth's not being a perfect sphere.  For basically the same reason, an object on the cable at this point would experience zero g (again neglecting secondary effects).  Points below would experience a pull toward Earth, however slight, and points above would experience a pull away from Earth.

Where does that leave us with Thoth's structure?

A 15km tower is nowhere near high enough to function as a space elevator.  The advantage to launching from there is not the difference in speed between the ground and the top, which would amount to a moderate walking pace, but being above about 90% of the atmosphere.  That's certainly helpful, but not enough to neglect the atmosphere entirely.  To do that, you would need to be somewhere above the Kármán line, somewhat arbitrarily defined as 100km, though the Wikipedia article asserts that 160km is the lowest point at which you can complete an orbit without further propulsion.

In other words, Thoth's structure doesn't even reach into space, or even a tenth of the way for the purposes of launching things into orbit.

While Thoth's structure might be still useful in launching thanks to bypassing much of the atmosphere, landing, on the other hand, seems a bit of a stretch.  If you're leaving low Earth orbit, you'll have to get rid of that 28,000 km/h somehow.  You could use rockets, the same as you used to get into orbit, but that means more fuel -- not just twice as much but a bunch more, because the extra fuel is just dead weight on the way up and you'll need more fuel to compensate for that.  And more fuel for that extra fuel, and so forth.

This is why actual reentry uses the Earth's atmosphere to turn kinetic energy (energy of motion) into heat.  If you're going to do that, you might as well go all the way to the ground, where you can have a nice big runway, safety crews and other amenities if you're a spaceplane, or at least your choice of an ocean or a large expanse of open ground to aim for if you're not.

Of course, if you have an actual space elevator, you can re-attach to the elevator at geosync and let the climber spill your kinetic energy gradually on the way down, neglecting the not-so-small matter of moving from your current orbit back to the elevator, matching speed, not just location.  But that's not what Thoth is proposing.

So the whole "space elevator" thing is marketing hype.  Is there anything left after you account for that?  Well, maybe.  It depends on the numbers.

The Really Tall Tower idea still has some value, I think.  Actually building it would turn up all kinds of interesting issues in building tall structures, from how to build a stable structure about 20 times taller than the current record holder to the logistics of supporting a crew well above the Everest death zone to even just getting materials to a construction site 15km up in the air.  At the end, though, you'd at least have a unique hotel property and, depending on how much load that inflated Kevlar can take, potentially a whole lot of residential and office space, though most of it will need to be pressurized.

Do you have a space elevator?  No.

Do you have a compelling value proposition for someone wanting to put things in orbit in a world where ground-launched rockets are pretty much the only game in town?  I really don't know enough to say, but my guess is that it would be better if the business model didn't depend on that happening.



(*) If you prefer, feel free to recast this in terms of centripetal force.  You'll get the same vectors, just with different names.

Saturday, July 28, 2018

The woods are dark and full of terrors

I wanted to "circle back" on a comment I'd made on the "Dark Forest" hypothesis, which is basically the idea that we don't see signs of alien life because everyone's hiding for fear of everyone else.  This will probably be the last I want to say about the "Fermi Paradox", at least until the next time I feel like posting about it ...

I haven't read the book Dark Forest, but my understanding is that the Dark Forest hypothesis is based on the observation that when a relatively technologically developed society on Earth has made contact with a less developed society, the results have generally not been pretty.   "Technology" in this context particularly means "military technology."  The safest assumption is that this isn't unique to our own planet and species, but a consequence of universal factors such as competition for resources.

If you're a civilization at the point of being able to explore the stars, you're probably aware of this first hand from your own history, and the next obvious observation is that you're just at the beginning of the process of exploring the stars.  Is it really prudent to assume that there's no one out there more advanced?

Now put yourself in the place of that hypothetical more advanced civilization.  They've just detected signs of intelligent life on your world.  You are now either a threat to them, or a potential conquest, or both.  Maybe you shouldn't be so eager to advertise your presence.

But you don't actually see anyone out there, so there's nothing to worry about, right?  Not so fast.  Everyone else out there is probably applying the same logic.  They might be hunkering down quietly, or they might already be on their way, quietly, in order to get the jump on you, but either way you certainly shouldn't assume that not detecting anyone is good news.

Follow this through and, assuming that intelligent life in general isn't too rare at any given point in time, you get a galaxy dotted with technological civilizations, each doing its best to avoid detection, detect everyone else and, ideally, neutralize any threats that may be out there.  Kind of like a Hunger Games scenario set in the middle of a dark forest.


This all seems disturbingly plausible, at least until you take scale into account.


There are two broad classes of scenarios:  Faster-than-light travel is possible, or it's not.  If anyone's figured out how to travel faster than light, then all bets are off.  The procedure in that case seems pretty simple: Send probes to as many star systems as you can.  Have them start off in the outer reaches, unlikely to be detected, scanning for planets, then scanning for life on those planets.  If you find anything that looks plunderable, send back word and bring in the troops.  Conquer.  Build more probes.  Repeat.

This doesn't require listening for radio waves as a sign of civilization.  Put a telescope and a camera on an asteroid with a suitable orbit and take pictures as it swings by your planet of choice.  Or whatever.  The main point is if there's anyone out there with that level of technology, our fate is sealed one way or another.


On the other hand, if the speed of light really is a hard and fast limit, then economics will play a significant role.  Traveling interstellar distances takes a huge amount of energy and not a little time (from the home planet's point of view -- less for the travelers, particularly if they manage to get near light speed).  By contrast, in the period of exploration and conquest from the late 1400s to the late 1700s it was not difficult to build a seaworthy ship and oceans could be crossed in weeks or months using available energy from the wind.  The brutal fact is that discovering and exploiting new territories on Earth at that time was economically profitable.

If your aim is to discover and exploit resources in other star systems, then you have to ask what they might have that you can't obtain on your home system using the very large amount of energy you would have to use to get to the other system.  The only sensible answer I can come up with is advanced technology, which assumes that your target is more advanced than you are, in which case you might want to rethink.


Even if your aim is just to conquer other worlds for the evulz or out of some mostly-instinctive drive, you're fighting an extremely uphill battle.  Suppose you're attacking a planet 10 light-years away.  Messages from the home planet will take 10 years to reach your expeditionary force, and any reply will take another 10 years, so they're effectively on their own.

You detected radio transmissions from your target ten years ago.  It takes you at least 10 years to reach their planet (probably quite a bit longer, but let's take the best case, for you at least).  They're at least 20 years more advanced than when the signal that led you to plot this invasion left their planet.

You've somehow managed to assemble and send a force of thousands, or tens of thousands, or a million.  You're still outnumbered by -- well, you don't really know until you get there, do you? -- but hundreds to one at the least and more likely millions to one.  You'd better have a crushing technological advantage.

I could come up with scenarios that might work.  Maybe you're able to threaten with truly devastating weapons that the locals have no way to counter.  The locals treat with you and agree to become your loyal minions.

Now what?

Unless your goal was just the accomplishment of being able to threaten another species from afar, you'll want to make some sort of physical contact.  Presumably you land your population on the planet and colonize, assuming the planet is habitable to you and the local microbes don't see you as an interesting host environment/lunch (or maybe you've mastered the art of fighting microbes, even completely unfamiliar ones).

You're now on unfamiliar territory to which you're not well adapted, outnumbered at least a hundred to one by intelligent and extremely resentful beings that would love to steal whatever technology you're using to maintain your position.  Help is twenty years away, counting from the time you send your distress call, and if you're in a position to need it, is the home planet really going to want to send another wave out?  By the time they get there, the locals will have had another twenty years to prepare since you sent your distress call, this time with access to at least some of your technology.


I'm always at least a little skeptical of the idea that other civilizations will think like we do.  Granted, it doesn't seem too unreasonable to assume that anyone who gets to the point that we would call them "technological" is capable of doing the same kind of cost/benefit analyses that we do.  On the other hand, it also seems reasonable to assume that they have the same sort of cognitive biases and blind spots that we do.

The "soft" sciences are a lot about how to model the aggregate behavior of not-completely rational individuals.  There's been some progress, but there's an awful lot we don't know even about our own species, which we have pretty good access to.  When it comes to hypothetical aliens, I don't see how we can say anything close to "surely they will do thus-and-such", even if there are practical limits on how bonkers you can be and still develop technology on a large scale.

In the context of the Dark Forest, the question is not so much how likely it is that alien species are actually a danger to us, but how likely is it that an alien species would think they were in danger from another alien species (maybe us) and act on that by actively going dark.

Our own case suggests that's not very likely.  There may be quite a few people who think that an alien invasion is a serious threat (or for that matter, that one has already happened), or who think that it's unlikely but catastrophic enough if it did happen that we should be prepared.  That doesn't seem to have stopped us from spewing radio waves into the universe anyway.  Maybe we're the fools and everyone else is smarter, but imagine the level of coordination it would take to keep the entire population of a planet from ever doing anything that would reveal their presence.  This seems like a lot to ask, even if the threat of invasion seems likely, which, if you buy the analysis above, it's probably not.

Overall, it seems unlikely that every single technological civilization out there would conclude that staying dark was worth the trouble.  At most, I think, there would be fewer detectable civilizations, than there would have been otherwise, but I still think that as far as explaining why we haven't heard from anyone, it's more likely that whatever civilizations there are, have been or will be out there are too far away for our present methods to detect (and may always be), and that the window of opportunity for detecting them is either long past or far in the future.

Saturday, July 21, 2018

Fermi on the Fermi paradox

One of the pleasures of life on the modern web is that if you have a question about, say, the history of the Fermi paradox, there's a good chance you can find something on it.  In this case, it didn't take long (once I thought to look) to turn up E. M. Jones's "Where is Everybody?" an  Account of Fermi's Question.

The article includes letters from Emil Konopinski, Edward Teller and Herbert York, who were all at lunch with Enrico Fermi at Los Alamos National Laboratory some time in the early 1950s when Fermi asked his question.  Fermi was wondering specifically about the possibility that somewhere in the galaxy some civilization had developed a viable form of interstellar travel and had gone on to explore the whole galaxy, and therefore our little blue dot out on one of the spiral arms.

Fermi and Teller threw a bunch of arguments at each other, arriving at a variety of probabilities.  Fermi eventually concluded that probably interstellar travel just wasn't worth the effort or perhaps no civilization had survived long enough to get to that stage (I'd throw in the possibility that they came by millions of years ago, decided nothing special was going on and left -- or won't come by for a few million years yet).

Along the way Fermi, very much in the spirit of "How many piano tuners are there in Chicago?" broke the problem down into a series of sub-problems such as "the probability of earthlike planets, the probability of life given an earthlike planet" and so forth.  Very much something Fermi would have done, (indeed, this sort of exercise goes by the name "Fermi estimation") and very similar to what we now call the Drake equation.

In other words, Fermi and company anticipated much of the subsequent discussion on the subject over lunch more than fifty years ago and then went on to other topics (and presumably coffee).  There's been quite a bit of new data on the subject, particularly the recent discovery that there are in fact lots of planets outside our solar system, but the theoretical framework hasn't changed much at all.

What's a Fermi paradox?

So far, we haven't detected strong, unambiguous signs of extraterrestrial intelligence.  Does that mean there isn't any?

The usual line of attack for answering this question is the Drake equation [but see the next post for a bit on its origins --D.H Oct 2018], which breaks the question of "How many intelligent civilizations are there in our galaxy?" down into a series of factors that can then be estimated and combined into an overall estimate.

Let's take a simpler approach here.

The probability of detecting extraterrestrial intelligence given our efforts so far is the product of:
  • The probability it exists
  • The probability that what we've done so far would detect it, given that it exists
(For any math geeks out there, this is just the definition of conditional probability)

Various takes on the Fermi paradox (why haven't we seen anyone, if we're pretty sure they're out  there?) address these two factors
  • Maybe intelligent life is just a very rare accident.  As far as we can tell, Earth itself has lacked intelligent life for almost all of its history (one could argue it still does, so feel free to substitute "detectable" for "intelligent").
  • Maybe intelligent life is hard to detect for most of the time it's around (See this post for an argument to that effect and this one for a bit on the distinction between "intelligent" and "detectable").  A particularly interesting take on this is the "dark forest" hypothesis, that intelligent civilizations soon figure out that being detectable is dangerous and deliberately go dark, hoping never to be seen again.  I mean to take this one on in a bit, but not here.
  • One significant factor when it comes to detecting signs of anything, intelligent or otherwise: as far as we know detectability drops with the square of distance, that is, twice as far away means four times harder to detect.  Stars are far away.  Other galaxies are really far away.
  • Maybe intelligent life is apt to destroy itself soon after it develops, so it's not going to be detectable for very long and chances are we won't have been looking when they were there .  This is a popular theme in the literature.  I've talked about it here and here.
  • Maybe the timing is just wrong.  Planetary time scales are very long.  Maybe we're one of the earlier ones and life won't develop on nearby planets for another million or billion years (basically low probability of detection again, but also an invitation to be more rigorous about the role of timing)

At first blush, the logic of the Fermi paradox seems airtight: Aliens are out there.  We'd see them if they were out there.  We haven't seen them.  QED.  But we're not doing a mathematical proof here.  We're dealing in probabilities (also math, but a different kind).  We're not trying to explain a mathematically impossible result.  We're trying to determine how likely it is that our observations are compatible with life being out there.

I was going to go into a longish excursion into Bayesian inference here, but ended up realizing I'm not very adept at it (note to self: get better at Bayesian inference).  So in the spirit of keeping it at least somewhat simple, let's look at a little badly-formatted table with, granted, a bunch of symbols that might not be familiar:


We see life (S) We don't see life (¬S)
Life exists (L) P(L ∧ S) P(L ∧ ¬S) P(L)
No life (¬L) P(¬L ∧ S) P(¬L ∧ ¬S) P(¬L)

P(S) P(¬S) 100%

P is for probability.  P(L) is the probability that there's intelligent life out there we could hope to detect as such, at all.  P(S) is the probability that we see evidence strong enough that the scientific community (whatever we mean by that, exactly) agrees that intelligent life is out there.  The ¬ symbol means "not" and the ∧ symbol means "and".  The rows sum to the right, so
  • P(L ∧ S) + P(L ∧ ¬S) = P(L) (the probability life exists is the probability that life exists and we see it plus the probability it exists and we don't see it)
  • P(S + ¬S) = 100% (either we see life or we don't see it)
Likewise the columns sum downward.  Also "and" means multiply (as long as the two probabilities are independent; they are here, since we allow for false positives), so P(L ∧ S) = P(L)×P(S).  This all puts restrictions on what numbers you can fill in.  Basically you can pick any three and those determine the rest.

Suppose you think it's likely that life exists, and you think that it's likely that we'll see it if it's there.  That means you think P(L) is close to 100% and P(L ∧ S) is a little smaller but also close to 100% (see conditional probability for more details) .  You get to pick one more.  It actually turns out not to matter that much, since we've already decided that life is both likely and likely to be detected.  One choice would be P(¬L ∧ S), the chance of a "false positive", that is, the chance that there's no life out there but we think we see it anyway.  Again, in this scenario we're assuming false positives should be unlikely overall, but choosing exactly how unlikely locks in the rest of the numbers.

It's probably worth calling out one point that kept coming up while I was putting this post together: The chances of finding signs of life depend on how much we've looked and how we've done it.  A lot of SETI has centered around radio waves, and in particular radio waves in a fairly narrow range of frequencies.  There are perfectly defensible reasons for this approach, but that doesn't mean that any actual ETs out there are broadcasting on those frequencies.  In any case we're only looking at a small portion of the sky at any given moment, our current radio dishes can only see a dozen or two light years out and there's a lot of radio noise from our own technological society to filter out.

I could model this as a further conditional probability, but it's probably best just to keep in mind that P(S) is the probability of having detected life after everything we've done so far, and so includes the possibility that we haven't really done much so far. 


To make all this concrete, let's take an optimistic scenario: Suppose you think there's a 90% chance that life is out there and a 95% chance we'll see it if it's out there.  If there's no chance of a false positive, then there's an 85.5% chance that we'll see signs of life and so a 14.5% chance we won't (as is presently the case, at least as far as the scientific community is concerned).  If you think there's a 50% chance of a false positive, then there's a 90.5% chance we'll see signs of life, including the 5% chance it's not out there but we see it anyway.  That means a 9.5% chance of not seeing it, whether or not it's actually there.

This doesn't seem particularly paradoxical to me.  We think life is likely.  We think we're likely to spot it.  So far we haven't.  By the assumptions above, there's about a 10% chance of that outcome.  You generally need 99.99994% certainty to publish a physics paper, that is, a 0.00006% chance of being wrong.  A 9.5% chance isn't even close to that

Only if you're extremely optimistic and you think that it's overwhelmingly likely that detectable intelligent life is out there, and that we've done everything possible to detect it do we see a paradox in the sense that our present situation seems very unlikely.  But when I say "overwhelmingly likely" I mean really overwhelmingly likely.  For example, even if you think both are 99% likely, then there's still about a 1-2% chance of not seeing evidence of life, depending on how likely you think false positives are.  If, on the other hand, you think it's unlikely that we could detect intelligent life even if it is out there, there's nothing like a paradox at all.


My personal guess is that we tend to overestimate the second of the two bullet points at the beginning.  There are good reasons to think that life on other planets is hard to detect, and our efforts so far have been limited.  In this view,  the probability that detectably intelligent life is out there right now is fairly low, even if the chance of intelligent life being out there somewhere in the galaxy is very high and the chance of it being out there somewhere in the observable universe is near certain.

As I've argued before, there aren't a huge number of habitable planets close enough that we could hope to detect intelligent life on them, and there's a good chance that we're looking at the wrong time in the history of those planets -- either intelligent life hasn't developed yet or it has but for one reason or another it's gone dark.

Finding out that there are potentially habitable worlds in our own solar system is exciting, but probably doesn't change the picture that much.  There could well be a technological civilizations in the oceans of Enceladus, but proving that based on what molecules we see puffing out of vents on the surface many kilometers above said ocean seems like a longshot.

With that in mind, let's put some concrete numbers behind a less optimistic scenario.  If there's a 10% chance of detectable intelligent life (as opposed to intelligent life we don't currently know how to detect), and there's a 5% chance we'd have detected it based on what we've done so far and a 1% chance of a false positive (that is, of the scientific community agreeing that life is out there when in fact it's not), then it's 98.6% likely we wouldn't have seen clear signs of life by now.   That seems fine.


While I'm conjecturing intermittently here, my own wild guess is that it's quite likely that some kind of detectable life is out there, something that, while we couldn't unequivocally say it was intelligent, would make enough of an impact on its home world that we could hope to say "that particular set of signatures is almost certainly due to something we would call life".   I'd also guess that it's pretty likely that in the next, say, 20 or 50 or 100 years we would have searched enough places with enough instrumentation to be pretty confident of finding something if it's there.  And it's reasonably likely that we'd get a false positive in the form of something that people would be convinced it was a sign of life when there in fact wasn't -- maybe we'd figure out our mistake in another 20 or 50 or 100 years.

Let's say life of some sort is 90% likely, there's a 95% chance of finding it in the next 100 years if it's there and a 50% chance of mistakenly finding life when it's not there, that is, a 50% chance that at some point over those 100 years we mistakenly convince ourselves we've found life and later turn out to be wrong.  Who knows?  False positives are based on the idea that there's no detectable life out there, which is another question mark.  But let's go with it.

I actually just ran those numbers a few paragraphs ago and came up with a 9.5% chance of not finding anything, even with those fairly favorable odds.

All in all, I'd say we're quite a ways from any sort of paradoxical result.


One final thought occurs to me:  The phrase "Fermi paradox" has been in the lexicon for quite a while, long enough to have taken on a meaning of its own.  Fermi himself, being one of the great physicists, was quite comfortable with uncertainty and approximation, so much so that the kind of "How many piano tuners are there in Chicago?" questions given to interview candidates are meant to be solved by "Fermi estimation".

I should go back and get Fermi's own take on the "Fermi paradox".  My guess was he wasn't too bothered by it and probably put it down to some combination of "we haven't really looked" and "maybe they're not out there".

If I find out I'll let you know.

[As noted above, I did in fact come across something --D.H Oct 2018]

Friday, July 6, 2018

Are we alone in the face of uncertainty?

I keep seeing articles on the Drake equation and the Fermi Paradox on my news feed, and since I tend to click through and read them, I keep getting more of them.  And since I find at least some of the ideas interesting, I keep blogging about them.  So there will probably be a few more posts on this topic.  Here's one.

One of the key features of the Drake equation is how little we know, even now, about most of the factors.  Along these lines, a recent (preprint) paper by Anders Sandberg, Eric Drexler and Toby Ord claims to "dissolve" the Fermi Paradox (with so many other stars out there why haven't we heard from them?), claiming to find "a substantial ex ante probability of there being no other intelligent life in our observable universe".

As far as I can make out, "ex ante" (from before) means something like "before we gather any further evidence by trying to look for life".  In other words, there's no particular reason to believe there should be other intelligent life in the universe, so we shouldn't be surprised that we haven't found any.

I'm not completely confident that I understand the analysis correctly, but to the extent I do, I believe it goes like this (you can probably skip the bullet points if math makes your head hurt -- honestly, some of this makes my head hurt):
  • We have very little knowledge of the some of the factors in the Drake equation, particularly fl (probability of life on a planet that might support life) fi (probability of a planet with life developing intelligent life) and L (the length of time a civilization produces a detectable signal)
  • Estimates of those range over orders of magnitude.
    • Estimates for L range from 50 years to a billion or even 10 billion years.
    • The authors do some modeling and come up with a range of uncertainty of 50 orders of magnitude for fl.  That is, it might be close to 1 (that is, close to 100% certain), or it might be more like 1 in 100,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000.  Likewise they take fi to range over three orders of magnitude, from near 1 to 1 in 1,000.
  • Rather than assigning a single number to every term, as most authors do, it makes more sense to assign a probability distribution.  That is, instead of saying "the probability of life arising on a suitable planet is 90%", or 0.01% or whatever, assign probability for each possible value (the actual math is a bit more subtle, but that should do for our purposes).  Maybe the most likely probability of life developing intelligence is 1 in 20, but there's a possibility, though not as likely, that it's actually 1 in 10 or 1 in 100, so take that into account with a probability distribution..
  • (bear in mind that the numbers were looking at are themselves probabilities, so we're assigning a probability that the probability is a given number -- this is the part that makes my head hurt a bit)
  • Since we're looking very wide ranges of values, a reasonable distribution is the "log normal" distribution -- basically "the number of digits fits a bell curve".
  • These distributions have very long tails, meaning that if, say, 1 in a thousand is a likely value for the chance of life evolving into intelligent life, then (depending on the exact parameters) 1 in a million may be reasonably likely, 1 in a billion not too unlikely and 1 in trillion is not out of the question.
  • The factors in the Drake equation multiply, following the rules of probability, so it's quite possible that the aggregate result is very small.
    • For example if it's reasonably likely that fl is 1 in a trillion and fi is 1 in a million, then we can't ignore the chance that the product of the two is 1 in a quintillion.
    • Numbers like that would make it unlikely that there is any life in our galaxy's few hundred billion stars and that ours just happened to get lucky.
  • Putting it all together, they estimate that there's a significant chance that we're alone in the observable universe.

I'm not sure how much of this I buy.

There are two levels of probability here.  The terms in the Drake equation represent what has actually happened in the universe.  An omniscient observer that knew the entire history of every planet in the universe (and exactly what was meant by "life" and "intelligent") could count the number of planets, the number that had developed life and so forth and calculate the exact values of each factor in the equation.

The probability distributions in the paper, as I understand it, represent our ignorance of these numbers.  For all we know, the portion of "habitable" planets with intelligent life is near 100%, or near 1 in a quintillion or even lower.  If that's the case, then the paper is exploring to what extent our current knowledge is compatible with there being no other life in the universe.  The conclusion is that the two are fairly compatible -- if you start with what (very little) we know about the likelihood of life and so forth, there's a decent chance that the low estimates are right, or even too optimistic, and there's no one but us.

Why?  Because low probabilities are more plausible than we think, and multiplying probabilities increases that effect.  Again, the math is a bit subtle, but if you have a long chain of contingencies, any one of them failing breaks the whole chain.  If you have several unlikely links in the chain, the chances of the chain breaking are even better.


The conclusion -- that for all we know life might be extremely rare -- seems fine.  It's the methodology that makes me a bit queasy.

I've always found the Drake equation a bit long-winded.  Yes, the probability of intelligent life evolving on a planet is the probability of life evolving at all multiplied by the probability of life evolving into intelligent life, but does that really help?

On the one hand, it seems reasonable to separate the two.  As far as we know it took billions of years to go from one to the other, so clearly they're two different things.

But we don't really know the extent of our uncertainty about these things.  If you ask for an estimate of any quantity like this, or do your own estimate based on various factors, you'll likely* end up with something in the wide range of values people consider plausible enough to publish (I'm hoping to say more on this theme in a future post).  No one is going to say "zero ... absolutely no chance" in a published paper, so it's a matter of deriving a plausible really small number consistent given our near-complete ignorance of the real number -- no matter what that particular number represents or how many other numbers it's going to be combined with.

You could almost certainly fit the results of surveying several good-faith attempts into a log-normal distribution.  Log-normal distributions are everywhere, particularly where the normal normal distribution doesn't fit because the quantity being measured has something exponential about it -- say, you're multiplying probabilities or talking about orders of magnitude.

If the question is "what is the probability of intelligent life evolving on a habitable planet?" without any hints as to how to calculate it, that is, one not-very-well-determined number rather than two, then the published estimates, using various methodologies, should range from a small fraction to fairly close to certainty depending on the assumptions used by the particular authors.  You could then plug these into a log normal distribution and get some representation of our uncertainty about the overall question, regardless of how it's broken down.

You could just as well ask "What is the probability of any self-replicating system arising on a habitable planet?", "What is the probability of a self-replicating system evolving into cellular life?"  "What is the probability of cellular life evolving into multicellular life?" and so forth, that is, breaking the problem down into several not-very-well-determined numbers.  My strong suspicion is that the distribution for any one of those sub-parts will look a lot like the distribution for the one-question version, or the parts of the two-question version, because they're basically the same kind of guess as any answer to the overall question.  The difference is just in how many guesses your methodology requires you to make.

In particular, I seriously doubt that anyone is going to cross-check that pulling together several estimates is going to yield the same distribution, even approximately, as what's implied by a single overall estimate.  Rather, the more pieces you break the problem into, the more likely really small numbers become, as seen in the paper.


I think this is consistent with the view that the paper is quantifying our uncertainty.  If the methodology for estimating the number of civilizations requires you to break your estimate into pieces, each itself with high uncertainty, you'll get an overall estimate with very high uncertainty.  The conclusion "we're likely to be alone" will lie within that extremely broad range, and may even take up a sizable chunk of it.  But again, I think this says much more about our uncertainty than about the actual answer.

I suspect that if you surveyed estimates of how likely intelligent life is using any and all methodologies*, the distribution would imply that we're not likely to be alone, even if intelligent life is very rare.  If you could find estimates of fine-grained questions like "what is the probability of multicellular life given cellular life?" you might well get a distribution that implied we're an incredibly unlikely fluke and really shouldn't be here at all.  In other words, I don't think the approach taken in the paper is likely to be robust in the face of differing methodologies.  If it's not, it's hard to draw any conclusions from it about the actual likelihood of life.

I'm not even sure, though, how feasible it would be to survey a broad sample of methodologies.  The Drake formulation dominates discussion, and that itself says something.  What estimates are available to survey depends on what methods people tend to use, and that in turn depends on what's likely to get published.  It's not like anyone somehow compiled a set of possible ways to estimate the likelihood of intelligent life and prospective authors each picked one at random.

The more I ponder this, the more I'm convinced that the paper is a statement about the Drake equation and our uncertainty in calculating the left hand side from the right.  It doesn't "dissolve" the Fermi paradox so much as demonstrate that we don't really know if there's a paradox or not.  The gist of the paradox is "If intelligent life is so likely, why haven't we heard from anyone?", but we really have no clear idea how likely intelligent life is.


* So I'm talking about probabilities of probabilities about probabilities?

Monday, June 18, 2018

Did clickbait kill the aliens?

Disclaimer: This post is on a darker topic than most.  I've tried to adjust the tone accordingly, but if anything leads you to ask "How can he possibly say that so casually?", rest assured that I don't think any of this is a casual matter.  It's just that if we're talking at the scale of civilizations and stars we have to zoom out considerably from the everyday human scale, to the point where a truly horrible cataclysm becomes just another data point.


As I've noted elsewhere, the Fermi paradox is basically "It looks likely that there's life lots of other places in the universe, so why haven't we been able to detect it -- or why haven't they made it easy by contacting us?"  Or, as Fermi put it, "Where is everybody?"

One easy answer, though something of a downer, is "They're all dead."*

This is the idea that once a species gets to a certain level of technological ability, it's likely to destroy itself.  This notion has been floated before, in the context of the Cold War: Once it became technically possible, it took shockingly little time for humanity to develop enough nuclear weapons to pose a serious threat to itself.  One disturbingly ready conclusion from that was that other civilizations hadn't contacted us because they'd already blown themselves up.

While this might conjure up images of a galaxy full of  the charred, smoking cinders of once vibrant, now completely sterile planets, that's not exactly what the hypothesis requires.  Before going into that in detail, it's probably worth reiterating here that most planets in the galaxy are much too far away to detect directly against the background noise, or to be able to carry on a conversation with (assuming that the speed of light is the cosmic speed limit we think it is).  In order to explain why we haven't heard from anyone, we're really trying to explain why we haven't heard from anyone within, say, a hundred light years.  I've argued elsewhere that that narrows the problem considerably (though maybe not).


A full-scale nuclear exchange by everyone with nuclear weapons would not literally kill all life on Earth.  There are a lot of fungi and bacteria, and a lot of faraway corners like hydrothermal vents for all kinds of life to hide.  It probably wouldn't even kill all of humanity directly, but -- on top of the indescribable death and suffering from the bombing itself -- it would seriously damage the world economy and make life extremely difficult even in areas that weren't directly affected by the initial exchange.  Behind the abstraction of the "world economy" is the flow of food, medicine, energy and other essentials.

There is an extensive literature concerning just how bad things would get under various assumptions, but at some point we're just quibbling over levels of abject misery.  In no realistic case is bombing each other better for anyone involved than not bombing each other.

For our purposes here, the larger point is clear: a species that engages in a full-scale nuclear war is very unlikely to be sending out interstellar probes or operating radio beacons aimed at other stars.  It may not even be sending out much in the way of stray radio signals at all.  It might well be possible for a species in another star system to detect life in such a case without detecting signs of a technological civilization, much less communicating with it.

So how likely is a full-scale nuclear war?  We simply don't know.  So far we've managed to survive several decades of the nuclear age without one, but, as I've previously discussed, that's no time at all when it comes to estimating the likelihood of finding other civilizations.  To totally make up some numbers, supposed that, once nuclear weapons are developed, a world will go an average of a thousand years without seriously using them and then, after the catastrophe, take a couple of centuries to get back to the level of being able to communicate with the rest of the universe.

Again, who knows?  We (fortunately) have very little data to go on here.  In the big picture, though, this would mean that a planet with nuclear weaponry or something similarly dangerous would be 10-20% less likely to be detected than one without.  We also have to guess what portion of alien civilizations would be subject to this, but how likely is it, really, that someone would develop the ability to communicate with the stars without also figuring out how to do anything destructive with its technology?

My guess is  that "able to communicate across interstellar distances" is basically the same as "apt to destroy that ability sooner or later".  This applies particularly strongly to anyone who could actually send an effective interstellar probe.  The  kinetic energy of any macroscopic object traveling close to light speed is huge.  It's hard to imagine being able to harness that level of energy for propulsion without also learning how to direct it toward destruction.

For purposes of calculation, it's probably best to assume a range of scenarios.  In the worst case, a species figures out how to genuinely destroy itself, and perhaps even life on its planet, and is never heard from.  In a nearly-as-bad case, a species spends most of its time recovering from the last major disaster and never really gets to the point of being able to communicate effectively across interstellar distances, and is never heard from.  The upshot is a reduction in the amount of time a civilization might produce a detectable signal (or, in a somewhat different formulation, the average expected signal strength over time).

Our own case is, so far, not so bad, and let's hope it continues that way.  However, along with any other reasons we might not detect life like us on other planets, we can add the possibility that they're too busy killing each other to say hello.


With all that as context, let's consider a recent paper modeling the possibility that a technological civilization ends up disrupting its environment with (from our point of view here, at least) pretty much the same result as a nuclear war.   The authors build a few models, crunch through the math and present some fairly sobering conclusions: Depending on the exact assumptions and parameters, it's possible for a (simulated) civilization to reach a stable equilibrium with its (simulated) environment, but several other outcomes are also entirely plausible: There could be a boom-and-bust that reduces the population to, say, 10% of its peak.  The population could go through a repeating boom/bust cycle.  It could even completely collapse and leave the environment essentially unlivable.

So what does this add to the picture?  Not much, I think.

The paper reads like as a proof-of-concept of the idea of modeling an alien civilization and its environment using the same mathematical tools (dynamical system theory) used to model anything from weather to blood chemistry to crowd behavior and cognitive development.  Fair enough.  There is plenty of well-developed math to apply here, but the math is only as good as the assumptions behind it.

The authors realize this and take care only to make the most general, uncontroversial assumptions possible.  They don't assume anything about what kind of life is on the planet in question, or what kind of resources it uses, or what exact effect using those resources has on the planet.  Their assumptions are on the order of "there is a planet", "there is life on it", "life consumes resources" and so forth.

Relying on few assumptions means that any conclusions you do reach are very general.  On the other hand, if the assumptions support a range of conclusions, how do you pick from amongst them?  Maybe once you run through all the details, any realistic set of assumption leads to a particular outcome -- whether stability or calamity.  Maybe most of the plausible scenarios are in a chaotic region where the slightest change in inputs can make an arbitrarily large difference in outputs.  And so forth.

As far as I can make out, the main result of the paper is that planets, civilizations and their resources can be modeled as dynamical systems.  It doesn't say what particular model is appropriate, much less make any claims about what scenarios are most likely for real civilizations on real exoplanets.  How could it?   Only recently has there been convincing evidence that exoplanets even exist.  The case that there is life on at least some of them is (in my opinion) reasonably persuasive, but circumstantial.  It's way, way too early to make any specific claims about what might or might not happen to civilizations, or even life in general, on other planets.

To be clear, the authors don't seem to be making any such claims, just to be laying some groundwork for eventually making such claims.  That doesn't make a great headline, of course.  The article I used to find the paper gives a more typical take: Climate change killed the aliens, and it will probably kill us too, new simulation suggests.

Well, no.  We're still in the process of figuring out exactly what effect global warming and the resulting climate change will have on our own planet, where we can take direct measurements and build much more accurate models than the authors of the paper put forth.  All we can do for an alien planet is lay out the general range of possibilities, as the authors have done.  Trying to draw conclusions about our own fate from our failure (so far) to detect others like us seems quite premature, whether the hypothetical cause of extinction is war or a ruined environment.



There's a familiar ring to all this.   When nuclear destruction was on everyone's mind, people saw an obvious, if depressing, answer to Fermi's question.  As I recall, papers were published and headlines written.  Now that climate-related destruction is on everyone's mind, people see an obvious, if depressing, answer to Fermi's question, with headlines to match.  It's entirely possible that fifty years from now, if civilization as we know it is still around (as I expect it will be) and we haven't heard directly from an alien civilization (as I suspect we won't), people will see a different obvious, if depressing, answer to Fermi's question.  Papers will be written about it, headlines will do what headlines do, and it will all speak more to our concerns at the time than to the objective state of any alien worlds out there.


I want to be clear here, though.  Just because headlines are overblown doesn't mean there's nothing to worry about.  Overall, nuclear weapons take up a lot less cultural real estate than they did during the height of the cold war, but they're very much still around and just as capable of wreaking widespread devastation.  Climate change was well underway during that period as well, and already recognized as a hazard, but not nearly as prominent in the public consciousness as it is today.

It's tempting to believe in an inverse relationship between the volume of headlines and the actual threat: If they're making a big deal out of it, it's probably nothing to worry about.  But that's an empirical question to be answered by measurement.  It's not a given.  Without actually taking measurements, the safest assumption is the two are unrelated, not inversely related.  That is, how breathless the headlines are is no indication one way or another as to how seriously to take the threat.

My own guess, again without actually measuring, is that there's some correlation between alarming headlines and actual danger.  People study threats and publish their findings.  By and large, and over time, there is significant signal in the noise.  If a range of people working in various disciplines say that something is cause for concern, then it most likely is -- nuclear war and climate change are real risks.  Some part of this discussion finds its way into the popular consciousness, with various shorthands and outright distortions, but if you take the time to read past the headlines and go back to original sources you can get a reasonable picture, and one that will bear at least some resemblance to the headlines.

Going back to original sources and getting the unruly details may not be as satisfying as a nice, punchy one-sentence summary, but I'd say it's worth the effort nonetheless.



(*) A similar but distinct notion is the "Dark forest" hypothesis: They're out there, but they're staying quiet so no one else kills them -- and we had best follow suit.  That's fodder for another post, though I think at least some of this post applies.

Thursday, May 31, 2018

Cookies, HTTPs and OpenId

I finally got around to looking at the various notices that have accumulated on the admin pages for this blog.  As a result:

  • This blog is supposed to display a notice regarding cookies if you access it from the EU.  I'm not sure that this notice is actually appearing when it should (I've sent feedback to try to clarify), but as far as I can tell blogspot is handling cookies for this blog just like any other.  I have not tried to explicitly change that behavior.
  • I've turned on "redirect to https".  This means that if you try to access this blog via http://, it will be automatically changed to https://.  This shouldn't make any difference.  On the one hand, https has been around for many years and all browsers I know of handle it just fine.  On the other hand, this is a public blog, so there's no sensitive private information here.  It might maybe make a difference if you have to do some sort of login to leave comments, but I doubt it.
  • Blogger no longer supports OpenID.  I think this would only matter if I'd set up "trust these web sites" under the OpenId settings, but I didn't.
In other words, this should all be a whole lot of nothing, but I thought I'd let people know.

Wednesday, May 23, 2018

The stuff of dreams


... and then our hero woke up and it was all a dream ...

... has to rank among the most notorious pulled-out-of-thin-air deus ex machina twist endings in the book, along with "it was actually twins" and "it was actually the same person with multiple personalities".  As with all such tropes, there's nothing wrong with these plot twists per se.  The problem is generally the setup.

In a well-written "it was actually twins" twist, you have clues all along that there were actually two people -- maybe subtle shifts in behavior, or a detail of clothing that comes and goes, or the character showing up in an unexpected place that it seemed unlikely they'd be able to get to.  With a good setup, you're reaction is "Oh, so that's why ..." and not "Wait ... what?  Seriously?"

The same goes for "it was all a dream".  In a good setup, there are clues that it was all a dream.  Maybe things start out ok, then something happens that doesn't quite make sense, then towards the end things get seriously weird, but more in a "wait, what's going on here, why did they do that?" kind of way, as opposed to a "wait, was that a flying elephant I just saw with a unicyclist on its back?" kind of way, though that can be made to work as well.

There's a skill to making things dreamlike, particularly if you're trying not to give the game away completely.  Dream logic doesn't just mean randomly bizarre things happening.  Dreams are bizarre in particular ways which are not particularly well understood, even though people have been talking about and interpreting dreams probably for as long as there have been talking and dreams.

A while ago I ran across a survey by Jennifer Windt and Thomas Metzinger that has quite a bit to say about dreams and the dream state, both ordinary dreams and "lucid" dreams where the rules are somewhat different.  They compare the three states of ordinary dreaming, lucid dreaming and waking consciousness to try to tease out what makes each one what it is with, I found, fair success.  I'm not going to go into a detailed analysis of that paper here, but I did want to acknowledge it, if only as a starting point.


First, though, some more mundane observations about dreams.  We tend to dream several times a night, in cycles lasting around 90 minutes.  We don't typically remember this, but a subject who is awakened while exhibiting signs of a dream state can generally recall dreaming while a subject awakened under other conditions doesn't.  The dream state is marked by particular patterns of electrical activity in the brain, near-complete relaxation of the skeletal muscles and, probably best-known, Rapid Eye Movement, or REM.  REM is not a foolproof marker, but the correlation is high.

Dreams in early sleep cycles tend to be closely related to things that happened during the waking day.  Subjects who studied a particular skill prior to going to sleep, for example, tended to have dreams about that skill.  I've personally had dreams after coding intensely that were a sort of rehash of how I'd been thinking about the code in question, not so much in the concrete sense of writing or reading particular pieces as more abstractly navigating data and control structures.

Later dreams -- those closer to when you wake up -- tend to be more emotional and less closely associated with recent memories.  Since these are more likely to be the ones you remember unless someone is waking you up as part of a sleep experiment, these are the kind of dreams we tend to think of as "dreamlike".  These are the "I was in this restaurant having dinner with such-and-such celebrity, except it didn't look like them, and I could hear my third-grade teacher yelling something, but everyone just ignored it and then a huge ocean wave came crashing in and we all had to swim for it, even though the restaurant was in the Swiss Alps" kind of dreams.

In my experience this kind of dream can often be linked back to relevant events, but in a sort of mashed-up, piecemeal, indirect way.  Maybe you heard a news story about a tidal wave yesterday and a couple of days ago some relative or old friend had mentioned something that happened to you in grade school.  Celebrities, by definition, are frequently in the news, and it was the Swiss Alps just because.  That doesn't really explain what the dream might mean, if indeed it meant anything, but it does shed some light on why those particular elements might have been present.

But why that particular assemblage of elements?  Why wasn't the third grade teacher your dinner companion?  Why did all the other diners ignore the teacher?  Why wasn't the restaurant on the beach? And so on.

My personal theory on such things is pretty unsatisfying: it just is.  Whatever part of the mind is throwing dream elements together is distinct from the parts of the mind concerned with cause and effect and pulling together coherent narratives.

To draw a very crude analogy, imagine memory as a warehouse.  From time to time things have to be shuffled around in a warehouse in for various logistical reasons.  For example, if something that's been stored in the back for months now needs to be brought out, you may have to move other items around to get at it.  Those items were put there for their own reasons that may not have anything to do with the item that's being brought out.

Now suppose someone from management in a different part of the company -- say media relations -- comes in and starts observing what's going on.  A pallet of widgets gets moved from section 12D, next to the gadgets, to section 4B, next to the thingamajigs.  This goes on for a while and our curious manager may even start to notice patterns and make tentative notes on them.

Suppose upper-level management demands, for its own inscrutable reasons, a press release on the warehouse activity.  The media relations person writing the release is not able to contact the warehouse people to find out what's really going on and just has to go by the media relations manager's notes about widgets moving from next to the gadgets to next to the thingamajigs.  The resulting press release is going to try to tell a coherent story, but it's not going to make much sense.  It's almost certainly not going to say "We had to get the frobulator out of long-term storage for an upcoming project so we moved a bunch of stuff to get at it."

My guess is that something similar is going on in the brain with dreams.  In normal waking consciousness, the brain is receiving a stream of inputs from the outside world and putting them together into a coherent picture of what's going on.  There are glitches all the time for various reasons. The input we get is generally incomplete and ambiguous.  We can only pay attention to so much at a time.

In order to cope with this we constantly make unconscious assumptions based on expectations, and these vary from person to person since we all have different experiences.  The whole concept of consciousness is slippery and by no means completely understood, but for the purpose of this post consciousness (as opposed to any particular state of consciousness) means whatever weaves perception into a coherent picture of what's going on.

Despite all the difficulties in turning perception into a coherent reality, we still do pretty well.  Different people perceiving the same events can generally agree on at least the gist of what happened, so in turn we agree that there is such a thing as "objective reality" independent of the particular person observing it.  Things fall down.  The sun rises in the morning.  It rains sometimes.  People talk to each other, and so on.  Certainly there's a lot we disagree on, sometimes passionately with each person firmly believing the other just doesn't know the simple facts, but this doesn't mean there's no such thing as objective reality at all.



In the dream state, at least some of the apparatus that builds conscious experience is active, but it's almost completely isolated from the outside world (occasionally people will incorporate outside sounds or other sensory input into a dream, but this is the exception).  Instead it is being fed images from memories which, as in the warehouse analogy, are being processed according to however memory works, without regard to the outside world.  Presented with this, consciousness tries to build a narrative anyway, because that's what it does, but it's not going to make the same kind of sense as waking consciousness because it's not anchored to the objective, physical world.

If the early memory-processing is more concerned with organizing memories of recent events, early-cycle dreams will reflect this.  If later memory processing deals in larger-scale rearrangement and less recent, less clearly correlated memories, later-cycle dreams will reflect this.


As I understand it, Windt and Metzinger's analysis is broadly compatible with this description, but they bring in two other key concepts that are important to understanding the various states of consciousness: agency and phenomenal transparency.

Agency is just the power to act.  In waking consciousness we have a significant degree of agency.  In normal circumstances we can control what we do directly -- I choose to type words on a keyboard.  We can influence the actions of others to some extent, whether by force or persuasion.  We can move physical objects around, directly or indirectly.  If I push over the first domino in a chain, the others will fall.

In a normal dream the dreamer has no agency.  Things just happen.  Even things that the dreamer experiences as doing just happen.  You can recall "I was running through a field", but generally that's just a fact.  Even if your dream self decides to do something, as in "The water was rushing in so I started swimming", it's not the same as "I wanted to buy new curtains so I looked at a few online and then I picked these out".  Your dream self is either just doing things, or sometimes just doing things in a natural reaction to something that happened.

Even that much is a bit suspect.  It wouldn't be a surprise to hear "... a huge ocean wave came crashing in and then I was walking through this city, even though it was underwater".  In some fundamental way, in a dream you're not making things happen.  They just happen.

Likewise, one of the most basic forms of agency is directing one's attention, but in a dream you don't have any choice in that, either.  Instead, attention is purely salience based, meaning, more or less, that in a dream your attention is directed where it needs to be -- if that ocean wave bursts in you're paying attention to the water -- rather than where you want it to be.

Phenomenal transparency concerns knowing what state of consciousness you're in.  Saying that dreaming is phenomenally transparent is just a technical way of saying "when you're in a dream you don't know you're dreaming" (So why coin such a technical term for such a simple thing?  For the usual reasons.  On the one hand, repeating that whole phrase every time you want to refer to the concept -- which will be a lot if you're writing a paper on dreaming -- is cumbersome at best.  It's really convenient to have a short two-word phrase for "the-quality-of-not-knowing-you're-dreaming-when-you're dreaming".  On the other hand, defining a phrase and using it consistently makes it easier for different people to agree they're talking about the same thing.  But I digress.)

If someone is recalling a dream, they don't recall it as something that they dreamed.  The recall it as something that happened, and happened in a dream.  It "happened" just the same as something in waking consciousness "happened".  During the dream itself, it's completely real.  Only later, as we try to process the memory of a dream, do we understand it as a dream.  I've personally had a few fairly unsettling experiences of waking up still in a dreamlike state and feeling some holdover from the dream as absolutely real, before waking up completely and realizing ... it was all a dream (more on this below).  I expect this is something most people have had happen and this is why the "it was all a dream trope" can work at all.

In some sense this seems related to agency.  When you say "I dreamed that ..." it doesn't mean that you consciously decided to have thus-and-such happen in your dream.  It means that you had a dream, and thus-and-such happened in it.

Except when  it doesn't ...

Windt and Metzinger devote quite a bit of attention to lucid dreams. While the term lucid might suggest vividness and clarity, and this can happen, lucidity generally refers to being aware that one is dreaming (phenomenal transparency breaks down).  Often, but not always, the dreamer has a degree of control (agency) over the action of the dream.  In a famous experiment, lucid dreamers were asked to make a particular motion with their eyes, something like "when you realize you're in a dream, look slowly left and then right, then up, then left and right again", something that would be clearly different from normal REM.  Since the eyes can still move during a dream, even if the rest of the body is completely relaxed, experimenters were able to observe this and confirm that the dreamers were indeed aware and able to act.

Not everybody has lucid dreams, or at least not everyone is aware of having had them.  I'm not sure I've had any lucid dreams in the "extraordinarily clear and vivid" sense, but I've definitely had experiences drifting off to sleep and working through some problem or puzzle to solve, quite consciously, but blissfully unaware that I'm actually asleep and snoring.  I've also had experiences waking up where I was able to consciously replay what had just been happening in a dream and at least to some extent explore what might happen next.  I'm generally at least somewhat aware of my surroundings in such cases, at least intermittently, so it's not clear what to call dreaming and what to call remembering a dream.

In any case, I think this all fits in reasonably well with the idea of multiple parts of the brain doing different things, or not, none of them in complete control of the others.  Memory is doing whatever memory sorting it needs to do during sleep (it's clear that there's at least something essential going on during sleep, because going without sleep for extended periods is generally very bad for one's mental health).  Some level of consciousness's narrative building is active as well, doing its best to make sense of the memories being fed to it.  Some level of self awareness that "I'm here and I can do things" may or may not be active as well, depending on the dreamer and the particular circumstances.

This is nowhere near a formal theory of dreams.  Working those out is a full-time job.  I do think it's interesting, though, to try to categorize what does and doesn't happen in dream states and compare that to normal waking consciousness.  In particular, if you can have A happen without B happening and vice versa, then in some meaningful sense A and B are produced by different mechanisms.

If we draw up a little table of what can happen with or without what else...

Can there be ... without ...ConsciousnessAgencyPhenomenal transparency
Consciousnessyes1yes2
(Conscious) Agencyno?
Phenomenal transparencynoyes3
1 In ordinary dreams, but also, e.g., if paralyzed by fear
2 In ordinary dreams
3 In a lucid dream, if you're aware that you're dreaming but can't influence the dream

... it looks like things are pretty wide open.  I didn't mention it in the table, but agency doesn't require consciousness.  We do things all the time without knowing why, or even that, we're doing them.  However, conscious agency requires consciousness by definition.  So does phenomenal transparency -- it's consciousness of one's own state.

Other than that, everything's wide open except for one question mark: Can you have conscious agency without phenomenal transparency?  That is, can you consciously take an action without knowing whether you're awake or dreaming (or in some other mental state).  This isn't clear from lucid dreaming, since lucid dreaming means you know you're dreaming.  It isn't clear from ordinary dreaming.  Ordinary dreams seem passive in nature.

In a related phenomenon, though, namely false awakening, the dreamer can, while actually remaining asleep, awaken and start to do ordinary things.  In some cases, the dreamer becomes aware of the dream state, but in other cases the illusion of being awake lasts until the dreamer awakens for real.

All of this is just a long way of saying that our various faculties like consciousness, agency and awareness of one's state of consciousness seem to be mix and match.  The normal states are waking consciousness and ordinary dreaming, but anything between seems possible.  In other words, while these faculties generally seem to operate either together (waking consciousness), or with only consciousness (ordinary dreaming) they're actually independent.  It's also worth noting that nothing in the table above distinguishes waking from dreaming.  The difference there would seem to be in whether we're processing the real world or memories of it.

This is an interesting piece of information, one which would have been considerably harder to come by if we didn't have the alternate window into consciousness provided by dreams.