Thursday, July 2, 2015

Do androids trip on electric acid?

Have a look at some of these images and take note of whatever adjectives come to mind.  If other people's responses are anything to go by, there's a good chance they include some or all of "surreal", "disturbing", "dreamlike", "nightmarish" or "trippy".  Particularly "trippy".

These aren't the first computer-generated images to inspire such descriptions.  Notably, fractal images have been described in psychedelic terms at least since the Mandelbrot set came to general attention, and the newer, three-dimensional varieties seem particularly evocative.  The neural-network generated images, however, are in a different league.  What's going on?

Real neural systems appear to be rife with feedback loops.  In experiments with in vitro neuron cultures -- nerve cells growing in dishes with electrodes attached here and there -- a system with no other input to deal with will amplify and filter whatever random noise there is (and there is always something) into a clear signal.  This would be a "signal" in the information theory sense of something that's highly unlikely to occur by chance, not a signal in the sense of something conveying a particular meaning.

This distinction between the two senses of "signal" is important.  Typically a signal in the information theory sense is also meaningful in some way.  That's more or less why they call it "information theory".  There are plenty of counterexamples, though.  For example:
  • tinnitus (ringing of the ears), where the auditory system fills in a frequency that the ear itself isn't able to produce
  • pareidolia, where one sees images of objects in random patterns, such as faces in clouds
  • the gambler's fallacy, where one expects a random process to remember what it has already done and compensate for it ("I've lost the last three hands.  I'm bound to get good cards now.")
and so forth.  The common thread is that part of the brain is expecting to perceive something -- a sound, a face, a balanced pattern of "good" and "bad" outcomes -- and selectively processes otherwise meaningless input to produce that perception.

In the generated images, a neural network is first trained to recognize a particular kind of image -- buildings, eyes, trees, whatever -- and the input image is adjusted bit by bit to strengthen the signal to the recognizer.  The code doing the adjustment knows nothing about what the recognizer expects.  It just tries something, and if the recognizer gives it a stronger signal as a result, it keeps the adjustment.  If you start with random noise, you end up with the kind of images you were looking for.  If you start with non-random input, you get a weird mashup of what you had and what you were looking for.

Our brains almost certainly have this sort of feedback loop built in.  Real input often provides noisy and ambiguous signals.  Is that a predator behind those bushes, or just a fallen branch?  Up to a point it's safer to provide a false positive ("predator" when it's really a branch) than a false negative ("branch", when it's really a predator), so if a predator-recognizer feeds "yeah, that might be a four-legged furry thing with teeth" back to the visual system in order to strengthen the signal, survival chances should be better than with a brain that doesn't do that.  A difference in survival chances is exactly what natural selection needs to do its work.

At some point, though, too many false positives mean wasting energy, and probably courting other dangers, by jumping at every shadow.  Where that point is will vary depending on all sorts of things.  In practice, there will be a sliding scale from "too complacent" to "too paranoid", with no preset "right" amount of caution.  Given that chemistry is a vital part of the nervous system's operation, it's not surprising that various chemicals could move such settings.  If the change is in a useful direction, we call such chemicals "medicine".  Otherwise we call them "drugs".

In other words -- and I'm no expert here -- it seems plausible that we call the images trippy because they are trippy, in the sense that the neural networks that produced them are hallucinating in a manner similar to an actual brain hallucinating.  Clearly, there's more going on than that, but this is an interesting result.


When testing software, it's important to look at more than just the "happy" path.  If you're testing code that divides numbers, you should see what it does when you ask it to divide by zero.  If you're testing code that handles addresses and phone numbers, you should see what it does when you give it something that's not a phone number.  Maybe you should feed it some random gibberish (noting exactly what that gibberish was, for future reference), and see what happens.

Testing models of perception (or of anything else), seems similar.  It's nice if your neural network for recognizing trees can say that a picture of a tree is a picture of a tree.  It's good, maybe good enough for the task at hand, if it's also good at not calling telephone poles or corn stalks trees.  But if you're not just trying to recognize pictures, and you're actually trying to model how brains work in general, it's very interesting if your model shows the same kind of failure modes as an actual brain.  A neural network that can hallucinate convincingly might just be on to something.