Intermittent Conjecture: 2022

Wednesday, December 28, 2022

Pushing back on AI (and pushing back on that)

A composer I know is coming to terms with the inevitable appearance of online AIs that can compose music based on general parameters like mood, genre and length, somewhat similar to AI image generators that can create an image based on a prompt (someone I know tried "Willem Dafoe as Ronald McDonald" with ... intriguing ... results). I haven't looked at any of these in depth, but from what I have seen it looks like they can produce arbitrary amounts of passable music with very little effort, and it's natural to wonder about the implications of that.

A common reaction is that this will sooner or later put actual human composers out of business, and my natural reaction to that is probably not. Then I started thinking about my reasons for saying that and the picture got a bit more interesting. Let me start with the hot takes, and then go on to the analysis.

This type of AI is generally built by training a neural network against a large corpus of existing music. Neural nets are now pretty good at extracting general features and patterns and extrapolating from them, which is why the AI-generated music sounds a lot like stuff you've already heard. That's good because the results sound like "real music", but it's also a limitation.
At least in its present form, using an AI still requires human intervention. In theory, you could just set some parameters and go with whatever comes out, but if you wanted to provide, say, the soundtrack for a movie or video game, you'll need to actually listen to what's produced and decide what music goes well with what parts, and what sounds good after what, and so forth. In other words, you'll still need to do some curation.

Along with this, I have a general opinion about the progress of AI as a whole: A few years back, there was a breakthrough as hardware got fast enough, thanks in part to special-purpose tensor-smashing chips, and new modeling techniques were developed, for the overall approach of neural network-based machine learning (ML) models to solve interesting problems that had so far resisted solution. We're now in the phase of working out the possibilities, with new applications turning up left and right.

One way to look at is that there were a bunch of problem spaces out there that computers weren't well suited for before but are a good match for the new techniques, and we're in the process of identifying those. Because there has been so much progress in applying the new ML, and because these models are based on the way actual brains work, it's tempting to thing that they can handle anything that the human brain can handle, and/or that we've created "general intelligence", but that's not necessarily the case.

My strong hunch is that before too long the limitations will become clear and the flood of new applications will slow. There may or may not be a new round of "failed promise of AI" proclamations and amnesia about how much progress has been made. Researchers will keep working away, as they always have, and at some point there will be another breakthrough and another burst of progress. Lather, rinse, repeat.

That's all well and good, but honestly those bullet-pointed arguments above aren't that great, and the more general argument doesn't even try to say where the limits are.

The bullet points amount to two arguments that go back to the beginnings of AI, if not before, to the first time someone built an automaton that looked like it was doing something human, and they have a long history of looking compelling in the short run but failing in the long run.

The first argument is basically that the automaton can only do what it was constructed or taught to do by its human creators, and therefore it cannot surpass them. But just as a human-built machine can lift more than a human, a human-built AI can do things that no human can. Chess players have known this for decades now (and I'm pretty sure chess wasn't the first such case).
The second argument assumes that there's something about human curation that can't be emulated by computers (though I was careful to say "at least in its present form"). The oldest form of this argument is that a human has a soul, or a "human spark of creativity" or something similar, while a machine doesn't, so there will always be some need for humans in the system.

The problem with that one is that when you try to pin down that human spark, it basically amounts to "whatever we can do that the machines can't ... yet", and over and over again the machines have eventually turned out to be able to do things they supposedly couldn't. Chess players used to believe that computers could only play "tactical chess" and couldn't play "positional chess", until Deep Blue demonstrated that if you can calculate deeply enough, there isn't any real difference between the two.

As much as I would like to say that computers will never be able to compose music as well as humans, it's pretty certain that they eventually will, including composing pieces of sublime emotional intensity and inventing new paradigms of composition. I don't expect that to happen very soon -- more likely there will be an extended period of computers cranking out reasonable facsimiles of popular genres -- but I do expect it to happen.

Where does that leave the composer? I think a couple of points from the chess world are worth considering:

Computer chess did not put chess masters out of business. The current human world champion would lose badly to the best computer chess player, which has been the case for decades, and we can expect it to be the case from here on out, but people still like to play chess and to watch the best human players play (watching computers play can also be fun). People will continue to like to make music and to hear music by good composers and players.
Current human chess players spend a lot of time practicing with computers, working out variations and picking up new techniques. I expect similar things will happen with music: at least some composers will get ideas from computer-generated music, or train models with music of their choosing and do creative things with the results, or do all sorts of other experiments.

There is also some relevant history from the music world

Drum machines did not put drummers out of business. People can now produce drum beats without hiring a drummer, including beats that no human drummer could play, and beats that sound like humans playing with "feel" on real instruments, but the effect of that has been more to expand the universe of people who can make music with drum beats than to reduce the need for drummers (I'm not saying that drummers haven't lost gigs, but there is still a whole lot of live performance going on with a drummer keeping the beat).
Algorithms have been a part of composition for quite a while now. Again, this goes back to before computers, including common-practice techniques like inversion, augmentation and diminution and 20th-century serialism. An aleatoric composition arguably is an algorithm, and electronic music has made use of sequencers since early days. From this point of view, model-generated music is just one more tool in the toolbox.

Humanity has had a complicated relationship with the machines it builds. On the one hand, people generally build machines to enable them to do something they couldn't, or relieve them of burdensome tasks. Computers are no different. On the other hand, people have always been cautious about the potential for machines to disrupt their way of life, or their livelihood (John Henry comes to mind). Both attitudes make sense. Fixating on one at the expense of the other is generally a mistake.

Personally, having watched AI develop for decades now, I don't see any significant change in that dynamic. We don't seem particularly closer to the Singularity than we ever were (and I argue in that post that's in part because the Singularity isn't really a well-defined concept). But then, given the way these things are believed to work, we may not know different until it's too late.

If it does happen maybe someone, or something, will compose an epic piece to mark the event.

Friday, March 25, 2022

The house with the green shutters

Consider these two sentences:

I went around the house with the green shutters

In other words, the house has green shutters and I'm going around that house.

I went around the house with the green shutters to install

In other words, I have some green shutters I need to install on the house and I'm carrying them around the house.

These are considerably different meanings, and they have different structures from a grammatical point of view. In the first sentence, with the green shutters is describing the house -- it has green shutters. In the second, it is describing my going around the house -- I have the shutters with me as I go

This second sentence might be considered a garden-path sentence, which is a sentence that you have to reinterpret midway through because the interpretation you started with stops working. Wikipedia has three well-known examples:

The old man the boats
The complex houses married and single soldiers and their families
The horse raced past the barn fell

If your first reaction to those sentences is "Wait ... what?" like mine was, they might make more sense with a little more context:

The young stay on shore. The old man the boats.
The complex was built by the Corps of Engineers. The complex houses married and single soldiers and their families.
The horse led down the path was fine. The horse raced past the barn fell.

or with a slight change in wording

The old people man the boats
The housing complex houses married and single soldiers and their families
The horse that was raced past the barn fell

While sentences like these do come up in real life, especially in headlines or other situations where it's common to leave out words like "that" or "who" which can provide valuable clues about the structure of a sentence, they also feel a bit artificial. An editor would be well within their rights to suggest that an author rephrase any of the three, because they're hard to read, because the whole structure and meaning aren't what you think they are at first.

the old man, with the adjective old modifying the noun man, changes to the old, a noun phrase made from an adjective, as the subject of the verb man. The sentence fragment the old man becomes a complete sentence (though, granted, it's harder to leave the object off of man the boat than in a sentence like I read every day)
the complex houses, with the adjective complex modifying the noun houses, becomes the complex as the subject of the verb houses. This is actually the same pattern as the first case, except that married can keep the game going (The complex houses married elements of the Rococo and Craftsman styles). It may be worth noting that in this case, the two interpretations would generally sound distinct. As a noun phrase, the complex houses would have the main stress on houses, while as a noun phrase and a verb, it would have the main stress on complex.
the horse raced, a complete sentence with raced in the simple past, becomes the horse modified by the past participle raced

Compared to these, I don't think either of the two "green shutters" sentences is particularly hard to understand. While the change in meaning is significant, the change in structure isn't as great as in the garden-path examples. The subject is still I. The verb is still went, modified by around the house. The only difference is in what with applies to. Every word, except possibly with, is used in the same sense in both sentences.

In technical terms, this is a syntactic ambiguity. What's uncertain is which particular words relate to which others. The meanings of the words themselves are the same either way. At the very least, with remains a preposition. In the garden-path sentences, the senses of the words, and in particular their parts of speech, change when the sentence is reinterpreted -- a lexical ambiguity, one reason to think there's something different going on in the two cases.

This sort of thing is bread and butter for linguistics and cognitive science experiments where subjects are given sentences and asked to, say, pick the picture that best matches them, with the experimenters timing the responses and looking for differences that suggest that some structures require more processing than others. In this case, I strongly suspect that the sentences I gave would take much less time for people to sort out than the garden-path sentences.

In short, while I think that there are some similarities, I also think different things are going on in the brain when dealing with the sentences I gave, as opposed to garden-path sentences.

Even without running the experiments or considering garden-path sentences, there are some clear implications just from considering sentences like the "green shutters" ones above:

On the one hand, our parsing of sentences is sequential in some strong sense. At several points, we can stop and say "This is a sentence". If you hear nothing more, you can still work out possible meanings

I went around the house
I went around the house with the green shutters
I went around the house with the green shutters to install

On the other hand, the structure of a sentence is provisional in some sense. After hearing I went around the house with the green shutters and associating with the green shutters with house, we can then hear to install and fairly easily re-associate with with went around the house.
Semantics and context affect this process. The sentence I went around the house with the green shutters is itself ambiguous. You could read it the same way as the other sentence, meaning that I was carrying green shutters around the house, but the house with the green shutters is much more likely to refer to the house, so you probably don't. Similarly, putting a context sentence before a garden-path sentences makes it more likely that the garden-path sentence will make sense without re-reading.

(That last point runs counter to Chomsky's assertion that "[T]he notion of 'probability of a sentence' is an entirely useless one, under any known interpretation of this term")

Assuming that there's some sort of re-structuring going on when you hear to install after I went around the house with the green shutters, it would be interesting to see how different theories of grammar handle it.

In a phrase structure grammar, the shift between the two sentences is from a structure like

I [went [around [the house [with the green shutters]]]]

(a full parse tree would have a lot more to it than this) to

I [went [around [the house]][with the green shutters [to install]]]

That is, with the green shutters goes from being a constituent of the noun phrase (the house with the green shutters) to a constituent of the verb phrase went around the house with the green shutters to install. From a phrase-structure point of view, the two possible readings of I went around the house with the green shutters are examples of a bracketing ambiguity, since there are two ways to put the brackets.

You can look at this as lifting [with the green shutters] out of [the house [with the green shutters]] and putting it back next to [around the house]. In principle, the place that [with the green shutters] is lifted out of can be as deep as you want: [I went [down the path [around the house [with the green shutters]]]] and so forth. You're still moving a chunk of the parse tree from one place to another, but as the nesting gets deeper, you have to navigate through more tree nodes to find what you're moving.

In a dependency grammar, the shift is pretty simple: with switches from a dependent of house to a dependent of went (if I understand correctly, with would be a syntactic dependency of house or went, but the semantic dependency is the other way around: house or went would be a semantic dependency of went -- but there's a good chance I don't understand correctly). Saying that I went around the house with the green shutters is ambiguous is saying that there are two possible places that with could attach as a dependency.

Consider one more sentence

I went around the house with the green shutters to install the awning

After seeing the awning, the shutters are back on the house and we're back where we started (and the object of install is now awning). The fact that we can handle any of the three sentences suggests that there's something in the brain that can track both possible structures, that is, both ways of associating with, whether as a constituent or a dependency or something else, and switch back and forth between them, or in some cases even end up in a state of "Wait a sec, did you mean the shutters are on the house, or you were carrying them?"

There ought to be experiments to run in order to test this, and I wouldn't be surprised if they've already been run, but I'll leave that to the real linguists.