Intermittent Conjecture: What would superhuman intelligence even mean?

Artificial General Intelligence, or AGI, so the story goes, isn't here yet, but it's very close. Soon we will share the world with entities that are our intellectual superiors in every way, that have their own understanding of the world and can learn any task and execute it flawlessly, solve any problem perfectly and generally outsmart us at every turn. We don't know what the implications of this are (and it might not be a good idea to ask the AGIs), but they're certainly huge, quite likely existential.

Or at least, that's the story. For a while now, my feeling has been that narratives like this one say more about us than they do about AI technology in general or about AGI in particular.

At the center of this is the notion of AGI itself. I gave a somewhat extreme definition above, but not far, I think, from what many people think it is. OpenAI, whose mission is to produce it, has a more focused and limited definition. While the most visible formulation is that an AGI would be "generally smarter than humans", the OpenAI charter defines it as "a highly autonomous system that outperforms humans at most economically valuable work". While "economically valuable work" may not be the objective standard that it's trying to be here -- valuable to whom? by what measure? -- it's still a big step up from "generally smarter".

Google's Deep Mind team (as usual, I don't really know anything you don't, and couldn't tell you anyway) lays out more detailed criteria, based on three properties: autonomy, performance and generality. A system can exhibit various levels of each of these, from zero (a desk calculator, for example, would score low across the board) to superhuman, meaning able to do better than any human. In this view there is no particular dividing line between AGI and not-AGI, but anything that scored "superhuman" on all three properties would have to be on the AGI side. The paper calls this Artificial Superintelligence (ASI), and dryly evaluates it as "not yet achieved".

There are several examples of superhuman intelligence in current AI systems. This blog's favorite running example, chess engines, can consistently thrash the world's top human players, but they're not very general (more on that in a bit). The AlphaFold system can predict how a string of amino acids will fold up into a protein better than any top scientist, but again, it's specialized to a particular task. In other words, current AIs may be superhuman, but not in a general way.

As to generality, LLMs such as ChatGPT and Bard are classified as "Emerging AGI", which is the second of six levels of generality, just above "No AI" and below Competent, Expert, Virtuoso and Superhuman. The authors do not consider LLMs, including their own, as "Competent" in generality. Competent AGI is "not yet achieved." I tend to agree.

So what is this "generality" we seek?

Blaise Agüera y Arcas and Peter Norvig (both at Google, but not at DeepMind, at least not at the time) argue that LLMs are, in fact, AGI. That is, flawed though they are, they're not only artificial intelligence, which is not really in dispute, but general. They can converse on a wide range of topics, perform a wide range of tasks, work in a wide range of modalities, including text, images, video, audio and robot sensors and controls, use a variety of languages, including some computer languages, and respond to instructions. If that's not generality, then what is?

On the one hand, that seems hard to argue with, but on the other hand, it's hard to escape the feeling that at the end of the day, LLMs are just producing sequences of words (or images, etc.), based on other sequences of words (or images, etc.). While it's near certain that they encode some sorts of generalizations about sequences of words, they also clearly don't encode very much if anything about what the words actually mean.

By analogy, chess engines like Stockfish make fairly simple evaluations of individual positions, at least from the point of view of a human chess players. There's nothing in Stockfish's evaluation function that says "this position would be good for developing a queenside attack supported by a knight on d5". However, by evaluating huge numbers of positions, it can nonetheless post a knight on d5 that will end up supporting a queenside attack.

A modern chess engine doesn't try to just capture material, or follow a set of rules you might find in a book on chess strategy. It performs any number of tactical maneuvers and implements any number of strategies that humans have developed over the centuries, and some that they haven't. If that's not general, what is?

And yet, Stockfish is obviously not AGI. It's a chess engine. Within the domain of chess, it can do a wide variety of things in a wide variety of ways, things that, when a human does them, require general knowledge as well as understanding, planning and abstract thought. An AI that had the ability to form abstractions and make plans in any domain it encounters, including domains it hasn't encountered before, would have to be considered an AGI, and such an AI could most likely learn how to play chess well, but that doesn't make Stockfish AGI.

I think much the same thing is going on with LLMs, though there's certainly room for disagreement. Agüera y Arcas and Norvig see multiple domains like essay writing, word-problem solving, Italian-speaking, Python-coding and so forth. I see basically a single domain of word-smashing. Just like a chess engine can turn a simple evaluation function and tons of processing power into a variety of chess-related abilities, I would claim that an LLM can turn purely formal word-smashing and tons of training text and processing power into a variety of word-related abilities.

The main lesson of LLMs seems to be that laying out coherent sequences of words in a logical order certainly looks like thinking, but, even though there's clearly more going on than in an old-fashioned Markov chain, there's abundant evidence that they're not doing anything like what we consider "thinking" (I explore this a bit more in this post and in some others with the AI tag).

What's missing, then? The DeepMind paper argues that metacognitive skills are an important missing piece, perhaps the most important one. While the term is mentioned several times, it is never really sharply defined. It variously includes "learning", "the ability to learn new tasks or the ability to know when to ask for clarification or assistance from a human", "the ability to learn new skills", "the ability to learn new skills and creativity" and "learning when to ask a human for help, theory of mind modeling, social-emotional skills". Clearly, learning new skills is central, but there is a certain "we'll know it when we see it" quality to all this.

This isn't a knock on the authors of the paper. A recurring theme in the development of AI, as the hype dies down about the latest development, is trying to pinpoint why the latest development isn't the AGI everyone's been looking for. By separating out factors like performance and autonomy, the paper makes it clear that we have a much better handle on what those mean, and the remaining mystery is in generality. Generality comprises a number of things that current AIs don't do. You could make a case that current LLMs show some level of learning and creativity, but I agree with the assessment that this is "emerging" and not "competent".

An LLM can write you a poem about a tabby cat in the style of Ogden Nash, but it won't be very good. Or all that much like Ogden Nash. More importantly, it won't be very creative. LLM-generated poems I've seen tend to have a similar structure: Opening lines that are generally on-topic and more or less in style, followed by a meandering middle that throws in random things about the topic in a caricature of the style, followed by a conclusion trying to make some sort of banal point.

Good poems usually aren't three-part essays in verse form. Even in poems that have that sort of structure, the development is carefully targeted and the conclusion tells you something insightful and unexpected.

It's not really news that facility with language is not the same as intelligence, or that learning, creativity, theory of mind and so on are capabilities that humans currently have in ways that AIs clearly don't, but the DeepMind taxonomy nonetheless sharpens the picture and that's useful.

I think what we're really looking for in AGI is something that will make better decisions than we do, for some notion of "better". That "for some notion" bit isn't just a bit of boilerplate or an attempt at a math joke. People differ, pretty sharply sometimes, on what makes a decision better or worse. Different people bring different knowledge, experience and temperaments to the decision-making process, but beyond that, we're not rational beings and never will be.

Making better decisions probably does require generality in the sense of learning and creativity, but the real goal is something even more elusive: judgment. Wisdom, even. Much of the concern over AGI is, I think, about judgment.

We don't want to create something powerful with poor judgment. What constitutes good or poor judgment is at least partly subjective, but when it comes to AIs, we at least want that judgment to regard the survival of humanity as a good thing. One of the oldest nightmare scenarios, far older than computers or Science Fiction as a genre, is the idea that some all-powerful, all-wise being will judge us, find us wanting and destroy us. As I said at the top, our concerns about AGI say more about us than they do about AI.

The AI community does talk about judgment, usually under the label of alignment. Alignment is a totally separate thing from generality or even intelligence. "Is it generally intelligent?" is not just a different question, but a different kind of question, from "Does its behavior align with our interests?" In other words, "good judgment" means "good for us". I'm not going to argue against, or at least not very enthusiastically.

Alignment is a concern when a thing can make decisions, or influence us to make decisions, in the real world. Technology to amplify human intelligence is ancient (writing, for example), as is technology to influence our decisions (think rolling dice or drawing lots for ancient examples, but also any technology, such as a spreadsheet, that we come to rely on to make decisions).

Technology that can make decisions based on an information store it can also update is less than a century old. While computing pioneers were quick to recognize that this was a very significant development, it's no surprise that we're still coming to grips with just what it means a few decades later.

Intelligence is important here not for its own sake, but because it relates to concepts like risk, judgment and alignment. To be an active threat, something has to be able to influence the real world, and it has to be able to make decisions on its own. That ability to make decisions is where intelligence comes in.

Computers have been involved in controlling things like power plants and weapons for most of the time computers have been around, but until recently control systems have only been able to implement algorithms that we directly understand. If the behavior isn't what we expect, it's because a part failed or we got the control algorithm wrong. With the advent of ML systems (not just LLMs), we now have a new potential failure mode: The control system is doing what we asked, but we don't really understand what that means.

This is actually not entirely new, either. It took a while to understand that some systems are chaotic and that certain kinds of non-linear feedback can lead to unpredictable behavior even though the control system itself is simple and you know the inputs with a high degree of precision. Nonetheless, state-of-the-art ML models introduce a whole new level of opaqueness. There's now a well-developed theory of when non-linear systems go chaotic and what kinds of behavior they can exhibit. There's nothing like that for ML models.

This strongly suggests that we should tread very carefully before, say, putting an ML model in charge of launching nuclear missiles, but currently, and for quite a while yet as far as I can tell, whether to do such a thing is still a human decision. If some sort of autonomous submarine triggers a nuclear war, that's ultimately a failure in human judgment for building the sub and putting nuclear missiles on it.

Well, that went darker than I was expecting. Let's go back to the topic: What would superhuman intelligence even mean? The question could mean two different things:

How do you define superhuman intelligence? It's been over 70 years since Alan Turing asked if machines could think, but we still don't have a good answer. We have a fairly long list of things that aren't generally intelligent, including current LLMs except perhaps in a limited sense, and we're pretty sure that having capabilities like the ability to learn new tasks is a key factor, but we don't have a good handle on what it really means to have such a capability.
What are the implications of something having superhuman intelligence? This is an entirely different question, having to do with what kind of decisions do we allow an AI to make about what sort of things. The important factors here are risk and judgment.

These are two very different questions, but they're related.

It's natural to think of them together. In particular, when some new development comes along that may be a step toward AGI (first question), it's natural, and useful, to think of the implications (second question). But that needs to be done carefully. It's easy to follow a chain of inference along the lines of

X is a major development in AI
So X is a breakthrough on the way to AGI
In fact, X may even be AGI
So X has huge implications

Those middle steps tie a particular technical development to the entire body of speculation about what it would mean to have all-knowing super-human minds in our midst, going back to well before there were even computers. Whatever conclusions you've come to in that discussion -- AGI will solve all the world's problems, AGI will be our demise, AGI will disrupt the jobs market and the economy, whether for better or for worse, or humans will keep being humans and AGI will have little effect one way or another, or something else -- this latest development X has those implications.

My default assumption is that humans will keep being humans, but there's a lot I don't know. My issue, really, is with the chain of inference. The debate over whether something like an LLM is generally intelligent is mostly about how you define "generally intelligent". Whether you buy my view on LLMs, or Agüera y Arcas and Norvig's has little if anything to do with what the economic or social impacts will be.

The implications of a particular technical development, in AI or elsewhere, depend on the specifics of that development and the context it happens in. While it's tempting to ask "Is it AGI?" and assume that "no" means business as usual while "yes" has vast consequences, I doubt this is a useful approach.

The development of HTTP and cheap, widespread internet connectivity has had world-wide consequences, good and bad, with no AI involved. Generative AI and LLMs may well be a step toward whatever AGI really is, but at this point, a year after ChatGPT launched and a couple of years after generative AIs like DALL-E came along, it's hard to say just what direct impact this generation of AIs will have.

I would say, though, that the error bars have narrowed. A year ago, they ranged from "ho-hum" to "this changes everything". The upper limit seems to have dropped considerably in the interim, while the lower limit hasn't really moved.

Intermittent Conjecture

Sunday, December 3, 2023

What would superhuman intelligence even mean?

No comments:

Post a Comment