I have a draft post from just over a year ago continuing a thread on intelligence in general, and artificial intelligence in particular. In fact, I have two draft AI posts at the moment. There's also one from early 2020 pondering how the agents in Open AI's hide-and-seek demo figure out that an agent on the opposing team is probably hiding out of sight behind a barrier.
I was pretty sure at the time of the earlier draft that they do this by applying the trained neural network not just to the last thing that happened, but a window of what's happened recently. In other words, they have a certain amount of short-term memory, but anything like long-term memory is encoded in the neural net itself, which isn't updated during the game. This ought to produce effects similar to the "horizon effect" in early chess engines, where a player that could look ahead, say, three moves and see that a particular move was a blunder would play another move that led to the same blunder, but only after four moves.
I'm still pretty sure that's what's going on, and I was going to put that into a post one of these days as soon as I read through enough of the source material to confirm that understanding, but I never got around to it.
Because ChatGPT-4 happened.
ChatGPT is widely agreed to have been a major game changer and, yeah ... but which games?
From a personal point of view, my musings on how AIs work and what they might be able to do are now relevant to what my employer talks about in its quarterly earnings reports, so that put a damper on things as well. For the same reason, I'll be staying away from anything to do with the internals or how AI might play into the business side. As usual, everything here is me speaking as a private citizen, about publicly available information.
Out in the world at large, I recall a few solid months of THIS CHANGES EVERYTHING!!!, which seems to have died down into a steady stream of "How to use AI for ..." and "How to deal with the impact of AI in your job." I've found some of this interesting, but a lot of it exasperating, which leads me to the title.
There are at least three very distinct things "AI" could reasonably mean at the moment:
- The general effort to apply computing technology to things that humans and other living things have historically been much better at, things like recognizing faces in pictures, translating from one language to another, walking, driving a car and so forth.
- Large language models (LLMs) like the ones behind ChatGPT, Bard and company
- Artificial General Intelligence (AGI), a very vague notion roughly meaning "able to do any mental task a human can do, except faster and better"
There are several posts under
the AI tag here (including
the previous post) poking and prodding at all three of those, but here I'm interested in the terms themselves.
To me, the first AI is the interesting part, what I might even call "real AI" if I had to call anything that. It's been progressing steadily for decades. LLMs are a part of that progression, but they don't have much to do with, say, recognizing faces or robots walking. All of these applications involve neural networks with back propagation (I'm pretty sure walking robots use neural nets), but training a neural net with trillions of words of speech won't help it recognize faces or control a robot walking across a frozen pond because ... um, words aren't faces or force vectors?
If you ask a hundred people at random what AI is, though, you probably won't hear the first answer very much. You'll hear the last two quite a bit, and more or less interchangeably, which is a shame, because they have very little in common.
LLMs are a particular application of neural nets. They encode interesting information about the contents of a large body of text. That encoded information isn't limited to what we think of as the factual content of the training text, and that's a significant result. For example, if you give an LLM an email and ask it to summarize the contents, it can do so even though it wasn't explicitly trained to summarize email, or even to summarize unfamiliar text in general.
To be clear, summarizing an email is different from summarizing part of the text that an LLM has been trained on. You could argue that in some very broad sense the LLM is summarizing the text it's been trained on when it answers a factual question, but the email someone just sent you isn't in that training text.
Somehow, the training phase has built a model, based in part on some number of examples of summaries, of how a text being summarized relates to a summary of that text. The trained LLM uses that to predict what could reasonably come after the text of the email and a prompt like "please summarize this", and it does a pretty good job.
That's certainly not nothing. There may or may not be a fundamental difference between answering a factual question based on text that contains a large number of factual statements and performing a task based on a text that contains examples of the task, or descriptions of the task being done, but in any case an LLM summarizing an email is doing something that isn't directly related to the text it's been trained on, and that it wasn't specifically trained to do.
I'm pretty sure this is not a new result with LLMs, but seeing the phenomenon with plain English text that any English speaker can understand is certainly a striking demonstration.
There are a couple of reasons to be cautious about linking this to AGI.
First, to my knowledge, there isn't any widely-accepted definition of what AGI actually is. From what I can tell, there's a general intuition of something like The Computer from mid 20th-century science fiction, something that you can give any question or task to and it will instantly give you the best possible answer.
"Computer, what is the probability that this planet is inhabited?"
"Computer, devise a strategy to thwart the enemy"
"Computer, set the controls for the heart of the Sun"
"Computer, end conflict among human beings"
This may seem like an exaggeration or strawman, but at least one widely-circulated manifesto literally sets forth that "Artificial Intelligence is best thought of as a universal problem solver".
There's quite a bit of philosophy, dating back centuries and so probably much, much farther, about what statements like that might even mean, but whatever they mean, it's abundantly clear by now that an LLM is not a universal problem solver, and neither is anything else that's currently going under the name AI.
In my personal opinion, even a cursory look under the hood and kicking of the tires ought to be enough to determine that nothing like a current LLM will ever be a universal problem solver. This is not just a matter of what kinds of responses current LLM-based chatbots give. It's also a matter of what they are. The neural net model underpinning this is based pretty explicitly on how biological computers like human brains work. Human brains are clearly not universal problem solvers, so why would an LLM be?
There's an important distinction here, between "general problem solver", that is, able to take on an open-ended wide variety of problems, and "universal", able to solve any solvable problem. Human brains are general problem-solvers, but nothing known so far, including current AIs, is universal.
This may sound like the argument that, because neural nets are built and designed by humans, they could never surpass a human's capabilities. That's never been a valid argument and it's not the argument here. Humans have been building machines that can exceed human capabilities for a long, long time, and computing machines that can do things that humans can't have been around for generations or centuries, depending on what you count as a computing machine and a human capability.
The point is that neural nets, or any other computing technology, have a particular set of problems that they can solve with feasible resources in a feasible amount of time. The burden of proof is on anyone claiming that this set is "anything at all", that by building a network bigger and faster than a human brain, and giving it more data than a human brain could hope to handle, neural nets will not only be able to solve problems that a human brain can't -- which they already can -- but will be able to solve any problem.
So next time you see something talking about AI, consider which AI they're referring to. It probably won't be the first (the overall progress of machines doing things that human brains also do). It may well be the second (LLMs), in which case the discussion should probably be about things like how LLMs work or what a particular LLM can do in a particular context.
If it's talking about AGI, it should probably be trying to untangle what that means or giving particular reasons why some particular approach could solve some particular class of problems.
If it's just saying "AI" and things on the lines of "Now that ChatGPT can answer any question and AGI is right around the corner ...", you might look for a bit more on how those two ideas might connect.
My guess is there won't be much.
No comments:
Post a Comment