Artificial Intelligence Apocalypse? Why the Expert Doomsayers May Be Right to Worry
Certain AI experts are very worried that super intelligent AI systems will (1) soon be smarter than systems and (2) will develop goals of their own and (3) that may spell doom for the human species.
Certain experts are profoundly worried that AI tools will soon be smarter than humans, at least along certain dimensions
They are concerned that these superintelligent systems will pursue their own goals – and that these goals may be unknown, and perhaps unknowable – to us
They are further concerned that, along the way to these goals, these AI systems will act in ways that spell doom, or at least potential doom, for the human species
The Increasingly Heated Debate Over AI Risk
This is the first in a series of posts on the debates over AI Risk. AI researchers and experts have raised numerous arguments and counter-arguments about whether Artificial Intelligence tools pose an “existential threat” to humanity. In the words of AI researcher Nick Bostrom, an existential threat is “one where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.”
In this post, we’ll seek to introduce some of the arguments posed by critics of AI. In later posts, we’ll address counter-arguments as well as discuss efforts, including potential regulation, to manage these risks.
Background on the Debate
Concerns about AI have been around for decades. For example, in 1960, in the very early days of the computer age AI pioneer Norbert Wiener explained the dangers of AI (which he called ‘mechanical agency’) and the need to ensure that we align the AI’s goals with our own:
If we use, to achieve our purposes, a mechanical agency with whose operation we cannot efficiently interfere once we have started it, because the action is so fast and irrevocable that we have not the data to intervene before the action is complete, then we had better be quite sure that the purpose put into the machine is the purpose which we really desire and not merely a colorful imitation of it.
Norbert Wiener, Some Moral and Technical Consequences of Automation.
But while academics and others began wrestling with these issues, for much of the intervening decades, in the public consciousness, these were more the concern of science fiction writers than policy analysts, researchers, or regulators. In fact, many of us, myself included, first encountered the idea of AI risk, not from academic papers, but from Isaac Asimov’s Three Laws of Robotics, or Stanley Kubrick’s 2001: A Space Odyssey or, of course, James Cameron’s Terminator movies.
The Machine Research Institute and Friendly AI
The modern version of the policy debate started picking up steam about 20 years ago. The Machine Intelligence Research Institute (@MIRIBerkeley), founded by Eliezer Yudkowsky (@ESYudkowsky) has been focused, since 2005, on identifying and managing potential existential threats from artificial intelligence. The predecessor to MIRI, the Singularity Institute for Artificial Intelligence, started out in 2000 with a focus on accelerating the development of artificial intelligence. But the organization shifted focus (and eventually changed its name) as Yudkowsky and others began to be concerned that the most important problem was not to create AI but “figuring out how to do that safely by getting AI to incorporate our values in their decision making”
Nick Bostrom’s Superintelligence Changes the Conversation
In 2014, Swedish Philosopher Nick Bostrom published his book Superintelligence: Paths, Dangers, Strategies. In this book, Bostrom lays out the argument that superintelligence could pose an existential risk. At the time, the book influenced, among others, Elon Musk and Bill Gates, but nonetheless the debate remained a concern of a relatively small group of researchers.
ChatGPT Moves Up the Timeline
This debate was pushed to the forefront in November 2022 with the release of ChatGPT-3. Many researchers who had been in the field for a long time were shocked by how powerful ChatGPT was. And all of a sudden, the debate about whether AI could be a threat got a lot less academic and began to concern regulators and others.
It Is Scary Hard to Predict the Goals and Actions of a Superintelligence
The “Paperclip Maximizer” is a thought experiment first proposed by Nick Bostrom and developed in his book, the aforementioned “Superintelligence.” The point of the thought experiment – which at first glance seems kind of silly – is to show that a super-intelligent AI, pursuing a harmless-sounding goal, could be dangerous. In fact, it could be extremely dangerous – even if it’s not evil, or not trying to harm us, or even if it doesn’t really care about human beings.
What If You Could Have All The Paperclips You Ever Wanted?
Imagine a superintelligence with an arbitrary goal, such as “manufacture as many paperclips as possible.” For the moment, don’t get caught up on why anyone would want to create an agent with such an arbitrary goal. The point of the thought experiment is that the goal is arbitrary, and may not make much sense to us, might indeed be completely alien to us.
A super-intelligent system may have a goal that doesn’t make sense to humans and doesn’t look much like a human goal. This, by the way, is known as the “orthogonality thesis” The idea is that (1) the intelligence of an agent and (2) the agent's goals are “orthogonal,” i.e. “independent of each other. (“orthogonal” just means like the x and y-axis of a graph – two variables that can vary independently) In other words, just because an agent is smart, doesn’t mean that it’s goals will make any sense to us. Indeed, they probably won’t make sense to us.
The Risk of Intelligence In Pursuit of an Arbitrary Goal
Now given this goal, what happens next? Well, in our thought experiment, the superintelligence who (that? which?) is trying to maximize the number of paperclips, will soon realize a few things:
“If I get turned off or deactivated, then I can’t make more paperclips, and so I need to make sure that humans don’t turn me off or deactivate me”
“Humans sometimes change their minds, and if humans change their minds and tell me to start making staples instead of paperclips, then I won’t be able to make any more paper clips. So I need to make sure that the humans can’t change my goals.”
“Humans are made out of atoms, and air (that humans need to breath) is also made of atoms, and these atoms could be put to much better use if they were turned into paperclips forthwith.”
Next thing you know, the superintelligence starts turning, every man, woman, and child, and all the birds and bees and dogs and cats, and the earth’s crust, and the atmosphere into paperclips and we are powerless to stop it.
Now, nobody thinks that we are in danger anytime soon of being turned into paperclips. (Although, just to be on the safe side, I keep a few extra paperclips around, hoping that I can try to make friends with any paperclip maximizers that I might run into) The example is deliberately fanciful in order to show how even the most arbitrary, harmless-sounding goal may turn into real problems when we are dealing with superintelligence.
The takeaway is that, if the superintelligence is intelligent enough, and therefore powerful enough, it will likely, maybe even certainly, move off in directions we can’t anticipate and can’t control.
Stay tuned.
We’ve set up a page on our wiki covering AI Risk: https://lawsnap.mywikis.wiki/wiki/AI_Risk
We hope you’ll take a look. Feedback always appreciated.