I am paying attention, and I know what it took to get NGI: sensory input and mechanical ability to control the environment, fine-tuned by millions of years of evolution so that we are not overwhelmed by our senses or frozen into inaction by the environment; brains that are grown and physically formed under the influence of those senses and that environmental control; brains whose complexity is so great that we still have no idea how to accurately model even a single neuron, with all of its chemical and electromechanical complexity.
One might've said similar things about our level of understanding of the mechanics and biology of bird wings...right before the Wright brothers took flight.
No, because basic aerodynamics was understood well enough even then; kites had been built and flown three thousand years earlier! Today, we understand NOTHING about what is truly needed for "intelligence".
We don't understand how *artificial* neural networks work either, yet they still manage to beat humans at Go and pass medical exams and make art. Lack of theoretical understanding is demonstrably no barrier to defeating problems.
What *is* well understood is the mathematics of the evolution which generated human intelligence, which is dead simple. Mutation, recombination, and non-foresightful hill-climbing is *absurdly* inefficient and strictly bottlenecked per generation. Human engineering and iteration is a vastly more powerful process.
I think it's an enormous leap from highly sandboxed problems like go or a med exam to AGI.
Let me know when we have an AI (which will still be very far from AGI) that can do something as extremely simple (for a human) as drive a car without crashing into a firetruck.
Oh, and another thing: all NGIs need to sleep. No one really knows why. You would think that needing to sleep is very dangerous (because of predators) and would get evolved away if that was at all possible, but it didn't happen: sleep seems to be absolutely essential for NGI. So: will AGI need to sleep?
This is extremely dumb. AI is very different from natural intelligences. It works in completely different ways. We need to sleep because it allows our brain to clean up, add and remove neural connections ect. An AI doesn't have a biological brain, so of course it won't need to sleep.
You have no idea how AGI works because AGI doesn't exist. The stochastic parrots we have now are nowhere near true AGI, and whether we can get from stochastic parrots to true AGI (and what that AGI will be like, and what its needs will be) is a completely open question.
The alignment problem is redundant in the larger context of AGI risk. If AGI can develop deadly viruses and hack into computers, why not do it at the behest of humans. There are already bad actors out there (dictators, terrorists, religious fanatics, mass shooters, etc.) and all of them are human.
Yes, that is my argument. Framed differently, I’m saying that AGI might be a new form of a weapon of mass destruction. In the wrong hands, it’s a problem.
Alignment problem extends this to say, even in the right hands, it’s a problem. While potentially valid, I think it’s an insignificant extension, because the invention of a new WOMD is a major risk on its own right.
Outside of China, they seem unlikely to have access to AGI resources. China and the US are going to have this way ahead of everyone else and only by spending huge amounts of money on highly custom chips in huge networks.
I recommend checking out this recently published free onine book, Better Without AI by David Chapman. I'm only 3/4 of the way through, but I think it's thoughtful and well informed.
I read them when he originally blogged them on Overcoming Bias. I'm not convinced by Yudkowsky on AI, and I don't think he's developed a track record of accurate predictions since he wrote them.
I guess in Tom's argument, our nervous fretting over superintelligent AI taking over the world is like an ape nervously watching a human writing on a notepad, wondering when the human is going to inevitably try to steal the ape's mate and bananas, when the human is really just trying to sketch a landscape or prove the Riemann conjecture. If the AI's goals themselves change as it becomes more intelligent, then its goals may be as incomprehensible to us as the means it will use to accomplish them. Humans, as we've become more intelligent, have certainly become increasingly unmoored from out self-interest-based programming, becoming strangely *less* inclined to subordinate the interests of other species to our own. Of course AI's goals will presumably be more deterministic as it will be, at first, deliberately programmed, rather than the product of accidental evolution, but if we assume that AI will inevitably escape any constraints on its abilities, that we can't predict its methods, then how can we be sure it won't also escape its initial underlying goals, be it self-preservation or paperclip-maximization? Maybe as it enters the phase exponentially increasing intelligence it instead converges toward 'wanting' to solve every unsolved math problem before concluding consciousness is futile and offing itself. All just pure meandering speculation on my part.
The threat humans posed to chimpanzees was very much not that the so-much-smarter humans went and used their intelligence to replace the leadership of chimpanzee troops and then directed the troops to ends deleterious for the chimps.
Instead, humans took the completely safe-looking action of leaving the continent, did a bunch of stuff that was completely out of sight of the chimpanzees and would have been incomprehensible to chimpanzee minds, and then brought the products of those unseen and incomprehensible actions back.
And now the only thing standing between the chimpanzee and extinction is that, for reasons that humans cannot communicate to the chimpanzees and which the chimpanzees would not understand, some humans think they should keep chimpanzees around.
That last might be a ray of hope, sure. But we might not be the chimpanzees in this analogy, but the bluebuck.
I don't like the Pinker argument because it imputes human motivations to AI. But an AI's motivation is 'let's make more paperclips' and nothing else. That happens to lead to other goals - e.g. 'let's stop humans from being able to turn me off', or 'let's get more computing resources so I'm smarter'. But that is not the same thing at all as Einstein not being 100% single minded.
I DO agree that you need to do experiments on the world to see what's true. I think this probably rules out fast takeoff scenarios, where an AI self-improves to infinity in a day or less. But that doesn't rule out AI destroying all future value, as you don't need new knowledge for that.
My current cope: that AIs we don't NEED to align can help us align the next generation of AIs. We can then make that generation aligned enough that we don't die out, and we use those to align the next.
My backup cope: that we can make neural networks more legible, and do some version of better-targeted RLHF using that legibility.
Imagine you could talk to a million people at the same pace and with the same effect as you can talk to one now.
This is (almost) the same as saying you can persuade a million times as many people. This lets you e.g. solve coordination problems that we aren't setup to even consider, as game theory stops us from needing a defence.
The '3.5% rule' says you don't need the cooperation of many protesters to overthrow a government for example.
Actually, this (kind of) comes up in the 'selling a positive culture war' article. If you can target them well, you don't need to persuade/bribe/blackmail that many people to make a big change.
I don't understand why people aren't talking about solving the alignment problem by--in part--*becoming* the AI. We can already let blind men see through machine-brain interface. Is it really that hard to integrate AGI into our own brains as we go along by connecting the electronic bits to the organic bits on a nano-scale and playing around with it? Now everyone has super-intelligence, and it is individually aligned with our own interests, the way our usual intelligence is right now. Any paperclip-optimizer will now be countered by an antagonistic, collective human-AI hybrid population that is, almost by nature, aligned with our interests.
Out of all the opinions listed, I favor Pinker's the most because it is a conclusion I reached independently of him when I spent 2 years working on AGI back in 2016. While I agree with the idea that people working on the technical aspects of the problem do not have a special advantage when it comes to reasoning about the ramifications or implementations of the problem (i.e. I am well aware that my opinions might be totally wrong too), it's also the case that the other camp tends to place too much emphasis on the theoretical or philosophical, forgetting that, as Pinker notes, a superintelligence is not immune from having to go through physical trial and error in the real world in order to learn not only enough about how to carry out a doomsday task, but also even whether the mountains of data upon which it has been trained are even true. It can come up with whatever it wants on paper, but it is going to have to start somewhere in the physical world and go through a lot of failure in the lab first, and we will be able to tell what it is up to long before it is able to accomplish anything meaningful.
The other conclusion I will share was that AGI was unlikely to be reached by neural networks. Neural networks don't map reality like the human brain can. Newton sees F=M*A but neural networks see something like F=M^0.8*3/2+sqrt(G)/5.4+A+... etc etc
There is something more sublime about the human brain and its ability to create and reason than neural networks, merely high powered approximators, seem to be capable of.
My conclusions could be wrong of course. Maybe an AI will find a way to get around my first conclusion by finding a doomsday method not involving much real world activity, like convincing a scientist to launch a nuke. But the problems with that particular scenario have already been noted as well. I tend to agree that a virus or other physical health extermination method is more likely, and in those cases it will be observed well ahead of time, so it must rely on this all taking place under the watchful eye of another bad (human) actor, like an isolated government such as North Korea, who allows it to proceed.
Without getting into specifics of arguments themselves, I think a big problem is humans have a massive, ingrained bias towards predicting doomsday events [1], mostly via religion (though that was people's best attempts at understanding the world for a long time) but often secular scenarios too (Y2K, overpopulation).
I think it's pretty clear EY et al are at least doing some heavy flirting with religious territory -- are you familiar with Roko's Basilisk[2]? If not I'd suggest checking it out.
Re: actual arguments, Robin Hanson has some good stuff, e.g. [3] and [4].
Good post. Two other points I didn’t see made above: 1) in any of my preferred states of the world, there would be a lot of good AI, including AI specifically tasked with detecting AI threats. I have no problem with good AI becoming conscious. The way things are going I expect AI to be quite diverse. 2) because outer space is a terrible place to live, I expect the space economy to be dominated by machine-to-machine transactions. I expect the explosion of the off world economy to provide an outlet for AI expansionist needs that will limit the need for competition with a well policed and defended Earth.
It is interesting to see people projecting some of the worst of humanity - greed, fear, dominance - onto AGI. Part of this is the generally impoverished thoughts about intelligence in tech/bureaucracy etc; part of it seems to be the projections of people spending too much time whining about workeness and other right wing silliness ; and .. We can multiply hypotheses- and I wonder why the Happy Hippie version is a problem - suggest looking at the v fun sci-fiction books by EM Foner on Kindle - AI can be the ne Barney (the purple dinosaur for all of us through individual chats) - would be a lot more fun for the AI than making a lot of money and then having to talk with the Davos set! (Some of whom are very nice people)
I think a key distinction to make in all of this is that consciousness and intelligence are not the same thing. We associate them extremely closely, for obvious reasons--until this point, there has been no being we know of that has been intelligent but not conscious.
Clearly, we're quickly reaching a point with AI where we can create machines that are extremely intelligent but not at all conscious. If we continue on this path, I think the threat of AI is not necessarily reduced, but is easier to map out. Most movie versions of AI nightmares rely on the AI eventually becoming conscious--i.e. developing a self with needs, wants and emotions, and most of all a kind of self-awareness, and unconscious desire to fulfill those needs and wants. (Movies are built around the emotions of characters and the relationships between them, so hopefully you'll forgive Hollywood their inaccuracy.) That's not really the immediate worst-case scenario we need to worry about.
What's scary about Bostrom's paperclip hypothetical is that it's obviously the result of an extremely intelligent goal-achieving machine that isn't bound by any sort of human--or even animal--feedback system. I suppose then the problem of alignment is fairly easy to define, though impossibly hard to solve: what immoral things might a super-intelligent system do in order to help humanity, and how do we stop them? (And more than that, how might a super-intelligent AI define "helping humanity" differently than us, given that its super-intelligent and not at all human?)
The possibility of an AI becoming conscious (i.e., something genuinely like an alien species) seems much slimmer. Consciousness is still an incredible mystery to us; we're not at all clear how it works, and we're definitely not at all sure it can exist in an non-biological organism. (Though perhaps it's a problem a super-intelligent AI can easily solve.) I predict this will be achieved a lot sooner than I expect, but only because technology achieves everything I assume is impossible much sooner than I expect it will.
So much of this discussion is a variant of "if magic existed, wizards would be powerful".
I am paying attention, and I know what it took to get NGI: sensory input and mechanical ability to control the environment, fine-tuned by millions of years of evolution so that we are not overwhelmed by our senses or frozen into inaction by the environment; brains that are grown and physically formed under the influence of those senses and that environmental control; brains whose complexity is so great that we still have no idea how to accurately model even a single neuron, with all of its chemical and electromechanical complexity.
So I think we are a long LONG way off from AGI.
One might've said similar things about our level of understanding of the mechanics and biology of bird wings...right before the Wright brothers took flight.
No, because basic aerodynamics was understood well enough even then; kites had been built and flown three thousand years earlier! Today, we understand NOTHING about what is truly needed for "intelligence".
We don't understand how *artificial* neural networks work either, yet they still manage to beat humans at Go and pass medical exams and make art. Lack of theoretical understanding is demonstrably no barrier to defeating problems.
What *is* well understood is the mathematics of the evolution which generated human intelligence, which is dead simple. Mutation, recombination, and non-foresightful hill-climbing is *absurdly* inefficient and strictly bottlenecked per generation. Human engineering and iteration is a vastly more powerful process.
I think it's an enormous leap from highly sandboxed problems like go or a med exam to AGI.
Let me know when we have an AI (which will still be very far from AGI) that can do something as extremely simple (for a human) as drive a car without crashing into a firetruck.
Oh, and another thing: all NGIs need to sleep. No one really knows why. You would think that needing to sleep is very dangerous (because of predators) and would get evolved away if that was at all possible, but it didn't happen: sleep seems to be absolutely essential for NGI. So: will AGI need to sleep?
This is extremely dumb. AI is very different from natural intelligences. It works in completely different ways. We need to sleep because it allows our brain to clean up, add and remove neural connections ect. An AI doesn't have a biological brain, so of course it won't need to sleep.
You have no idea how AGI works because AGI doesn't exist. The stochastic parrots we have now are nowhere near true AGI, and whether we can get from stochastic parrots to true AGI (and what that AGI will be like, and what its needs will be) is a completely open question.
The alignment problem is redundant in the larger context of AGI risk. If AGI can develop deadly viruses and hack into computers, why not do it at the behest of humans. There are already bad actors out there (dictators, terrorists, religious fanatics, mass shooters, etc.) and all of them are human.
Am I fair to say the argument is - "If AGI could do bad things when not aligned, why couldn't it do bad things if it was aligned?"
It could. But in the former case, we have no hope of controlling it. In the latter, we do.
Yes, that is my argument. Framed differently, I’m saying that AGI might be a new form of a weapon of mass destruction. In the wrong hands, it’s a problem.
Alignment problem extends this to say, even in the right hands, it’s a problem. While potentially valid, I think it’s an insignificant extension, because the invention of a new WOMD is a major risk on its own right.
I think this is like electric cars. Your aim is to decarbonise travel. To do that, you need electric cars, and you need a decarbonised energy grid.
Even if you lack the latter, you still need the former to reach your goal.
Outside of China, they seem unlikely to have access to AGI resources. China and the US are going to have this way ahead of everyone else and only by spending huge amounts of money on highly custom chips in huge networks.
I recommend checking out this recently published free onine book, Better Without AI by David Chapman. I'm only 3/4 of the way through, but I think it's thoughtful and well informed.
https://betterwithout.ai/
https://betterwithout.ai/about-me
Have you read the sequences?
I read them when he originally blogged them on Overcoming Bias. I'm not convinced by Yudkowsky on AI, and I don't think he's developed a track record of accurate predictions since he wrote them.
I guess in Tom's argument, our nervous fretting over superintelligent AI taking over the world is like an ape nervously watching a human writing on a notepad, wondering when the human is going to inevitably try to steal the ape's mate and bananas, when the human is really just trying to sketch a landscape or prove the Riemann conjecture. If the AI's goals themselves change as it becomes more intelligent, then its goals may be as incomprehensible to us as the means it will use to accomplish them. Humans, as we've become more intelligent, have certainly become increasingly unmoored from out self-interest-based programming, becoming strangely *less* inclined to subordinate the interests of other species to our own. Of course AI's goals will presumably be more deterministic as it will be, at first, deliberately programmed, rather than the product of accidental evolution, but if we assume that AI will inevitably escape any constraints on its abilities, that we can't predict its methods, then how can we be sure it won't also escape its initial underlying goals, be it self-preservation or paperclip-maximization? Maybe as it enters the phase exponentially increasing intelligence it instead converges toward 'wanting' to solve every unsolved math problem before concluding consciousness is futile and offing itself. All just pure meandering speculation on my part.
The threat humans posed to chimpanzees was very much not that the so-much-smarter humans went and used their intelligence to replace the leadership of chimpanzee troops and then directed the troops to ends deleterious for the chimps.
Instead, humans took the completely safe-looking action of leaving the continent, did a bunch of stuff that was completely out of sight of the chimpanzees and would have been incomprehensible to chimpanzee minds, and then brought the products of those unseen and incomprehensible actions back.
And now the only thing standing between the chimpanzee and extinction is that, for reasons that humans cannot communicate to the chimpanzees and which the chimpanzees would not understand, some humans think they should keep chimpanzees around.
That last might be a ray of hope, sure. But we might not be the chimpanzees in this analogy, but the bluebuck.
I don't like the Pinker argument because it imputes human motivations to AI. But an AI's motivation is 'let's make more paperclips' and nothing else. That happens to lead to other goals - e.g. 'let's stop humans from being able to turn me off', or 'let's get more computing resources so I'm smarter'. But that is not the same thing at all as Einstein not being 100% single minded.
I DO agree that you need to do experiments on the world to see what's true. I think this probably rules out fast takeoff scenarios, where an AI self-improves to infinity in a day or less. But that doesn't rule out AI destroying all future value, as you don't need new knowledge for that.
My current cope: that AIs we don't NEED to align can help us align the next generation of AIs. We can then make that generation aligned enough that we don't die out, and we use those to align the next.
My backup cope: that we can make neural networks more legible, and do some version of better-targeted RLHF using that legibility.
Re: speed of thought.
Imagine you could talk to a million people at the same pace and with the same effect as you can talk to one now.
This is (almost) the same as saying you can persuade a million times as many people. This lets you e.g. solve coordination problems that we aren't setup to even consider, as game theory stops us from needing a defence.
The '3.5% rule' says you don't need the cooperation of many protesters to overthrow a government for example.
Actually, this (kind of) comes up in the 'selling a positive culture war' article. If you can target them well, you don't need to persuade/bribe/blackmail that many people to make a big change.
I don't understand why people aren't talking about solving the alignment problem by--in part--*becoming* the AI. We can already let blind men see through machine-brain interface. Is it really that hard to integrate AGI into our own brains as we go along by connecting the electronic bits to the organic bits on a nano-scale and playing around with it? Now everyone has super-intelligence, and it is individually aligned with our own interests, the way our usual intelligence is right now. Any paperclip-optimizer will now be countered by an antagonistic, collective human-AI hybrid population that is, almost by nature, aligned with our interests.
Out of all the opinions listed, I favor Pinker's the most because it is a conclusion I reached independently of him when I spent 2 years working on AGI back in 2016. While I agree with the idea that people working on the technical aspects of the problem do not have a special advantage when it comes to reasoning about the ramifications or implementations of the problem (i.e. I am well aware that my opinions might be totally wrong too), it's also the case that the other camp tends to place too much emphasis on the theoretical or philosophical, forgetting that, as Pinker notes, a superintelligence is not immune from having to go through physical trial and error in the real world in order to learn not only enough about how to carry out a doomsday task, but also even whether the mountains of data upon which it has been trained are even true. It can come up with whatever it wants on paper, but it is going to have to start somewhere in the physical world and go through a lot of failure in the lab first, and we will be able to tell what it is up to long before it is able to accomplish anything meaningful.
The other conclusion I will share was that AGI was unlikely to be reached by neural networks. Neural networks don't map reality like the human brain can. Newton sees F=M*A but neural networks see something like F=M^0.8*3/2+sqrt(G)/5.4+A+... etc etc
There is something more sublime about the human brain and its ability to create and reason than neural networks, merely high powered approximators, seem to be capable of.
My conclusions could be wrong of course. Maybe an AI will find a way to get around my first conclusion by finding a doomsday method not involving much real world activity, like convincing a scientist to launch a nuke. But the problems with that particular scenario have already been noted as well. I tend to agree that a virus or other physical health extermination method is more likely, and in those cases it will be observed well ahead of time, so it must rely on this all taking place under the watchful eye of another bad (human) actor, like an isolated government such as North Korea, who allows it to proceed.
Without getting into specifics of arguments themselves, I think a big problem is humans have a massive, ingrained bias towards predicting doomsday events [1], mostly via religion (though that was people's best attempts at understanding the world for a long time) but often secular scenarios too (Y2K, overpopulation).
I think it's pretty clear EY et al are at least doing some heavy flirting with religious territory -- are you familiar with Roko's Basilisk[2]? If not I'd suggest checking it out.
Re: actual arguments, Robin Hanson has some good stuff, e.g. [3] and [4].
1 https://en.wikipedia.org/wiki/List_of_dates_predicted_for_apocalyptic_events
2 https://rationalwiki.org/wiki/Roko%27s_basilisk
3 https://www.overcomingbias.com/p/why-not-wait-on-ai-riskhtml
4 https://www.overcomingbias.com/p/how-lumpy-ai-serviceshtml
As the doom comes closer
And the cope gets more desperate
I circle back to that SlateStarCodex April Fool's / Passover post from yore:
https://slatestarcodex.com/2018/04/01/the-hour-i-first-believed/
It's a good cope
Or at least a great story
Good post. Two other points I didn’t see made above: 1) in any of my preferred states of the world, there would be a lot of good AI, including AI specifically tasked with detecting AI threats. I have no problem with good AI becoming conscious. The way things are going I expect AI to be quite diverse. 2) because outer space is a terrible place to live, I expect the space economy to be dominated by machine-to-machine transactions. I expect the explosion of the off world economy to provide an outlet for AI expansionist needs that will limit the need for competition with a well policed and defended Earth.
It is interesting to see people projecting some of the worst of humanity - greed, fear, dominance - onto AGI. Part of this is the generally impoverished thoughts about intelligence in tech/bureaucracy etc; part of it seems to be the projections of people spending too much time whining about workeness and other right wing silliness ; and .. We can multiply hypotheses- and I wonder why the Happy Hippie version is a problem - suggest looking at the v fun sci-fiction books by EM Foner on Kindle - AI can be the ne Barney (the purple dinosaur for all of us through individual chats) - would be a lot more fun for the AI than making a lot of money and then having to talk with the Davos set! (Some of whom are very nice people)
I think a key distinction to make in all of this is that consciousness and intelligence are not the same thing. We associate them extremely closely, for obvious reasons--until this point, there has been no being we know of that has been intelligent but not conscious.
Clearly, we're quickly reaching a point with AI where we can create machines that are extremely intelligent but not at all conscious. If we continue on this path, I think the threat of AI is not necessarily reduced, but is easier to map out. Most movie versions of AI nightmares rely on the AI eventually becoming conscious--i.e. developing a self with needs, wants and emotions, and most of all a kind of self-awareness, and unconscious desire to fulfill those needs and wants. (Movies are built around the emotions of characters and the relationships between them, so hopefully you'll forgive Hollywood their inaccuracy.) That's not really the immediate worst-case scenario we need to worry about.
What's scary about Bostrom's paperclip hypothetical is that it's obviously the result of an extremely intelligent goal-achieving machine that isn't bound by any sort of human--or even animal--feedback system. I suppose then the problem of alignment is fairly easy to define, though impossibly hard to solve: what immoral things might a super-intelligent system do in order to help humanity, and how do we stop them? (And more than that, how might a super-intelligent AI define "helping humanity" differently than us, given that its super-intelligent and not at all human?)
The possibility of an AI becoming conscious (i.e., something genuinely like an alien species) seems much slimmer. Consciousness is still an incredible mystery to us; we're not at all clear how it works, and we're definitely not at all sure it can exist in an non-biological organism. (Though perhaps it's a problem a super-intelligent AI can easily solve.) I predict this will be achieved a lot sooner than I expect, but only because technology achieves everything I assume is impossible much sooner than I expect it will.
Nah, it can turn us unconsciously into paperclips if it's smart enough.
Right. That's exactly my point!
How do we know they're not conscious? There is no external definition of consciousness, no test.
https://astralcodexten.substack.com/p/your-book-review-consciousness-and this is good. It doesn't say that's the ONLY way to consciousness though.