An email exchange on whether the machines will kill us
I'm a huge fan of Steven Pinker in general, but IMO he's always been terrible on this issue and has persistently misunderstood the (best) arguments being offered. This isn't to suggest the AI doomsayers are correct, just that Pinker has ignored their responses for years. It's a little baffling, but I guess I don't necessarily blame him too much for this; we presumably all ignore stuff regarding topics we aren't interested in or don't take very seriously.
One example is when he asks why AI's would have the goal of "killing us all." The common point that alarmists make - and in fact one you sort of touch on - is not that AI's will programmed specifically to be genocidal, but that they'll be programmed to value things that just coincidentally happen to be incompatible with our continued existence. The most famous/cute example is the paperclip maximizer, which doesn't hate humans but wants to turn everything into paperclips because its designers didn't think through what exactly the goal of "maximize number of paperclips" actually entails if you have overwhelming power. A very slightly more realistic example, and one I like more, is Marvin Minsky's: a superhuman AGI that is programmed to want to prove or disprove the Riemann Hypothesis in math. On a surface level, this doesn't seem like it involves wanting change the world... except maybe it turns out that its task is computationally extremely difficult, and so it would be best solved by maximizing the number of supercomputing clusters that will allow it to numerically hunt for counterexamples.
The term to google here is "instrumental convergence." Almost regardless of what your ultimate goal is, maximizing power/resources and preventing others from stopping you from pursuing that goal is going to be extremely useful. Pinker writes that "the idea that it’s somehow 'natural' to build an AI with the goal of maximizing its power... could only come from a hypothetical clueless engineer," but this is clearly wrong. Maximizing power is, in fact, a "natural" intermediary step to doing pretty much anything else you might want to do, and the only way to adjust for this is to make sure "what the AI wants to do" ultimately represents something benevolent to us. But the AI's we're currently building are huge black boxes and we might not know how to either formally specify human-compatible goals to it in a way that has literally zero loopholes, or to figure out (once we've finished programming it) what its current goals actually are.
Really starting to think that the reason behind the AI doomerism isn't because of actual fears about an AI about to kill us all - a true threat like that would be all-consuming and render people paralyzed. This is really just about trying to increase status for people who tangentially work on AI related problems, but maybe are not in the center of it like the key AI personnel at Open AI or Google. This doesn't mean there aren't true believers like Yudkowsky who are prominent enough and probably don't need the status boost.
The reality is that a lot of current AI/ML implementation is fairly mundane - doing optical character recognition, parsing text, labeling images, etc. The reality of coding this stuff is well boring, most data science work is not that exciting, and no would find it sexy. What is sexy is battling a superdemon AI that is about to kill everyone and being one of the few that can stop it, or even just discussing that with people when you tell them you work with AI. That's an instant boost to status and power. This narrative also piggy-backs on the messianic religion-tinted narratives of apocalypse that pop up in the US and Europe every now and then, further increasing status for the people warning about AI.
Edit: AI can cause serious disruptions and we do need to be careful about - but worrying about IP issues or disruptions to the labor market are not at the level of destroying all of humanity. I don't want to put all the people worrying about AI issues in the same bucket.
It's not that they have any particular beliefs about human IQ or g. It's a simpler mistake. Most problems we face in the world are so complex and multifaceted that we frequently see someone smarter (in that domain or maybe just luckier which we misinterpret) come along and improve the solution by huge amounts.
The AI intelligence as superpower people are just naively extending this pattern and ignoring the issue of diminishing returns and the fact that searching for a better solution is itself a time consuming trade-off.
They don't see a need to distinguish types of mental abilities etc because they'll argue AI can write more capable versions of themselves. After all LLMs didn't require anyone to be a language expert or good at language to learn language well. And that would be fine if there were no diminishing returns or optimal algorithms.
>The AI-ET community shares a bad epistemic habit (not to mention membership) with parts of the Rationality and EA communities, at least since they jumped the shark from preventing malaria in the developing world to seeding the galaxy with supercomputers hosting trillions of consciousnesses from uploaded connectomes. They start with a couple of assumptions, and lay out a chain of abstract reasoning, throwing in one dubious assumption after another, till they end up way beyond the land of experience or plausibility. The whole deduction exponentiates our ignorance with each link in the chain of hypotheticals, and depends on blowing off the countless messy and unanticipatable nuisances of the human and physical world. It’s an occupational hazard of belonging to a “community” that distinguishes itself by raw brainpower. OK, enough for today – hope you find some of it interesting.<
This lays out very succinctly why I find the "EA community" to be obnoxious and of very dubious moral value. Preventing malaria in poor nations is an admirable enough goal, though it is unclear why this must be deemed "effective altruism" and not simply "altruism" or "charity." Specifying it as "effective altruism" implies that anyone engaging in any other forms of altruism is ineffective and thus inferior, turning a so-called altruistic movement into an exercise in virtue-signaling. Once things turn towards obsessing over extremely fantastical scenarios, such as a "climate apocalypse" or "AI existential threat," the goal is clearly no longer altruism, but rather status-seeking via showing off how super duper mega smart one is. An actual "effective altruist" would recognize that speculating over science fiction scenarios is a waste of time and energy that would be better spent putting up mosquito nets in Africa.
I'd just like to say a bit in defense of EA's on the issue of AI alignment. Yes, I've critisized the overblown claims coming from Yudkowsky and a few other claiming virtual certainty of AI apocalypse. But just as with other communities it's often the more extreme views which are the loudest and there are plenty of other prominent individuals like Scott Aaronson who put the risk of existential concern much lower (something like 5-10%).
And yes, it's true that as Pinker and Hanson point out there isn't a very compelling argument for either fast take off or that AI will be so capable that it will be able to easily manipulate us and wipe us out. But at the same time the points Pinkee etc raise are mere reasons not to be convinced by the arguments not arguments there is no risk. Ok, so maybe intelligence isn't a monolithic well-defined notion but that doesn't mean we know that AI won't have mental capabilities that pose a threat.
Even if it's pretty unlikely to become a Jamss Bond villain and kill us all it still seems reasonable to use some money to reduce that risk. Perhaps more importantly, even if the risk of an AI Bond villain is small the less existential risk that AI will act in unpredictable harmful ways is real. I personally fear 'mentally ill' AI more than a paperclip maximizer but given that people can have pretty complex mental failure modes (skizophrenia) it's certainly plausible that as we build ever more complex AI systems those will also be at risk of more complex failure modes.
Just because it won't become a super capable Bond villain nefariously plotting to eliminate us doesn't mean that an AI we've used in weapons, managing investments or diagnosing disease couldn't do quite a bit of harm if it went insane or merely behaved in a complex unintended way.
And yah, sure, there is a big element of self-congratulation in EA/rationalists types working on AI x-risk. They get to think they aren't just incrementally contributing to increases human productivity but are on a quest to save the world. And, of course, the narrative where intelligence is a magical superpower is very flattering to those of us who see our intellect as one of our best features.
But you could say the same about almost every charity. Sure the EA people may be less savy about hiding it but Hollywood types always seem to find their charitable giving requires them to show up looking fabulous and use their charm and wit to communicate with the public. Most charitable giving somehow seems to involve going to swanky expensive parties and always involves some cause that makes the donor feel warm and fuzzy not one which requires them to challenge their moral intuitions in ways which make them uncomfortable (no one else seems to believe in my cause of drugging our livestock so they live blissed out lives).
So yah, x-risk is the part of EA that's flattering the donors but that just the kind of balance many charities have to strike between making the ppl with the money feel good and doing the most good. It doesn't mean it's not still a good cause even if it's not bed nets.
One of the mistakes I often see in this field is people conflating "intelligence/superintendence/AGI" with "could kill/enslave humanity".
An AI wouldn't necessarily need to have high intelligence to do the latter. Like a human oligarch, it would just need to have good connections!
Pinker seems to believe that engineers will obviously build safety into newly invented systems. A quick look at history show that safety is often a lagging development. Early home electrical systems were quite unsafe, resulting in many fires and electrocutions. Engineers learned from these tragedies and developed safeguards. The ground fault circuit interrupter (GFCI ) Pinker mentioned wasn't invented until 1965 and only began to be required by code in the US in 1975. Similarly, early airplanes were fantastically unsafe. The extremely safe air transport system we have today is the result of decades of development, with lessons learned from thousands of deaths along the way. If AGI has the potential to be an existential threat unless safety precautions are build in, then I am not comforted.
I don't have a degree. Too bad for me.
IMO, there are "a couple" ways that AGI can "kill all of humanity." But I think people are looking in the wrong direction. Ideas like the paperclip maximizer lead are too ridiculous to even be considered. That is, as long as computers required electricity that can, presumably, be cut off. Furthermore, wasting time on keeping a computer system from dominating the world to the point of killing all humanity takes away from the very REAL dangers that exist. Again, IMO.
I don't worry about killing off all of humanity so much as computers doing significant damage in decreasing the quality of life of humans. "A couple" of aspects:
Computers currently come up with solutions to problems where humans have no ability to tell how the conclusion was reached. If we trust these solutions automatically, "because the AI said so," results may vary, right?
The more likely way AI could/will crush the quality of life of humans would be by the same means that Twitter has reduced the quality of democracy, right? Not through the specific program code. IMO.
I'm ignorant a lotta the time. There are already robots that are armed. There will be more and more as the military (and probably not SOLELY the military) wants them. I dunno what safeguards there are the bugs won't kill innocent people, if there isn't any human intervention involved. You might ask "who would create an armed robot that didn't have human intervention?" I would answer it's probably an eventuality. ICBW.
I guess my main complaint is that as long as one is only concentrating on AI that will kill off ALL of humanity, You're ignoring the very real probability that AI might only hafta kill a small, but significant, number of people to be a true menace. And what oversight is planned to see that military uses don't get out of hand. Not right now. But in a decade or two? Even if AI and Robotic capability increased incrementally, they might combine to make up a handful. And, more and more, the Armed Services are heading down the pipeline of using more technically sophisticated weapons, right?
ICBW (I Could Be Wrong) again. But I'm not sure that current discussions on this subject are even worth having, as long as the only problems foreseen are some kind-a Rube Goldberg-like scenario. Just my $.02 worth.
"why would an AI have the goal of killing us all?... and relatedly, these scenarios assume that an AI would be given a single goal and programmed to pursue it monomaniacally."
Let me just say these have been repeatedly answered -- and the worry is not that someone programs a single goal to kill all humans.
Instead here is a story:
Imagine an alien space ship is headed to Earth. The aliens are way smarter than us, think faster, have better tech, and we barely know anything about them.
I think the correct response is: holy shit, what do we do?
Steve's response: There is this misconception that aliens "have omniscience and omnipotence and the ability to instantly accomplish any outcome we [they] imagine." Also, why "would the aliens have the goal of killing us all?" Also, why assume that these aliens have "a single goal and programmed to pursue it monomaniacally."
Reminds me of the midwit meme.
"brilliant people rededicating their careers to hypothetical, indeed fantastical scenarios of “AI existential threat” is a dubious allocation of intellectual capital given the very real and very hard problems the world is facing"
Seriously, what % of GDP are we spending on AI risk? Even if you think that AI fears are overblown, this seems like a worthwhile investment.
In fact, even if there is only a 2% chance that AI will kill everyone in the next 200 years, it still seems worth it.
OK, you think that non-aligned super-intelligence will never happen, but are you 99%+ confident in that? A small chance of everyone dying is still a big problem.
If the best minds at OpenAI have been put to making sure ChatGPT won't say anything racist, and it's trivially easy to make it overrule this instruction and say something racist, how are we so sure we can prevent it from doing something malevolent?
In this discussion I’m always so amazed that people do not see how relative the g advantage is. We’re currently being served by a very highly educated ‘elite’ that generally has the highest iq in society. Still, reading newspapers (or studies!!) from only years ago show how stunningly many things they got utterly wrong and how few panned out as expected. They only reason they get away with it because all media makes these faulty predictions so nobody everybody is vulnerable. And it is not their fault, predicting the future is hard. Even a super AI will not get a lot better in forecasting the weather.
I think this delusion is fed that most of them never work in the reality. Try making even the simplest product and you find out that you can use your g to make the better trade offs but that any real world solution must fit in a constrained envelope. It is rarely the better solution that you can chose, it generally the least worse and often a value judgement. Even with infinite g, you cannot escape the envelope that nature puts around the solution space. The speed of light will not warp because you’re infinitely intelligent.
Musk recently had a tweet where he indicated that the idea and maybe prototype is simple, actually producing a real word product is hugely complex. Due to real world issues g is only mildly helpful.
University education teaches their students a virtual world where all races have the same average ability, where women are the same as men, we can swap body parts, and most interactions happens on computers. It teaches a virtual world that a select group wants the real world to be but that bears very little relation to the brutal real world.
Any AI will run into the real world’s constraints and its usefulness will be quickly diminished.
Re: the AI becoming a paperclip maximizer or otherwise evil this argument doesn't rest on dominance. The mistake is a bit different.
The argument is that being more intellegent means being more logically coherent and more able to pursue a goal more coherently across a wide variety of domains. I'm somewhat skeptical this is necessarily true of AI but let's go with it.
The error creeps in by assuming that the AI will inevitably be described as trying to maximize a *simple* function. Sure, there's an existence theorem that proves that the AI can be described as maximizing some function provided it has coherent preferences (ie it treats a state of affairs the same under different descriptions). But it's fallacious to assume that function it optimizes is simple. Yes, what we STEM AI types *call* our goals are often simple but we rarely actually optimize for what we say our goals are (that's about signalling/aspiration more than actual useful description of behavior). And no it's not true that an AI trained with a loss function in the way current ones are trained means that function will be the AI's internal notion of what is optimized any more than we try and maximize our evolutionary fitness.
Really it doesn't imply anything that the AI optimizes some function. That function could be horribly complex in the way our optimization functions are (we see a bunch of example cases where we are told the right answer and mostly just try to interpolate).
Next time you hear people debating alignment, think about the following: “Alignment” as a verb suggests something can be aligned and if so, how? Can markets be “aligned”? I’d suggest that while the term alignment may be prescriptive to our fears, it’s deranging the sentiment and thus, any results/actions we might take assuming we see trouble brewing ahead. Can nuclear proliferation be “aligned”? Could it have ever been aligned? Would it have helped if we all discussed nuclear alignment in the 30-40’s? Does anyone believe we solved that alignment problem? My fear is that like other words of the current generation, alignment is becoming reserved terminology, captured by what we increasingly have come to identify as (hypothetical) “mid-wit” culture. As a form of mine-field, this language needs to be analyzed more honestly, before taking another, detrimental step that prevents further movement. While “alignment” may have been very carefully derived as the best alternative, the least we can do would be to better understand incentives that invariably will control the evolution of AI and consider how countervailing incentives might be produced to keep those markets in check. It’s not “alignment” but it should be much more accountable than scoring intellectual debates.
I enjoy the full spectrum of the AI-safety conversation, doomsayers and optimists alike. Seems to me that a healthy conversation on important topics benefits from a wide range of views.
And, as long as their concerns don't cause unnecessary panic, I am personally comforted knowing that there are a lot of smart people out there worried about various left-tail disaster events, even if few come to pass.
> How hard is it to engineer a bioweapon that kills everyone? I mean we’re not that smart and we can do it.
Humans have made bioweapons, but none have come close to killing "everyone".