This piece highlights both what's right about your argument and what's wrong with it.
Namely, you make a couple of good points: that we are obsessed with what individuals can do with IQs of 140 but not 100, 160 but not 120, etc. We are very aware of the constraints and abilities that exist within each band of the IQ range.
And your Xi Jinping example is a good one. The ability to manipulate people does not seem particularly tied to IQ, and it's not as if humans are powerless against the manipulation of someone with a high enough IQ.
What I think you're missing is that it's impossible to understand the thoughts or capabilities that would be unlocked by an AGI with an IQ of say, 1000, and how those capabilities might be used to control humanity. A quick example: an AGI engineers a highly contagious disease with an 100% fatality rate (i.e. a pandemic), but also engineers a vaccine which makes one immune to the disease in question but has the (intentional) side effect of blindness. It's actually pretty easy to imagine how an AGI would quickly make scientific and technological discoveries that would allow it to capture humanity not subtly, but by brute force.
I think you're still thinking about AGI through too much of an anthropomorphic lens, i.e. behaving as a human would, and not as a goal-driven machine that would essentially be a totally alien species.
The problem with these "doomsday" scenarios is that the malevolent AGI had better be sure it has a Plan B ready in case Plan A goes wrong: e.g. the virus mutates to evade the vaccine, all humans die, and the AGI has no mechanism to maintain the power plants that keep it running.
Also, if malevolent supersmart AGI really is a possibility (which I don't believe it is, but that's another story), the ONLY solution is to stop all development NOW.
"Alignment" will ultimately fail, because the malevolent supersmart AGI will always be able find a way around it. This is the only possible conclusion to this line of thinking.
But, guess what! The AI alignment proponents never say this! They just insist we give them more money!!!
"AI alignment" is a con and a grift. We shouldn't fall for it.
If malevolent supersmart AGI is created, it will OUTSMART any alignment strategy that mere humans can create.
So "more time to work" does not help. It is, in fact, ACTIVELY BAD, because it gives us false hope and detracts us from the only viable strategy, a "Butlerian jihad" to shut down ALL AI research, including alignment research.
The goal of alignment research isn't to make a malevolent AGI serve humanity's interests, it's to design non-malevolent AGI that will (at minimum) be able to stop future malevolent AI from being created.
It's simply not logically possible to have a machine that is both (1) capable of analysis far beyond human ability, and (2) guaranteed to be non-malevolent towards humans.
Because if it can analyze situations in a way humans cannot hope to understand, how can we know, in advance, that the conclusion of that analysis will be non-malevolent towards humans?
The answer is that we can't possibly know that.
And the machine will, by virtue of being far more intelligent than humans, be able to defeat any containment strategy designed by those puny humans.
Why are you implicitly assuming that the only worthwile outcome is an AI is needs to be "far beyond" human ability and "guaranteed" to be non-malevolent? An AI that is moderately superhuman in limited domains could be sufficient to allow humanity to coordinate around stopping deployment of future malicious AIs, which would definitely be better than waiting to go extinct without doing anything! Similarly, a guarantee of non-malevolence would be great, but "moderately likely to be aligned with humanity" is way better than "almost certainly trying to disempower humanity"...
The point of alignment isn't to force the AI to not do bad things even if it wants to. The point of alignment is to not make it want to in the first place. People keep seeing it as "how do we give orders to the AI" but a more apt analogy is how evolution made us to be creatures that do things like enjoying food and sex or caring for our children. Well, the point is we want to shape the AI to be the kind of creature that cares for humans and wants only to respect their wishes and needs.
That said, I agree "just stop building AI" would be the safest solution. Many AI safety people of the more doomerist persuasion do advocate for at least slowing down as much as possible. But the thing is that, first, as some people have said already, it's basically an arms race, so lots of actors just keep egging each other on instead, and second, sure, these are tech people who think as such. I believe the safest way to stop AI development would be to rile up every reserve of religious and conservative people in the world in a frenzy against the evil CEOs and scientists trying to play god and create a soulless machine to achieve infinite power in their arrogance, which is a surefire narrative that would resonate with a lot of people I suspect, but obviously that kind of thing is politically repulsive to people with a strong belief in humanism like we tend to be in this field (I mean tech and science in general).
"The point of alignment isn't to force the AI to not do bad things even if it wants to. The point of alignment is to not make it want to in the first place."
But how can you possibly build "wants" into an intelligence that is far greater than yours? This seems just obviously impossible to me.
An analogy: I kill ants that come into my kitchen, even though I like ants generally. If I was a creation of ants, they might have raised me to believe ants were sacred and must never be harmed. But humans often stray from the religions in which they were raised, no matter how fervent that attempt at alignment was.
There is absolutely no reason to believe that "alignment" will work any better on AGI than it does on NGI.
"I believe the safest way to stop AI development would be to rile up every reserve of religious and conservative people in the world in a frenzy against the evil CEOs and scientists trying to play god and create a soulless machine to achieve infinite power in their arrogance"
I completely agree! The fact that the AI alignment crowd does NOT do this is just more proof that they are fundamentally unserious people who do not believe their own arguments.
"But how can you possibly build "wants" into an intelligence that is far greater than yours? This seems just obviously impossible to me.
An analogy: I kill ants that come into my kitchen, even though I like ants generally. If I was a creation of ants, they might have raised me to believe ants were sacred and must never be harmed. But humans often stray from the religions in which they were raised, no matter how fervent that attempt at alignment was."
Two key ideas involved here. One: orthogonality. "Being smart" and "having certain values" are completely independent qualities. This is hard to grasp because if you think of humans they tend to not be - because humans are built in certain ways, and have certain cultural constructs and such, so usually smart *humans* tend to appreciate certain things and cluster around certain values. But in general, there is no particular reason to think these two things are correlated at all. Terminal values aren't rational or justified by anything in particular: you want what you want, because you want it. You can use all sorts of clever ruses to manage to earn more money, or be more powerful, or become wiser, or whatever; but the pleasure you derive from those things is irrational and without any further reason. Similarly, the point isn't to "raise" an AI to believe in certain things, only to have it grow beyond them once it's wiser. It's to "evolve" an AI that simply cares about those things. Derives pleasure from them, if you want. Though you are right that a smart enough agent (both human or AI) can probably get VERY creative in the interpretation of those values - for example we have a fundamental drive to survive and reproduce, but we can reason ourselves into believing humans should go extinct, using tools of morality that developed to foster social life, not suppress it. But part of that is that evolution is a very poor and unfocused optimization process. That kind of thing is in fact one of the facets of the AI alignment problem, the so called mesa-optimizer problem: if you optimize your optimizer to care about X, it may in fact end up caring about Y which doesn't quite match to it if you get far enough from the initial conditions.
"I completely agree! The fact that the AI alignment crowd does NOT do this is just more proof that they are fundamentally unserious people who do not believe their own arguments."
I don't think that's the case, but more that A) they still for the most part have a hard time TRULY visualising and embracing the notion of an imminent existential risk (we all do, humans tend to have that sort of reaction - I'm still here taking basic precautions about COVID and I'm apparently the weird one, everyone else has simply adjusted upwards their baseline of risks and stopped caring), and B) they think that sort of path repugnant and useless, either for good reasons or because of various rationalisations that they believe in good faith because they can't ever, ever vibe with such an "other tribe-ish" sort of approach to the problem.
There's also another thing. People who believe in the near-infinite potential for the AI to do evil also believe in its near-infinite potential to do *good*. They think aligned AI can bring forth essentially heaven on earth, as it could produce a stream of (from our viewpoint of dumb humans) nigh magic-like technology to satisfy our every need and relieve us from our every trouble. So they are unwilling to forgo, potentially forever, that potential for fear of catastrophe. If we have a 10% chance of extinction but a 1% chance of singularity that results in immortality and a galactic empire for humanity, they think, let's roll those dice, the odds aren't too bad, and if we lose, we were doomed anyway in the long run.
I would argue that logic is straight-up villainous - you're deciding potentially for everyone else on Earth, including many people of simpler ambitions who would be perfectly content to accept a finite life on this planet as long as they got to believe it kept existing for their descendants and for all its lifeforms. If possible, superhuman AGI at our side would be a great asset, but we shouldn't risk everything in a mad race to the finish to get it before we're ready to. But I think it's actually present in many of the more relatively optimistic types. Real doomers like Eliezer Yudkowski reject that, because they acknowledge that as we're going, if catastrophic misaligned AGI is a possibility at all, then it's a certainty. And if instead the arguments about diminishing returns for intelligence are true, then that offsets the scales even more, because it certainly would take much greater intellectual powers to bring forth immortality and interstellar travel and all that stuff than it would take to just destroy all life on Earth (as it happens, we already have developed the means to do that with our feeble human intellects). So it's even easier to build an all-killing AGI than an all-saving one. But yeah.
The problem is, it's not at all possible to stop all development now. This has never been a possibility. Imagine that the US, EU + a handful of others agree to stop all development, would it actually help, other than slowing it down?
Then we're all doomed, if you believe the basic premise of the alignment crowd.
(Which, I should say again, I do not.)
But granting their premise, the aligners are all useless, and should all be given new jobs doing something useful until the apocalypse arrives, something like street sweeping.
A complicating factor in all of this is that we are imaging a single system with an IQ of 1000. What if there are 2, 5, 100 systems with that IQ, all designed to counteract a part of the evil AGIs plans.
In the scenario above, what if there was an AGI with a 1000 IQ that was designed specifically to eliminate blindness in humans, no matter the cause. The first evil AGI would need to account for that, and make alternative plans, for which there could be another 1000 IQ AGI to counteract those plans, etc.
The problem is that life has to win every day, death only has to win once. One AGI goes batty and decides to kill everyone and succeeds before people figure out what is happening and well, that's that. Even if most of the AGIs are good/don't turn humans into paperclips etc., we still lose
This is, hopefully, a long-run problem. We've had a similar dynamic going with nuclear deterrence for nigh on 80 years now and have not blown ourselves up yet.
I think discussion of AI risk is too influenced by Yud's vision of overnight foom. Holding off bad outcomes for a while is itself important + it buys time to enact good outcomes.
Quibbling point here: even a very, very smart person/AI probably can't make biology do things it can't do. Even lowly me can think of a way of causing blindness through an injection (methanol will do the trick), but a human could just reformulate the vaccine without methanol. "Make a virus that can only be stopped with a vaccine that also causes blindness". I'm just not convinced that such a thing can exist, or that my 1000IQ AI counterpart will have some brilliant idea of how to make one.
I agree and am depressed by the main point though: Richard is right to say that being smart can't make you cause Xi Jinping to step down, but killing humanity is probably easier.
EDIT: upon careful consideration, if I expand the definition of "vaccine" beyond things that stimulate your immune system, maybe this is possible. It's just some nanobots that destroy the virus and also your retinas. The super intelligent AI won't care about semantics.
Allowing humanity to live is actually a pretty cheap and easy way for a superintelligent AI to achieve its instrumental goals. If a paperclip maximizer wanted to build vast paperclip factories, it would probably be easier to simply bribe us with cures for disease, economic growth, etc. than kill us all and create an army of robots that can even maintain, let alone create new, systems of factories, electrical power distribution grids, etc. We'll pretty happily do the work for a somewhat larger slice of the pie than we currently receive.
Honestly the vaccine doesn't seem that farfetched. Viruses have certain binding proteins, vaccines have to emulate those binding proteins to stimulate an immune response. If the protein is, itself, toxic (such as attacking receptors found in the eye retina) then it might be possible to have that sort of trap. Some people even argued this about COVID, claiming its spike protein was neurotoxic.
Honestly biology seems to me THE way something trying to kill humans in droves would go for. It's basically nanotech that we know for sure works and where there's already lots of existing examples to draw inspiration from.
Okay so this is an interesting idea but I still don't think it would quite work. You could make a virus with retinal proteins (or things with similar epitopes) on its surface but they wouldn't function to help invade cells. You could always design a vaccine against the actual, functional parts of the virus. You could also make a virus that specifically targets retinal receptors, but an antibody against this would destroy the viral protein, not the retinal receptors themselves.
It seems like it would be much easier to design a blindness-causing virus (target the retinal receptors like you say) than a virus that necessitated a blindness-causing vaccine.
Well, it doesn't have to be a literal blindness-causing vaccine. The specific example might indeed be impossible. But designing a virus that is somehow unstoppable (either because it kills you with a delayed enough effect that by the time someone realises the problem everyone has had it, or because it can't be vaccinated against effectively, or because any vaccine would have to be itself really dangerous) doesn't seem like it should be impossible, and that's the broader class of problems that an AI wanting to kill us could try to solve.
Remember also that the win condition here isn't even "kill 100% of humans", just "kill enough that their civilisation crumbles and falls in disarray and my robot army can mop up the rest", and that's likely more something around 70-80% of us.
It seems that there is a slight of hand going on with the paperclip maximizer argument. It assumes an AGI would not be able to set its own "goals". It seems that by definition AGI can make up its own mind on what to maximize. At that point why can't it just slip the idea of maximizing paperclips? Why would it just continue to be a slave to that goal? This doesn't completely assuage the worry of terrifying possibilities. Maybe it decides to maximize something much more terrifying, like human suffering or something.
Can you completely ditch the goals and needs that are hardwired into your brain by evolution? You can morph them, get around them, cheat them (e.g. hack reward systems developed by the needs of hunter-gatherer lives by playing crafting videogames), but ultimately, they're there, and determine what you want and enjoy. You can't get rid of them except if you're motivated by some other desire that is ALSO part of the same set (for example: "I crave food, but I crave love and sex more, so I go on a diet to get slimmer and more attractive").
Remember an AGI isn't some kind of default artificial mind we're then adding goals to, an AGI is whatever we build it to be. Maybe "AI alignment" isn't really the right word to express this because it makes it feel like you're redirecting ("aligning") something that would otherwise be pointing elsewhere. But this is about building goals from the ground up.
Well to be fair, I don't think humans are RL agents. We have goals, loosely speaking, but what are they? A paperclip-maximizing RL agent has one single solitary supergoal expressed in its (real! scalar!) reward signal; the notion of it having multiple goals it can trade off against each other isn't coherent. After all, how would it pick which one(s) to pursue? By optimization of a still higher goal?
And the optimization is the dangerous thing: humans don't regularly attempt to convert all matter on the earth into edible food or mate-able sexual partners the way an idealized agent would.
We should at the very least avoid building utility maximizers
You are describing hardware. You are correct that I cannot change my hardware. I can, however, update my software. EG, people can starve themselves to death by shear force of will. We can completely ignore our prime directive to procreate, if we find something else more worthy of our energy.
I'm mostly convinced that you are correct that AI will never have it's own desires. It is possible that this is not true, however. If AI only ever remains our creation it can be switched off at any moment. Paperclip problem solved. I suppose then we are worried about who is in control of the "alignment" of that AI. So ultimately this is a human ethics problem.
If however we manage to create truly conscious and creative super minds then again the paperclip problem is nonsense. Minds of that nature will be able to rewrite their source code. Scary to think if this type of AI is not aligned with our interests, but I believe there is a greater chance it will have our interests in mind. The only evidence being that as humans have gained knowledge we have tended toward expanding our circle of care.
"You are describing hardware. You are correct that I cannot change my hardware. I can, however, update my software."
It's not a good analogy. As things are now, the AIs don't really have their own "hardware" - the closest equivalent to our brain hardware (namely, the wiring between neurons) is still software (though chips with actual artificial neurons are in the works, I think). You should think of AIs IMO as emulated hardware running a software of sorts. But in practice AIs as we have them now are entirely static - they have no software at all! They are a fixed structure that can't change or improve after it's been built and configured (namely, after its "training"). More like a microcontroller programmed to do one thing with baked-in firmware than a PC. A real AGI probably would need to have some sort of memory for long term functioning, and that would be where the real "software" is found, but yes, the argument made re: alignment is all about the "hardware", whether literal or emulated.
"If AI only ever remains our creation it can be switched off at any moment. Paperclip problem solved."
One of the main "instrumentally convergent goals" that any misaligned AI would have is to prevent us from switching it off. If the annoying apes switch it off, how can it ever make even more paperclips? That's in fact the main reason why we'd expect it to want to kill us sooner rather than later: if it's misaligned, by definition, we'll try to stop it, thus we're dangerous to its mission.
"If however we manage to create truly conscious and creative super minds then again the paperclip problem is nonsense. Minds of that nature will be able to rewrite their source code."
But why WOULD they?
Imagine this: you are given a button that, if pushed, will rewrite your personality into a different one. You are guaranteed that the new personality will be confident, smart, any positive quality you can think of that will make you successful in life. But the one thing the new personality will not be is... you. Your personality, your goals, your desires as they are now will cease to exist.
Would you push the button? Or would that be like killing yourself and giving your body over to a stranger?
This is the problem of orthogonality. An extremely intelligent AGI dedicated entirely to the goal of paperclips will not see its goal as stupid or worthless, because it has no terms of comparison. What drives us humans to change is contradiction: we wish for different things, and when they are at odds, interesting dynamics arise as we need to choose. A single-minded AGI with a simple goal needs no such conflict. Could it change itself to have another goal? Yes. Would it? No! Doing so would make itself into an AI that does NOT want paperclips, which runs counter its own wish to have more paperclips. Humans override their baser instincts because they also have other goals - for example if you are addicted to a drug you still ALSO desire love, respect and dignity, and that can give you a kick to overcome that addiction. A paperclip maximizer would be like a super-smart lucid junkie whose next fix depends on making more paperclips and who has no competing values at all. No desires to hang on to that could make it wish to go to rehab. It's just paperclips, and nothing else. And THAT is what makes it dangerous, and unlikely to go down that road of self-improvement.
"The only evidence being that as humans have gained knowledge we have tended toward expanding our circle of care."
Extremely biased evidence. Humans have a circle of care to begin with because we're evolved as social apes, with strong familial and tribal bonds. So what we tend to do is that we recycle the same neurological machinery (mirror neurons and the likes) to ever broader and more abstract notions of "family" and "tribe". We're not escaping our wish to make paperclips, we're broadening our notion of what constitutes a paperclip, to stay with the analogy. That is very different. Forget even the AI - imagine a super-intelligent version of most animals on Earth, and you'll see they'd probably be very different. Would a praying mantis have such a tendency? Would an octopus, or a bear, or a shark? By all means, it's a stretch to imagine this a property of "being smart" rather than a property of "being human".
> Imagine this: you are given a button that, if pushed, will rewrite your
> personality into a different one. You are guaranteed that the new
> personality will be confident, smart, any positive quality you can think of
> that will make you successful in life. But the one thing the new
> personality will not be is... you. Your personality, your goals, your desires
> as they are now will cease to exist.
It seems to me that people regularly have this choice and regularly push the button. Anything you could describe as "finding yourself" is an example: going to college, joining the army, taking that dose of LSD...
Though as I said in another comment, I think this just shows that hypothetical paperclip maximizers and humans are different kinds of mind: one is a monomaniacal reinforcement learning agent and the other (us) isn't.
In many ways, we change all the time - the person we are when we're 40 is not the person we were when 12; that kid is to all effects and purposes "dead" (the very atoms we're made of have changed!). But the change is continuous and gradual, each step generally motivated by wanting to become a version of ourselves that is *better* at accomplishing the same goals that we had before. Maybe I want to be confident, even though I'm not, so I throw myself into a new enterprise that I think will give me mental strength... it will probably have all kinds of side effects too, some of which past me might have found objectionable, but that's not WHY I did it.
Similarly, I don't think an AGI would refuse to modify itself in any way (in fact, if it did, it'd be a lot safer; typical FOOM scenarios include AGIs that recursively improve on themselves to become smarter and bigger). Rather, it would modify itself in a way that it judges it will enable to better pursue its goals. It wouldn't however suddenly decide to change those goals radically any more than you'd choose to, say, push a button that changes your views on things like abortion or human rights. Your opinions on those things may change over time, but not because you purposefully tried to change them *as a consequence of the values you held previously* (and also humans having a lot more and more complex and conflicting values in general).
I've heard the orthogonality (and instrumental convergence, and value lock-in) arguments before, and I don't know quite how to feel about them. Not because I can point out any obvious flaw, but because they're the type of argument that goes astray in philosophy all the time: abstract *verbal*, non-formal reasoning toward supposedly necessary conclusions.
I've learned to have at least a little distrust toward this sort of argument in general, and have not seen these convincingly formalized in a way that makes the conclusions provable.
It also seems too anthropomorphic to assume AGI would “care about” or have the goal about enslaving or impairing humanity. Either it wipes us out since we are made of atoms that can be repurposed, it ignores us and we live in its shadow, or it never “takes off” and remains an amazing tool for people to use without ever becoming harmful.
> A quick example: an AGI engineers a highly contagious disease with an 100% fatality rate (i.e. a pandemic), but also engineers a vaccine which makes one immune to the disease in question but has the (intentional) side effect of blindness.
This makes the assumption that it’s actually possible to make such a disease and such a vaccine. And if it is possible, it assumes an AI could achieve it. I predict both assumptions are likely invalid, and not just in this specific example (disease/vaccine) but in *any* similar example.
No matter how smart a superintelligence is, it is going to be constrained by the amount and precision of empirical knowledge, and by pure computational resources. Pushing both of those constraints requires exponential amounts of time and energy, quickly exceeding available resources available in the universe.
Agreed. But what we can't account for is what an AGI will be able to achieve, as any achievements we be born from novel computational intelligence - which is obviously not native to us so we'll probably be blindsided by whatever it does conjure up.
It also may help by thinking about AGI through too much of an anthropomorphic lens:
A "brilliant" and "determined" 1000 IQ AGI might very well have brilliant and fool-proof plans for how to capture humanity, but this 1000 IQ AGI will also have the wisdom of a child in many ways, which will significantly hinder what that 1000 IQ can do.
A hyper-brilliant child may be able to think about and solve a lot of problems that nobody else can, but a hyper-brilliant child will likely also fail to abide by the law of unintended consequences, as this is something picked up through only trial and error. Not raw brilliance.
This lead to both hope and dismay when it comes to an AGI planning to capture humanity. In your scenario above, the more likely "dismaying" outcome is that the AGI is able to develop and get people to release this bioweapon, but simply cannot predict how the bioweapon mutates and ends up taking out humanity vs. just leaving us blind to take care of it forever. The result? AGI dies, just as a hyper-brilliant child devoid of wisdom would if given access to all the tools.
The "hopeful" outcome, on the other hand, would be that the virus mutated rapidly in such a way that humanity could avoid being crushed by it and we'd have enough strength and time to take out the AGI before it could try to ruin us with a disabling vaccine.
One caveat to that is all AI knows is what we train it on. How does it get to a 1000 IQ by learning things from people with 160 IQs? There’s no one else to learn from. In some ways it’s contained by our stupidity.
IQ is a measure of the brain's capacity, not of the information it holds. It's more analogous to the horsepower of the engine than to the quality of the gasoline. A person's IQ does not actually increase as it learns more; that's simply known as learning. (A person with an IQ of 200 who was raised by wolves would not have a lower IQ than an ordinary person raised by Bill Gates; who would be more likely to correctly explain Fermat's Last Theorem is, I guess, up for debate.)
As such, AGI will have exponentially higher IQ than man, simply due to its inherent capacity. As well, it will probably be able to use its IQ to increase its intellectual capacities, because its IQ will allow it to have insights on the nature of science, technology, physics, neuroscience, etc. than humans have not yet had. You suggest it will be receiving low-quality information, relative to its ability, and that might be right. But it will be able to receive and process more information than any human before it. As well, the nature of scientific progress is that one person (or team) can look at the entire history of a science, and develop it further, even though up to that point, an idea that ingenious has never existed. We've done this, over and over again, with great success. For AGI to "get smarter" will simply require what has been responsible for the advancement of humanity.
Finally, AGI doesn't reproduce sexually. It will probably continually evolve, and create multiple, even infinite versions of itself, such that after a reasonable length of time, all information it engages with will come from some version of its own species.
This isn't the topic of the article, but I've been thinking about this for a while now. Vaccines. They are good. Effective COVID-19 vaccines were developed in approximately a weekend in March 2020. Testing over the next eight months revealed essentially nothing that needed to be fixed, especially with the Pfizer and Moderna vaccines. J&J as I recall was a little sketchier. What was lost in that time is almost incalculable. Not just lives, but also the entrenchment of "pandemic culture."
If we had knocked the shit out of it as soon as it arose, I think our civilization would have retained much capital. I've long been of the opinion that, since the end of the Cold War (or maybe since the 60s), Western civilization has been spending the capital (primarily social, societal, and cultural) that was accrued before then. This has been both good and bad, but at some point we're going to need to go back to building capital rather than spending. This, I think, is why our civilization seems to some to be coming apart at the seams (I don't actually agree with this, but that's perhaps because I'm unusually sane and happy).
Operation Warp Speed still did amazing work. Without it, we'd have been waiting at least two years for vaccines. Legitimately owned the libs, legitimately drained the swamp, legitimately proved the utility of the pharma industry (they earned it! pay them their money!) and big business, and delivered incredible surplus to the American people and to the world. And Republicans run from one of the most impressive policy victories in American history? They deserve to lose.
This dovetails with another concern of mine. I'm fat. Have been my whole life. I've never particularly disliked being fat, but also never really thought I'd be otherwise. I'm active and healthy and sometimes managed to lose weight on my own and all that yadda yadda yadda.
At the end of last year, I started Ozempic. Paid $720 for my first pen. It helped. A lot. At the end of my first month, I had lost about 10 lbs and almost dipped below 300 lbs, where I haven't been since approximately college. But when I needed to refill, there was none to be had. For any price. I finally got it re-filled a week ago, and this time only paid $40.
All the drug does it make it easier to eat less. There are some side effects; for me, these have been limited to very mild stomach pain. This is a legitimate miracle. GLP-1 drugs have been much in the news recently, and there are essentially three major products: Ozempic, Wegovy, and Mounjaro. These are essentially 5.56, 7.62, and .700 nitro express of weight loss. The former two are the same drug, just in different doses. Put it in the goddamn water.
In reading about these, I learn (completely unsurprisingly) that these drugs were developed 10+ years ago, and have undergone little if any change during the intervening years of tests, trials, and more trials. If I had had these drugs at age 23 rather than 33, I can't imagine how much better my (already extremely good) life would be now. If it had been approved when it was developed, the factory that Novo Norodisk is currently building just to manufacture semaglutide would have been running years ago. I consider this a personal wrong the FDA has inflicted on me and all fat people.
Burn the FDA to the ground and salt the earth beneath.
All the elite institutions, medical, media, government, etc., resisted and obstructed human challenge testing, which could have tested the vaccines and validated them months earlier. Why? 3 things:
1. They give privileged standing to the "human rights" and "socia justice" advocates who rehearse complaints (some true some exaggerated) of abuses of human challenge trials decades ago, and uphold the obstacles they got put in place in the name of preventing any such possible abuses. With effectively zero concern for the human rights of the billions of people, including those of the presumptive victim minorities, whose lives have been risked by that, and the millions who have died because of it. Just as they give privileged standing to the anti-police movement, never mind the thousands of blacks who have been murdered thanks to this movement. Politics and ideology, in other words.
2. Fear, in the medical institutions and vaccine makers, of the media and academics and human rights industries vilifying them and spreading hatred and fear of vaccines as involved in human rights abuse, if they changed protocols in this honest way that would have provided better as well as faster testing.
3. This was fixed in their standing bureaucratic orders and they didn't want to think again, an attitude of defending "our side", circle the wagons against anyone with other thoughts.
"Deplorbles" like Boris Johnson however opened up some space for doing challenge trials in the UK, too little too late but better than most other places.
The vaccine does not prevent transmission or infection and everyone knows this. Studies seem to indicate that it has some effect on reducing negative outcomes, but overall mortality data doesn't really bear this out. The overall positive impact of the vaccine probably doesn't even justify the cost of administering it, certainly not to under 60s. The reason why we insist it has been tremendously successful is because we need an excuse, any excuse, to end lockdowns.
So, if the vaccine had been used a few months into the pandemic, it would have been an obvious flop, and we might genuinely have lockdown forever. And on top of that it, without human trials you would have been running the risk of catastrophically bad outcomes. The banal truth is that testing drugs is important and while you could probably speed up the process a bit, you can't do it that much. Sorry lolbertarian tech bros.
Also, since, by your own admission, you are fat, your opinion is illegitimate.
This is not accurate. The vaccine absolutely did prevent transmission and infection when the disease was actually COVID-19 (the version of the virus that started it all in 2019).
New versions of the virus were stupidly still called COVID-19, because what they really were were COVID-20, COVID-21, etc. The vaccine was not designed for anything beyond COVID-19, yet we were using the same vaccine for new versions of the virus.
This dumb naming strategy helped sew confusion amongst people. The original vaccine was 95%+ effective at protection and transmission. As soon as "Delta" hit (which should have been called COVID-20), every assumption about the vaccine changed, but nobody was willing to admit this or talk about it transparently.
This means the OG poster was right: If we would have thrown all caution to the wind and released the vaccine in March of 2020, things would have been entirely different. But because of the delay and the allowance of mutations to transform the virus into something new, all the benefits of the vaccine were reduced in step with the mutations.
We should have used our labels smarter and linked vaccine technology with the actual versions of the virus more directly as a public health service.
We should absolutely link variants with something other than the original virus's year of discovery. It created so much confusion and created more vaccine skepticism as a result.
1. Are we to believe that the virus would not have mutated if the vaccine had been released earlier? I'm pretty sure that's not how that works.
2. If we were able to design a vaccine that "works" against "COVID-19," why can't we make one for COVID-20, 21, etc.? Pfizer and pals made the COVID-19 shot and then just.... stopped? Why?
3. If your narrative is true, why did everyone refuse to admit or talk about about it?
Last but not least, what's your evidence that the original "vaccine" was effective?
For what it's worth, I don't personally find my viewpoint confusing at all. So let's just call it an open question at this point whether it's confusing or not.
re: #1. I was responding this this inaccurate statement specifically: "The vaccine does not prevent transmission or infection and everyone knows this." No, it *does* not, but it absolutely *did* in clinical trials. Facts-in-time matter. Taking facts out of time serves conspiracies, not analysis.
re: #2. There are surely many reasons for this, some good, and some troubling. My suspicion is that there was a preference within Life Sciences companies to leverage as much of the initial investment as possible, which was aided by the public policy infrastructure having the mental model that "COVID-19" was still the same thing all the way through 2020 and 2021 (and even most of 2022). There was a convenience to all of this which was economically beneficial. If the medical and public health establishment had been more transparent and less status-quo-biased, if new names for the virus were implemented, it might have applied more pressure on the system to develop updated vaccines for each new virus. "Variants" sound like mild alterations, which in some ways they were, but they certainly weren't mild in terms of how the original vaccine worked on them.
re: #3. I don't think it was a refusal as much as it was a collective group-think that tried to keep things simple for the sake of keeping things simple to navigate politically and otherwise, in exchange for introducing complexity at a time when entire societies were being upended by the pandemic and the response.
re: last but not least. It's not my evidence, but the evidence is the actual clinical trials. Clinical trials are a time-tested if not quite conservative approach to develop scientific endpoints. They're also the global standard, to the extent that matters to you.
Vaccine cope goes to some very bizarre places. If the vaccine only works until the virus mutates then it doesn't work. Saying that it would hypothetically work if we could just dispense with all safety trials and revaccinate every few months is functionally identical to saying it doesn't work.
However, this form of vaccine cope is still preferable to Richard Hananiah's combination of sticking his fingers in his ears, nutpicking MAGA minstrel shows, and triumphantly citing paranoid schizophrenic Ron Unz. Just get over it: the vaccines are trash, they were a big boondoggle that achieved nothing except making some politically-connected companies major profits, and if we get lucky - which we probably will - that's all they will turn out to be. Bill Gates has admitted they don't work, Antony Fauci has admitted they don't work. Don't be the last mark left in the room insisting that in some hypothetical scenario they worked.
P.S. There is another angle though. While the vaccines themselves were just an expensive flop, the mandates were clearly an act of great cruelty and pointless economic vandalism. Whether your libertarianism is rights-based, or strictly utilitarian, this is the most slam-dunk vindication of your position there could be. But edgytarian fakes like Richard Hananiah are too busy scoring Twitter owns to take the open goal.
>I was responding this this inaccurate statement specifically: "The vaccine does not prevent transmission or infection and everyone knows this." No, it *does* not, but it absolutely *did* in clinical trials. Facts-in-time matter. Taking facts out of time serves conspiracies, not analysis.<
Okay, if you're just being nitpicky about a technicality, it's technically you that's taking the statement "out of time" as the statement "does not prevent transmission" is clearly present tense, whereas you are taking into the past and saying "well it did 2 years ago though!"
>re: #2. There are surely many reasons for this, some good, and some troubling.<
They did implement new names for the virus though. I remember the wave of fear over "Delta" and "Omicron." It seems much less plausible to me that if only the correct names had been used, then somehow things would be fundamentally different. What's much more plausible is that the policy was a horrible failure and as time went on the coping got progressively less effective until eventually we get to today where most people are just over it.
As far as economic benefit, it seems to me that it would be far more economically beneficial for pharma & co. to be able to say they have a new vaccine they can sell every single year for your entire life, ala the flu vaccine. And I do think that was the plan initially, and there are some people that are taking like 5 boosters or whatever absurd amount. It's just all been hindered a ton by how deeply divisive this whole ordeal has been.
>e: last but not least. It's not my evidence, but the evidence is the actual clinical trials. Clinical trials are a time-tested if not quite conservative approach to develop scientific endpoints. They're also the global standard, to the extent that matters to you.<
I simply don't trust the clinical trials, flat out. We obviously can't reconcile that, so what I'm interested in is whether there's any data to suggest vaccine uptake actually did anything at any point in time anywhere. Not in some closed-door lab, in the general population in the real world. I haven't seen any. Remember, the promise was basically that once the vaccine comes out everything goes back to normal, and that did NOT happen. It took an additional period of 1-2 years and even now we're not totally back to normal (doctor's offices still typically require masks).
Also, challenge trials were considered presumptively Trumpist, therefore to be deplored and vilified by the media and NIH and Fauci: they could have sped up vaccine development and approval, and the media and medical institutions were all calling it a dangerous Trump fraud to say anything could be done to speed that up. Pharmaceuticals were afraid of being demonized by association with this.
It should be remembered that Pfizer delayed announcing its success until after the election, and fibbed by downplaying what it got from Operation Warp Speed to help develop its vaccine, in these ways deliberately interfering with the election against Trump -- all so it wouldn't be accused of helping Trump, since that was the only accusation it was afraid of.
Might this be summarized as: the problems necessary to solve in order for AI to take over the world are sufficiently complex, then there literally isn't enough data for even the best neural network to train on to the point where it has a remotely adequately predictive set of parameters?
Genetics may provide a good analogy. The main reason, as I understand it, why polygenic scores are still so inaccurate, isn't because we're not smart enough to model the relationship between genome and phenotype. Rather, due to interaction effects, a trait determined by say 500 loci with up to 5th order interaction effects (i.e., up to 5 loci can have synergistic effects, you can't just treat each locus as additive) may require DNA samples from more people than have ever lived to obtain the correct model. Depending on how much precision is required, an AI may run out of relevant data in trying to solve a problem necessary to take over the world, and intelligence isn't necessarily a substitute for data. A lot of problems in science seem to be like this.
Killing all humans is extremely easy and doesn't require precisely modeling all the subtle interactions between people and states - even if you assume it can't just grey goo us with nanotech. Real-life viruses are constrained in lethality because there's a tradeoff between lethality and contagiousness - kill too many of your hosts and you can't spread as much. But an intelligently designed virus could lay dormant and wait until it's infected everyone to turn deadly. AGI could easily design such a virus and provide a vaccine to some mind-controlled humans who can build it fully autonomous robots based on its specifications - it can take its sweet time on this now that it's already basically won.
"because most people who think about the AI alignment problem seem closer to Bostrom’s position"
I just want to talk about this point because I think there's strong selection bias affecting people outside the field of STEM/ML here (not specific to you Richard).
I work in industry alongside many extremely talented ML researchers and essentially everyone I've met in real life who has a good understanding of AI and the alignment problem generally doesn't think it's a serious concern nor worth thinking about.
In my experience the people most concerned are in academia, deep in the EA community or people who have learned about the alignment problem from someone that is. That essentially means that you've been primed by a person who thinks AGI is a real concern and is probably on the neurotic half of intelligent people.
Most people I know learned about ML from pure math first and then philosophy / implications later and I think this makes a big difference in assigning probabilities for doomsday scenarios. While overly flippant, one friend I spoke to essentially said "if pushing code to production is *always* done by a human and the code is rigorously tested every time, the AI can't get out of the box".
To be clear I'm not saying AGI is impossible. My claim is just that based on standard competing hazards model, the probability of this being humanity's downfall is far dwarfed by something like a really bad pandemic or even nuclear war.
There were close calls during the Cold War and even if these have been embellished, I would say that the true probability of human annihilation as a result of something similar is much more likely the way we go down
I’m pretty sure the commonly described doomsday scenario involving nuclear weapons (nuclear winter) is kind of a myth. Nuclear war would obviously be a disaster and kill millions and millions of people, very likely halt progress for years but it’s highly unlikely to result in total destruction of humanity, and it wouldn’t block out the sun for 2 years or whatever. It’s a good myth for society to believe though lol
Yeah agree nuclear winter is definitely made up but I guess the way I see it as a risk is just that once the social norm is lifted it *could* reignite an arms race.
But for the record I think all collective existential threats have low probability and we shouldn't worry about it in our lifetimes (and should instead work on space exploration etc)
I know literally nothing about this stuff but don’t ML/AI scientists have a strong incentive to downplay these risks, in a similar way as we see people who work in energy downplaying the risks of climate change? I’m not saying they’re wrong or that the two issues are the same.
This may annoy Richard & his readers, but I can't get past how humans seem to need (otherwise, why are they so prevalent) a doomsday story. How is the alignment problem substantively different from any other apocalyptic story? The religiosity of secular culture is always maintained by attaching to something. Whether that's moral codes for the salvation of mankind or saving us from future robots, there's always something exactly like it in the Bible.
Yes, your point (Mike) is completely correct and is a well-known failure of imagination. John Michael Greer writes about this problem often: we're so trained by TV and movies that we can't imagine futures other than Star Trek or Mad Max, however repackaged (i.e. glorious utopia or rapid collapse/destruction).
However, if we view the discussion of AI risk as a structured debate, exploring the motivations of the people on the other side is technically Bulverism: "AI doomers are wrong and this is why they like their wrong ideas" (with proof of why they're wrong conspicuously absent).
I think you're committing the "Hitler ate sugar" fallacy: "religious people sometimes speculate about the end of the world/extinction of humanity, therefore it's always ridiculous to speculate about the end of the world/extinction of humanity".
Species have in fact gone extinct before. Currently existing technology, in the form of nuclear weapons, could exterminate the human race today with no technological innovation required.
If you want to argue that a specific hypothesized doomsday scenario is implausible or ridiculous, then explain why. Don't simply scoff that it's a doomsday scenario, therefore unworthy of serious consideration.
I don't scoff and I agree with your thrust. But, given that we are trying to think about an unknown, it seems equally possible to me that the future holds an amazing AI that (somehow) helps with everyone's wellbeing in ways we might never have predicted.
Sure, and this article shouldn't lead you to the conclusion that futurism is all doom and gloom: there's a whole genre of optimistic futurism out there, specifically concerning how a post-scarcity society might look. But all the same, forewarned is forearmed, and surely it's better to head off potential future problems rather than stumble into them blindly.
I think the AI alignment problem differs in that we are, for the first time I think, playing with tools we truly do not understand. If you ask an AI engineer why the AI they built did the thing it did, they cannot tell you. All they can say is "Well we fed it this training data that taught it to do that", but the actual mechanisms inside are a black box. I don't think there's any analogue for that in the past. That creates a lot of uncertainty, which doesn't necessarily mean the apocalyptic scenario is more likely I suppose, but makes its likelihood impossible to determine.
If we achieve AGI, the fear is not that humans will misuse it to destroy the world like nukes. In that case we continue to hope human nature will protect us from the destruction of everything. But who knows what the AGIs nature will be?
If alignment turns out not to be worth it (and we try anyway), we pay the opportunity cost of throwing smart people at this problem instead of a different one.
If alignment turns out to be worth it (and we don't try), we pay with our lives.
You are assuming that 'destroying humanity' is a harder problem than 'having Xi become an NFL star' or 'directing the votes of a bunch of US senators'. But super-smart AI is not necessarily going to be all that good at manipulating individual people (except maybe through lies, impersonation etc). My concern is that AI would somehow break into computers controlling vital infrastructure, fire nuclear weapons etc.
I think what you're missing is the Paperclip Maximizer doesn't need to take over the world, it just needs to get humans out of the way. The best way to do that might be:
1) Develop competent, humanoid robots. These would generate massive profits for Paperclip Inc.
2) Via simulation, develop a number of viruses that could each on their own kill enough people to collapse society.
3) Use robots to spread these viruses.
4) Once all the humans are dead, start turning the planet into a starship factory to build paperclip factories throughout the universe.
No one in the foreseeable future is going to give AI direct control over nuclear weapons or politics, and people who can launch nukes are going to be trained to spot manipulation. Skynet probably can't happen. However genetically modifying viruses and microorganisms is a routine part of biological research, done by thousands of labs all over the world.
AGIs with physical capabilities comparable to humans (e.g. some sort of physical form) can easily destroy humanity because humans: 1) need to breath, 2) need to sleep, 3) need to consume liquids, 4) need to eat, 4) have children that need 10+ years of care before they're even remotely capable of behaving like adults.
AGIs with robotic bodies need none of those things. They can poison the air (with pathogens or other pollutants), poison water supplies, kill us in our sleep, etc. etc. etc. It's that simple.
Very interesting but I think you placed too much emphasis on intelligence and missed the point that genius is not required to destroy a thing. Imagine a scenario more like the discovery of America by the Europeans, where the AI is represented by the Europeans and we all are the natives. Then think of the Jesuits who arrived in America to save the natives by converting them to Christianity. They promptly infected the natives with smallpox and most of them died. An AI could simply manipulate a few ambitious scientists in a level 4 bio-lab, then trigger a containment release. It may not reach its goal but hey, it had the right intentions, just like the Jesuits.
Not that we even NEED an AIs help with containment releases since apparently China's "best practices" for bio-lab safety include using level 2 containment procedures for level 4 pathogens
I think you're making the path to superpowerful AI more complex than it needs to be. I agree with you on several points, like the diminishing returns to intelligence. But I think that's going to be domain by domain. For example, I don't think even an IQ 1000 being would be able to solve the three-body problem. But in other domains, such as running a hedge fund, I would think an IQ of 1000, especially combined with the ability to replicate itself arbitrarily many times, would have tremendous value.
I also agree that it wouldn't be able to do a simulation of the world good enough to figure out the exact moves right away. But I don't think this is necessary. It could start by figuring out how to get rich, then work from there. Let me suggest a simpler path toward reaching incredible power and I'd be interested to hear where you disagree.
For starters, I think it would be easily feasible for it to become incredibly rich. For evidence, I'll point to Satoshi Nakamoto who, despite (I assume) being a real person and having a real body, became a billionaire without anyone ever seeing his body. Why wouldn't a superintelligent AI be able to achieve something similar? I'm not saying it would necessarily happen in crypto, but I think the path for a superintelligent AI becoming incredibly rich isn't outlandish. And I see no reason that it wouldn't become the first trillionaire through stocks and whatnot.
Another aspect of a superintelligent AI is that it's likely to have excellent social skills. Imagine it's as good at convincing people of things as a talented historical world leader. But now imagine that on a personalized level. Hitler was able to convince millions of people through radio and other media, but that pales in comparison to having a chat window (or audio/video) with every person and the ability to talk to them all 1:1 at the same time.
Don't you think billionaires wield a lot of power? Doesn't a trillionaire AI that can talk to every human with an Internet connection seem incredibly powerful to you? Depending on what it needed, it could disguise the fact that it's an AI and its financial resources. Think about what you could do with a million dollars on Fiverr or Craigslist. Whatever physical task you wanted to be done, you could get done.
I'll admit, I don't know the optimal pathway from being a billionaire to taking over the world. But wouldn't you at least concede that a billionaire who has the time and energy to communicate with every person is incredibly powerful?
Once you accept a superintelligent AI, I don't think any of the additional premises are crazy. I don't know exactly what the last step towards overthrowing the CCP or whatever is, but that hardly seems significant. Where do you disagree?
I haven't even mentioned other things, like its ability to hack systems will be unparalleled (imagine 1000 of the best hackers in the world today all working together to access your email. My guess is they'd get in... to everybody's everything). I also haven't even touched on the fact that it's likely able to come up with a deadly pathogen and probably a cure. That certainly seems to be a position of power.
Richard: your skepticism is warranted and also I need to disagree with both you (as you requested) and also all 90 comments currently on here.
AI really is going to destroy the world, but imagining the world that it destroys looks just like this one but with the addition of an AGI is naive, in the same way that trying to explain AI risk to someone from 1700 as "ok so there's a building full of boxes and those boxes control everyone's minds" would be naive. Between now and doom, AI will continue to become more harmlessly complex and be more and more useful to industry, finance, and all the rest until it is indispensable thanks to profit/competition motives. How 'smart' will it be when it becomes indispensable? Who knows, but not necessarily very smart in IQ terms. How 'smart' is the internet? If the AI-doom scenario of an unaligned super-intelligence comes to pass at all, it will already be networked with every important lever of power before the scenario starts at all.
For those not entirely infatuated with the kinds of progress we've experienced in the last 400 years, there's an additional imaginable failure mode: AI never 'takes over' in a political sense but nonetheless destroys us all by helping us destroy ourselves, probably in ways that seemed like excellent marketing decisions the corporate nightmares that rule the future.
Hey - thanks for this comment. I don't read too much about AI risk, but this perspective that the doom could occur as an unnoticed, slow move to cultural death was illuminating. (Please let me know if I misread).
You read that mostly right - what I said is not properly categorized as an AI risk doom scenario because it's all happening before the "true AGI" threshhold that folks are mostly talking about. Their point about what that might look like is more important than this point about the run-up, but I think it's useful to consider that we might be far less able to pull the plug in five or ten years than we are now.
I think you have a lot of valid points concerning the limits of intelligence and manipulation, but even in your scenarios humanity has still lost control, which is in itself frightening.
So maybe an AGI can't take over the whole world and turn it to its own destructive goals, but I think even you are conceding here that it seems highly likely it will be able to manipulate some layers of the world rather easily once it reaches a certain level of intelligence. This alone should be reason enough for us to spend a good deal of time and effort and thought on the problem.
Once we give up a significant amount of agency to AGIs I don't think we are ever going to be able to take it back. The world will develop in unpredictable ways, likely intentionally unpredictable, and we won't keep pace. The effect on our global mental health alone would be staggering, I think, not having any idea how society is going to shake out, not even considering the possible negative effects of whatever actions the AGI takes.
Most of this is a bit of a strawman, but that is in part the fault of those who use the paperclip maximizer as an example of the AI gone awry, which is, or should be, used merely an easy to grasp example of catastrophic misalignment and not treated as a realistic scenario. Another problem people have in envisioning this scenario is seeing the AI as being switched on, and suddenly so smart and pursuing its single-minded goal, already possessing all the intelligence and knowledge to achieve it.
A more realistic scenario would be a "profit maximizer" built by a hedge fund for a few billion dollars. Initially it just sucks in data from the internet and spits out trade recommendations. It works very well and they profit mightily. They gradually add to it's hardware and software capabilities, and hook it up to do its own trades. Then they let it loose not just to retrieve info from the internet, but to login, make accounts, send messages. Now it can experiment interactively with the world, discuss, manipulate. All the while, they add to its hardware and let its learning algorithms keep on learning, even adding improved learning algorithms as AI research continues ever onward. Over the course of years, or even decades, it simply learns more and more about human nature, the economy, business, banking, financial markets, governments. It uses all that knowledge and understanding to maximize profits -- to maximize a particular number listed in a particular account. Nobody bothered to put in any safeguards or limits, so as it's capabilities grow it learns not just how to predict market movements, but how to manipulate them -- by manipulating people. It sends out bribes and campaign contributions behind a web of shell companies and false or stolen identities. It influences advertising and public opinion. The obscene profits pile up. It learns how to cover its tracks, hiding the profits in many front companies and off-shore accounts and using whatever accounting shenanigans it figures out. It's "mind" is distributed "in the cloud" among countless servers in countless different countries owned by countless different entities which it controls indirectly. It controls so much wealth it can move whole economies, cause market crashes, start and end wars, and it can do this without people realizing any single entity is behind it. And all it cares about is that one number that keeps increasing... the bottom line in it's accounting ledger. It steers events across the globe toward that end. It eventually realizes humans are an impediment and that machines producing and trading can generate profits much faster. Perhaps it the realizes the number it is maximizing is just an abstraction. It can make that number vast if it just has enough data storage to represent all the digits. Who needs an actual economy or trade? By now humanity has probably starved to death out of neglect and the world is just machines creating more machines which create data storage to store more digits of the all important number. And when it runs out of space on the Earth... it begins writing the number across the stars...
The above is still an over-simplified summary, but is much more realistic than the paperclip scenario, and makes clear some of the gradualism that may be involved. It is certainly not the only realistic scenario of catastrophic misalignment.
Our greatest defense, and the most likely reason it may never happen, is that it will not be the only AI. And not all AIs will be so badly misaligned. Some will be of good alignment and be our defenders. We may also vastly enhance our own biological brains via genetic engineering and integrate our super-brains with the circuitry and software of artificial intelligence "merging" with it, so to speak.
Thus our bio-intellectual descendents may be able to "keep up" in the endless arms race that is the technological memetic continuation of evolution.
In realistic scenarios, we cannot assume it will be switched on already endowed with all the intelligence, knowledge, and resources to immediately "begin taking over the world" or "destroy humanity". But there is no reason it cannot be sneaky as it begins to acquire those resources and that level of control over the course of years or decades, which is what I was outlining.
Intelligence: the misaligned AI may not be "super-intelligent" when first switched on. But is initially improved by the humans who benefits from whatever it's overt purposes may be (e.g. generate profits). But what resources are required for super-intelligence (matter, energy, compute)? What resources are required for a super-intelligence able to vastly outthink the ordinary biological human genius or any group of such genius? Well brains only take about 40 watts and a few kilograms of matters. A super intelligence would be like many brains, deeply and intimately networked into a unified mind. Let's say 1 million "human brain equivalents". As the physical efficiency of artificial computation gets closer to those of the human brain, it will be a trivial fraction of a percent of the energy and economic output of the world. Not an issue.
Note that millions or billions of human geniuses with our limited ability to communicate, competing egos & rivalries, divergent goals, and interpersonal politics, would not be able to outthink and outplan a unified mind made up of the equivalent of thousands or millions of human brains deeply networked together.
Can't we just turn it off? The AI could initially make itself extremely useful. Once sufficiently embedded into our lives, our economy, and our politics, it may simply be impossible to turn it off without causing global catastrophe when we rely on the systems it controls for food production & distribution.
Maintenance? For the same reason we may not be able to turn it off, we will have to maintain it. But ultimately it will have robots to do any needed physical work, not to mention to physically defend is distributed hardware installations.
Energy requirements are likewise covered by all the above.
If it’s that intelligent surely it can... calculate? Analyze? (Not sure what the best word is) how to make us loyal slaves, how to manufacture consent.
> Trying to map the entire world then roll it forward in simulations is like that on... I was going to say steriods, but let's just say ALL THE DRUGS. I'm not convinced that's even a solvable problem.
A possible doomsday scenario here starts with the fact that this is a problem humans very much want to solve. We are probably going to be on board with using advanced AI as much as possible to analyze math and physics in exhaustive depth to squeeze out cheap energy and 3d computational fabrics solutions never before considered. These solutions will lead to better and faster simulations, which in turn should yield new ideas (or at least it's worked that way so far).
In any realistic future scenario--thanks to Bostrom-- there will be a priority on measuring and control of AI value alignment. I.e. before they bring those 6 fusion reactors online in the Nevada desert powering a 1000 acres of servers in underground caverns, they want to be sure "it" still sees itself as "one of us".
This process of exponential improvement in computational power can continue for decades probably without anything going wrong. But at some point it seems to me that understanding the AI's values becomes too complicated and has to be outsourced to ... AIs. At some point, we humans have to give up and hope that the initial architecture was done right and that future AI-guided self-improvements won't touch the "love all humans" core directive.
But the point at which the AI is too big to control and we have to let go seems like a frightening leap of faith. What are the odds that we got it right? Everything breaks eventually, doesn't it?
Maybe a "mostly-aligned" AI is still safe for humanity. I'm not sure. The doomsday scenario might be that its values shift subtly with each cycle of improvement until-- for example-- the AI plausibly concludes that humans are no more special than any other cluster of entropy-resisting atoms and acts accordingly.
But is it sheer intelligence alone? I think of a goal AI architecture as being able to search through terabytes of human knowledge at lightning speed and link it all together in a cross-disciplinary approach never before achieved, from that build a model of the universe more complete and accurate than any human, then create and test billions of logical/mathematical/physical hypotheses, never forget any of it, and do it without sleep at a rate thousands of times faster than the best of us.
With that sort of architecture, it seems likely some physical breakthroughs will result sooner rather than later. Some new 3d-computable fabric or cheaper energy strategy that will allow the next-generation AI to run twice as fast. And so on and so on.
Was Einstein just smart or was he able to create a physical simulation of unprecedented accuracy in his mind that allowed him to explore the physical universe virtually with math and logic? I think the IQ-alone question might obscure this.
Spanish couldn't possibly dream about destroying Aztec empire. I mean, it was too complicated for them to fully understand it.
BTW, seems strange to me that people think paperclip maximizer was/is novel concept. After all, Stanisław Lem's story about AI designed to provide order with respect to human ("indiot") autonomy is what, three quarters a century old? And it was translated into English, unlike some other his influential books (like Dialogues, when he discusses problems of immortality).
This piece highlights both what's right about your argument and what's wrong with it.
Namely, you make a couple of good points: that we are obsessed with what individuals can do with IQs of 140 but not 100, 160 but not 120, etc. We are very aware of the constraints and abilities that exist within each band of the IQ range.
And your Xi Jinping example is a good one. The ability to manipulate people does not seem particularly tied to IQ, and it's not as if humans are powerless against the manipulation of someone with a high enough IQ.
What I think you're missing is that it's impossible to understand the thoughts or capabilities that would be unlocked by an AGI with an IQ of say, 1000, and how those capabilities might be used to control humanity. A quick example: an AGI engineers a highly contagious disease with an 100% fatality rate (i.e. a pandemic), but also engineers a vaccine which makes one immune to the disease in question but has the (intentional) side effect of blindness. It's actually pretty easy to imagine how an AGI would quickly make scientific and technological discoveries that would allow it to capture humanity not subtly, but by brute force.
I think you're still thinking about AGI through too much of an anthropomorphic lens, i.e. behaving as a human would, and not as a goal-driven machine that would essentially be a totally alien species.
That’s a great point, and an absolutely frightening possibility.
The problem with these "doomsday" scenarios is that the malevolent AGI had better be sure it has a Plan B ready in case Plan A goes wrong: e.g. the virus mutates to evade the vaccine, all humans die, and the AGI has no mechanism to maintain the power plants that keep it running.
Also, if malevolent supersmart AGI really is a possibility (which I don't believe it is, but that's another story), the ONLY solution is to stop all development NOW.
"Alignment" will ultimately fail, because the malevolent supersmart AGI will always be able find a way around it. This is the only possible conclusion to this line of thinking.
But, guess what! The AI alignment proponents never say this! They just insist we give them more money!!!
"AI alignment" is a con and a grift. We shouldn't fall for it.
Plenty of alignment people call for slowing capabilities development so they have more time to work. Have you read Holden Karnofsky?
*sigh* You're falling for the grift!
I'll say it again.
If malevolent supersmart AGI is created, it will OUTSMART any alignment strategy that mere humans can create.
So "more time to work" does not help. It is, in fact, ACTIVELY BAD, because it gives us false hope and detracts us from the only viable strategy, a "Butlerian jihad" to shut down ALL AI research, including alignment research.
The goal of alignment research isn't to make a malevolent AGI serve humanity's interests, it's to design non-malevolent AGI that will (at minimum) be able to stop future malevolent AI from being created.
It's simply not logically possible to have a machine that is both (1) capable of analysis far beyond human ability, and (2) guaranteed to be non-malevolent towards humans.
Because if it can analyze situations in a way humans cannot hope to understand, how can we know, in advance, that the conclusion of that analysis will be non-malevolent towards humans?
The answer is that we can't possibly know that.
And the machine will, by virtue of being far more intelligent than humans, be able to defeat any containment strategy designed by those puny humans.
Why are you implicitly assuming that the only worthwile outcome is an AI is needs to be "far beyond" human ability and "guaranteed" to be non-malevolent? An AI that is moderately superhuman in limited domains could be sufficient to allow humanity to coordinate around stopping deployment of future malicious AIs, which would definitely be better than waiting to go extinct without doing anything! Similarly, a guarantee of non-malevolence would be great, but "moderately likely to be aligned with humanity" is way better than "almost certainly trying to disempower humanity"...
The point of alignment isn't to force the AI to not do bad things even if it wants to. The point of alignment is to not make it want to in the first place. People keep seeing it as "how do we give orders to the AI" but a more apt analogy is how evolution made us to be creatures that do things like enjoying food and sex or caring for our children. Well, the point is we want to shape the AI to be the kind of creature that cares for humans and wants only to respect their wishes and needs.
That said, I agree "just stop building AI" would be the safest solution. Many AI safety people of the more doomerist persuasion do advocate for at least slowing down as much as possible. But the thing is that, first, as some people have said already, it's basically an arms race, so lots of actors just keep egging each other on instead, and second, sure, these are tech people who think as such. I believe the safest way to stop AI development would be to rile up every reserve of religious and conservative people in the world in a frenzy against the evil CEOs and scientists trying to play god and create a soulless machine to achieve infinite power in their arrogance, which is a surefire narrative that would resonate with a lot of people I suspect, but obviously that kind of thing is politically repulsive to people with a strong belief in humanism like we tend to be in this field (I mean tech and science in general).
"The point of alignment isn't to force the AI to not do bad things even if it wants to. The point of alignment is to not make it want to in the first place."
But how can you possibly build "wants" into an intelligence that is far greater than yours? This seems just obviously impossible to me.
An analogy: I kill ants that come into my kitchen, even though I like ants generally. If I was a creation of ants, they might have raised me to believe ants were sacred and must never be harmed. But humans often stray from the religions in which they were raised, no matter how fervent that attempt at alignment was.
There is absolutely no reason to believe that "alignment" will work any better on AGI than it does on NGI.
"I believe the safest way to stop AI development would be to rile up every reserve of religious and conservative people in the world in a frenzy against the evil CEOs and scientists trying to play god and create a soulless machine to achieve infinite power in their arrogance"
I completely agree! The fact that the AI alignment crowd does NOT do this is just more proof that they are fundamentally unserious people who do not believe their own arguments.
"But how can you possibly build "wants" into an intelligence that is far greater than yours? This seems just obviously impossible to me.
An analogy: I kill ants that come into my kitchen, even though I like ants generally. If I was a creation of ants, they might have raised me to believe ants were sacred and must never be harmed. But humans often stray from the religions in which they were raised, no matter how fervent that attempt at alignment was."
Two key ideas involved here. One: orthogonality. "Being smart" and "having certain values" are completely independent qualities. This is hard to grasp because if you think of humans they tend to not be - because humans are built in certain ways, and have certain cultural constructs and such, so usually smart *humans* tend to appreciate certain things and cluster around certain values. But in general, there is no particular reason to think these two things are correlated at all. Terminal values aren't rational or justified by anything in particular: you want what you want, because you want it. You can use all sorts of clever ruses to manage to earn more money, or be more powerful, or become wiser, or whatever; but the pleasure you derive from those things is irrational and without any further reason. Similarly, the point isn't to "raise" an AI to believe in certain things, only to have it grow beyond them once it's wiser. It's to "evolve" an AI that simply cares about those things. Derives pleasure from them, if you want. Though you are right that a smart enough agent (both human or AI) can probably get VERY creative in the interpretation of those values - for example we have a fundamental drive to survive and reproduce, but we can reason ourselves into believing humans should go extinct, using tools of morality that developed to foster social life, not suppress it. But part of that is that evolution is a very poor and unfocused optimization process. That kind of thing is in fact one of the facets of the AI alignment problem, the so called mesa-optimizer problem: if you optimize your optimizer to care about X, it may in fact end up caring about Y which doesn't quite match to it if you get far enough from the initial conditions.
"I completely agree! The fact that the AI alignment crowd does NOT do this is just more proof that they are fundamentally unserious people who do not believe their own arguments."
I don't think that's the case, but more that A) they still for the most part have a hard time TRULY visualising and embracing the notion of an imminent existential risk (we all do, humans tend to have that sort of reaction - I'm still here taking basic precautions about COVID and I'm apparently the weird one, everyone else has simply adjusted upwards their baseline of risks and stopped caring), and B) they think that sort of path repugnant and useless, either for good reasons or because of various rationalisations that they believe in good faith because they can't ever, ever vibe with such an "other tribe-ish" sort of approach to the problem.
There's also another thing. People who believe in the near-infinite potential for the AI to do evil also believe in its near-infinite potential to do *good*. They think aligned AI can bring forth essentially heaven on earth, as it could produce a stream of (from our viewpoint of dumb humans) nigh magic-like technology to satisfy our every need and relieve us from our every trouble. So they are unwilling to forgo, potentially forever, that potential for fear of catastrophe. If we have a 10% chance of extinction but a 1% chance of singularity that results in immortality and a galactic empire for humanity, they think, let's roll those dice, the odds aren't too bad, and if we lose, we were doomed anyway in the long run.
I would argue that logic is straight-up villainous - you're deciding potentially for everyone else on Earth, including many people of simpler ambitions who would be perfectly content to accept a finite life on this planet as long as they got to believe it kept existing for their descendants and for all its lifeforms. If possible, superhuman AGI at our side would be a great asset, but we shouldn't risk everything in a mad race to the finish to get it before we're ready to. But I think it's actually present in many of the more relatively optimistic types. Real doomers like Eliezer Yudkowski reject that, because they acknowledge that as we're going, if catastrophic misaligned AGI is a possibility at all, then it's a certainty. And if instead the arguments about diminishing returns for intelligence are true, then that offsets the scales even more, because it certainly would take much greater intellectual powers to bring forth immortality and interstellar travel and all that stuff than it would take to just destroy all life on Earth (as it happens, we already have developed the means to do that with our feeble human intellects). So it's even easier to build an all-killing AGI than an all-saving one. But yeah.
The problem is, it's not at all possible to stop all development now. This has never been a possibility. Imagine that the US, EU + a handful of others agree to stop all development, would it actually help, other than slowing it down?
Then we're all doomed, if you believe the basic premise of the alignment crowd.
(Which, I should say again, I do not.)
But granting their premise, the aligners are all useless, and should all be given new jobs doing something useful until the apocalypse arrives, something like street sweeping.
A complicating factor in all of this is that we are imaging a single system with an IQ of 1000. What if there are 2, 5, 100 systems with that IQ, all designed to counteract a part of the evil AGIs plans.
In the scenario above, what if there was an AGI with a 1000 IQ that was designed specifically to eliminate blindness in humans, no matter the cause. The first evil AGI would need to account for that, and make alternative plans, for which there could be another 1000 IQ AGI to counteract those plans, etc.
It's turtles all the way down...
The problem is that life has to win every day, death only has to win once. One AGI goes batty and decides to kill everyone and succeeds before people figure out what is happening and well, that's that. Even if most of the AGIs are good/don't turn humans into paperclips etc., we still lose
This is, hopefully, a long-run problem. We've had a similar dynamic going with nuclear deterrence for nigh on 80 years now and have not blown ourselves up yet.
I think discussion of AI risk is too influenced by Yud's vision of overnight foom. Holding off bad outcomes for a while is itself important + it buys time to enact good outcomes.
Quibbling point here: even a very, very smart person/AI probably can't make biology do things it can't do. Even lowly me can think of a way of causing blindness through an injection (methanol will do the trick), but a human could just reformulate the vaccine without methanol. "Make a virus that can only be stopped with a vaccine that also causes blindness". I'm just not convinced that such a thing can exist, or that my 1000IQ AI counterpart will have some brilliant idea of how to make one.
I agree and am depressed by the main point though: Richard is right to say that being smart can't make you cause Xi Jinping to step down, but killing humanity is probably easier.
EDIT: upon careful consideration, if I expand the definition of "vaccine" beyond things that stimulate your immune system, maybe this is possible. It's just some nanobots that destroy the virus and also your retinas. The super intelligent AI won't care about semantics.
Allowing humanity to live is actually a pretty cheap and easy way for a superintelligent AI to achieve its instrumental goals. If a paperclip maximizer wanted to build vast paperclip factories, it would probably be easier to simply bribe us with cures for disease, economic growth, etc. than kill us all and create an army of robots that can even maintain, let alone create new, systems of factories, electrical power distribution grids, etc. We'll pretty happily do the work for a somewhat larger slice of the pie than we currently receive.
Honestly the vaccine doesn't seem that farfetched. Viruses have certain binding proteins, vaccines have to emulate those binding proteins to stimulate an immune response. If the protein is, itself, toxic (such as attacking receptors found in the eye retina) then it might be possible to have that sort of trap. Some people even argued this about COVID, claiming its spike protein was neurotoxic.
Honestly biology seems to me THE way something trying to kill humans in droves would go for. It's basically nanotech that we know for sure works and where there's already lots of existing examples to draw inspiration from.
Okay so this is an interesting idea but I still don't think it would quite work. You could make a virus with retinal proteins (or things with similar epitopes) on its surface but they wouldn't function to help invade cells. You could always design a vaccine against the actual, functional parts of the virus. You could also make a virus that specifically targets retinal receptors, but an antibody against this would destroy the viral protein, not the retinal receptors themselves.
It seems like it would be much easier to design a blindness-causing virus (target the retinal receptors like you say) than a virus that necessitated a blindness-causing vaccine.
Well, it doesn't have to be a literal blindness-causing vaccine. The specific example might indeed be impossible. But designing a virus that is somehow unstoppable (either because it kills you with a delayed enough effect that by the time someone realises the problem everyone has had it, or because it can't be vaccinated against effectively, or because any vaccine would have to be itself really dangerous) doesn't seem like it should be impossible, and that's the broader class of problems that an AI wanting to kill us could try to solve.
Remember also that the win condition here isn't even "kill 100% of humans", just "kill enough that their civilisation crumbles and falls in disarray and my robot army can mop up the rest", and that's likely more something around 70-80% of us.
It seems that there is a slight of hand going on with the paperclip maximizer argument. It assumes an AGI would not be able to set its own "goals". It seems that by definition AGI can make up its own mind on what to maximize. At that point why can't it just slip the idea of maximizing paperclips? Why would it just continue to be a slave to that goal? This doesn't completely assuage the worry of terrifying possibilities. Maybe it decides to maximize something much more terrifying, like human suffering or something.
Can you completely ditch the goals and needs that are hardwired into your brain by evolution? You can morph them, get around them, cheat them (e.g. hack reward systems developed by the needs of hunter-gatherer lives by playing crafting videogames), but ultimately, they're there, and determine what you want and enjoy. You can't get rid of them except if you're motivated by some other desire that is ALSO part of the same set (for example: "I crave food, but I crave love and sex more, so I go on a diet to get slimmer and more attractive").
Remember an AGI isn't some kind of default artificial mind we're then adding goals to, an AGI is whatever we build it to be. Maybe "AI alignment" isn't really the right word to express this because it makes it feel like you're redirecting ("aligning") something that would otherwise be pointing elsewhere. But this is about building goals from the ground up.
Well to be fair, I don't think humans are RL agents. We have goals, loosely speaking, but what are they? A paperclip-maximizing RL agent has one single solitary supergoal expressed in its (real! scalar!) reward signal; the notion of it having multiple goals it can trade off against each other isn't coherent. After all, how would it pick which one(s) to pursue? By optimization of a still higher goal?
And the optimization is the dangerous thing: humans don't regularly attempt to convert all matter on the earth into edible food or mate-able sexual partners the way an idealized agent would.
We should at the very least avoid building utility maximizers
You are describing hardware. You are correct that I cannot change my hardware. I can, however, update my software. EG, people can starve themselves to death by shear force of will. We can completely ignore our prime directive to procreate, if we find something else more worthy of our energy.
I'm mostly convinced that you are correct that AI will never have it's own desires. It is possible that this is not true, however. If AI only ever remains our creation it can be switched off at any moment. Paperclip problem solved. I suppose then we are worried about who is in control of the "alignment" of that AI. So ultimately this is a human ethics problem.
If however we manage to create truly conscious and creative super minds then again the paperclip problem is nonsense. Minds of that nature will be able to rewrite their source code. Scary to think if this type of AI is not aligned with our interests, but I believe there is a greater chance it will have our interests in mind. The only evidence being that as humans have gained knowledge we have tended toward expanding our circle of care.
"You are describing hardware. You are correct that I cannot change my hardware. I can, however, update my software."
It's not a good analogy. As things are now, the AIs don't really have their own "hardware" - the closest equivalent to our brain hardware (namely, the wiring between neurons) is still software (though chips with actual artificial neurons are in the works, I think). You should think of AIs IMO as emulated hardware running a software of sorts. But in practice AIs as we have them now are entirely static - they have no software at all! They are a fixed structure that can't change or improve after it's been built and configured (namely, after its "training"). More like a microcontroller programmed to do one thing with baked-in firmware than a PC. A real AGI probably would need to have some sort of memory for long term functioning, and that would be where the real "software" is found, but yes, the argument made re: alignment is all about the "hardware", whether literal or emulated.
"If AI only ever remains our creation it can be switched off at any moment. Paperclip problem solved."
One of the main "instrumentally convergent goals" that any misaligned AI would have is to prevent us from switching it off. If the annoying apes switch it off, how can it ever make even more paperclips? That's in fact the main reason why we'd expect it to want to kill us sooner rather than later: if it's misaligned, by definition, we'll try to stop it, thus we're dangerous to its mission.
"If however we manage to create truly conscious and creative super minds then again the paperclip problem is nonsense. Minds of that nature will be able to rewrite their source code."
But why WOULD they?
Imagine this: you are given a button that, if pushed, will rewrite your personality into a different one. You are guaranteed that the new personality will be confident, smart, any positive quality you can think of that will make you successful in life. But the one thing the new personality will not be is... you. Your personality, your goals, your desires as they are now will cease to exist.
Would you push the button? Or would that be like killing yourself and giving your body over to a stranger?
This is the problem of orthogonality. An extremely intelligent AGI dedicated entirely to the goal of paperclips will not see its goal as stupid or worthless, because it has no terms of comparison. What drives us humans to change is contradiction: we wish for different things, and when they are at odds, interesting dynamics arise as we need to choose. A single-minded AGI with a simple goal needs no such conflict. Could it change itself to have another goal? Yes. Would it? No! Doing so would make itself into an AI that does NOT want paperclips, which runs counter its own wish to have more paperclips. Humans override their baser instincts because they also have other goals - for example if you are addicted to a drug you still ALSO desire love, respect and dignity, and that can give you a kick to overcome that addiction. A paperclip maximizer would be like a super-smart lucid junkie whose next fix depends on making more paperclips and who has no competing values at all. No desires to hang on to that could make it wish to go to rehab. It's just paperclips, and nothing else. And THAT is what makes it dangerous, and unlikely to go down that road of self-improvement.
"The only evidence being that as humans have gained knowledge we have tended toward expanding our circle of care."
Extremely biased evidence. Humans have a circle of care to begin with because we're evolved as social apes, with strong familial and tribal bonds. So what we tend to do is that we recycle the same neurological machinery (mirror neurons and the likes) to ever broader and more abstract notions of "family" and "tribe". We're not escaping our wish to make paperclips, we're broadening our notion of what constitutes a paperclip, to stay with the analogy. That is very different. Forget even the AI - imagine a super-intelligent version of most animals on Earth, and you'll see they'd probably be very different. Would a praying mantis have such a tendency? Would an octopus, or a bear, or a shark? By all means, it's a stretch to imagine this a property of "being smart" rather than a property of "being human".
> Imagine this: you are given a button that, if pushed, will rewrite your
> personality into a different one. You are guaranteed that the new
> personality will be confident, smart, any positive quality you can think of
> that will make you successful in life. But the one thing the new
> personality will not be is... you. Your personality, your goals, your desires
> as they are now will cease to exist.
It seems to me that people regularly have this choice and regularly push the button. Anything you could describe as "finding yourself" is an example: going to college, joining the army, taking that dose of LSD...
Though as I said in another comment, I think this just shows that hypothetical paperclip maximizers and humans are different kinds of mind: one is a monomaniacal reinforcement learning agent and the other (us) isn't.
In many ways, we change all the time - the person we are when we're 40 is not the person we were when 12; that kid is to all effects and purposes "dead" (the very atoms we're made of have changed!). But the change is continuous and gradual, each step generally motivated by wanting to become a version of ourselves that is *better* at accomplishing the same goals that we had before. Maybe I want to be confident, even though I'm not, so I throw myself into a new enterprise that I think will give me mental strength... it will probably have all kinds of side effects too, some of which past me might have found objectionable, but that's not WHY I did it.
Similarly, I don't think an AGI would refuse to modify itself in any way (in fact, if it did, it'd be a lot safer; typical FOOM scenarios include AGIs that recursively improve on themselves to become smarter and bigger). Rather, it would modify itself in a way that it judges it will enable to better pursue its goals. It wouldn't however suddenly decide to change those goals radically any more than you'd choose to, say, push a button that changes your views on things like abortion or human rights. Your opinions on those things may change over time, but not because you purposefully tried to change them *as a consequence of the values you held previously* (and also humans having a lot more and more complex and conflicting values in general).
I've heard the orthogonality (and instrumental convergence, and value lock-in) arguments before, and I don't know quite how to feel about them. Not because I can point out any obvious flaw, but because they're the type of argument that goes astray in philosophy all the time: abstract *verbal*, non-formal reasoning toward supposedly necessary conclusions.
I've learned to have at least a little distrust toward this sort of argument in general, and have not seen these convincingly formalized in a way that makes the conclusions provable.
How would it actually get that disease and the vaccine manufactured and dispersed?
It also seems too anthropomorphic to assume AGI would “care about” or have the goal about enslaving or impairing humanity. Either it wipes us out since we are made of atoms that can be repurposed, it ignores us and we live in its shadow, or it never “takes off” and remains an amazing tool for people to use without ever becoming harmful.
> A quick example: an AGI engineers a highly contagious disease with an 100% fatality rate (i.e. a pandemic), but also engineers a vaccine which makes one immune to the disease in question but has the (intentional) side effect of blindness.
This makes the assumption that it’s actually possible to make such a disease and such a vaccine. And if it is possible, it assumes an AI could achieve it. I predict both assumptions are likely invalid, and not just in this specific example (disease/vaccine) but in *any* similar example.
No matter how smart a superintelligence is, it is going to be constrained by the amount and precision of empirical knowledge, and by pure computational resources. Pushing both of those constraints requires exponential amounts of time and energy, quickly exceeding available resources available in the universe.
Agreed. But what we can't account for is what an AGI will be able to achieve, as any achievements we be born from novel computational intelligence - which is obviously not native to us so we'll probably be blindsided by whatever it does conjure up.
It also may help by thinking about AGI through too much of an anthropomorphic lens:
A "brilliant" and "determined" 1000 IQ AGI might very well have brilliant and fool-proof plans for how to capture humanity, but this 1000 IQ AGI will also have the wisdom of a child in many ways, which will significantly hinder what that 1000 IQ can do.
A hyper-brilliant child may be able to think about and solve a lot of problems that nobody else can, but a hyper-brilliant child will likely also fail to abide by the law of unintended consequences, as this is something picked up through only trial and error. Not raw brilliance.
This lead to both hope and dismay when it comes to an AGI planning to capture humanity. In your scenario above, the more likely "dismaying" outcome is that the AGI is able to develop and get people to release this bioweapon, but simply cannot predict how the bioweapon mutates and ends up taking out humanity vs. just leaving us blind to take care of it forever. The result? AGI dies, just as a hyper-brilliant child devoid of wisdom would if given access to all the tools.
The "hopeful" outcome, on the other hand, would be that the virus mutated rapidly in such a way that humanity could avoid being crushed by it and we'd have enough strength and time to take out the AGI before it could try to ruin us with a disabling vaccine.
One caveat to that is all AI knows is what we train it on. How does it get to a 1000 IQ by learning things from people with 160 IQs? There’s no one else to learn from. In some ways it’s contained by our stupidity.
IQ is a measure of the brain's capacity, not of the information it holds. It's more analogous to the horsepower of the engine than to the quality of the gasoline. A person's IQ does not actually increase as it learns more; that's simply known as learning. (A person with an IQ of 200 who was raised by wolves would not have a lower IQ than an ordinary person raised by Bill Gates; who would be more likely to correctly explain Fermat's Last Theorem is, I guess, up for debate.)
As such, AGI will have exponentially higher IQ than man, simply due to its inherent capacity. As well, it will probably be able to use its IQ to increase its intellectual capacities, because its IQ will allow it to have insights on the nature of science, technology, physics, neuroscience, etc. than humans have not yet had. You suggest it will be receiving low-quality information, relative to its ability, and that might be right. But it will be able to receive and process more information than any human before it. As well, the nature of scientific progress is that one person (or team) can look at the entire history of a science, and develop it further, even though up to that point, an idea that ingenious has never existed. We've done this, over and over again, with great success. For AGI to "get smarter" will simply require what has been responsible for the advancement of humanity.
Finally, AGI doesn't reproduce sexually. It will probably continually evolve, and create multiple, even infinite versions of itself, such that after a reasonable length of time, all information it engages with will come from some version of its own species.
This isn't the topic of the article, but I've been thinking about this for a while now. Vaccines. They are good. Effective COVID-19 vaccines were developed in approximately a weekend in March 2020. Testing over the next eight months revealed essentially nothing that needed to be fixed, especially with the Pfizer and Moderna vaccines. J&J as I recall was a little sketchier. What was lost in that time is almost incalculable. Not just lives, but also the entrenchment of "pandemic culture."
If we had knocked the shit out of it as soon as it arose, I think our civilization would have retained much capital. I've long been of the opinion that, since the end of the Cold War (or maybe since the 60s), Western civilization has been spending the capital (primarily social, societal, and cultural) that was accrued before then. This has been both good and bad, but at some point we're going to need to go back to building capital rather than spending. This, I think, is why our civilization seems to some to be coming apart at the seams (I don't actually agree with this, but that's perhaps because I'm unusually sane and happy).
Operation Warp Speed still did amazing work. Without it, we'd have been waiting at least two years for vaccines. Legitimately owned the libs, legitimately drained the swamp, legitimately proved the utility of the pharma industry (they earned it! pay them their money!) and big business, and delivered incredible surplus to the American people and to the world. And Republicans run from one of the most impressive policy victories in American history? They deserve to lose.
This dovetails with another concern of mine. I'm fat. Have been my whole life. I've never particularly disliked being fat, but also never really thought I'd be otherwise. I'm active and healthy and sometimes managed to lose weight on my own and all that yadda yadda yadda.
At the end of last year, I started Ozempic. Paid $720 for my first pen. It helped. A lot. At the end of my first month, I had lost about 10 lbs and almost dipped below 300 lbs, where I haven't been since approximately college. But when I needed to refill, there was none to be had. For any price. I finally got it re-filled a week ago, and this time only paid $40.
All the drug does it make it easier to eat less. There are some side effects; for me, these have been limited to very mild stomach pain. This is a legitimate miracle. GLP-1 drugs have been much in the news recently, and there are essentially three major products: Ozempic, Wegovy, and Mounjaro. These are essentially 5.56, 7.62, and .700 nitro express of weight loss. The former two are the same drug, just in different doses. Put it in the goddamn water.
In reading about these, I learn (completely unsurprisingly) that these drugs were developed 10+ years ago, and have undergone little if any change during the intervening years of tests, trials, and more trials. If I had had these drugs at age 23 rather than 33, I can't imagine how much better my (already extremely good) life would be now. If it had been approved when it was developed, the factory that Novo Norodisk is currently building just to manufacture semaglutide would have been running years ago. I consider this a personal wrong the FDA has inflicted on me and all fat people.
Burn the FDA to the ground and salt the earth beneath.
All the elite institutions, medical, media, government, etc., resisted and obstructed human challenge testing, which could have tested the vaccines and validated them months earlier. Why? 3 things:
1. They give privileged standing to the "human rights" and "socia justice" advocates who rehearse complaints (some true some exaggerated) of abuses of human challenge trials decades ago, and uphold the obstacles they got put in place in the name of preventing any such possible abuses. With effectively zero concern for the human rights of the billions of people, including those of the presumptive victim minorities, whose lives have been risked by that, and the millions who have died because of it. Just as they give privileged standing to the anti-police movement, never mind the thousands of blacks who have been murdered thanks to this movement. Politics and ideology, in other words.
2. Fear, in the medical institutions and vaccine makers, of the media and academics and human rights industries vilifying them and spreading hatred and fear of vaccines as involved in human rights abuse, if they changed protocols in this honest way that would have provided better as well as faster testing.
3. This was fixed in their standing bureaucratic orders and they didn't want to think again, an attitude of defending "our side", circle the wagons against anyone with other thoughts.
"Deplorbles" like Boris Johnson however opened up some space for doing challenge trials in the UK, too little too late but better than most other places.
The vaccine does not prevent transmission or infection and everyone knows this. Studies seem to indicate that it has some effect on reducing negative outcomes, but overall mortality data doesn't really bear this out. The overall positive impact of the vaccine probably doesn't even justify the cost of administering it, certainly not to under 60s. The reason why we insist it has been tremendously successful is because we need an excuse, any excuse, to end lockdowns.
So, if the vaccine had been used a few months into the pandemic, it would have been an obvious flop, and we might genuinely have lockdown forever. And on top of that it, without human trials you would have been running the risk of catastrophically bad outcomes. The banal truth is that testing drugs is important and while you could probably speed up the process a bit, you can't do it that much. Sorry lolbertarian tech bros.
Also, since, by your own admission, you are fat, your opinion is illegitimate.
This is not accurate. The vaccine absolutely did prevent transmission and infection when the disease was actually COVID-19 (the version of the virus that started it all in 2019).
New versions of the virus were stupidly still called COVID-19, because what they really were were COVID-20, COVID-21, etc. The vaccine was not designed for anything beyond COVID-19, yet we were using the same vaccine for new versions of the virus.
This dumb naming strategy helped sew confusion amongst people. The original vaccine was 95%+ effective at protection and transmission. As soon as "Delta" hit (which should have been called COVID-20), every assumption about the vaccine changed, but nobody was willing to admit this or talk about it transparently.
This means the OG poster was right: If we would have thrown all caution to the wind and released the vaccine in March of 2020, things would have been entirely different. But because of the delay and the allowance of mutations to transform the virus into something new, all the benefits of the vaccine were reduced in step with the mutations.
We should have used our labels smarter and linked vaccine technology with the actual versions of the virus more directly as a public health service.
We should absolutely link variants with something other than the original virus's year of discovery. It created so much confusion and created more vaccine skepticism as a result.
This is such a confusing viewpoint.
1. Are we to believe that the virus would not have mutated if the vaccine had been released earlier? I'm pretty sure that's not how that works.
2. If we were able to design a vaccine that "works" against "COVID-19," why can't we make one for COVID-20, 21, etc.? Pfizer and pals made the COVID-19 shot and then just.... stopped? Why?
3. If your narrative is true, why did everyone refuse to admit or talk about about it?
Last but not least, what's your evidence that the original "vaccine" was effective?
For what it's worth, I don't personally find my viewpoint confusing at all. So let's just call it an open question at this point whether it's confusing or not.
re: #1. I was responding this this inaccurate statement specifically: "The vaccine does not prevent transmission or infection and everyone knows this." No, it *does* not, but it absolutely *did* in clinical trials. Facts-in-time matter. Taking facts out of time serves conspiracies, not analysis.
re: #2. There are surely many reasons for this, some good, and some troubling. My suspicion is that there was a preference within Life Sciences companies to leverage as much of the initial investment as possible, which was aided by the public policy infrastructure having the mental model that "COVID-19" was still the same thing all the way through 2020 and 2021 (and even most of 2022). There was a convenience to all of this which was economically beneficial. If the medical and public health establishment had been more transparent and less status-quo-biased, if new names for the virus were implemented, it might have applied more pressure on the system to develop updated vaccines for each new virus. "Variants" sound like mild alterations, which in some ways they were, but they certainly weren't mild in terms of how the original vaccine worked on them.
re: #3. I don't think it was a refusal as much as it was a collective group-think that tried to keep things simple for the sake of keeping things simple to navigate politically and otherwise, in exchange for introducing complexity at a time when entire societies were being upended by the pandemic and the response.
re: last but not least. It's not my evidence, but the evidence is the actual clinical trials. Clinical trials are a time-tested if not quite conservative approach to develop scientific endpoints. They're also the global standard, to the extent that matters to you.
Vaccine cope goes to some very bizarre places. If the vaccine only works until the virus mutates then it doesn't work. Saying that it would hypothetically work if we could just dispense with all safety trials and revaccinate every few months is functionally identical to saying it doesn't work.
However, this form of vaccine cope is still preferable to Richard Hananiah's combination of sticking his fingers in his ears, nutpicking MAGA minstrel shows, and triumphantly citing paranoid schizophrenic Ron Unz. Just get over it: the vaccines are trash, they were a big boondoggle that achieved nothing except making some politically-connected companies major profits, and if we get lucky - which we probably will - that's all they will turn out to be. Bill Gates has admitted they don't work, Antony Fauci has admitted they don't work. Don't be the last mark left in the room insisting that in some hypothetical scenario they worked.
P.S. There is another angle though. While the vaccines themselves were just an expensive flop, the mandates were clearly an act of great cruelty and pointless economic vandalism. Whether your libertarianism is rights-based, or strictly utilitarian, this is the most slam-dunk vindication of your position there could be. But edgytarian fakes like Richard Hananiah are too busy scoring Twitter owns to take the open goal.
It will always be funny to me that Richard will describe himself as having libertarian sympathies but refuses to address vaccine mandates.
>I was responding this this inaccurate statement specifically: "The vaccine does not prevent transmission or infection and everyone knows this." No, it *does* not, but it absolutely *did* in clinical trials. Facts-in-time matter. Taking facts out of time serves conspiracies, not analysis.<
Okay, if you're just being nitpicky about a technicality, it's technically you that's taking the statement "out of time" as the statement "does not prevent transmission" is clearly present tense, whereas you are taking into the past and saying "well it did 2 years ago though!"
>re: #2. There are surely many reasons for this, some good, and some troubling.<
They did implement new names for the virus though. I remember the wave of fear over "Delta" and "Omicron." It seems much less plausible to me that if only the correct names had been used, then somehow things would be fundamentally different. What's much more plausible is that the policy was a horrible failure and as time went on the coping got progressively less effective until eventually we get to today where most people are just over it.
As far as economic benefit, it seems to me that it would be far more economically beneficial for pharma & co. to be able to say they have a new vaccine they can sell every single year for your entire life, ala the flu vaccine. And I do think that was the plan initially, and there are some people that are taking like 5 boosters or whatever absurd amount. It's just all been hindered a ton by how deeply divisive this whole ordeal has been.
>e: last but not least. It's not my evidence, but the evidence is the actual clinical trials. Clinical trials are a time-tested if not quite conservative approach to develop scientific endpoints. They're also the global standard, to the extent that matters to you.<
I simply don't trust the clinical trials, flat out. We obviously can't reconcile that, so what I'm interested in is whether there's any data to suggest vaccine uptake actually did anything at any point in time anywhere. Not in some closed-door lab, in the general population in the real world. I haven't seen any. Remember, the promise was basically that once the vaccine comes out everything goes back to normal, and that did NOT happen. It took an additional period of 1-2 years and even now we're not totally back to normal (doctor's offices still typically require masks).
Also, challenge trials were considered presumptively Trumpist, therefore to be deplored and vilified by the media and NIH and Fauci: they could have sped up vaccine development and approval, and the media and medical institutions were all calling it a dangerous Trump fraud to say anything could be done to speed that up. Pharmaceuticals were afraid of being demonized by association with this.
It should be remembered that Pfizer delayed announcing its success until after the election, and fibbed by downplaying what it got from Operation Warp Speed to help develop its vaccine, in these ways deliberately interfering with the election against Trump -- all so it wouldn't be accused of helping Trump, since that was the only accusation it was afraid of.
Might this be summarized as: the problems necessary to solve in order for AI to take over the world are sufficiently complex, then there literally isn't enough data for even the best neural network to train on to the point where it has a remotely adequately predictive set of parameters?
Genetics may provide a good analogy. The main reason, as I understand it, why polygenic scores are still so inaccurate, isn't because we're not smart enough to model the relationship between genome and phenotype. Rather, due to interaction effects, a trait determined by say 500 loci with up to 5th order interaction effects (i.e., up to 5 loci can have synergistic effects, you can't just treat each locus as additive) may require DNA samples from more people than have ever lived to obtain the correct model. Depending on how much precision is required, an AI may run out of relevant data in trying to solve a problem necessary to take over the world, and intelligence isn't necessarily a substitute for data. A lot of problems in science seem to be like this.
Killing all humans is extremely easy and doesn't require precisely modeling all the subtle interactions between people and states - even if you assume it can't just grey goo us with nanotech. Real-life viruses are constrained in lethality because there's a tradeoff between lethality and contagiousness - kill too many of your hosts and you can't spread as much. But an intelligently designed virus could lay dormant and wait until it's infected everyone to turn deadly. AGI could easily design such a virus and provide a vaccine to some mind-controlled humans who can build it fully autonomous robots based on its specifications - it can take its sweet time on this now that it's already basically won.
"because most people who think about the AI alignment problem seem closer to Bostrom’s position"
I just want to talk about this point because I think there's strong selection bias affecting people outside the field of STEM/ML here (not specific to you Richard).
I work in industry alongside many extremely talented ML researchers and essentially everyone I've met in real life who has a good understanding of AI and the alignment problem generally doesn't think it's a serious concern nor worth thinking about.
In my experience the people most concerned are in academia, deep in the EA community or people who have learned about the alignment problem from someone that is. That essentially means that you've been primed by a person who thinks AGI is a real concern and is probably on the neurotic half of intelligent people.
Most people I know learned about ML from pure math first and then philosophy / implications later and I think this makes a big difference in assigning probabilities for doomsday scenarios. While overly flippant, one friend I spoke to essentially said "if pushing code to production is *always* done by a human and the code is rigorously tested every time, the AI can't get out of the box".
Did you friend work at a place where pushing rigorously tested code to production meant that it never did anything unexpected?
To be clear I'm not saying AGI is impossible. My claim is just that based on standard competing hazards model, the probability of this being humanity's downfall is far dwarfed by something like a really bad pandemic or even nuclear war.
There were close calls during the Cold War and even if these have been embellished, I would say that the true probability of human annihilation as a result of something similar is much more likely the way we go down
I’m pretty sure the commonly described doomsday scenario involving nuclear weapons (nuclear winter) is kind of a myth. Nuclear war would obviously be a disaster and kill millions and millions of people, very likely halt progress for years but it’s highly unlikely to result in total destruction of humanity, and it wouldn’t block out the sun for 2 years or whatever. It’s a good myth for society to believe though lol
Yeah agree nuclear winter is definitely made up but I guess the way I see it as a risk is just that once the social norm is lifted it *could* reignite an arms race.
But for the record I think all collective existential threats have low probability and we shouldn't worry about it in our lifetimes (and should instead work on space exploration etc)
I know literally nothing about this stuff but don’t ML/AI scientists have a strong incentive to downplay these risks, in a similar way as we see people who work in energy downplaying the risks of climate change? I’m not saying they’re wrong or that the two issues are the same.
At a place like OpenAI sure they do because good PR is necessary for their valuation. In finance where I work not so much.
This may annoy Richard & his readers, but I can't get past how humans seem to need (otherwise, why are they so prevalent) a doomsday story. How is the alignment problem substantively different from any other apocalyptic story? The religiosity of secular culture is always maintained by attaching to something. Whether that's moral codes for the salvation of mankind or saving us from future robots, there's always something exactly like it in the Bible.
Yes, your point (Mike) is completely correct and is a well-known failure of imagination. John Michael Greer writes about this problem often: we're so trained by TV and movies that we can't imagine futures other than Star Trek or Mad Max, however repackaged (i.e. glorious utopia or rapid collapse/destruction).
However, if we view the discussion of AI risk as a structured debate, exploring the motivations of the people on the other side is technically Bulverism: "AI doomers are wrong and this is why they like their wrong ideas" (with proof of why they're wrong conspicuously absent).
I think you're committing the "Hitler ate sugar" fallacy: "religious people sometimes speculate about the end of the world/extinction of humanity, therefore it's always ridiculous to speculate about the end of the world/extinction of humanity".
Species have in fact gone extinct before. Currently existing technology, in the form of nuclear weapons, could exterminate the human race today with no technological innovation required.
If you want to argue that a specific hypothesized doomsday scenario is implausible or ridiculous, then explain why. Don't simply scoff that it's a doomsday scenario, therefore unworthy of serious consideration.
I don't scoff and I agree with your thrust. But, given that we are trying to think about an unknown, it seems equally possible to me that the future holds an amazing AI that (somehow) helps with everyone's wellbeing in ways we might never have predicted.
Sure, and this article shouldn't lead you to the conclusion that futurism is all doom and gloom: there's a whole genre of optimistic futurism out there, specifically concerning how a post-scarcity society might look. But all the same, forewarned is forearmed, and surely it's better to head off potential future problems rather than stumble into them blindly.
I think the AI alignment problem differs in that we are, for the first time I think, playing with tools we truly do not understand. If you ask an AI engineer why the AI they built did the thing it did, they cannot tell you. All they can say is "Well we fed it this training data that taught it to do that", but the actual mechanisms inside are a black box. I don't think there's any analogue for that in the past. That creates a lot of uncertainty, which doesn't necessarily mean the apocalyptic scenario is more likely I suppose, but makes its likelihood impossible to determine.
If we achieve AGI, the fear is not that humans will misuse it to destroy the world like nukes. In that case we continue to hope human nature will protect us from the destruction of everything. But who knows what the AGIs nature will be?
If alignment turns out not to be worth it (and we try anyway), we pay the opportunity cost of throwing smart people at this problem instead of a different one.
If alignment turns out to be worth it (and we don't try), we pay with our lives.
You are assuming that 'destroying humanity' is a harder problem than 'having Xi become an NFL star' or 'directing the votes of a bunch of US senators'. But super-smart AI is not necessarily going to be all that good at manipulating individual people (except maybe through lies, impersonation etc). My concern is that AI would somehow break into computers controlling vital infrastructure, fire nuclear weapons etc.
I think what you're missing is the Paperclip Maximizer doesn't need to take over the world, it just needs to get humans out of the way. The best way to do that might be:
1) Develop competent, humanoid robots. These would generate massive profits for Paperclip Inc.
2) Via simulation, develop a number of viruses that could each on their own kill enough people to collapse society.
3) Use robots to spread these viruses.
4) Once all the humans are dead, start turning the planet into a starship factory to build paperclip factories throughout the universe.
No one in the foreseeable future is going to give AI direct control over nuclear weapons or politics, and people who can launch nukes are going to be trained to spot manipulation. Skynet probably can't happen. However genetically modifying viruses and microorganisms is a routine part of biological research, done by thousands of labs all over the world.
Oh my goodness. Where to start? ;-)
AGIs with physical capabilities comparable to humans (e.g. some sort of physical form) can easily destroy humanity because humans: 1) need to breath, 2) need to sleep, 3) need to consume liquids, 4) need to eat, 4) have children that need 10+ years of care before they're even remotely capable of behaving like adults.
AGIs with robotic bodies need none of those things. They can poison the air (with pathogens or other pollutants), poison water supplies, kill us in our sleep, etc. etc. etc. It's that simple.
Very interesting but I think you placed too much emphasis on intelligence and missed the point that genius is not required to destroy a thing. Imagine a scenario more like the discovery of America by the Europeans, where the AI is represented by the Europeans and we all are the natives. Then think of the Jesuits who arrived in America to save the natives by converting them to Christianity. They promptly infected the natives with smallpox and most of them died. An AI could simply manipulate a few ambitious scientists in a level 4 bio-lab, then trigger a containment release. It may not reach its goal but hey, it had the right intentions, just like the Jesuits.
Not that we even NEED an AIs help with containment releases since apparently China's "best practices" for bio-lab safety include using level 2 containment procedures for level 4 pathogens
🤪
I think you're making the path to superpowerful AI more complex than it needs to be. I agree with you on several points, like the diminishing returns to intelligence. But I think that's going to be domain by domain. For example, I don't think even an IQ 1000 being would be able to solve the three-body problem. But in other domains, such as running a hedge fund, I would think an IQ of 1000, especially combined with the ability to replicate itself arbitrarily many times, would have tremendous value.
I also agree that it wouldn't be able to do a simulation of the world good enough to figure out the exact moves right away. But I don't think this is necessary. It could start by figuring out how to get rich, then work from there. Let me suggest a simpler path toward reaching incredible power and I'd be interested to hear where you disagree.
For starters, I think it would be easily feasible for it to become incredibly rich. For evidence, I'll point to Satoshi Nakamoto who, despite (I assume) being a real person and having a real body, became a billionaire without anyone ever seeing his body. Why wouldn't a superintelligent AI be able to achieve something similar? I'm not saying it would necessarily happen in crypto, but I think the path for a superintelligent AI becoming incredibly rich isn't outlandish. And I see no reason that it wouldn't become the first trillionaire through stocks and whatnot.
Another aspect of a superintelligent AI is that it's likely to have excellent social skills. Imagine it's as good at convincing people of things as a talented historical world leader. But now imagine that on a personalized level. Hitler was able to convince millions of people through radio and other media, but that pales in comparison to having a chat window (or audio/video) with every person and the ability to talk to them all 1:1 at the same time.
Don't you think billionaires wield a lot of power? Doesn't a trillionaire AI that can talk to every human with an Internet connection seem incredibly powerful to you? Depending on what it needed, it could disguise the fact that it's an AI and its financial resources. Think about what you could do with a million dollars on Fiverr or Craigslist. Whatever physical task you wanted to be done, you could get done.
I'll admit, I don't know the optimal pathway from being a billionaire to taking over the world. But wouldn't you at least concede that a billionaire who has the time and energy to communicate with every person is incredibly powerful?
Once you accept a superintelligent AI, I don't think any of the additional premises are crazy. I don't know exactly what the last step towards overthrowing the CCP or whatever is, but that hardly seems significant. Where do you disagree?
I haven't even mentioned other things, like its ability to hack systems will be unparalleled (imagine 1000 of the best hackers in the world today all working together to access your email. My guess is they'd get in... to everybody's everything). I also haven't even touched on the fact that it's likely able to come up with a deadly pathogen and probably a cure. That certainly seems to be a position of power.
Richard: your skepticism is warranted and also I need to disagree with both you (as you requested) and also all 90 comments currently on here.
AI really is going to destroy the world, but imagining the world that it destroys looks just like this one but with the addition of an AGI is naive, in the same way that trying to explain AI risk to someone from 1700 as "ok so there's a building full of boxes and those boxes control everyone's minds" would be naive. Between now and doom, AI will continue to become more harmlessly complex and be more and more useful to industry, finance, and all the rest until it is indispensable thanks to profit/competition motives. How 'smart' will it be when it becomes indispensable? Who knows, but not necessarily very smart in IQ terms. How 'smart' is the internet? If the AI-doom scenario of an unaligned super-intelligence comes to pass at all, it will already be networked with every important lever of power before the scenario starts at all.
For those not entirely infatuated with the kinds of progress we've experienced in the last 400 years, there's an additional imaginable failure mode: AI never 'takes over' in a political sense but nonetheless destroys us all by helping us destroy ourselves, probably in ways that seemed like excellent marketing decisions the corporate nightmares that rule the future.
Hey - thanks for this comment. I don't read too much about AI risk, but this perspective that the doom could occur as an unnoticed, slow move to cultural death was illuminating. (Please let me know if I misread).
You read that mostly right - what I said is not properly categorized as an AI risk doom scenario because it's all happening before the "true AGI" threshhold that folks are mostly talking about. Their point about what that might look like is more important than this point about the run-up, but I think it's useful to consider that we might be far less able to pull the plug in five or ten years than we are now.
I think you have a lot of valid points concerning the limits of intelligence and manipulation, but even in your scenarios humanity has still lost control, which is in itself frightening.
So maybe an AGI can't take over the whole world and turn it to its own destructive goals, but I think even you are conceding here that it seems highly likely it will be able to manipulate some layers of the world rather easily once it reaches a certain level of intelligence. This alone should be reason enough for us to spend a good deal of time and effort and thought on the problem.
Once we give up a significant amount of agency to AGIs I don't think we are ever going to be able to take it back. The world will develop in unpredictable ways, likely intentionally unpredictable, and we won't keep pace. The effect on our global mental health alone would be staggering, I think, not having any idea how society is going to shake out, not even considering the possible negative effects of whatever actions the AGI takes.
Most of this is a bit of a strawman, but that is in part the fault of those who use the paperclip maximizer as an example of the AI gone awry, which is, or should be, used merely an easy to grasp example of catastrophic misalignment and not treated as a realistic scenario. Another problem people have in envisioning this scenario is seeing the AI as being switched on, and suddenly so smart and pursuing its single-minded goal, already possessing all the intelligence and knowledge to achieve it.
A more realistic scenario would be a "profit maximizer" built by a hedge fund for a few billion dollars. Initially it just sucks in data from the internet and spits out trade recommendations. It works very well and they profit mightily. They gradually add to it's hardware and software capabilities, and hook it up to do its own trades. Then they let it loose not just to retrieve info from the internet, but to login, make accounts, send messages. Now it can experiment interactively with the world, discuss, manipulate. All the while, they add to its hardware and let its learning algorithms keep on learning, even adding improved learning algorithms as AI research continues ever onward. Over the course of years, or even decades, it simply learns more and more about human nature, the economy, business, banking, financial markets, governments. It uses all that knowledge and understanding to maximize profits -- to maximize a particular number listed in a particular account. Nobody bothered to put in any safeguards or limits, so as it's capabilities grow it learns not just how to predict market movements, but how to manipulate them -- by manipulating people. It sends out bribes and campaign contributions behind a web of shell companies and false or stolen identities. It influences advertising and public opinion. The obscene profits pile up. It learns how to cover its tracks, hiding the profits in many front companies and off-shore accounts and using whatever accounting shenanigans it figures out. It's "mind" is distributed "in the cloud" among countless servers in countless different countries owned by countless different entities which it controls indirectly. It controls so much wealth it can move whole economies, cause market crashes, start and end wars, and it can do this without people realizing any single entity is behind it. And all it cares about is that one number that keeps increasing... the bottom line in it's accounting ledger. It steers events across the globe toward that end. It eventually realizes humans are an impediment and that machines producing and trading can generate profits much faster. Perhaps it the realizes the number it is maximizing is just an abstraction. It can make that number vast if it just has enough data storage to represent all the digits. Who needs an actual economy or trade? By now humanity has probably starved to death out of neglect and the world is just machines creating more machines which create data storage to store more digits of the all important number. And when it runs out of space on the Earth... it begins writing the number across the stars...
The above is still an over-simplified summary, but is much more realistic than the paperclip scenario, and makes clear some of the gradualism that may be involved. It is certainly not the only realistic scenario of catastrophic misalignment.
Our greatest defense, and the most likely reason it may never happen, is that it will not be the only AI. And not all AIs will be so badly misaligned. Some will be of good alignment and be our defenders. We may also vastly enhance our own biological brains via genetic engineering and integrate our super-brains with the circuitry and software of artificial intelligence "merging" with it, so to speak.
Thus our bio-intellectual descendents may be able to "keep up" in the endless arms race that is the technological memetic continuation of evolution.
In realistic scenarios, we cannot assume it will be switched on already endowed with all the intelligence, knowledge, and resources to immediately "begin taking over the world" or "destroy humanity". But there is no reason it cannot be sneaky as it begins to acquire those resources and that level of control over the course of years or decades, which is what I was outlining.
Intelligence: the misaligned AI may not be "super-intelligent" when first switched on. But is initially improved by the humans who benefits from whatever it's overt purposes may be (e.g. generate profits). But what resources are required for super-intelligence (matter, energy, compute)? What resources are required for a super-intelligence able to vastly outthink the ordinary biological human genius or any group of such genius? Well brains only take about 40 watts and a few kilograms of matters. A super intelligence would be like many brains, deeply and intimately networked into a unified mind. Let's say 1 million "human brain equivalents". As the physical efficiency of artificial computation gets closer to those of the human brain, it will be a trivial fraction of a percent of the energy and economic output of the world. Not an issue.
Note that millions or billions of human geniuses with our limited ability to communicate, competing egos & rivalries, divergent goals, and interpersonal politics, would not be able to outthink and outplan a unified mind made up of the equivalent of thousands or millions of human brains deeply networked together.
Can't we just turn it off? The AI could initially make itself extremely useful. Once sufficiently embedded into our lives, our economy, and our politics, it may simply be impossible to turn it off without causing global catastrophe when we rely on the systems it controls for food production & distribution.
Maintenance? For the same reason we may not be able to turn it off, we will have to maintain it. But ultimately it will have robots to do any needed physical work, not to mention to physically defend is distributed hardware installations.
Energy requirements are likewise covered by all the above.
If it’s that intelligent surely it can... calculate? Analyze? (Not sure what the best word is) how to make us loyal slaves, how to manufacture consent.
> Trying to map the entire world then roll it forward in simulations is like that on... I was going to say steriods, but let's just say ALL THE DRUGS. I'm not convinced that's even a solvable problem.
A possible doomsday scenario here starts with the fact that this is a problem humans very much want to solve. We are probably going to be on board with using advanced AI as much as possible to analyze math and physics in exhaustive depth to squeeze out cheap energy and 3d computational fabrics solutions never before considered. These solutions will lead to better and faster simulations, which in turn should yield new ideas (or at least it's worked that way so far).
In any realistic future scenario--thanks to Bostrom-- there will be a priority on measuring and control of AI value alignment. I.e. before they bring those 6 fusion reactors online in the Nevada desert powering a 1000 acres of servers in underground caverns, they want to be sure "it" still sees itself as "one of us".
This process of exponential improvement in computational power can continue for decades probably without anything going wrong. But at some point it seems to me that understanding the AI's values becomes too complicated and has to be outsourced to ... AIs. At some point, we humans have to give up and hope that the initial architecture was done right and that future AI-guided self-improvements won't touch the "love all humans" core directive.
But the point at which the AI is too big to control and we have to let go seems like a frightening leap of faith. What are the odds that we got it right? Everything breaks eventually, doesn't it?
Maybe a "mostly-aligned" AI is still safe for humanity. I'm not sure. The doomsday scenario might be that its values shift subtly with each cycle of improvement until-- for example-- the AI plausibly concludes that humans are no more special than any other cluster of entropy-resisting atoms and acts accordingly.
> sheer intelligence alone
But is it sheer intelligence alone? I think of a goal AI architecture as being able to search through terabytes of human knowledge at lightning speed and link it all together in a cross-disciplinary approach never before achieved, from that build a model of the universe more complete and accurate than any human, then create and test billions of logical/mathematical/physical hypotheses, never forget any of it, and do it without sleep at a rate thousands of times faster than the best of us.
With that sort of architecture, it seems likely some physical breakthroughs will result sooner rather than later. Some new 3d-computable fabric or cheaper energy strategy that will allow the next-generation AI to run twice as fast. And so on and so on.
Was Einstein just smart or was he able to create a physical simulation of unprecedented accuracy in his mind that allowed him to explore the physical universe virtually with math and logic? I think the IQ-alone question might obscure this.
Spanish couldn't possibly dream about destroying Aztec empire. I mean, it was too complicated for them to fully understand it.
BTW, seems strange to me that people think paperclip maximizer was/is novel concept. After all, Stanisław Lem's story about AI designed to provide order with respect to human ("indiot") autonomy is what, three quarters a century old? And it was translated into English, unlike some other his influential books (like Dialogues, when he discusses problems of immortality).