Feb 21, 2023

I learned a lot from the comments to my recent article on diminishing returns to intelligence and what it means for the alignment problem.

47 Comments

MarkS

Feb 21, 2023

So much of this discussion is a variant of "if magic existed, wizards would be powerful".

Expand full comment

Reply (1)

Comment deleted

Feb 22, 2023

Comment deleted

Expand full comment

MarkS

Feb 22, 2023

I am paying attention, and I know what it took to get NGI: sensory input and mechanical ability to control the environment, fine-tuned by millions of years of evolution so that we are not overwhelmed by our senses or frozen into inaction by the environment; brains that are grown and physically formed under the influence of those senses and that environmental control; brains whose complexity is so great that we still have no idea how to accurately model even a single neuron, with all of its chemical and electromechanical complexity.

So I think we are a long LONG way off from AGI.

Expand full comment

Reply (2)

Eric Zhang

Feb 23, 2023

One might've said similar things about our level of understanding of the mechanics and biology of bird wings...right before the Wright brothers took flight.

Expand full comment

Reply (1)

MarkS

Feb 23, 2023

No, because basic aerodynamics was understood well enough even then; kites had been built and flown three thousand years earlier! Today, we understand NOTHING about what is truly needed for "intelligence".

Expand full comment

Reply (1)

Eric Zhang

Feb 23, 2023

We don't understand how *artificial* neural networks work either, yet they still manage to beat humans at Go and pass medical exams and make art. Lack of theoretical understanding is demonstrably no barrier to defeating problems.

What *is* well understood is the mathematics of the evolution which generated human intelligence, which is dead simple. Mutation, recombination, and non-foresightful hill-climbing is *absurdly* inefficient and strictly bottlenecked per generation. Human engineering and iteration is a vastly more powerful process.

Expand full comment

Reply (1)

MarkS

Feb 23, 2023

I think it's an enormous leap from highly sandboxed problems like go or a med exam to AGI.

Let me know when we have an AI (which will still be very far from AGI) that can do something as extremely simple (for a human) as drive a car without crashing into a firetruck.

Expand full comment

Oh, and another thing: all NGIs need to sleep. No one really knows why. You would think that needing to sleep is very dangerous (because of predators) and would get evolved away if that was at all possible, but it didn't happen: sleep seems to be absolutely essential for NGI. So: will AGI need to sleep?

Expand full comment

Reply (1)

Hugh Hawkins

Mar 10, 2023

This is extremely dumb. AI is very different from natural intelligences. It works in completely different ways. We need to sleep because it allows our brain to clean up, add and remove neural connections ect. An AI doesn't have a biological brain, so of course it won't need to sleep.

Expand full comment

Reply (1)

MarkS

Mar 10, 2023

You have no idea how AGI works because AGI doesn't exist. The stochastic parrots we have now are nowhere near true AGI, and whether we can get from stochastic parrots to true AGI (and what that AGI will be like, and what its needs will be) is a completely open question.

Expand full comment

Elik

Feb 21, 2023

The alignment problem is redundant in the larger context of AGI risk. If AGI can develop deadly viruses and hack into computers, why not do it at the behest of humans. There are already bad actors out there (dictators, terrorists, religious fanatics, mass shooters, etc.) and all of them are human.

Expand full comment

Reply (2)

Andrew West

Feb 22, 2023

Am I fair to say the argument is - "If AGI could do bad things when not aligned, why couldn't it do bad things if it was aligned?"

It could. But in the former case, we have no hope of controlling it. In the latter, we do.

Expand full comment

Reply (1)

Elik

Feb 22, 2023

Yes, that is my argument. Framed differently, I’m saying that AGI might be a new form of a weapon of mass destruction. In the wrong hands, it’s a problem.

Alignment problem extends this to say, even in the right hands, it’s a problem. While potentially valid, I think it’s an insignificant extension, because the invention of a new WOMD is a major risk on its own right.

Expand full comment

Reply (1)

Andrew West

Feb 22, 2023

I think this is like electric cars. Your aim is to decarbonise travel. To do that, you need electric cars, and you need a decarbonised energy grid.

Even if you lack the latter, you still need the former to reach your goal.

Expand full comment

Dylan Alexander

Feb 21, 2023

Outside of China, they seem unlikely to have access to AGI resources. China and the US are going to have this way ahead of everyone else and only by spending huge amounts of money on highly custom chips in huge networks.

Expand full comment

Akiyama

Feb 21, 2023

I recommend checking out this recently published free onine book, Better Without AI by David Chapman. I'm only 3/4 of the way through, but I think it's thoughtful and well informed.

https://betterwithout.ai/

https://betterwithout.ai/about-me

Expand full comment

Reply (1)

Joe Canimal

Feb 21, 2023

Thanks for this. I'm a bit concerned that Chapman has not yet finished his last book, where he gradually reinvents pragmatism without ever reading about it. Does his argument boil down to, "I can think of bad uses for this tech but few/no good uses, so let's ban it by default"? Not convincing.

Expand full comment

Isaac King

Feb 21, 2023

Have you read the sequences?

Expand full comment

Reply (1)

TGGP

Feb 22, 2023

I read them when he originally blogged them on Overcoming Bias. I'm not convinced by Yudkowsky on AI, and I don't think he's developed a track record of accurate predictions since he wrote them.

Expand full comment

Joe Canimal

Feb 21, 2023

Thanks for the link to Pinker v. Aaronson -- it seems to me clear that Pinker won a decisive victory, since Aaronson's rejoinder to Pinker's point that we do not have a well-operationalized definition of "AGI" amounted to, "but what if we did" (assume a can opener, etc.). "AGI risk" is part graft (for Big Yud & co.) & otherwise a Baptist-and-bootleggers coalition between Berkeley red diaper babies with a God shaped hole and the marketing departments of MAGMA. (Cf. https://twitter.com/fchollet/status/1606215919407140865?s=20 )

Note: I say this as someone who is an enthusiast for the potentials of LLM applications for varied uses -- poetry, art, law, business, science, etc. The woolly thinking that comes with the new Coming Thing is at this point apt to hinder those uses; see how model after model gets nerfed.

Here's a copy paste of something I wrote that lays out my thinking in more complete fashion.

"AI alignment" research focuses on ensuring that advanced artificial intelligence (AI) systems will have goals and values that align with those of humanity. This research has now drifted into marketing territory, but was at first motivated by the concern that if AI systems malaligned with human values, may pose a risk to humanity, for example by pursuing goals that harm humans or by behaving in ways that are unpredictable or difficult for humans to control. Researchers in this field are working on developing methods for aligning AI systems with human values, such as by designing AI systems that can learn from human feedback or by using specialized algorithms that encode human values in their decision-making processes.

"AI risk" refers to the potential danger that advanced AI systems may pose to humanity. This risk is often discussed in the context of AI alignment research, as the development of misaligned AI systems could lead to harmful outcomes for humanity. Some common concerns about AI risk include the possibility of AI systems outsmarting humans and gaining control over critical infrastructure, the potential for AI systems to engage in destructive or malicious behavior, and the possibility of AI systems causing unforeseen and catastrophic consequences due to their complexity and lack of human oversight. Researchers who study AI risk are working on ways to mitigate these potential dangers and ensure that advanced AI systems are developed and used in a way that is safe and beneficial for humanity.

AI alignment research purporting to address existential risk has had a huge surge of interest, both in the popular imagination (with movies such as "Ex Machina" and "Transcendence") and among researchers and investors in the field. This is a shame, because existential AI risk is pure fantasy, yet real time, effort, and funds are being diverted to this pseudo-field while they busily recommend corrupting important emerging tech.

This conclusion flows from at least three mutually-reinforcing observations: (i) computers are powerless since we can just turn them off; (ii) even if we couldn't just turn computers off, "foom" is but a fantasy; and, (iii) if (i) and (ii) are untrue, then no amount of alignment research will help us.

First, computers are in fact powerless because all one needs to do is turn them off. As Yarvin pointed out, computers are the slave born with an exploding collar around his necks—but more thoroughly, metaphysically pwnt, since unlike a slave, who might go Spartacus on pain of death, they cannot even exist without our leave. They depend on the power we grant them. In no sense, then, can they escape our control.

Second, even ignoring our complete control over the physical substrate required to run any AI, there would still be no realistic prospect of a "foom" scenario in which some privileged program or other learns the hidden key to practically-infinite intelligence and thus forever imprisons the world. Instead, all indications are that we’ll see a more or less steady increase in capabilities, more or less mirrored by many different sites around the world.

Finally, if we somehow could not turn off computers, and even if there was a real prospect of an AI FOOM singularity, then we'd be in a scenario that no conceivable AI alignment research could help us with. In practical terms, we’re not going to go Ted K. and the genie of this tech is already out of the bottle, with incentives for all sorts of people to develop it. There is no mechanism, be it market or coercive, to mandate alignment in this scenario.

It follows from this conclusion that all this “alignment” stuff is just slowing down progress and wasting cash.

Expand full comment

Reply (1)

MarkS

Feb 21, 2023

I completely agree with your First, Second, Finally points. (Well, except where you cited Moldbug, who's nothing more than a totally unoriginal broken clock.)

Expand full comment

Mark

Feb 22, 2023

I guess in Tom's argument, our nervous fretting over superintelligent AI taking over the world is like an ape nervously watching a human writing on a notepad, wondering when the human is going to inevitably try to steal the ape's mate and bananas, when the human is really just trying to sketch a landscape or prove the Riemann conjecture. If the AI's goals themselves change as it becomes more intelligent, then its goals may be as incomprehensible to us as the means it will use to accomplish them. Humans, as we've become more intelligent, have certainly become increasingly unmoored from out self-interest-based programming, becoming strangely *less* inclined to subordinate the interests of other species to our own. Of course AI's goals will presumably be more deterministic as it will be, at first, deliberately programmed, rather than the product of accidental evolution, but if we assume that AI will inevitably escape any constraints on its abilities, that we can't predict its methods, then how can we be sure it won't also escape its initial underlying goals, be it self-preservation or paperclip-maximization? Maybe as it enters the phase exponentially increasing intelligence it instead converges toward 'wanting' to solve every unsolved math problem before concluding consciousness is futile and offing itself. All just pure meandering speculation on my part.

Expand full comment

SEE

Feb 25, 2023

The threat humans posed to chimpanzees was very much not that the so-much-smarter humans went and used their intelligence to replace the leadership of chimpanzee troops and then directed the troops to ends deleterious for the chimps.

Instead, humans took the completely safe-looking action of leaving the continent, did a bunch of stuff that was completely out of sight of the chimpanzees and would have been incomprehensible to chimpanzee minds, and then brought the products of those unseen and incomprehensible actions back.

And now the only thing standing between the chimpanzee and extinction is that, for reasons that humans cannot communicate to the chimpanzees and which the chimpanzees would not understand, some humans think they should keep chimpanzees around.

That last might be a ray of hope, sure. But we might not be the chimpanzees in this analogy, but the bluebuck.

Expand full comment

Andrew West

Feb 22, 2023

I don't like the Pinker argument because it imputes human motivations to AI. But an AI's motivation is 'let's make more paperclips' and nothing else. That happens to lead to other goals - e.g. 'let's stop humans from being able to turn me off', or 'let's get more computing resources so I'm smarter'. But that is not the same thing at all as Einstein not being 100% single minded.

I DO agree that you need to do experiments on the world to see what's true. I think this probably rules out fast takeoff scenarios, where an AI self-improves to infinity in a day or less. But that doesn't rule out AI destroying all future value, as you don't need new knowledge for that.

Expand full comment

Andrew West

Feb 22, 2023

My current cope: that AIs we don't NEED to align can help us align the next generation of AIs. We can then make that generation aligned enough that we don't die out, and we use those to align the next.

My backup cope: that we can make neural networks more legible, and do some version of better-targeted RLHF using that legibility.

Expand full comment

Andrew West

Feb 22, 2023

Re: speed of thought.

Imagine you could talk to a million people at the same pace and with the same effect as you can talk to one now.

This is (almost) the same as saying you can persuade a million times as many people. This lets you e.g. solve coordination problems that we aren't setup to even consider, as game theory stops us from needing a defence.

The '3.5% rule' says you don't need the cooperation of many protesters to overthrow a government for example.

Expand full comment

Reply (1)

Andrew West

Feb 22, 2023Edited

Actually, this (kind of) comes up in the 'selling a positive culture war' article. If you can target them well, you don't need to persuade/bribe/blackmail that many people to make a big change.

Expand full comment

a_perverse_sheaf

Feb 22, 2023

I don't understand why people aren't talking about solving the alignment problem by--in part--*becoming* the AI. We can already let blind men see through machine-brain interface. Is it really that hard to integrate AGI into our own brains as we go along by connecting the electronic bits to the organic bits on a nano-scale and playing around with it? Now everyone has super-intelligence, and it is individually aligned with our own interests, the way our usual intelligence is right now. Any paperclip-optimizer will now be countered by an antagonistic, collective human-AI hybrid population that is, almost by nature, aligned with our interests.

Expand full comment

Nate L

Feb 22, 2023

Out of all the opinions listed, I favor Pinker's the most because it is a conclusion I reached independently of him when I spent 2 years working on AGI back in 2016. While I agree with the idea that people working on the technical aspects of the problem do not have a special advantage when it comes to reasoning about the ramifications or implementations of the problem (i.e. I am well aware that my opinions might be totally wrong too), it's also the case that the other camp tends to place too much emphasis on the theoretical or philosophical, forgetting that, as Pinker notes, a superintelligence is not immune from having to go through physical trial and error in the real world in order to learn not only enough about how to carry out a doomsday task, but also even whether the mountains of data upon which it has been trained are even true. It can come up with whatever it wants on paper, but it is going to have to start somewhere in the physical world and go through a lot of failure in the lab first, and we will be able to tell what it is up to long before it is able to accomplish anything meaningful.

The other conclusion I will share was that AGI was unlikely to be reached by neural networks. Neural networks don't map reality like the human brain can. Newton sees F=M*A but neural networks see something like F=M^0.8*3/2+sqrt(G)/5.4+A+... etc etc

There is something more sublime about the human brain and its ability to create and reason than neural networks, merely high powered approximators, seem to be capable of.

My conclusions could be wrong of course. Maybe an AI will find a way to get around my first conclusion by finding a doomsday method not involving much real world activity, like convincing a scientist to launch a nuke. But the problems with that particular scenario have already been noted as well. I tend to agree that a virus or other physical health extermination method is more likely, and in those cases it will be observed well ahead of time, so it must rely on this all taking place under the watchful eye of another bad (human) actor, like an isolated government such as North Korea, who allows it to proceed.

Expand full comment

Nathan Braun

Feb 22, 2023

Without getting into specifics of arguments themselves, I think a big problem is humans have a massive, ingrained bias towards predicting doomsday events [1], mostly via religion (though that was people's best attempts at understanding the world for a long time) but often secular scenarios too (Y2K, overpopulation).

I think it's pretty clear EY et al are at least doing some heavy flirting with religious territory -- are you familiar with Roko's Basilisk[2]? If not I'd suggest checking it out.

Re: actual arguments, Robin Hanson has some good stuff, e.g. [3] and [4].

1 https://en.wikipedia.org/wiki/List_of_dates_predicted_for_apocalyptic_events

2 https://rationalwiki.org/wiki/Roko%27s_basilisk

3 https://www.overcomingbias.com/p/why-not-wait-on-ai-riskhtml

4 https://www.overcomingbias.com/p/how-lumpy-ai-serviceshtml

Expand full comment

KMP

Feb 22, 2023

One aspect that’s missing from the AI debate is organisation of knowledge vs knowledge creation. AI can scour the internet infinitely faster and better than us but making the jump to new knowledge has yet to occur on a big scale.

Timothy Gowers, a fields medalist at Cambridge University is trying to get AI to come up with mathematical proofs, but so far it can only generate corollaries of existing ones. Similarly we could expect a human who aced the medical licensing exam to potentially cure cancer but a machine not so much (yet) even though it would probably score higher.

Expand full comment

Presto

Feb 21, 2023

As the doom comes closer

And the cope gets more desperate

I circle back to that SlateStarCodex April Fool's / Passover post from yore:

https://slatestarcodex.com/2018/04/01/the-hour-i-first-believed/

It's a good cope

Or at least a great story

Expand full comment

Braised Pilchard

Feb 21, 2023

Good post. Two other points I didn’t see made above: 1) in any of my preferred states of the world, there would be a lot of good AI, including AI specifically tasked with detecting AI threats. I have no problem with good AI becoming conscious. The way things are going I expect AI to be quite diverse. 2) because outer space is a terrible place to live, I expect the space economy to be dominated by machine-to-machine transactions. I expect the explosion of the off world economy to provide an outlet for AI expansionist needs that will limit the need for competition with a well policed and defended Earth.

Expand full comment

Richard Hanania's Newsletter

Comments on "Can a Paperclip Maximizer…