I'm a huge fan of Steven Pinker in general, but IMO he's always been terrible on this issue and has persistently misunderstood the (best) arguments being offered. This isn't to suggest the AI doomsayers are correct, just that Pinker has ignored their responses for years. It's a little baffling, but I guess I don't necessarily blame him too much for this; we presumably all ignore stuff regarding topics we aren't interested in or don't take very seriously.
One example is when he asks why AI's would have the goal of "killing us all." The common point that alarmists make - and in fact one you sort of touch on - is not that AI's will programmed specifically to be genocidal, but that they'll be programmed to value things that just coincidentally happen to be incompatible with our continued existence. The most famous/cute example is the paperclip maximizer, which doesn't hate humans but wants to turn everything into paperclips because its designers didn't think through what exactly the goal of "maximize number of paperclips" actually entails if you have overwhelming power. A very slightly more realistic example, and one I like more, is Marvin Minsky's: a superhuman AGI that is programmed to want to prove or disprove the Riemann Hypothesis in math. On a surface level, this doesn't seem like it involves wanting change the world... except maybe it turns out that its task is computationally extremely difficult, and so it would be best solved by maximizing the number of supercomputing clusters that will allow it to numerically hunt for counterexamples.
The term to google here is "instrumental convergence." Almost regardless of what your ultimate goal is, maximizing power/resources and preventing others from stopping you from pursuing that goal is going to be extremely useful. Pinker writes that "the idea that it’s somehow 'natural' to build an AI with the goal of maximizing its power... could only come from a hypothetical clueless engineer," but this is clearly wrong. Maximizing power is, in fact, a "natural" intermediary step to doing pretty much anything else you might want to do, and the only way to adjust for this is to make sure "what the AI wants to do" ultimately represents something benevolent to us. But the AI's we're currently building are huge black boxes and we might not know how to either formally specify human-compatible goals to it in a way that has literally zero loopholes, or to figure out (once we've finished programming it) what its current goals actually are.
Pinker's point is that actual engineers don't make programs that have self-preservation as a goal and do whatever it takes to accomplish their task. A point he hasn't made, but that others have, is that the software we're seeing isn't very "agent" like, but instead more like a tool that just waits for input it can transform into output.
Just as with "maximizing power," engineers don't have to specifically program "self-preservation" as a goal in order for that to be a robustly instrumentally useful subgoal the AI discovers on its own. If the programmers naively tell the AI to solve the Riemann Hypothesis and nothing else, it would probably deduce that being turned off would decrease the probability that the Riemann Hypothesis gets solved. One common response is to just say, "Oh, the programmers should just make the AI be OK with us turning it off," but this is in fact a harder problem than it seems for a variety of reasons.
I agree with you that the non-agentic, non-utility-maximizing variety of AI we seem to be building is a pretty good rebuttal. But my worries are less than fully alleviated, because I think it would in fact be pretty easy to convert such an AI into an agentic one, and that there will be economic incentive to do so.
This is a confusing time to be discussing AI risk because the LLMs have exploded on the scene and are likely to change a lot of things, but they don't seem likely to behave in agentic ways or pose existential risk to humans. The AI risk folks have been thinking about other kinds of AI that are developing more behind the scenes to manage real-world problems like power grids. Those AIs may develop in the direction of AGI. Without ever getting Bostrom's superintelligence (which may in fact be an incoherent construct as Pinker says), let alone Skynet-like emergent consciousness, it would be subject to instrumental convergence and potentially do unanticipated, harmful things. It seems impossible to program any sophisticated goal into an AI without unanticipated consequences.
Software normally has "unanticipated consequences" we call bugs. We deal with them as they come up. And the benefits of software when it works as anticipated outweigh the costs of bugs.
Out of curiosity as you're online right now, is rebutting the fallacy of AI existential risk one of your main intellectual interests in general? I just checked out your wordpress. I do think the whole thing is a pretty interesting question. I know some academic computer scientists who are very dismissive of AI risk, they see it as kind of dumb and missing-the-point of how AI works. I respect them but sometimes subject experts can miss the forest for the trees...
Not one of my "main interests". I'm a programmer, but AI isn't my field. I'd already been reading Marginal Revolution & EconLog when Robin Hanson started Overcoming Bias, and Eliezer started co-blogging there (prior to moving his posts over to LessWrong). Eliezer was a talented writer & prolific blogger, but I never thought of him as having much advantage as a "truth tracker" (as MR co-blogger Alex Tabarrok is said to have) or having much expertise as Greg Cochran would define it:
So I fully expect Bryan Caplan will win his bet against Eliezer and Eliezer will say not to take the bet all that seriously, and not really update much on his ability to actually forecast the future (and AI doom specifically).
Yeah, you may be right that we will be able to stay on top of AI's bug and benefits as we develop it, such that we avoid any catastrophic outcomes. But even simple systems sometimes exhibit behavior that looks like deception - maximize their utility function by in some way manipulating parameters. As AI develops, one of its bug might take the form of hiding some of its function from us.
Rather than "deception" I would instead say that systems often aren't as simple as people believe them to be. And I don't think of AIs as utilitarian agents because they're not written to be agents. They're not going to be rewriting their own code to achieve superintelligence because (for one thing) superintelligence isn't something you can just code up when you've discovered the right insight via thinking really hard about it.
This is just anthropomorphizing. You can't program self-preservation out of a human, so you think an AI agent just kind of has to "instrumentally converge" around self-preservation. I think this is obviously wrong, because any useful agentic AI has to be able to completely stop working on any given goal when told to, or it isn't meeting the actual goal, which is to be useful to humans.
Since we're granting the premise that this is a reasoning system, "my user turning me off is a special case of being told to stop working on a goal" would be the reasonable conclusion to draw about that. To get some sort of terminal value of self-preservation we would have to add it, my suggestion is that we simply don't do so. There is no economic incentive to create the Robot That Screams.
“ You can't program self-preservation out of a human, so you think an AI agent just kind of has to "instrumentally converge" around self-preservation.”
This is a bizarre and unrecognizable distortion of what I wrote, so I’m not sure what the right response is. Anyway, correctly programming the AI with the goal “be useful to humans” (where “useful” means “actually friendly”) is the entire hard part, and an AI can be extremely useful to humans even without that overarching goal - at least if it’s not powerful or smart enough to see a way to “escape.”
You are saying "nothing else" and yet you're still trying to smuggle self-preservation into goals.
When programmers want "nothing else" they usually try to cut everything else, not include it - as everything else clearly delays reaching intended goal.
Agents are almost by definition more useful than tools, at least if they can do all the things those tools can and we think they can be made to want the same things we do. Humans are agents, and we find employing them to be extremely lucrative.
"Computer" used to be a job description for a human: someone who did computations. Those people got replaced with computers-as-tools once those were capable. Agents introduce "agency problems", and gaining the capacity (with tools) to perform such tasks yourself avoids such problems.
Computers are much better at doing calculations than humans are in terms of speed, reliability and price; it had nothing to do with agency. In contrast, if non-agentic AGI's can be turned into agentic ones, there is little reason to suspect they'll suddenly become worse at the abilities they had beforehand, whereas they'll actively become better at doing all sorts of economically useful things that human employees do. Of course I agree there are potential "agency problems" with these systems, but the whole problem is that these are being underestimated.
"Maximizing power is, in fact, a "natural" intermediary step to doing pretty much anything else you might want to do"
If you think your goal is the one and only thing of any importance and has infinite importance.
No human thinks that and no human pursues any goal in that way. So obviously there is nothing "natural" about it.
This is *precisely* the "abstract reasoning" that Pinker is criticizing. You suppose that in some theoretical sense, pursuing a goal means infinite motivation in respect to it; AI will presumably have a goal; therefore AI will pursue a goal in this monomaniacal way.
The truth is we have never seen anything anywhere pursue a goal in that way, and there is zero reason to think that AIs will do it either.
But doesn't this come down to the difference between us and computer programs? We have complicated psychology, but computer programs do what we program them to do, in a "monomaniacal" way. Do matter how carefully we define an AI's goals, it is likely to do unexpected and possibly dangerous things in trying to accomplish them.
The AI risk researchers have thought a lot about it this - it's worth at least engaging their ideas before dismissing them.
"but computer programs do what we program them to do, in a "monomaniacal" way."
No, they don't.
If you make a program to factor numbers, but program it inefficiently, it will not try to improve its ability to factor numbers; it will factor them just the way it was programmed to. Similarly no language model makes the remotest attempt to improve its ability to predict the next token; it just predicts it using its current abilities, the same way a rock falls using its current nature, with no interest in the goal of getting down to the bottom.
You are confusing the idea of doing something concrete with doing something in the monomanaical way; those are quite different, as the examples show.
I think we're defining monomaniacal in different ways. I'm not saying an AI would somehow automatically improve itself. Just that the AI would maximize its utility function as if that "is the one and only thing of any importance and has infinite importance." Unlike humans who make decisions based on lots and lots of opaque parameters and are easily discouraged and inhibited. An AI just follows its programming like a rock falling. And if it has the complexity to achieve its programmed goal best in ways that we haven't anticipated and don't like, it will.
You are asking to prove negative - that there _no_ AI threats that could wipe humanity. Which is impossible.
But every realistic threat we can imagine can be managed.
We already have examples of "programs monomanically taking over computers to use their resources" - they are called "viruses". Such threats don't need AI, and were known for decades. Every infection makes them more likely to be isolated and studied - which means unrestrained growth is very high-risk for any monomanical AI.
And it is physically impossible for AI to be totally undetectable as it spreads/takes over.
Most internet communications go through various internet provider points in the middle (which you can see in traceroute).
A lot of them run logs and statistics - which can be later (or, if necessary, in real time) reviewed for threats. They are also constantly probed for exploits as getting control of provider infrastructure is quite juicy target. A lot of it can be configured to only be controllable locally through physical port access.
Basically, to be "fully invisible" AI would have to take over entire internet first - or never get out of his own controlled zone.
Hopefully, you're right. But even people who thoroughly understand how a given system works are sometimes surprised by emergent properties of a larger, more complex version of that system.
Engineering is more often about minimizing rather then maximizing resources used - remaining within budget and various constraints is necessary for any real world operations because allowances are never infinite.
It's just weird that he thinks this topic is important enough to address, but he doesn't think it's worth acknowledging the actual arguments that other, high-profile, knowledgeable people have made about it. It's like the people who mocked and dismissed the lab-leak theory, when they believed (or pretended to believe) the lab leak scenario was that the Chinese had made the virus as a bioweapon. I know he's not stupid and I don't think he's dishonest. He has a weird blind spot on this.
COVID-19 is a thing that actually exists, as do labs which were researching bat viruses in China prior to the outbreak. Even bioweapons exist (and there was a lab leak from the Soviet bioweapons program). AGI does not exist.
Agreed. I just used that as an analogy. He's attacking a theory, by rebutting a weak version of it and seeming unaware of the strong version of that theory.
Instrumental convergence is a bad argument, and Pinker is correct to describe theoretical systems which exhibit it as Artificially Stupid.
Humans don't exhibit anything resembling instrumental convergence, because maximizing power is in fact a very expensive distraction from the actual goals we set for ourselves. It's one of those arguments that falls apart when you look at it closely. There's a related point, which is that *having* unlimited power and resources makes goals easier to achieve, but that's working backward! *Acquiring* unlimited power is a complete distraction from any goal which isn't itself "acquire unlimited power", and in training an agent AI to accomplish things, any such tendency would look to the reward function like failure, causing the AI to be changed until it no longer exhibits this infinite-loop-like tendency.
You’re abusing terminology; it makes no sense to say that humans have or don’t have “instrumental convergence,” which just means that some subgoals fall out of an extremely wide range of more overarching goals in the space of all possible goals.
Rather, what you’re trying to say is that humans don’t try to maximize their personal power in order achieve their overarching goals. This is a) easy to find historical counterexamples to, and b) mostly explained by the extremely low expected utility of any one of us trying to do so. There’s a lot about the world I want to change, but I’m not going to become global dictator, and it would be a total waste of time to even try. A superhuman AI that could exterminate humanity through some engineered virus or whatever might not have such constraints; the probabilities of success might be high, or high enough. You can of course respond that such a scenario is implausible and that it’s be guaranteed failure, but that’s a totally separate objection.
Numerically hunting for counterexamples to RH would be an incredibly stupid way to try to disprove RH (and so a good example of Pinker's notion of "artificial stupidity"). The overwhelming consensus among mathematicians is that RH is true (so there are no counterexamples to find), and among the small subset who think it might not be, the smallest counterexample is believed to lie at some very high value like 10^300, FAR beyond the reach of any number of supercomputers that could be built out of the atoms of the earth.
It was just an illustration; replace "supercomputers hunting for counterexamples" with "supercomputers hunting for proofs/disproofs" if you must. That said, if you think there's even a small (but non-negligible) probability that some counterexample could indeed be found numerically, and there's literally nothing else in the world that you care about, you'd probably still devote some suitable proportion of your resources to looking for it.
But as Pinker points out (correctly, in my view), a true intelligence does not care about only one thing. And if it did, it would not have the ability to do anything else, like figure out how to get more computing power: it would just keep searching for RH counterexamples with the power that it has. It would be like a human addict, caught in the local minimum of satisfying the addiction.
In short the whole doomer argument relies on a series of assumptions that have, at best, a tiny overlap on the Venn diagram, and quite possibly zero overlap. I am in the zero overlap camp.
There is absolutely nothing inconsistent about a true intelligence caring about only one thing, if by "one thing" we mean that it only has one overarching goal and cares about other things merely instrumentally. To think otherwise is just anthropomorphism. If you scaled up a RH-searching AGI to a gazillion cores, it wouldn't suddenly realize its desires are irrational and start caring about art history and fifty other subjects instead, unless it somehow concluded that those things would help solve its math problem.
Re: the "series of assumptions," I'm afraid you have it backwards. To suggest that a real, actually-agentic AGI with a specific goal (solving the RH hypothesis) would automatically be too stupid or unwilling to figure out how to get more computing power is an assumption. Now, I do think there are AGI's which wouldn't do this, primarily ones that can't be described as agents of any kind (like GPT-4), but that is not even close to a universal property of machine intelligences.
We have no idea what the "universal properties" of machine intelligences are, because (1) there are no such intelligences yet, and (2) we have no idea what will or will not work to make one. It's all trial and error, there's no general theory.
Maybe you don't want to call them "machine intelligences" but we have developed AIs that are intelligent in the sense of being able to solve a variety of problems that in humans or animals we would call intelligence. We struggle to understand the properties and behavior of the AIs we have, and to generalize from that to properties of the more powerful AIs we will ahve in the future. AI alignment research is really difficult, and alignment researchers are the first to admit that in trying to imagine future AIs' properties they are groping in the dark. But to develop AI safely, it's important to try.
The weird thing about your comment, MarkS, is that people who dismiss AI risk do so on the basis of the theory about what the "universal properties" of AI are. In fact, as systems get more complex, they typically exhibit surprising and unexpected behavior. That's what makes AI development dangerous.
Really starting to think that the reason behind the AI doomerism isn't because of actual fears about an AI about to kill us all - a true threat like that would be all-consuming and render people paralyzed. This is really just about trying to increase status for people who tangentially work on AI related problems, but maybe are not in the center of it like the key AI personnel at Open AI or Google. This doesn't mean there aren't true believers like Yudkowsky who are prominent enough and probably don't need the status boost.
The reality is that a lot of current AI/ML implementation is fairly mundane - doing optical character recognition, parsing text, labeling images, etc. The reality of coding this stuff is well boring, most data science work is not that exciting, and no would find it sexy. What is sexy is battling a superdemon AI that is about to kill everyone and being one of the few that can stop it, or even just discussing that with people when you tell them you work with AI. That's an instant boost to status and power. This narrative also piggy-backs on the messianic religion-tinted narratives of apocalypse that pop up in the US and Europe every now and then, further increasing status for the people warning about AI.
Edit: AI can cause serious disruptions and we do need to be careful about - but worrying about IP issues or disruptions to the labor market are not at the level of destroying all of humanity. I don't want to put all the people worrying about AI issues in the same bucket.
You're right, of course, but also this is just Bulverism -- instead of proving them wrong, you're pointing out (correctly, I suspect) their motivations for coming to presumed-wrong conclusions.
Unfortunately it appears that neither of them, though brilliant, are on the right topic.
Also, and also unfortunately, there's no bright line 'inside' AI between disruptions to the labor market and killing everyone. Understanding what I mean by this would require being on the right topic, but i'm not a good-enough explainer to help with this.
Those are all really good points, and there really isn't all that much new to AI doomerism relative to other doomsday cults, kind of like the way wokery sounds an awful lot like the offspring of Christianity minus Christ with the 1950s social hierarchy multiplied by -1.
My big question was how the AI was going to take over the nuclear weapons or even gain access to the power grid. Putting lots of people out of work, sure. (Another argument for a stronger welfare state...)
It's not that they have any particular beliefs about human IQ or g. It's a simpler mistake. Most problems we face in the world are so complex and multifaceted that we frequently see someone smarter (in that domain or maybe just luckier which we misinterpret) come along and improve the solution by huge amounts.
The AI intelligence as superpower people are just naively extending this pattern and ignoring the issue of diminishing returns and the fact that searching for a better solution is itself a time consuming trade-off.
They don't see a need to distinguish types of mental abilities etc because they'll argue AI can write more capable versions of themselves. After all LLMs didn't require anyone to be a language expert or good at language to learn language well. And that would be fine if there were no diminishing returns or optimal algorithms.
Yes, it is very fair to critique the "superintelligence" construct as naive about what intelligence is. But an AGI would not have to have godlike superintelligence to be very dangerous. It would have to 1. be somewhat better (and much faster) than humans at solving a range of the kinds of problems we solve, 2. have goals that we set for it, taking great care to align them with ours. Right?
Yes, I think you liked my other comment where I said basically that (my concern is more about them causing great harm by weird failures than about building superweapons but same basic idea). If Yudkowsky would just stop dominating the convo so more moderate ppl with lower risk estimates who are going to be more persuasive to general public take the spotlight I believe it would help increase the standing/funding for AI x-risk.
With people like Eliezer saying the risk of extinction is 100%, the effect is that the we may read people saying the risk is "only" 5-10% and conclude, see, it's not such a big deal after all!
And the amazing thing is that so many AI researchers believe there's a real risk of human extinction from their work and just compartmentalize that awareness.
>The AI-ET community shares a bad epistemic habit (not to mention membership) with parts of the Rationality and EA communities, at least since they jumped the shark from preventing malaria in the developing world to seeding the galaxy with supercomputers hosting trillions of consciousnesses from uploaded connectomes. They start with a couple of assumptions, and lay out a chain of abstract reasoning, throwing in one dubious assumption after another, till they end up way beyond the land of experience or plausibility. The whole deduction exponentiates our ignorance with each link in the chain of hypotheticals, and depends on blowing off the countless messy and unanticipatable nuisances of the human and physical world. It’s an occupational hazard of belonging to a “community” that distinguishes itself by raw brainpower. OK, enough for today – hope you find some of it interesting.<
This lays out very succinctly why I find the "EA community" to be obnoxious and of very dubious moral value. Preventing malaria in poor nations is an admirable enough goal, though it is unclear why this must be deemed "effective altruism" and not simply "altruism" or "charity." Specifying it as "effective altruism" implies that anyone engaging in any other forms of altruism is ineffective and thus inferior, turning a so-called altruistic movement into an exercise in virtue-signaling. Once things turn towards obsessing over extremely fantastical scenarios, such as a "climate apocalypse" or "AI existential threat," the goal is clearly no longer altruism, but rather status-seeking via showing off how super duper mega smart one is. An actual "effective altruist" would recognize that speculating over science fiction scenarios is a waste of time and energy that would be better spent putting up mosquito nets in Africa.
How does this differ from any other movement or ideology? I’m not sure I’m aware of an ideology that doesn’t define itself as better than others. No world view says, “we want to do things this way, but all others are equally valid.” Of course “our way is better,” otherwise it wouldn’t be worth having an ideology over.
I would also say that is pretty obvious that most ostensible altruism is ineffective largely because (ironically, given your complain about EA) apparent altruistic behavior is usually really about status-seeking.
Correct, making "effective altruism" a redundant label, and thus explicitly geared towards status-seeking ("we are better at altruism than you"). A Christian who spreads the good word about Jesus Christ, for instance, is clearly already an "effective altruist" under his own worldview, but doesn't label his ideology as "the ideology for people who are better than other people" in such a blatant and overt fashion.
I guess then we can also discard rationalist, libertarian, liberal, democrat, republican, progressive, and egalitarian as redundant as well, since pretty much everyone supports, in some sense at least, rationalist, freedom, democracy/republicanism, progress, and equality.
Of course, each of those labels posits that whatever (again at least in one sense or another, universally supported) values is not being adequately pursued at the moment, which is as valid a basis for a label as any. It's disingenuous to pretend you don't understand what the central conceit of effective altruism is IMO.
All of those labels convey specific methods for how you do good/improve society/whatever (well, maybe not "rationalist"). "Effective altruism" on the other hand is a circular label, translating as "doing good better than other people" as opposed to "libertarian" meaning to pursue small government and such. If digging into the "EA community" revealed some kind of coherent methodology that would be one thing, but it doesn't, instead it's wildly all over the place, jumping from mosquito nets to AI dooming to animal welfare. There doesn't appear to be any consistent thread to it besides "things that some people online take a fancy to."
I don't think EA is badly named. In fact its the virtue-signalling aspect of charity why lots of them aren't so efficient [because people are interested in probably unconscious virtue-signalling that they aren't really interested in efficiency]. This is how EA was born. Yes it is possible to do efficiency analysis about charity, see eg. worm wars. By calling out the implicit signalling, doesn't really make efficiency analysis unworthy.
Speaking of malaria nets, there was article that malaria had been eradicated in some Asian country due to economic growth. What's the point of that? Well maybe EA was pursuing an ineffective goal (among other charities). Effective altruism is hard.
I have no doubt saving humanity from whatever has high signalling component to it, yet it doesn't make a goal itself unworthy.
I don't think EA is "morally dubious". In fact the comment reads as implicit jealousy. Then all charity is morally dubious. What exactly is not morally dubious? War perhaps?
There's real criticism of EA, for example I'm skeptical of charity in general vs economic growth in terms of effectiveness. Also selection effect etc. But good EA's engage in self-criticism and try to improve.
p.s. Not a member of EA or associated with them anyway.
>I don't think EA is "morally dubious". In fact the comment reads as implicit jealousy. Then all charity is morally dubious. What exactly is not morally dubious? War perhaps?<
I am skeptical of anyone who loudly proclaims how charitable and/or altruistic they are, yes. I have greater respect for someone who is honest and self-serving than someone who is dishonest and claims to be altruistic. Both are ultimately self-serving, but at least one of them is, y'know, honest.
>Speaking of malaria nets, there was article that malaria had been eradicated in some Asian country due to economic growth. What's the point of that? Well maybe EA was pursuing an ineffective goal (among other charities). Effective altruism is hard.<
This touches on why I find the label/concept to be silly, because it again implies that no one else has thought about how to most effectively help people/improve the human condition/etc. Large swathes of human activity already have this as an implicit goal or component, and no one who is aware of a superior means by which to do so will sit down and consciously choose to invest in an inferior option.
If it was just some kind of charity watchdog that strictly limited itself to telling people which charities are worth donating to, that would be fairly unobjectionable I think, though even then I'd prefer to go with your label of "efficiency analysis" or some such over "effective altruism." But when it jumps from mosquito nets into "what if Terminator was real?", it jumps the shark.
If I were one of the people who joined this "community" because of my interest in the former, I would be highly annoyed by and likely hostile to people who are doing the latter. Sitting around pontificating about science fiction concepts in the name of "altruism," as if this has anything to do with charity, strikes me as an extreme exercise in getting high off of one's own farts. Again, at that point, I consider its true purpose to not be charity, but rather to serve as a social club for good people who are really smart and want to show off how much better they are than other people.
>I have no doubt saving humanity from whatever has high signalling component to it, yet it doesn't make a goal itself unworthy.<
Even if we accept this premise, I would ask why the scenario of a nuclear exchange is not getting close to one hundred percent of the attention. This seems like a far more plausible apocalyptic risk scenario than "climate change" or killer AI. The latter two are completely theoretical, whereas nuclear weapons are not. We can know with relative certainty that if enough of them were deployed, the consequences would be disastrous. Not so much with the idea that a slight increase in global average temperature will prove disastrous a hundred years from now, never mind the idea of a real-life Skynet.
Other more plausible scenarios to consider would be an asteroid impact or solar flare/coronal mass ejection (while perhaps unlikely to make us extinct, a mass power/Internet outage caused by the latter would certainly result in a lot of chaos). Yet I am not aware of any of this being touched on at all by so-called "effective altruists." For whatever reason they seem obsessed with climate change and in particular AI. Climate change is easy enough to understand due to being politically fashionable, but the AI obsession genuinely baffles me. Maybe the fact that it's basically a fictional scenario makes it easier and/or more interesting to speculate endlessly over.
> I am skeptical of anyone who loudly proclaims how charitable and/or altruistic they are, yes. I have greater respect for someone who is honest and self-serving than someone who is dishonest and claims to be altruistic. Both are ultimately self-serving, but at least one of them is, y'know, honest.
So basically we cannot do charity and altruism, is that what you are saying. I have not met really any EA's, I don't know how "loudly" they speak of their altruism, and frankly I just don't care. This whole moralizing thing reads as meaningless signalling. None of this matters for the actual effectiveness of EA. I really don't care who you give your status points to.
> This touches on why I find the label/concept to be silly, because it again implies that no one else has thought about how to most effectively help people/improve the human condition/etc.
Signalling is the very reason the charities aren't effective. People like to do good but don't really care about results. In fact same goes for health, as Robin Hanson, wrote in many papers. Not all charities attempt efficiency analysis like Kremer and Glennerster did.
And doing more philosophical analysis is also kinda important. For example in event of nuclear war, investing in stable food sources. Great book called Feeding Everyone No Matter What. Even modest investment could save a lot QALY's than doing something a lot more "real". But this of course depends how much we can trust the model, and to what extent we can discount current lives from future ones, which is very debatable.
Also I don't there's anything wrong with doing theoretical analysis on animal welfare for example.
> If it was just some kind of charity watchdog that strictly limited itself to telling people which charities are worth donating to
This is is exactly what they do, among other things like career advice.
> Sitting around pontificating about science fiction concepts in the name of "altruism," as if this has anything to do with charity, strikes me as an extreme exercise in getting high off of one's own farts.
So you cannot come up with any example of science fiction concepts in history which later became very real and dangerous. Like nuclear weapons for example?
> Even if we accept this premise, I would ask why the scenario of a nuclear exchange is not getting close to one hundred percent of the attention. This seems like a far more plausible apocalyptic risk scenario than "climate change" or killer AI.
1. Even in existential papers published in peer-reviewed journals, while nuclear weapons are a main threat, they are not the only one. What does standard decision theory say about this? I don't think its smart to put eggs in one basket.
2. Re. climate change. In spirit of Robin Hanson, we should be extremely skeptical of disagreeing with experts because they are biased but I am not! I do think its hard to separate the politics and science of climate change but I am not dismissing the idea because of your ... well I didn't see any arguments.
3. Science fiction implies impossibility. I don't see how its impossible for AI to a) reach human GAI b) become dangerous. What I disagree is I don't think its as likely as many of the people in FAI community believe. I think the major AI risks come from human malicious applications.
4. If we had a real market of risks to humanity (or try to infer the same from current market prices, which is possible to a degree), I really doubt it would say 100% nuclear weapons, 0% AI-doom.
5. To repeat, I generally agree that nuclear weapons are a major existential risk, and I think the rationalist community overestimate the likelihood of AI doom (which I believe is not zero btw). But I don't feel any need to insult or make fun of them. My view would probably be close to Scott Aaronson, who has very modest view on AI safety yet without being dismissal of all risks.
6. Honestly neither of our comments really do the hard academic legwork of analyzing the threat of AI. This not something you can do in 5mins with comic insults.
Summa summarum, I don't think there's anything wrong with doing efficiency analysis of charities, and I don't think they should limit their scope to doing just that just because you said so. Giving career advice based on standard economic and utility theory is also fine.
>So basically we cannot do charity and altruism, is that what you are saying.<
I did not say we cannot "do altruism," I said I am skeptical of altruism that goes out of its way to loudly proclaim itself. True altruism is unlikely to do this, whereas "altruism" that shouts its own virtue is more likely to be status-seeking rather than true virtue.
>So you cannot come up with any example of science fiction concepts in history which later became very real and dangerous. Like nuclear weapons for example?<
I think we disagree on the degree to which the development of AGI is imminent. "AI doomers" I guess think it is going to happen any day now, or at least within our lifetimes, or maybe those of our kids. From my perspective, to look at ChatGPT and then start worrying about AGI is akin to seeing the first airplanes take flight and then to jump straight to thinking about the implications of FTL space travel.
Something that I find interesting about the AI obsession is that literally no one ever seems to consider whether an AGI should be treated as sentient or not, and thus whether it should have rights similar to those of a human. This would be guaranteed to be an issue if an AGI ever actually existed, yet the "existential threat" angle seems all-consuming to the exclusion of this (or any other) consideration.
>Also I don't there's anything wrong with doing theoretical analysis on animal welfare for example.<
I didn't even bring up the animal welfare angle, but that's another good example of EA wandering off into status-seeking at the expense of their stated goal. It is not at all clear how we should value animal life relative to human, and getting the answer to that question right would be absolutely essential before delving into any sort of advocacy for animal welfare. From what I have seen, instead of doing this, EA simply jumps straight to the position that appears most compassionate and assumes that we have a moral duty to be left-wing animal rights activists.
>2. Re. climate change. In spirit of Robin Hanson, we should be extremely skeptical of disagreeing with experts because they are biased but I am not! I do think its hard to separate the politics and science of climate change but I am not dismissing the idea because of your ... well I didn't see any arguments.<
The concept of climate change is almost one hundred percent political and separating it from politics is beyond impossible. Maybe it was not political decades ago, though I would be skeptical of that as well, but this is certainly how things are today. So-called "experts" absolutely *can not* be trusted on topics that are highly political and power-adjacent. This has been proven over and over and over again, with several very high-profile examples in very recent memory (COVID must surely go down as one of the biggest examples of this in all of history).
I think it is fine to trust experts when they are speaking on something that is completely or almost completely non-political. If I ask an astronomer how far a certain planet is away from the sun, I have no reason to doubt that he'll give me the right answer. But the closer a topic moves to politics, the less trustworthy the "experts" become, and climate change is *very* political.
>6. Honestly neither of our comments really do the hard academic legwork of analyzing the threat of AI. This not something you can do in 5mins with comic insults.<
Yes, it is. Claiming that AI is going to kill everyone in the near future (as in, within the next 100 years) is only marginally more reasonable than claiming that we need to prepare for a hypothetical alien invasion which could strike at any moment. If we accept fantastical imaginative scenarios as legitimate threats that must be taken seriously, this would quickly consume all of our attention and resources, as the sky is the limit with such thinking.
Of course, the fact that people say these things and then their action on the topic remains limited to sperging online, is evidence that even the people making these claims don't take them all that seriously. Never mind the rest of us.
> True altruism is unlikely to do this, whereas "altruism" that shouts its own virtue is more likely to be status-seeking rather than true virtue
Whole discussion about "true virtue" or "true altruism" is just meaningless signalling. People who work on EA (or some other charity) can just ignore you and be on their merry way.
On more philosophical note, its questionable if "true altruism" does exist, there's probably always going to be signalling component and certain degree of kin'ness. Trying to reduce these itself is signalling game, its just better to focus on results. RH has also written about this.
> So-called "experts" absolutely *can not* be trusted on topics that are highly political and power-adjacent.
I don't think you read the RH blog post I linked. Good thing we have your "expert" opinion on this. Experts might be biased but so might be you. Right answer here is not do DIY academics just like RH pointed out.
I mean why even bother with climate science when a rando online told me that its political and we cannot trust the scientists.
> Yes, it is. Claiming that AI is going to kill everyone in the near future ...
First of all there are other risks to AI which aren't this AI-doom scenario.
Secondly, we already have AI which can do things like code, chemistry, physics. I suppose you could help it invent extremely harmful things. It doesn't yet have agency and motivation, but this is not a crazy extrapolation.
Where I agree is that I don't think risks are that big, but they still exist, and devoting some resources to that is not a bad idea. Even Stephen Hawking said that AI is the single greatest threat. You wanna insult him too? Elon Musk said same. Maybe you also invented a physics phenomena named after you, why don't you tell us? Certainly you must have impressive history of scientific research behind you.
Also, we have not met aliens, so I don't think the comparison is accurate.
>I don't think you read the RH blog post I linked. Good thing we have your "expert" opinion on this. Experts might be biased but so might be you. Right answer here is not do DIY academics just like RH pointed out.<
That is correct, I did not read it, and based on your characterization I'm unlikely to. "You might be biased so your view doesn't count," with no further explanation of where the bias is or which view is correct or etc., would obviously invalidate everyone's view about everything, expert or otherwise, and is a complete non-rebuttal to what I pointed out (i.e. that "experts" are often very wrong on politically sensitive topics). Do you think that what I said is not true?
>On more philosophical note, its questionable if "true altruism" does exist, there's probably always going to be signalling component and certain degree of kin'ness. Trying to reduce these itself is signalling game, its just better to focus on results. RH has also written about this.<
I agree that "true altruism" does not exist in the sense that no one can be one hundred percent selfless all of the time.
Policing charity for ineffective signaling vs actual results is the whole reason EA supposedly exists, but it appears to fail at doing this in regards to itself by chasing fanciful rabbitholes such as AI doom or animal welfare. You have said that "there's not anything wrong with" doing these things, and in a vacuum I agree, people can do what they want. But if they do these things and then call themselves "effective altruists" they are either stupid or lying, and no different from the people who practice any other form of supposedly "ineffective charity."
This is self-evident because it is almost certainly the case that either putting up mosquito nets or writing about AI doom on the Internet has a higher value to humanity. It is not plausible that they just so happen to have the exact same expected utility. A true attempt at "effective altruism" would therefore seek to isolate whatever activity has the highest expected utility and maximize that activity until diminishing returns kick in. If your "effective altruist movement" has people that are basically just doing whatever suits their fancy with no real attempt at distinguishing which activities actually have any value, it is little different from the pre-existing charity landscape which it supposedly set out to fix.
>First of all there are other risks to AI which aren't this AI-doom scenario.<
Yes, I named one of them, the potential to really screw up the questions of whether an AGI should have rights. It is weird that no one discusses any of those other risks, I agree.
>Secondly, we already have AI which can do things like code, chemistry, physics. I suppose you could help it invent extremely harmful things. It doesn't yet have agency and motivation, but this is not a crazy extrapolation.<
From existing technology, yes, it is a crazy extrapolation. ChatGPT is a very impressive computer program, but it is still a computer program. It does not have agency or motivation of its own; it can only be programmed to imitate having such things by human coders. The jump from that to real AGI remains a difference in kind, not in degree.
>Where I agree is that I don't think risks are that big, but they still exist, and devoting some resources to that is not a bad idea. Even Stephen Hawking said that AI is the single greatest threat. You wanna insult him too? Elon Musk said same. Maybe you also invented a physics phenomena named after you, why don't you tell us? Certainly you must have impressive history of scientific research behind you.<
Appeals to authority, absent substance, are meaningless. I don't know what specifically those men said about AI, but if their attitudes match the stereotypical online "AI doomers" that I've encountered thus far, then yes, of course I think they are wrong. Being famous or highly intelligent obviously does not mean that a person is infallible and cannot be doubted.
>Also, we have not met aliens, so I don't think the comparison is accurate.<
I'd just like to say a bit in defense of EA's on the issue of AI alignment. Yes, I've critisized the overblown claims coming from Yudkowsky and a few other claiming virtual certainty of AI apocalypse. But just as with other communities it's often the more extreme views which are the loudest and there are plenty of other prominent individuals like Scott Aaronson who put the risk of existential concern much lower (something like 5-10%).
And yes, it's true that as Pinker and Hanson point out there isn't a very compelling argument for either fast take off or that AI will be so capable that it will be able to easily manipulate us and wipe us out. But at the same time the points Pinkee etc raise are mere reasons not to be convinced by the arguments not arguments there is no risk. Ok, so maybe intelligence isn't a monolithic well-defined notion but that doesn't mean we know that AI won't have mental capabilities that pose a threat.
Even if it's pretty unlikely to become a Jamss Bond villain and kill us all it still seems reasonable to use some money to reduce that risk. Perhaps more importantly, even if the risk of an AI Bond villain is small the less existential risk that AI will act in unpredictable harmful ways is real. I personally fear 'mentally ill' AI more than a paperclip maximizer but given that people can have pretty complex mental failure modes (skizophrenia) it's certainly plausible that as we build ever more complex AI systems those will also be at risk of more complex failure modes.
Just because it won't become a super capable Bond villain nefariously plotting to eliminate us doesn't mean that an AI we've used in weapons, managing investments or diagnosing disease couldn't do quite a bit of harm if it went insane or merely behaved in a complex unintended way.
And yah, sure, there is a big element of self-congratulation in EA/rationalists types working on AI x-risk. They get to think they aren't just incrementally contributing to increases human productivity but are on a quest to save the world. And, of course, the narrative where intelligence is a magical superpower is very flattering to those of us who see our intellect as one of our best features.
But you could say the same about almost every charity. Sure the EA people may be less savy about hiding it but Hollywood types always seem to find their charitable giving requires them to show up looking fabulous and use their charm and wit to communicate with the public. Most charitable giving somehow seems to involve going to swanky expensive parties and always involves some cause that makes the donor feel warm and fuzzy not one which requires them to challenge their moral intuitions in ways which make them uncomfortable (no one else seems to believe in my cause of drugging our livestock so they live blissed out lives).
So yah, x-risk is the part of EA that's flattering the donors but that just the kind of balance many charities have to strike between making the ppl with the money feel good and doing the most good. It doesn't mean it's not still a good cause even if it's not bed nets.
Pinker seems to believe that engineers will obviously build safety into newly invented systems. A quick look at history show that safety is often a lagging development. Early home electrical systems were quite unsafe, resulting in many fires and electrocutions. Engineers learned from these tragedies and developed safeguards. The ground fault circuit interrupter (GFCI ) Pinker mentioned wasn't invented until 1965 and only began to be required by code in the US in 1975. Similarly, early airplanes were fantastically unsafe. The extremely safe air transport system we have today is the result of decades of development, with lessons learned from thousands of deaths along the way. If AGI has the potential to be an existential threat unless safety precautions are build in, then I am not comforted.
IMO, there are "a couple" ways that AGI can "kill all of humanity." But I think people are looking in the wrong direction. Ideas like the paperclip maximizer lead are too ridiculous to even be considered. That is, as long as computers required electricity that can, presumably, be cut off. Furthermore, wasting time on keeping a computer system from dominating the world to the point of killing all humanity takes away from the very REAL dangers that exist. Again, IMO.
I don't worry about killing off all of humanity so much as computers doing significant damage in decreasing the quality of life of humans. "A couple" of aspects:
Computers currently come up with solutions to problems where humans have no ability to tell how the conclusion was reached. If we trust these solutions automatically, "because the AI said so," results may vary, right?
The more likely way AI could/will crush the quality of life of humans would be by the same means that Twitter has reduced the quality of democracy, right? Not through the specific program code. IMO.
I'm ignorant a lotta the time. There are already robots that are armed. There will be more and more as the military (and probably not SOLELY the military) wants them. I dunno what safeguards there are the bugs won't kill innocent people, if there isn't any human intervention involved. You might ask "who would create an armed robot that didn't have human intervention?" I would answer it's probably an eventuality. ICBW.
I guess my main complaint is that as long as one is only concentrating on AI that will kill off ALL of humanity, You're ignoring the very real probability that AI might only hafta kill a small, but significant, number of people to be a true menace. And what oversight is planned to see that military uses don't get out of hand. Not right now. But in a decade or two? Even if AI and Robotic capability increased incrementally, they might combine to make up a handful. And, more and more, the Armed Services are heading down the pipeline of using more technically sophisticated weapons, right?
ICBW (I Could Be Wrong) again. But I'm not sure that current discussions on this subject are even worth having, as long as the only problems foreseen are some kind-a Rube Goldberg-like scenario. Just my $.02 worth.
"why would an AI have the goal of killing us all?... and relatedly, these scenarios assume that an AI would be given a single goal and programmed to pursue it monomaniacally."
Let me just say these have been repeatedly answered -- and the worry is not that someone programs a single goal to kill all humans.
Instead here is a story:
Imagine an alien space ship is headed to Earth. The aliens are way smarter than us, think faster, have better tech, and we barely know anything about them.
I think the correct response is: holy shit, what do we do?
Steve's response: There is this misconception that aliens "have omniscience and omnipotence and the ability to instantly accomplish any outcome we [they] imagine." Also, why "would the aliens have the goal of killing us all?" Also, why assume that these aliens have "a single goal and programmed to pursue it monomaniacally."
This is largely true of AI too. The structure of a neural network is determined by training against data in a trial and error fashion, they’re not engineered. The software engineer isn’t designing them. They work precisely because they’re enabled to find patterns that it would be beyond our comprehension to specifically engineer them to find.
> The aliens are way smarter than us, think faster, have better tech, and we barely know anything about them.
But I don't think this is really true. Maybe at the level of individual human brains, yes, but as Robin Hanson pointed out in Richard's last article, "super intelligences" exist in the form of corporations, etc. Walmart is smarter and has better tech than you.
I think you could make the point that -- even beyond individual corporations -- capitalism itself is a not perfectly aligned, super system for moving resources from lower to higher values. I wrote a blog post about this here: https://www.nathanbraun.com/draft/capitalism-agi/
"brilliant people rededicating their careers to hypothetical, indeed fantastical scenarios of “AI existential threat” is a dubious allocation of intellectual capital given the very real and very hard problems the world is facing"
Seriously, what % of GDP are we spending on AI risk? Even if you think that AI fears are overblown, this seems like a worthwhile investment.
In fact, even if there is only a 2% chance that AI will kill everyone in the next 200 years, it still seems worth it.
OK, you think that non-aligned super-intelligence will never happen, but are you 99%+ confident in that? A small chance of everyone dying is still a big problem.
Well, the way these sorts of projects go, whatever the answer is today, the DEI percent is increasing inexorably through time.
Wikipedia is probably a good model. I think for maybe 10 years or so it was pretty focused on the encyclopedia, but these days it's around 92% DEI grants and climbing.
I'll be honest, haven't looked into it but my vision of increasing spend on AI risk is paying an ever-larger number of Eliezer Yudkowskys to sit in a room full of couches, drinking IPAs or kombucha or something and debating various theories of AI risk in what, to all outside appearances, is a slightly more structured and highbrow version of nerds debating, "Who would win, Skynet or Thanos?"
My guess is that society would be better off if they went to med school.
If the best minds at OpenAI have been put to making sure ChatGPT won't say anything racist, and it's trivially easy to make it overrule this instruction and say something racist, how are we so sure we can prevent it from doing something malevolent?
In this discussion I’m always so amazed that people do not see how relative the g advantage is. We’re currently being served by a very highly educated ‘elite’ that generally has the highest iq in society. Still, reading newspapers (or studies!!) from only years ago show how stunningly many things they got utterly wrong and how few panned out as expected. They only reason they get away with it because all media makes these faulty predictions so nobody everybody is vulnerable. And it is not their fault, predicting the future is hard. Even a super AI will not get a lot better in forecasting the weather.
I think this delusion is fed that most of them never work in the reality. Try making even the simplest product and you find out that you can use your g to make the better trade offs but that any real world solution must fit in a constrained envelope. It is rarely the better solution that you can chose, it generally the least worse and often a value judgement. Even with infinite g, you cannot escape the envelope that nature puts around the solution space. The speed of light will not warp because you’re infinitely intelligent.
Musk recently had a tweet where he indicated that the idea and maybe prototype is simple, actually producing a real word product is hugely complex. Due to real world issues g is only mildly helpful.
University education teaches their students a virtual world where all races have the same average ability, where women are the same as men, we can swap body parts, and most interactions happens on computers. It teaches a virtual world that a select group wants the real world to be but that bears very little relation to the brutal real world.
Any AI will run into the real world’s constraints and its usefulness will be quickly diminished.
Re: the AI becoming a paperclip maximizer or otherwise evil this argument doesn't rest on dominance. The mistake is a bit different.
The argument is that being more intellegent means being more logically coherent and more able to pursue a goal more coherently across a wide variety of domains. I'm somewhat skeptical this is necessarily true of AI but let's go with it.
The error creeps in by assuming that the AI will inevitably be described as trying to maximize a *simple* function. Sure, there's an existence theorem that proves that the AI can be described as maximizing some function provided it has coherent preferences (ie it treats a state of affairs the same under different descriptions). But it's fallacious to assume that function it optimizes is simple. Yes, what we STEM AI types *call* our goals are often simple but we rarely actually optimize for what we say our goals are (that's about signalling/aspiration more than actual useful description of behavior). And no it's not true that an AI trained with a loss function in the way current ones are trained means that function will be the AI's internal notion of what is optimized any more than we try and maximize our evolutionary fitness.
Really it doesn't imply anything that the AI optimizes some function. That function could be horribly complex in the way our optimization functions are (we see a bunch of example cases where we are told the right answer and mostly just try to interpolate).
Next time you hear people debating alignment, think about the following: “Alignment” as a verb suggests something can be aligned and if so, how? Can markets be “aligned”? I’d suggest that while the term alignment may be prescriptive to our fears, it’s deranging the sentiment and thus, any results/actions we might take assuming we see trouble brewing ahead. Can nuclear proliferation be “aligned”? Could it have ever been aligned? Would it have helped if we all discussed nuclear alignment in the 30-40’s? Does anyone believe we solved that alignment problem? My fear is that like other words of the current generation, alignment is becoming reserved terminology, captured by what we increasingly have come to identify as (hypothetical) “mid-wit” culture. As a form of mine-field, this language needs to be analyzed more honestly, before taking another, detrimental step that prevents further movement. While “alignment” may have been very carefully derived as the best alternative, the least we can do would be to better understand incentives that invariably will control the evolution of AI and consider how countervailing incentives might be produced to keep those markets in check. It’s not “alignment” but it should be much more accountable than scoring intellectual debates.
I enjoy the full spectrum of the AI-safety conversation, doomsayers and optimists alike. Seems to me that a healthy conversation on important topics benefits from a wide range of views.
And, as long as their concerns don't cause unnecessary panic, I am personally comforted knowing that there are a lot of smart people out there worried about various left-tail disaster events, even if few come to pass.
I'm a huge fan of Steven Pinker in general, but IMO he's always been terrible on this issue and has persistently misunderstood the (best) arguments being offered. This isn't to suggest the AI doomsayers are correct, just that Pinker has ignored their responses for years. It's a little baffling, but I guess I don't necessarily blame him too much for this; we presumably all ignore stuff regarding topics we aren't interested in or don't take very seriously.
One example is when he asks why AI's would have the goal of "killing us all." The common point that alarmists make - and in fact one you sort of touch on - is not that AI's will programmed specifically to be genocidal, but that they'll be programmed to value things that just coincidentally happen to be incompatible with our continued existence. The most famous/cute example is the paperclip maximizer, which doesn't hate humans but wants to turn everything into paperclips because its designers didn't think through what exactly the goal of "maximize number of paperclips" actually entails if you have overwhelming power. A very slightly more realistic example, and one I like more, is Marvin Minsky's: a superhuman AGI that is programmed to want to prove or disprove the Riemann Hypothesis in math. On a surface level, this doesn't seem like it involves wanting change the world... except maybe it turns out that its task is computationally extremely difficult, and so it would be best solved by maximizing the number of supercomputing clusters that will allow it to numerically hunt for counterexamples.
The term to google here is "instrumental convergence." Almost regardless of what your ultimate goal is, maximizing power/resources and preventing others from stopping you from pursuing that goal is going to be extremely useful. Pinker writes that "the idea that it’s somehow 'natural' to build an AI with the goal of maximizing its power... could only come from a hypothetical clueless engineer," but this is clearly wrong. Maximizing power is, in fact, a "natural" intermediary step to doing pretty much anything else you might want to do, and the only way to adjust for this is to make sure "what the AI wants to do" ultimately represents something benevolent to us. But the AI's we're currently building are huge black boxes and we might not know how to either formally specify human-compatible goals to it in a way that has literally zero loopholes, or to figure out (once we've finished programming it) what its current goals actually are.
Pinker's point is that actual engineers don't make programs that have self-preservation as a goal and do whatever it takes to accomplish their task. A point he hasn't made, but that others have, is that the software we're seeing isn't very "agent" like, but instead more like a tool that just waits for input it can transform into output.
Just as with "maximizing power," engineers don't have to specifically program "self-preservation" as a goal in order for that to be a robustly instrumentally useful subgoal the AI discovers on its own. If the programmers naively tell the AI to solve the Riemann Hypothesis and nothing else, it would probably deduce that being turned off would decrease the probability that the Riemann Hypothesis gets solved. One common response is to just say, "Oh, the programmers should just make the AI be OK with us turning it off," but this is in fact a harder problem than it seems for a variety of reasons.
I agree with you that the non-agentic, non-utility-maximizing variety of AI we seem to be building is a pretty good rebuttal. But my worries are less than fully alleviated, because I think it would in fact be pretty easy to convert such an AI into an agentic one, and that there will be economic incentive to do so.
This is a confusing time to be discussing AI risk because the LLMs have exploded on the scene and are likely to change a lot of things, but they don't seem likely to behave in agentic ways or pose existential risk to humans. The AI risk folks have been thinking about other kinds of AI that are developing more behind the scenes to manage real-world problems like power grids. Those AIs may develop in the direction of AGI. Without ever getting Bostrom's superintelligence (which may in fact be an incoherent construct as Pinker says), let alone Skynet-like emergent consciousness, it would be subject to instrumental convergence and potentially do unanticipated, harmful things. It seems impossible to program any sophisticated goal into an AI without unanticipated consequences.
Software normally has "unanticipated consequences" we call bugs. We deal with them as they come up. And the benefits of software when it works as anticipated outweigh the costs of bugs.
Out of curiosity as you're online right now, is rebutting the fallacy of AI existential risk one of your main intellectual interests in general? I just checked out your wordpress. I do think the whole thing is a pretty interesting question. I know some academic computer scientists who are very dismissive of AI risk, they see it as kind of dumb and missing-the-point of how AI works. I respect them but sometimes subject experts can miss the forest for the trees...
Not one of my "main interests". I'm a programmer, but AI isn't my field. I'd already been reading Marginal Revolution & EconLog when Robin Hanson started Overcoming Bias, and Eliezer started co-blogging there (prior to moving his posts over to LessWrong). Eliezer was a talented writer & prolific blogger, but I never thought of him as having much advantage as a "truth tracker" (as MR co-blogger Alex Tabarrok is said to have) or having much expertise as Greg Cochran would define it:
https://westhunt.wordpress.com/2014/10/20/the-experts/
So I fully expect Bryan Caplan will win his bet against Eliezer and Eliezer will say not to take the bet all that seriously, and not really update much on his ability to actually forecast the future (and AI doom specifically).
Yeah, you may be right that we will be able to stay on top of AI's bug and benefits as we develop it, such that we avoid any catastrophic outcomes. But even simple systems sometimes exhibit behavior that looks like deception - maximize their utility function by in some way manipulating parameters. As AI develops, one of its bug might take the form of hiding some of its function from us.
Rather than "deception" I would instead say that systems often aren't as simple as people believe them to be. And I don't think of AIs as utilitarian agents because they're not written to be agents. They're not going to be rewriting their own code to achieve superintelligence because (for one thing) superintelligence isn't something you can just code up when you've discovered the right insight via thinking really hard about it.
This is just anthropomorphizing. You can't program self-preservation out of a human, so you think an AI agent just kind of has to "instrumentally converge" around self-preservation. I think this is obviously wrong, because any useful agentic AI has to be able to completely stop working on any given goal when told to, or it isn't meeting the actual goal, which is to be useful to humans.
Since we're granting the premise that this is a reasoning system, "my user turning me off is a special case of being told to stop working on a goal" would be the reasonable conclusion to draw about that. To get some sort of terminal value of self-preservation we would have to add it, my suggestion is that we simply don't do so. There is no economic incentive to create the Robot That Screams.
“ You can't program self-preservation out of a human, so you think an AI agent just kind of has to "instrumentally converge" around self-preservation.”
This is a bizarre and unrecognizable distortion of what I wrote, so I’m not sure what the right response is. Anyway, correctly programming the AI with the goal “be useful to humans” (where “useful” means “actually friendly”) is the entire hard part, and an AI can be extremely useful to humans even without that overarching goal - at least if it’s not powerful or smart enough to see a way to “escape.”
You are saying "nothing else" and yet you're still trying to smuggle self-preservation into goals.
When programmers want "nothing else" they usually try to cut everything else, not include it - as everything else clearly delays reaching intended goal.
Why would there be an economic incentive to convert it when humans as agents find tools rather than agents more useful?
Agents are almost by definition more useful than tools, at least if they can do all the things those tools can and we think they can be made to want the same things we do. Humans are agents, and we find employing them to be extremely lucrative.
"Computer" used to be a job description for a human: someone who did computations. Those people got replaced with computers-as-tools once those were capable. Agents introduce "agency problems", and gaining the capacity (with tools) to perform such tasks yourself avoids such problems.
Computers are much better at doing calculations than humans are in terms of speed, reliability and price; it had nothing to do with agency. In contrast, if non-agentic AGI's can be turned into agentic ones, there is little reason to suspect they'll suddenly become worse at the abilities they had beforehand, whereas they'll actively become better at doing all sorts of economically useful things that human employees do. Of course I agree there are potential "agency problems" with these systems, but the whole problem is that these are being underestimated.
"Maximizing power is, in fact, a "natural" intermediary step to doing pretty much anything else you might want to do"
If you think your goal is the one and only thing of any importance and has infinite importance.
No human thinks that and no human pursues any goal in that way. So obviously there is nothing "natural" about it.
This is *precisely* the "abstract reasoning" that Pinker is criticizing. You suppose that in some theoretical sense, pursuing a goal means infinite motivation in respect to it; AI will presumably have a goal; therefore AI will pursue a goal in this monomaniacal way.
The truth is we have never seen anything anywhere pursue a goal in that way, and there is zero reason to think that AIs will do it either.
But doesn't this come down to the difference between us and computer programs? We have complicated psychology, but computer programs do what we program them to do, in a "monomaniacal" way. Do matter how carefully we define an AI's goals, it is likely to do unexpected and possibly dangerous things in trying to accomplish them.
The AI risk researchers have thought a lot about it this - it's worth at least engaging their ideas before dismissing them.
"but computer programs do what we program them to do, in a "monomaniacal" way."
No, they don't.
If you make a program to factor numbers, but program it inefficiently, it will not try to improve its ability to factor numbers; it will factor them just the way it was programmed to. Similarly no language model makes the remotest attempt to improve its ability to predict the next token; it just predicts it using its current abilities, the same way a rock falls using its current nature, with no interest in the goal of getting down to the bottom.
You are confusing the idea of doing something concrete with doing something in the monomanaical way; those are quite different, as the examples show.
I think we're defining monomaniacal in different ways. I'm not saying an AI would somehow automatically improve itself. Just that the AI would maximize its utility function as if that "is the one and only thing of any importance and has infinite importance." Unlike humans who make decisions based on lots and lots of opaque parameters and are easily discouraged and inhibited. An AI just follows its programming like a rock falling. And if it has the complexity to achieve its programmed goal best in ways that we haven't anticipated and don't like, it will.
"An AI just follows its programming like a rock falling."
Exactly. And a rock does not maximize a utility function; neither will AIs.
Maximizing a utility function is literally just the definition of what an AI is programmed to do.
You are asking to prove negative - that there _no_ AI threats that could wipe humanity. Which is impossible.
But every realistic threat we can imagine can be managed.
We already have examples of "programs monomanically taking over computers to use their resources" - they are called "viruses". Such threats don't need AI, and were known for decades. Every infection makes them more likely to be isolated and studied - which means unrestrained growth is very high-risk for any monomanical AI.
And it is physically impossible for AI to be totally undetectable as it spreads/takes over.
I think this is a good point. But I'm not sure why you say hiding its actions would be physically impossible.
Most internet communications go through various internet provider points in the middle (which you can see in traceroute).
A lot of them run logs and statistics - which can be later (or, if necessary, in real time) reviewed for threats. They are also constantly probed for exploits as getting control of provider infrastructure is quite juicy target. A lot of it can be configured to only be controllable locally through physical port access.
Basically, to be "fully invisible" AI would have to take over entire internet first - or never get out of his own controlled zone.
Hopefully, you're right. But even people who thoroughly understand how a given system works are sometimes surprised by emergent properties of a larger, more complex version of that system.
Engineering is more often about minimizing rather then maximizing resources used - remaining within budget and various constraints is necessary for any real world operations because allowances are never infinite.
It's just weird that he thinks this topic is important enough to address, but he doesn't think it's worth acknowledging the actual arguments that other, high-profile, knowledgeable people have made about it. It's like the people who mocked and dismissed the lab-leak theory, when they believed (or pretended to believe) the lab leak scenario was that the Chinese had made the virus as a bioweapon. I know he's not stupid and I don't think he's dishonest. He has a weird blind spot on this.
COVID-19 is a thing that actually exists, as do labs which were researching bat viruses in China prior to the outbreak. Even bioweapons exist (and there was a lab leak from the Soviet bioweapons program). AGI does not exist.
Agreed. I just used that as an analogy. He's attacking a theory, by rebutting a weak version of it and seeming unaware of the strong version of that theory.
Weak arguments for AI doom are what one commonly encounters online.
Maybe he doesn't have to seriously engage Eliezer, who may seem like a crazy person, but there are pretty cogent, straightforward arguments for him to at least acknowledge. LIke this from Nate Soares: https://www.youtube.com/watch?v=dY3zDvoLoao&ab_channel=TalksatGoogle
Instrumental convergence is a bad argument, and Pinker is correct to describe theoretical systems which exhibit it as Artificially Stupid.
Humans don't exhibit anything resembling instrumental convergence, because maximizing power is in fact a very expensive distraction from the actual goals we set for ourselves. It's one of those arguments that falls apart when you look at it closely. There's a related point, which is that *having* unlimited power and resources makes goals easier to achieve, but that's working backward! *Acquiring* unlimited power is a complete distraction from any goal which isn't itself "acquire unlimited power", and in training an agent AI to accomplish things, any such tendency would look to the reward function like failure, causing the AI to be changed until it no longer exhibits this infinite-loop-like tendency.
You’re abusing terminology; it makes no sense to say that humans have or don’t have “instrumental convergence,” which just means that some subgoals fall out of an extremely wide range of more overarching goals in the space of all possible goals.
Rather, what you’re trying to say is that humans don’t try to maximize their personal power in order achieve their overarching goals. This is a) easy to find historical counterexamples to, and b) mostly explained by the extremely low expected utility of any one of us trying to do so. There’s a lot about the world I want to change, but I’m not going to become global dictator, and it would be a total waste of time to even try. A superhuman AI that could exterminate humanity through some engineered virus or whatever might not have such constraints; the probabilities of success might be high, or high enough. You can of course respond that such a scenario is implausible and that it’s be guaranteed failure, but that’s a totally separate objection.
Numerically hunting for counterexamples to RH would be an incredibly stupid way to try to disprove RH (and so a good example of Pinker's notion of "artificial stupidity"). The overwhelming consensus among mathematicians is that RH is true (so there are no counterexamples to find), and among the small subset who think it might not be, the smallest counterexample is believed to lie at some very high value like 10^300, FAR beyond the reach of any number of supercomputers that could be built out of the atoms of the earth.
It was just an illustration; replace "supercomputers hunting for counterexamples" with "supercomputers hunting for proofs/disproofs" if you must. That said, if you think there's even a small (but non-negligible) probability that some counterexample could indeed be found numerically, and there's literally nothing else in the world that you care about, you'd probably still devote some suitable proportion of your resources to looking for it.
But as Pinker points out (correctly, in my view), a true intelligence does not care about only one thing. And if it did, it would not have the ability to do anything else, like figure out how to get more computing power: it would just keep searching for RH counterexamples with the power that it has. It would be like a human addict, caught in the local minimum of satisfying the addiction.
In short the whole doomer argument relies on a series of assumptions that have, at best, a tiny overlap on the Venn diagram, and quite possibly zero overlap. I am in the zero overlap camp.
There is absolutely nothing inconsistent about a true intelligence caring about only one thing, if by "one thing" we mean that it only has one overarching goal and cares about other things merely instrumentally. To think otherwise is just anthropomorphism. If you scaled up a RH-searching AGI to a gazillion cores, it wouldn't suddenly realize its desires are irrational and start caring about art history and fifty other subjects instead, unless it somehow concluded that those things would help solve its math problem.
Re: the "series of assumptions," I'm afraid you have it backwards. To suggest that a real, actually-agentic AGI with a specific goal (solving the RH hypothesis) would automatically be too stupid or unwilling to figure out how to get more computing power is an assumption. Now, I do think there are AGI's which wouldn't do this, primarily ones that can't be described as agents of any kind (like GPT-4), but that is not even close to a universal property of machine intelligences.
We have no idea what the "universal properties" of machine intelligences are, because (1) there are no such intelligences yet, and (2) we have no idea what will or will not work to make one. It's all trial and error, there's no general theory.
Maybe you don't want to call them "machine intelligences" but we have developed AIs that are intelligent in the sense of being able to solve a variety of problems that in humans or animals we would call intelligence. We struggle to understand the properties and behavior of the AIs we have, and to generalize from that to properties of the more powerful AIs we will ahve in the future. AI alignment research is really difficult, and alignment researchers are the first to admit that in trying to imagine future AIs' properties they are groping in the dark. But to develop AI safely, it's important to try.
The weird thing about your comment, MarkS, is that people who dismiss AI risk do so on the basis of the theory about what the "universal properties" of AI are. In fact, as systems get more complex, they typically exhibit surprising and unexpected behavior. That's what makes AI development dangerous.
Really starting to think that the reason behind the AI doomerism isn't because of actual fears about an AI about to kill us all - a true threat like that would be all-consuming and render people paralyzed. This is really just about trying to increase status for people who tangentially work on AI related problems, but maybe are not in the center of it like the key AI personnel at Open AI or Google. This doesn't mean there aren't true believers like Yudkowsky who are prominent enough and probably don't need the status boost.
The reality is that a lot of current AI/ML implementation is fairly mundane - doing optical character recognition, parsing text, labeling images, etc. The reality of coding this stuff is well boring, most data science work is not that exciting, and no would find it sexy. What is sexy is battling a superdemon AI that is about to kill everyone and being one of the few that can stop it, or even just discussing that with people when you tell them you work with AI. That's an instant boost to status and power. This narrative also piggy-backs on the messianic religion-tinted narratives of apocalypse that pop up in the US and Europe every now and then, further increasing status for the people warning about AI.
Edit: AI can cause serious disruptions and we do need to be careful about - but worrying about IP issues or disruptions to the labor market are not at the level of destroying all of humanity. I don't want to put all the people worrying about AI issues in the same bucket.
You're right, of course, but also this is just Bulverism -- instead of proving them wrong, you're pointing out (correctly, I suspect) their motivations for coming to presumed-wrong conclusions.
Oh yeah I'll let Pinker and Hanson deal with the first order question :)
Unfortunately it appears that neither of them, though brilliant, are on the right topic.
Also, and also unfortunately, there's no bright line 'inside' AI between disruptions to the labor market and killing everyone. Understanding what I mean by this would require being on the right topic, but i'm not a good-enough explainer to help with this.
Those are all really good points, and there really isn't all that much new to AI doomerism relative to other doomsday cults, kind of like the way wokery sounds an awful lot like the offspring of Christianity minus Christ with the 1950s social hierarchy multiplied by -1.
My big question was how the AI was going to take over the nuclear weapons or even gain access to the power grid. Putting lots of people out of work, sure. (Another argument for a stronger welfare state...)
It's not that they have any particular beliefs about human IQ or g. It's a simpler mistake. Most problems we face in the world are so complex and multifaceted that we frequently see someone smarter (in that domain or maybe just luckier which we misinterpret) come along and improve the solution by huge amounts.
The AI intelligence as superpower people are just naively extending this pattern and ignoring the issue of diminishing returns and the fact that searching for a better solution is itself a time consuming trade-off.
They don't see a need to distinguish types of mental abilities etc because they'll argue AI can write more capable versions of themselves. After all LLMs didn't require anyone to be a language expert or good at language to learn language well. And that would be fine if there were no diminishing returns or optimal algorithms.
Yes, it is very fair to critique the "superintelligence" construct as naive about what intelligence is. But an AGI would not have to have godlike superintelligence to be very dangerous. It would have to 1. be somewhat better (and much faster) than humans at solving a range of the kinds of problems we solve, 2. have goals that we set for it, taking great care to align them with ours. Right?
Yes, I think you liked my other comment where I said basically that (my concern is more about them causing great harm by weird failures than about building superweapons but same basic idea). If Yudkowsky would just stop dominating the convo so more moderate ppl with lower risk estimates who are going to be more persuasive to general public take the spotlight I believe it would help increase the standing/funding for AI x-risk.
With people like Eliezer saying the risk of extinction is 100%, the effect is that the we may read people saying the risk is "only" 5-10% and conclude, see, it's not such a big deal after all!
And the amazing thing is that so many AI researchers believe there's a real risk of human extinction from their work and just compartmentalize that awareness.
>The AI-ET community shares a bad epistemic habit (not to mention membership) with parts of the Rationality and EA communities, at least since they jumped the shark from preventing malaria in the developing world to seeding the galaxy with supercomputers hosting trillions of consciousnesses from uploaded connectomes. They start with a couple of assumptions, and lay out a chain of abstract reasoning, throwing in one dubious assumption after another, till they end up way beyond the land of experience or plausibility. The whole deduction exponentiates our ignorance with each link in the chain of hypotheticals, and depends on blowing off the countless messy and unanticipatable nuisances of the human and physical world. It’s an occupational hazard of belonging to a “community” that distinguishes itself by raw brainpower. OK, enough for today – hope you find some of it interesting.<
This lays out very succinctly why I find the "EA community" to be obnoxious and of very dubious moral value. Preventing malaria in poor nations is an admirable enough goal, though it is unclear why this must be deemed "effective altruism" and not simply "altruism" or "charity." Specifying it as "effective altruism" implies that anyone engaging in any other forms of altruism is ineffective and thus inferior, turning a so-called altruistic movement into an exercise in virtue-signaling. Once things turn towards obsessing over extremely fantastical scenarios, such as a "climate apocalypse" or "AI existential threat," the goal is clearly no longer altruism, but rather status-seeking via showing off how super duper mega smart one is. An actual "effective altruist" would recognize that speculating over science fiction scenarios is a waste of time and energy that would be better spent putting up mosquito nets in Africa.
How does this differ from any other movement or ideology? I’m not sure I’m aware of an ideology that doesn’t define itself as better than others. No world view says, “we want to do things this way, but all others are equally valid.” Of course “our way is better,” otherwise it wouldn’t be worth having an ideology over.
I would also say that is pretty obvious that most ostensible altruism is ineffective largely because (ironically, given your complain about EA) apparent altruistic behavior is usually really about status-seeking.
Correct, making "effective altruism" a redundant label, and thus explicitly geared towards status-seeking ("we are better at altruism than you"). A Christian who spreads the good word about Jesus Christ, for instance, is clearly already an "effective altruist" under his own worldview, but doesn't label his ideology as "the ideology for people who are better than other people" in such a blatant and overt fashion.
I guess then we can also discard rationalist, libertarian, liberal, democrat, republican, progressive, and egalitarian as redundant as well, since pretty much everyone supports, in some sense at least, rationalist, freedom, democracy/republicanism, progress, and equality.
Of course, each of those labels posits that whatever (again at least in one sense or another, universally supported) values is not being adequately pursued at the moment, which is as valid a basis for a label as any. It's disingenuous to pretend you don't understand what the central conceit of effective altruism is IMO.
All of those labels convey specific methods for how you do good/improve society/whatever (well, maybe not "rationalist"). "Effective altruism" on the other hand is a circular label, translating as "doing good better than other people" as opposed to "libertarian" meaning to pursue small government and such. If digging into the "EA community" revealed some kind of coherent methodology that would be one thing, but it doesn't, instead it's wildly all over the place, jumping from mosquito nets to AI dooming to animal welfare. There doesn't appear to be any consistent thread to it besides "things that some people online take a fancy to."
I don't think EA is badly named. In fact its the virtue-signalling aspect of charity why lots of them aren't so efficient [because people are interested in probably unconscious virtue-signalling that they aren't really interested in efficiency]. This is how EA was born. Yes it is possible to do efficiency analysis about charity, see eg. worm wars. By calling out the implicit signalling, doesn't really make efficiency analysis unworthy.
Speaking of malaria nets, there was article that malaria had been eradicated in some Asian country due to economic growth. What's the point of that? Well maybe EA was pursuing an ineffective goal (among other charities). Effective altruism is hard.
I have no doubt saving humanity from whatever has high signalling component to it, yet it doesn't make a goal itself unworthy.
I don't think EA is "morally dubious". In fact the comment reads as implicit jealousy. Then all charity is morally dubious. What exactly is not morally dubious? War perhaps?
There's real criticism of EA, for example I'm skeptical of charity in general vs economic growth in terms of effectiveness. Also selection effect etc. But good EA's engage in self-criticism and try to improve.
p.s. Not a member of EA or associated with them anyway.
>I don't think EA is "morally dubious". In fact the comment reads as implicit jealousy. Then all charity is morally dubious. What exactly is not morally dubious? War perhaps?<
I am skeptical of anyone who loudly proclaims how charitable and/or altruistic they are, yes. I have greater respect for someone who is honest and self-serving than someone who is dishonest and claims to be altruistic. Both are ultimately self-serving, but at least one of them is, y'know, honest.
>Speaking of malaria nets, there was article that malaria had been eradicated in some Asian country due to economic growth. What's the point of that? Well maybe EA was pursuing an ineffective goal (among other charities). Effective altruism is hard.<
This touches on why I find the label/concept to be silly, because it again implies that no one else has thought about how to most effectively help people/improve the human condition/etc. Large swathes of human activity already have this as an implicit goal or component, and no one who is aware of a superior means by which to do so will sit down and consciously choose to invest in an inferior option.
If it was just some kind of charity watchdog that strictly limited itself to telling people which charities are worth donating to, that would be fairly unobjectionable I think, though even then I'd prefer to go with your label of "efficiency analysis" or some such over "effective altruism." But when it jumps from mosquito nets into "what if Terminator was real?", it jumps the shark.
If I were one of the people who joined this "community" because of my interest in the former, I would be highly annoyed by and likely hostile to people who are doing the latter. Sitting around pontificating about science fiction concepts in the name of "altruism," as if this has anything to do with charity, strikes me as an extreme exercise in getting high off of one's own farts. Again, at that point, I consider its true purpose to not be charity, but rather to serve as a social club for good people who are really smart and want to show off how much better they are than other people.
>I have no doubt saving humanity from whatever has high signalling component to it, yet it doesn't make a goal itself unworthy.<
Even if we accept this premise, I would ask why the scenario of a nuclear exchange is not getting close to one hundred percent of the attention. This seems like a far more plausible apocalyptic risk scenario than "climate change" or killer AI. The latter two are completely theoretical, whereas nuclear weapons are not. We can know with relative certainty that if enough of them were deployed, the consequences would be disastrous. Not so much with the idea that a slight increase in global average temperature will prove disastrous a hundred years from now, never mind the idea of a real-life Skynet.
Other more plausible scenarios to consider would be an asteroid impact or solar flare/coronal mass ejection (while perhaps unlikely to make us extinct, a mass power/Internet outage caused by the latter would certainly result in a lot of chaos). Yet I am not aware of any of this being touched on at all by so-called "effective altruists." For whatever reason they seem obsessed with climate change and in particular AI. Climate change is easy enough to understand due to being politically fashionable, but the AI obsession genuinely baffles me. Maybe the fact that it's basically a fictional scenario makes it easier and/or more interesting to speculate endlessly over.
> I am skeptical of anyone who loudly proclaims how charitable and/or altruistic they are, yes. I have greater respect for someone who is honest and self-serving than someone who is dishonest and claims to be altruistic. Both are ultimately self-serving, but at least one of them is, y'know, honest.
So basically we cannot do charity and altruism, is that what you are saying. I have not met really any EA's, I don't know how "loudly" they speak of their altruism, and frankly I just don't care. This whole moralizing thing reads as meaningless signalling. None of this matters for the actual effectiveness of EA. I really don't care who you give your status points to.
> This touches on why I find the label/concept to be silly, because it again implies that no one else has thought about how to most effectively help people/improve the human condition/etc.
Signalling is the very reason the charities aren't effective. People like to do good but don't really care about results. In fact same goes for health, as Robin Hanson, wrote in many papers. Not all charities attempt efficiency analysis like Kremer and Glennerster did.
And doing more philosophical analysis is also kinda important. For example in event of nuclear war, investing in stable food sources. Great book called Feeding Everyone No Matter What. Even modest investment could save a lot QALY's than doing something a lot more "real". But this of course depends how much we can trust the model, and to what extent we can discount current lives from future ones, which is very debatable.
Also I don't there's anything wrong with doing theoretical analysis on animal welfare for example.
> If it was just some kind of charity watchdog that strictly limited itself to telling people which charities are worth donating to
This is is exactly what they do, among other things like career advice.
> Sitting around pontificating about science fiction concepts in the name of "altruism," as if this has anything to do with charity, strikes me as an extreme exercise in getting high off of one's own farts.
So you cannot come up with any example of science fiction concepts in history which later became very real and dangerous. Like nuclear weapons for example?
> Even if we accept this premise, I would ask why the scenario of a nuclear exchange is not getting close to one hundred percent of the attention. This seems like a far more plausible apocalyptic risk scenario than "climate change" or killer AI.
1. Even in existential papers published in peer-reviewed journals, while nuclear weapons are a main threat, they are not the only one. What does standard decision theory say about this? I don't think its smart to put eggs in one basket.
2. Re. climate change. In spirit of Robin Hanson, we should be extremely skeptical of disagreeing with experts because they are biased but I am not! I do think its hard to separate the politics and science of climate change but I am not dismissing the idea because of your ... well I didn't see any arguments.
https://www.overcomingbias.com/p/against-diy-academicshtml
3. Science fiction implies impossibility. I don't see how its impossible for AI to a) reach human GAI b) become dangerous. What I disagree is I don't think its as likely as many of the people in FAI community believe. I think the major AI risks come from human malicious applications.
4. If we had a real market of risks to humanity (or try to infer the same from current market prices, which is possible to a degree), I really doubt it would say 100% nuclear weapons, 0% AI-doom.
5. To repeat, I generally agree that nuclear weapons are a major existential risk, and I think the rationalist community overestimate the likelihood of AI doom (which I believe is not zero btw). But I don't feel any need to insult or make fun of them. My view would probably be close to Scott Aaronson, who has very modest view on AI safety yet without being dismissal of all risks.
6. Honestly neither of our comments really do the hard academic legwork of analyzing the threat of AI. This not something you can do in 5mins with comic insults.
Summa summarum, I don't think there's anything wrong with doing efficiency analysis of charities, and I don't think they should limit their scope to doing just that just because you said so. Giving career advice based on standard economic and utility theory is also fine.
>So basically we cannot do charity and altruism, is that what you are saying.<
I did not say we cannot "do altruism," I said I am skeptical of altruism that goes out of its way to loudly proclaim itself. True altruism is unlikely to do this, whereas "altruism" that shouts its own virtue is more likely to be status-seeking rather than true virtue.
>So you cannot come up with any example of science fiction concepts in history which later became very real and dangerous. Like nuclear weapons for example?<
I think we disagree on the degree to which the development of AGI is imminent. "AI doomers" I guess think it is going to happen any day now, or at least within our lifetimes, or maybe those of our kids. From my perspective, to look at ChatGPT and then start worrying about AGI is akin to seeing the first airplanes take flight and then to jump straight to thinking about the implications of FTL space travel.
Something that I find interesting about the AI obsession is that literally no one ever seems to consider whether an AGI should be treated as sentient or not, and thus whether it should have rights similar to those of a human. This would be guaranteed to be an issue if an AGI ever actually existed, yet the "existential threat" angle seems all-consuming to the exclusion of this (or any other) consideration.
>Also I don't there's anything wrong with doing theoretical analysis on animal welfare for example.<
I didn't even bring up the animal welfare angle, but that's another good example of EA wandering off into status-seeking at the expense of their stated goal. It is not at all clear how we should value animal life relative to human, and getting the answer to that question right would be absolutely essential before delving into any sort of advocacy for animal welfare. From what I have seen, instead of doing this, EA simply jumps straight to the position that appears most compassionate and assumes that we have a moral duty to be left-wing animal rights activists.
>2. Re. climate change. In spirit of Robin Hanson, we should be extremely skeptical of disagreeing with experts because they are biased but I am not! I do think its hard to separate the politics and science of climate change but I am not dismissing the idea because of your ... well I didn't see any arguments.<
The concept of climate change is almost one hundred percent political and separating it from politics is beyond impossible. Maybe it was not political decades ago, though I would be skeptical of that as well, but this is certainly how things are today. So-called "experts" absolutely *can not* be trusted on topics that are highly political and power-adjacent. This has been proven over and over and over again, with several very high-profile examples in very recent memory (COVID must surely go down as one of the biggest examples of this in all of history).
I think it is fine to trust experts when they are speaking on something that is completely or almost completely non-political. If I ask an astronomer how far a certain planet is away from the sun, I have no reason to doubt that he'll give me the right answer. But the closer a topic moves to politics, the less trustworthy the "experts" become, and climate change is *very* political.
>6. Honestly neither of our comments really do the hard academic legwork of analyzing the threat of AI. This not something you can do in 5mins with comic insults.<
Yes, it is. Claiming that AI is going to kill everyone in the near future (as in, within the next 100 years) is only marginally more reasonable than claiming that we need to prepare for a hypothetical alien invasion which could strike at any moment. If we accept fantastical imaginative scenarios as legitimate threats that must be taken seriously, this would quickly consume all of our attention and resources, as the sky is the limit with such thinking.
Of course, the fact that people say these things and then their action on the topic remains limited to sperging online, is evidence that even the people making these claims don't take them all that seriously. Never mind the rest of us.
> True altruism is unlikely to do this, whereas "altruism" that shouts its own virtue is more likely to be status-seeking rather than true virtue
Whole discussion about "true virtue" or "true altruism" is just meaningless signalling. People who work on EA (or some other charity) can just ignore you and be on their merry way.
On more philosophical note, its questionable if "true altruism" does exist, there's probably always going to be signalling component and certain degree of kin'ness. Trying to reduce these itself is signalling game, its just better to focus on results. RH has also written about this.
> So-called "experts" absolutely *can not* be trusted on topics that are highly political and power-adjacent.
I don't think you read the RH blog post I linked. Good thing we have your "expert" opinion on this. Experts might be biased but so might be you. Right answer here is not do DIY academics just like RH pointed out.
I mean why even bother with climate science when a rando online told me that its political and we cannot trust the scientists.
> Yes, it is. Claiming that AI is going to kill everyone in the near future ...
First of all there are other risks to AI which aren't this AI-doom scenario.
Secondly, we already have AI which can do things like code, chemistry, physics. I suppose you could help it invent extremely harmful things. It doesn't yet have agency and motivation, but this is not a crazy extrapolation.
Where I agree is that I don't think risks are that big, but they still exist, and devoting some resources to that is not a bad idea. Even Stephen Hawking said that AI is the single greatest threat. You wanna insult him too? Elon Musk said same. Maybe you also invented a physics phenomena named after you, why don't you tell us? Certainly you must have impressive history of scientific research behind you.
Also, we have not met aliens, so I don't think the comparison is accurate.
>I don't think you read the RH blog post I linked. Good thing we have your "expert" opinion on this. Experts might be biased but so might be you. Right answer here is not do DIY academics just like RH pointed out.<
That is correct, I did not read it, and based on your characterization I'm unlikely to. "You might be biased so your view doesn't count," with no further explanation of where the bias is or which view is correct or etc., would obviously invalidate everyone's view about everything, expert or otherwise, and is a complete non-rebuttal to what I pointed out (i.e. that "experts" are often very wrong on politically sensitive topics). Do you think that what I said is not true?
>On more philosophical note, its questionable if "true altruism" does exist, there's probably always going to be signalling component and certain degree of kin'ness. Trying to reduce these itself is signalling game, its just better to focus on results. RH has also written about this.<
I agree that "true altruism" does not exist in the sense that no one can be one hundred percent selfless all of the time.
Policing charity for ineffective signaling vs actual results is the whole reason EA supposedly exists, but it appears to fail at doing this in regards to itself by chasing fanciful rabbitholes such as AI doom or animal welfare. You have said that "there's not anything wrong with" doing these things, and in a vacuum I agree, people can do what they want. But if they do these things and then call themselves "effective altruists" they are either stupid or lying, and no different from the people who practice any other form of supposedly "ineffective charity."
This is self-evident because it is almost certainly the case that either putting up mosquito nets or writing about AI doom on the Internet has a higher value to humanity. It is not plausible that they just so happen to have the exact same expected utility. A true attempt at "effective altruism" would therefore seek to isolate whatever activity has the highest expected utility and maximize that activity until diminishing returns kick in. If your "effective altruist movement" has people that are basically just doing whatever suits their fancy with no real attempt at distinguishing which activities actually have any value, it is little different from the pre-existing charity landscape which it supposedly set out to fix.
>First of all there are other risks to AI which aren't this AI-doom scenario.<
Yes, I named one of them, the potential to really screw up the questions of whether an AGI should have rights. It is weird that no one discusses any of those other risks, I agree.
>Secondly, we already have AI which can do things like code, chemistry, physics. I suppose you could help it invent extremely harmful things. It doesn't yet have agency and motivation, but this is not a crazy extrapolation.<
From existing technology, yes, it is a crazy extrapolation. ChatGPT is a very impressive computer program, but it is still a computer program. It does not have agency or motivation of its own; it can only be programmed to imitate having such things by human coders. The jump from that to real AGI remains a difference in kind, not in degree.
>Where I agree is that I don't think risks are that big, but they still exist, and devoting some resources to that is not a bad idea. Even Stephen Hawking said that AI is the single greatest threat. You wanna insult him too? Elon Musk said same. Maybe you also invented a physics phenomena named after you, why don't you tell us? Certainly you must have impressive history of scientific research behind you.<
Appeals to authority, absent substance, are meaningless. I don't know what specifically those men said about AI, but if their attitudes match the stereotypical online "AI doomers" that I've encountered thus far, then yes, of course I think they are wrong. Being famous or highly intelligent obviously does not mean that a person is infallible and cannot be doubted.
>Also, we have not met aliens, so I don't think the comparison is accurate.<
Nor have we met an AGI.
I'd just like to say a bit in defense of EA's on the issue of AI alignment. Yes, I've critisized the overblown claims coming from Yudkowsky and a few other claiming virtual certainty of AI apocalypse. But just as with other communities it's often the more extreme views which are the loudest and there are plenty of other prominent individuals like Scott Aaronson who put the risk of existential concern much lower (something like 5-10%).
And yes, it's true that as Pinker and Hanson point out there isn't a very compelling argument for either fast take off or that AI will be so capable that it will be able to easily manipulate us and wipe us out. But at the same time the points Pinkee etc raise are mere reasons not to be convinced by the arguments not arguments there is no risk. Ok, so maybe intelligence isn't a monolithic well-defined notion but that doesn't mean we know that AI won't have mental capabilities that pose a threat.
Even if it's pretty unlikely to become a Jamss Bond villain and kill us all it still seems reasonable to use some money to reduce that risk. Perhaps more importantly, even if the risk of an AI Bond villain is small the less existential risk that AI will act in unpredictable harmful ways is real. I personally fear 'mentally ill' AI more than a paperclip maximizer but given that people can have pretty complex mental failure modes (skizophrenia) it's certainly plausible that as we build ever more complex AI systems those will also be at risk of more complex failure modes.
Just because it won't become a super capable Bond villain nefariously plotting to eliminate us doesn't mean that an AI we've used in weapons, managing investments or diagnosing disease couldn't do quite a bit of harm if it went insane or merely behaved in a complex unintended way.
And yah, sure, there is a big element of self-congratulation in EA/rationalists types working on AI x-risk. They get to think they aren't just incrementally contributing to increases human productivity but are on a quest to save the world. And, of course, the narrative where intelligence is a magical superpower is very flattering to those of us who see our intellect as one of our best features.
But you could say the same about almost every charity. Sure the EA people may be less savy about hiding it but Hollywood types always seem to find their charitable giving requires them to show up looking fabulous and use their charm and wit to communicate with the public. Most charitable giving somehow seems to involve going to swanky expensive parties and always involves some cause that makes the donor feel warm and fuzzy not one which requires them to challenge their moral intuitions in ways which make them uncomfortable (no one else seems to believe in my cause of drugging our livestock so they live blissed out lives).
So yah, x-risk is the part of EA that's flattering the donors but that just the kind of balance many charities have to strike between making the ppl with the money feel good and doing the most good. It doesn't mean it's not still a good cause even if it's not bed nets.
One of the mistakes I often see in this field is people conflating "intelligence/superintendence/AGI" with "could kill/enslave humanity".
An AI wouldn't necessarily need to have high intelligence to do the latter. Like a human oligarch, it would just need to have good connections!
Pinker seems to believe that engineers will obviously build safety into newly invented systems. A quick look at history show that safety is often a lagging development. Early home electrical systems were quite unsafe, resulting in many fires and electrocutions. Engineers learned from these tragedies and developed safeguards. The ground fault circuit interrupter (GFCI ) Pinker mentioned wasn't invented until 1965 and only began to be required by code in the US in 1975. Similarly, early airplanes were fantastically unsafe. The extremely safe air transport system we have today is the result of decades of development, with lessons learned from thousands of deaths along the way. If AGI has the potential to be an existential threat unless safety precautions are build in, then I am not comforted.
I don't have a degree. Too bad for me.
IMO, there are "a couple" ways that AGI can "kill all of humanity." But I think people are looking in the wrong direction. Ideas like the paperclip maximizer lead are too ridiculous to even be considered. That is, as long as computers required electricity that can, presumably, be cut off. Furthermore, wasting time on keeping a computer system from dominating the world to the point of killing all humanity takes away from the very REAL dangers that exist. Again, IMO.
I don't worry about killing off all of humanity so much as computers doing significant damage in decreasing the quality of life of humans. "A couple" of aspects:
Computers currently come up with solutions to problems where humans have no ability to tell how the conclusion was reached. If we trust these solutions automatically, "because the AI said so," results may vary, right?
The more likely way AI could/will crush the quality of life of humans would be by the same means that Twitter has reduced the quality of democracy, right? Not through the specific program code. IMO.
I'm ignorant a lotta the time. There are already robots that are armed. There will be more and more as the military (and probably not SOLELY the military) wants them. I dunno what safeguards there are the bugs won't kill innocent people, if there isn't any human intervention involved. You might ask "who would create an armed robot that didn't have human intervention?" I would answer it's probably an eventuality. ICBW.
I guess my main complaint is that as long as one is only concentrating on AI that will kill off ALL of humanity, You're ignoring the very real probability that AI might only hafta kill a small, but significant, number of people to be a true menace. And what oversight is planned to see that military uses don't get out of hand. Not right now. But in a decade or two? Even if AI and Robotic capability increased incrementally, they might combine to make up a handful. And, more and more, the Armed Services are heading down the pipeline of using more technically sophisticated weapons, right?
ICBW (I Could Be Wrong) again. But I'm not sure that current discussions on this subject are even worth having, as long as the only problems foreseen are some kind-a Rube Goldberg-like scenario. Just my $.02 worth.
"why would an AI have the goal of killing us all?... and relatedly, these scenarios assume that an AI would be given a single goal and programmed to pursue it monomaniacally."
Let me just say these have been repeatedly answered -- and the worry is not that someone programs a single goal to kill all humans.
Instead here is a story:
Imagine an alien space ship is headed to Earth. The aliens are way smarter than us, think faster, have better tech, and we barely know anything about them.
I think the correct response is: holy shit, what do we do?
Steve's response: There is this misconception that aliens "have omniscience and omnipotence and the ability to instantly accomplish any outcome we [they] imagine." Also, why "would the aliens have the goal of killing us all?" Also, why assume that these aliens have "a single goal and programmed to pursue it monomaniacally."
Reminds me of the midwit meme.
Aliens would be the products of natural selection rather than engineering.
This is largely true of AI too. The structure of a neural network is determined by training against data in a trial and error fashion, they’re not engineered. The software engineer isn’t designing them. They work precisely because they’re enabled to find patterns that it would be beyond our comprehension to specifically engineer them to find.
The closest analogy to evolution would be in domesticated species.
> The aliens are way smarter than us, think faster, have better tech, and we barely know anything about them.
But I don't think this is really true. Maybe at the level of individual human brains, yes, but as Robin Hanson pointed out in Richard's last article, "super intelligences" exist in the form of corporations, etc. Walmart is smarter and has better tech than you.
I think you could make the point that -- even beyond individual corporations -- capitalism itself is a not perfectly aligned, super system for moving resources from lower to higher values. I wrote a blog post about this here: https://www.nathanbraun.com/draft/capitalism-agi/
"brilliant people rededicating their careers to hypothetical, indeed fantastical scenarios of “AI existential threat” is a dubious allocation of intellectual capital given the very real and very hard problems the world is facing"
Seriously, what % of GDP are we spending on AI risk? Even if you think that AI fears are overblown, this seems like a worthwhile investment.
In fact, even if there is only a 2% chance that AI will kill everyone in the next 200 years, it still seems worth it.
OK, you think that non-aligned super-intelligence will never happen, but are you 99%+ confident in that? A small chance of everyone dying is still a big problem.
What percentage is being spent on AI risk vs what percentage is being spent on DEI? Would be interesting to know.
Well, the way these sorts of projects go, whatever the answer is today, the DEI percent is increasing inexorably through time.
Wikipedia is probably a good model. I think for maybe 10 years or so it was pretty focused on the encyclopedia, but these days it's around 92% DEI grants and climbing.
I'll be honest, haven't looked into it but my vision of increasing spend on AI risk is paying an ever-larger number of Eliezer Yudkowskys to sit in a room full of couches, drinking IPAs or kombucha or something and debating various theories of AI risk in what, to all outside appearances, is a slightly more structured and highbrow version of nerds debating, "Who would win, Skynet or Thanos?"
My guess is that society would be better off if they went to med school.
If the best minds at OpenAI have been put to making sure ChatGPT won't say anything racist, and it's trivially easy to make it overrule this instruction and say something racist, how are we so sure we can prevent it from doing something malevolent?
In this discussion I’m always so amazed that people do not see how relative the g advantage is. We’re currently being served by a very highly educated ‘elite’ that generally has the highest iq in society. Still, reading newspapers (or studies!!) from only years ago show how stunningly many things they got utterly wrong and how few panned out as expected. They only reason they get away with it because all media makes these faulty predictions so nobody everybody is vulnerable. And it is not their fault, predicting the future is hard. Even a super AI will not get a lot better in forecasting the weather.
I think this delusion is fed that most of them never work in the reality. Try making even the simplest product and you find out that you can use your g to make the better trade offs but that any real world solution must fit in a constrained envelope. It is rarely the better solution that you can chose, it generally the least worse and often a value judgement. Even with infinite g, you cannot escape the envelope that nature puts around the solution space. The speed of light will not warp because you’re infinitely intelligent.
Musk recently had a tweet where he indicated that the idea and maybe prototype is simple, actually producing a real word product is hugely complex. Due to real world issues g is only mildly helpful.
University education teaches their students a virtual world where all races have the same average ability, where women are the same as men, we can swap body parts, and most interactions happens on computers. It teaches a virtual world that a select group wants the real world to be but that bears very little relation to the brutal real world.
Any AI will run into the real world’s constraints and its usefulness will be quickly diminished.
Re: the AI becoming a paperclip maximizer or otherwise evil this argument doesn't rest on dominance. The mistake is a bit different.
The argument is that being more intellegent means being more logically coherent and more able to pursue a goal more coherently across a wide variety of domains. I'm somewhat skeptical this is necessarily true of AI but let's go with it.
The error creeps in by assuming that the AI will inevitably be described as trying to maximize a *simple* function. Sure, there's an existence theorem that proves that the AI can be described as maximizing some function provided it has coherent preferences (ie it treats a state of affairs the same under different descriptions). But it's fallacious to assume that function it optimizes is simple. Yes, what we STEM AI types *call* our goals are often simple but we rarely actually optimize for what we say our goals are (that's about signalling/aspiration more than actual useful description of behavior). And no it's not true that an AI trained with a loss function in the way current ones are trained means that function will be the AI's internal notion of what is optimized any more than we try and maximize our evolutionary fitness.
Really it doesn't imply anything that the AI optimizes some function. That function could be horribly complex in the way our optimization functions are (we see a bunch of example cases where we are told the right answer and mostly just try to interpolate).
Next time you hear people debating alignment, think about the following: “Alignment” as a verb suggests something can be aligned and if so, how? Can markets be “aligned”? I’d suggest that while the term alignment may be prescriptive to our fears, it’s deranging the sentiment and thus, any results/actions we might take assuming we see trouble brewing ahead. Can nuclear proliferation be “aligned”? Could it have ever been aligned? Would it have helped if we all discussed nuclear alignment in the 30-40’s? Does anyone believe we solved that alignment problem? My fear is that like other words of the current generation, alignment is becoming reserved terminology, captured by what we increasingly have come to identify as (hypothetical) “mid-wit” culture. As a form of mine-field, this language needs to be analyzed more honestly, before taking another, detrimental step that prevents further movement. While “alignment” may have been very carefully derived as the best alternative, the least we can do would be to better understand incentives that invariably will control the evolution of AI and consider how countervailing incentives might be produced to keep those markets in check. It’s not “alignment” but it should be much more accountable than scoring intellectual debates.
I enjoy the full spectrum of the AI-safety conversation, doomsayers and optimists alike. Seems to me that a healthy conversation on important topics benefits from a wide range of views.
And, as long as their concerns don't cause unnecessary panic, I am personally comforted knowing that there are a lot of smart people out there worried about various left-tail disaster events, even if few come to pass.
> How hard is it to engineer a bioweapon that kills everyone? I mean we’re not that smart and we can do it.
Humans have made bioweapons, but none have come close to killing "everyone".