Re: Fighting UFAI

From: Eliezer S. Yudkowsky (
Date: Wed Jul 13 2005 - 21:12:28 MDT

Tennessee Leeuwenburg wrote:
> I suppose I'm unwilling to accept the paperclips position to some
> extent, for a variety of reasons.
> Is a truly intelligent AI ever going to make the kind of monumental
> slip-up required to decide to do something so blatantly dumb as just
> cover the universe in paperclips?

Back in the era of "pulp" science fiction, one would occasionally see magazine
covers depicting a sentient monstrous alien - colloquially known as a bug-eyed
monster or BEM - carrying off an attractive human female in a torn dress.
Today the authors of literary science fiction know better, but the idiom still
shows up on TV shows. Non-humanoid aliens, presumably with completely
different evolutionary histories, are sexually attracted to human females.
(Imagine an alien movie poster that shows a threatening human carrying off a
giant bug in a torn dress - Phil Foglio depicts this in "Illegal aliens".)
People don't make mistakes like that by explicitly reasoning: "Giant bug or
not, it's likely to be wired pretty much the same way we are, so presumably it
will also find human females sexually attractive." Probably they who went
awry did not ask whether a giant bug perceives human females as attractive.
Rather, a female in a torn dress is sexy - inherently so, as an intrinsic
property - therefore she attracts an intelligent giant bug. They who make
this mistake aren't thinking about the bug's mind, they're focusing on the
woman's torn dress. If the dress were not torn, the woman would be less sexy;
the bug doesn't enter into it. This is yet another case of the mistake which
E. T. Jaynes named the "Mind Projection Fallacy".

Paperclips are not inherently a dumb idea.

> I know people have posed race conditions between FAI and paperclips,
> but there seems to me to be a kind of contradiction inherent in any AI
> which is intelligent enough to achieve one of these worst-case
> outcomes, but is still capable of making stupid mistakes.

Actions are only "stupid mistakes" relative to a cognitive reference frame.

> Does it make sense that something so intelligent could have such mindless goals?

3.85 billion years of natural selection was powerful enough to poof primordial
soup into peregrine falcons, a nontrivial amount of design ability. Yet
natural selection's goals were pretty mindless from a human standpoint, just
as human beings having sex with contraception is pretty stupid from the
mindless standpoint of natural selection.

You can define an UFAI as "unintelligent" but it will still have a
self-consistent goal system that can tear apart stars.

> I'm fairly willing to accept that UFAI might see a need for human
> destruction in achieving its own goals, but I think that those are
> likely to be interesting, complex goals, not simple mindless goals.

I think that the sooner an UFAI acquires the ability to self-modify with
accurate foresight as to the consequences, the sooner its goal system will
freeze in place. Whether those goals end up being "simple" or "complex", the
quantitative complexity in bytes, I couldn't say; my guess is that it will be
on the simple side compared to humans, because it will turn out that humans
got much further along before acquiring the ability to self-modify. That is,
the UFAI arises from self-modifying Eurisko soup or a "seed AI"-ish project
and acquires self-modification much earlier. Another variable is how much
complexity "washes out" of the goal system at the time when the goal system
first decides to rewrite the goal system. If the UFAI has a long, elaborate,
intricate, dynamically inconsistent, decision procedure for making paperclips,
that procedure's intricacies may wash out of the decision to rewrite the part
of the code that says how to make paperclips.

> I'm also willing to accept the risk-in-principle posed by advanced
> nanotech, or some kind of "subverted" power which destroys humanity,
> but I'm both reluctant to tag it as truly intelligent, and also
> doubtful about the real possiblity.

It doesn't matter what you call a real thing. Definitions don't control
physical outcomes.

As for the real possibility, a paperclip utility function looks perfectly
self-consistent to me. If anything it looks a lot more self-consistent than a

> To some extent, there is a trade-off between efficiency and efficacy.
> For example, the energy requirements might be too high to sustain
> existence across the void of space. Just as lions in the sahara starve
> when there is no food, so being powerful is not always a survival
> advantage. I'm sure this point may have been up before, but I don't
> know that it's a given that evil nanotech

Evil nanotech? Evil is a term that applies to cognitive systems, not systems
of a particular size. And an UFAI is not evil; it just transforms local
reality according to a decision criterion that makes no internal mention of
human life.

> is really a universal
> threat. It's clearly a planetwide threat, which is probably enough for
> the argument anyway, given the lack of evidence of offworld life.
> Cheers,
> -T

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:51 MDT