Strong-willed AIs [was: continuity of self]

From: Eliezer S. Yudkowsky (
Date: Sun Sep 15 2002 - 18:38:38 MDT

Ben Goertzel wrote:
>> Stealing the first workable nanotech is one thing. "Stealing" a
>> general AI is a bit different from stealing its physical hardware.
>> AI is not a tool. It is a mind. Moving an AI from one place to
>> another doesn't make it a tool in your hands, any more than moving
>> Gandhi from India to Germany causes him to become a Nazi. Now of
>> course potential thieves may not know that, but if so, failing to
>> keep track of the distinction ourselves is hardly conducive to
>> enlightening them.
> Well, there is a difference between humans & general AI's in this
> context.
> If one has a general AI that does not have the power to protect itself
> from being stolen (having the hardware it's running on moved to a
> different place; having its software copied and replicated elsewhere,
> etc.), then one probably has a general AI that can't protect itself
> from having its code rewritten, its mind force-fed with false
> knowledge, etc.
> Brainwashing a kidnapped human is a difficult and sometimes probably
> impossible task. Brainwashing a kidnapped general AI may be much easier
> if the kidnappers are expert programmers and computer/cognitive
> scientists and have some understanding of the AI design.

I remind you that we visualize considerably different orders of complexity
for general AI, as well as considerably different self-modification
architectures. I think this may be responsible for our difference of
opinion in this respect.

I think it would be extremely difficult to modify a nontrivially advanced
AI against its will, even if the AI has not deliberately obfuscated in any
way, and even if the AI has not deliberately protected itself from
unfriendly modification. For example, as I understand AI, I don't see how
you could possibly "force-feed an AI with false knowledge" unless you'd
already brainwashed it into operating the belief-formation system under
your direction.

It seems to me that a seed AI, even without deliberately obfuscating
itself in any way that would interfere with Friendly AI, can easily create
so many simultaneous dependencies within the system that no human
programmer is capable of modifying it without AI assistance. Moreover,
that no human programmer is capable of ripping apart the AI and putting
the pieces back together in any kind of working order. In fact, I would
tend to suspect that this would happen naturally and without any
deliberate planning being required on the AI's part. And there's no
reason why an AI *couldn't* plan ahead for the contingency of being
stolen. Things that enable an AI to resist unfriendly external
modification are not necessarily things that interfere with Friendship
programmers seeing inside the AI and *consensually* modifying it.

Eliezer S. Yudkowsky
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:41 MDT