Re: Effective(?) AI Jail

From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Tue Jun 12 2001 - 19:06:10 MDT


sunrise2000@mediaone.net wrote:
>
> I have a proposal for an escape-proof AI jail. Please post if you find any
> flaws....
>
> The seed AI program, a computer, and a human observer are placed in
> a black box. Assume that the computer is operating on battery power
> and that the human has sufficient food/water/O2 to last the duration
> of the experiment. A wire is run into the box. Before entering the
> box, the human agrees with those running the experiment to use the
> wire for communication and their protocol is agreed upon: when the
> human raises the voltage on the wire, the experiment is finished,
> the computer is shut down, and the human is let out of the box.
> Assuming that the computer has no way of manipulating the physical
> environment within the box (which could endanger the human), it
> would seem that such a setup would be a safe way of experimenting
> with non-friendly AI.

Okay. At this point, I have to honestly say that I don't think you've
even *begun* to understand what the problem is. The problem is not a PC
sprouting legs and sneaking out the window. The problem is information.
If information links from an UFSI (unFriendly superintelligence) to our
world, then our world is hosed. In this case, the leak is called the
"human observer". When you're out, you look normal, but you're actually
the fanatic slave of the UFSI. You've memorized a nice, compact set of
source code that you can write down over a couple of weeks that will give
rise to an UFSI with essentially the same goals as the original, which is
all a nonevolved being probably cares about.

I repeat: The probability, and certainly the conservative assumption, is
that a transhuman can take over a human through a VT100 terminal. Locking
a human in a room with an UFSI for a week would certainly do the trick.

-- -- -- -- --
Eliezer S. Yudkowsky http://intelligence.org/
Research Fellow, Singularity Institute for Artificial Intelligence



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:36 MDT