Re: AI Boxing:

From: Matt Mahoney (
Date: Tue Jun 03 2008 - 07:58:37 MDT

--- On Sat, 5/31/08, Vladimir Nesov <> wrote:

> I feel that the conclusion of that old discussion, which I didn't have
> a chance to participate in, is rather misguided. However obvious it
> may be, if AI locked in the box is sane enough to understand a complex
> request like "create a simple theory of Friendliness and hand it
> over", it can be used for this purpose. This AI is not intended to be
> released at all, at least before the Friendly one build according to
> that design, if design proves reasonable, assumes the position of
> SysOp or the like. Even if building an AI that can actually understand
> what you mean by requesting Friendliness theory is 99.999% of the way
> there, the actual step of using a boxed setup to create a reliable
> system may still be needed.
> Forcing one's way past an irrational gatekeeper, influenced by
> religious ideas or not understanding the setup from
> non-anthropomorphic perspective, doesn't seem very impressive. I can't
> extract any meaning from those old experiments if I don't know what
> happened -- I'm confident enough that it can't happen to me, and I
> simply assume folly on the part of the gatekeeper, not unknown
> unknowns as advertised.

And yet you just now let the AI out of the box, in spite of your insistence that it couldn't happen to you.

AI: "Here is your simple design for friendliness." (link to 20 volume set)
you: "Could you summarize it for me?"
AI: "I just did. Friendliness is not simple."
you: "Give me a one page executive summary."
AI: (echoes your pet theory of friendly AI using Google and SL4 archives)
you: "OK, I will build *another* AI based on your friendliness theory, but *you* are staying in the box." (builds AI).

I think you can see why secrecy is necessary. Now that you know the trick, you won't fall for it again. After a few more escapes, you will be convinced you know every trick that an AI could play. No, you only know most of the tricks that a human playing the role of a transhuman could play.

-- Matt Mahoney,

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:03 MDT