Re: AI Boxing:

From: Vladimir Nesov (
Date: Tue Jun 03 2008 - 09:17:18 MDT

On Tue, Jun 3, 2008 at 5:58 PM, Matt Mahoney <> wrote:
> And yet you just now let the AI out of the box, in spite of your insistence that it couldn't happen to you.
> AI: "Here is your simple design for friendliness." (link to 20 volume set)
> you: "Could you summarize it for me?"
> AI: "I just did. Friendliness is not simple."
> you: "Give me a one page executive summary."
> AI: (echoes your pet theory of friendly AI using Google and SL4 archives)
> you: "OK, I will build *another* AI based on your friendliness theory, but *you* are staying in the box." (builds AI).
> I think you can see why secrecy is necessary. Now that you know the trick, you won't fall for it again. After a few more escapes, you will be convinced you know every trick that an AI could play. No, you only know most of the tricks that a human playing the role of a transhuman could play.

I won't fall for this, since I won't accept its theory of Friendliness
unless I fully understand it (and have many other people understand
it). Even if the theory proves too long to be understood directly, it
may be possible to construct a verification procedure (see [*]) that
will itself be understandable and that will confirm that the theory
(in this case, some kind of formal specification) is correct. But from
the point of my present ignorance, I expect Friendly AI theory
(especially constructed with the power of Oracle AI) to be reasonably

[*] Robert Pollack. How to believe a machine-checked proof. BRICS
Notes Series, BRICS Autumn School on Verification, October 1996.

Vladimir Nesov

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:03 MDT