Re: AI Jailer.

From: Cliff Stabbert (
Date: Sat Jul 06 2002 - 13:24:12 MDT

Saturday, July 6, 2002, 9:27:49 AM, Mike & Donna Deering wrote:

MDD> What strategies are available to the three participants?
MDD> The programmer can listen to the AI and try to determine
MDD> based on its' communications if it is friendly. The
MDD> Friendly AI can communicate with the programmer and try
MDD> to convince him that it is Friendly. The Unfriendly AI
MDD> can communicate with the programmer and try to convince
MDD> him it is Friendly.

MDD> What limitations? Any argument available to the Friendly
MDD> AI is also available to the Unfriendly AI. Therefore the
MDD> programmer has no way of determining the status of the AI.

MDD> In the case of uncertainty what to do? The way I see it,
MDD> there are three possibilities:

MDD> 1. You release an Unfriendly AI and the world is
MDD> destroyed.
MDD> 2. You release a Friendly AI and the world if saved.
MDD> 3. You release no AI and the world is destroyed by
MDD> knowledge enabled weapons.

MDD> It seems to me that you have no choice but to release the AI.

Very nice bit of speculation...I think you may have hit the nail on
the head. It does make me wonder a bit at Eliezer's request for a
two-hour block of time. But maybe that's just a safety limit, so he
has time to lay out further reasoned arguments for the above to
convince the stubborn. E.g.,

3A. You don't release the AI, and somebody else who hasn't attempted
    to build their AI as Friendly does.

4. Since you must make a choice sooner or later, and since that
    choice will always be uncertain, you may as well make it now since
    that will maximize the benefit (i.e., stop suffering and dying
    sooner) if the AI is Friendly.

I haven't read his whole piece on Friendly AIs yet (for shame, for
shame!), so I don't know if he makes any reference to Pascal's
Wager, but something similar to that would seem to apply.


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:40 MDT