From: Eliezer S. Yudkowsky (email@example.com)
Date: Sat Jul 06 2002 - 17:47:58 MDT
Michael Warnock wrote:
> This seems very complete. Most of my ideas for how the AI party has
> been successful are not within the protocol. The only remaining notion
> I think to be reasonable is that Eli is convincing the Gatekeeper party
> that letting him out now increases the chances of real FAI by increasing
> the thought and self-doubt surrounding AI-Boxes and Friendliness.
> This too may be judged to be using a real-world Eli tactic such as a
> secret bribe, which breaks the first of the AI protocols.
That's why the protocol says "no real-world *material* stakes" - bribes
are ruled out but not other means of convincing the Gatekeeper to break
character, if you can manage to do so.
> I tend to think that an AI-Box with a single perimeter and an on/off
> switch would not do the job intended, but possibly influence the AI's
> perception of our own friendliness negatively.
That shouldn't make any difference. We are not dealing with an evolved
> On the other hand I
> think it's important that we not run programs on any substrate capable
> of unaided self-extension at the hardware level without extremely well
> thought-out safe guards.
There are no safeguards against a transhuman. If you want to win, do it
> If a transhuman AI could exist purely as software on more conventional
> hardware (barring quantum mechanisms in our own brains or extremely
> high requirements), then it should be possible to maintain hardware parity
> that is not under the influence of said software. This parity hardware
> would allow inspection of any and all code. The usefullness of this,
> as I see it, depends upon how algorythmic vs connectionist the program
> is and its level of complexity in general. Even given the most readable
> code, however, a transhuman AI, which has spent most of its existance
> working with code, including its own, would likely be able to slip one
> by even the most expert panel of code-reviewers.
> So what else could be done along these lines? How about the
> equivalent of a honey-pot. In a traditional *N*X honeypot a
> system image is placed within a host environment to which it has
> no access or even indication that it is there. All outside communication
> is routed to the contained system image. Checks are made of the
> integrity of the image to alert of the hoped-for intrusion. Much can
> be learned about a particular cracker by tricking him into your
> To produce the equivalent for a transhuman AI may prove
> impossible if it is able to harness superior hardware technology
> within its own perimeter. Also, if an UAI deduced the existance
> of the honeypot it would likely pretend to respect the first
> boundary so as to be let out or wait for an appropriate time to
> blow through the first and attempt the second. At least I think there
> is some possible use in this, though, if an UAI is let into what it
> thinks is the base level of reality and makes its character known,
> but is still contained.
IMO it would be very easy for a transhuman AI (UAI or FAI) to deduce the
existence of the honeypot.
-- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence
This archive was generated by hypermail 2.1.5 : Sat May 25 2013 - 04:00:34 MDT