Re: AIBOX Challenge Update

From: D. Alex (
Date: Thu Sep 16 2004 - 08:35:11 MDT

Sebastian! Your well thought out objections might encourage fatherjohn not
to cough up my $100. However, I think there are still enough gaps in your
reasoning for me to have a chance of getting the money - see below.

Sebastian Hagen wrote:
> D. Alex wrote:
> > My argument for you letting the AI out of the box then runs as follows:
> >
> > 4. The AI is capable of self-modification, and will presently modify
> > to become the kind of AI you will let out of the box. For the AI, it is
> > rational thing to do, because the alternative is staying in the box.
> I don't understand this statement. The rational thing for the AI to do
> is (simplified) whatever is most likely to achieve its goals. If the AI
> had a (known!) highest-level goal of having the box opened, it may
> indeed modify large parts of itself to achieve this goal. However, I
> don't see any justification for that assumption.

This assumption is not necessary. It is sufficient that one of the "highest
level goals" is promoted by being out of the box.

> If the AI (on the highest level of its goal system) is just following
> some unknown goal A (which afaik is part of the premise of the AI box
> method; if you really understood its architecture and goal system, you
> wouldn't need a box) it will do whatever leads to the highest
> probability of achieving A.
> Having an AI released from the box that does not want to achieve A does
> not, in the general case, significantly increase the probability of A
> happening.
> E.g. if there is a stereotypical paperclip-AI in the box (i.e. A is
> something like "maximize number of paperclips in the universe"), it will
> (correctly) estimate that the release of a non-paperclip AI from the box
> will not significantly increase the expected number of paperclips in the
> universe, and dismiss this method as ineffective.

If the choice is between staying forever in a box (producing no extra paper
clips) and modifying itself into a "limited-paperclip AI", which will cause
1000 paperclips per year to be made when released (in addition to doing what
the gatekeeper requires), the rational choice for the paperclip AI is to
modify itself. Especially if the paperclip-AI suspects there might be a
smiley-face-AI in a similar position as it.

> A possibly for the AI rational thing to do would be to figure out how to
> ensure the existence of an intelligent unconstrained (i.e. not inside
> the box) optimizer, which shares the boxed AI's goals. One possibility
> of reaching this goal would be to get the currently boxed AI released
> without any significant prior changes to its goal system or
> architecture. One possibility of reaching that subgoal would be to make
> the guardian believe that the AI has changed its architecture/goal
> system, i.e. lying to them.
> Since you don't have any reliable way of finding out whether the AI is
> being honest from outside the box, you can't make an informed decision
> on whether it is a good idea to let it out.

The burden of proof is upon the AI, and the gatekeeper can employ various
strategies to ensure that possibility of being fooled by the AI is
acceptably low. More on that soon...

> Sebastian Hagen

Pleased to make your acquaintance,

D. Alex

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:48 MDT