Problems with AI-boxing

From: Chris Paget (
Date: Fri Aug 26 2005 - 03:37:37 MDT

With the recent AI-box experiments and the discussion on computational
power, I've been considering the issue of AI boxing. I've decided it's
a bad idea, for the following reasons.

I'm assuming that a true AGI has been created, and that its intelligence
is being artificially constrained - it has the capability to become
vastly more intelligent if given sufficient computing power (such as the
Internet) but is air-gapped (or confined in a simulation) to keep it
within a limited environment.

There are six possibilities. Within its box, the AGI would either be
less intelligent than its creators, roughly equivalent, or more
intelligent. At the same time (and separately), it is either friendly
or unfriendly.

Consider the case of an AGI that is not as smart as its creators.
Whether it is friendly or unfriendly is irrelevant - you cannot
guarantee that the same will apply once it is released. Once more
computational power is available to it, it will be capable of more
complex reasoning and it's morals will change accordingly - an
unfriendly child may grow up into a friendly adult, and any moral rules
may break down or exceptions be discovered when analysed in more detail.
  Thus, until the AI is released from the box it will be largely
impossible to guarantee whether it is friendly or not.

In the case of an intelligence roughly equal to that of its creators.
If the AGI is unfriendly, it could lie about its motives and fool its
creators into releasing it. Alternatively, it may be able to find a way
out of the box on its own. If, on the other hand, the AGI is friendly,
it has been unfairly kept in a box while its creators made their minds
up. In the first case a deception or escape will release an unfriendly
AGI, the latter case just results in the singularity being delayed. It
is also possible that its creators will realise that it is unfriendly,
but the argument above still applies.

If the AGI is more intelligent than its creators, it is unlikely that
the box will hold it, friendly or otherwise. It will definitely be able
to talk its way out, and even with an air-gap it's possible that RFI
tricks could let it out (consider that Internet access can be gained
using power lines, phone lines, cable TV, or simply out of the GSM-laden
thin air. It'd have to be a pretty damned strong box). If the AGI is
unfriendly, again you have a problem; if it is friendly, you simply
delay the singularity again.

Given the above, I don't see that "boxing" serves any purpose. You
could refute my logic by gradually moving from one stage to the next,
slowly adding computational power, but if the AGI is unfriendly then it
could simply pretend to be stupid, and trick you into adding more and
more computing power until you hit the last scenario and it can escape.
  So, the two results that you're faced with are either delaying the
singularity or releasing an unfriendly AI - neither of which is a
worthwhile goal.

Since you cannot be sure which will happen, I would never box an AGI -
and if I could not be certain right from the beginning that it will be
friendly, I wouldn't build it in the first place.


Chris Paget

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:52 MDT