Re: AI boxing

From: Russell Wallace (
Date: Thu Jul 21 2005 - 20:26:02 MDT

On 7/21/05, Chris Capel <> wrote:
> I thought I'd get this in before the killthread, as I've never seen a
> meta-discussion on AI-boxing, and it might be more fruitful than the
> actual discussion.

I'll throw in another meta-level observation: arguments as to how to
safely box a UFAI aren't useful unless they can also show how it would
be a useful thing to do in the first place.

For one thing, I don't think anywhere near human-level AI in a box is
possible. To become intelligent in any useful sense in the first
place, it'll have to interact with the outside world; there's no
algorithm that'll magically generate "intelligence" run in a vacuum;
and apart from anything else, a superintelligence is going to require
more computers than you'll fit in a basement anytime soon.

But let's suppose we postulate the existence of an unfriendly
superintelligent AI in a box, not connected to anything except a
single serial terminal, for the sake of argument. I don't think an AI
could take over my mind through a serial terminal, the results of
Eliezer's experiments notwithstanding (not because my mind is
inherently stable at a higher level of stress or anything, but because
I've been on the receiving end of RDF-level charisma, so I know what
to look out for); is that an argument for me to go ahead and turn on
the machine and sit down at the terminal? No. I either make use of
information obtained from the interaction, or I don't. In the former
case, I've opened a window of vulnerability; in the latter case, I'm
at best wasting time I could spend doing something more productive
(such as trying to figure out how to create a Friendly AI!), and at
worst making a very big gamble that there isn't a loophole I haven't
thought of.

Whether unfriendly superintelligent AI in a box is safe depends on
your assumptions; but I claim that there are _no_ plausible
assumptions under which it would be _both safe and useful_.

- Russell

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:51 MDT