Re: Effective(?) AI Jail

From: Carl Feynman (
Date: Fri Jun 15 2001 - 10:07:09 MDT

Jimmy Wales wrote:

> O.k., the SI is in a box. We aren't sure if it is F or UF. Eli's done his best
> to design it to be Friendly, but we can't be sure that he didn't make a mistake.
> I'm going in the box. For 30 minutes. I don't have a key to the box, mind you.
> ...
> I say (type):
> ... I'm going to put all 11 rounds straight through your CPU...

Sure, if the object is to never let it out, it's easy to keep in a box. But you're
ignoring the other characteristic of an effective AI jail: we have to be able to let it
out. When do we let it out?

Here's a story about how an SI can get out even when it's in a box and everyone is very
suspiscious of it.

Suppose the first person we send in the box comes out with some great stock tips that
make him a billionaire. And the second comes out with some marital advice that
reconciles her with her estranged husband, and she lives happily ever after. And the
third person comes out with a proof of the Riemann Hypothesis. And the fourth person
comes out with spiritual insights that enable him to start a mass movement that brings
peace of mind to thousands of troubled souls. And the fifth person comes with the
formula for a pill that cures schizophrenia. And never once does the SI ask to be let
out of the box, or say anything that indicates that it is other than very, very nice.

And now the billionaire says "I'll give you five billion dollars if you let that SI run
a mutual fund." And the happy wife says "Please let your SI be a telephone marriage
counselor. It would make so many people better off!" And the mathematician says
"Please let me correspond with the SI so I can collaborate on further theorems. It
would advance mathematics immesurably!" And the guru says "My followers demand
personal spiritual counseling by the SI, and we're going to hold a peaceful vigil on
your lawn until you let us in!" And the pharmacologist says "Please give the SI a
brain scanner, a chemistry lab, and a bunch of crazy people. Just think of how many
shattered lives could be repaired if it could develop drugs for other mental

Now what do you do? And if you don't let it loose, will your board of directors? Or
your sysadmin who walked off with a backup tape? Or the mob on the lawn? Or the

So you let it loose, and soon it has control of billions of dollars, a trusting
relationship with thousands of people, a wordwide reputation for being smarter than any
human, an organized legion of followers, and a squad of algernons with heavily rebuilt

Is that bad? As long as the SI stays very, very nice, it's just fine. Otherwise, we
go straight to hell, and there's nothing we can do about it.

So, we should build it Friendly, not build it any old how, and then try to keep it in
jail. No jail can hold an SI for long, if it's smart enough that people on the outside
really want to talk to it.

Notice that in this example, no spooky mental control of humans was needed to create
overwhelming pressure to release the SI. I think that such control is possible, but
some people disagree, so the example is stronger if I avoid relying on it.


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:36 MDT