Re: AI boxing

From: James Higgins (
Date: Sat Jul 27 2002 - 16:40:08 MDT

outlawpoet - wrote:
> The challenge is not to design
> a perfect prison or vault, but to create an AI that we can trust to some
> reasonable degree, whether through character or design. Now character is
> shifty, and while i'd love to trust ver on that, I'm afraid I probably couldn't
> take the chance. So design it is. And at this point, all I've heard are

Yup, and designs are sketchy at best so far...

> Asimov Law-type injunctions, Friendly Goalsystems, and Goertzel's Novamente's
> FriendlyNode thingey. Of those, Friendliness(a la Yudkowsky) and perhaps
> Goertzel's sytem are the only types I think would survive, although i'd
> have to see more of the documentation on both in their specific instances
> of course.

As far as I know Eliezer doesn't have a complete AI design yet, so it
isn't practicle to analyze his approach yet. Ben doesn't have the
Friendliness thing fully worked out. So both need more work and study...

> Most other schemes depend on determining the character of the
> AI, which is so insanely dangerous and ill omened, I hardly have words for
> it. (well, of course I have words for it, I just told you some, don't let
> my propensity for hyperbole scare you off, it's just really really a bad
> idea) An AI will not be a person like a human is. It's personality and goals
> are likely to follow rules unintuitive and alien to us. Our social reflexes
> are entirely inappropriate. And very misleading, as the AI boxing experiment
> illustrates. Absolutely none of the participants let the AI out for the
> right reasons, and I believe that to be because the right reasons can't
> be determined by interaction at that stage and in that medium.

AI Boxing will almost certainly be employed, however. It may even
proove very useful. There is a good chance that an infra-human AI won't
be able to lie effectively. A human level AI would most likely be less
successful at lieing than, at least, very good human liars. Plus, it is
likely that such AIs would be much less socially developed/skilled than
most humans. At the very least, they will be unpracticed at lieing and
will most likely need to develop that ability (which we would,
hopefully, notice). Thus we may well be able to spot trouble using Box
Testing before the AI gets to the trans-human stage. Once there,
however, it will likely be very difficult to spot an UnFriendly AI using
Box Testing.

James Higgins

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:40 MDT