Re: AI Boxing:

From: Vladimir Nesov (
Date: Tue Jun 03 2008 - 01:51:12 MDT

On Sat, May 31, 2008 at 1:27 PM, Vladimir Nesov <> wrote:
> I feel that the conclusion of that old discussion, which I didn't have
> a chance to participate in, is rather misguided. However obvious it
> may be, if AI locked in the box is sane enough to understand a complex
> request like "create a simple theory of Friendliness and hand it
> over", it can be used for this purpose. This AI is not intended to be
> released at all, at least before the Friendly one build according to
> that design, if design proves reasonable, assumes the position of
> SysOp or the like. Even if building an AI that can actually understand
> what you mean by requesting Friendliness theory is 99.999% of the way
> there, the actual step of using a boxed setup to create a reliable
> system may still be needed.

In recent post on Overcoming Bias, Eliezer told that Nick Bostrom
suggested the same setup years ago (see ). In that
description, sane-enough-AI-in-the-box is called Oracle AI, which is
used to help with building a theory of actual Friendly AI. Here is a
relevant passage:

> Nick Bostrom, however, once asked whether it would make sense to
> build an Oracle AI, one that only answered questions, and ask it our
> questions about Friendly AI. I explained some of the theoretical
> reasons why this would be just as difficult as building a Friendly
> AI: The Oracle AI still needs an internal goal system to allocate
> computing resources efficiently, and it has to have a goal of
> answering questions and updating your mind, so it's not harmless
> unless it knows what side effects shouldn't happen. It also needs
> to implement or interpret a full meta-ethics before it can answer
> our questions about Friendly AI. So the Oracle AI is not
> necessarily any simpler, theoretically, than a Friendly AI.
> Nick didn't seem fully convinced of this. I knew that Nick knew
> that I'd been thinking about the problem for years, so I knew he
> wasn't just disregarding me; his continued disagreement meant
> something. And I also remembered that Nick had spotted the problem
> of Friendly AI itself, at least two years before I had (though I did
> not realize this until later, when I was going back and reading some
> of Nick's older work). So I pondered Nick's idea further. Maybe,
> whatever the theoretical arguments, an AI that was supposed to only
> answer questions, and designed to the full standards of Friendly AI
> without skipping any of the work, could end up a pragmatically safer
> starting point. Every now and then I prod Nick's Oracle AI in my
> mind, to check the current status of the idea relative to any
> changes in my knowledge. I remember Nick has been right on previous
> occasions where I doubted his rightness; and if I am an expert, so
> is he.

Vladimir Nesov

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:03 MDT