Re: Think of it as AGI suiciding, not boxing

From: Nick Hay (
Date: Sun Feb 19 2006 - 04:01:39 MST

Phillip Huggan wrote:
> To clarify, there would be no user interface.

There is necessarily a user interface if we are using the AI to design systems.
  We need to influence it to direct the design process, and it needs to
influence us to communicate the designs.

> And nothing but mechanical
> computer components; no danger of physics tricks. A purely mechanical
> computer log. Yes I know AI Boxing talk is passe, but can someone quickly
> point out why this architecture would fail?

This scenario is effectively equivalent to the standard AI box; the box
necessarily has a hole to allow communication between human and AI.

There is nothing to stop the AI producing a design which convinces the human
reading it to release the AI (e.g. by connecting it to the internet). It is
difficult to state with certainty there is no such design as the space of
possibilities is exponentially huge and humans are knowably imperfect.

Alternatively, the AI could design a useful machine which looks definitely
harmless to a human but in actuality releases the AI. As the space of
possibilities is exponentially huge, and as humans have limited ability to
predict a design's true effect, a sufficiently smart AI may be able to invent
such a design.

> If the AGI is friendly, we lose out on a great deal of its immediate
> engineering prowess building only safe product designs it spits out. If it
> is unfriendly, we are saved. It doesn't matter how smart an entity is if the
> jail permits no key.

In any AI boxing scenario humans form a lock, to which any influence the AI has
on us (e.g. any time we look at anything it's produced) gives it access. The
smarter the entity the better it is at cracking locks.

> Phillip Huggan <> wrote:
> Can friendliness be implemented by specifying the AGI shut itself down if vis
> self-coding directly interacts with the universe beyond a very limited set of
> computer operations?

If successful this wouldn't implement Friendless, as SIAI defines it, but a form
of "safe" usefulness. Such an AI has no protection against humans getting what
they wished for but didn't want, and no moral judgement to protect against misuse.

Incidentally, which limited set of computer operations?

-- Nick Hay

> In concert with a relatively inert computer substrate
> such as a molecular computer, how could it cause harm? We could still
> benefit from such a limited architecture a great deal.

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:55 MDT