Re: Think of it as AGI suiciding, not boxing

From: Nick Hay (
Date: Wed Feb 22 2006 - 11:41:59 MST

Phillip Huggan wrote:
> It is feasible to police as computation increases towards ubiquity if
> policing implements also increase at the same rate or higher. It isn't
> happening now, but there is a class of technologies on the horizon that do
> encompass the mass production of sensor devices. The technical side of
> things seems feasible (one day, maybe not before AGI is built or something
> else earth-shattering happens), it is the political and administrative risks
> that scare me. That is why I'm only advancing a *let's never allow an AGI to
> get out* vision of the future as a suggestion to chew on.

I agree, although I would strengthen this. We can never allow a nonFriendly AGI
to exist. Bottling an entity smarter than us seems like a lost cause.
Explicitly allowing it to influence us, in the hopes getting it to work for us,
is even worse! (This is assuming an abitrary AGI design; if the design is
already safe, we don't have a problem.)

> I didn't mean to suggest limiting the output of an AGI as the only safeguard.
> I meant that it might be safer to tack it on top of whatever programmed
> friendly architectures already exist, than would be allowing the AGI to
> embark upon a full-throttle engineering project.

Ah, ok. It might be absolutely safer, assuming this didn't cause humans to feel
safer and thereby relax their guard, but the improvement is probably tiny.
Intelligence is a powerful thing, it can achieve ends through rather indirect
means. Focusing on boxing the AI can distract us from the more important, but
harder to discuss, issues of designing AGIs that are actually safe (or, at
least, safe enough to be worth building).

Restricting output like this can protect against accidental error on the AI's
part, which is important. I don't think we can remove the AI's ability to
manipulate humans, especially given that humans will scrutinise its output. Not
unless we reduce the AI's total influence over reality to a small number of bits
(which is probably impossible, given EM radiation leaks etc).

Reducing its influence correspondingly reduces its ability to be useful. If we
want it to invent things it requies a large space of outputs, e.g. the space of
valid blueprints. This is the risk with trying to use an unsafe AGI (i.e. not
explicitly constructed and known to be safe) to do work.

On the other hand, safety measures within the AGI, e.g. explicitly building an
AGI that doesn't "want" to escape, are an entirely different matter. Although
these are considerably more complex (e.g. you have to understand how the AGI
actually works) they offer the only reasonable ability to create safe AGIs
smarter than us.

> It seems far from certain
> to me that we can't screen out all of an AGI's magic. It still has to obey
> the laws of physics. As a thought experiment, imagine an AGI entombed within
> a collapsing hollow sphere of mini blackholes. Such an AGI is toast no
> matter how smart it is if massive engineering precursors don't accompany it.
> Believing in the certainty of AGI magic means that you believe for all
> classes of AGI architectures, that its own substrate material offers
> sufficient resources for it to escape.

Sure, that AGI is probably toast (it's still hard to be certain). But any AGI
we're milking for our benefit isn't in that situation, and is in a fine
situation to influence the universe indirectly through humans.

It's not believing in the certainty of AGI magic, just the possibility. Any
situation where the AI isn't entirely boxed, i.e. encased in concrete with no
information channels connecting it to us (that includes peeking at its internal
state), explicitly has potential resources for escape: humans.

> We can't even define this discussion until we identify what AGI magic is.

AGI magic is the AGI doing something we didn't expect it could do (e.g. by
failure of intelligence or imagination). There is no restriction on the methods
it can use.

> I
> can't suggest a safer output medium than engineering blueprints if I don't
> know what the risks are. What are they?

The general risk is its ability to influence reality. I don't think this
problem can be fixed, given that we don't know all the possible loopholes --
especially those in humans -- it could exploit. For instance, perhaps we try to
exclude the AI's ability to apply persuasive arguments by ensuring its output
contains no recognisable letters. This doesn't stop the AI influencing us
through concepts, since the process of understanding any design it produces will
invoke them.

> Inducing a spiritual experience and
> hynotism via sensor imput to humans? Mutating nearby virii with EM radiation
> to make mini robots? Manipulating nearby power grids? I know our physics is
> presently incomplete. Give me a hint, will the AGI attempt to utilize
> Gravity magic or everything else combined?

Whatever works. There is a fundamental limit on the escape avenues and methods
we can see, due to the difference in intelligence levels. That's the crux of
the matter.

-- Nick Hay

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:55 MDT