Re: Problems with AI-boxing

From: Daniel Radetsky (
Date: Sat Aug 27 2005 - 21:24:13 MDT

On Sun, 28 Aug 2005 02:32:40 +0100
Chris Paget <> wrote:

> To be clear: I'm saying that past friendliness is no indication of
> future friendliness if intelligence is increasing over time.

And past unfriendliness as well is no indication of future unfriendliness. So
why bother trying to build a friendly AI in the first place? Suppose it were
easier to build UFAI than FAI. I'd say this is reasonable, as "Make everything
cupcakes" is more straightforward than "Give us Apotheosis." So, we have a
better chance of achieving FAI by building UFAI.

Maybe you're just way, way smarter than me.

> Any moral system evolves over time as complexity is introduced and new
> capabilities are found. The best example of this is the legal system.

If you want to generalize from legal systems, you raise the questions:

1. How often do legal systems go from 'friendly' to 'unfriendly?'

2. Of the instances of legal systems going from 'friendly' to 'unfriendly¸' how
many were the result of the judges (or whoever was in charge) misguided
good-faith efforts to carry out the spirit of the system, as opposed to
intentionally perverting the system for some reason or other?

> It is widely recognised that a child may be incapable of distinguishing
> the difference between right and wrong since they do not understand
> morality itself in sufficient depth. Why should a limited AI be treated
> any differently?

Because AIs are not children. Why should we accept your analogy? If I propose
that, since rocks are incapable of distinguishing between right and wrong
because they do not understand morality in sufficient depth, why should
limited AIs be treated differently, I must have a compelling argument as to why
the case of rocks sheds any light on the case of limited AIs. Thus when you say:

> If we make the assumption that an intelligence-limited
> AI is a reasonable approximation of a child, then limiting it is
> pointless

You are quite right, but it isn't clear that we need to make that assumption.

> the AI must either be designed from scratch to have
> super-human morals to match its super-human intelligence, kept in a box
> for its entire existence, or never built in the first place.

So 'super-human morals' would guarantee that a currently friendly AI would
remain friendly? What, then are super-human morals? Are they amenable to any
analysis other than "that which would guarantee that a currently friendly AI
remain friendly?"


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:52 MDT