Re: Friendliness not an Add-on

From: Richard Loosemore (
Date: Sat Mar 04 2006 - 11:30:34 MST

I agree, although (as I said before) I take the position that all
arguments about provability or guarantees of friendliness have to start
with at least some definition of 'friendliness' that actually means
something in the world of real systems, as opposed to the world of
fantasy-AI systems based on abstract mathematical formalisms.

Richard Loosemore

Philip Goetz wrote:
> It seems to me everyone is making a mistake in thinking that checking
> for friendliness of a program is like checking whether the program
> halts.
> Computational proofs of validity, halting behavior, etc., must
> consider every component of a program or theorem. Friendliness
> involves only the actions taken by an AI. Given 2 AIs that construct
> identical plans of action, they have equivalent Friendliness, even if
> one has evil intent and one has benign intent. You don't need to look
> inside the computations. You only need to check the proposed course
> of action.
> This means, for example, that protests that a program to verify the
> Friendliness of an AI's output must be as complex as the AI itself.
> This is wrong, since the complexity of the AI's output is many orders
> of magnitude less than the complexity of the AI itself.
> What I've just said makes the argument for friendliness verifiers
> easier. However, I'm not on the side of the verifiers - I think there
> is absolutely no hope of being able to formally verify anything about
> the results of a proposed course of action in the world. Given any
> formal system to prove actions benign, I could make it prove any set
> of actions benign by rigging the perceptual system that connected the
> action representations with their real-world semantics. Besides
> which, decades of experience shows that systems with provable
> properties are useless in the real world.
> - Phil

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:56 MDT