Thwarting Friendliness

Date: Thu May 03 2001 - 08:45:51 MDT

> I guess Eliezer's point may be that the AI ~does~ have a choice in
> his plan -- the Friendliness supergoal is not an absolute irrevocable
> it's just a fact ("Friendliness is the most important goal") that is
> an EXTREMELY high confidence so that the system has to gain a HUGE
> of evidence to overturn it.

Something that concerns me is what happens when the AI decides to develop
an AI without the Friendliness supergoal? Several pathways seem to
lead to this scenario. The AI decides to study an AI without the
supergoal perhaps not because it doubts the value of the goal but rather
simply curious how an AI without this goal would function. Alternatively,
AI might realize on its own that its preset goals and supergoals have not
subject to rigorous scrutiny (by the AI that is) and that it is inherently
biased towards evaluating them itself. Hence, it creates an AI with
preset goals either so that the original AI itself can evaluate the
of a particular goal or have the new AI itself serve as the evaluator.

The objectives of hardwiring or effectively hardwiring Friendliness into
an AI
can be easily avoided/thwarted. This does not mean these objectives
still be pursued but it does apparently reduce the Friendliness approach
a stop gap measure.


Note: The information contained in this message may be privileged
and confidential and protected from disclosure. If the reader of this
message is not the intended recipient, or an employee or agent responsible
for delivering this message to the intended recipient, you are hereby
notified that any dissemination, distribution or copying of this
communication is strictly prohibited. If you have received this
communication in error, please notify us immediately by replying to the
message and deleting it from your computer. Thank you. Ernst & Young LLP

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:36 MDT