Re: Components of Friendly AI

From: Eliezer Yudkowsky (
Date: Thu Jul 15 2004 - 21:36:15 MDT

Emil Gilliam wrote:
> From the Collective Volition page:
>> Friendly AI requires:
>> 1. Solving the technical problems required to maintain a
>> well-specified abstract invariant in a self-modifying goal system.
>> (Interestingly, this problem is relatively straightforward from a
>> theoretical standpoint.)
>> 2. Choosing something nice to do with the AI. This is about
>> midway in theoretical hairiness between problems 1 and 3.
>> 3. Designing a framework for an abstract invariant that doesn't
>> automatically wipe out the human species. This is the hard part.
> How independent are these problems? For example, is it theoretically
> possible to fully solve and implement 1 and 3 without 2 -- such that the
> AI can then be given any abstract invariant at all?

3 subsumes 1 and is more difficult than 1. Pragmatically, 2 is highly
dependent on 3 - you need to understand how goal systems work before you
can state the goals. Otherwise you can't see what your options are, or
understand how to describe success. FAI-theoretically, 2 subsumes 3
because until you decide on your goals you have no way to distinguish good
goal systems from bad goal systems, e.g., state that a framework which
automatically wipes out the human species is undesirable.

I can imagine that an AI researcher fully solves and implements 1 and 3
without 2, for example by using all that knowledge and understanding to
create an AI whose sole purpose is producing giant 200-foot-diameter
cheesecakes. Though there is the question of from whose perspective 2 has
not been solved. Presumably the researcher had some reason for wanting an
AI that makes cheesecake. I've encountered many proposals scarcely less
stupid, but generally from people who pay no real attention to 1 and
haven't begun to comprehend the question for 3.

In practice, I would be surprised to find a researcher who'd solved 1, much
less 3, without doing very intense thinking about 2.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:48 MDT