From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Thu Jan 25 2001 - 23:28:54 MST
Dale Johnstone wrote:
>
> > Under the _Friendly AI_ semantics, the goal system's description is itself
> > a design goal. A redesign under which the goal system "always returns
> > true" may match the "speed" subgoal, but not the "accuracy" subgoal, or
> > the "Friendly decisions" parent goal. The decision to redesign or
> > not-redesign would have to be made by the current system.
>
> How do you measure the 'accuracy' of the goal system's description?
Programmatically this is self-referential but conceptually it's not, if
you see what I mean. In other words, you have the current goal system, in
which the supergoals are probabilistic so that the AI can conceive of the
possibility of a supergoal being "wrong". Still, if you propose an
alternate goal system implementation which has totally different
supergoals, then the alternate implementation will get kicked into the
reject bin because the real-world result of the redesigned AI acting on
those supergoals is projected to be unFriendly.
Another way to phrase it is that you can describe verification and
production separately. The current supergoal content is the verification
predicate. The proposed new implementation is the production system. If
the new production system is predicted to produce results that don't pass
the verification predicate, then the AI won't adopt the new production
system.
-- -- -- -- --
Eliezer S. Yudkowsky http://intelligence.org/
Research Fellow, Singularity Institute for Artificial Intelligence
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:35 MDT