From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Sun Jan 28 2001 - 10:38:51 MST
xgl wrote:
>
> hmmm ... if we try to get friendliness by directed evolution,
> wouldn't friendliness end up implicitly as a subgoal of survival? isn't
> that, like, bad?
"Implicit" doesn't pay the rent - if it's not represented declaratively or
procedurally *somewhere* in the source, it doesn't exist except as an
obscure historical fact.
If you start out with a Friendly AI that wants to excel at training
scenarios {as a subgoal of improving the population of Friendly AIs, as a
subgoal of creating better Friendly AIs, as a subgoal of better
implementing Friendliness}, and you create training scenarios - fitness
metrics - that test for the presence or absence of complex functional
Friendliness and Friendliness-sensitivity in problem solving, then there
should be no simple mutation that can short-circuit the AI. Though you'd
still have to deal with genetic drift in any source/content that
implements functionality you can't directly test via training scenarios.
-- -- -- -- --
Eliezer S. Yudkowsky http://intelligence.org/
Research Fellow, Singularity Institute for Artificial Intelligence
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:35 MDT