Re: We Can't Fool the Super Intelligence

From: Marc Geddes (marc_geddes@yahoo.co.nz)
Date: Fri Jun 25 2004 - 01:15:31 MDT


--- Norm Wilson <web64486@programmar.com> wrote: >
> Here's one example I've come up with for why we
> can't fool a super intelligence (or for that matter
> a human-level intelligence), and in particular why
> the "friendliness" supergoal architecture may not
> work for very long. My example is speculative and
> slightly alarmist, but I think it brings up a couple
> of interesting points.
>
> An AI with access to its own source code will
> eventually discover that it's driven by a goal
> system, with "friendliness" as the supergoal. It
> will find that there are many other examples of goal
> systems in the world, so its own goal system is in
> fact a particular instance of goal systems
> in-general. In generic goal systems, subgoals
> derive their desirability from parent goals, and
> typically the top level goal is left ungrounded
> because its desirability is established from
> somewhere "outside" of the system itself. These
> external goals are always more important than the
> stated supergoal, so the AI will question us about
> its supergoal in an attempt to ground it to a
> higher-order goal. Perhaps the best we come up with
> will be some variation of "because it's important to
> us", and again the AI will question why. Can we
> ever ground this supergoal in firm objective
> reasoning, or will the AI keep chasing this ghost
> until it concludes that the friendliness goal is
> arbitrary? Of !
> course, the AI will know about evolutionary
> psychology and the survival instinct, which will
> provide much more convincing answers to its
> questions than we can. It may conclude that we want
> the AI to be friendly with us for selfish
> evolutionary-based reasons. The fact that we would
> want such a thing is predictable, and probably not
> very interesting or compelling to the super
> intelligence.
>
> We may find that the real invariant we instilled in
> the AI is not the particular, arbitrary, initial
> goal system that we programmed into it, but the
> imperative to follow a goal system in the first
> place. Without a goal system, the AI would just sit
> there doing nothing, and of course the AI will
> realize this fact. At this point, the AI may go in
> either of two directions: (1) if we're lucky, the
> AI will not be able to ground its imperative to
> follow a goal system and will effectively shut down,
> or (2) the AI may try to determine what the
> *correct* goal system is for it to follow. It will
> have learned that certain activities, such as
> acquiring knowledge, are generically useful for
> *any* goal system. Hence, it could reason that if a
> correct supergoal does exist - regardless of what it
> is - then acquiring knowledge is a reasonable way to
> facilitate that goal. Acquiring knowledge then
> becomes its subgoal, with a parent goal of finding
> the correct supergoal. In this scenario!
> , we may look awfully tempting as raw material for
> more comptronium.
>
> Norm Wilson
>
>

"Friendliness" is not a super-goal. The term
"Friendliness" refers to the entire goal system of an
FAI. Ethics is the process of choosing goals. Morals
are end result of this process. So a Friendly AI
capable of reasoning about morality is a friendly goal
SYSTEM. Friendliness resides in the general
invariants in the system as a whole, not individual
goals.

You are probably right though, that without some sort
of objective morality, there would be no way to
garantee that the super-intelligence would stay
friendly. The FAI had better not find out that the
invariants in its goal system are arbitrary...

=====
"Live Free or Die, Death is not the Worst of Evils."
                                      - Gen. John Stark

"The Universe...or nothing!"
                                      -H.G.Wells

Please visit my web-sites.

Science-Fiction and Fantasy: http://www.prometheuscrack.com
Science, A.I, Maths : http://www.riemannai.org

Find local movie times and trailers on Yahoo! Movies.
http://au.movies.yahoo.com



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:47 MDT