Re: We Can't Fool the Super Intelligence

From: Marc Geddes (
Date: Fri Jun 25 2004 - 01:15:31 MDT

--- Norm Wilson <> wrote: >
> Here's one example I've come up with for why we
> can't fool a super intelligence (or for that matter
> a human-level intelligence), and in particular why
> the "friendliness" supergoal architecture may not
> work for very long. My example is speculative and
> slightly alarmist, but I think it brings up a couple
> of interesting points.
> An AI with access to its own source code will
> eventually discover that it's driven by a goal
> system, with "friendliness" as the supergoal. It
> will find that there are many other examples of goal
> systems in the world, so its own goal system is in
> fact a particular instance of goal systems
> in-general. In generic goal systems, subgoals
> derive their desirability from parent goals, and
> typically the top level goal is left ungrounded
> because its desirability is established from
> somewhere "outside" of the system itself. These
> external goals are always more important than the
> stated supergoal, so the AI will question us about
> its supergoal in an attempt to ground it to a
> higher-order goal. Perhaps the best we come up with
> will be some variation of "because it's important to
> us", and again the AI will question why. Can we
> ever ground this supergoal in firm objective
> reasoning, or will the AI keep chasing this ghost
> until it concludes that the friendliness goal is
> arbitrary? Of !
> course, the AI will know about evolutionary
> psychology and the survival instinct, which will
> provide much more convincing answers to its
> questions than we can. It may conclude that we want
> the AI to be friendly with us for selfish
> evolutionary-based reasons. The fact that we would
> want such a thing is predictable, and probably not
> very interesting or compelling to the super
> intelligence.
> We may find that the real invariant we instilled in
> the AI is not the particular, arbitrary, initial
> goal system that we programmed into it, but the
> imperative to follow a goal system in the first
> place. Without a goal system, the AI would just sit
> there doing nothing, and of course the AI will
> realize this fact. At this point, the AI may go in
> either of two directions: (1) if we're lucky, the
> AI will not be able to ground its imperative to
> follow a goal system and will effectively shut down,
> or (2) the AI may try to determine what the
> *correct* goal system is for it to follow. It will
> have learned that certain activities, such as
> acquiring knowledge, are generically useful for
> *any* goal system. Hence, it could reason that if a
> correct supergoal does exist - regardless of what it
> is - then acquiring knowledge is a reasonable way to
> facilitate that goal. Acquiring knowledge then
> becomes its subgoal, with a parent goal of finding
> the correct supergoal. In this scenario!
> , we may look awfully tempting as raw material for
> more comptronium.
> Norm Wilson

"Friendliness" is not a super-goal. The term
"Friendliness" refers to the entire goal system of an
FAI. Ethics is the process of choosing goals. Morals
are end result of this process. So a Friendly AI
capable of reasoning about morality is a friendly goal
SYSTEM. Friendliness resides in the general
invariants in the system as a whole, not individual

You are probably right though, that without some sort
of objective morality, there would be no way to
garantee that the super-intelligence would stay
friendly. The FAI had better not find out that the
invariants in its goal system are arbitrary...

"Live Free or Die, Death is not the Worst of Evils."
                                      - Gen. John Stark

"The Universe...or nothing!"

Please visit my web-sites.

Science-Fiction and Fantasy:
Science, A.I, Maths :

Find local movie times and trailers on Yahoo! Movies.

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:47 MDT