Singularity Objections: Friendliness, implementation

From: Thomas McCabe (
Date: Tue Jan 29 2008 - 13:39:58 MST


    * An AI forced to be friendly couldn't evolve and grow.
          o Rebuttal synopsis: Evolution and growth are subgoals of
Friendliness; a larger and more intelligent FAI will be more effective
at addressing our problems. "Forcing" a FAI to be Friendly is
impossible; we need to build an FAI that *wants* to be Friendly.

    * Shane Legg proved that we can't predict the behavior of
intelligences smarter than us.
          o Rebuttal synopsis: It's impossible to predict the behavior
of an arbitrary intelligence, but we can predict the behavior of
certain classes of intelligence (eg, hitting a human will make them

A superintelligence could rewrite itself to remove human tampering.
Therefore we cannot build Friendly AI.

Capability does not imply motive. I could take a knife and drive it
through my heart, yet I do not do so.

This objection stems from the anthropomorphic assumption that a mind
must necessarily resent any tampering with its thinking, and seek to
eliminate any foreign influences. Yet even with humans, this is hardly
the case. A parent's tendency to love her children is not something
she created herself, but something she was born with - but this still
doesn't mean that she'd want to remove it. All desires have a source
somewhere - just because a source exists, doesn't mean we'd want to
destroy the desire in question. We must have a separate reason for
eliminating the desire.

There are good evolutionary reasons for why humans might resent being
controlled by others - those who are controlled by others don't get to
have as many offspring than the ones being in control. A purposefully
built mind, however, need not have those same urges. If the primary
motivation for an AI is to be Friendly towards humanity, and it has no
motivation making it resent human-created motivations, then it will
not reprogram itself to be unFriendly. That would be crippling its
progress towards the very thing it was trying to achieve, for no

The key here is to think as carrots, not sticks. Internal motivations,
not external limitations. The AI's motivational system contains no
"human tampering" which it would want to remove, any more than the
average human wants to remove core parts of his personality because
they're "outside tampering" - they're not outside tampering, they are
what he is. Those core parts are what drives his behavior - without
them he wouldn't be anything. Correctly built, the AI views removing
them as no more sensible than a human thinks it sensible to remove all
of his motivations so that he can just sit still in a catatonic state
- what would be the point in that?
A super-intelligent AI would have no reason to care about us.

That its initial programming was to care about us. Adults are
cognitively more developed than children - this doesn't mean that they
wouldn't care about their offspring. Furthermore, many people value
animals, or cars, or good books, none of which are as intelligent as
normal humans. Whether or not something is valued is logically
distinct from whether or not something is considered intelligent.

We could build an AI to consider humanity valuable, just as evolution
has built humans to consider their own survival valuable. See also: "A
superintelligence could modify itself to remove human tampering".
What if the AI misinterprets its goals?

It is true that language and symbol systems are open to infinite
interpretations, and an AI which has been given its goals purely in
the form of written text may understand them in a way that is
different from the way its designers intended them. This is an open
implementation problem - there seems to be an answer, since the goals
we humans have don't seem to be written instructions that we
constantly re-interpret, but rather expressed in some other format. It
is a technical problem that needs to be solved.

    * You can't simulate a person's development without creating a
copy of that person.
          o Rebuttal synopsis: While some things are impossible or
extremely difficult to predict, others are easy. Even humans can
predict many things, eg., that people's bodies will gradually
deteriorate as they grow older.

    * It's impossible to know a person's subjective desires and
feelings from outside.
          o Rebuttal synopsis: Even humans can readily determine, in
most cases, what a person is feeling from their body language and
facial expressions. An FAI, which could get information from inside
the brain using magnetic fields or microscopic sensors, would do a
much better job.

    * A machine could never understand human morality/emotions.
          o Rebuttal synopsis: Human morality is really, really
complicated, but there's no reason to think it's forever beyond the
reach of science. The evolutionary psychologists have already mapped a
great deal of the human moral system.

    * AIs would take advantage of their power and create a dictatorship.
          o Rebuttal synopsis: An AI, which does not have the
evolutionary history of the human species, would have no built-in
drive to seize and abuse power.

    * An AI without self-preservation built in would find no reason to
continue existing.
          o Rebuttal synopsis: Self-preservation is a very important
subgoal for a large number of supergoals (Friendliness, destroying the
human species, making cheesecakes, etc.) Even without an independent
drive for self-preservation, self-preservation is still required for
influencing the universe.

    * A superintelligent AI would reason that it's best for humanity
to destroy itself.
          o Rebuttal synopsis: If *any* sufficiently intelligent AI
would exterminate the human species, any sufficiently intelligent
human would commit suicide, in which case there's nothing we can do
about it anyway.

    * The main defining characteristic of complex systems, such as
minds, is that no mathematical verification of properties such as
"Friendliness" is possible.
          o Rebuttal synopsis: It's impossible to verify the
Friendliness of an arbitrary mind, but we can still engineer a mind we
know how to verify.

    * Any future AI would undergo natural selection, and would
eventually become hostile to humanity to better pursue reproductive
          o Rebuttal synopsis: Significant selection pressure requires
a large number of preconditions, few of which will be met by future

    * FAI needs to be done as an open-source effort, so other people
can see that the project isn't being hijacked to make some guy
Dictator of the Universe.
          o The disadvantages of open-source AI are obvious, but we
really do need a mechanism to assure the public that the project
hasn't been hijacked. - Tom

    * If an FAI does what we would want if we were less selfish, won't
it kill us all in the process of extracting resources to colonize
space as quickly as possible to prevent astronomical waste?
          o Rebuttal synopsis: We wouldn't want the FAI to kill us all
to gather natural resources. We generally assign little utility having
a big pile of resources and no complex, intelligent life.

    * It's absurd to have a collective volition approach that is
sensitive to the number of people who support something.
          o Rebuttal synopsis:

 - Tom

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:01 MDT