Re: In defense of Friendliness was RE: [wta-talk] My own A.I project: 'Show-down with Singularity Institute"

From: Eliezer S. Yudkowsky (
Date: Thu Oct 17 2002 - 01:25:32 MDT

Marc Geddes wrote:

> Fair enough, but I did kind of get the impression that Eleizer
> expected the FAI to act like a wish-granting genie.

No, if I inflated to superintelligence, then this is my best current
understanding of how I would be ethically obligated to interact with any
still-basic humans around.

Why? In the absence of an objective morality (an undetermined possibility
that needs to be projected both ways) then I don't see any moral reason
that would allow a superintelligence to meddle in other people's lives
without their consent. As a human, interacting with other humans, I can
try to convince people that they should want different things. I can do
this because I don't have the intelligence to see humans as systems whose
actions are effectively determined by which of several sentences I choose
to say to them. At that level, even simple conversation becomes meddling
with people's minds if you start choosing between innocent-sounding
statements on the basis of a superintelligent understanding of how those
statements will affect that person's trajectory, according to goals not
shared by that person. I don't see any privileged moral frame of
reference under human morality that would enable one entity's strictly
personal goals to justly supervene on another entity's strictly personal
goals. But giving people what *they* want, as opposed to what *you* want
or even what you would prefer that they want, is okay.

Another way of thinking about it is this. Suppose that you're interacting
with an human-derived superintelligence that happens to be a devout
Zoroastrian (a thought that makes my brain hurt, but let's suppose for the
sake of discussion that this is cognitively possible). Let's suppose
there are no external guards on how this superintelligence interacts with
you. What constraints do you hope this superintelligence feels obliged to
obey in interacting with you? I would feel that my rights had been
violated if our conversation seemed innocent on the surface - in
accordance with my wishes - but had actually been chosen out of a vast
space of possibilities with the intent of making my life serve the overall
goals of that ZSI (Zoroastrian superintelligence). I would not wish to
find my life trajectory warped to serve goals I find repugnant.

But in the absence of objective morality, how can the same ethics not also
apply to a rational superintelligence interacting with a human
Zoroastrian? Fair is fair. If I don't want my life warped to serve a
view of destiny that I disagree with, then undoubtedly neither do
Zoroastrians. Now it could be that there is some view that is simply
*right* and that I and the Zoroastrians are both wrong. But in the
absence of objective morality, what's left for the ethics of
superintelligences interacting with mortals is the idea that sentient
beings should get what *they* want. And "what they want" should be
defined however *they* want to define it.

If you're interacting as an equal, you can try and convince people to want
other things, and they can try and convince you back. But if you have
substantially transhuman brainpower and can model people from the design
stance, such that you have the power to convince people of arbitrary
propositions, at that point what you are doing ceases to be "argument" and
becomes "mind control", unless the discussion is carried out *solely*
according to the wishes of the person being talked to.

Now you might say that an obvious way around this - in ethical terms - is
for a superintelligence to create an interacting system on the same level
on the human. And if I were talking with a voluntarily ethical
Zoroastrian superintelligence for some reason, this is in fact what I
would prefer. Given my current limited understanding of ethics, that is;
obviously you should first ask a more trusted superintelligence about
whether this is what a smarter you would regard as the ethical way to

But if you're talking about a Transition Guide or Sysop - a
superintelligence occupying a unique position with respect to the
trajectory of human destiny - then *any* transfer of personal preferences
by the original programmer constitutes a hijacking of human destiny.
Would you want a transcending upload with personal goals you found
repugnant using her unique position as Transition Guide to leverage those
goals over all human destiny? No? Then, failing an objective morality,
the general ethical principle must be that a mind which will occupy a
unique position with respect to human destiny, such as the first seed AI,
should not be leveraging the personal goals of its programmers. Not in
*any* way, big or small, because you wouldn't like it if someone else did
that to you. Eliminate sensitivities on the choice of original
programmers. If any mind-in-general needs *some* set of highly developed
personal preferences, then use the modeled personal preferences of a
generic human, not the personal preferences of the original programmers.
And that's even *if* the personal preferences of this mind-in-general play
no role in how it interacts with not-yet-upgraded intelligences, or play a
role only on the specific request of those intelligences. Don't hijack
human destiny. Being a seed AI programmer doesn't make you morally
different from any other human.

Of course my current understanding of morality may be wrong, but my
current understanding of morality is all that I have available to use as
raw material to illustrate to an AI how human moral philosophy works.

Marc Geddes: Your wish is to achieve your immediate subgoals as a result
of personal effort, rather than having them granted directly as wishes.
(If you Google on "Singularity Fun Theory", you'll find out *why* you
prefer this... but that's a separate story.) That wish is certainly
suboptimally achieved with respect to your current environment. You
didn't create your shoes, much less your feet. Your functional components
are systems that evolution designed and then handed to you on a silver
platter. The various artifacts around you were created by other humans
and operate as independent causal systems divorced from your sensorimotor
architecture; you control them as external tools rather than feeling them
as parts of yourself. If your wish is to find happiness in achieving your
goals as the result of your personal creative efforts, rather than having
things handed to you on a silver platter, then that's certainly a wish
that could be achieved to a much greater degree after the Singularity than

> All that's needed is an injunction not to violate the rights of
> others. (i.e. to regard all other sentients as 'ends in themselves'
> and not initiate force against them except in self-defense)
> A little bit of self-centeredness should not be a problem if the
> Kantian imperative is backed up by reason. (i.e. if the A.I reasons
> that it is indeed 'ethically correct' to regard other sentients
> as 'ends in themselves')
> Agreed. In fact as part of my suggested supergoals the A.I would co-
> operate with humans. (As part of it's quest for knowledge I
> suggested a desire to 'teach' what it learns, so it would be sharing
> it's knowledge with us. In fact this sharing of knowledge is
> actually probably the best way an A.I could help us)


You can't mess around with AI morality this way - adding a piece here,
taking a piece there according to your whims. Do you construct your own
morality using this kind of thinking? No? Then why do you believe you
can use it to construct an AI's?

This is not a question of making up random stuff and asking the AI to do
it. What do *you* believe is right? What do *you* believe is wrong? No
matter how imperfect your moral philosophy is, it's something that can
exist at least temporarily in at least one mind.

By thinking of FAI as something separate from yourself, you are destroying
your ability to even get started on understanding metamorality. We
interact with other humans by cajoling them, giving them orders, making
alliances, wondering about their other loyalty. We *create* only
ourselves. Building AI is an act of creation, not a matter of giving
orders. How do *you* choose between actions? What are *your* supergoals?

If you're wondering how this gets reconciled with "no sensitivity on
choice of initial programmer", it's because opening a channel to your own
moral substance is how you provide the AI with an *interim* approximation
of the substance of humanity. In other words, because your supergoals are
good enough for you, doesn't mean they're good enough for the AI. But if
they're not good enough for *you*, they're *certainly* not good enough for
the AI. That doesn't mean inventing whatever you want piecemeal, then
rationalizing why each piece *would be* good enough for you. It means
using what actually genuinely *was* good enough for you.

> It may be that a 'non-self
> oriented supergoal' might turn out to harmful in some way that we are
> currently unaware of.

You're going to have to offer something more specific than "might" if you
want to convince me that observer-centered supergoals are necessary to the
integrity of minds-in-general. I don't believe that having ten fingers is
necessary to all possible minds-in-general. I don't believe that being
built from amino acid chains twisted into complex shapes by van der Waals
forces is a requirement of minds-in-general. Why should the feature you
cite be anything except another contingent product of our evolutionary
origins? You are arguing from your own ignorance of the evolutionary
psychology of goal systems. I suppose this can be an effective argument
if the audience doesn't know either - i.e., Jeremy "How do we *know*
genetically engineered organisms won't spontaneously explode?" Rifkin -
but it ill behooves a wannabe specialist.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:41 MDT