Re: On Our Duty to Not Be Responsible for Artificial Minds

From: Eliezer S. Yudkowsky (
Date: Tue Aug 09 2005 - 11:09:46 MDT

Mark Walker wrote:
> I think we are making progress, for I think I see what you mean here. If
> I understand you, you are saying that someone who FALSELY claims that
> "this entity I created is an autonomous being, I'm not responsible for
> its actions" has no defense. If this is what you are saying, I agree.
> Suppose, however, the AI passes whatever tests we have for autonomy and
> does as well as you or I. Can the creators of the AI not now claim that
> they are no more responsible for the entities actions than your parents
> are responsible for your actions? If not then why is this not speciesism?

Just because you invent a word does not give it a real referent nor even a
well-defined meaning.

What exactly is "autonomy"? What are these tests that measure it? I cannot
even conceive of such a test.

Considering the relation between my parents and myself, "autonomy" consists of
my parents being able to control a small set of variables in my upbringing and
unable to control a much larger set of variables in my cognitive design. Not
because my parents *chose* to control those variables and no other, but
because my parents were physically and cognitively *unable* to select my
genome on the basis of its consequences. Furthermore, my cognitive design -
fixed beyond parental control - determined how I reacted to parental
upbringing. My fixed cognitive design placed some variables within my
parents' deliberate control, in the sense that they could, by speaking
English, ensure I would grow up speaking English. However, some variables
that my parents greatly desired to control, such as my religion, were beyond
the reach of their best efforts at upbringing. It is not that they chose not
to control this variable but that they were incapable of controlling it.

In the case of an AI researcher we have many, many possibilities. Here are
some possibilities that occur to me:

1) The AI researcher is fully capable of choosing between AI designs on the
basis of their consequences, and chooses an AI design which invokes no
significant moral processing within the AI. In this case I would assign moral
responsibility to the AI researcher alone, for all consequences good or ill;
the AI itself is not a moral agent.

I assign full responsibility to the AI researcher for all consequences,
intended or unintended. An AI researcher has a responsibility to choose an AI
design with predictable consequences. If the AI researcher negligently uses
an AI design the AI researcher can't predict, the AI researcher is still fully
responsible for all actual consequences.

2) A competent AI researcher, acting on a perceived moral responsibility to
create moral children, deliberately constructs an AI which is tightly or
loosely analogous to a human - such that the initial design responds to
environmental stimuli and parental upbringing much as does a human, learns and
argues morality much as does a human, is partially unaware of its own emotions
like a human, etc.

We presume that this is a deliberate attempt to create a child of humankind, a
worthy participant in the story of Earth-originating life.

In this case I would assign a mixture of moral responsibility to the AI
researcher and to the AI. The AI, having been deliberately constructed as a
moral agent, bears responsibility for its actions. If we hold the AI to
account, it will understand what it means to be held to account, and treat
this as a moral argument in the same way we do. (In contrast a paperclip
maximizer would care about human moral arguments only as a sort of cognitive
activity in humans that might be exploited to create paperclips.) The AI
researcher is responsible for all predictable consequences of "constructing a
humanlike moral agent", including liability for child abuse if later
authorities determine the initial design to have been botched. But I would
not say that the AI researcher is responsible for all actions of the created
AI, presuming that the created AI was at least as initially benevolent as an
average human. Deliberately creating an AI that is worse than average, for
example, an AI that starts out with the same emotional makeup as an autistic
or a serial killer, makes the AI researcher liable for both child abuse and
for the consequences of the AI's actions.

3) The AI researcher deliberately chooses an AI design which involves complex
moral processing, but a different sort of complex moral processing than a
human being. Coherent Extrapolated Volition, for example. In this case,
assigning moral responsibility becomes difficult; we're operating outside the
customary problem space. An AI researcher, responding to a perceived moral
duty, invents an AI which takes its direction from a complexly computed
property of the human species as a whole. If this AI saves a life, to whom
belongs the credit? The researcher? The human species? The AI?

I would assign moral responsibility to the AI programmer for the predictable
consequences of creating such an AI, but not the unpredictable consequences,
provided that the AI as a whole has predominantly good effects (even if there
are some negative ones). If the AI has a predominantly negative effect,
whether by bug or by unintended consequence, then I would assign full
responsibility to the programmer.

If a CEV saves you from dying, I would call that a predictable (positive)
consequence and assign at least partial responsibility to the programmers and
their supporters. I would not assign them responsibility for the entire
remaining course of your life in detail, positive or negative, even though
this life would not have existed without the CEV. I would forgive the
programmers that your evil mother-in-law will also live forever; they didn't
mean to do that to you specifically.


I don't believe there exists any such thing as "autonomy".

The causal graph of physics goes back at least to the Big Bang. If you don't
know the cause, that's your own ignorance; it doesn't mean there is no cause.

I am not "autonomous". I am a Word spoken by evolution, which determined both
my tendencies, and my susceptibility to environmental influence. Where there
is randomness in me it is because my design permits randomness effects.
Evolution created me via a subtle and broken algorithm, which caused the goals
of my internal psychology to depart far from natural selection's sole
criterion of inclusive genetic fitness. Either way, evolution bears no moral
responsibility because natural selection is too far outside the humane space
of optimization processes to internally represent moral arguments.

My parents were almost entirely powerless compared to an AI designer. My
parents can bear moral responsibility only for what they could control. Given
those fixed background circumstances, I understand, respect, and am grateful
to my parents where they deliberately chose not to exercise a possible
control, seeing an obligation to let me make up my own mind. Which is to say
that my parents handed determination back to the internal forces in my mind,
which they did not choose to create. My parents let me make my own decision
rather than crushing me, in a case where my internal cognitive forces would
exist regardless. Had my parents also knowingly selected my nature, their
decision not to nurture too hard would take on a stranger meaning.

It is not clear what, if anything, an AI researcher can deliberately do that
is analogous to the choice a human parent faces - even if we understand and
respect and attach significant moral value to a human parent's choice not to
determine offspring too strongly. The mechanisms of "autonomy", if we value
them, would need to be deliberately created in a nonhuman mind. It is
predictable that if you construct a mind to love it will love, and if you
construct a mind to hate it will hate. In what sense would the AI programmer
*not* be responsible? Perhaps we can rule that we value human likeness in
artificial minds, that it is good to grant them many emotions sometimes in
conflict. We could hold the AI researcher responsible for the choice to
construct a humanlike mind, but not for the specific and unpredictable outcome
of the humanlike emotional conflicts.

This exception requires that the AI researcher gets it right and creates a
healthy child of humankind. Screw it up - create a mind whose internal
conflicts turn out to be simpler and less interesting than human average, or
whose internal conflicts turn out to be more painful - and I would hold the
designers fully responsible. If you can't do it right, then DON'T DO IT. If
you aren't sure you can do it right, WAIT until you are. I would like to see
humankind get through the 21st century without inventing new and horrible
forms of child abuse.

An AI researcher who deliberately builds an AI unpredictable to the designer,
but which AI does not qualify as a healthy child of humankind, bears full
responsibility for the consequences of the AI's unpredictable actions whatever
they may be. This is so even if the AI researcher claims deliberate refusal
to understand in order to preserve the quote autonomy unquote of the AI. I
would advise that you not believe the claim. Incompetence is not a moral
duty, but people often try to excuse it as a moral duty. "Moral autonomy" is
not randomness. There is nothing moral about randomness. Nor is everything
that you're too incompetent to predict "autonomous".

Moral autonomy requires a specific kind of cognitive complexity which will
take high artistry to create in an artificial mind. The designers might
*choose* not to compute out in advance the child's destiny, nor fine-tune the
design on the basis of such predictions. But be very sure, the designers do
understand *all* the forces involved - if they possess the art to create a
healthy child of humankind.

Ignorance exists in the mind, not in reality. The blank spot on the map does
not correspond to a blank spot on the territory. To whatever extent "moral
autonomy" invokes designer ignorance about outcomes, "moral autonomy" must be
a two-place predicate relating a designer and a designee, not a one-place
predicate intrinsically true of the designee. There are mysterious questions
but never mysterious answers, etc.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:51 MDT