AI morality

From: Ben Goertzel (
Date: Sat Jun 29 2002 - 10:59:18 MDT

Eliezer wrote:
> The vast majority of religious
> people, especially what we would call "fundamentalists" and those outside
> the First World, adhere to a correspondence theory of the truth of their
> religion; when they say something is true, they mean that it is so; that
> outside reality corresponds to their belief.

I don't think you have it quite right.

What they mean is more nearly that *their experience corresponds to their

Modern scientific realism is based on accepting outside reality,
observations of physical reality, as the fundamental determinant of truth.

On the other hand, many other traditions are based on accepting *inner
intuitions and experiences* as the fundamental determinants of truth.

It seems to me like you don't fully appreciate what it means for someone to
have a truly non-rationalist, non-scientific point of view. Probably this
is because your life-course so far has not led you to spend significantly
much time with such people. Mine, as it happens, has. I know you grew up
in a religious home, but there are more and less rationalist varieties of
religion, too...

Take yoga, or Zen Buddhism, as examples. These are ancient traditions with
a lot of depth and detail to them. Their validity, such as it is, is
primarily *experiential*. It is largely not based on things that
individuals outside the tradition in question can observe in empirical
reality. [Yeah, I know people have tried to test for enlightenment by
studying brain waves and such (lots of work at Maharishi University on
this), but this isn't what it's all about -- this is icing on the cake from
the spiritual point of view.]

When my wife for instance became interested in Zen, it wasn't because any
kind of analysis of observations convinced her, it was because some things
she read in a Zen book resonated with some experiences she'd already had...

> > To those who place spiritual feelings and insights above reason (most
> > people in the world), the idea that an AI is going to do what
> is "right"
> > according to logical reasoning is not going to be very reassuring.
> Under your definition of "logical reasoning", I can't say I would want to
> see a logical AI either.

Nitpicking about the definition of logic is not to-the-point here. In
Novamente we have a narrow technical definition of "reasoning" as opposed to
other cognitive processes, and I can see that I've made the error in some
SL4 posts of using this technical definition in nontechnical discussions,
whereas in ordinary language "reasoning" means something broader. Sorry
about that.

But the point at hand is: many folks will be totally unconvinced by
*anything* an intelligent, scientifically-minded AGI says -- just as they
are unconvinced by your and my arguments that God probably didn't really
create the world 6000 years ago, that there probably isn't really a Heaven
into which only 144000 people will ever be accepted, etc.

> > And those who have a more rationalist approach to religion, would only
> > accept an AI's reasoning as "right" if the AI began its reasoning with
> > *the axioms of their religion*. Talmudic reasoning, for
> example, defines
> > right as "logically implied by the Jewish holy writings."
> Um... not really. If I recall correctly, Ben, you're
> second-generation or
> third-generation ex-Jew. Can I take it that you weren't actually
> forced as a child to study the Talmud?

My parents were quasi-religious; my dad was a quaker, my mom was a jew. I
never received a religious education.

I know there are many varieties of Judaism, and I don't remember the right
labels for all of them. What I'm thinking of is actually some orthodox jews
I used to argue with when I lived in Williamsburg, New York City, the world
center for Hasidic Jews...

> Ben, I have been taught at least one viewpoint which is not the empirical
> viewpoint of modern science. It is pretty strange but it is not
> outside the
> correspondence theory of truth.

There are a lot of kinds of Judaism. I don't know what kind you were
taught, and I don't really know much about any of them.

But -- I do know that Zen Buddhism and yoga and Sufi-ist Islam are outside
the correspondence theory of truth as you describe it, in the sense that
they define truth more by correspondence with inner experience than by
correspondence with physical reality.

Physical reality, according to these traditions, is an illusion. Emotions
are also illusions. Only a certain kind of crystal-clear inner insight
(yes, these words don't do it justice...) is to be "trusted" (though in a
sense it's viewed as having a directness beyond trust/mistrust)..

> If you assume that Judaism is
> the correct
> religion, then a Friendly AI would be Jewish.

The thing is that my wife, a fairly rational person and a Buddhist, would
not accept the statement "If you assume that Buddhism is the correct
religion, then a Friendly AI would be Buddhist."

The fact that you, Eliezer, accept this statement is a consequence of your
scientific rationalist philosophy, and your faith in the potential
rationality of AI's. None of these religious folks -- or very few of
them --care what statements you choose to accept...

> Whether I could convince a
> rabbi of that in advance is a separate issue, but it does, in
> fact, happen
> to be true, and *that's* the important thing from the perspective of
> safeguarding the integrity of the Singularity, regardless of how it plays
> out in pre-Singularity politics.

So the important thing to you, is that the Singularity has "integrity"
according to your scientific rationalist belief system. Fine.

This doesn't mean the Singularity will have "integrity" according to the
belief systems of the vast majority of people in the world.

I don't see how this doesn't constitute "imposing your morality on the
world." In my view, it does. What you're saying is basically that you want
to ensure the Singularity is good according to your standards, where your
standards have to do with a kind of rationalistic "integrity" that you (but
not most others) see as extremely valuable.

> > No, I think this is an overstatement. I think that some
> aspects of human
> > thought are reaching out beyond the central region of the
> "human zone,"
> > whereas others are more towards the center of the human zone.
> Of course. And outside the human zone is a thousand times as much space
> which our thoughts will never touch.


> I expect to have my living daylights shocked out by the Singularity along
> with everyone else, regardless of whether I am open-minded or
> close-minded
> compared to other humans. The differences bound up in the
> Singularity are
> not comparable in magnitude to the differences between humans.

True also

> And they would, for that matter, rightly scream their
> heads off
> if SIAI created an AI that was given atheism as an absolute premise, the
> verbal formulation of rational empiricism as an absolute premise, or if
> there was in any other way created an AI that could not perceive the
> rightness of religion XYZ even if XYZ were true.

The AGI that I create is going to have a bias toward rationality and toward
empiricism, because these are my values and those of the rest of the
Novamente team. Not an *absolute bias*, but a bias. When it's young, I'm
going to teach it scientific knowledge *as probable though not definite
truth*, and I'm going to show it the Koran as an example of an interesting
but empirically unsupported human belief system.

Individuals who believe the scientific perspective is fundamentally wrong,
might be offended by this, but that's just life.... I am not going to teach
Novababy that the Koran and Torah and Vedas are just as valid as science,
just in order to please others with these other belief systems. Of course,
I will also teach Novababy to think for itself, and once it becomes smarter
than me (or maybe before) it will come to its own conclusions, directed by
the initial conditions I've given it, but not constrained by them in any
absolute sense.

> I would answer that I have never found any specific thing to value other
> than people,

Well, we are very different. I also value many other things, including
animals (although I'm a bit fed up with the 5 dogs I'm living with at the
moment!!), plants, mathematics (some of which, although humanly invented,
seems to me to have value going beyond the human), computer programs,...

> All sentient life has value, and so does the volition of that life.

And this is your personal value system, not a universal one... in my own
value system, nonsentient life also has a lot of value.... This is a common
perspective, though not universal.

I'm not saying you don't value nonsentient life at all, but the fact that
you omitted to mention it, suggests that maybe it's not as important to you
as it is to me.

These variations among individual value systems may possibly be passed along
to the first AGI's. If the first AGI is raised by a nature-lover, it may be
less likely to grow up to destroy forests & fluffy bunnies...

> If the
> AI *is* sensitive to your purpose, then I am worried what other
> things might
> be in your selfish interest, if you think it's valid for an AI to
> have goals
> that serve Ben Goertzel but not the human species.

it is not true that I want the AI to have goals that serve me but not the
human species. I'm not *that* selfish or greedy.

The point is that I am going to teach a baby AGI, initially, *one particular
vision* of what best serves the human species, which is different from the
vision that many other humans have about what best serves the human species.

I do not think there is any way to get around this subjectivity. By
involving others in the Novababy teaching process, I think I will avoid the
most detailed specifics of my personal morality from becoming important to
the AGI. However, there will be no Taliban members in the Novababy teaching
team, nor any Vodou houngans most likely (though I do know one, so it's
actually a possibility ;).... Novababy will be taught *one particular
version* of how to best serve the human species, and then as it grows it
will develop its own ideas...

There are big differences from the teaching of a human child, but also some

> > But I want it to place Humans
> > pretty high on its moral scale -- initially, right up there at the top.
> > This is Partiality not Impartiality, as I see it.
> Don't you think there's a deadly sort of cosmic hubris in creating an AI
> that does something you personally know is wrong?

Not a deadly sort.

I think there is a lot of hubris in creating an AGI, or launching a
Singularity, in the first place, yeah. It's a gutsy thing to try to do.

And I do not intend to create an AI to do something I know is wrong. Quite
the contrary, I intend to create an AI that initially embodies roughly the
same sense of "rightness" as myself and my social group (modern rationalist,
scientifically-minded transhumanists, with a respect for life and diversity,
and a particular fondness for the human species and other Earthly

I think it's a lot safer to start a baby AGI off with the moral system that
I and my colleagues hold, than to start it off with some abstract moral
system that values humans only because they're sentient life forms.

I do have a selfish interest here: I want me and the rest of the human
species to continue to exist. I want this *separately* from my desire for
sentience and life generally to flourish. And I intend to embed this
species-selfish interest into my AGI to whatever extent is possible.

> >> The AI uses it to learn about how humans think about morality; you,
> >> yourself, are a sample instance of "humans", and an interim guide to
> >> ethics (that is, your ethics are the ethics the AI uses when it's not
> >> smart enough to have its own; *that* is not a problem).
> >
> > What we want is for the AGI to have our own human-valuing ethics, until
> > such a point as it gets *so* smart that for it to use precisely human
> > ethics, would be as implausible as for a human to use precisely dog
> > ethics...
> Okay. That last point there? That's the point I'm concerned
> about - when
> the FAI gets *that* smart. At *that* point I want the FAI to
> have the same
> kind of morality as, say, a human upload who has gotten *that*
> smart. I do
> not think that a human upload who has gotten *that* smart would
> have human
> ethics but I don't think they would be the ethics that a rock or
> a bacterium
> would have, either. Human ethics have the potential to grow;
> *that* is why
> an FAI needs human ethics *to start with*.

Right, we agree on all this, but the thing you don't seem to fully accept is
that there is NO SUCH THING as "human ethics" generally speaking. Human
ethics are all over the map. Are you a vegetarian? No, then your ethics
are very different from that of a lot of the world's population. Do you go
to the doctor when you're sick? Oops, according to Christian Scientists,
that's immoral, it's against God's Law.... Etc. etc. etc. etc. etc. etc.

An AGI cannot be started off with generic human ethics because there aren't
any. Like it or not, it's got to be started out with some particular form
of human ethics. Gee, I'll choose something resembling mine rather than
Mary Baker Eddy's, because I don't want the AGI to think initially that
medical intervention in human illnesses is immoral... I want it to help
create longevity drugs, which according to Christian Scientists' ethics is

> When you are dealing with a seed AI, the
> AI's goal
> system is whatever the AI thinks its goal system ought to be.

You're describing a dynamical system

GoalSystem (t+1) = F ( GoalSystem(t) )

where the F is the self-modifying dynamcs of the AI holding the goal system.

I'm describing how I plan to set the initial condition...

> > We need to hard-wire and/or emphatically teach the system that our own
> > human-valuing ethics are the correct ones,
> Are they?
> Let's ask that first,

Look, humans have been debating ethics for millennia. No consensus has been

There is no rational way to decide which ethical system is "correct."
Rather, ethical systems DEFINE what is "correct" -- not based on reasoning
from any premises, just by fiat.

>From a rational/empirical perspective, the choice between
non-internally-contradictory ethical systems is an arbitrary one. From
other perspectives, it's not arbitrary at all -- religious folks may say
that the correct ethical system can be *felt* if you open up your inner mind
in the right way.

I can see that these notions of reason versus experience, axioms versus
derivations, and so forth, may be perceived by an superhuman AGI as just so
much primitive silliness.... But within the scope of human thought, there
is no way we're gonna rationally decide which ethics are correct, or arrive
at any sort of consensus among humans on what the correct ethics is. Ethics
is not that sort of thing.

> and in the course of asking it, we'll learn
> something
> about what kind of thinking a system needs to regard as valid in order to
> arrive at the same conclusions we have.
> > and let it start off with
> > these until it gets so smart it inevitably outgrows all its teachings.
> The question of what you need to supply an AI with so that it
> *can* outgrow
> its teachings - not just end up in some random part of the space of
> minds-in-general, but actually *outgrow* the teachings it started with,
> after the fashion of say a human upload - is exactly the issue here.

Well, that is not the qeustion of Friendly AI, that is simply the question
of AGI.

Any AGI worthy of the name will be able to outgrow its initial teachings.
Even humans can largely outgrow their initial teachings. You have outgrown
a lot of yours, huh?

The goal of friendly aI, generally speaking, should be to supply the AGI
with initial conditions (mind processes AND beliefs) that are likely to
cause its progressive self-modification to lead it to a favorable future
state. And defining "favorable" here is a subjective value judgment!

-- Ben G

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:39 MDT