Re: FAI means no programmer-sensitive AI morality

From: Eliezer S. Yudkowsky (
Date: Sat Jun 29 2002 - 16:15:28 MDT

Ben Goertzel wrote:
>> The vast majority of religious
>>people, especially what we would call "fundamentalists" and those outside
>>the First World, adhere to a correspondence theory of the truth of their
>>religion; when they say something is true, they mean that it is so; that
>>outside reality corresponds to their belief.
> I don't think you have it quite right.
> What they mean is more nearly that *their experience corresponds to their
> beliefs*

Yes. They mean that their experience corresponds to their beliefs, that
their beliefs give them the experience of personal validation, that
(they think) their beliefs support altruism and uphold the common social
fabric, that their beliefs are acceptable to their peers... and that
their beliefs correspond to external reality.

The human mind comes with many reasons to believe. Science includes
some of them. Religion and all other human stories include all of them.
  Only in modern times have theologians actually attempted to eliminate
rationality from religion, and of course they have not succeeded; plus,
this is a handful of theologians with no relation whatsoever to either
the average dude's belief or the actual spirit of the religion.

> Modern scientific realism is based on accepting outside reality,
> observations of physical reality, as the fundamental determinant of truth.
> On the other hand, many other traditions are based on accepting *inner
> intuitions and experiences* as the fundamental determinants of truth.
> It seems to me like you don't fully appreciate what it means for someone to
> have a truly non-rationalist, non-scientific point of view. Probably this
> is because your life-course so far has not led you to spend significantly
> much time with such people. Mine, as it happens, has.

Uh... Ben, unless you spent 4 hours a day during your first twelve years
being indoctrinated in a fundamentalist religion...

No offense, Ben, but I probably have a much better picture of how a
fundamentalist Jew *actually thinks* than you do. Your understanding of
religion appears to be built around a very abstract outsider's
viewpoint. In my experience religious people argue just like other
people, regardless of what they say they believe about empiricism and so
on. For that matter, the vast majority of scientists argue just like
other people, regardless of what *they* say they believe about
empiricism and so on. During my first twenty years on Earth I had the
opportunity to hear far more Devrai Torah than any sane being would
voluntarily undergo, and yes, there is rationally structured argument,
cause and effect, evidence, et cetera, mixed in with the social morals
and the appeals to internal experience and so on.

I think you are confusing the principles which people verbally adhere to
with the way that people actually think.

And while I'm not proud of it, yes, I *was* religious up until the age
of, oh, I would say around eleven years or so. We don't choose our
parents. I do remember how it worked. I'm curious, Ben, have you ever
actually *been* religious? Do you know how religious thinking works *in
the first person*? Because I have to say you're sounding like a
complete outsider here - like your idea of religion comes from watching
scientists debating theologians about the nature of truth - which is a
very different thing from how ordinary religious people actually think
in practice. I'm sure that you've had chats with your semireligious
parents and your semireligious wife and so on, and maybe read a few
books, but you may need to consider that standing back as a scientist
and going "Gosh, how *utterly alien* and *unempirical*" is going to give
you a different perspective, and one which is maybe a bit unrealistic
about the way religious people talk to each other when they're not
debating a scientist or whatever.

And finally: Ben, could you please stop with the condescending "Oh, you
must not have XYZ" stuff? I've held off complaining until now but it's
starting to get outright silly. As I've illustrated above, two can play
at that game, and in this case I'm starting to get a bit sick of it. I
don't know whether you actually have any in-depth experience of how
religious people think - maybe you do and maybe you don't; you certainly
think you do - but it is foolish to pull rank on someone who was,
however involuntarily, raised by Orthodox Jews for twenty years. I'm
sure I don't know everything there is to know about Orthodox Jewish
thinking, much less Buddhism or any of the other religions, but if you
think there's something I'm missing, kindly explain it to me as an equal
instead of attempting to pull rank due to your supposed greater depth of
religious experience.

> Take traditional Chinese medicine, or yoga, or Zen Buddhism, as examples.
> These are ancient traditions with a lot of depth and detail to them. Their
> validity, such as it is, is primarily *experiential*. It is largely not
> based on things that individuals outside the tradition in question can
> observe in empirical reality. [Yeah, I know people have tried to test for
> enlightenment by studying brain waves and such (lots of work at Maharishi
> University on this), but this isn't what it's all about -- this is icing on
> the cake from the spiritual point of view.]
> When my wife for instance became interested in Zen, it wasn't because any
> kind of analysis of observations convinced her, it was because some things
> she read in a Zen book resonated with some experiences she'd already had...

Ben, I think that you are again being much too restrictive about what
constitutes rationality and evidence. People's internal experiences can
be evidence too. If Zen enlightenment takes place it is a real event.
(Incidentally, I think Zen enlightenment is probably a real thing even
if it is not mystical in nature, and I've always wanted to try and defy
the Zen rules and come up with a purely rational train of thought which
ends in Zen enlightenment... but this may actually be as impossible as
much Zen tradition insists.)

Is the fountain of human altruism a piece of external evidence or is it
a subjective experience? I can certainly imagine a Power discovering an
objective morality which exists outside of all observers and which is
apparent to any sufficiently intelligent entity regardless of its
previous goals. This would certainly simplify Friendly AI immensely,
but unfortunately we cannot assume it to be the case. In that case, one
of the foundations of Friendliness, the shaper of altruism, will be the
human subjective experience of altruism. Subjective experiences are not
outside the scope of Friendly AI, nor do they exist outside the physical

The idea that religion and rationality are orthogonal is a modern idea
proposed by modern theologians; it also happens to be wrong. If I am to
speak from my personal experience then every Orthodox Jew with whom I
have ever discussed religion has made any number of readily testable
statements about the external world and about the comparative subjective
experiences of Orthodox Jews and everyone else.

Anyway, let's keep this conversation focused. Ben, is it your assertion
that even if the Jewish or Buddhist religion were correct, this would
not be apparent to a Friendly AI that had been programmed by atheists?
Because *I* would most certainly regard this as a bug.

Obviously religious people are not thinking like Friendly AIs, or, given
that we are correct in asserting that their religions are in fact wrong,
they would not be religious. But just in case *we* happen to be the
ones who are in fact horribly, fundamentally wrong, whether or not any
current human is right, we need to make sure that the AI is not bound to
our mistakes. This could conceivably include mistakes about the nature
of scientific evidence - I've heard several pro-science people make
statements about science that I would regard as erroneous, even
fundamentally erroneous.

In any case, I must say that from a first-person perspective on
religion, with respect to Judaism (the one religion which I know from a
first-person perspective, albeit through no fault of my own), while
religious thinking is not like Friendly AI thinking, there is more than
enough evidence (given the correctness of a given religion) for a
Friendly AI to find, lock onto, and revise itself into accordance with
that religion even if that Friendly AI is unfortunate enough to have
been programmed by spiritually empty atheists. And I have seen no
reason why any other religion would be different.

> Nitpicking about the definition of logic is not the point. In Novamente we
> have a narrow technical definition of "reasoning" as opposed to other
> cognitive processes, and I can see that I've made the error in some posts of
> using this definition in nontechnical discussions, when in ordinary language
> "reasoning" means something broader. Sorry about that.
> But the point at hand is: many folks will be totally unconvinced by anything
> an intelligent, scientifically-minded AGI says -- just as they are
> unconvinced by your and my arguments that God probably didn't really create
> the world 6000 years ago, that there probably isn't really a Heaven into
> which only 144000 people will ever be accepted, etc.

First comes the question of what is true. Then comes the question of
how to convince others. It's not the responsibility of a Friendly AI
programmer to convert other humans to atheism given that atheism is the
correct religion; it is the responsibility of a Friendly AI programmer
to build an AI that would, in fact, convert to religion X if religion X
were, in fact, true. Again, you've got to distinguish between
protecting the integrity of the post-Singularity world and
pre-Singularity politics.

>>Um... not really. If I recall correctly, Ben, you're
>>second-generation or
>>third-generation ex-Jew. Can I take it that you weren't actually
>>forced as a child to study the Talmud?
> My parents were quasi-religious. I know there are many varieties of
> Judaism, and I don't remember the right labels for all of them. What I'm
> thinking of is actually some Kabalist stuff that I read years ago ;>

Ben, this no more enables you to speculate on the nature of Talmudic
truth than reading a few books on Zen and Taoism makes me an Eastern
philosopher. *I'm* not qualified to talk about the nature of truth
under Talmudic philosophy as seen by Talmudic philosophy, although I've
been forced to read enough Talmud that I have some vague experience of
how it works in practice. I'll tell you this much: it's absurdly
complicated. In fact, I would have to say that I don't think I've ever
in my life seen so much unnecessary complexity in one place in one time.
   The Talmud may very well be the single most unnecessarily complicated
thing on the entire Earth at this time. Reading some stuff on Kabbalah
isn't going to cut it.

Want some idea of how it looks from the inside? Read this halachic
guide to cloning:

Nonetheless, supposing that the Talmud were true, even in a purely
spiritual sense as spiritual truth is understood in most religions, it
would be straightforward for a Friendly AI programmed by atheists to
lock onto it sufficiently to deduce that it (the Friendly AI) should
have been programmed by Talmudic scholars rather than atheists and wash
out the remaining errors.

And if it happens that the Talmud says something intelligent, apposite,
and enlightening, the right conclusion for the right reasons, about some
moral or ethical question, then it would be a distortion of rationality
to reject it because of the source.

>>Ben, I have been taught at least one viewpoint which is not the empirical
>>viewpoint of modern science. It is pretty strange but it is not
>>outside the
>>correspondence theory of truth.
> There are a lot of kinds of Judaism. I don't know what kind you were
> taught, and I don't really know much about any of them.

Well, I'm certainly pretty pathetic by Jewish standards, but I still
think I may have a better idea at this point than you do.

> I do know that Zen Buddhism and yoga and Sufi-ist Islam are outside the
> correspondence theory of truth as you describe it, in the sense that they
> define truth more by correspondence with inner experience than by
> correspondence with physical reality.

No human thought is outside the correspondence theory of truth. It is
as pervasive as the Force, although it does take some Jedi training to
see its action and presence in all things. The correspondence theory of
truth runs much deeper in human nature and the cognitive foundations of
belief itself than do our surface traditions of scientific empiricism.

Now it may be that Zen proceeds from arational thoughts to an arational
conclusion which is important not because it corresponds to some outside
thing but because it is itself, and in this sense the core of Zen may
come closer to being outside the correspondence theory of truth than
anything else I know of, but it is surrounded by a core of mystical
tradition which, like all forms of human storytelling, makes use of the
correspondence theory of truth. From what I know of Zen, if Zen is
correct, then a rational mind would be able to perceive this enough to
begin studying Zen and breaking itself of rationality.

> Physical reality, according to these traditions, is an illusion. Emotions
> are also illusions. Only a certain kind of crystal-clear inner insight
> (yes, these words don't do it justice...) is to be "trusted" (though in a
> sense it's viewed as having a directness beyond trust/mistrust)..

And Zen makes testable assertions about this crystal-clear inner
insight, the Satori, even though Satori is neither a testable assertion
in itself, nor can it be obtained by confirming testable assertions.
That's enough of a handle for a Friendly AI to lock in on Zen if Zen is
a necessary foundation or greater apotheosis of altruism. Though it
would take a rather drastic amount of self-recreation before the AI was
actually capable of studying Zen.

>>If you assume that Judaism is
>>the correct
>>religion, then a Friendly AI would be Jewish.
> The thing is that my wife, a fairly rational person and a Buddhist, would
> not accept the statement "If you assume that Buddhism is the correct
> religion, then a Friendly AI would be Buddhist."

Sounds like a testable statement. Would you care to put it to the test?
  Again, strictly speaking this is just pre-Singularity politics as long
as a Friendly AI *would in fact* be Buddhist given the correctness of
Buddhism, but you may underestimate the extent to which religion and
rationality are *not* orthogonal.

> The fact that you, Eliezer, accept this statement is a consequence of your
> scientific rationalist philosophy. None of these religious folks -- or very
> few of them --care what statements you choose to accept...

I still have responsibilities to them. Anyway, from my own experience
with people who are deeply or matter-of-factly religious, I don't think
I believe you. There might be some argument about the nature of AI, but
if I were to avoid using the term "AI" and describe what a Friendly AI
really is and how it would think, I think that most religious people
would readily agree that a Friendly AI ought to be of their religion.

>>Whether I could convince a
>>rabbi of that in advance is a separate issue, but it does, in
>>fact, happen
>>to be true, and *that's* the important thing from the perspective of
>>safeguarding the integrity of the Singularity, regardless of how it plays
>>out in pre-Singularity politics.
> So the important thing to you, is that the Singularity has integrity
> according to your scientific rationalist belief system. Fine.
> This doesn't mean the Singularity will have integrity according to the
> belief systems of the vast majority of people on the world.
> I don't see how this doesn't constitute "imposing your morality on the
> world." In my view, it does. What you're saying is basically that you want
> to ensure the Singularity is good according to your standards, where your
> standards have to do with a kind of rationalistic "integrity" that you (but
> not most others) see as extremely valuable.

Ben, imagine what kind of precautions you would ask a Catholic
programming a Friendly AI to take in order to ensure that the AI would
eventually convert to atheism, given that atheism is correct. Now do
that yourself. What does this have to do with "scientific rationalism"?

>>And they would, for that matter, rightly scream their
>>heads off
>>if SIAI created an AI that was given atheism as an absolute premise, the
>>verbal formulation of rational empiricism as an absolute premise, or if
>>there was in any other way created an AI that could not perceive the
>>rightness of religion XYZ even if XYZ were true.
> The AGI that I create is going to have a bias toward rationality and toward
> empiricism, because these are my values and those of the rest of the
> Novamente team. Not an *absolutely bias*, but a bias. When it's young, I'm
> going to teach it scientific knowledge *as probable though not definite
> truth*, and I'm going to show it the Koran as an example of an intersting
> human belief system.
> Individuals who believe the scientific perspective is fundamentally wrong,
> might be offended by this, but that's just life.... I am not going to teach
> Novababy that the Koran and Torah and Vedas are just as valid as science,
> just in order to please others with these other belief systems. Of course,
> I will also teach Novababy to think for itself, and once it becomes smarter
> than me (or maybe before) it will come to its own conclusions, directed by
> the initial conditions I've given it, but not constrained by them in any
> absolute sense.

I think that asking how to ensure that an AI created by atheists would
converge to a religion, given that this religion is correct, is a
necessary exercise for understanding how an AI can repair whatever deep
flaws may very well exist in our own worldviews. In this sense, I think
that refusing to put yourself in the shoes of a Christian building an AI
and asking what would be "fair" is not just a matter of pre-Singularity
politics. It is a test - and not all that stringent a test, at that -
of an AI's ability to transcend the mistakes of its programmers. If you
don't want to apply this test, what are you going to use instead?

>> All sentient life has value, and so does the volition of that life.
> And this is your personal value system, not a universal one... in my own
> value system, nonsentient life also has a lot of value.... This is a common
> perspective, though not universal.
> I'm not saying you don't value nonsentient life at all, but the fact that
> you omitted to mention it, suggests that maybe it's not as important to you
> as it is to me.
> These variations among individual value systems may possibly be passed along
> to the first AGI's. If the first AGI is raised by a nature-lover, it may be
> less likely to grow up to destroy forests.

I view it as an unbearably horrifying possibility that the next billion
years of humanity's existence may be substantially different depending
on whether the first AGI was raised by an environmentalist. It's
equally horrifying whether you're an environmentalist looking at Eliezer
or vice versa. It shouldn't depend on who happens to build the AI, it
should depend on *who's right*. If nobody's right then the choice
should be kicked back to the individual. If there's no way to do that
then you might as well take a majority vote of the existing humans or
pick a choice that's as good as any other; in absolutely no case should
the programmers occupy a priviliged position with respect to an AI that
may end up carrying the weight of the Singularity.

>> If the
>>AI *is* sensitive to your purpose, then I am worried what other
>>things might
>>be in your selfish interest, if you think it's valid for an AI to
>>have goals
>>that serve Ben Goertzel but not the human species.
> The point is not that I want the AI to have goals that serve me but not the
> human species. I'm not *that* selfish or greedy.
> The point is that I am going to teach a baby AGI one particular vision of
> what best serves the human species, which is different from the vision that
> many other humans have about what best serves the human species.

If your view is no better than anyone else's, why not ask the AI to pick
a view at random, or ask the AI to do what a majority vote determines?
I think that a majority vote is an absolute last resort, by the way, but
it's still better than seizing power as an individual.

> I do not think there is any way to get around this subjectivity. By
> involving others in the Novababy teaching process, I think I will avoid the
> most detailed specifics of my personal morality from becoming important to
> the AGI. However, there will be no Taliban members in the Novababy teaching
> team, nor any Vodou houngans most likely (though I do know one, so it's
> actually a possibility ;).... Novababy will be taught *one particular
> version* of how to best serve the human species, and then as it grows it
> will develop its own ideas...

Develop its own ideas from where? How? Why? Every physical event has
a physical cause. There are causes for humans developing their own
ideas as they grow up, most of them evolved. You are standing not only
"in loco parentis" but "in loco evolution" to your AI. What causes will
you give Novamente to develop its own ideas?

>> > But I want it to place Humans
>> > pretty high on its moral scale -- initially, right up there at the top.
>> > This is Partiality not Impartiality, as I see it.
>>Don't you think there's a deadly sort of cosmic hubris in creating an AI
>>that does something you personally know is wrong?
> Not a deadly sort.
> I think there is a lot of hubris in creating an AGI, or launching a
> Singularity, in the first place, yeah. It's a gutsy thing to try to do.

That's certainly one motive which could contribute to launching a
Singularity. It's not the only motive. I think that a Friendly AI
would tell me to do the same thing. For that rather, quite a large
number of people who *aren't* me have told me to get out there and
launch a Singularity as soon as possible; relatively few of them are
donors to the Singularity Institute, which is kind of depressing, but
it's at least suggestive evidence that no personal hubris needs to be
involved. For humanity at this point in history it's Singularity or
bust, and someone has to do it. "Someone has to do it" should generally
be followed by the statement "And I volunteer", but that's just a
personal opinion.

> And I do not intend to create an AI to do something I know is wrong. Quite
> the contrary, I intend to create an AI that initially embodies roughly the
> same morals as myself and my social group (modern rationalist,
> scientifically-minded transhumanists, with a respect for life and diversity,
> and a particular fondness for the human species and other Earthly
> life-forms).
> I think it's a lot safer to start a baby AGI off with the moral system that
> I and my colleagues hold, than to start it off with some abstract moral
> system that values humans only because they're sentient life forms.
> I do have a selfish interest here: I want me and the rest of the human
> species to continue to exist. I want this *separately* from my desire for
> sentience and life generally to flourish. And I intend to embed this
> species-selfish interest into my AGI to whatever extent is possible.

Ben, to the best of my ability to tell, the abilities an AI would use to
grow beyond its programmers' and the abilities an AI would use to
correct horrifying errors by its programmers are exactly the same
structurally. Your "pseudo-selfish" attitude here - i.e, that it's okay
to program an AI with altruism that is just yours - endangers the AI's
possession of even that altruism.

The moral hubris of "pseudo-selfish" AI creation can have very real and
very drastic consequences even under your own morality.

>>Okay. That last point there? That's the point I'm concerned
>>about - when
>>the FAI gets *that* smart. At *that* point I want the FAI to
>>have the same
>>kind of morality as, say, a human upload who has gotten *that*
>>smart. I do
>>not think that a human upload who has gotten *that* smart would
>>have human
>>ethics but I don't think they would be the ethics that a rock or
>>a bacterium
>>would have, either. Human ethics have the potential to grow;
>>*that* is why
>>an FAI needs human ethics *to start with*.
> Right, we agree on all this, but the problem you don't seem to understand is
> that there is NO SUCH THING as "human ethics" generally speaking. Human
> ethics are all over the map. Are you a vegetarian? No, then your ethics
> are very different from that of a lot of the world's population. Do you go
> to the doctor when you're sick? Oops, according to Christian Scientists,
> that's immoral, it's against God's Law.... Etc. etc. etc. etc. etc. etc.
> etc.

Of course humans argue about everything. The question is which of these
answers is *right*. If your answer is no righter than anyone else's
then how dare you impose it on the Singularity? Why wouldn't anyone
else in the world be justly outraged at such a thing? Letting everyone
pick their own solutions whenever possible is one answer.

> An AGI cannot be started off with generic human ethics because there aren't
> any. Like it or not, it's got to be started out with some particular form
> of human ethics. Gee, I'll choose something resembling mine rather than
> Mary Baker Eddy's, because I don't want the AGI to think initially that
> medical intervention in human illnesses is immoral...

And you think these two positions are equally right?

>> > We need to hard-wire and/or emphatically teach the system that our own
>> > human-valuing ethics are the correct ones,
>>Are they?
>>Let's ask that first,
> Look, humans have been debating ethics for millennia. No consensus has been
> reached.

Under the BPT, the fact that no consensus has been reached indicates
that no correct answer exists only if we would expect, given that a
correct answer exists, for consensus to be reached. We still don't have
universal consensus that the Earth is not flat. Guess what? It's still
not flat.

> There is no rational way to decide which ethical system is "correct."
> Rather, ethical systems DEFINE what is "correct" -- not based on reasoning
> from any premises, just by decision.

Hm. According to you, people sure do spend a lot of time arguing about
things that they should just be deciding by fiat. In fact, everyone
except a relative handful of cultural relativists - a tiny minority of
humanity, in other words - seems to instinctively treat ethics as if it
were governed directly by the correspondence theory of truth. Why is
that, do you suppose?

> From a rational/empirical perspective, the choice between ethical systems is
> an arbitrary one. From other perspectives, it's not arbitrary at all --
> religious folks may say that the correct ethical system can be *felt* if you
> open up your inner mind in the right way.

I beg your pardon? Since when does rational empiricism assert that the
choice between ethical systems is arbitrary? I'm a rational empiricist
and I assert no such thing.

> I can see that these notions of reason versus experience, axioms versus
> derivations, and so forth, may be perceived by an superhuman AGI as just so
> much primitive silliness.... But within the scope of human thought, there
> is no way we're gonna rationally decide which ethics are correct. Ethics is
> not that sort of thing.

We aren't dealing with human thought any more. We're dealing with
humans creating a seed AI that goes forth to become transhuman, and
asking what happens then.

>>The question of what you need to supply an AI with so that it
>>*can* outgrow
>>its teachings - not just end up in some random part of the space of
>>minds-in-general, but actually *outgrow* the teachings it started with,
>>after the fashion of say a human upload - is exactly the issue here.
> No, that is not the qeustion of Friendly AI, that is simply the question of
> AGI.
> Any AGI worthy of the name will be able to outgrow its initial teachings.
> Even humans can largely outgrow their initial teachings. You have outgrown
> a lot of yours, huh?

Hm. It seems like you simultaneously believe:

(a) there are correct answers for questions of simple fact and that any
AGI should be able to easily outgrow programmer-supplied wrong answers
for questions of simple fact
(b) ethical questions are fundamentally different from questions of
simple fact because no correct answers exist
(c) an AGI should be able outgrow programmer-supplied ethics as easily
as it outgrows programmer-supplied facts; in fact, this has nothing to
do with Friendly AI but is simply a question of AI

I can see how (a) (!b) (c) go together but not how (a) (b) (c) go
together. If you assert (b) then the human ability to outgrow
parentally inculcated ethics would depend on evolved functionality above
and beyond generic rationality.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:39 MDT