Re: Revising a Friendly AI

From: Eliezer S. Yudkowsky (
Date: Tue Dec 12 2000 - 22:44:18 MST

Ben Goertzel wrote:
> > Do you really believe that you can alter someone's level of intelligence
> > without altering the set of supergoals they tend to come up with?
> Sometimes.... Surely, I know some very intelligent people whose supergoals
> are the same as those of much less intelligent people (Beer, T&A, ... ;)

Well, I wanted to avoid bringing this up, since it does sometimes creep
people out, but I don't drink, do drugs, smoke, fight, have sex, overeat,
or gamble.

> > And overriding evolution's supergoals with a verbally transferred
> > supergoal (as your schema would seem to have it?) is an evolutionary
> > advantage because?
> Because culture can adapt faster than biological evolution, I suppose.

There are two ways to look at this.

One way is that it's an evolutionary advantage to have a *limited* supply
of mental energy that you can use to override immediate desires in favor
of long-term goals or to avoid long-term consequences. (Actually, what we
have is not a quantitative supply per se, but a situation in which
expenditure of mental energy becomes increasingly painful, and
increasingly difficult, which makes it useful to use the quantitative
analogies of exhausting patience or replenishing energy.)

Another way of looking at it is that all actual decisions, as in the ones
that eventually get sent to motor control, are made by verbal thoughts.
Our emotions tie into our verbal thoughts through a complex interface that
lets them influence, but not control, these decisions. I don't fully
understand the core of this interface, but I think that it ultimately
grounds in feelings of pleasure (or pain) reinforcing (or negatively
reinforcing) the thought-level reflexes that we build up in infancy. In
the beginning, thoughts that cause pain, or thoughts that are projected to
lead to pain, are actually damped out on the neural level, and other
thoughts take their place. Over time, the landscape of the mind is
dominated by thought sequences that don't cause pain, real or projected -
that dance around it. This is who we are. This is what a "human being"
is. We are governed by what I call "flinchback"; the mental reflexes that
direct our minds away from certain thoughts, or rather, mental images.
Our focus of attention shifts; a new thought is loaded in; we naturally
segue into thinking of a way to avoid, or minimize, the problem.

>From this perspective, the ability to do a verbal override is not
necessarily an evolutionary advantage. It is an inevitable consequence of
our underlying cognitive architecture. Our cognitive architecture is a
huge evolutionary advantage compared to nonconsciousness; thus, remaining
Neanderthal is not an option. But, from the genes' perspectives, the
first conscious architecture that happened to arise, while crushing the
Neanderthals, has its own problems. Such as allowing verbal overrides -
and therefore, memetic overrides - of the built-in desires. This is a
"problem" from the perspective of the genes, anyway.

The genes have probably been trying to seduce this system, but have not
yet succeeded in making it completely obedient, thank Ifni. Larry Niven
and Jerry Pournelle, in "The Mote in God's Eye", paint a chilling picture
of an alien race trapped by a cosmological bottleneck in its home system
for millions of years, long enough for genetic motives to leave a far
deeper footprint on the process of intelligence.

As for the idea that instincts and emotions themselves become obsolete
fast enough for cultural replacement to be an advantage, I just don't see
it. Maybe in the last ten thousand years, but that's not long enough to
make a difference. It would have to make a REALLY HUGE difference to make
up for the increased vulnerability to all the anti-survival and
anti-reproductive memes floating around... though that, too, may be a
historically recent innovation. But the actual emotions seem to be
hardware-supported and human-invariant, so cultural or parental
conditioning should have no influence whatever on the instincts you
sometimes label as supergoals ("beer, T&A") - though verbal thoughts and
experiences can certainly control how those chunks of hardware interface
with the rest of the mind.

I regard none of this as a factor in my picture of how a mind *should*

	"You have the power to compel me," echoed Archive back, flat.
	It was lying.
	It remembered the pain, but in the way something live'd remember the
weather.  Pain didn't matter to Archive.  No matter how much Archangel
hurt Archive, it wouldn't matter.  Ever.
	Archangel thought he could break Archive's will, but he was wrong.  A
Library doesn't have a will any more than a stardrive does.  It has a
what-it-does, not a will, and if you break it you don't have a Library
that will do what you want.  You have a broken chop-logic.
	-- Eluki bes Shahar, "Archangel Blues", p. 127
--              --              --              --              -- 
Eliezer S. Yudkowsky                 
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:35 MDT