Re: Goertzel's _PtS_

From: Eliezer S. Yudkowsky (
Date: Mon May 21 2001 - 19:13:35 MDT

Ben Goertzel wrote:
> > > What I mean by "hard-wiring Friendliness" is placing
> > Friendliness at the top
> > > of the initial goal system and making the system express all
> > other goals as
> > > subgoals of this. Is this not what you propose? I thought
> > that's what you
> > > described to me in New York...
> >
> > Yes, that's what I described, but by that description *I'm* hard-wired
> > Friendly, since this is one of the properties I strive for in my own
> > declarative philosophical content.
> No, you're just **deluded** ;>

(For those of you who are not also subscribed to the Extropians list, this
is funny because there was a discussion about self-delusion in Bayesian
reasoners at the time.)

> You don't *really* have Friendliness as your ultimate supergoal ... you just
> have a false self-model in which Friendliness is your ultimate supergoal!

Okay, pardon me, I should have specified declarative goals. Almost all my
declarative goals are justified in terms of Friendliness maximization via
Singularity. (The declarative goals that aren't so justified are exotic
goals justified by the possibility that something other than Friendliness
is ethically preferable or that the Singularity is not the best way to

Now, since human cognition is messy, I should specify down even further
what I mean by this. When I take a deliberate action or make a deliberate
choice, I sometimes do so based on a deliberate value assessment, and I
sometimes do so based on a reflexive value assessment. The reflexes are
derived both from instinctive/intuitive/emotional values and from previous
(cached) experience in applying my deliberate values. Where I apply
deliberate values, I do so using a store of deliberate statements or
beliefs about goals. Where I consciously set out to justify a value or
goal belief, I do so by reference to my beliefs about moral maximization
via the Singularity.

For obvious reasons, however, this system also tends to emergently (or, in
my case, not so emergently) bubble up the belief that reflexive
assessments which differ from the justified assessments are "bad", and
that deliberate statements that are unjustified because unexamined are
also "bad". Major actions and life-choices in particular are made almost
entirely using the "top-level" "goal" of a Friendly Singularity.

In plainer English, this is only the standard self-conflict of a human
being rewritten as cognitive science. We all have at least one emotion or
instinct that we'd as soon be rid of and that we expend effort on
combating. In my case, I've won, or at least won to a much higher degree
than I'm led to believe is usual. My moral philosophy is now the dominant
factor in decision making to a degree that becomes even higher for larger
or more deliberate decisions, especially decisions about how to make
decisions or decisions about how to organize my mind.

A seed AI has (a) less junk cognition and (b) can rewrite vis own source
code. So I may consider myself hot stuff by modern-day Western cultural
standards but I expect a mature seed AI to be a heck of a lot better.

-- -- -- -- --
Eliezer S. Yudkowsky
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:36 MDT