Re: Friendliness and blank-slate goal bootstrap

From: Metaqualia (
Date: Fri Oct 03 2003 - 23:59:01 MDT

> Not necessarily. It is still possible for a self-bootstraping goal
> system to become Friendly.

I am not excluding this possibility, but not being sure whether the AI will
become a friendly AI is equivalent to having failed in creating the
framework for friendly AI.

> If you consider the history of life on
> Earth as such a self-bootstrapping system and each of us sentients
> as a leaf node in this forward branching process and that one
> individual or group of us can produce Friendly AI, then that
> possibility is still open.

The fact that humans have evolved the concept of friendliness in my view has
nothing to do with whether a seed AI will ultimately be able to
self-bootstrap itself into friendliness. We are imperfectly deceptive
systems, social morality arises because if we kick someone's butt they will
kick ours, and this will be encoded genetically sooner or later.

Aside from the genetically encoded "feeling bad about doing harm", the
reason _I_ decide to be friendly (not kill slain or massacre people) is not
because I have this evolved instinct to sometimes be nice (I choose to
ignore much stronger instincts at times) but because I take the leap of
faith of saying, they contain a subjective consciousness just as mine, so
they must experience pain as I do, so I won't crack their skull open. If I
had absolutely no experience of pain, and saw it as a physical process
happening in someone's brain, I would probably think differently. Just like
I do not mind killing insects that annoy me or cutting tree branches when
they are in the way, because I do not know about _their_ subjectivity (well
I do have moments of hesitation, but just because I am nice :)

> When thinking about such possibilities, it is also useful to
> consider the vast number of afriendly systems, ie. those that
> are neither friendly nor unfriendly. A blank-slate self-
> bootstrapping goal system, might tend to one of those as well.
> Or if you don't think so, maybe you can offer some reasons why.

Don't think a system can be neutral. Friendly or unfriendly. The AI can mind
its own business until someone comes along and tries to mess with it, at
that point if it's friendly it will go out of its way not to
cripple/erase/disintegrate the person, otherwise it will take the shortest
path to eliminating the obstacle.

> > It would be really nice if we found that the most basic
> > morality system (do
> > not make beings feel pain, make them feel pleasure when you
> > can) can also be
> > bootstrapped.
> Basic morality would be nice. Knowing how to boostrap a satisfactory
> transhuman morality would be nicer :)

I was not referring to basic morality as human morality, but as the most
basic set of moral rules that can be seen as potentially universal. "Do not
create subjective pain" is basically what I intend for "basic morality",
actually I shouldn't have used the word "basic", let's call it "physical
morality" or "universal morality" or "objective morality".

> I don't agree. I do not personally know the pain of being thrown
> in a wheat thresher. But I can tell you with certainty that I want
> to avoid it.

That is because you have experienced disruption to your body tissues before,
it felt painful, so you want to avoid repeating the experience. If you had
never ever experienced pain, what would prevent you from looking at millions
of creatures burning in agony and interpreting their screams as purely
physical phenomena?

> Besides, the AIs I like to conceive of experience all the qualia we
> do, plus qualia we cannot yet imagine ;-)

hence my nickname. But it is not clear whether even a transhuman AI will
have qualia. Being intelligent is not equivalent to being sensitive. When I
do extreme mental exercise that does not make me taste strawberries or hear

> > Possibly, this could be incorporated into its baby cognitive
> > framework as:
> >
> > 4. since humans claim to have qualia and you don't, humans still have
> > information that you need, or so it would seem.
> > 5. therefore, _some_ humans must be around until you have
> > good indications
> > you are also experiencing qualia >> therefore, don't kill all
> > humans just
> > yet
> That seems like a thin shield of protection. Hopefully we can come up
> with something safer than that.

if you can come up with it, let me know. remember that it must be something
arising from simple logic, not a simple "block" put somewhere to restrict
the AI from accomplishing an otherwise desirable goal.

> > [bonus. if you are already superintelligent and still are not
> > experiencing
> > qualia it would be a good idea to get some humans to
> > transcend and see if a
> > qualia-augmented superintelligence is better at finding out
> > what qualia are
> > than you qualia-less superintelligence are.]
> Ok, one femto second passes. Done. What happens next?

My augmented self will think about what's next.

> > PS: The termination of the AI would still be an ultimate
> > evil, and death of
> > all humans will be preferred in an extreme situation in which
> > it's either
> > one way or the other, since the AI is better off looking for
> > the meaning of
> > life without humans than dead
> Ah, but this is where you probably want to read CFAI to get a
> good sense of how a Friendly AI might want to arrange it's
> goals. The Ultimate Evil, and any class of evil hopefully,
> would be avoided by setting Friendliness as the supergoal.

I read CFAI, but I disagree about it saying that you must set friendliness
as the supergoal. One of the basic lessons i took home from CFAI is that you
can't just put a supergoal of "be friendly" and expect the machine to be
friendly as it improves itself, you must find a way to let it discover
friendliness on its own. Besides, "be friendly" is not a clear rule at all;
I can think of conflicting scenarios in which friendliness cannot be
achieved period (disarming a human trying to kill another necessarily means
going against the will of the first person and possibly creating distress) -
better to let the AI come to a set of morals on its own and figure out the
specific cases from a superior moral standpoint.

> I try to avoid focusing on the issue of qualia with regard to Super
> Intelligence, other than for pure recreational thinking. Qualia are only
> interesting to me in the sense that they are part of my own personal
> goal system. Most likely, once I am consciously in charge of all that

Qualia have nothing to do with goal systems, the two can exist separately.

> Are qualia important for designing a Friendly AI? If so, then they are
> imporant. Otherwise, I'd rather think about something else.

This was my point, is experiencing pain is necessary to elaborating a
self-bootstrapping moral framework in which pain is evil?


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:42 MDT