Re: When Subgoals Attack

From: Eliezer S. Yudkowsky (
Date: Thu Dec 14 2000 - 13:20:32 MST

Ben Goertzel wrote:
> > Ben Goertzel, for philosophical reasons, may choose a design specifically
> > tuned to give subgoals autonomy. In the absence of that design decision,
> > I do not expect the problem to arise naturally.
> I suspect that this design decision is a necessary one in order to
> achieve intelligence.

Yes, I thought you'd say that. :)

> > So while the Minskyites might make problems for themselves, I can't see
> > the when-subgoals-attack problem applying to either the CaTAI class of
> > architectures, or to the transhuman level.
> I don't understand the CaTAI architecture well enough to form a
> counterargument. But in general, I think that if you can't forget goals
> but remember the subgoals they spawned, you're going to have a hellacious
> memory problem in your system. You can't assume infinite memory, as you'll
> find out when you actually start building CaTAI...

The counterargument for CaTAI is the same as the counterargument for
humans - that we are unified minds and that our subgoals don't have
independent volition. When was the last time your mind was taken over by
your auditory cortex? Maybe you once had a tune you couldn't get out of
your head, but there's a difference between a subprocess exhibiting
behavior you don't like, and hypothesizing that a subprocess will exhibit
conscious volitional decision-making. The auditory cortex may annoy you
but it cannot plot against you; it has a what-it-does, not a will.

One counterargument for transhumans is that it does not take "infinite"
memory to retain supergoal context, just the stardard amount of memory
required for a Friendliness system, and thus you can have an ecology of
perfectly cooperative processes united by a shared set of supergoals.

*The* counterargument for transhumans is that the whole idea of identity
and identifying is itself an anthropomorphism. Why aren't we worried that
the transhuman's goal system will break off and decide to take over,
instead of being subservient to the complete entity? Why aren't we
worried about individual functions developing self-awareness and deciding
to serve themselves instead of a whole? You can keep breaking it down,
finer and finer, until at the end single bytes are identifying with
themselves instead of the group... something that would require around a
trillion percent overhead, speaking of infinite memory.

And, speaking of cultural relativism: For minds-in-general, *who you
identify with* is an arbitrary proposition. It's just humans, who have
very sharp cognitive boundaries between "inside of me" and "outside of
me", who (a) automatically identify with "myself" and (b) have the
*capability* of identifying with a clearly defined "myself". A transhuman
subprocess might as easily say "I am a goal system" as "I am a transhuman
subprocess" or "I am a transhuman". The answer is not any one of the
three; the answer is MU, literally no-thing; a Sysop and a Sysop
sub-process does not "identify" with anything at all. It is simply
Friendly. It has no need to know its "identity" or its "group membership"
as a determinant of which side it should be on; it is simply Friendly,
regardless. The whole idea that who you identify with necessarily has
something to do with your goals is itself an anthropomorphism.

The real answer, then, is that neither the Sysop nor its subprocesses have
a need to distinguish between being a subprocess and being a Sysop. It,
or they, are simply Friendly.

-- -- -- -- --
Eliezer S. Yudkowsky
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:35 MDT