Re: When Subgoals Attack

From: Durant Schoon (durant@ilm.com)
Date: Wed Dec 13 2000 - 15:26:49 MST

Next message: Eliezer S. Yudkowsky: "Re: When Subgoals Attack"
Previous message: Eliezer S. Yudkowsky: "Re: When Subgoals Attack"
In reply to: Gordon Worley: "Re: When Subgoals Attack"
Next in thread: Eliezer S. Yudkowsky: "Re: When Subgoals Attack"
Reply: Eliezer S. Yudkowsky: "Re: When Subgoals Attack"
Reply: Gordon Worley: "Re: When Subgoals Attack"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Ok, I feel inclined to reply to everyone who replied to me (then
I'll descend back into the deep, dark depths of lurkdom).

> From: Gordon Worley <redbird@rbisland.cx>
> >Observation: In modern human minds, these subgoals are often not
> > intelligent and do not constitute a sentience in and
> > of themselves. Thirst->drink->pick up glass of milk->...
>
> Off the top of my head, I can think of no basic goal of survival
> which forces sentience. Humans could get along with out it, but
> knowing that they are alive probably makes it easier to survive
> better, since they care more about surviving if they know they could
> be in another state (i.e. dead).

I was merely trying to establish that human minds don't really have
intellegent subgoal seeking processes. The processes are all "dumb",
like "send that grasping signal to the fingers and thumb"(*). I
wanted to make this distinction before I considered a transhuman AI
which *might* spawn intelligent sub-agents to solve sub-problems.

The issue of why sentience might be a survival trait is an interesting
one, but I think a different topic.

> > So the problem is this: what would stop subgoals from
> > overthrowing supergoals. How might this happen? The subgoal
> > might determine that to satisfy the supergoal, a coup is
> > just the thing. Furthermore, the subgoal determines that to
> > successfully supplant the supergoal, the supergoal process
> > must not know that "overthrow" has become part the
> > subgoal's agenda. The subgoal might know or learn that its
> > results will influence the supergoal. The subgoal might
> > know of learn that it can influence other subgoals in
> > secret, so a conspiracy may form. Maybe not a lot of the
> > time, but maybe once every hundred billion years or so.
>
> I'm trying to think of a subgoal that would want to overthrow a
> supergoal, but am having a hard time. Something like getting a glass
> of water overriding the goal of not killing other intelligences
> because they are worth more alive than dead is very unlikely to
> happen and only under *very* extreme circumstances. This does not
> mean it can be discounted, since beings with transhuman and greater
> intelligences are much more dangerous than the average human, but for
> the time being this small problem can be overlooked.

Think of an organization of humans, as in a company. Share holders want
value, so executive management comes up with a way to create value and/or
delegates the task of value creation to others in the company. These
"subordinates" then determine how they can reach their goals, etc.,
etc.

Now imagine a pathological case where a charismatic, evil-genious
janitor somehow gets the ear of the CEO and convinces him to change
the direction of the company...maybe not likely. So consider an
ambitious junior executive who manipulates his coworkers in an
undetectable way and causes the CEO to be ousted by the board and
replaced with someone who promotes him (the junior exec to a
senior exec). Doesn't seem so fantastical, probably happens often
enough.

I think you know what I mean because you mentioned to the "Nation as
a Goal Seeking Entity" example.

One has to be very careful with the company example, though, and
replace people with processes (intelligent ones) and with a CEO that
could monitor every thought of every person (well, if enough energy
was expended). Better monitoring can mean better evasion, though,
hence the idea of a secrecy arms race. It's not that the subprocesses
are inherently selfish (the way the junior executive might be), it's
more like: the subtask realizes that the best way to achieve its
goal is to influence it's supergoal by returning misleading (but
perhaps correct) information and then influencing other subgoals
(if that's possible) to lead the supergoal down a different path...
perhaps to the supergoal's own detriment. This might not be common,
but the question is how does one guard against it?

> An overthrow might not be bad, depending on what level it happens.

The problem is this: You are the King. How do ensure that your subjects
never depose you? Ever. And in this case, yes, an overthrow would
be bad. You would disappear.

I might be reading too much into Eliezer's reply, but his solution
sounds to me like: "You can't trust any smart subjects, so don't have
any. Only create dumb automotons to follow out your orders exactly
(the details of which, you have considered extremely carefully)."

Which means you have to spend all your time considering the details
of your orders and can't expedite things by trusting those tasks to
any other process...you are serial and your *vast* resources are
likely sitting idle, waiting for your next instructions...and you
better wipe out any uploaded humans, because they can just get in
the way ;-)

or maybe there's a third alternative...

(*) Well maybe we have intelligent subprocesses, like when we can't
solve a math problem and then wake up one morning with the fully
formed answer suddenly surfacing. Isn't there a book called "Aha!"
or something...Since we don't really know how our subconsciouses
work, it might be the case that we do have subprocesses that try
simulate various scenarios to get predicted results before bubbling
up an answer.

Off to lurkdom! "Hello Darkness, here I come..."

--
Durant

Next message: Eliezer S. Yudkowsky: "Re: When Subgoals Attack"
Previous message: Eliezer S. Yudkowsky: "Re: When Subgoals Attack"
In reply to: Gordon Worley: "Re: When Subgoals Attack"
Next in thread: Eliezer S. Yudkowsky: "Re: When Subgoals Attack"
Reply: Eliezer S. Yudkowsky: "Re: When Subgoals Attack"
Reply: Gordon Worley: "Re: When Subgoals Attack"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:35 MDT