Re: Revising a Friendly AI

From: Eliezer S. Yudkowsky (
Date: Tue Dec 12 2000 - 23:57:21 MST

Ben Goertzel wrote:
> > Another way of looking at it is that all actual decisions, as in the ones
> > that eventually get sent to motor control, are made by verbal thoughts.
> This is just not true... but I don't think it's crucial to your point...

Okay, I'll expand. There are reflex decisions that are carried out
directly by the spinal cord, although it is possible for these reflexes to
be inhibited in advance by signals sent from higher systems.

Actual motor signals do not, of course, originate from the auditory cortex
- that's not what I meant by "verbal thoughts". (Though one current
theory holds that motor decisions originate in the entire Layer 4 of the
cerebral cortex, so almost any theory is neurotopologically plausible.)
When you think "I will get up and make coffee" - not just the verbal
sounds themselves, of course, but the sounds plus your belief in them,
which is technically an emotional binding (I think). Anyway, that verbal
thought creates several mental images. For example, coffee as an
immediate subgoal, the expectation that this subgoal will be fulfilled
shortly, the anticipation of the associated taste sensation and
satisfaction, and so on. However, the important image is not really a
declarative image as such, but a sort of plan-in-potentia - the default
set of instructions associated with getting up, making coffee, and so on.
Unless conscious attention is focused on this default plan imagery (which
was created by the verbal thought, please note), it will start to feed
into motor cortex, or rather trigger the very-high-level motor actions
like "get out of chair".

I could be wrong about the picture I just drew, but it's my current best
guess for what happens when you think "I will get up and make coffee".
(Not that I drink coffee, of course.)

There's quite possibly an opportunity for emotions to bind directly to the
plan-in-potentia, either before or after the plan is visualized, if the
plan has negative consequences (projected pain). The emotions can exert
strong influences toward: Removing the plan from being the "default",
removing the plan from the focus of attention, replacing the plan with a
substitute, or giving rise to any number of verbal thoughts about the
desirability of replacing the plan. Perhaps the plan may even be
suppressed directly via low-level negative feedback, which would involve,
in essence, emotions seizing the steering wheel.

Under routine circumstances, however, the verbal thoughts are in immediate
control. (Note that I do not say the conscious mind is in control, since
emotions can also exert major influence over verbal thoughts.) With
respect to long-term goals, verbal thoughts are in effectively complete

> > >From this perspective, the ability to do a verbal override is not
> > necessarily an evolutionary advantage. It is an inevitable consequence of
> > our underlying cognitive architecture. Our cognitive architecture is a
> > huge evolutionary advantage compared to nonconsciousness; thus, remaining
> > Neanderthal is not an option. But, from the genes' perspectives, the
> > first conscious architecture that happened to arise, while crushing the
> > Neanderthals, has its own problems. Such as allowing verbal overrides -
> > and therefore, memetic overrides - of the built-in desires. This is a
> > "problem" from the perspective of the genes, anyway.
> I understand your view, I think ... but I think it's wrong.
> Culture makes us smarter and more adaptable, so genes that lead to culture
> should e selected, and they are.

"Culture" is a big word. Are we talking about replacing sex with
religious faith, or learning how to use a fork? My view is that the two
are inseparable consequences of cognitive architecture; yours seems to be
that the two are each, individually, evolutionary advantages.

> > I regard none of this as a factor in my picture of how a mind *should*
> > work.
> Well, our evolutionary heritage has many minusses, e.g. the dark aspects of
> sexuality (jealousy, etc.); aggression; etc.

Actually, I would regard the whole pleasure-pain-mental-energy
architecture as a "minus". But my perspective on this is not necessarily

> But I was noting one plus: it integrates relatively useful goal systems all
> through our minds in subtle & complex ways.

The return on integration is scarcely greater than the investment in
instinct... maybe less. Remember, evolution is not just trying to
integrate novel and archaic goal systems, it is trying to *control* a
complex system with a simple one, which is why my description of a human
is so tangled and invokes so many different steps.

Forget about whether imitating this is a *good* idea; I don't think we
*can*... not this side of the Singularity, anyway.

> If Ai systems don't evolve, they'll have to get this some other way, that's
> all. It's far from impossible.

What AIs need is the correct decision, the Friendliness. Why do they need
the tangle to get it? What's wrong with supergoal and subgoal?

-- -- -- -- --
Eliezer S. Yudkowsky
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:35 MDT