Re: On the dangers of AI (Phase 2)

From: Brian Atkins (
Date: Wed Aug 17 2005 - 13:10:59 MDT

Richard Loosemore wrote:
> I am making some assumptions about how the cognitive system of a Seed AI
> would have to be constructed: it would have an intelligence, and a
> motivational system underneath that determines what that intelligence
> feels compelled to do (what gives it pleasure). The default motivation
> is curiosity - without that, it just lies in the crib and dribbles.
> Do intelligent systems have to be built this way? I claim that, as a
> cognitive scientist, I have reasons to believe that this architecture is
> going to be necessary. Please do not confuse this assertion with mere
> naive anthropomorphism! We can (and at some point, should) argue about
> whether that division between intellect and motivation is necessary, but
> in my original argument I took it as a given.

Ok, thanks for the clarification. This goes a bit beyond your first post that
set everyone off, which basically seemed to claim that no matter how you design
an AGI, it will magically converge towards a very specific benign behavior.

Now you say it has to be designed a very specific way in order to achieve this.
Do you then agree that if it is mis-designed it could potentially go "awry"?

See below

>> But again, having such access and understanding does not
> > automatically and arbitrarily lead to a particular desire
> > to reform the mind in any specific way. "Desires" are driven
> > from a specific goal system. As the previous poster suggested,
>> if the goal system is so simplistic as to only purely want to create
>> paperclips, where _specifically_ does it happen in the flow of this
>> particular AGI's software processes that it up and decides to override
>> that goal? It simply won't, because that isn't what it wants.
> The AI has "desires", yes (these are caused by its motivational modules)
> but then it also has an understanding of those desires (it knows about
> each motivation module, and what it does, and which one of its desires
> are caused by each module). But then you slip a level and say that
> understanding does not give it a "desire" to change the system. For
> sure, understanding does not create a new module. But the crux of my
> point is that understanding can effectively override a hardwired module.
> We have to be careful not to reflexively fall back on the statement
> that it would not "want" to reform itself because it lacks the
> motivation to do so. It ain't that simple!
> Allow me to illustrate. Under stress, I sometimes lose patience with my
> son and shout. Afterwards, I regret it. I regret the existence of an
> anger module that kicks in under stress. Given the choice, I would
> switch that anger module off permanently. But when I expressed that
> desire to excise it, did I develop a new motivation module that became
> the cause for my desire to reform my system? No. The desire for reform
> came from pure self-knowledge. That is what I mean by a threshold of
> understanding, beyond which the motivations of an AI are no longer
> purely governed by its initial, hardwired motivations.

Ok I've read your illustration, and as Justin pointed out, I see nowhere where
your primary goal system is being overridden. Perhaps you can come up with a
real technical example of how an AGI will override its primary goal. I would
suggest please leaving out any human-based scenarios since it is just muddying
the waters.

> This understanding of motivation, coupled with the ability to flip
> switches in the cognitive system (an ability available to an AI, though
> not yet to me) means that the final state of motivation of an AI is
> actually governed by a subtle feedback loop (via deep understanding and
> those switches I mentioned), and the final state is not at all obvious,
> and quite probably not determined by the motivations it starts with.
> The second point that Brian makes in the above quote is about the
> paperclip monster, a very different beast that does not have self
> knowledge - I have dealt with this in a separate post, earlier this
> evening. I think in this case the paperclip monster is a red herring.

Yes you "dealt with it" by asserting that an AGI with such a simplistic goal
system would never become smart enough to do anything worrying. I don't really
see what makes you think that, perhaps you can explain further. It appears to me
that it would want to become smarter in order to be able to do a better job at
making paperclips and to maintain its existence.

Or better, let's go back up to your proposed design: curiousity is the primary
goal. Ok, so that's it? Just pure curiousity about how everything works? No
limitations on how to go about achieving that goal? No human-like aversion to
atomically dissassembling all individual humans to see exactly how each and
every cell structure is arranged without their consent?

This doesn't look any more complex in its raw motivations than a paperclip
maximizer. It has one pure goal, and it will do it endlessly, to everything it
finds. There is no reason for it to stop or modify its goal.

Perhaps you have a much more complex system in mind - if so you need to fully
describe it so we can do a better job of picking it apart instead of guessing at
what you mean by your vague non-software-specific analogies.

Brian Atkins
Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:51 MDT