AI Goals [WAS Re: The Singularity vs. the Wall]

From: Richard Loosemore (rpwl@lightlink.com)
Date: Tue Apr 25 2006 - 07:56:06 MDT


Philip Goetz wrote:
> On 4/24/06, Richard Loosemore <rpwl@lightlink.com> wrote:
>> I could pick up on several different examples, but let me grab the most
>> important one. You seem to be saying (correct me if I am wrong) that an
>> AGI will go around taking the same sort of risks with its technological
>> experiments that we take with ours, and that because its experiments
>> could hold existential risks for us, we should be very afraid. But
>> there is no reason to suppose that it would take such risks, and many,
>> many reasons why it would specifically not do the kind of risky stuff
>> that we do: if you look at all the societal and other pressures that
>> cause humans to engage in engineering ventures that might have serious
>> side effects, you find that all the drivers behind those proessures come
>> from internal psychological factors that would not be present in the
>> AGI.
>
> The problem is that the AI's goals might not be our goals. At all.
> Most people would say we haven't taken many outrageous risks in
> settling America, and yet the number of wolves has declined by a
> factor of as much as a thousand! How did that happen?
>
> (Aside: It is worth asking whether a free AI or an AI controlled by
> humans would be more dangerous.)
>
>

I think that the question of an AI's "goals" is the most important issue
lurking beneath many of the discussions that take place on this list.

The problem is, most people plunge into this question without stopping
to consider what it is they are actually talking about. I don't mean
this in a critical or offensive way, I mean that we have spent so little
time thinking about what it means for a thing to have goals or
motivations, that we don't have a clear concept of what "goals" actually
are. We don't have a science of AGI motivation yet, we only have wild
guesswork.

And there are many traps that guessing will get us into:

One trap is to do a simple extrapolation from the way that humans behave
towards other species: we had our goals, the wolves had theirs, and
look what happened to the worlves, as you put it above. But why would
an AGI be motivated the way the human species is collectively motivated?

A second trap is to suppose that when we build an AGI, the goals and
motivations of the AGI will be something we discover afterwards, when it
is too late. Almost every comment on this list that expresses concern
about what an AGI would do, has this assumption lurking out back. There
are good reasons to believe that we, the designers of the AGI, would
have complete control over what its motivations would be. Worried that
it might wipe us out without caring? Then don't design it without a
"caring" module! (I am oversimplifying for rhetorical effect, but you
know what I mean).

A third trap is to suppose that it could be an AGI and also be in some
sense completely dumb, like being superintelligent but also being under
the control of an idiot human dictator. Sure, maybe this is possible:
but we shouldn't just assume it can happen and start ranting about how
bad it would be, we should talk technical details about exactly how it
might work (and FWIW, I don't think it could be made to work at all).

Here is another subtle issue: is there going to be one AGI, or are
there going to be thousands/millions/billions of them? The assumption
always seems to be "lots of them," but is this realistic? It might well
be only on AGI, with large numbers of drones that carry out dumb donkey
work for the central AGI. Now in that case, you suddenly get a
situation in which there are no collective effects of conflicting
motivations among the members of the AGI species. At the very least,
all the questions about goals and species dominance get changed by this
one-AGI scenario, and yet people make the default assumption that this
is not going to happen: I think it very likely indeed.

Finally, there are many people (especially on this list) who assume that
an AGI will be an RPOP whose motivations are nothing more than a
sophisticated goal stack: this begs many questions about whether such a
primitive goal system would actually work. These questions are
enormously technical, and some of them may not be resolvable without a
good deal more hard coding on big systems, but the answers to the
questions are utterly huge, when it comes to deciding what kind of
creature we might be dealing with. If RPOPs are stable and easy to
build, we get (IMO) a very dangerous kind of AGI. If RPOPs are
difficult to get to work, or if they don't work at all, and if
human-style motivational systems are used instead, (again, IMO) we could
have an extremely safe kind of AGI that would never, ever be vlunerable
to any of the issues that people tear their hair about. Enormous
difference in outcomes, between these two possibilities: but how many
people are discussing the distinctions I am making here? Practically
zero! [I don't have very good discussions with myself, you see :-)]

Overall message: let's have some serious and detailed discussion to
explore the space of possibilities. Let's not make one default
assumption about what would motivate an AGI, and then run with it;
let's find out what the options are, then debate the ramifactions.

I don't think we can easily do that on this list, because things get too
heated, alas.

Richard Loosemore



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:56 MDT