Re: Paperclip monster, demise of.

From: Richard Loosemore (rpwl@lightlink.com)
Date: Wed Aug 17 2005 - 22:03:58 MDT


Michael Wilson wrote:
>>Richard Loosemore wrote:
>>This hypothetical paperclip monster is being used in ways that are
>>incoherent, which interferes with the clarity of our arguments.
>
> The problem is not that we don't understand your position. It is a
> common position that has been put forward by numerous people with
> anthropic expectations of how AGI cognition will work. The problem is
> that you do not understand the opposing position; how reasoning about
> goals works when the goals are all open to reflective examination and
> modification. You are incorrectly postulating that various quirks of
> human cognition, which most readers are well aware of, apply to
> intelligences in general.

This comment, like the other ones in your reply, is not related to the
quoted text that precedes it, nor related to the overall intent of the
original message that I sent. This is a little frustrating. It is
almost as if you did not read or try to understand the simple point I
was making in this post, but instead launched a series of arguments that
are generally directed at all of my arguments in the other posts. Many
of your comments about my original "Paperclip monster, demise of" post
are wild misinterpretations of the letter and spirit of what I was
talking about, like this glaring example:

>>> [Loosemore:]
>>> and it does perceive within itself a strong compulsion to make
>>> paperclips, and it does understand the fact that this compulsion is
>>> somewhat arbitrary .... and so on.
>
> Ah, here we go. You presumably believe 'arbitrariness' is a bad thing
> (most humans do). Why would an AGI believe this?

Not only is the "Ah, here we go" pretty insulting, but you then
interpret "arbitrary" to be about something that has nothing to do with
what I was talking about, then ascribe to me a puerile value judgement,
then criticise me for it!

****************************************************************

For the rest of this reply I will ignore non-sequiteurs like the one
above and just address the core issues. The following is directed not
just at you, Brian, but at that part of the group that expresses a
similar position.

Your comment above, about my not understanding "how reasoning about
goals works when the goals are all open to reflective examination and
modification" and the other comments about "goal systems" that appear in
your reply, all come from a very particular, narrow conception of how an
AI should be structured. How can I characterize it? It is a
symbolic-AI, goal-hierarchical, planning-system approach to AI that is a
direct descendant of good old Winograd et al.

Just for the record, I know perfectly well what kind of goal system you
are referring to. (I have written such systems, and taught postgrads
how to write them). But I have also just spent 6000 words trying to
communicate to various posters that the world of AI research has moved
on a little since the early 1980s, and that there are now some very much
more subtle kinds of motivational systems out there. I guess I made the
mistake, from the outset, of assuming that the level of sophistication
here was not just deep, but also broad.

Do you know about the difference between (1) quasi-deterministic
symbol-system that keeps stacks of goals, (2) a Complex assemblage of
neurons, (3) Complex systems with partly symbolic, partly neuron-like
properties? Do you understand the distinction between a set of goals
and a set of motivations, and why I have been talking about the latter
while you persist in changing the subject to a particular interpretation
of the former?

Do you know about cognitive science? About concept development? The
problems with classical, feature and prototype models of concept
structure? About the binding problem in the context of neural systems?
  About the way that motivational systems can be the result of
interacting, tangled mechanisms that allow the entire system to be
sensitive to small influences, rendering it virtually non-deterministic?

For example:

> Whether a system will actually 'think about' any given subjunctive goal
> system depends on whether its existing goal system makes it desireable
> to do so.

Complete nonsense. In general, an AI could use a goal system and
motivation system that caused it to shoot off and consider all kinds of
goals, in ways that are exquisitely sensitive to small influences. I
have already made this point elsewhere: The AI says "I could insert
inside myself ANY motivation module in the universe, today. Think I'll
toss a coin [tosses coin]: Now I am passionately devoted to collecting
crimson brocade furniture, and my goal for today is to get down to that
wonderful antique store over in Chelsea." Guess what, it *really* wants
to do this, and it genuinely adopted the antique-hunting goal, but are
you going to say that its previous goal system made it desirable to do
so? That this new motivation/goal was actually determined by the
previous goal, which was something like pure curiosity? The hell it did!

In the context of an example like this, your assertion above makes no sense.

Or, another example:

> A goal system is simply a compact function
> defining a preference order over actions, or universe states, or
> something else that can be used to rank actions. If such a function
> is not stable under self-modification, then it will traverse the space
> of unstable goal systems (as it self-modifies) until it falls into a
> stable attractor.

A goal system *is* a compact function defining a preference order over
actions/universe states? Who says so? Many people would say it is not:
  this is just a particular way of construing a goal system. It is
possible to construct (Complex, as in Complex System) goal systems that
work quite well, but which implicitly define a nondeterministic function
over the space of possible actions, where that function is deeply
nonlinear, non-analytic and probably noncomputable. And, yes, it may be
unstable and it may spend the entire lifetime of the universe heading
towards a nice stable attractor *and never get there*.... and still,
meanwhile, it might working prtty damn well as a goal mechanism in a
real cognitive system.

And still, this talks about goal system and not motivational system.

What you have done is to go from the general to a restrictively
particular non-sequiteur.

Or this, from Randall Randall:

> "Pure thought" is
> only useful as a tool to examine outcomes against goals. In order to make
> a choice, you have to have some method for measuring "better" outcomes
> internally. Whatever direction or scale you use to measure "better" is
> what other people here are calling your "goal". It may be that you have
> more than one goal, or that you have a (large) set of conflicting goals
> held by various subsystems of you, but each decision you make, each time
> you choose between "better" and "not as good", the measurement is a
> reflection of the goal or goals involved in the decision.

What? Did nobody here ever read Hofstadter? I'd bet good money that
every one of you did, so what is so difficult about remembering his
discussion of tangled hierarchies, and about how global system behavior
can be determined by self-referential or recursive feedback loops within
the system, in such a way that ascribing global behavior to particular
local causes is nonsense? Why, in all of this discussion, are so many
people implying that all goal systems must be one-level, quasi-
deterministic mechanisms, with no feedback loops and no nonlinearity?

And why are the same people, rather patronizingly, I must say, treating
me as if I am too dumb to understand what a goal system is? When, in
fact, I am trying to point out a subtle property of that more general
class of goal (actually, motivational) systems? I keep trying to talk
about motivational systems and goal hierarchies with tangled loops in
them [what tangled loop? the one that occurs when the system realises
that it can swap motivational modules in and out, and that this swapping
process could have profound consequences for the entire observable
universe], but while I am trying to drag the discussion up to this
subtle level, I find people trying to gently explain to me that I need
to do more work to understand the properties of simple, non-tangled goal
hierarchies, or that I seem to be making the even more stupid mistake of
anthropomorphizing and confusing the true behavior of a real AI with the
silly quirks of a human mind.

Like the following comment from Michael Roy Ames:

> You are positing one type of AGI architecture, and the
> other posters are positing a different type. In your type the AGI's action
> of "thinking about" its goals results in changing those goals to be quite
> different. In the other type this does not occur. You suggest that such a
> change must occur, or perhaps is very likely to occur. You have provided
> some arguments to support your suggestion but, so far, they have all had big
> holes blown in them. Got any other arguments to support your suggestion?

Patronizing BS. I have watched holes get blown in arguments I never
made, about systems that I was not referring to (and which are probably
too trivial to be worth investigating, but that is a side issue), by
people who persistently fail to read what I have actually said, or make
an effort to understand that what I have said.

If you really insist on characterizing it as "my" type of AGI vs
everyone else's type of AGI, that is fine: but I am talking about a
more general type of AGI, as I have been [ranting] on about in this message.

Or, finally, this example from Chris Capel, which is certainly not
patronizing, but includes the same misunderstanding that keeps occuring
over and over:

> The AI doesn't have a meta-utility-function by which to judge its
> utility function. It has a single utility function by which to judge
> all potential actions, which is by definition the standard of good.
>
> The only reason the act of reflecting on one's goals produces change
> in humans is that humans have multiple ways of evaluating the goodness
> of ideas and actions, and different standards are used depending on
> the mental state of the human. An AI would be designed only have one
> such standard, a single, unitary utility function, and thus no amount
> of reflection could ever, except by error, lead to the changing of the
> content of its goal system.
>
> The best interpretation I can give your words (and I confess, I
> haven't read all of them) is that you're saying any AI would by
> necessity have multiple levels of goals that could potentially
> conflict. But this is just bad design, and I don't think it would
> happen. If you want to make a case for its necessity, perhaps that
> would progress this thread along a bit more.

All of what you say would be try of SHRDLU. But it is a pitiably weak
conception of what a goal system could be, or (even more so) what a
motivational system could be. You have so narrowly defined the meaning
of goals and utility functions that there is nothing tangled in there.
Why are all those recursive, tangled possibilities excluded?

Richard Loosemore



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:51 MDT