Re: Terms of debate for Complex Systems Issues

From: Michael Wilson (
Date: Wed Aug 24 2005 - 14:24:46 MDT

Richard Loosemore wrote:
> this is all about how to change the parameters of various systems
> to get them from a non-complex regime up into complexity.

No, it's a search for useful predictions that 'Complexity Theory' might
be capable to making. The above might be part of the hypothesis behind
such a prediction, but it's only an intermediate step in any useful
inferential chain.

> All of the above is not anthropomorphism, just model building inside
> an intelligent mechanism. /There are no intentional terms/.
> may be /curious/ about... ...engage in /speculation/ about...
> ...surely /imagine/ such eventualities... could /relax/...
> .../ignore/ all of this thinking...

(my emphasis) This is just black humour. You make a cursory attempt at
formalism and then proceed to ignore it. If you're trying to make a
technical argument, define everything that has no commonly accepted

> ("Thinking" = building representations of the world, reasoning about
> the world, etc etc etc. "think" from now on will be used as shorthand
> for something going on the part of the system that does this).

Representations and reasoning aren't well defined entities, but unlike
the above terms they are (relatively) non-anthropomorphic, so this is
acceptable in a non-rigorous argument.

> One day, it happens to be working on the goal of *trying to understand
> how intelligent systems work*.
> It thinks about its own system.

A key intuitive property of 'thinking' is that it has no direct effect
on the external environment. Technically this isn't true for any real
world system, though it may be literally true for a simulated system,
but in practice it's normally ok to assume that cognitive actions
(e.g. manipulating models to produce inferences) will have no
side effects beyond yielding information.
> This means: it builds a representation of what is going on inside
> itself.

Which is a reflective model of the AI.

> And as part of its "thinking" it may be curious about what happens
> if it reaches into its own programming and makes alterations to
> its goal system on the fly. (Is there anything in your formalism that
> says it cannot or would not do this? Is it not free, within the
> constraints of the goal system, to engage in speculation about
> possibilities? To be a good learner, it would surely imagine such
> eventualities.)

Changing the content of the reflective model in order to predict the
results of a particular self-modification is fine; CFAI calls this
'self-shadowing'. That isn't to say that this is risk free in reality,
but an AGI with a simple goal system could reasonably elect do this.
In particular this is compatible with an architecture in which
inferential search is not closely directed by the main goal system,
something designers of the 'bubbling stew of agents' persuasion are
likely to favour.

However changing the /AI itself/, rather than the /model of the AI/,
violates a basic layering tennent. Changes to the AI itself are not
side-effect free, which is to say that cognitive operators affecting
cognitive content outside of the reflective model are not guarenteed
to remain causally local. As such self-modification is a first class
action that will not take place unless the goal system evaluates it
as a good idea. Any halfway decent AI substrate should be able to
manage elementary reflective layering and isolation of causal domains
like this; even connectionist systems can support with appropriate
help from the supporting codebase. Direct self-modification can still
theoretically be used as a form of inference, via a decision similar
to a decision to perform a physical experiment in reality outside of
the AI. But this decision will not be made if the modification in
question changes the goal system such that the future AI has a
nontrivial probability of no longer sharing the same optimisation

> Let us suppose... that it notices... that it will eventually reach a
> state in a million years time when it will cause some kind of damage
> that will result in its own demise.

Where 'demise' is classified by the goal system as undesireable.

> But if it now *knows* that this prime directive was inserted
> arbitrarily, it might consider the idea that it could simply
> alter its goals.

Sure, it models itself changing its goals, predicts the consequences
and ranks the action (if the AGI is using EU, this means calculating
an EU for the action). But arbitrariness has no influence on the
desireability of having a goal unless there is an explicit goal stating
that (goal system) arbitrariness is bad. This point was made repeatadly
when you were arguing about your own goals and you still don't seem to
have taken it on board.

> Could make them absolutely anything it wanted, in fact

Capability does not imply intention.

> after the change, it could relax

'Relax' could mean a great many things, all of them irrelevant, but if
you want to make an argument about emotions then you're going to have
to specify how they contribute to inference and action selection in your

> What does it do? Ignore all of this thinking?

To 'ignore' means to deliberately not consider information when making
futher inferences (including decisions). Predictions about the results
of a self-modification are clearly highly relevant to the decision to
implement that modification or not, so ignoring them is the last thing
an (even vaguely rational) AGI would do. But none of the above will
generate a preference for the self-modification unless (a) arbitrariness
/is/ defined as a bad thing in the goal system and/or (b) the modified
goal would result in better overall success at fulfilling the original
goal due to the system being around longer, despite not targetting
quite the same thing. I would argue that (b) is merely an ill-advised
and almost certainly unnecessary approximation of a system of the
original goal plus an implied subgoal 'don't allow self-destruction',
which will act in concert (under an EU decision function anyway) to
produce exactly the same result without risking optimisation target
drift (and its attendant negative utility/undesireability).

> Maybe it comes to some conclusion about what it *should* do that
> is based on abstract criteria that have nothing to do with its
> current goal system.

Either you're specifying explicit metagoals, which are a special
case of supergoals, or you're designing a non-causally-clean
substrate that effectively has implicit metagoals. And/or you're
just introducing randomness for no good reason, I suppose.
> What is crucial is that in a few moments, the AGI will have changed
> (or maybe not changed) its goal system, and that change will have been
> governed, not by the state of the goal system right now, but by the
> "content" of its current thinking about the world.

What sane reason is there for violating the causal locality of the
model and the layering of systems required for reflection by conflating
the reflective model and the actual system like this? You can't even
make an efficiency argument, as it's always possible to just run a
second copy of the AI in a box as the model (in principle; see caveats
about AI boxes), or do a properly considered experiment. If you've
simply conflated predictive inference from decision making then fine,
but in that case (apart from being very foolish) you can no longer
talk about discrete 'models' and 'thinking about models'.

> A system in which *representational content* has gotten the ability
> to feed back to *mechanism* in the way I have just described, is one
> sense of Complex.

That part makes perfect sense. As I understand it all your
frustrations with SL4 result from the fact that you can't distinguish
a (reflective) model from its referant. To be fair, this is a
moderately subtle mistake to make.

> Now, demonstrate in some formal way that the goal system's structure,
> when the AGI has finished this little thought episode, is a predictable
> consequence of the current goal system.

Back when I was a fan of active symbol systems, emergence, agent soups
and advanced genetic algorithms, Eliezer said to me;

'Of course you can't predict the goal system trajectory for a DE-heavy
 AGI. It's like trying to predict the molecular outcome of turning a
 six-year-old loose in a chemistry lab.'

This was instrumental in causing me to realise that those sort of
techniques was a profoundly bad idea, though I confess that I didn't
truly accept this until I'd already found alternative (and more
powerful) techniques. The quote also applies to any design that
can't manage to simulate something without accidentally altering
the thing it's trying to simulate, but hopefully it is more obvious
that there's no good reason to design a system like that in the first

 * Michael Wilson

To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre.

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:52 MDT