Re: Is complex emergence necessary for intelligence under limited resources?

From: Richard Loosemore (rpwl@lightlink.com)
Date: Tue Sep 20 2005 - 13:05:31 MDT


Ben Goertzel wrote:
> Richard,
>
>
>>The answers that others offer to your questions are, pretty much: no
>>you cannot really avoid complex systems, and mathematical verification
>>of their friendliness is the very last thing you would be able to do.
>>The main defining characteristic of complex systems is that such
>>mathematical verification is out of reach.
>
>
> There is no question that human mind/brain is a complex system which
> achieves its intelligence via emergence, self-organization, strange
> attractors, terminal attractors, and all that great stuff....
>
> And there is little question that emergence-based intelligence is
> intrinsically very difficult to predict and control with a high
> degree of reliability, thus rendering verified Friendliness an
> unlikely outcome.
>
> However, these observations don't tell you much about whether it's
> possible to use digital computers to create an intelligence that
> DOESN'T rely critically on the emergent phenomena typically
> associated with biological complex dynamical systems.
>
> Semi-similarly, the human mind/brain probably uses complex emergent
> phenomena to add 2+2 and get 4, but, a calculator doesn't, and a
> calculator does a better job of arithmetic anyway.
>
> One may argue that flexible, creative intelligence is fundamentally
> different than arithmetic, and is not achievable within limited
> computational resources except via complex, unpredictable emergent
> dynamics. In fact I strongly SUSPECT this is true, but I haven't SHOWN
> that it's true in a convincing way, and I'm not 100% convinced
> it's true.
>
> If you have a strong argument why this contention is true, I'd be
> very eager to hear it.
>
> On the other hand, some others seems pretty sure that the opposite is
true,
> and that it IS possible to achieve powerful intelligence under
> limited resources without requiring unpredictable emergent phenomena.
>
> However, I haven't seen any strong arguments in this direction either.
>
> -- Ben Goertzel
>
>

Ben,

In reply to your question, I'll see if I can outline my argument in more
detail than previously.

[I am targetting this argument at people who actually understand what a
complex system is: I am beyond the point of trying to educate people
who not only do not understand, but are scornful of even making the
effort to understand, and who repeatedly throw out false arguments based
on caricatures of what I and the complex systems people have claimed.]

Preliminary Remark 1: The best I am going to be able to do is offer
convincing empirical reasons why the thesis is true; certain proof is
beyond reach, alas. So it will always be something of a judgement call
whether one accepts these arguments or not.

*Overview.*

The overall direction of the argument is going to be this: that in all
the work on AGI to date, there has been relatively little emphasis on
getting the "symbols" [please interpret the word loosely: these are just
the basic representational units that encode the smallest chunks of
knowledge about the world] to be constructed entirely without programmer
intervention. In other words, we tend not to let our systems develop
their symbols entirely as a result of the interaction of a learning
mechanism with a stream of environmental input. Rather, we tend to put
"ungrounded" symbols in, which we interpret. The argument is going to
be that there are indications that this facet of an AGI (whatever
apparatus allows the symbols to be fully grounded) is going to be more
important than we suppose, and that it will introduce a great deal of
complexity, and that that, in turn, will be impossible to avoid.

*Detailed Version of this Argument*

We all know that in the Good Old Days, a lot of AI folks would build
systems in which they inserted simple tokens, labelled them "arch"
"hand" "table" "red" and so on, then wrapped a mechanism around those
symbols so the system could manipulate the symbols as a representation
of a world, and as a result dispaly some intelligent behavior (like, be
able to manipulate, and answer questions about, the placement of blocks
on a table).

What we now know is that these kinds of systems had problems, not the
least of which was the fact that the symbols were not grounded: the
system never created the symbols itself, so it was up to the programmer
to interpret them manually (so to speak).

What was the solution? Clearly, one part of the solution had to be to
give the system a learning mechanism, so it could build new symbols out
of simpler ones. Or rather, so it could build symbols from scratch, out
of real-world raw input of some kind. Or rather... well, you can see
that it isn't entirely clear, so let's unwrap this in a bit more detail.

Question: Do we have to get the system to build all its symbols ab
initio, from a completely blank slate? Can't we give it some starter
symbols and ask it to develop from there? Surely it would be reasonable
if the system had some innate knowledge of the world, rather than none
at all? The consensus view on this questtion is that it should not be
necessary to go all the way back to signals coming from raw nerve
endings in eyes, ears, hands, etc, but that we should be able to put in
some primitive symbols and get the rest of them to be generated by the
system. (Aside: the connectionists were pretty much defined to be the
group that broke ranks at this point and insisted that we go down to
much deeper levels ... we will avoid talking about them for a moment,
however).

So learning mechanisms should start with some simple symbols and create
more complex (more abstract) ones as a result of observing and
interacting with the world. Where do we draw the line, though? Which
primitives are acceptable, and which do we think are too high-level?
People disagree. Many who work in the machine learning field do not
really accept any constraints on the high-levelness of their primitive
symbols, and are happy to get any symbols to develop into any others,
not caring if the primitives look low-level enough that we can believe
they escape the grounding problem.

We, however (from the perspective of this essay) care about the
grounding problem and see a learning mechanism as a way to justify our
usage of symbols: we believe that if we find a plausible learning
mechanism (or set of mechanisms) it would be capable of going all the
way from very primitive sensorimotor signals, or very primitive innate
symbols, all the way up to the most abstract symbols the system could
ever use. If we found something as plausible as that, we would believe
we had escaped the grounding problem.

Next step. We notice that, whatever learning mechanism we put our money
on, it is going to complicate our symbols.

[Terminology Note: I will use "complicate" to mean just what it seems,
and *nothing whatsoever* to do with "Complex" as in Complex Systems.
Hopefully this will avoid confusion. If I use "complex" it will mean
"as in 'complex systems'"].

What do I mean by complicating our symbols? Only that if they are going
to develop, they need more stuff inside them. They might become
"frames" or "scripts" or they might be clusters of features, or they
might have prototypes stored in them.... whatever the details, there
seems to be a need to put more apparatus in a symbol if it is going to
develop. Or, if not passive data inside the symbol (the way I have
implied so far), then more mechanism inside it. It seems quite hard,
for a variety of subtle reasons, to build a good learning mechanism that
involves utterly simple, passive symbols (just a token, an activation
level and links to other tokens) and a completely separate learning
mechanism that is outside the symbols.

Now, I might be being unfair here by implying that the reason our
symbols became more complicated was because, and only because, we needed
learning mechanisms. I don't think I really mean to put it that
strongly (and it might be interesting only to future historians of AI
anyhow). We had other reasons for making them more complicated, on of
them being that we wanted non-serial models of cognition in which less
power was centralized in a single learning mechanism and more power
distributed out to the (now active, not passive) symbols.

So perhaps the best way to summarize is this: we found, for a variety
of reasons, that we were pushed toward more complicated mechanisms
(and/or data structures) inside our symbols, in order to get them to do
more interesting things, or in order to get over problems that they
clearly had.

This is a very subtle point, so although a lot of people reading this
will be right along with me, accepting all of this as obvious, I know
there are going to be some voices that dispute it. For that reason, I
am going to dwell on it for just a moment more.

Sometimes you may think that you do not need complicated, active
symbols, and that in fact you can get away with quite simple structures,
allied with a sophisticated learning mechanism that builds new symbols
and connects them to other symbols in just the right, subtle way that
allows the system as a whole to be generally intelligent. In response
to this position, I will say that there is a trap here: you can always
rearrange one of my complicated-symbol systems to make it look as if the
symbols are actually simple (and maybe passive also), at the cost of
making the learning and thinking mechanism more complicated. You know
the kind of thing I mean: someone proposes that symbols should be
active neuron-like things, and then some anti-neural-net contrarian
insists that they can do the same thing with a centralised mechanism
acting on a matrix of passive data values. We have all seen these kinds
of disputes, so let's just cut through all the nonsense and point out
that you can always reformulate anything to look like anything else, and
that some types of formulation look more natural, more efficient and
(generally) more parsimonious. So when I argue that there is a tendency
towards more complicated symbols, I mean that the consensus intuition of
the community is that the simplest, most parsimonious AGI systems tend
to work with symbols that have more complicated apparatus inside them
than simply a token plus a couple other bits.

So I want to wrap up all of this and put it in a Hypothesis:

*Complicated Symbol Hypothesis*

   "To the extent that AGI researchers acknowledge the need to capture
sophisticted learning capabilities in their systems, they discover that
they need their symbols to be more complicated."

Corollary:

   "The only people who believe that symbols do not need to be
complicated are the ones who are in denial about the need for learning
mechanisms."

***
Please note the status of this claim. It is not a provable contention:
  it is an observation about the way things have been going. It is
based on much thought and intuition about the different kinds of AI
systems and their faults and strengths. The term "complicated" is open
to quite wide interpretation, and it does not mean that there is an
linear growth on complicatedness as sophistication of learning mechanism
goes up, only that we seem to need relatively complicated symbols (as
compared with simple passive tokens with activation levels and simple
links) in order to capture realistic amounts of learning.
***

But now, what of it? What does it matter that we need more stuff in the
symbols? How is this relevant to the original question about complexity
in AGI design?

To answer this, I am going to try to unpack that idea of "complicated
symbols" to get at some of the details.

When we observe humans going through the process of learning the
knowledge that they learn, we notice that they have some extremely
powerful mechanisms in there. I mean "powerful" in the sense of being
clever and subtle. They seem to use analogy a lot, for example. To get
some idea of the subtlety, just go back and look at the stuff Hofstadter
comes up with in GEB and in Metamagical Themas. When I talk about
"learning" I don't mean the restricted, narrow sense in which AI folks
usually talk about learning systems, I mean the whole shebang: the
full-up, flexible kind of learning that people engage in, where jumping
up a level of representation and pulling analogies around seems to be
the almost the norm, rather than the exception.

Getting this kind of learning capability into an AGI is the goal, as far
as the present discussion is concerned. Anything less is not good
enough. I think we can all agree that there is a big gap between
current machine capabilities and the awesome generality of the human system.

But how to do it? How do we close that gap?

At this point, everyone has a different philosophy. I look at language
learning in a 2-to-6 year old child and I think I observe that when I
talk to that child, I can define pretty much anything in the whole world
by referring to loose examples and analogies, and the chold does an
astonishing job of getting what I am saying. I even define grammatical
subtleties that way, when teaching how to talk. But to someone like
Chomsky, Fodor or Pinker, this entire process may be governed by a
massive amount of innate machinery [and, yes, Fodor at least seems to
believe that innateness does not just apply to the grammatical machinery
but to most of the conceptual apparatus as well, with just a little
content filling in to be done during maturation].

Then there are folks who don't go for the Chomskian innateness idea, but
who do insist that there is nothing wrong with our current ideas about
the basic format (data and mechanisms) inside symbols, all we need to do
is build a system that is big enough and fast enough, and connect it up
to sufficiently much real world input (and realistic motor output
systems) and it will develop the same rich repertoire of knowledge that
we see in humans. These people believe that all we need to do is take
symbols the way they are currently conceived, add some richer, improved,
to-be-determined learning mechanisms that browse on those symbols, and
all will eventually be well.

So what do I think? I disagree with the position taken in that last
paragraph. Now I am now going to try to focus in on exactly why I disagree.

Imagine a hypothetical AGI researcher who first decides what the format
of a symbol should be and then tries to hunt for a learning mechanism
that will allow symbols of that sort to develop as a result of
interaction with the world. Just for the sake of argument (which means
don't ask me to defend this!) let's be really naive and throw down some
example symbol structure: maybe each symbol is a token with activation
level, truth value, labelled connections to some other symbols (with the
labels coming from a fixed set of twenty possible labels) and maybe an
instance number. Who knows, something like that.

First the researcher convinces themself that the proposed system can
work if given ungrounded, programmer-interpreted symbols. They knock up
a system for reasoning about a little knowledge domain and show that
given a stream of predigested, interpreted information, the system can
come to some interesting conclusions within the domain. Maybe the
system gets loaded on a spacecraft bound for the outer solar system and
it can "think" about some of the ship's technical troubles and come up
with strategies for resolving them. And it does a reasonable job, we'll
suppose.

So far, not much learning. It didn't learn about spaceraft repair from
a textbook, or from getting out a wrenc and trying to build spacecraft,
or from long conversations with the engineers, it was just preloaded
with symbols and information.

But the researcher is hopeful, so they start adding learning mechanisms
in an attempt to get the system to augment itself. The idea is not just
to get it to add to its existing knowledge, but to start with less
knowledge and get to where it is now, by doing its own learning. We
are, after all, on a quest to eliminate most of that preloaded knowledge
because we want to ground the system in the real world.

But as the researcher tries to devise more and more powerful kinds of
learning mechanisms, they discover a trend. Those more powerful
mechanisms need more complicated stuff inside the symbols. Let's
suppose that they are trying to get analogy to happen: they find that
the existing symbol structure is too limited and they need to add on a
bunch of extra doohickeys that represent..... well, they don't represent
anything that easily has a name at the symbol level, but they are needed
anyhow to get the system to do flexible, tangled kinds of stuff that
leads to the building of new symbols out of old.

When and if the researcher tries to avoid this - tries to keep the
symbols nice and clean, like they were before - they discover something
rather annoying: the only way they can do this is to put more stuff in
the learning mechanisms (outside of the symbols) instead. Keep the
symbols clean, but make the (non-symbol-internal) learning mechanisms a
lot more complex. And then there is even more trouble, because it turns
out that the learning mechanism itself starts to need, not just more
machinery, but its own knowledge content! Now there are two places
where knowledge is being acquired: symbol system and learning engine.
And they don't talk, these two systems. In the process of trying to
keep the symbols clean and simple, the learning system had to invent new
strategies for learning, and (this is the real cause of the trouble)
some of those new learning mechanisms really seemed to be dependent on
the content of the world knowledge.

Something difficult-to-explain is happening here. It is because
learning (knowledge acauisition and refinement) is apparently so
flexible and reflexive and tangled in humans, that we have reason to
believe that the (human) learning mechanism that is the generator of
this behavior must itself involve some quite tangled mechanisms.

What does this seem to imply for the design of an AGI? It seems to
indicate that if we want a natural, parsimonious design, we are going to
inevitably head towards a type of system in which the symbols are
allowed to grow in a tangled way right from the outset, with knowledge
_about_ the world and knowledge about _how to understand the world_
being inextricably intertwined. And then, when we try to get those
systems to actually work with real world I/O, we will discover that we
have to add tweaks and mechanisms inside the symbols, and in the
surrounding architecture, to keep the system stable. And sooner or
later we discover that we have a system that seems to learn new concepts
from [almost] scratch quite well, but we have lost our ability to
exactly interpret what is the meaning of the apparatus inside and around
the symbols. We might find that there is no such thing as a symbol that
represents "cup", there is only a cluster of units and operators that
can be used to stand for the cup concept wherever it is needed, and the
cluster manifests in different ways depending on whether we are picking
up a cup, describing a cup, trying to defining a cup, trying to catch a
falling cup, trying to design an artistic looking cup, and so on.

In short, we may end up discovering, in the process of trying to get a
realistic set of human-like, powerul, recursive learning mechanisms to
actually work, that all the original apparatus we put in those symbols
becomes completely redundant! All the reasons the system now functions
might actually be enshrined in the extra apparatus we had to introduce
to (a) get it to learn powerfully and (b) get it to be stable in spite
of the tangledness.

But wait, that sounds like a presumption on my part: why jump to the
conclusion that "all the original apparatus we put in those symbols
becomes completely redundant"? Why on earth should we believe this
would happen? Isn't this just a random piece of guesswork?

The reason it is not a wild guess is that when the complicated learning
mechanisms were introduced, they were so tangled and recursive (with
data and mechanism being intertwined) that they forced the system away
from the "simple system" regime and smack bang in the middle of the
"complex system" regime. In other words, when you put that kind of
reflexivity and adaptiveness in such a system, it is quite likely that
the low level mechanisms needed to make its stable will look different
from the high-level behavior. We *want* something that approximates our
conventional understanding of symbols to appear in the top level
behavior - that is our design goal - and we want enormously powerful
adaptive mechanisms. The experience of teh complex systems community is
that you can't start with design goals and easily get mechanisms that do
that. And you especially cannot start with low level mechanisms that
look like the desired high-level ones, add a soupcon of adpativeness,
and then expect the high and low levels to still be the same.

Now, the diehard AGI researcher listens to me say this and replies:
"But I am not *trying* to emulate the messy human design: I believe
that a completely different design can succeed, involving fairly clean
symbols and a very limited amount of tangling. Just because humans do
it that way, doesn't mean that a machine has to."

The reply is this. As you put more powerful learning mechanisms in your
designs, I see your symbols getting more complicated. I see an enormous
gap, still, between the power of human learning mechanisms and the power
of existing AGI mechanisms. There is a serious possibility that you can
only keep a clean, non-complex AGI design by steering clear of extremely
tangled, recursive, powerful knowledge acquisition mechanisms. And if
you steer clear of them, you may find that you never get the AGI to
actually work.

To the complex systems person, observing the only known example of an
[non-A]GI (the human mind), it appears to be a matter of faith that real
learning can happen in a clean, non-complex AGI design. Their point of
view (my point of view) is: go straight for the jugular please, and
produce mechanisms of awesome, human level learning power, but *without*
sending the system into Complex territory, and I will have some reason
to believe it possible.

At the moment, I see no compelling reason to believe, in the face of the
complexity I see in the human design.

I am sure I could articulate this argument better. But that is my best
shot for the moment.

Richard Loosemore.



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:52 MDT