From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Mon Apr 15 2002 - 18:56:36 MDT
Ben Goertzel wrote:
>
> Actually, my definition of pattern allows for a variety of definitions of
> "simplicity".
>
> The most straightforward mathematical definition is "brevity of program
> size", but as I state explicitly, another valuable definition is "brevity of
> runtime".
Not in the stuff I'm reading... maybe one of your other works? Anyway,
brevity of program size is a special case of brevity of runtime, not the
other way around. This should become clear shortly...
> I think that the perspective you outline here can be seen as special case of
> the general framework I've defined.
I'm about to argue that it's the other way around. :)
> So, your definition is my definition with:
>
> -- simplicity defined as some sort of average of space and time cost
>
> -- similarity defined relative to the knowledge base K defined by the system
> in question
(Not "similarity" defined relative to the knowledge base K - what would that
mean? - but "similarity" defined relative to a goal-oriented role. That's
why you need a surrounding AI for the definition to make sense.)
I think it makes a big difference, in practice, whether you call what you're
looking for "simplicity" or "useful complexity". It makes a big difference
whether you're looking for a "representation as something simpler" or
"complexity that renders a problem tractable". It makes a big difference
whether you're looking for "similarity", what I would call a correspondence
mapping, or for "utility" in a goal context. These concepts may be related
mathematically but their intuitive surface manifestations are different. So
psychologically there is a definite difference, one that IMHO shows a strong
surface correspondence with our respective beliefs about how AI should be
pursued.
There are also good mathematical reasons to define simplicity as a special
case of tractability rather than the other way around, especially if you
want to match your intuitions to seed AI (or evolutionary psychology, for
that matter).
How can we mathematically define tractability? It will come as no surprise
to any reader of DGI that I want to define it in terms of fitness
landscapes. Suppose that a system S is at a point P within a landscape L,
and you want to get from P to Q. What is the distance between P and Q? One
way is to define "distance" as inversely proportional to the probability of
system S taking the step to Q from P. Another way is to define distance as
the amount of computing power that needs to be expended to get from P to Q.
The first definition is a special case of the second, since if taking a step
is highly improbable, it means that you need to try out a large number of
steps in order to hit on the right one - that is, you can convert a measure
of improbability into a measure of the amount of computing power required,
although not always vice versa. The utility of the first definition's
phrasing - distance as inversely proportional to probability - is that a
bias in the right direction can be as good as cash; Deacon fans take note.
This basic unit of tractability - the computational distance between P and Q
for a system S - can be deconstructed from below in a number of ways; "Q"
may represent a class of targets having some minimum utility, any one of
which would be acceptable, and so on. But it will do as a building block
for the other concepts needed.
"Useful complexity", or complexity that contributes to tractability, is a
pattern that, when added to the system S, tends to decrease the distance
between P and Q. If the pattern decreases the distance between a very broad
range of Ps and Qs, then the pattern might be said to have something to do
with "general intelligence", as above.
"Manageability" becomes the ability to reach a final target through a series
of interim steps of incrementally increasing perceived utility. If the
distance between P and Q is inversely proportional to probability, then it's
important to break up the journey from P to Q into as many incremental steps
as possible. In other words, if the system needs to go from P to Q to R
without Q having any perceived utility of its own, it needs to launch a
two-ply search, and the total distance from P to R is the distance from P to
Q *times* the distance from Q to R. If the system can break up the problem
into two steps, by seeing Q as incrementally closer to a solution, then the
total distance from P to R is the distance from P to Q *plus* the distance
from Q to R.
The simplicity c(P) of a pattern P, and the simplicity of Q given P, c(P|Q),
respectively become the amount of computational power the system must expend
to arrive at solution P, and the amount of computational power the system
must expend to arrive at solution Q given solution P. "Simplicity" is a
special case of tractability. Usually, if Q is simple given P under the
complexity-theory definition, then it means that it takes less computational
power to arrive at Q given P. But this is not always the case. When it is
not the case, then the greater simplicity of Q given P does not make Q more
accessible to the AI given P. The tractability definition is the one that
is of immediate use in AI. The simplicity definition sometimes, but not in
always, produces useful tractability.
The simplicity definition is less complex, but the tractability definition
is more useful...
For seed AI, it is useful to distinguish between solution complexity and
system complexity. The manageability of a problem X, for a given system, is
determined by the structure of X, St(X), which in this case becomes
analogous to the fitness landscape of X. X is manageable if the simpler
solutions in St(X), or the perceived "good intermediates" for partial
solutions, render tractable the complex high-utility solutions in St(X).
However, the distance between any given P and Q is relative to the system S
and its knowledge base K. The accumulation of expertise can render a single
problem tractable if the expertise decreases the distance between P and Q,
or increases the system's ability to perceive goodness in partial solutions
so that the problem can be broken down into a larger number of smaller
steps. In turn, the acquisition of *expertise* is manageable if acquiring
tractable simple expertise enables the system to acquire complex useful
expertise! *This* is what you're looking for in an environment for baby AIs
- not just an environment with manageable problems, but also an environment
with manageable expertise, so that the AI can manageably acquire the
expertise that makes larger and more complex problems manageable!
The complexity that goes into intelligence is the complexity that renders
manageable (a) the problem space and (b) the expertise/content space.
"General intelligence" is system complexity and knowledge that renders
tractable a wide variety of problem spaces and expertise spaces. An AI
begins to mature as a "seed AI" when "adding system complexity that renders
problems and expertise tractable and manageable" first becomes tractable.
Of course that's only the start of the problem. If you see the System
Complexity Space as the problem, and the AI's underlying system at any time
as the P and Q, then just jumping from P to Q for the first time may
inaugurate the AI's career as a seed, but it hardly finishes it. As with
any fitness landscape there's the problem of not getting stuck in local
optima and so on; there are distinctions between mere optimizations that get
from P to Q faster and true systemic changes that alter which Ps and Qs are
explored. "Recursive self-improvement" is when seed AI becomes
"manageable", in the sense given, as well as being "tractable". There's
also a good way to define "hard takeoff" using these definitions, but I
think I'd better stop here for the day.
But I will say that when you look at AI from this perspective, it does shed
another spotlight on why AI projects tend to fail. The tractability of the
problem is merely the zeroth step. The manageability of the problem space
is the first derivative. The manageability of the expertise space is the
second derivative. Coding an AI, or an AI modifying itself, is the third
derivative. Strongly recursive seed AI is the fourth derivative. If all
you see of that is the immediate problem, the zeroth derivative, and you
write some piece of code that solves it directly, then naturally you're
going to do a comical bounce off the problem of Real AI - like backing up,
getting a good wind, and running headlong into a thirty-mile wall of rubber
cement.
-- -- -- -- --
Eliezer S. Yudkowsky http://intelligence.org/
Research Fellow, Singularity Institute for Artificial Intelligence
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:38 MDT