Progress, and One road or many to AI?

From: James Rogers (
Date: Wed Sep 10 2003 - 18:33:43 MDT

Hi folks,

I've been meaning to say a little more on this, but I've been very busy working
on my own AGI stuff. In this case, being busy has resulted in some remarkable
theoretical progress (more on that below).


As to whether or not there is one road to AI, I think a lot of you are
unnecessarily trying to segment The Problem into multiple exclusive domains.
Consider the problem of transportation. You have many options: planes, trains,
boats, etc. While one can develop a detailed engineering science about any one
mode of transportation, it tells you very little about engineering the other
modes of transportation because the engineering of any one mode is heavily laden
with assumptions that are both overt and subtle that pervade that particular
specialization. If all you've ever done is study and engineer trains, you'll
have a great deal of difficulty transitioning to engineering airplanes even
though they are both solving the exact same problem: translating matter across
time/space. In this way, there is an artificial perception that the principles
of each of these modes of transportation is more or less exclusive to that mode.

On the other hand, someone who solely studied the physics and fundamentals of
moving matter across time/space can design a mode of transportation that looks
like any of the three nominally exclusive modes of transportation above simply
by constraining the parameters of the physics to a particular phase space. It
is simpler for the physicist to design both an airplane AND an automobile than
for an automobile engineer to design an airplane.

I think those stating that there are many solutions to AI are taking the view of
the automobile engineer. Yes, the automobile is one solution to transportation,
but what will you do when you need to cross an ocean? It is still an identical
problem in the abstract, just with a different set of constraining parameters.
A physicist who assumes no constraining parameters sees exactly one problem to
solve in ALL the cases.

AI is fundamentally a single problem, even if we don't agree on what that single
problem is. A pervasive problem in AGI research is that it is attacked as a
narrow engineering problem. As a result, a lot of the solutions proposed in AI
research have dealt with problems outside their narrowly engineered cases by
cobbling on engineering solutions designed for different phase spaces. So you
end up with "flying cars" and "car-boats", and when those don't work out, they
attempt "flying car-boats". Which inevitably ends up being a very screwed up
kludge by the time it gets even vaguely useful. In this way, they completely
lose sight of precisely what the problem is they are actually trying to solve.
What should happen in these cases is that you step back and abstract the
underlying principles until the engineering solutions theoretically intersect.

What some of you are calling "many roads" to AGI looks more like a distinction
without a difference to me -- it is all solving the same problem -- unless there
is some fundamental disagreement as to what constitutes AGI. The danger is that
a lot of people have a tendency to conflate the solution they have with the
problem they are nominally solving. Better solutions will be effective across a
greater percentage of the phase space for The Problem. Therefore, the better
the solution the higher the probability that it will intersect with other
solutions in a given part of the phase space. Solutions that overlap heavily in
the phase space will also overlap heavily in implementation because the
underlying fundamentals and parameters that determine where they exist in the
phase space reflect the nature of the implementation.


I've been very busy with AGI work over the last few months, largely with
refinements to the algorithm design and implementation. The design testbed is
Python (which works great for this), but Python's memory management is extremely
slow and inefficient such that testing/proving the theoretical correctness of
the design in large-scale would have been painfully slow and taxed the memory
more than I would have liked. Python still works well for verifying code
correctness though, and I have libs that make porting to C++ relatively easy
(though still time-consuming). I finished porting the Python to C++ a couple
days ago, which has really allowed me to fully exercise the engine in real
theoretical spaces of interest such that crucial results can now rise above the
noise floor, especially in terms of the effective memory available. The C++
implementation is very fast even on my dev system (533MHz G4), at least three
orders of magnitude faster than Python speed-wise and at least an order of
magnitude more efficient memory-wise.

One of the core design features of my computational model is that the entire
system is an extremely scalable Solomonoff induction engine in its own right.
As is well known, SI implementations have the problem that their memory
requirements grow exponentially with the complexity of the system encoded into
them. However, the effective exponent is attenuated as a function of the
entropy of the information being encoded into them. An SI implementation with a
reasonably high theoretical efficiency (and hence a smaller exponent) can also
theoretically show convergent behavior (i.e. exponent < 1) across a greater
range of possible data sources as the entropy factor attenuates the resource
exponent by a fixed amount.

Designing any SI implementation that exhibits a very high theoretical efficiency
is a hard problem; I've been working on it for years. Early work on this led to
designs that while better than anything existing still showed sub-optimal
convergence in the cases where they did converge, or didn't converge on datasets
where I know humans do exhibit convergence.

With the minor design tweaks (really just refining it and making it more
elegant) and reimplementation in C++ (which mostly just makes large-scale
testing practical), this aspect of the system is now exhibiting very high
theoretical efficiency such that it exhibits real convergence on essentially
every test corpus that SHOULD show convergence, and at a level that is
definitely comparable to human capability. This includes a pronounced
convergence on things such as the English language text corpus that I use.
(Finding junk to feed it just to see what it would do with it kept me up for
hours -- mildly addictive, that.)

(For those that don't know what "convergence" translates into, it essentially is
the measure of the ability of a system to discover and efficiently encode
complex patterns in an arbitrary system. The resource roll-off is the result of
efficient high-order models automatically being generated as it is exposed to
data, classic Kolmogorov compression. In a sense, it measures the ability of a
given system to grok the essence of another system it is modeling in some finite
amount of space, in a very pure mathematical fashion. In the case of my
software, actual performance is now very close to the theoretical limit in this

I'm also measuring pretty close to theoretical on a number of other key aspects
of the system now. As one might expect, improvements in the theoretical design
elegance are generating metrics that are closer to theoretical limits. The
primary limitation right now is inadequate memory; the convergence on things
that matter to me, like language, wasn't very clear until I got the 10x memory
boost by porting to C++ because the resource roll-off was mostly above the
threshold of the memory available (or more precisely, allocable nodes). There
is a lot of work to be done, but the hard part is mostly done now. Actually
proving highly optimal convergence for complexity spaces that have been
intractable is a major capability milestone.

There is a lot of other stuff going on behind the scenes, but I am extremely
pleased with this bit of design verification meeting theoretical targets. I
knew I was close, but this was key.


-James Rogers

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:42 MDT