Novamente project progress

From: Ben Goertzel (ben@goertzel.org)
Date: Wed Dec 24 2003 - 08:05:58 MST


Mike Deering, on the AGI list, asked me for a report on the progress of the
Novamente AI project

I thought this might be of interest to some fraction of the SL4 list as
well, so here it is.

For those who aren't SL4 old-timers: Novamente (www.agiri.org) is an AI
project that's been in progress since mid-2001, with Singularitarian
ambitions, and also with short-term commercial applications in
bioinformatics and other areas.

-- Ben Goertzel

-----------------
Hi Mike,

About Novamente project progress...

The reason I haven't given progress updates to this list lately is that I've
been even more insanely busy than usual, due to a combination of AI work and
(Novamente-related) business work and personal-life developments. So
recreational emailing has fallen by the wayside of late. Now that Christmas
vacation has come, I have a little time send emails! [Although I'm going on
vacation for 4 days starting tomorrow, so I'll be offline for a little
while....]

Progress on Novamente has actually stepped up considerably as of September,
when a few new people were brought into the project (in connection with a
commercial application of Novamente), some of them working on language
processing and one working on more fundamental AI stuff (representation of
complex knowledge; and evolutionary learning).

However, our recent progress has been of a technical nature, it hasn't yet
yielded big milestones that are exciting to report the world at large.

Regarding language processing, we have a system that "reads" English text
and outputs semantic relationships contained in the text into Novamente.
This is one way to fill the system's mind up with information, though from
an AGI perspective it must be considered a complement to experiential
interactive learning, rather than as the sole means of providing the system
with knowledge. However, this "relationship extraction" software is far
from complete; there's probably 6 months of work left on the "syntax" side,
which will proceed in parallel with work on the semantic interpretation of
the extracted relationships.

Regarding reasoning, we've made a lot of progress on probabilistic
inference, in the context of experimenting with inference on biological data
(quantitative experimental data and data from relational DB's), and (more
recently) on linguistic knowledge. This has involved a lot of technical
math work on my part, working out various details in the inference system.
The results here are very interesting, although there's a lot more testing
and tweaking required, particularly regarding the control of inference.

We've got the code in place for our own generalization of combinatory logic,
which is the scheme we're using to represent complex knowledge in Novamente
(the knowledge that would be represented using variables and quantifiers in
a traditional logic-based system). And we've generalized the Bayesian
Optimization Algorithm (an extension of genetic algorithms based on
probability theory, invented by Martin Pelikan and David Goldberg) to learn
complex combinatory logic expressions. Experimentation with this is
ongoing actively, and during the first half of 2004 this code will be
integrated with the probabilistic reasoning code. I spent a lot of time
last month working out the nasty software-design details of integrating
inferential processing with combinatory logic; that design will be
implemented in January.

I redesigned our previous "attention allocation" subsystem, which used to
use neural net based ideas, to use a different approach based on
probabilistic inference, thus simplifying the system and (I hope) inducing
more emergence among components. This hasn't been implemented yet though.

Yes, we STILL are not at the phase where we've hooked up Novamente to a
"simulated body" in a simulated world and started teaching it
experientially. We're very much looking forward to that day! As of this
point, I can at least say there are no major components that are
unengineered ... the coding of the generalized-combinatory-logic framework w
as the biggest beast AI-wise. However, there's a lot of integration,
testing and tuning work ahead, and some moderate-sized beasts remain uncoded
(like probabilistic attention allocation, and probabilistic logical
unification). Also, there is some nitty-gritty work as yet undone, such as
extending the Novamente core to run on multiple machines as a distributed
system (the design was made to support extension to distributed processing,
but the work hasn't been done yet). And, we've ordered our first 64-bit
machine, and in early 2004 will undertake the task of porting the core to
64-bit Linux on 64-bit hardware.... Depending on how much attention we can
give to AGI as opposed to commercial Novamente applications, we could get to
the "teaching the baby" phase in mid or late 2004, or it might not be till
2005.

Next, some comments on funding.

Our strategy of funding AGI work via commercial applications of the
in-progress AI system is working, in the sense that we're making progress
implementing and testing the AI system, while getting paid for it at a
reasonable level. Some of the commercial applications are also very
interesting in themselves; for instance, in our bioinformatics work we've
made some real progress in creating original diagnostic tools for a couple
diseases (publications on this will come out in early 2004; in one case we
found a diagnostic using Novamente, and in another case using a simpler,
more specialized software system whose construction was inspired by parts of
Novamente).

On the other hand, there's no doubt that if we had some funding purely
oriented toward AGI research, we'd be progressing significantly faster. It
wouldn't take too much either. Funding on the order of $10,000/month would
enable us to hire 3 programmer/scientists exclusively oriented toward AGI
rather than commercial applications, which would drastically accelerate our
progress, as opposed to the current situation where nearly everyone's time
is either devoted to commercial apps, or fragmented between commercial apps
and AGI.

However, I'm not complaining (though I complained a lot yesterday after an
entire day spent conferring with intellectual property lawyers! ;p) -- I'm
happy that it's proved possible to continue the AGI work in an economically
sustainable way.

Regarding the long-delayed "Novamente book", it got too big and messy and so
it was trifurcated into three books

* Probablistic Term Logic (an in-depth treatment of this one component of
Novamente, which is the most mathematically involved component)
* Novamente: Design for an Artificial General Intelligence
* Mind Patterns (a semi-technical review of the conceptual foundations of
Novamente)

I haven't had much time for writing lately, but the PTL manuscript is
largely done, awaiting only some proofreading and a couple final chapters
covering practical applications, to be added in early '04. Now that
Christmas vacation is here and the business world is in a lull, I've been
digging into the main "Novamente Design" manuscript again, and am happily
updating it in accordance with recent simplifications and improvements in
the design. It seems likely that, even with everything else on my plate,
I'll manage to get these manuscripts submitted for publication in 2004,
hopefully by early fall. Once these books are published, we'll probably
make a big push to get some pure-AGI funding to accelerate progress. Unless
we've made soooo much progress by then (due to my delays in finishing the
books!) that more funding isn't needed ;-)

And that's the capsule summary regarding Novamente. Could be better ... but
could also be a lot worse. I've never been more confident of the capability
of the Novamente design to yield AGI, but I've also become more acutely
aware of the time and effort taken to tune and tweak and refine the various
parts of the design to really work as intended. The commercial work we're
doing can be frustrating in that it distracts from pure AGI work, but it has
its rewards in itself (intellectual and human rewards as well as paying the
bills), and it also provides excellent contexts in which to test, tune and
refine various system components as they develop.

-- Ben G



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:43 MDT