From: Ben Goertzel (ben@goertzel.org)
Date: Wed Nov 10 2004 - 09:01:35 MST
Hi Christian,
> I don't try to pretend being an expert in this area. I do *not*
> have my own
> answer to AI, the Universe and Everything.
> However, Hutters and Schmidhubers works are good examples of such
> approaches. I do not want to suggest that the "Goedel machine" is
> practical (I expressed my critiques on this list), still I think that the
> direction is right.
OK, that's reasonable. I disagree but, well, time will tell...
I think their work is beautiful theoretically but isn't going to lead to
anything practical, ever... I think that huge-resources AI and
pragmatic-resources AI are very significantly different problems...
> There are other randomized approaches: for example the
> famous, deep and surprising PCP-Theorem states that
> every mathematical proof can be checked in polynomial
> tiime with arbitrary high probability by looking at only a
> *constant number* of bits of a sufficiently transformed form of
> the proof (the transformation can be performed in polynomial time).
> That is, for every epsilon, there is an N such that you have to look
> at only N randomly chosen bits of the proof to be able to say
> with 1-epsilon probability that it is correct.
> Keep in mind that the proof can be arbitrarily long!
This is interesting indeed, but I don't know of any work that makes this
practically useful, do you? If so I'd love to hear about it.
Do you think this will ever be usable as a practical basis for AI work?
It's all a question of how big the constant number of bits N is, and how big
the polynomial is, right?
Suppose epsilon = .01, then do you know what is N (for some standard model
of computation), and do you have a bound on the polynomial?
> My point is that approximative and stochastic methods are not
> incompatible with theoretically justified mathematics and this applies
> to formal deduction too, since theorem-proving can also be formulated
> as a discrete optimization problem.
In principle, this is true, but so far as I know the theoretically justified
math is not of any use for making practical algorithms for general AI right
now.
I'd be happy to be proved wrong, for sure.
For instance, we make a lot of use of a variant of Martin Pelikan's
"Bayesian Optimization Algorithm" in Novamente. We apply a BOA variant to
combinatory logic expressions using a special set of supercombinators. It
seems to work well, and we have reason to hope this method can be a key part
of a general AI.
But theoretical math doesn't seem useful yet for answering the really
essential questions we have, such as: "For problems of some particular
character (e.g. recognizing objects among a set of pictures, learning
procedures to carry out certain actions in certain environments, etc.), how
long will BOA take to converge to an answer that's within epsilon of
optimal, based on a population size of N candidate instances?" We've used
some nice though fairly shallow math to do stuff like prove that simplified
combinator-expression formats have general power, but contemporary math
seems impotent to even start addressing the questions that really interest
us.
Contemporary math has failed even to probe the interesting questions about
the behavior of the simple genetic algorithm. David Goldberg has done some
nice stuff that's restricted to particular fitness functions, and Michael
Vose gave a beautiful treatment of the GA as population size tends to
infinity (much improving on a 1993 paper of mine on the topic), but this
stuff doesn't really touch the important questions of how big a pop size you
need and how many generations it takes to get convergence to a near-optimal
solution on practical problems.
> I also think that the gap between timeless and stateful problems are
> not so huge as you suggest. The control-theory is also a very well
> established area of mathematical optimization and it deals exaclty
> with these type of problems.
Yeah, but control theory totally falls apart when you start dealing with
situations of significant complexity. For instance, control theory helps
for fixed-arm robots but it's almost useless for mobile robotics, where
everyone consequently relies on heuristic methods.
> From a formalist point of view, the functional programming
> community developed the notion of monads which allows
> to embed stateful computations into completely
> statelessly defined (purely functional) programs. This simplifies
> the reasoning about the correctness of programs with side-effects
> and allows to look at such computations in a stateless way.
> (In fact, that state is not complitely eliminated, but merely made
> explicit.)
>
> A nice thing about functional programming is that it narrows the
> gap between programs and logic (formal reasoning).: Functional
> programs can be viewed as mathematical formulas and can be
> manipulated, reasoned about and executed at the same time.
> The drawback, of course, is that the resource-usage is hard
> to argue about. As the saying goes: "A LISP programmer knows
> the value of everything and the cost of nothing." I think that
> this is not necessary and one could extend the type-sysytem of
> functional programs to incorporate information about the resource
> usage of functions. This is of course a hard topic, since it would
> need more sopisticated formal reasoning tools to make it feasible.
Yeah, we've actually done something like this with Novamente and it's
cool -- our representation of complex procedures and patterns is based on a
beefed-up version of combinatory logic (which is the math underlying some
functional programming compilers).
> I would mostly agree, but I would stress that sufficient interaction
> between formal deduction and optimization could be of great relevance.
> I believe that this interaction must be mutual: the deduction subsystem
> should make use of the optimization methods and argue about their
> efficiency at the same time. It should be able to access the practical
> performance results of the optimization and also rewrite itself.
> A formal reasonong system should be able to perform experients
> and formulate hypotheses. The usage of functional programming could
> be practically useful in this reagrd, since it narrows the gap between
> formulas and programs.
Yes! our intuitions agree here, and this is very close to what we're doing
in Novamente...
We do deduction using a probabilistic inference framework called
Probabilistic Term Logic (PTL), and we do optimization using the Bayesian
Optimization Algorithm. So they two can be glued together using the common
language of probability theory.
This week we are actually merging the PTL branch of Novamente's codebase
with the BOA branch for the first time ;-)
> To test these ideas, I have designed an extremely simple
> yet powerful fully reflective functional programming language
> (I called it *BLess*, which stands for "Be less!") It is currently
> being implemented in OCaml. (A first preview very rudimentary
> version without type-checking is available.)
Interesting -- more parallels with our own work.
As syntactic sugar, we are working with our own language called Sasha,
designed by Luke Kaiser. Sasha is a derivative of the language Nemerle that
Luke and his friends wrote in 2002-2003, and Nemerle is itself a derivative
of OCaml. So for short, we have an OCaml-type language that compiles into
combinator trees, which in turn can be loaded into NOvamente to form
internal "Novamente nodes and links"...
> The current proof of concept (not yet ready) implementation is
> a very inefficient interpreter, I have plans to compile BLess
> programs to OCaml code, which can be compiled to efficient
> native assemble.
Very cool! Is there more online information about Bless somewhere?
If you can compile Bless to OCaml, then you could probably compile it to
Sasha as well, and Bless programs could then be loaded into Novamente as
combinator expressions ;-)
-- Ben
This archive was generated by hypermail 2.1.5 : Tue Feb 21 2006 - 04:22:48 MST