From: Christian Szegedy (szegedy@or.uni-bonn.de)
Date: Wed Nov 10 2004 - 05:07:51 MST
Ben Goertzel wrote:
>So I'm wondering if you've thought through any of the details of how to make
>an AI based on mathematical deduction of the form used in contemporary
>theorem-provers, and mathematical optimization of the form used in
>contemporary computer algebra systems.
>
I don't try to pretend being an expert in this area. I do *not* have my own
answer to AI, the Universe and Everything.
However, Hutters and Schmidhubers works are good examples of such
approaches. I do not want to suggest that the "Goedel machine" is
practical (I expressed my critiques on this list), still I think that the
direction is right.
Of course, you are right if you resetrict yourself to "contemporary"
systems,
but I think that several shortcomings of those systems are rooted
in the separation of deduction and optimization.
You seem to differentiate between traditional timeless extact
deduction and more stochastic indeterministic human-like reasoning.
Of course this distinction is justified to some extent, but I believe that
the latter can be very well formulated in the framework of mathematical
optimization.
For example, if you look at the classical (simple) theory of neural networks
from the optimization point of view, it turnes out that algorithms
based on them mostly perform a special primal-dual subgradient
method based on Lagrangian relaxation. This does not mean that all such
subgradient methods can be described by neural subnetworks.
For practical pupropses, non-neural type subgradient methods are
the base of several successful domain-specific optimization algorithms
on current hardware. Of course, only very few people in the optimization
community associate such methods with neural-networks or AI.
The area of stochastic methods is a very broad and intensively
studied topic in optimization and there are a lot of theoretically
justified combinatorial algorithms based on analytical (continous)
optimization and randomized rounding. This may sound
unspectacular, but in fact, there are extremely tricky ways to apply
such methods without recognizing that one has, in fact, performed
"randomized rounding".
There are other randomized approaches: for example the
famous, deep and surprising PCP-Theorem states that
every mathematical proof can be checked in polynomial
tiime with arbitrary high probability by looking at only a
*constant number* of bits of a sufficiently transformed form of
the proof (the transformation can be performed in polynomial time).
That is, for every epsilon, there is an N such that you have to look
at only N randomly chosen bits of the proof to be able to say
with 1-epsilon probability that it is correct.
Keep in mind that the proof can be arbitrarily long!
There is also a very well-established theory of approximation algorithms.
Such algorithms do not necessary produce optimal solutions only
a good approximation of them. There are a lot of "hard" optimization
problems, for which there exist very good polynomial time approximation
algorithms (like the famous Travelling Salesman Problem). But there are
also very simply sounding problems (e.g.: find a maximum clique in
a graph) for which there exists no reasonable polynomial time
approximation algorithm (assuming P!=NP). Interestingly, the proof of
such non-approximability results make use of the above-mentioned
PCP-Theorem.
My point is that approximative and stochastic methods are not
incompatible with theoretically justified mathematics and this applies
to formal deduction too, since theorem-proving can also be formulated
as a discrete optimization problem.
I also think that the gap between timeless and stateful problems are
not so huge as you suggest. The control-theory is also a very well
established area of mathematical optimization and it deals exaclty
with these type of problems.
From a formalist point of view, the functional programming
community developed the notion of monads which allows
to embed stateful computations into completely
statelessly defined (purely functional) programs. This simplifies
the reasoning about the correctness of programs with side-effects
and allows to look at such computations in a stateless way.
(In fact, that state is not complitely eliminated, but merely made
explicit.)
A nice thing about functional programming is that it narrows the
gap between programs and logic (formal reasoning).: Functional
programs can be viewed as mathematical formulas and can be
manipulated, reasoned about and executed at the same time.
The drawback, of course, is that the resource-usage is hard
to argue about. As the saying goes: "A LISP programmer knows
the value of everything and the cost of nothing." I think that
this is not necessary and one could extend the type-sysytem of
functional programs to incorporate information about the resource
usage of functions. This is of course a hard topic, since it would
need more sopisticated formal reasoning tools to make it feasible.
>Regarding theorem-provers, for instance, the truth is that contemporary
>theorem-provers use very oversimplistic inference control strategies. As an
>example, the Otter theorem-prover, which is an excellent system compared to
>most of the competitors, basically does only proofs by contradiction, and
>deals with existential quantifiers solely by Skolemizing them out (which
>only works in the context of proofs by contradiction and other special
>cases). This kind of control strategy appears to bear no relevance to how
>any adaptive, autonomous cognitive system would reason (presumably such
>systems would reason by combining crisp deductive reasoning with
>uncertain-inference-driven inductive and abductive reasoning and speculative
>concept-creation).
>
I would mostly agree, but I would stress that sufficient interaction
between formal deduction and optimization could be of great relevance.
I believe that this interaction must be mutual: the deduction subsystem
should make use of the optimization methods and argue about their
efficiency at the same time. It should be able to access the practical
performance results of the optimization and also rewrite itself.
A formal reasonong system should be able to perform experients
and formulate hypotheses. The usage of functional programming could
be practically useful in this reagrd, since it narrows the gap between
formulas and programs.
=========================================
To test these ideas, I have designed an extremely simple
yet powerful fully reflective functional programming language
(I called it *BLess*, which stands for "Be less!") It is currently
being implemented in OCaml. (A first preview very rudimentary
version without type-checking is available.)
It turned out that the power of a programming language can be
extended by generalization. In fact, the power of BLess comes
from the unification of seemingly different concepts. The result
is a framework which is powerful and simple at the same time.
BLess is similar to LISP, but there are a lot of very essential
differences:
Most programming languages are designed by a graduate
evolutionary process. First the applicative system is
designed, then a type system is defined allow for detecting
obvious run-time errors in the applicative code in advance.
At last the module system completes the design of the language.
Although this procedure may seem mandatory to most
language designers, BLess is desgined by a completely
different approach: It started with the design of the module
system... and it ended with it.
In BLess, the data, the code and the logic is one.
The same internal representation is used to define pure data,
code that operates on data and logic to argue about the
correctness of the code.
In BLess, value, type and module is one.
There are no syntactic and representional differences between
values, types and modules. Although the optimization of
BLess programs requires a separation between these quantities,
the definition of applicational semantics and type correctness
does not differentiate between the above notions.
In BLess, function (or functor) application,
interface restriction and type validation is one.
There is no syntactic or representational difference between
the above concepts. In BLess, they are all different
facets of the general concept of *imposition*.
The most important distinguishing feature of BLess it that
it is designed to generate, process and check BLess programs.
This does not only involve syntactic manipulations, but
also access to the type inference and checking mechanisms.
The standard library coming with BLess will contain a fully
featured interface to the internal representation of BLess
code. This library allows for online type inference and
type checking of newly inserted BLess code. It can also
be used to examine and type-check externally supplied
BLess code.
BLess will support deep generic programming:
Already checked applicative bless code may operate on the full
type and module system before the whole code is completely
type checked.
Bless is a purely functional language, so it does not
allow mutable data. However, through the monadic
system (built on top of the core language, like in Haskell)
allows for a simple a concise simlulation of the
above features while maintaining referential transparency.
The current proof of concept (not yet ready) implementation is
a very inefficient interpreter, I have plans to compile BLess
programs to OCaml code, which can be compiled to efficient
native assemble.
=========================================
Best Regards, Christian
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:50 MDT