Re: [agi] A difficulty with AI reflectivity

From: Eliezer Yudkowsky (
Date: Thu Oct 21 2004 - 02:10:57 MDT

Ben Goertzel wrote:
>> Leaving aside other qualms I have about the Gödel Machine, it appears
>> to me that for the Gödel Machine to execute a change that switches to
>> a new theorem-prover or proof-verifier, the Gödel Machine must prove
>> that if the proof-verifier accepts P, then P.
> Hmmm... I haven't read that paper for a while, but my intuitive response
> is that you're asking the Godel machine to do more than it needs to do.
> Doesn't it just have to prove that:
> "If the proof verifier accepts P, then either P is true or the old
> proof-verifier would have accepted P"
> or something like that?

Wei Dai wrote:
> I think you probably misunderstood the paper. The purpose of searching
> for new theorem provers is to speed up the theorem proving, not to be
> able to prove new theorems that were unprovable using the old theorem
> prover. So the Gödel Machine only needs to prove that the new prover
> will prove a theorem if and only if the current prover will. I don't
> think this should be a problem.

It looks to me like this would require a special amendment to Schmidhuber's
architectural specification - Schmidhuber spoke of proving only that the
code modification had greater expected utility. Even if you proved the new
system would do exactly what the old system did, only faster, it would not
follow that the new system was an *improvement* - it might be generating
dreadfully wrong answers even faster - unless you included some kind of
assumption about the system working. For example, suppose that we start
with a Godel Machine that, meta-level aside, starts with an object-level
program that repeatedly presses a button delivering the equivalent of a
nasty electrical shock. If the meta-level speeds up the object level so
that we get nasty electrical shocks twice as fast, this is not an expected
utility improvement. If you build the assumption into the meta-system that
the object-system works, this is good knowledge if the object-system *does*
work, but not otherwise. Correspondingly, if the meta-system does work,
but the meta-system cannot prove that the meta-system works, you will need
to include an assumption in the meta-system that the meta-system works.
Even presuming the basic strategy is correct, the next question is how to
do this without trashing the meta-system due to Godel and Lob.

One way that might succeed in salvaging the Godel-Machine formalism would
be to isolate the rewriting of the theorem verifier from the rewriting of
the rest of the Godel Machine. When rewriting the remainder of the Godel
Machine, we try to prove that the remainder rewrite improves expected
utility, assuming that the verifier admits correct object-level theorems
(I'm not sure you can get away with this). When rewriting the verifier,
the verifier has to pass a different test: proving that the new verifier
accepts only theorems that are admissible under the old verifier. I.e.,
any acceptable proof according to the new verifier, is an acceptable proof
according to the old verifier. We would not concern ourselves with whether
the proofs are correct - no object-level questions of expected utility -
just with proving an embedding of the new formal system in the old formal
system. But this requires a compartmentalization of the Godel Machine; it
can no longer rewrite every part of itself using the same criterion of
success, and there must be several hardcoded, non-modifiable aspects of the
architecture (such as the interface between the proof verifier and the rest
of the system). This would weaken Schmidhuber's original claim that the
Godel Machine was fully self-modifying. It would also require some kind of
diagonalization in the old formal system so that it can possess a full copy
of itself, and prove embedding of the new formal system in "the old formal
system plus a copy of itself as an embedding criterion". You'd have to
prove an embedding of a new diagonalized formal system in the old
diagonalized formal system. This is all new math that doesn't appear in
Schmidhuber's original paper, nor anywhere else I have yet encountered in
reading about reflective formal systems. And since I have not formalized
this entire argument, the whole thing could contain some catastrophic flaw
that isn't apparent in verbal discussion.

Hence my request to see if there was anything on this in the existing

I would not be satisfied with the solution as given above for a true seed
AI. It seems to me that the difficulty with the Godel Machine reflects a
very deep AI challenge, and I gave the example of the Godel Machine only to
clarify the nature of the challenge, not to ask how to patch the Godel
Machine specifically. Hardcoding that all new proof-verifiers must embed
their admissible sentences in the set of admissible sentences of the old
proof-verifier is an interesting solution, but not as interesting as, say,
an AI that could ponder the relative merit of intuitionistic versus
classical mathematics, which is where I think the deep problem is heading.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:49 MDT