Re: [agi] A difficulty with AI reflectivity

From: Jeff Medina (
Date: Thu Oct 21 2004 - 06:57:47 MDT

"For example, suppose that we start with a Godel Machine that,
meta-level aside, starts with an object-level program that repeatedly
presses a button delivering the equivalent of a nasty electrical
shock. If the meta-level speeds up the object level so that we get
nasty electrical shocks twice as fast, this is not an expected utility

In your analogy, "pushing a button that produces nasty electrical
shocks" is meant to correspond to "proving theorems", and is bad
(results in a net utility loss), and is meant to be plausible because
the theorem-proving our theorem-proving machine is doing may be
fundamentally unsound (in that it may be proving some theorems true
that are actually false.) Correct?

If so, then your problem arises without introducing the possibility of
our rewriting its own theorem-prover. Specifically, if one can solve
the current problem (roughly, "How can we prove that a new
theorem-prover is an improvement, when that theorem-prover might be
wrongly proving falsehoods to be truths (pushing the electric shock
button)?") then one can also solve the problem "How can we prove that
our original theorem-prover's proofs correspond to (and only to) true

We can't expect to prove that a theorem-prover's proofs correspond to
(and only to) true propositions, on pain of infinite regress (because
we can always question the fallibility of the meta-theorem-prover we
decided to use to prove the truth-correspondence of the object-level

So the answer is, as you mentioned in passing, that we must assume, at
some level, correctness of our proof theory, even though the level at
which we make said assumption might be the proof-theoretic equivalent
of jabbing red zappy buttons in an unfortunate, catastrophe-birthing
manner. Our theorem-proving machine can only ever hope to rewrite its
theorem-prover with a performance-level improvement, not a
truth-correspondence-level improvement (other than accidentally, that

You won't ever find anything in the literature on how to prove that a
system of proof is infallibily truth-producing outside of the context
of said system of proof (or even how to prove that system-of-proof A
is more consistently truth-producing than system-of-proof B; not
outside of the context of system-of-proof A, B, or X) because even if
someone were to come up with something that looked like a proof that
"wraps around" as you put it, they (or the journal editors/referees)
would reject it as unsound, due to a reductio ad absurdum. Namely,
because it would contradict that which the rest of proof theory claims
is true; that wrap around is impossible -- truth is forever uncertain
at its base. We cannot avoid inserting potentially unsound

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:49 MDT