From: Chris Capel (
Date: Wed May 10 2006 - 15:56:44 MDT

Thinking about self-modification and Godel.

If some proposed modification involves changing how the modified AI
will evaluate the desirability, D_m(S), of some specific future state
S, relative the current AI, which assigns it D_c(S), then how can the
current AI tell whether the change is actually good? If D_m(S) is not
equal to D_c(S), then by the standards of the current utility function
U_c, the proposed modification *must* be suboptimal, right?

Perhaps it's the case that all self-modifications ought to be those
which preserve the values of utility calculations on known inputs, but
simply make them more efficient to calculate? But would that class of
self-improvement encompass the kind SIAI has in mind for a Friendly
AI, or would that put some undesirable limit on the shape and
development of the AI?

Perhaps Godelian considerations enter when one considers
self-modifications where D_c(S) is undefined, or incalculable, and
D_m(S) is calculable.

I'm aware that pretty much any self-modification is going to change
D(x) for some future state x, if trivially, because the AI simply
cannot map utility to state space exhaustively. Due to computation
restraints, it will have to decide that, within an acceptable
probability, |D_c(x) - D_m(x)| < acceptable_risk, where
acceptable_risk might vary over x, etc. etc. I think these
complications can be put aside for a while.

Chris Capel

"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:56 MDT