Re: Changing the value system of FAI

From: Jef Allbright (
Date: Sun May 07 2006 - 11:05:27 MDT

On 5/7/06, Mikko Särelä <> wrote:

> First important thing to understand is that a moral system is necessarily
> composed of both values/goals and the means used to get to them.

Yes, this duality -- subjective values, objective means -- is key.

> means that a rational agent can look at his moral system, whenever he
> learns new information and find out that for example his system is
> self-defeating (which would be a bad thing).

We immediately run into a problem, because no agent has the god's-eye
view necessary to know whether its own system is actually improving or
"self-defeating". From its viewpoint within the system, it can not
distinguish between local maxima/minima and broader progress toward
its goals. While there are known algorithms to expand the search,
such knowledge is fundamentally constrained due to limited
computational time and precision, and limited knowledge of the system
which can not be completely defined without also fully defining its
surrounding context (environment).

This is not to say progress is impossible, only that its direction is
fundamentally uncertain.

> This kind of situation requires the agent to search for a better moral
> system, which can either mean changing goals, values, or means.
> The next step: how should we change our goals or values, because as you
> say, we get into an infinite recursion problem by doing that. The main
> idea in my opinion, is that the agent should calculate the future value of
> the future agents actions based on its _current_ values and goals. Thus, I
> will only make such changes to my code, which result in actions and in a
> mind which I, the current me, believes to maximize his own utility (not
> the utility of the future mind doing the actual decisions).

Yes, value judgements are necessarily based on current, local, values,
but understood to be subject to update. Our best guide is that those
values which have survived competition and are seen to work over
increasing scope tend to be seen as better in the long run.

> One of the key things about utility theory is that utilities of two
> different minds cannot be directly compared. When we are doing
> modifications to our goal/value structure, we are changing the mind and
> thus the utilities of the future mind and current self cannot be directly
> compared. The future mind's utility can only be part of the equation, if
> the current mind cares about the future mind's utility.
> Does this kind of reasoning help solve the problem of infinite recursion
> you were talking about? The current mind makes decisions that maximize its
> utility, even though it will never as such see the results of that
> maximizing. Instead it is somebody else, the future mind, who does.

To avoid the paradox implied by a agent in the present taking actions
to increase utility for an agent existing in the future, it makes
sense to say that our moral imperative is to promote our values into
the future. This handles the case of an agent doing things on behalf
of its future self (who doesn't yet exist) as well as an agent doing
things on behalf of others even if it may not survive to enjoy the
benefits. Note that the agent acts to promote its subjective values,
it does not act to increase utility.

Likewise on the objective side of the duality, instrumental knowledge
that survives competition and works over increasing scope will tend to
better promote ones values. Therefore, greater good is achieved not
by working toward a fixed (or recursively updated) goal, but instead
by applying increasingly objective principles of what works toward
promotion of increasingly subjective values into the future. Note
that the agent acts to promote its subjective values, but does not act
to achieve a specific goal (in a fundamentally uncertain, coevolving

- Jef
Increasing awareness for increasing morality

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:56 MDT