Re: drives ABC > XYZ

From: Michael Vassar (
Date: Tue Aug 30 2005 - 20:06:14 MDT

>We're already
>assuming that. The A B C -> X Y Z example shows how, one step at
>a time, the system can take actions that provide greater utility
>from the perspective of its top-level goals, that nonetheless end
>up replacing all those top-level goals.

Well then, so long as the ultimate goals are higher utility, from the
perspective of the original goals, than the original goals were, why is this
a problem? A human would typically not be able to predict the long term
expected utility of a change to its top level goals, but a FAI wouldn't make
such changes unless it could.

>Another question entirely is whether, if the AI is told to maximize
>a score relating to the attainment of its top-level goals, and is
>given write access to those goals, it will rewrite those goals into
>ones more easily attainable? (We could call this the "Buddhist AI",
>perhaps?) The REAL top-level goal in that case
>is "maximize a score defined by the contents of memory locations X",
>but it doesn't help us to say that "maximization" won't be replaced.
>The kinds of goals we don't want to be replaced have referents
>in the real world.

This really is a very very old insight for this list. Try to familiarize
yourself with the list archive or at least with the major articles. That
really applies to everyone who hasn't done so. Suffice it to say that such
concerns were addressed very thoroughly years ago.

>You seem to be proposing that an AI will never make mistakes.

In the human sense, yes. If an AI is superintelligent and Friendly for any
significant time it will reach a state from which it will not ever make the
sort of errors of reasoning which humans mean by mistakes. In fact, any
well calibrated Bayesian built on a sufficiently redundant substrate should
never make mistakes in the sense of either acting on implicit beliefs other
than its explicit beliefs or holding a belief with unjustified confidence.
Obviously, computing power, architectural details, and knowledge will
determine the degree to which it will or will not act in the manner which
actually maximized its utility function, but that is not what we humans mean
by a mistake. We are used to constantly taking actions which we have every
reason to expect to regret. A FAI shouldn't do that. This is an important
distinction and not at all a natural one. It shouldn't be terribly
shocking, but is. But by now we should be used to the idea that computers
can perform long series of mathematical operations without error, and that
performing the right long series of mathematical operations is equivalent to
making a decision under uncertainty, so they should be able to make
decisions under uncertainty without error, though due to the uncertainty
such decisions will usually be less optimal that the decisions that would
have been available given more information.

>Making mistakes is a second way in which top-level goals can
>drift away from where they started.

Making sub-optimal decisions can cause top-level goals to drift, but this
problem is absolutely unoavoidable, but should not be critical (and if it is
critical, that is, fundamental to the way reason works, we will just have to
do as well as we can). Account must be taken of it when designing an FAI,
but this only requires an incremental development beyond that needed to
protect it from Pascal's Wagers.

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:52 MDT