From: Michael Vassar (michaelvassar@hotmail.com)
Date: Tue Aug 30 2005 - 20:06:14 MDT
>We're already
>assuming that.  The A B C -> X Y Z example shows how, one step at
>a time, the system can take actions that provide greater utility
>from the perspective of its top-level goals, that nonetheless end
>up replacing all those top-level goals.
Well then, so long as the ultimate goals are higher utility, from the 
perspective of the original goals, than the original goals were, why is this 
a problem?  A human would typically not be able to predict the long term 
expected utility of a change to its top level goals, but a FAI wouldn't make 
such changes unless it could.
>Another question entirely is whether, if the AI is told to maximize
>a score relating to the attainment of its top-level goals, and is
>given write access to those goals, it will rewrite those goals into
>ones more easily attainable?  (We could call this the "Buddhist AI",
>perhaps?)  The REAL top-level goal in that case
>is "maximize a score defined by the contents of memory locations X",
>but it doesn't help us to say that "maximization" won't be replaced.
>The kinds of goals we don't want to be replaced have referents
>in the real world.
This really is a very very old insight for this list.  Try to familiarize 
yourself with the list archive or at least with the major articles.  That 
really applies to everyone who hasn't done so.  Suffice it to say that such 
concerns were addressed very thoroughly years ago.
>You seem to be proposing that an AI will never make mistakes.
In the human sense, yes.  If an AI is superintelligent and Friendly for any 
significant time it will reach a state from which it will not ever make the 
sort of errors of reasoning which humans mean by mistakes.  In fact, any 
well calibrated Bayesian built on a sufficiently redundant substrate should 
never make mistakes in the sense of either acting on implicit beliefs other 
than its explicit beliefs or holding a belief with unjustified confidence.  
Obviously, computing power, architectural details, and knowledge will 
determine the degree to which it will or will not act in the manner which 
actually maximized its utility function, but that is not what we humans mean 
by a mistake.  We are used to constantly taking actions which we have every 
reason to expect to regret.  A FAI shouldn't do that.  This is an important 
distinction and not at all a natural one.  It shouldn't be terribly 
shocking, but is.  But by now we should be used to the idea that computers 
can perform long series of mathematical operations without error, and that 
performing the right long series of mathematical operations is equivalent to 
making a decision under uncertainty, so they should be able to make 
decisions under uncertainty without error, though due to the uncertainty 
such decisions will usually be less optimal that the decisions that would 
have been available given more information.
>Making mistakes is a second way in which top-level goals can
>drift away from where they started.
Making sub-optimal decisions can cause top-level goals to drift, but this 
problem is absolutely unoavoidable, but should not be critical (and if it is 
critical, that is, fundamental to the way reason works, we will just have to 
do as well as we can).  Account must be taken of it when designing an FAI, 
but this only requires an incremental development beyond that needed to 
protect it from Pascal's Wagers.
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:52 MDT