From: Eliezer Yudkowsky (sentience@pobox.com)
Date: Sat Jun 12 2004 - 16:12:11 MDT
Michael Roy Ames wrote:
>
> It is confusing to talk about CV *doing* something. CV is data used by FAI
> to help figure out what humans *want* to do. The 'optimization process'
> then has to figure out just how to get from the existing situation to a
> desired future situation, without 'breaking any eggs'.
I agree that my original wording is confusing - but I was describing a
confusing thing. The CV *is* doing something; the CV is *not* data used by
FAI. The FAI is an optimization process that defines itself as an
approximation to collective volition.
Humans aren't expected utility maximizers. Descriptively speaking, we
don't run on expected utility at all. There are decisions humans make that
are hard to describe in terms of expected utility, such as, for example,
the choice between satisficing and maximizing, or the choice between an
expected utility formalism that represents infinite utilities and an
expected utility formalism that does not (an allusion to a recent Nick
Bostrom paper).
Naively, we might phrase the problem: "Humans are expected utility
maximizers, FAIs are expected utility maximizers, let's transfer the
utility function from the human to the FAI." Now note that even naively we
do not speak of human utility functions as *data* for a constant FAI
utility function that tries to maximally "satisfy human utility functions".
We speak of transferring over the utility functions themselves.
Otherwise, for example, the FAI would wirehead on altering humans to have
easily satisfied utility functions.
But this naive description is not good enough, I think; humans are not
expected utility maximizers. So the next step up in abstraction is to view
a human as a decision process, a dynamic-that-outputs-preferences, and to
define "expected utility maximization" as a breakable abstraction from the
decision process. I am not sure how to formally define this business of
creating abstractions from decision processes, but it is where I think FAI
theory must go. An example of an abstraction is looking at a human
outputting decisions, and deducing expected utility - but that is only one
kind of abstraction one might employ.
The distinction is that a Collective Volition does not ask "What would
extrapolated humankind want?" but "What would extrapolated humankind
decide?" There's a major difference, in formal terms; the former is a
special case of the latter. Similarly, the optimization process that views
itself as an approximation to collective volition does not have a fixed
utility function that says "Do what human utility functions say," nor a
variable utility function bound to the human equivalent of utility
functions, nor a constant utility function that says "Do what extrapolated
humankind would decide." Rather, the FAI views its own decision process as
an approximation to what extrapolated humankind would decide.
This is a step toward handling the kind of problem Mitchell Porter is
concerned about, choosing between different systems for representing
expected utility and the like. Faced with such a dilemma one should not
ask "What would a human want?", but "What would a human decide?" which is
the more general form of the question.
It may help to know, at this point, that the original motivation for
expected utility was that any set of *decisions* which obeyed certain
consistency axioms could be summarized using expected utility. In the
original math, wanting is deduced from deciding, not the other way around.
*If the FAI works correctly*, then the existence of an FAI is transparent;
the act of setting the FAI in motion is the act of manifesting an
approximation of the collective volition of extrapolated humankind, which
may choose to change its dynamic, or choose to write some code other than
an FAI. The collective volition is not an external thing that an
independent FAI tries to satisfy. The collective volition would be the
same function that makes decisions about the FAI's internal code.
-- Eliezer S. Yudkowsky http://intelligence.org/ Research Fellow, Singularity Institute for Artificial Intelligence
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:47 MDT