"Supergoal" considered harmful; New term: "Quality Function"

From: Thomas Buckner (tcbevolver@yahoo.com)
Date: Sun Jul 17 2005 - 02:08:54 MDT


--- "Eliezer S. Yudkowsky" <sentience@pobox.com>
wrote:

> Should the AI be a good predictor, it will
> systematically steer reality into
> regions which its utility function assigns high
> utilities. Thus, the term
> "supergoal" that I used in CFAI means simply
> "utility function". And if the
> AI is a good predictor, its utility function
> also serves as a good description
> of the target of the optimization process, the
> regions of reality into which
> the AI does in fact steer the future.

I think this is an appropriate time to apply my
previous post about the NLP method of identifying
'values' in the sense of 'what does this utility
function achieve?' Perhaps for sl4 purposes a new
word is needed, since mathematicians already mean
something else by the word 'values', and it's
apparently too late to decide that 'supergoal'
can fill this function.

So let me propose "quality function" as the new
term; it implies 'qualia' since (frothy, dodgy
word though it is) we can agree that pleasant
qualia are what we humans are hoping to get from
FAI; it also implies 'quality' in the sense of
'quality of life' which, again, we are hoping to
get from FAI.

I propose this since you (Eliezer, and perhaps
others) seem to have punted the question of "What
is the optimum utility function *in humans* that
the FAI utility function is supposed to achieve?"
It seems to me that you have found flaws with all
pre-existing ideas about what this ultimate human
goal or set of goals could be, which presents a
seemingly intractable Alphonse-and-Gaston dilemma
(You first! No, you first! No, I insist, sir! And
so on.) The seed AI programmer dare not make any
assumptions beforehand about CV without first
letting the superior mind of the AI judge the
likelihood of success, but if the AI gets ahead
of the human programmer, ve may not be Friendly
or fully informed about the needs of humans, and
offer wrong answers, and around and around we go.

So, like it or not, we need more confidence that
we understand the quality function, from which we
derive the AI's utility function.

I'm thinking of Monty Python and the Holy Grail.
King Arthur, at the end, are convinced that the
French knights in the castle have the Grail, and
one of their failed gambits is the building of a
wooden rabbit, which is catapulted back and
crushes one or two Englishmen. A nested view of
their utility functions (UF), with each step
intended to achieve the next, higher value, might
run as follows:

UF Build wooden rabbit => UF infiltrate castle =>
UF overpower French knights => UF search castle
=> UF find Grail => QF serve Greater Glory of
God.

Fans of the film will know that the plan broke
down at the second step, due to the failure to
put anybody in the wooden rabbit (obviously, any
utility function might depend on more than one
lower function, so the actual nested view could
be a tree converging on a single trunk function;
when you are down to one trunk or a set of equal
and indispensable trunks, you've found your QF).
But there were many assumptions underlying the
English plan; once inside, they might have lost
the fight; the French might not really have had
the Grail; Jehovah's original geas to go find the
Grail might have been a mass hallucination!
Perhaps, in order to serve the highest quality
function (Greater Glory of God), Arthur and his
knights should have just gone home and helped
feed the poor.

This is a point where I think I misunderstood
Eliezer all along; when he wrote 'supergoal (UF)'
I thought he was talking about something a lot
closer to 'supergoal (QF)'.

Now I expect Eliezer to fill my backside with
buckshot for proposing a quality function that
floats atop the utility function with no
mathematical means of support, but it can't be
helped (yet?) The quality function, in the end,
would be what all discussion of CV, domain
protection, singularity fun theory, and so on,
are intended to produce: a
mathematically/logically sturdy formulation of
what it is that the utility function is meant *by
us* to achieve. A paperclip maximizer, in this
light, is an AI that performs the utility
function flawlessly but makes a total wreck of
the quality function (or punishes us for not
having nailed down the quality function in time
for the AI to make use of it).

Tom Buckner

                
____________________________________________________
Start your day with Yahoo! - make it your home page
http://www.yahoo.com/r/hs
 



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:51 MDT