Re: 6 points about Coherent Extrapolated Volition

From: Eliezer S. Yudkowsky (
Date: Sat Aug 06 2005 - 20:02:03 MDT

Michael Anissimov wrote:
> Hi Eliezer,
> Few quick questions on the CEV post - notice that you've turned
> "Collective Extrapolated Volition" into "Coherent Extrapolated Volition"
> here - is this a permanent jargon change or are you just using the term
> "coherent" to make some sort of point in this context? Please explain.

I think it will be a permanent jargon change, though perhaps not a final
equilibrium; who knows but that there may be more in store.

> Eliezer S. Yudkowsky wrote:
>> 3. The CEV writes an AI. This AI may or may not work in any way
>> remotely resembling a volition-extrapolator.
> ...though it's extremely likely it would, right?

What? No. Possible, sure. Where would be the justification for calling it
'extremely likely'? Asked what we want at the object-level, we may or may not
want anything that treats with our wants at the meta-level.

> In the broadest sense,
> "volition extrapolation" basically means "guessing what people want",
> right?


>> 4. The CEV returns one coherent answer. The AI it returns may or may
>> not display any given sort of coherence in how it treats different
>> people, or create any given sort of coherent world.
> Of course, if it doesn't display any sort of coherence in how it treats
> different people, or doesn't create any sort of coherent world, that
> would be a failure, right?

This is only a licensable inference because many of our goals require
coherence; not because coherence is a goal in itself. Survival implies at
least local continuity between past and future selves; challenge, success, and
fun implies at least local continuity between past and future worlds.

I think I would personally prefer that roughly the same thing happen to the
whole human species, so that we are not split to go one way and another never
to meet again. But perhaps that will prove to be only a personal preference
on my part, or only a transient delusion of morality.

> Is this statement being put forth to help
> people distinguish the difference between the CEV and the AI it creates?


>> 5. The CEV runs for five minutes before producing an output. It is
>> not meant to govern for centuries.
> Though of course, there could be substantial mutual information between
> the CEV and the AI it creates - correct?

Mutual information in the Shannon sense? Absolutely! Of course!

On the other hand, I'd be really disturbed to see sections of code copied
verbatim. I would regard this as prima facie evidence of malfunction.

> Though such an AI (nor the CEV
> which created it) would not "govern" in the anthropomorphic sense, it
> would surely exert optimization pressure upon the world. There are
> probably some people out there who feel infinitely uncomfortable

Wow, how does their brain pack in an infinite amount of uncomfort? Up until
this point I'd been an infinite set atheist, on the grounds that no reliable
witness has ever reported encountering an infinite set.

> with
> the idea of a superintelligent AI with initial conditions set by a human
> programming team creating changes in the world, and will hence object to
> any such proposals, but of course it seems like this event is basically
> unavoidable... I think it's important to distinguish between people who
> are objecting to *any* FAI theory on the grounds that they haven't come
> to terms with the reality of recursive self-improvement yet, and people
> who have already accepted that superintelligent AI will eventually come
> into existence whether we like it or not, and that it's merely our duty
> to set the initial conditions as best we can. It's sometimes difficult
> to tell the difference between the two, people because it seems like
> people in group #1 may occasionally pretend to be in group #2 for the
> sake of argument (which ends up going nowhere).
>> 6. The CEV by itself does not mess around with your life. The CEV
>> just decides which AI to replace itself with.
> ...but the CEV isn't explicitly being programmed to create an AI output
> - aye?

Pretty much, although I may have to make certain assumptions about the class
of thingydingies wo which the output belongs, in order to create a clearly
defined CEV computation producing the output. For example, one might require
that the output be a computer program placed in charge of the existing RPOP
infrastructure. That computer program could be an AI; or it could clean up
the infrastructure and delete itself; or it could execute a predefined set of
actions and then clean up and delete itself.

> The AI output is based on the assumption that our wish if we
> knew more, thought faster, were more the people we wished we were, had
> grown up farther together; where the extrapolation converges rather than
> diverges, where our wishes cohere rather than interfere; extrapolated as
> we wish that extrapolated, interpreted as we wish that interpreted, we
> would decide to construct an AI that exerts a sort of optimizing
> pressure on the world such that it makes it a better place to live?

No, that's exactly the sort of assumption you don't want to build into CEV.
That's CEV as Nice Place To Live, which is a strong assumption about the sort
of world humanity would enjoy inhabiting; quite distinct from CEV as Initial
Dynamic. For example, in the original "Collective Volition" I suggested that
we might coherently extrapolatedly wish the CEV output to create a small set
of understandable background rules for our new world, and "hands off!" for
individual destinies.

> I
> would agree with this assumption - I just think it's worthwhile to point
> out explicitly for the sake of clarity. Theoretically, the (extremely
> improbable) output of CEV could merely be a single object, like a
> banana, or something along those lines, yes?

If a banana belongs to the class of possible outputs, then you're allowing the
CEV to construct arbitrary physical devices as its output, rather than writing
arbitrary computer programs. That requires dynamic action and planning by CEV
in the real world, in the process of producing its first-order output, which
therefore occurs before the CEV's replacement by its first-order output.

Perhaps the (extremely improbable) output of CEV would be a program that uses
SI infrastructure to construct one banana, and then cleans up the SI
infrastructure and thereby deletes itself.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:51 MDT