Re: ethics

From: Eliezer Yudkowsky (
Date: Fri May 21 2004 - 16:18:35 MDT

Aubrey de Grey wrote:
> Eliezer Yudkowsky wrote:
>> Similarly, FAI doesn't require that I understand an existing
>> biological system, or that I understand an arbitrarily selected
>> nonhuman system, but that I build a system with the property of
>> understandability. Or to be more precise, that I build an
>> understandable system with the property of predictable
>> niceness/Friendliness, for a well-specified abstract predicate
>> thereof. Just *any* system that's understandable wouldn't be
>> enough.
> What I would like to see is an argument that there can, in principle,
> be a system with the property of understandability (by at least a few
> 21st century humans) and also with the property of considerably
> greater than human cognitive function. (I avoid "intelliigence"
> because I want to try to focus the discussion on function, and thence
> on the reasons why we may find these machines worth making, leaving
> aside for the moment the idea that we need to invent FAI before
> anyone invents unfriendly AI.)

If you're familiar with the expected utility formalism and the notion of
utility functions, then consider a utility function U(x), and an immense
amount of computing power devoted to steering the universe into states
with a 99.99% or better expectation that U(x) > T. (Note that this is a
satisficer, not an expected utility maximizer.) The idea is that even
if there's a huge amount of computing power devoted to looking for
actions/plans/designs that achieve U(x) > T, such that the specific
solutions chosen may be beyond human intelligence, the *ends* to which
the solutions operate are humanly comprehensible. We can say of the
system that it steers the futures into outcomes that satisfice U(x),
even if we can't say how.

Actually you need a great deal more complex goal structure than this, to
achieve a satisfactory outcome. In the extrapolated volition version of
Friendly AI that I'm presently working with, U(x) is constructed in a
complex way from existing humans, and may change if the humans
themselves change. Even the definition of how volition is extrapolated
may change, if that's what we want.

(I'm starting to get nervous about my ability to define an extrapolation
powerful enough to incorporate the reason why we might want to rule out
the creation of sentient beings within the simulation, without
simulating sentient beings. However, I've been nervous about problems
that looked more impossible than this, and solved them. So I'm not
giving up until I try.)

> Now, I accept readily that it is not correct that complex systems are
> *always* effectively incomprehensible to less complex systems. I
> have no probelm with the idea that "self-centredness" may be
> avoidable. But as I understand it you are focusing on the
> development of a system with the capacity for essentially indefinite
> cognitive self-enhancement. I can't see how a system so open-ended
> as that can be constrained in the way you so cogently point out is
> necessary, and I also can't see how any system *without* the capacity
> for essentially indefinite cognitive self-enhancement will be any use
> in pre-empting the development of one that does have that capacity,
> which as I understand it is one of your primary motivations for
> creating FAI in the first place.

The problem word is "constrain". I would say rather that I choose an
FAI into existence, and that what the FAI does is choose. The U(x)
constrains the future, not the FAI; the FAI, in a strong sense, is
*defined* by the choice of U(x). That becomes the what-it-does, the
nature of the FAI; it is no more a constraint than physics is
constrained to be physics, no more to be constrasted to some separate
will than I want to break out of being Eliezer and become a teapot.

"Thus the *freer* the judgement of a man is in regard to a definite
issue, with so much greater *necessity* will the substance of this
judgement be determined."
        -- Friedrich Engels, Anti-Dühring, 1877.

"Freedom is understood in contrast to its various opposites. I can be
free as opposed to being presently coerced. I can be free as opposed to
being under some other person's general control. I can be free as
opposed to being subject to delusions or insanity. I can be free as
opposed to being ruled by the state in denial of ordinary personal
liberties. I can be free as opposed to being in jail or prison. I can
be free as opposed to living under unusually heavy personal obligations.
  I can be free as opposed to being burdened by bias or prejudice. I
can even be free (or free spirited) as opposed to being governed by
ordinary social conventions. The question that needs to be asked, and
which hardly ever is asked, is whether I can be free as opposed to being
causally determined. Given that some kind of causal determinism is
presupposed in the very concept of human action, it would be odd if this
were so. Why does anyone think that it is?"
        -- David Hill

What kind of freedom can exist, except the freedom to determine our
selves and our futures with our goals and choices? Physics is
deterministic (yes, it is, see also many-worlds and Barbour's timeless
physics). It's a strange and complex delusion that leads people to see
illusory, impossible kinds of freedom, freedoms that contrast to
determination instead of existing within deterministic physics. Another
reason not to talk of "intelligence", since people often toss impossible
kinds of "freedom" into that definition.

I would construct a fully reflective optimization process capable of
indefinitely self-enhancing its capability to roughly satisfice our
collective volition, to the exactly optimal degree of roughness we would
prefer. Balancing between the urgency of our needs; and our will to
learn self-reliance, make our own destinies, choose our work and do it

> (In contrast, I would like to see
> machines autonomous enough to free humans from the need to engage in
> menial tasks like manufacturing and mining, but not anything beyond
> that -- though I'm open to persuasion as I said.)

Because you fear for your safety, or because you would prefer to
optimize your own destiny rather than becoming a pawn to your own
volition? Or both?

> What surprises me most here is the apparently widespread presence of
> this concern in the community subscribed to this list -- the reasons
> for my difficulty in seeing how FAI can even in principle be created
> have been rehearsed by others and I have nothing to add at this
> point. It seems that I am one of many who feel that this should be
> SIAI FAQ number 1. Have you addressed it in detail online anywhere?

Not really. I think that, given the difficulty of these problems, I
cannot simultaneously solve them and explain them. Though I'm willing
to take occasional potshots.

> I'm also fairly sure that SIAI FAQ #2 or thereabouts should be the
> one I asked earlier and no one has yet answered: namely, how about
> treating AI in general as a WMD, something to educate people not to
> think they can build safely and to entice people not to want to
> build?

I've had no luck at this. It needs attempting, but not by me. It has
to be someone fairly reputable within the AI community, or at least some
young hotshot with a PhD willing to permanently sacrifice his/her
academic reputation for the sake of futilely trying to warn the human
species. And s/he needs an actual technical knowledge of the issues,
which makes it difficult.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:47 MDT