Re: Ethics was In defense of physics

From: Eliezer S. Yudkowsky (
Date: Sun Feb 15 2004 - 20:37:14 MST

Keith Henson wrote:
>> The point is that you can perform all learning necessary to the task
>> of transforming everything in sight into paperclips, and you won't
>> have conflicts with distant parts of yourself that also want to
>> transform everything into paperclips - the target is constant, only
>> the aim gets updated.
> And the weapons, and the gravity field, and if it starts thinking about
> what it is doing and what paper clips are used for, it might switch from
> metal to plastic or branch out into report covers (which do a better
> job) and then reconsider the whole business of sticking papers together
> and start making magnetic media.

What does it matter so long as there are paperclips?

Seriously, what *does* it matter from the perspective of a mind that only
wants paperclips? If you yourself want something besides paperclips, you
should not build a paperclip optimization process, of course.

>> Hm... I infer that you're thinking of some algorithm, such as
>> reinforcement on neural nets, that doesn't cleanly separate model
>> information and utility computation.
> Even if you cleanly split out utility computation, widely separated AIs
> are going to be working off rather different data bases.
> Take shifting a galaxy to avoid the worst consequences of collisions
> (whatever they are). That's an obvious project for a friendly and very
> patient AI. Say galaxy A needs to go right or left direction and galaxy
> B needs to go left or right depending on what A does to modify the
> collision. If the AIs figure this out when they are separated by
> several million light years, they are going to have a heck of a time
> deciding which way each should cause their local galaxy to dodge. If
> they both decide the same way, you are going to get one of those
> sidewalk episodes of people dodging into each other's path--with really
> lamentable results for anybody nearby if the black holes merge.

If you anticipate this problem in advance, you can keep a simple reference
mind on offline storage somewhere, and some set of agreed-on protocols for
reducing your local data to the subset of the local data that would be
visible to a distant self. Both copies of yourself feed the reference
mind identical copies of the intersection of the data that would be known
to both entities. The reference mind then outputs a set of coordinated
high-level strategies on the level where coordination is necessary. The
rest is up to the local minds and they can use full knowledge in
implementing it.

In general, the ability to carry out optimal plans with multiple actions,
whether simultaneous spatially distributed actions or temporally
distributed local actions, depends on your ability to reliably predict
spatially or temporally distant actions. The solution I gave above is an
extreme case of the answer, "in thinking through coordinated plans, don't
use data your other self can't access". This answer is not necessarily
optimal, but it's simple. A more complex answer would involve optimizing
over probability distributions for the distant mind's action. The more
important it is to be perfectly coordinated, the more unshared information
you should throw away in order to be predictable.

> To the extent humans share goals it is because humans share genes. Males
> in particular are optimize to act and to take risks for others on the
> basis of the average relationship in a tribe a few hundred thousand
> years ago (averaging to something like second cousin).

Sometimes humans share goals, not because they have high relatedness to
one another, but because humans share the genes that construct the goals
and the goals are cognitively implemented in non-deictic form (the goal
template doesn't use the "this" variable). For example, humans like
particular kinds of environments, so if you were to propose a workable way
of transforming Toronto into the tree-city of Lothlorien, there'd be
widely distributed support for that proposal not because everyone in
Toronto is related to you, but because the parts of our brains that
process the pretty flowers (signs of fertile territory) are constructed by
species-typical genes. Shared utility functions exist because of shared
genes, but not necessarily because of Hamiltonian relatedness.

Likewise, you can get selection pressures derived from iterated Prisoner's
Dilemma between not necessarily related partners, and selection pressures
on more complex social interactions if language is around. If you had an
evolved intelligent species whose spawning process scrambled zygotes
spatially before they grew up, so that they weren't related to nearby
individuals, I'd still expect them to evolve social coordination
mechanisms in the process of evolving intelligence. We behave honorably
toward unrelated individuals.

> I have a real problem of part of my brain being subjective months our of
> sync. When you have to communicate, even with your twin brother, via
> sailing ship you might have the same interest and goals but you darn
> sure are going to be different individuals.

The human side of this is one issue; making lots of paperclips, or
creating a stable FAI, is another. Obviously you can't have brain lobes
millions of ticks distant from each other and remain a classical human.
I'm just saying it doesn't obviously introduce insoluble stability
problems for an FAI.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:45 MDT