Re: Ethics was In defense of physics

From: Keith Henson (hkhenson@rogers.com)
Date: Sun Feb 15 2004 - 20:01:00 MST


At 11:32 AM 15/02/04 -0500, Eliezer S. Yudkowsky wrote:
>Keith Henson wrote:
>>It seems to me that the core would have to be absolutely impervious to
>>outside influences--which is in conflict with intelligence--to the extent
>>that intelligence has to do with learning. Otherwise units at the ends of
>>communication delays would diverge.
>
>Okay, as a proof of principle, let's take a generic optimization process
>(i.e., a paperclip SI) and decompartmentalize learning into Bayesian
>learning of world-models and the expected utility equation with a constant
>utility function. See:
>
>http://intelligence.org/friendly/features.html#causal_bayesian
>http://intelligence.org/CFAI/design/clean.html#reinforcement
>
>(CFAI doesn't call anything by its proper name - "cleanly causal" should
>be translated as "expected utility", "Bayesian Probability Theorem" is
>Bayes' Theorem.)
>
>The point is that you can perform all learning necessary to the task of
>transforming everything in sight into paperclips, and you won't have
>conflicts with distant parts of yourself that also want to transform
>everything into paperclips - the target is constant, only the aim gets updated.

And the weapons, and the gravity field, and if it starts thinking about
what it is doing and what paper clips are used for, it might switch from
metal to plastic or branch out into report covers (which do a better job)
and then reconsider the whole business of sticking papers together and
start making magnetic media.

But after reading the articles I do see what you are getting at and why it
is really important. It also might really tough to do it right. In the
long run, I agree with your point that we will have to depend on the "good
will," friendliness, toward us so it makes a lot of sense to do it right in
the first place. (0r die trying.)

This is one place where the analogies of what drives people to be friendly
might be of interest.

>Programming in explicit cooperation with distant self-parts, or
>maintaining integrity of a philosophically growing Friendly AI, are more
>complex subjects. The former looks doable and theoretically
>straightforward; the latter looks doable and theoretically complex.

The latter may be pointless if the AIs are far enough apart. Like several
billion light years.

>>I suppose every AI could be broadcasting its total information stream
>>into memory and receiving the memory dumps from every other AI. It would
>>have to treat the experience (memory) of other AIs with equal weight to
>>its own. That would keep at least close ones in sync, but if there are
>>growing numbers of these things, the storage problem will get out of hand
>>no matter what media is being used. (In fact, it might make the case for
>>very few AIs. Even on per star would get out of hand.)
>
>Hm... I infer that you're thinking of some algorithm, such as
>reinforcement on neural nets, that doesn't cleanly separate model
>information and utility computation.

Even if you cleanly split out utility computation, widely separated AIs are
going to be working off rather different data bases.

Take shifting a galaxy to avoid the worst consequences of collisions
(whatever they are). That's an obvious project for a friendly and very
patient AI. Say galaxy A needs to go right or left direction and galaxy B
needs to go left or right depending on what A does to modify the
collision. If the AIs figure this out when they are separated by several
million light years, they are going to have a heck of a time deciding which
way each should cause their local galaxy to dodge. If they both decide the
same way, you are going to get one of those sidewalk episodes of people
dodging into each other's path--with really lamentable results for anybody
nearby if the black holes merge.

>>The problems this creates are bad enough that far apart AI cores would be
>>forced to consider themselves as different "individuals" just by weight
>>of different (unsync'ed) post creation experiences. I think this is true
>>even if closer ones engaged in total mind melding.
>
>In human beings all the axes of "individual" versus "group" are conflated:
>many memories versus one memory, many clusters versus one processor,
>different goals versus same goals, different plans versus same plans, and
>so on.

To the extent humans share goals it is because humans share genes. Males in
particular are optimize to act and to take risks for others on the basis of
the average relationship in a tribe a few hundred thousand years ago
(averaging to something like second cousin).

>Different memories stored in local nodes of an optimization process
>sprawled over long interprocessor communication delays does not equate to
>conflict of interest.
>
>>With FTL there doesn't seem to be an obvious limit. Without . . .
>>eventually your brain undergoes a gravitational singularity.
>
>Only if you want to keep each computing element within small-N clock ticks
>of every other computing element. This is the case with the human brain,
>for which Anders Sandberg calculated S = (single instruction time /
>communication delay) ~ 1. See "The Physics of Information Processing
>Superobjects".

That's an interesting number. I wonder what William Calvin would say about
that? (He knows one heck of a lot about problems of this class.)

>Actually, with FTL or without FTL, if you try to keep S ~ 1 or S < bound,
>you run into problems with your brain collapsing gravitationally. Without
>FTL, because of the lightspeed delay; with FTL, because the necessary
>density of FTL relays to keep all processors within N hops also grows,
>albeit logarithmically (I guess). In either case, you can either slow
>down your processors or accept a lower S.

I have a real problem of part of my brain being subjective months our of
sync. When you have to communicate, even with your twin brother, via
sailing ship you might have the same interest and goals but you darn sure
are going to be different individuals.

Keith Henson



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:45 MDT