Re: Darwinian dynamics unlikely to apply to superintelligence

From: Eliezer S. Yudkowsky (
Date: Fri Jan 02 2004 - 20:31:32 MST

Wei Dai wrote:
> On Fri, Jan 02, 2004 at 06:07:54PM -0500, Eliezer S. Yudkowsky wrote:
>> This last point is particularly important in understanding why
>> replicator dynamics are unlikely to apply to SIs. At most, we are
>> likely to see one initial filter in which SIs that halt or fence
>> themselves off in tiny spheres are removed from the cosmic
>> observables. Almost any utility function I have ever heard proposed
>> will choose to spread across the cosmos and transform matter into
>> either (1) *maximally high-fidelity copies* of the optimization
>> control structure or (2) configurations that fulfill intrinsic
>> utilities. If the optimization control structure is copied at
>> extremely high fidelity, there are no important heritable differences
>> for natural selection to act on.
> How do you ensure that the optimization control structures *remain*
> high-fidelity copies over billions of years?

Hm, this problem looks oddly familiar...

Well, to give it a human answer - just by way of arguing possibility - you
could encrypt the original instructions in such a way that a single error
scrambles them, store the instructions in multiple places and compare
them, keep them in a central redundant 'queen' and transmit them to
workers without ever locally storing the instructions in unencrypted
form... where have I seen this before? Oh, yes, the Foresight Guidelines.

To the Guidelines I will add my own observation that, while it is not
possible with expected utility optimizers that work on 'standard' utility
functions, once you start adding in Friendliness structure it should be
possible to build a utilitarian optimizer that can compare its utility
function with those of nearby neighbors, and eliminate errors by
redundancy checking, or probabilistically compromise for smooth falloff,
rather than, say, fighting a war. Vide external reference semantics; if
an error in the utility function is a serious probability then a
probabilistic utility functions can deal with it, and it would be very
much in the interest of the original optimizer to create that structure in
any regions of matter it optimizes.

If you only need to do something a quadrillion quadrillion times, no more,
then it does not seem too physically expensive to drive the expected error
rate down to effectively zero; vide the 10^-64 expected error rates on
primitive operations in Nanosystems, with exponentially better expected
fidelity achievable by raising the potential energy barrier.

> Each control structure
> will have to process environmental data as well as communications with
> neighboring control structures and the so called "configurations that
> fulfill intrinsic utilities" which I presume may be intelligent beings.
> All of these will have local variations. Somehow the SI has to ensure
> that processing these variations does not cause heritable differences.

Or at least, not heritable variations of a kind that we regard as viruses,
rather than, say, acceptable personality quirks, or desirable diversity.
But even in human terms, what's wrong with encrypting the control block?

> This may or may not be possible, but as I suggested earlier the only
> way it can succeed is if the SI imposes a strict limit on the kinds of
> communications that can occur between neighboring control structures
> and between control structures and the configurations that fulfill
> intrinsic utilities if they are intelligent beings to prevent memes
> from being transmitted.

Humans are hardly top-of-the-line in the cognitive security department.
You can hack into us easily enough. But how do you hack an SI
optimization process? What kind of "memes" are we talking about here?
How do they replicate, what do they do; why, if they are destructive and
foreseeable, is it impossible to prevent them? We are not talking about a
Windows network.

>> If there were heritable differences, they are not likely to covary
>> with large differences in reproductive fitness, insofar as all the
>> optimization control structures will choose equally to transform
>> nearby matter.
> I disagree with this. Whatever limitations the SI imposes on the
> control structures in order to maximize their long term fidelity will
> have a negative effect on their reproductive fitness when in
> competition with replicators that do not have such limits, and
> certainly the removal of these limits is a heritable difference.

Okay, so possibly a Friendly SI expands spherically as (.98T)^3 and an
unfriendly SI expands spherically as (.99T)^3, though I don't see why the
UFSI would not need to expend an equal amount of effort in ensuring its
own fidelity. Even so, under that assumption it would work out to a
constant factor of UFSIs being 3% larger; or a likelihood ratio of 1.03 in
favor of observing UFSI (given some prior probability of emergence); or in
terms of natural selection, essentially zero selection pressure - and you
can't even call it that, because it's not being iterated. I say again
that natural selection is a quantitative pressure that can be calculated
given various scenarios, not something that goes from zero to one given
the presence of "heritable difference" and so on.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:43 MDT