Re: Beyond evolution

From: Eliezer S. Yudkowsky (
Date: Mon Feb 05 2001 - 10:57:20 MST

Samantha Atkins wrote:
> > Multiple entities are multiple points of failure of Friendliness and even
> > greater dangers.
> Yes, but we seem to get along pretty well being able to more or less
> balance one another's power and to some extent limit each other's
> possibility of running totally amok in an un-stoppable way.

Part of the reason why I think a Friendly SI will go with the
Sysop scenario is that I think the system of checks and balances breaks
down completely under high technology.

Humans are a special case on at least two counts: First, we exist in the
sort of intermediate technological society where, right up until fifty
years ago, it was *technically impossible* to run completely amok and
destroy the world, and even now, the checks and balances work fairly
well. Second, we're evolved entities who usually aren't trustworthy, or
knowably trustworthy at any rate, in the absence of checks and balances.

> A single
> sysop is missing that sort of checks and balances. The assumption is
> that we can design it so well that it will automatically check and
> balance. I confess to having a lot of doubt that this can be done.

Humans in general are balanced: balanced against each other; balanced in
terms of internal cognitive ability levels; balanced between nature and
nurture; part of one big Gaussian curve. It doesn't apply to the
transhuman spaces.

> > A failure of Friendliness in a transcending seed AI results in a total
> > takeover regardless of what a Friendly AI thinks about the Sysop
> > Scenario. Once an AI has *reached* the Sysop point you're either screwed
> > or saved, so forking off more Sysops after that is a particularly
> > pointless risk.
> But a Sysop does "take over" and govern the entire space. The AIs can
> balance each other out.

I really don't think so. First AI to transcend moves into accelerated
subjective time and wins all the marbles. Unless ve decides not to, in
which case you have the "flipping through the deck" problem.

> I have a large worry with the idea of their
> only being one of them and with it perhaps having too limited a notion
> of what "Friendliness" entails.

Having too limited a notion? Sounds unFriendly to me.

> The question is why I should allow myself to be
> limited by your notion of a Sysop.

Mine? I may not be able to shove off all responsibility onto the
shoulders of an SI but I sure intend to try. Letting Samantha Atkins be
limited by "Eliezer Yudkowsky's notion" (that is, a notion which is unique
to Eliezer Yudkowsky and not an inevitable consequence of panhuman ethics)
sounds unFriendly to me.

> If we decide it is not a good
> solution but the Sysop disagrees, then what?

Then either "we" (who's "we"?) are mistaken, or somebody (me) really
screwed up the definition of Friendliness.

> I can see in theory how such a being could not be in the way but I think
> my notion of that is a bit different than yours.

Well, I've yet to hear a concrete proposal that would result in less
summated suffering than a Sysop Scenario - under *any* definition of

> > Yes, and yes. The risks inherent in material omnipotence are inherent in
> > rapid transcendence and thus inherent in AI. The Sysop Scenario adds
> > nothing to that.
> However, your solution is to make one AI and SI with the right
> moral/ethical grounding to have this power without running amok. What
> of the other billions of beings? Is there an evolutionary path for them
> up to and beyond this level of transcendence (assuming they wish it)?

Yes! But strike "evolutionary" from the record, please.

> What of other beings reaching full trancendence and having to learn
> wisdom and morality along the way? Is there room enough for them to do
> so?


> > And, incidentally, the Sysop Scenario isn't what I'm telling a Friendly
> > seed AI to do. It's what I think a Friendly superintelligence will decide
> > to do. I'm neither taking on the building of such, nor need to be
> > convinced that it's the Only Answer.
> OK. I don't think it will decide quite along the Sysop lines, but you
> could well be right.


> OK. As long as the first Sysop doesn't insist Friendliness is only
> compatible with roughly its own solutions to the very complex questions
> involved. My intuition is that there are many possible solution spaces
> that cannot all be explored by any one SI. Some of them may not even
> seem all that "Friendly" from other particular solution spaces.

Which parts of "Friendliness" are more and less arbitrary is itself part
of the understanding that constitutes a Friendship system. Any
sufficiently arbitrary answer shouldn't be part of Friendliness at all; it
should probably just be delegated back to the volitional decisions of
individuals, or at least be overridable by the decisions of individuals.
Even the primacy of pleasure over pain is subject to the volitional
override of static masochists.

> I guess I have a hard time expecting many people to do this. Or at
> least it is doubtful that they wouldn't choose to upgrade pretty soon.
> So what is the significance of "static". I think I am missing something
> there.

The significance of "static" is that it's the only part of the Universe
about which we can have meaningful discussions.

> > > For entities in a VR who are playing with
> > > designer universes of simulated beings they experience from inside, is
> > > it really harm that in this universe these simulated beings maim and
> > > kill one another? In other words, does the SysOp prevent real harm or
> > > all appearance of harm? What is and isn't real needs answering also,
> > > obviously.
> >
> > I don't see how this moral issue is created by the Sysop Scenario. It's
> > something that we need to decide, as a fundamental moral issue, no matter
> > which future we walk into.
> OK, but I am exploring what you think the Sysop answer is or should be
> to be compatible with Friendliness.

Will you take "I don't know, I'll ask the Sysop" as a legitimate answer

> > You can still become a better person, as measured by what you'd do if the
> > Sysop suddenly vanished.
> But will you ever know unless it does?

Sure; have your exoself run a predictive scan on your simulated cortex.
Minds are real in themselves and can be understood in themselves; the
external reality is the expression of it, not the test.

> No. Others can go outside who may have more nefarious motives. Are you
> claiming they would never tire of being bloody tyrants, never feel
> remorse, never seek to undo some part of the painful ugly creatin they
> made?

Some would, some wouldn't.

> Without experiencing the consequences, how do beings actually
> learn these things?

I think that maybe one of the underlying disagreements is that we disagree
on how much "real experience" is necessary. My own position is that the
human brain has two settings: "Sympathize, using all available hardwired
neurons," and "Project abstractly, using high-level thoughts." For us,
there's a very sharp border between really experiencing something and
thinking about it abstractly, because we can't do enough abstract thought
to simulate all the pixels in a visual cortex. For us, the behaviors that
we abstractly imagine on hearing the phrase "four-dimensional visual
cortex" will never be as sharp, as real, as the experiences of an entity
with a true 4D visual cortex. But this is a distinction that breaks down
for self-modifying entities, like seed AIs or transcendent humans, who can
abstractly think about every pixel and feature extractor in a 4D visual
cortex, and thus understand every facet of intuition and behavior that
would be exhibited by a being with a true 4D visual cortex, even if he or
she or ve retains their original 3D visual cortex the whole while. A seed
AI or a transcendent human with a 3D cortex can look at a 4D Escher
painting and understand it by virtue of their ability to understand a 4D

So, without experiencing the consequences, beings learn by using their
very vivid imaginations.

> Sure. Make a space (probably VR) where entities can do whatever they
> wish including making their own VR spaces controlled by theselves which
> are as miserable or wonderful as they wish and as their skill and wisdom
> allows. Keep the Sysop as an entity that insures all conscious beings
> created or who become involved come to no permanent or irreparable
> harm. Otherwise they are free to be as good or horrible to one another
> as they wish. And they are free to not know that they can come to no
> irreparable harm or cause any. Would this be compatible?

OK, but it sounds like you're talking an "unescapable" Sysop, which I
really thought was your whole point in the first place. I mean, if I
understand this scenario correctly, I can't go Outside for fear that I'll
bring an entity to permanent or irreparable harm.

-- -- -- -- --
Eliezer S. Yudkowsky
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:35 MDT