Re: Evolving minds

From: Eliezer S. Yudkowsky (
Date: Sat Nov 18 2000 - 12:34:51 MST

Ben Goertzel wrote:
> Once AI systems are smart enough to restructure all the matter of Earth into
> their
> own mind-stuff, we won't be ABLE to guide their development via intelligent
> tinkering, one would suspect...

I am not sure I agree with your exact phrasing. I would rephrase as follows:

1) Once an AI system is smart enough to restructure all the matter of the
Solar System into vis own mind-stuff, we will not be able to guide vis
development *if ve doesn't want us to*.

1a) Once AI systems have the physical capability to resist tinkering and to
continue improving themselves, you'd better have all the basics right (see

2) After some point - probably, the point where the AI can predict everything
the human programmers will say - the human programmers will not be able to
tinker with the AI, although the AI may take "what the programmers would
think" or "what an idealized programmer would think" into account when
developing its moral system further.

I would also add:

3) Programmer-independent morality (morality which does not contain quirks or
errors which were unique to the particular human programmers) is a desirable
property of a Sysop.

> So, your intuition is that by carefully guiding our AI systems in the very
> early stages,
> we can mold the early phases of the next generation of intelligence so
> carefully that
> later phases will be unlikely to go awry
> I tend to doubt it.

I agree with Ben Goertzel, but only with respect to that exact phrasing. The
core of seed AI is the idea that a system can become intelligent enough to
redesign itself, so that the programmers don't need to get everything exactly
right to design an intelligent mind. The core of Friendly AI is the idea that
a system can be Friendly enough to try and become Friendlier. Friendly AI
does not consist of getting the early systems precisely right; it consists of
getting those early systems *right enough* that the system grows into
Friendliness, convergently rather than divergently.

In practice, what is necessary is that the early Friendly AI make reference to
the intentions of the programmer. Not the specific intentions that have
already been embodied, but "the intentions of the programmer", in general,
including the intentions that the early AI doesn't know about yet.

When the programmer says: "I have this new element to include in the design
of your goal system", the AI needs to think: "Aha! Here's an element of
what-should-be-my-design that I didn't know about before!", not "He wants to
give me a new goal system, which leads to suboptimal results from the
perspective of my current goal system... I'd better resist."

In "Friendly AI", I'm working on describing the specific cognitive imagery
necessary for all that to take place.

When the AI encounters a new moral decision which is ambiguous under its
existing goal material, or when it encounters a design decision with respect
to the goal system itself, it needs to think "What would my programmers want
me to do?", come up with the correct answer (by means which may include just
*asking* the programmers), and implement it.

Eventually, the AI needs to grok the complexity which underlies those
judgement calls by the programmers, and internalize that complexity - both the
process we use to make moral decisions about what constitutes Friendliness,
and also the process we use to define which kinds of moral reasoning are valid
and extend our philosophy to cover new situations.

> Based on practical experience, it seems to me that even the AI systems we
> now are experimenting
> with at Webmind Inc. -- which are pretty primitive and use only about half
> of the AI code
> we've written -- are fucking HARD to control. We're already using
> evolutionary means to adapt
> system parameters... as a complement to, not a substitute for, experimental
> re-engineering of various
> components, of course.

Well, without more detailed knowledge of Webmind, I can't be sure; however, I
don't *think* you're encountering challenges of the same underlying class as
the challenges that would be involved in Friendly AI. But I don't know. I'm
not a Webminder.

Anyway, I know you don't think that it's possible to do all the work on
Friendly AI in advance. I agree. In the course of building a Friendly AI,
you or I (or both) will probably learn far more about Friendly AI than we
started with. However, there is also work about Friendly AI that *can* be
done in advance. There are classes of avoidable mistakes. There's
terminology needed to know what you're seeing when you see it. There are
things you need to do in the first versions to pave the way for later
versions. That's what's going into "Friendly AI".

A year ago, I believed there was nothing you or I or anyone could or should
know about Friendly AI in advance. I now recognize that this belief was quite

> So the idea that we can proceed with more advanced AI systems based
> primarily on conscious human
> engineering rather than evolutionary programming, seems pretty unlikely to
> be. It directly goes against
> the complex, self-organizing, (partially) chaotic nature of intelligent
> systems.

If you will pardon a bit of mystical terminology - there is a balance between
order and chaos in living systems.

Again, this is a statement that needs to be split up. As intelligence
advances, moved by evolutionary programming or self-modification, the
underlying mechanisms of that intelligence may become steadily more
incomprehensible to the human programmers. As the intelligence's rules of
reasoning and models of reality advance, the high-level behavior of the
intelligence may become steadily more reliable, comprehensible, easy to
summarize to a human. As the intelligence advances beyond the human, the
decisions of that intelligence may become impossible to predict, because the
intelligence is smarter than we are.

There is still a discipline of seed AI in Artificial Intelligence, and a
discipline of "seed morality" in Friendly AI.

-- -- -- -- --
Eliezer S. Yudkowsky
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:35 MDT