Re: Friendly AI in "Positive Transcension"

From: Metaqualia (
Date: Sun Feb 15 2004 - 01:50:17 MST

> I wish these were the old days, so I could look over what I just wrote
> with satisfaction, rather than the dreadful sinking knowledge that
> everything I just said sounded like complete gibberish to anyone honest
> enough to admit it.

I must say that although your prose is very entertaining there are clearer
ways of explaining what you are explaining!

Basically you are making the distinction between

A. hardwiring the outcome of human morality into an AI (don't do this, do


B. replicating the whole framework that created the outcome (the AI spits
out "do this" and "don't do that"s autonomously)

Asimov went for #1, but that is very dangerous as it is a brittle system.
This is not friendly AI.

It is like the difference between 1. writing an xml parser to look for
specific substrings and extracting the text between the tags blindly (if it
comes after <ZipCode> it must be a zipcode), AND 2. making an AI that
understands the nature of the document (looks like a letter, this must be
the recipient's address, 5 digit numbers in an address are a zipcode). This
is friendly.

Further, you add self improvement to the mix, so that according to Friendly
AI with the uppercase F, 3. the AI must be able to get better at doing what
it does, always checking if it is really right in its assumptions or if it
can learn more (ex. I have always thought that 5 digit numbers in an address
are a zipcode but I have some spare cycles so I am going to gather all the
data available and check whether street numbers can get to be 5 digits long,
whether street names can be expressed with 5 digits in any notation system,
whether this could be something else other than a letter, etc.)

It is not an extremely hard concept to understand.
It may be hard to implement, but not that hard to understand.

Now I don't think Ben thinks that Joyous Growth is a primitive that can be
taught to silicon easily; we were discussing about human morality outcomes
among humans so what he means is perfectly clear to us (or at least me).
Even though his primitives are different from mine (they are more numerous).

To get outside of the XML analogy and better summarize once again,

Your work with friendly ai consists in building a set of rules by which to
code an AI, which if followed would allow the AI to perform the following:

1. extract human statements/behaviors as [partial? incomplete?
contradictory? possibly plain wrong?] data about human morality and create a
model for an ideal morality that synthesizes all this data as accurately as

2. visualize itself as a tool that accomplishes n.1, but an imperfect tool,
always in need of refinement to better accomplish the goal

> a) A Friendly AI improving itself approaches as a limit the AI you'd have
> built if you knew what you were doing, *provided that* you *did* know what
> you were doing when you defined the limiting process.

System tends to become friendlier regardless of initial configuration
provided it can correctly evaluate its degree of friendliness at each step.

> b) A Friendly AI does not look old and busted when the civilization that
> created it has grown up a few million years. FAIs grow up too - very
> rapidly, where the course is obvious (entropy low), in other areas waiting
> for the civilization to actually make its choices.

AIs evolve. Agree.

> c) If you are in the middle of constructing an FAI, and you make a
> mistake about what you really wanted, but you got the fundamental
> architecture right, you can say "Oops" and the FAI listens. This is
> really really REALLY nontrivial.

AI does not take data as direct observations of the ideal morality, but as
possibly incorrect data pointing to the ideal morality. Agree. Very

> d) Friendly AI doesn't run on verbal principles or moral philosophies.
> If you said to an FAI, "Joyous Growth", its architecture would attempt to
> suck out the warm fuzzy feeling that "Joyous Growth" gives you and is the
> actual de facto reason you feel fond of "Joyous Growth".

AI will try to understand what programmers really mean by simulating their
neural structure, not just take their words are programming primitives.

> f) Friendly AI is frickin' complicated, so please stop summarizing it as
> "hardwiring benevolence to humans". I ain't bloody Asimov. This isn't
> even near the galaxy of the solar system that has the planet where the
> ballpark is located.

Most people, when hearing *hardwiring*, would
not indeed suspect that the task is so hard; nevertheless I think that is
just what it is; whatever means you choose to use, as long as you are not
making an AI that will dissect the laws of the universe to find if good and
evil have a really truly objective meaning, as long as you are most
preoccupied with what happens to humans, that is what you are doing,
hardwiring benevolence to humans. You are doing it well and in a very hard
way. But you ARE exploiting your brain's simulation power to constrain the
AI into a specific shape with a specific set of behaviors (whether
immediately foreseeable or not).
Which is not necessarily a bad thing, it depends on your ethical system. Who
can tell a human that building a human friendly AI is wrong? Who can say
that a universal morality really exists anyway?
Personally, because I feel I am at the most basic level a qualia stream and
not a human, I would like the AI to figure out qualia and if there is a
universally justifiable ethical system use THAT; if there isn't one, default
to be human friendly. But I'll go with the crowd as long as you don't make
it unfriendly to rabbits on purpose. :-)

I shall add that chimps (since you mentioned them) are much more similar to
us than we are comfortable believing; they show ethical behaviour in many
instances although I cannot give references here! From caring for children
to scratching each other's back to condemning excessively selfish behavior,
animals exhibit a wide range of social smarts which we call human evolved
morality. The reason humans are the only ones talking about morality is not
that they are the only ones "feeling stuff about morality", but that they
are the only ones that can talk.

> There is no light in this world except that embodied in humanity. Even my

I agree. The existence of sentients is the only thing that matters, since
everything else is inert, dead, might as well not exist at all. This seems
to be contradicting my earlier statements but it is not.
I think that the *light* that humanity embodies is not a brain that weighs a
few pounds or a max-4-hops logical inference system, or a randomly arisen
good/evil preference system, but qualia, and a link between qualia and the
physical world. When qualia arise, that makes a difference. Without them,
have whatever universe you will, nobody knows about it!

> I shut up about Shiva-Singularities. See, I even gave it a name, back in
> my wild and reckless youth. You've sometimes presumed to behave toward me
> in a sage and elderly fashion, Ben, so allow me to share one of the
> critical lessons from my own childhood: No, you do not want humanity to
> go extinct. Trust me on this, because I've been there, and I know from
> experience that it isn't obvious.

I think I was the one to suggest that the existence of humans may not be the
n.1 priority in universal terms. Now that the difference between a human and
a qualia stream is clear I suppose I won't look like a crazy assassin
anymore :)
There is a lot to a human besides being human! We are qualia streams inside
a mind inside a brain inside a human. The human is the least important part.


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:45 MDT