Re: Why FAI Theory is both Necessary and Hard (was Re: SIAI's flawed friendliness analysis)

From: Cliff Stabbert (
Date: Sun May 11 2003 - 07:58:18 MDT

Eliezer wrote:

ESY> The human species is fifty thousand years old. We've come a long
ESY> way during that time. We've come a long way during the last
ESY> century. At the start of the twentieth century, neither women
ESY> nor blacks could vote in the US. The twentienth century was, by
ESY> the standards of today's world, barbaric. Are we, ourselves,
ESY> raving barbarians by any future perspective? Almost certainly.
ESY> For one civilization to leave a permanent impress on the next
ESY> billion years is no more fair than for one individual to do so,
ESY> and just as deadly. Individuals grow in their moralities, as do
ESY> civilizations. That fundamental capability is what needs to be
ESY> transferred into a Friendly AI, not a morality frozen in time. A
ESY> human is capable of understanding the concept of moral
ESY> improvement; a true FAI, a real mind in the humane frame of
ESY> reference, must be able to do the same, or something vital is
ESY> missing.

ESY> If you want to make an FAI that is capable of moral improvement,
ESY> constructed via a fair, species-representative method, then the
ESY> only question is whether you have the knowledge to do that - to
ESY> build an AI that fully understands the concept of moral failure
ESY> and moral improvement as well as we do ourselves, and that
ESY> symbolizes a species rather than any one individual or
ESY> civilization.

ESY> If you commit to that standard, then there are no conflicts of
ESY> interest to fight over, no individually controllable variables
ESY> that knowably correlate with the outcome in a way that creates
ESY> conflicts of interest.

ESY> What is dangerous is that someone who believes that "AIs just do
ESY> what they're told" will think that the big issue is who gets to
ESY> tell AIs what to do. Such people will not, of course, succeed in
ESY> taking over the world; I find it extremely implausible that any
ESY> human expressing such a goal has the depth of understanding
ESY> required to build anything that is not a thermostat AI. The
ESY> problem is that these people might succeed in destroying the
ESY> world, given enough computing power to brute-force AI with no
ESY> real understanding of it.

ESY> Anyone fighting over what values the AI ought to have is simply
ESY> fighting over who gets to commit suicide. If you know how to give
ESY> an AI any set of values, you know how to give it a humanly
ESY> representative set of values. It is really not that hard to come
ESY> up with fair strategies for any given model of FAI. Coming up
ESY> with an FAI model that works is very hard.

ESY> It is not a trivial thing, to create a mind that embodies the
ESY> full human understanding of morality. There is a high and
ESY> beautiful sorcery to it. I find it hard to believe that any human
ESY> truly capable of learning and understanding that art would use it
ESY> to do something so small and mean. And anyone else would be
ESY> effectively committing suicide, whether they realized it or not,
ESY> because the AI would not hear what they thought they had said.

I think these six paragraphs could be productively mined for a
superb "Why We Need FAI Theory" / "Why FAI is Hard" intro, something
I've long felt was missing from SIAI's materials.

Not to say I agree with every detail of the above -- I think Ben
raises some valid questions re para 4. For me, this sentence in para
6 jumped out:

ESY> I find it hard to believe that any human truly capable of
ESY> learning and understanding that art would use it to do something
ESY> so small and mean.

You "find it hard to believe"? _I_ find it hard to believe you would
use such a phrase casually or unintentionally, without awareness
of the implications. *Especially* when such a statement is made about
a probability of the form P(small & mean | capable)...


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:42 MDT