From: Thomas McCabe (email@example.com)
Date: Tue Jan 29 2008 - 13:39:19 MST
Feasibility of the concept
* Ethics are subjective, not objective: therefore no truly
Friendly AI can be built.
o Rebuttal synopsis: If ethics are subjective, we can still
build a Friendly AI: we just need to program in our collective
(human-derived) morality, not some external objective morality.
The idea of a hostile AI is anthropomorphic.
There is no reason to assume that an AI would be actively hostile, no.
However, as AIs can become very powerful, their indifference (if they
haven't purposefully been programmed to be Friendly, that is) becomes
dangerous in itself. Humans are not actively hostile towards the
animals living in a forest when they burn down the forest and build
luxury housing where it once stood. Or as Eliezer Yudkowsky put it:
the AI does not hate you, nor does it love you, but you are made out
of atoms which it can use for something else.
The vast majority of the time, if someone dies, it's not because of
murder - it's because of something accidental. Some random error in
DNA replication caused cancer, or some clump of fatty acid caused a
heart attack. Non-malevolent forces killed more people than every
genocide in history put together. Even during WWII, the single largest
mass-killing event in human history, more people died of "natural
causes" than were killed by government armies. The same principle
applies on a smaller scale; most of the daily annoyances we live with
aren't caused by deliberate malice.
Were an AI not a threat to the very survival of humanity, it could
threaten our other values. Even among humans, there exist radical
philosophers whose ideas of a perfect society are repulsive to the
vast majority of the populace. Even an AI that was built to care about
many of the things humans value could ignore some values that are
taken for so granted that they are never programmed into it. This
could produce a society we considered very repulsive, even though our
survival was never at stake.
"Friendliness" is too vaguely defined.
This is true, because Friendly AI is currently an open research
subject. It's not that we don't know how it should be implemented,
it's that we don't even know what exactly should be implemented. If
anything, this is a reason to spend more resources studying the
Some informal proposals for defining Friendliness do exist. None of
these are meant to be conclusive - they are open to criticism and are
subject to change as new information is gathered. The one that
currently seems most promising is called Coherent Extrapolated
Volition. In the CEV proposal, an AI will be built (or, to be exact, a
proto-AI will be built to program another) to extrapolate what the
ultimate desires of all the humans in the world would be if those
humans knew everything a superintelligent being could potentially
know; could think faster and smarter; were more like they wanted to be
(more altruistic, more hard-working, whatever your ideal self is);
would have lived with other humans for a longer time; had mainly those
parts of themselves taken into account that they wanted to be taken
into account. The ultimate desire - the volition - of everyone is
extrapolated, with the AI then beginning to direct humanity towards a
future where everyone's volitions are fulfilled in the best manner
possible. The desirability of the different futures is weighted by the
strength of humanity's desire - a smaller group of people with a very
intense desire to see something happen may "overrule" a larger group
who'd slightly prefer the opposite alternative but doesn't really care
all that much either way. Humanity is not instantly "upgraded" to the
ideal state, but instead gradually directed towards it.
CEV avoids the problem of its programmers having to define the wanted
values exactly, as it draws them directly out of the minds of people.
Likewise it avoids the problem of confusing ends with means, as it'll
explicitly model society's development and the development of
different desires as well. Everybody who thinks their favorite
political model happens to objectively be the best in the world for
everyone should be happy to implement CEV - if it really turns out
that it is the best one in the world, CEV will end up implementing it.
(Likewise, if it is the best for humanity that an AI stays mostly out
of its affairs, that will happen as well.) A perfect implementation of
CEV is unbiased in the sense that it will produce the same kind of
world regardless of who builds it, and regardless of what their
ideology happens to be - assuming the builders are intelligent enough
to avoid including their own empirical beliefs (aside for the bare
minimum required for the mind to function) into the model, and trust
that if they are correct, the AI will figure them out on its own.
* Mainstream researchers don't consider Friendliness an issue.
o Rebuttal synopsis: Mainstream researchers don't have a
very good record of carefully thinking out the implications of future
technologies. Even during the Manhattan Project, few of the scientists
took the time to think about- in detail- the havoc the bomb would
wreak twenty years down the road. FAIs are much more difficult to
understand than atomic bombs, and so if anything, the problem will be
* Human morals/ethics contradict each other, even within individuals.
o Rebuttal synopsis: We, as humans, have a common enough
morality to build a system of laws. We share almost all of our brain
hardware, and we all have most of the same basic drives from
evolutionary psychology. In fact, within any given society, the moral
common ground usually far exceeds the variance between any two people.
* Most humans are rotten bastards and so basing an FAI morality
off of human morality is a bad idea anyway.
o Rebuttal synopsis: Eli listed this as a real possibility
in CEV, so we'll need a serious, possibly technical answer. - Tom
* The best way to make us happy would be to constantly stimulate
our pleasure centers, turning us into nothing but experiencers of
o Rebuttal synopsis: Most people would find this morally
objectionable, and a CEV or CEV-like system would act on our
objections and prevent this from happening.
This archive was generated by hypermail 2.1.5 : Sat May 18 2013 - 04:01:09 MDT