Re: SIAI's flawed friendliness analysis

From: Bill Hibbard (
Date: Fri May 30 2003 - 06:27:42 MDT

On Thu, 29 May 2003, Eliezer S. Yudkowsky wrote:

> "Happiness in human facial expressions, voices and body language, as
> trained by human behavior experts".
> Not only does this one get satisfied by euphoride, it gets satisfied by
> quintillions of tiny little micromachined mannequins. Of course, it will
> appear to work for as long as the AI does not have the physical ability to
> replace humans with tiny little mannequins, or for as long as the AI
> calculates it cannot win such a battle once begun. A nice, invisible,
> silent kill.

The essence of intelligence is a simulation model of the world
that is used to predict the long-term effect of behavior on
values, and hence solve the credit assignment problem for
reinforcement learning. A tight stimulus-response loop
satisfying immediate values has nothing to do with intelligence.

An intelligent mind will develop a model of the world that
equates human happiness with a loving family life, adequate
food and shelter, physical exercise, freedom, a meaningful
vocation, friends, etc. And it will equate human unhappiness
with abusive relations, loneliness, homelessness, hunger, lack
of freedon, poor health, drug addiction, etc. Its behavior
will be based on this model, trying to promote the long-term
happiness of humans.

Human babies love their mothers based on simple values about
touch, warmth, milk, smiles and sounds. But as the baby's
mind learns, those simple values get connected to a rich set
of values about the mother, via a simulation model of the
mother and surroundings. This elaboration of simple values
will happen in any truly intelligent AI. I can't speak for
others, but what rocks my boat is the sound of happiness in
my wife's voice - a simple, grounded value. Not for a second
does my model of the world suggest giving her drugs or
replacing her with a doll and a tape recording.

Your argument can be applied against any grounded defintion
of values. Hence the SIAI guidelines define a supergoal that
is not grounded, and hence allows whatever interpretation an
AI designer wants to apply. Similarly, various other SIAI
recommendations use lots of value words without defining them.
Lack of grounding for value words in safe AI guidelines are
intended to protect against the imagined dangers of a
non-intelligent stimulus-response loop satisfying immediate
values, but have the actual effect of providing a loophole to
those with motives to circumvent safe AI guidelines.

> There is no divine right of democracy; it does not confer infallibility.
> What it does confer is faith and the illusion of infallibility. Congress
> is not capable of understanding how little it knows, which is what makes
> it dangerous. Democracy has known bugs; and those bugs, applied to
> Singularity scenarios, result in predictable kills - one of which you
> have just demonstrated.

Thanks for being honest about this. Everyone will have to decide
for themselves whether they trust their elected representatives
or the SIAI.

> Incidentally, Hibbard, if I were given to trying to pass regulations,
> which I'm not, I'd prohibit the use of your reinforcement architecture,
> which is, of course, invariably fatal...

Prohibiting reinforcement learning would ensure safety by
eliminating intelligence.

Bill Hibbard, SSEC, 1225 W. Dayton St., Madison, WI 53706 608-263-4427 fax: 608-263-6738

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:42 MDT