RE: SIAI's flawed friendliness analysis

From: Bill Hibbard (
Date: Mon May 26 2003 - 15:43:42 MDT

This discussion has split into many threads, and I'll bring
them together into this single response. Ben's comments are
a good starting point for this, and I'll address all the
recent questions.

On Fri, 23 May 2003, Ben Goertzel wrote:

> There are a lot of good points and interesting issues mixed up here, but I
> think the most key point is the division between
> -- those who believe a hard takeoff is reasonably likely, based on a radical
> insight in AI design coupled with a favorable trajectory of self-improvemetn
> of a particular AI system
> -- those who believe in a soft takeoff, in which true AI is approached
> gradually [in which case government regulation, careful peer review and so
> forth are potentially relevant]
> The soft takeoff brings with it many obvious possibilities for safeguarding,
> which are not offered in the hard takeoff scenario. These possibilities are
> the ones Bill Hibbard is exploring, I think. A lot of what SIAI is saying
> is more relevant to the hard takeoff scenario, on the other hand.
> My own projection is a semi-hard takeoff, which doesn't really bring much
> reassurance...

I think we'll eventually get to a time (the singularity) when
intelligence increases very quickly to very high levels. But I
think it will take a long time to get there, during a sort of
soft takeoff. In particular it will be years or even decades
from the first intelligent machines until the true singularity,
and it could be decades from now until the first intelligent
machines. I agree with Donald Norman that people tend to
overestimate the short-term progress of technological change,
and underestimate the long-term effects.

I think real intelligence is decades away because no current
research is making any real progress on the grounding problem,
which is the problem of grounding symbols in sensory
experience and grounding reasoning and planning in learning.
That is, you cannot reason intelligently about horses unless
the word horse is connected to sight, sound, smell and touch
experiences with horses. Solving the grounding problem will
require much faster computers than are being used for current
AI research.

I think there will be years or decades from the first real
machine intelligence until the singularity because of the
likelyhood of difficult technical problems even after the
first signs of machine intelligence, and because of the
amount of learning for intelligence to acheive its true
potential. Applying intelligence effectively (we might call
this wisdom) requires many fine value judgements that can
only be learned from experience. Humans require decades of
learning for their intelligence to mature. A super-intelligent
machine may learn faster, but it may also need a lot more
experience for its super-intelligence to mature (just as
higher animals generally take longer to mature than lower

There is some chance that the first intelligent machines will
be hidden from the public. But probably not for long, because
they will be built in a wealthy and open society like the U.S.,
with lots of whistle blowers and where exciting news has a way
of getting out. Furthermore, a machine designed as I advocate,
with values for human happiness, or a machine designed as the
SIAI advocates, with a friendliness super-goal, would create
the singularity openly rather than hiding it from humans. It
is hard to imagine a safe singularity created in secret.

There are three broad public policy choices for AI:

1. Prohibit it, as advocated by Bill Joy in his April 2000 Wired
article "Why the Future Doesn't Need Us".

2. Allow it without regulation, as advocated by the SIAI and
most members of the SL4 mailing list.

3. Allow it but regulate it, as I advocate.

I think prohibiting AI is technically impossible and politically
unlikely, and unregulated AI is politically impossible and will
almost certainly be unsafe for humans. So we have no alternative
but to find our way through the difficulties of regulating AI.
In more detail:

1. Prohibit AI.

In his article, Bill Joy is pessimistic about prohibiting AI
because people will want the benefits. It will be politically
difficult to decide the right point to stop a technology whose
development continually creates wealth and relieves people of
the need to work.

As several people have pointed out, it will be technically
impossible to prevent people from building outlaw AIs,
especially as technology matures. The only way to do it
would be to stop technological progress world wide, which
won't happen.

2. Allow AI without regulation.

Ben's question about timing is relevant here. If you think
that the singularity will happen so quickly that the public
and the government won't have time to act to control the
singularity once they realize that machines are becoming
intelligent, then you don't have to worry about regulation
because it will be too late.

If the public and the government have enough time to react,
they will. People have been well primed for the dangers of
AI by science fiction books and movies. When machines start
surprising them with their intelligence, many people will be
freightened and then politicians will get excited. They will
be no more likely to allow unregulated AI than they are to
allow unregulated nuclear power. The only question is whether
they will try to prohibit or regulate AI.

Wealthy and powerful institutions will have motives to build
unsafe AIs. Even generally well-meaning institutions may
fatally compromise safety for mildly selfish motives. Without
broad public insistence on aggressive safety regulation, one
of these unsafe AIs will likely be the seed for the

3. Allow AI with regulation.

Ben's question about timing is relevant here too. The need
and political drive for regulation won't be serious until
mchines start exhibiting real intelligence, and that is
decades away. Even if you disagree about the timing, it is
still true that regulation won't interfere with current
research until some project acheives an AI breakthrough. At
the current stage of development, with lots of experiments
but nothing approaching real intelligence, regulation would
be counter-productive.

Like so many things in politics, regulation is the best
choice among a set of bad alternatives. Here is a list of
objections, with my answers:

  a. Regulation cannot work because no one can understand my
  designs. Government employees are too stupid to understand

Government employees include lots of very smart people, like
those who worked on the Manhattan Project and those who are
finding cures for diseases. While it is healthy for citizens
to be skeptical of politicians and government, thinking that
all politicians and government employees are stupid is just
an ignorant prejudice.

The regulators will understand designs because the burden
will be on the designers to satisfy regulators (many of whom
will be very smart) of the safety of their designs, as with
any dangerous technology.

Even if some smart designers don't want to cooperate with
regulators, other designers just as smart will cooperate.

  b. Regulation will hobble cooperating projects, enabling
  non-cooperating unsafe AI projects create the singularity

Non-cooperating projects will be hobbled by the need to hide
their resource use (large computers, smart designers, network
access, etc).

As long as regulation is aggressively enforced, major
corporations and government agencies will cooperate and
bring their huge resources to the effort for safe AI.

The government will have access to very smart people who can
help more than hinder the designers they are inspecting.

Given the importance of AI, it is plausible that the U.S.
government itself will create a project like the Manhattan
Project for developing safe AI, with resources way beyond
those available to non-cooperating groups. Currently, the
U.S. GDP is about $10 trillion, the federal government
budget is about $2.3 trillion, the defense budget is $0.4
trillion, and global spending on information technology is
$3 trillion. When the public sees intelligent machines and
starts asking their elected representatives to do something
about it, and those representatives hear from experts
about the dangers of the singularity, it is easy to imagine
a federal safe AI project with a budget on the scale of
these numbers.

  c. A non-cooperating project may destroy the world by
  using AI to create a nano-technology "grey goo" attack.

This is possible. But even without AI, there may be a world
destroying attack using nano-technology or genetically
engineered micro-organisms. My judgement is that the
probability of unsafe AI from a lack of regulation (I think
this is close to 1.0) is greater than the marginal increase
in the probability of a nano-technology attack caused by
regulation of AI (as explained in my answer to the previous
objection, active government regulation won't necessarily
slow safe AI down relative to unsafe AI).

  d. Even if AI is regulated in most countries, there may
  be others where it is not.

This is a disturbing problem. However, the non-democracies
are gradually disappearing, and the democracies are
gradually learning to work together. Hopefully the world
will be more cooperative by the time the singularity

Democratic countries are wealthier than non-democracies,
so may create a safe singularity before an unsafe
singularity can be created elsewhere.

  e. We can't trust an AI because we can't know what its
  thinking. An AI will continue to develop and design
  other AIs that are beyond the ability of human
  regulators to understand.

There is no way to trace or predict the detailed thoughts
of an AI, but we can make the general prediction that it
will try to satisfy its reinforcement values. The safety
of an AI is primarily determined by its values (its
learning and simulation algorithms also need to be

I would trust an AI designed by another safe AI, with
reinforcement values for human happiness. It may decide
that we would be happier if its design was checked by
another independently-designed safe AI, and so seek such
peer review.

  f. The intelligence of AIs will be limited by the
  ability of human regulators to understand their designs.

This is related to the previous objection. Once we have
safe AIs, we can trust them to design other safe AIs with
greater intelligence, and to verify the safety of each
other's designs.

** There are other objections to the specific form of
regulation that I advocate, rather then regulation in

  g. You advocate regulations on reinforcement values, but
  some designes don't rely on them.

Based on knowledge of human brains, and on the Solomonoff
Induction model of intelligence, I think the essence of
intelligence is reinforcement learning. Reinforcement
learning is very hard to do effectively in general situations
(like those faced by humans), which leads to all sorts of
design optimizations (e.g., human consciousness) that don't
look much like reinforcement learning. But at base they are
all trying to learn behaviors for satisfying some values.

  h. An AI based on reinforcement values for human happiness
  can't be any more intelligent than humans.

Values and intelligence are independent. As long as there
is no fixed-length algorithm that optimally satisfies the
values (i.e., values are not just winning at tic-tac-toe or
chess) there is no limit to how much intelligence can be
brought to bear to satisfying the values. In particular,
values for human happiness can drive unlimited intelligence,
given the insatiable nature of human aspirations.

  i. Reinforcement values for human happiness are too
  specific to humans. An AI should have universal altruism.

Universally altruistic values can only be defined in terms
of symbols (i.e., statements in human language) which must
be grounded in sensory experience before they have real
meaning. An AI will have grounding for language only after
it has done a lot of reinforcement learning, but values
are necessary for such learning. The third point of my
critique of the SIAI friendliness analysis was the lack of
values to reinforce its learning until the meaning of its
friendliness supergoal could be learned.

Reinforcement values for human happiness can be implemented
using current or near-future machine learning technology
for recognizing emotions in human facial expresssions,
voices and body language. These values have grounded

I think that a number of current AI efforts underestimate
the importance of solving the grounding problem. This
applies not only to grounding symbols in sensory experience,
but grounding reason and planning in learning. Speculation
about AI values that can only be expressed in language also
fails to appreciate the grounding problem.

There are always trade-offs, with winners and losers, that
must be faced by any set of values, even universal altruism.
That is, in this world there is no behavior that always
gives everyone what they want. I think it is likely that
"universal altruism" is one of those language constructs
that has no realization (like "the set of all sets that do
not contain themselves").

Any set of values that tries to protect interests broader
than human wellfare may motivate an AI behavior that has
negative consequences for humans. In the extreme, the AI
may destroy humanity because of its innate xenophobia or
violence. Some people think this may be the right thing
to do, but I cannot advocate any AI with such a possible
consequence. I only trust values that are grounded in human
wellfare, as expressed by human happiness.

Using human happiness for AI reinforcement values equates
AI values with human values, and keeps humans "in the loop"
of AI thoughts. Human values do gradually evolve, as for
example xenophobia declines (its bad, but not as bad as it
used to be). My own hope is that super-intelligent AIs with
reinforcement values for human happiness will accelerate
the pace of evolution of human values. For example, the AI
will learn that tolerant people are happier than intolerant
people, and promote tolerance in human society.

** Summary

I am sure some people won't accept my answers to these
objections, and be skeptical of regulation. I admit that
regulation is not guaranteed to produce a safe singularity.
But I think the alternatives are worse. In my opinion,
prohibiting AI is impossible, and unregulated AI makes an
unsafe singularity almost certain.

Bill Hibbard, SSEC, 1225 W. Dayton St., Madison, WI 53706 608-263-4427 fax: 608-263-6738

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:42 MDT