Re: Flawed Risk Analysis (was Re: SIAI's flawed friendliness analysis)

From: Bill Hibbard (
Date: Tue May 20 2003 - 15:20:38 MDT

On Mon, 19 May 2003, Samantha wrote:

> On Sunday 18 May 2003 02:31 pm, Bill Hibbard wrote:
> >
> > We don't have a way to inspect human brains in the way we will
> > inspect AI brains. And even if we could, we don't have the same
> > level of motivation to inspect humans. With AIs, inspection is
> > critical to the future survival of humanity.
> >
> Uh, no. An AI worth the bother of inspecting would have code, even
> if it was in human readable form, complex and volumnious enough that
> no human or team of humans would be able to understand it that well.
> Certainly not well enough to determine motivation, behavior and
> likely future behavior to a sufficiently fine level as to satisfy
> you. In actuality a true AI will be much more difficult to
> understand than a member of your own species.

Humans will create the first AIs, and certainly will
understand their designs and code. If human designers
can understand it, then human regulators will also be
able to understand it. Once we have trusted AIs, we can
let them handle the details of regulation. I would trust
an AI with reinforcement values for human happiness more
than I would trust any individual human.

> > I should make it clear that even with a strong effort to
> > detect and inspect AIs, there is no guarantee that all AIs
> > will be safe. But without that effort, unsafe AIs will be
> > guaranteed.
> >
> You are not competent to inspect AIs themselves in a way that will do
> anything to guarantee safety. This says nothing about your personal
> abilities. It is the nature of the problem.
> > > b) inspecting an AI will be an incredibly complex and difficult
> > > task requiring the intelligence and tracking abilities of a
> > > phlanx of highly tallented people with computer support, so it
> > > will take a lot of time to complete, rendering such inpections
> > > out of date and therefore of little value.
> It is worse than that. It is actually impossible for all but the
> simplest AIs.
> >
> > I never said it would be easy. We must take the time and
> > effort to inspect every AI to make sure its design
> > conforms to regulations. Regulation certainly slowed down
> > construction of nuclear power plants (before construction
> > stopped altogether), and it will slow down AI development.
> > But there's no reason to rush.
> >
> There is every reason to rush. Nuclear power was killed by poorly
> informed and politicized implementation of regulations. I am not
> willing to have AI be stillborn on some quest to insure our safety
> from beings more complex and brighter than ourselves.

I should have said that the reasons to go carefully
outweight the reasons to hurry. Intelligence is the
ultimate source of power, and super-intelligent AIs
will be the most dangerous human creations. As the
creators of AIs, humans have every right to design
them in a way that protects human interests. I know
some people who think it is right that humans become
extinct, replaced by the superior AIs they create (I
don't know whether you feel that way). I don't feel
that way, but recognize that this difference is
really a difference in basic values and so cannot be
resolved by logic. This difference will have to be
resolved by politics.

> > But inspecting designs is different than inspecting operations.
> > I'll grant that those inspecting designs may want to estimate
> > the intentions of the designers, but the ultimate judgement
> > about the design must come from an inspection of the design
> > itself.
> >
> So you believe that the ethics/behavior of an AI is a matter of the
> initial design? If so, why?

I think that behavior is learned through life experiences
to satisfy reinforcement values. Complex networks of values,
such as ethics and morals, are derived from initial values
via reinforcement learning (derived values are what
the SIAI analysis calls subgoals). Assuming a mind has
accurate simulation and learning algorithms, its behavior
is determined by its initial values, combined with its life

Humans create game playing programs that they cannot beat
(I've done it myself). Those humans can't trace the detailed
steps of their programs, but do understand their logic for
simulating the game and for learning. The exact details of
a game playing program's behavior are not pre-determined,
but if its reinforcement learning values are for winning,
then it is predictable that it will play to win. The
extent to which it wins will depend on the accuracy and
efficiency of its algorithms for simulating the game and
for learning.

Of course intelligent behavior is much more complex than
game playing programs. But the same idea applies. It is
enough to know that the simulation and learning algorithms
are accurate and efficient, and to know the reinforcement

> > > Correct. But your statement seems to imply that the 'artifact'
> > > is unchanging. This is untrue for any of the mind designs I have
> > > seen so far, including the human mind. Minds change, and an AI
> > > is going to be faster and more capable at changing its mind than
> > > humans are.
> >
> > We cannot let it outrun our ability to inspect. There will be
> > no rush.
> >
> That is the most foolish statement I have ever heard.

That's just name calling.

> Do you limit
> humans to not developing and changing faster than your ability to
> sufficiently inspect their thinking processes and psychology? You
> are talking about new minds here, minds vastly more capable in
> potential than our own. To insist they be limited to what your (no
> offense as this applies to all of us) pea brain can encompass and
> predict is to oppose the creation of such intelligences utterly.
> Hell, teams of very bright humans can barely keep something as
> uninspiring as Windows XP running and plug its security holes. The
> notion that similar teams can analyze and judge the safety of an AI
> beyond a very rudimentary stage is ludicrous.

OpenBSD illustrates that operating systems can be much more
secure than Windows XP. I think the difference between
OpenBSD and Windows XP security has a lot to do with the
different values of their creators.

Humans do not have to predict the detailed thoughts of their AIs,
just the mechanisms of their thoughts. This is similar to the
situation of game playing programs that I described earlier.
Humans can write game programs that they cannot beat, because
humans don't have to trace the detailed steps of their programs
to know that they will play the game legally and well.

Humans will design and code the first intelligent AIs, and
other humans will be able to verify the safety of those
designs. Once we have trusted AIs, the details of regulating
AIs can be delegated to them.

> There is no hurry only if we are sufficiently capable to solve the
> problems that face us and face us with more complexity and detail day
> by day. I do not believe that is the case.

As I said before, intelligence is the real source of
power in this world and AIs will be the greatest source
of danger. Thus AIs must be created carefully.

> > In the early days, AI technology won't be widely available
> > so inspection efforts can focus on the few successful groups.
> > No rush. Lets do the first strong AIs slowly, with the public
> > insisting on an intensive effort to formulate and enforce
> > regulations. Humanity can afford to take its time. It cannot
> > afford to get it wrong because of some imagined need to rush.
> >
> No! I will become an "outlaw" myself before I will sit back and
> watch such a farce unfold.

If you want to be an AI outlaw, go for it.

It really comes down to who you trust. I favor a broad
political process because I trust the general public more
than any individual or small group. Of course, democratic
goverement does enlist the help of experts on technical
questions, but ultimate authority is with the public.


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:42 MDT