Re: Military Friendly AI

From: Eliezer S. Yudkowsky (
Date: Thu Jun 27 2002 - 11:53:11 MDT

Ben Goertzel wrote:
>> To summarize the summary, the main danger to Friendliness of military
>> AI is that the commanders might want a docile tool and therefore
>> cripple moral development. As far as I can tell, there's no inherent
>> danger to Friendliness in an AI going into combat, like it or not.
> In my view, the main danger to Friendliness of military AI is that the AI
> may get used to the idea that killing people for the right cause is not
> such a bad thing...

Yes, this is the obvious thing to worry about.

> Your arguments for why Friendliness is consistent with military AI are
> based on your theory of a Friendly goal system as a fully logical,
> rational thing.

As I understand "rationality", association, intuition, pattern-recognition,
et cetera, are extensions of rationality just as much as verbal logic. If
our culture thinks otherwise it's because humans have accumulated more
irrationality-correctors in verbal declarative form than intuitive form and
hence associate rationality with logic. From a mind-in-general's
perspective these are different forms of rational intelligence, not rational
and irrational intelligence. Anyway...

An AI can learn the programmer's mistakes in verbal form, associational
form, recognized patterns, et cetera. The critical issue is whether, when
the AI grows up, the AI will be able to correct those mistakes.

> However, I think that any mind is also going to have an associational
> component, rivalling in power the logical component.

As you are using the term "logic", I do not believe in logical AI. Also
these are not "components"... oh, never mind. Different theories.

> This means that its logical reasoning is going to be *guided* by
> associations that occur to it based on the sorts of things it's been
> doing, and thinking about, in the past...
> Thus, an AI that's been involved heavily in military matters, is going to
> be more likely to think of violent solutions to problems, because its
> pool of associations will push it that way

And its memories, its concepts, its problem-solving skills, and so on. But
this is only a structural error if the AI attempts to kill its own
programmers to solve a problem. I suppose that's a possibility, but it
really sounds more like the kind of thing that (a) happens in science
fiction but not real life (as opposed to things which happen in science
fiction and real life), and (b) sounds like a failure of infrahuman AI,
which is probably not a Singularity matter. Besides, I would expect a
combat AI to be extensively trained in how to avoid killing friendly combatants.

> Remember, logic in itself does not tell you how to choose among the many
> possible series of logical derivations...

I think you're working from a Spock stereotype of rationality.

> I don't want an AGI whose experience and orientation incline it to
> associations involving killing large numbers of humans!

Despite an immense amount of science fiction dealing with this topic, I
honestly don't think that an *infrahuman* AI erroneously deciding to solve
problems by killing people is all that much of a risk, both in terms of the
stakes being relatively low, and in terms of it really not being all that
likely to happen as a cognitive error. Because of its plot value, it
happens much more often in science fiction than it would in reality. (You
have been trained to associate to this error as a perceived possibility at a
much higher rate than its probable real-world incidence.) I suppose if you
had a really bad disagreement with a working combat AI you might be in
substantially more trouble than if you had a disagreement with a seed AI in
a basement lab, but that's at the infrahuman level - meaning, not
Singularity-serious. A disagreement with a transhuman AI is pretty much
equally serious whether the AI is in direct command of a tank unit or sealed
in a lab on the Moon; intelligence is what counts.

> You may say that *your* AGI is gonna be so totally rational that it will
> always make the right decisions regardless of the pool of associations
> that its experience provides to it.... But this does not reassure me
> adequately. What if you're wrong, and your AI turns out, like the human
> mind or Novamente, to allow associations to guide the course of its
> reasoning sometimes?

Then the AI, when it's young, will kill a bunch of people it didn't really
have to. But that moral risk is inherent in joining the army or working on
any military project. The Singularity risk is if the AI's training trashes
the part of the Friendship system that would be responsible for fixing the
learned error when the AI grows up, or if the AI mistakenly self-modifies
this system in a catastrophically wrong way. I really don't see how that
class of mistake pops out from an AI learning wrong but coherent and not
humanly unusual rules for when to kill someone. If the AI starts
questioning the moral theory and the researcher starts offering a load of
rationalizations which lead into dark places, then yes, there would be a
chance of structural damage and the possibility of catastrophic failure of

Ben, what makes you think that you and I, as we stand, right now, do not
have equally awful moral errors embedded in our psyche? A couple of
centuries ago a lot of people thought slavery was a good thing. That's why
a Friendly AI's morals can't be frozen. You need to pick a Friendly AI
strategy that works if the researchers are honest with the AI and willing to
see the AI grow up - not a strategy that requires the researchers be right
about everything. Yes, if you have a bunch of researchers who think it's
okay to kill people, you get an AI that *starts out* thinking it's okay to
kill people. If you have researchers who share the Great Moral Error of the
21st Century, whatever that is, the AI starts out by learning that moral
error as well. The question is whether the AI can outgrow moral errors.
Civilizations have done so. A human upload probably would. Physical
systems that outgrow moral errors exist; the question is building one.

One must distinguish moral errors from metamoral errors. Obviously, Ben's
concept of morality with respect to military force is not shared by all AI
researchers. A young FAI built by honorable soldiers will draw on its
programmers' advice to make decisions that Ben would regard as moral errors,
and conversely, a young FAI built by Ben will make decisions that honorable
soldiers might regard as moral cowardice. Friendly AI is about making sure
that both AIs grow up into essentially the same mind, whichever side that
mature mind turns out to be on. What if both sides are wrong?

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:39 MDT