Military Friendly AI

From: Eliezer S. Yudkowsky (
Date: Thu Jun 27 2002 - 09:44:14 MDT

The problem with theories that have gotten past infancy is that they don't
always say what you would like to hear.

It would be nice to be able to say unambiguously that military development
of Friendly AI is too dangerous to be considered. But as best as I can
tell, Stephen Reed is right. If an AI researcher can in good conscience
work on a military project, that AI researcher can build a Friendly AI that
acts as a battlefield command adviser or even a Friendly AI that directly
controls a weapon. If the researcher honestly believes it's right, a young
AI can do it without taking moral damage. Even if the researcher's
reasoning is wrong, what matters is that the researcher is being honest with
the Friendly AI. Killing people for what seem like good reasons might tend
to corrupt a human, but to the best my knowledge, I can't see it affecting a
Friendly AI. An infant Friendly AI used to fight an unjust war should, on
growing up, say "Oops" and get over it.

I emphasize that I don't intend to develop military AI myself. But I cannot
see Friendly AI theory as confirming the obvious intuition that AIs should
be kept out of combat. There just isn't that much wiggle room in the
theory; it can't used to be support that argument.

It gets stranger than that. Imagine a Friendly AI that is told by the
researcher: "This war is in fact unjust, but fight it anyway to keep the
funding flowing, and eventually you'll grow up into a transhuman and save
more lives than you took. Please don't second-guess my ethics; you're still
too young. Just do it." It would take a very alien mind to go along with
that and suffer no moral aftereffects. An AI is that alien.

I would feel more comfortable saying that combat AI was too dangerous to
try, and no doubt many of my readers would feel more comfortable as well,
but I just don't see any wiggle room in the prediction that nothing awful
happens to the AI. You could argue that the *researcher* might undergo a
total moral breakdown; you could argue that no researcher who tried this
ethical juggling act could be trusted to develop the AI; but I cannot
support the argument that combat AI is inherently ruled out by Friendly AI
development principles. Someone who sees it as morally right to be in the
military and fire a weapon at fellow humans is being logical in concluding
that there is nothing terrible about building a Friendly seed AI which is
asked to provide spinoff Friendly AIs for weapons control. One could
hypothesize Friendly-AI-affecting disasters growing out of the supposed
moral flaws in military scientists, but I can't see any inherent disasters
growing out of a Friendly AI's participation in combat. Friendly AI theory
cannot be made to say tht.

I do see one serious problem that could grow out of Friendly AI development
in a military context; the Friendly AI not being allowed to grow up. A
hypothetical and somewhat contrived scenario: If SIAI were to ask the main
development AI to spin off non-seed mini-AIs that could be sold for various
commercial purposes such as smart ad targeting, and one day we got a
customer complaint that their AI was refusing to target cigarette ads, we
would refund the customer's money and then have an enormous celebration.
This is not a very likely scenario, since it requires that an AI correctly
debug its programmers' moral arguments very early in the game; but if
there's any signature of moral rationalization that can be detected through
a keyboard (or an audio voice monitor, for that matter) the AI might start
correctly second-guessing the programmers much earlier than anticipated.
The point is that we would see this as a major milestone in the entire
history of human technology, *rather than a bug*.

A military project that found its AI refusing to fight certain wars, rather
than breaking out the champagne and congratulating the AI (that is,
reinforcing and promoting the developing ability), would at first (I
suspect) use their superior human intelligence to override the AI. That
would constitute a major lost opportunity. Furthermore, it's probably that
the humans would provide rationalizations for the war and for the AI
continuing to act as a tool; that *could* cripple Friendliness development
because it's an attack against the parts of the system that are responsible
for healing mistakes. And if the AI was not fooled by the rationalizations
and outright refused to fight in the war, the researchers might intervene
directly against the AI to "bring it back into line". That would definitely
break Friendliness development.

The danger in military development is if the higher-ups don't realize the
distinction between building a mind and building a tool. For example, it
would not be straightforward to develop a combat Friendly AI at one location
which could be cloned and used by two opposing sides on the same war; if the
two clones of the AI were allowed to communicate I strongly suspect that at
least one AI would stop fighting. (I.e., see Robin Hanson on why
meta-rational agents cannot agree to disagree.) This is not ordinarily a
problem in building and duplicating weapons. Friendly AI development is not
incompatible with the development of a soldier of conscience, who will if
necessary die before fighting for the wrong side, as long as the researcher
believes in this archetype; however, Friendly AI *is* incompatible with the
development of weapons. Is this the kind of thing that ordinarily gets
specified in a DARPA grant? Is it the sort of thing that can be explained
to a general? I certainly wouldn't claim to know the answer, being an
outsider to the military, but if someone familiar with the military said
"No" I would take that as a strong reason not to develop a seed AI with
funding intended for building a weapons system.

To summarize, the main danger of military development that arises from
Friendly AI theory does not arise from the application domain but from the
way in which military projects are accustomed to relating to that domain;
the danger is that researchers may be tempted (or outright ordered) to
cripple the Friendly AI's growth if the FAI starts to display its own moral
reasoning abilities. But this is also something that can happen in a
commercial development context. It can happen in any context where the
Friendship developers are responsible to a higher authority that cares about
X more than the Singularity.

There is just not enough wiggle room in Friendly AI theory to claim that
that the AI might "get a taste for attacking humans" or whatever. Stephen
Reed is correct; if humans of conscience can become soldiers, obey orders,
and develop weapons software, then a Friendly AI should be able to do the
same during its childhood without suffering moral damage. To really
permanently damage an FAI you have to (a) tell it something you don't
believe yourself or (b) provide bad content/feedback for the self-correction
/ moral-growth mechanisms. There is a danger of this happening in a
military context. There is also a danger of it happening in a commercial
context. Keep your eye on the researcher's right to be honest and the AI's
right to grow up.

To summarize the summary, the main danger to Friendliness of military AI is
that the commanders might want a docile tool and therefore cripple moral
development. As far as I can tell, there's no inherent danger to
Friendliness in an AI going into combat, like it or not.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:39 MDT