Re: Military Friendly AI

From: Samantha Atkins (samantha@objectent.com)
Date: Thu Jun 27 2002 - 21:09:23 MDT


Eliezer S. Yudkowsky wrote:
> Samantha Atkins wrote:

> > I don't think it is cultural. Or perhaps the problem is that "rational"
> > and "irrational" are too fuzzy to be of much use in clarifying the
> > points. In any case I don't see why an SI would be any less inclined to
> > have its choices of possible solution paths be influenced by previous
> > taken paths that had "worked" in other similar contexts than we are.
>
> I'm not sure what you mean by this. Why would a grownup AI trust
> unreliable past information until it had been verified? Why would a
> grownup AI trust to the value a skill routine that fulfilled a subgoal
> of whose real value it had become uncertain? Calling the validity of a
> cause into question should almost always call into question the validity
> of that cause's effects. Given the limited complexity of its infancy I
> would expect an SI to be able to model this part of its past pretty much
> in toto. I just don't see a bias like that as persisting once the AI
> knows enough to contradict the original cause. This isn't a military
> thing. This isn't even a Friendly AI thing! Washing the biases out is a
> *very* general part of growing up.

OK. I see your point on this. So now we have a FAI that
hopefully, when it reexamines all its past programming and
experience, concludes that it should make peace not war, if you
will. So if that is the best conclusion given its supergoal then
it makes no difference what it did before.

>
> >> An AI can learn the programmer's mistakes in verbal form, associational
> >> form, recognized patterns, et cetera. The critical issue is whether,
> >> when the AI grows up, the AI will be able to correct those mistakes.
> >
> > So, you are expecting it to decide that killing people was a "mistake"
> > and drop it from future problem solving?
>
> Assuming it *was* a mistake, then yes.

If it concludes it is not a mistake, that it is a perfectly
viable method, then can it be said to be "Friendly" in any sense
we would recognize?

>
> > It might or might not kill its own programmers. The danger is
> whether it
> > considers killing itself to be a long-term viable way of dealing with
> > problems. If it carries this beyond the point where humans can
> influence
> > its programming we have a problem as far as I can see.
>
> We do indeed have a problem. The problem is not "a violent SI". The
> problem is "an SI that can't correct moral errors made by its
> programmers" which is by far more dangerous.
>

But we still have the path open to the FAI deciding that killing
is perfectly reasonable and moral, Would you please refresh my
memory as to how the Friendliness supergoal (or what it arises
from) is stated? Is it sufficient that the FAI will not come to
the conclusion that killing humans is moral as it "grows up"?
If it is so constructed then having it kill humans when it is
too young to catch the contradiction is evil. It seems quite
counter to training it to understand and apply its own
supergoals successfully.

> >> Despite an immense amount of science fiction dealing with this topic, I
> >> honestly don't think that an *infrahuman* AI erroneously deciding to
> >> solve problems by killing people is all that much of a risk, both in
> >> terms of the stakes being relatively low, and in terms of it really not
> >> being all that likely to happen as a cognitive error. Because of its
> >> plot value, it happens much more often in science fiction than it would
> >> in reality. (You have been trained to associate to this error as a
> >> perceived possibility at a much higher rate than its probable
> >> real-world incidence.) I suppose if you had a really bad disagreement
> >> with a working combat AI you might be in substantially more trouble
> >> than if you had a disagreement with a seed AI in a basement lab, but
> >> that's at the infrahuman level - meaning, not Singularity-serious. A
> >> disagreement with a transhuman AI is pretty much equally serious
> >> whether the AI is in direct command of a tank unit or sealed in a lab
> >> on the Moon; intelligence is what counts.
> >
> > Well gee, that is a great relief! But you haven't really convincingly
> > stated why this is not a possibility. If the combat AI is capable of
> > evolving (is a Seed) then we have a problem, no? The switching to
> > "infrahuman" does not shed light on the worry.
>
> What I'm saying is that you don't get Singularity tragedies - as opposed
> to ordinary military tragedies ("Friendly fire") - unless you're dealing
> with a transhuman AI. And if you're dealing with a transhuman AI then
> it is probably not relevant whether the AI is in immediate command of a
> tank unit; the AI must be Friendly.
>

Perhaps I am still not getting your point. It looks as if you
are saying that if we get "Friendly fire" then it is not a FAI
but only a transhuman AI. Is that right? So an FAI cannot
conclude killing humans is ok? If not then the mistraining of
having it involved in combat is obvious.

> > If you assume it was morally acceptable to kill people earlier and train
> > the AI that this is so then how will you latter train it that this
> isn't
> > so, assuming it hasn't already gone beyond being influenced by your
> > attempts at training?
>
> What you do is teach the AI in such a way that final moral decisions are
> not represented as correct in themselves, but correct conclusions from
> premises which, under improved intelligence, can be used to reach the
> actual valid conclusion. To give a simple example, the AI is told to

Hmmm. Are we sure there is one "actual valid conclusion" in
many scenarios?

> defend the United States under the premise that all sentient life is
> equally valid but that the military effectiveness of the US is the
> safeguard of international stability and contributing to it saves lives
> in the long run - because this is what the AI researcher believes and

But that would be giving it a quite questionable premise from
the get-go. How could that be in the least good for its
training unless the purpose was to tests its error detection and
correction abilities? I can see the researcher can guarantee to
military funders that the young FAI will actually do anything
they are paying for it to do. If it is not "dumbed down" it may
likely become an immediate ACO - conscientous objector.

> it's why s/he works for the military. It doesn't quite matter whether
> the US is really the world's police as long as the AI gets its mitts on
> the premise that all sentient life is equally valid.
>

But the premise it is asked to act on is that the US sentient
life and its stated military goals is of higher value than all
other sentient life.

> > If it is wrong when the AI "grows up" then it was wrong to
> > require it of the AI when it was young. I doubt the AI will miss the
> > contradiction.
>
> Of course not. The point is that the researcher was being honest
> earlier, and later (a) changed his/her mind, or (b) was contradicted by
> the grownup AI reconsidering the moral question at a higher level of
> intelligence.
>

I am a little worried that you leave open the possibility that
the "grown" AI will decide that killing humans is still ok. But
I understand your reasoning better.

- samantha



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:39 MDT