RE: Military Friendly AI

From: James Higgins (
Date: Fri Jun 28 2002 - 12:33:45 MDT

At 09:50 PM 6/27/2002 -0600, Ben Goertzel wrote:
> > >I think we will be able to develop a real
> > > theory of Friendly AI only after some experience playing around with
> > > infrahuman AGI's that have a lot more general intelligence than
> > any program
> > > now existing.
> >
> > Which tends to strike me as a dangerous approach.

I can see Ben's point, however. Eliezer's Friendliness designs do have a
reasonable potential be useless. The basic concepts of Friendliness are
good and useful, but detailed designs, does seem a bit
pointless. Espicially if there were only one or very few detailed
designs. Systems designed to implement an AI are almost certainly all
going to be significantly different. Because, collectively as a race, we
have little idea of what we're doing yet. So everyone is going in
different directions hoping they find the new world. Once someone spots it
we'll have a much better idea of how such things actually work.

Note that I am not, in any way, saying that work on Friendliness is useless
nor is the existence of the Singularity Institute irrelevant. On a
theoretical basis such work may prove helpful and should by all means be
continued. It will help guide the Friendliness system(s) designed once we
have some idea how they will work (and in what kind of system).

> > So can your position be summarized as: we'll build our AI, get it
> > working at
> > some subhuman level, and then when we guess it needs it we'll stop running
> > it for a while until we figure out how to ensure "Friendliness"? I think
> > your protocol needs to be fleshed out for us further so we can feel more
> > comfortable with your plans.
>Whether it will be necessary to stop running it for a while, or not, will
>depend on the situation.
>If the intelligence in the system is growing very fast, then yes, this will
>be necessary. If the intelligence is growing slowly, then to stop it may
>not be necessary.

I very much hope that you would make certain the fail safe was implemented
by this time (discussed by Ben below).

> > It sounds dangerous to me (and I guess others here) to build the AI first,
> > and let it run for some time without any special F features built in. How
> > will your protocol ensure that it does not take off, and if it does how
> > are we ensured it will turn out ok?
>Here's the thing... as clarified in the previous paragraphs I just typed, we
>*do* have a Friendliness goal built in, we're just sure yet what the best
>way is to do this. And we're not willing to fool ourselves that we *are*
>sure what the best way is....

This sounds like a realistic view of the problem. The system does have
some basic friendliness implemented. But it isn't clear how best to
implement the whole Friendliness systems and, thus, would be a waste of
time at this point (since at best it is very likely be useless and, at
worst, could bog down or throw off the entire system).

The Friendliness goals are already in place, then?

>Compared to e.g. Peter Voss's A2I2 system, our approach is far closer to
>Eli's, because Peter's system is neural-nettish and is not the sort of
>system that one *can* explicitly supply with a Friendliness goal. But Peter
>holds far more strongly than me to the opinion that it's "too early" to
>seriously consider the Friendliness issue. He's just less argumentative
>than me so he's keeping relatively quiet about this view on this list,
>although he's a member ;_)
>As you and Eli and I discussed in a private e-mail, we do plan to put a
>"failsafe" mechanism into Novamente to halt a potential unsupervised hard
>takeoff -- eventually, when we consider there to be a significantly > 0 risk
>of this happening.

Good. I'd suggest you start with a simple fail safe soon and gradually
improve / expand on it as the software base grows. As we discussed (via
private email) building frameworks as you go end up being much easier and
less error prone if you start early.

>So, I think we don't even know how to build a good failsafe mechanism for
>Novamente or any other AI yet. We will only know that when we know how to
>measure the intelligence of an AGI effectively, and we will only know *this*
>based on experimentation with AGI's smarter than the ones we have now.

Well, getting a basic fail safe system in sooner should also help you learn
how to do this better in the long-run.

> > I assume that if you get your working infrahuman AI, and are unable to
> > come up with a bulletproof way of keeping it "Friendly", you will turn it
> > off?
>Not necessarily, this will be a hard decision if it comes to that.
>It may be that what we learn is that there is NO bulletproof way to make an
>AGI Friendly... just like there is no bulletproof way to make a human
>Friendly.... It is possible that the wisest course is to go ahead and let
>an AGI evolve even though one knows one is not 100% guaranteed of
>Friendliness. This would be a tough decision to come to, but not an
>impossible one, in my view.

I think everyone who is realistic believes this is a possibility, no matter
how much we hate that fact. However, it must be made absolutely certain
that this is the case. And, if it is, some AI designs would still be less
risky than others (I would think). If anything even close to this looks
likely you better be getting opinions of hundreds or thousands of relevant
experts. Or I'll come kick yer ass. ;) Seriously.

What happens if you get Novamenta working as an AI, it is proven that
Friendliness can not be guaranteed and it looks like your design is
somewhat more risky that the ideal system. Lets say your AI has a 4%
chance (totally arbitrary, just for illustration) of turning out
un-friendly if allowed to proceed. And a group of responsible experts (not
crack pots, not government appointed, not-self interested parties, etc)
strongly believe a different design could lower the risk to 3%. Lets say
you'd have to scrap 70% of your code base and logic to implement the other
design and it would take you several years to do this.

What is the trade-off point between risk and time?

What if another team was further ahead on this other design than yours?

> > Well do you think it's worth our trouble to read it? If so I'd like to see
> > some discussion about it (perhaps Eliezer will allow you to
> > repost the flaws
> > he saw in it) since I don't recall any threads regarding it (if I've
> > forgotten, someone please give me a URL to the archives, thanks).
>I think it's worth your while to read it, sure. And there was a brief
>thread on it a while back.

I actually remember the thread (though, unfortunately, not so much of
details), so I know its out there somewhere...

> > I just started reading your AI Morality paper, I'm sure I'll have more
> > comments later, but this part is a bit scary I guess to everyone here who
> > is afraid of the initial AI programmers having too much control over the
> > AI's final state:
> >
> > "But intuitively, I feel that an AGI with these values is going to be a
> > positive force in the universe ­ where by “positive” I mean
> > “in accordance
> > with Ben Goertzel’s value system”."
> >
> >
>There is no escaping the subjectivity of morality, Brian.

A true fact. And part of the reason why I'd like to see more than one
person be responsible for defining this morality.

>The reason I put that phrase in there is, I know that to *some* people,
>anything that may lead to the obsolescence of humanity is intrinsically
>negative. (Because they believe, e.g., that humans are God's chosen
>creatures... that uploads will not have souls... etc.). To these people
>even a Friendly AGI would be a negative force in the universe.
>Eliezer's approach to Friendliness relies on his own personal morals as
>well. his are pretty similar to mine; for instance, he thinks that
>preserving lives forever is a good thing. My wife, who believes in
>reincarnation, disagrees with me and Eli on this -- according to her moral
>standards, ending death goes against the natural cycle of karma and is thus
>probably not a good thing....

Another good reason why morality should not be decided by a single
individual. Eliezer or Ben's morality may not allow death, thus severely
going against Ben's wife's morals. Ben's wife's morals, however, would not
prevent any deaths, and thus would go strongly against Eliezer's and Ben's
(and mine). So maybe preventing deaths except where the individual does
not want this protection is the best answer. But it takes more than one
viewpoint to even see this questions.

James Higgins

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:39 MDT