Fwd: Why friendly AI (FAI) won't work

From: Thomas McCabe (pphysics141@gmail.com)
Date: Wed Nov 28 2007 - 14:06:36 MST

Note: Original bounced with the error "PERM_FAILURE: SMTP Error (state
13): 550 5.7.1 <sl4@www.sl4.org>... Relaying denied".

 - Tom

---------- Forwarded message ----------
From: Thomas McCabe <pphysics141@gmail.com>
Date: Nov 28, 2007 4:01 PM
Subject: Re: Why friendly AI (FAI) won't work
To: sl4@www.sl4.org

On Nov 28, 2007 11:49 AM, Harry Chesley <chesley@acm.org> wrote:
> While I'm still open to being convinced otherwise, my current belief is
> that friendly AI is doomed to failure. Let me explain why and see if it
> convinces anyone else, or if they can produce strong counter-arguments.
> The whole situation reminds me of another movement in computer science
> that was a hot topic when I started out in the '70s: proving programs
> correct. Although there were many great ideas and many great thinkers
> involved, that movement failed to have any substantial impact of the
> real world for three reasons: it was hard to impossible to implement,
> people didn't really need it, and a specification is not necessarily
> less error prone than an implementation. All of these reasons apply to
> FAI as well.
> First, to be useful, FAI needs to be bullet-proof, with no way for the
> AI to circumvent it.

Error: You have assumed that the AI will actively be trying to
"circumvent" Friendliness; this is not accurate. See
http://www.intelligence.org/upload/CFAI//adversarial.html. In short, a
Friendly AI has the physical capacity to kill us, or to make an UFAI,
but it doesn't want to.

> This equates to writing a bug-free program, which
> we all know is next to impossible. In fact, to create FAI, you probably
> need to prove the program correct.

That's why the problem is so difficult. Anyone can build a system that
will work if there are no bugs.

> So it's unlikely to be possible to
> implement FAI even if you figure out how to do it in theory.
> Second, I believe there are other ways to achieve the same goal,
> rendering FAI an unnecessary and onerous burden. These include
> separating input from output, and separating intellect from motivation.

Won't work. See http://www.overcomingbias.com/2007/11/complex-wishes.html.

> In the former, you just don't supply any output channels except ones
> that can be monitored and edited.

Won't work. See http://sysopmind.com/essays/aibox.html.

> This slows things down tremendously,
> but is much safer. In the later, you just don't build in any motivations
> that go outside the internal analysis mechanisms, including no means of
> self-awareness. In essence, design it so it just wants to understand,
> not to influence.

Understanding requires more computronium. Hence, computronium = good.
Hence, humans will be converted to computronium, as quickly as

> This may be as prone to error as FAI, but is simpler
> to implement and therefore more likely to be successful. (Indeed, any
> solution can be argued to be impossible to implement due to the near
> certainty of bugs, but in general the simpler they are, the more likely
> they are to be workable.)
> Third, defining FAI is as bug-prone as implementing it. One small
> mistake in the specification, either due to lack of foresight or human
> error (say, a typo), and it's all for nothing.

That's what CEV is for, see http://www.intelligence.org/upload/CEV.html.
The idea is that you don't specify Friendliness content; you specify
the process to derive Friendliness content.

> And, in general, it's
> hard to correctly specify a solution without having the same context as
> that of the implementors of the solution, which is this case is
> equivalent to saying that you have the same perspective as the AI, which
> you don't.
> To save everyone from having to read more postings, let me pre-supply
> some of the replies I'm sure to get to this message:
> * Read the literature!
> * You don't understand the problem.
> * You're an idiot.
> I don't disagree with any of those, actually, but I'm only likely to be
> convinced I'm wrong by arguments that address my points directly.

 - Tom

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:01 MDT