Re: Fwd: Why friendly AI (FAI) won't work

From: Harry Chesley (
Date: Wed Nov 28 2007 - 18:50:25 MST

Thomas McCabe wrote:
>> First, to be useful, FAI needs to be bullet-proof, with no way for
>> the AI to circumvent it.
> Error: You have assumed that the AI will actively be trying to
> "circumvent" Friendliness; this is not accurate. See
> In short, a
> Friendly AI has the physical capacity to kill us, or to make an UFAI,
> but it doesn't want to.

Bad choice of words on my part. Circumvent implies intent. What I really
meant was "no way for the FAI's friendliness goal to fail."

>> This equates to writing a bug-free program, which we all know is
>> next to impossible. In fact, to create FAI, you probably need to
>> prove the program correct.
> That's why the problem is so difficult. Anyone can build a system
> that will work if there are no bugs.
>> So it's unlikely to be possible to implement FAI even if you figure
>> out how to do it in theory.
>> Second, I believe there are other ways to achieve the same goal,
>> rendering FAI an unnecessary and onerous burden. These include
>> separating input from output, and separating intellect from
>> motivation.
> Won't work. See
>> In the former, you just don't supply any output channels except
>> ones that can be monitored and edited.
> Won't work. See
>> This slows things down tremendously, but is much safer. In the
>> later, you just don't build in any motivations that go outside the
>> internal analysis mechanisms, including no means of self-awareness.
>> In essence, design it so it just wants to understand, not to
>> influence.
> Understanding requires more computronium. Hence, computronium = good.
> Hence, humans will be converted to computronium, as quickly as
> possible.

I think I answered these points in my last email (the one replying to
Robin Lee Powell), so I won't repeat them again. Please do point it out
if I missed something.

>> This may be as prone to error as FAI, but is simpler to implement
>> and therefore more likely to be successful. (Indeed, any solution
>> can be argued to be impossible to implement due to the near
>> certainty of bugs, but in general the simpler they are, the more
>> likely they are to be workable.)
>> Third, defining FAI is as bug-prone as implementing it. One small
>> mistake in the specification, either due to lack of foresight or
>> human error (say, a typo), and it's all for nothing.
> That's what CEV is for, see
> The idea is that you don't specify Friendliness content; you specify
> the process to derive Friendliness content.

Isn't that just an error-prone? Perhaps even more so, since it adds
another layer.

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:01 MDT