Re: [agi] Two draft papers: AI and existential risk; heuristics and biases

From: Ben Goertzel (
Date: Tue Jun 06 2006 - 09:09:58 MDT


> The chapters are:
> _Cognitive biases potentially affecting judgment of global risks_
> _Artificial Intelligence and Global Risk_
> The new standard introductory material on Friendly AI. Any links to
> _Creating Friendly AI_ should be redirected here.
> --
> Eliezer S. Yudkowsky

I find these to be excellent papers, and will recommend others to read them.

However, in comparison to "Creating a Friendly AI" (CFAI), I must note
that the ambition of what's attempted in "Artificial Intelligence and
Global Risk" (AIGR) is greatly reduced.

CFAI tried, and ultimately didn't succeed, to articulate an approach
to solving the problem of Friendly AI. Or at least, that is the
impression it made on me....

On the other hand, AIGR basically just outlines the problem of
Friendly AI and explains why it's important and why it's hard.

In this sense, it seems to be a retreat....

I suppose the subtext is that your attempts to take the intuitions
underlying CFAI and turn them into a more rigorous and defensible
theory did not succeed.

I also note that your Coherent Extrapolated Volition ideas were not
focused on in AIGR, which I think is corrrect because I consider CEV a
fascinating science-fictional speculation without much likelihood of
ever being practically relevant.

I agree with you that taking a more rigorous mathematical approach is
going to be the way -- if any -- to a theory of FAI. However, I am
more optimistic that this approach will lead to a theory of FAI
**assuming monstrously great computational resources** than to a
theory of pragmatic FAI. This would be expected since thanks to
Schmidhuber, Hutter and crew we now have the beginnings of a theory of
AGI itself assuming monstrously great computational resources, but
nothing approaching a theory of AGI assuming realistic computational

It would seem to me that FIRST we should try to create a theoretical
framework useful for analyzing and describing AGIs that operate with
realistic computational resources. Then Friendly AI should be
approached, theoretically, within this framework.

I can see the viability of also proceeding in a more specialized way,
and trying to get a theory of FAI under limited resources in the
absence of an understanding of other sorts of AGIs under limited
resources. But my intuition is that the best way to approach "FAI
under limited resources" is to first get an understanding of "AGI
under limited resources."

This brings us back to my feeling that some experimentation with AGI
systems is going to be necessary before FAI can be understood
reasonably well on a theoretical level. Basically, in my view, one
way these things may unfold is

* Experimentation with simplistic AGI systems ... leads to
* Theoretical understanding of AGI under limited resources ... which leads to...
* The capability of theoretically understanding FAI ... which leads to ...
* Building FAI

Now, this building of FAI *may* take the form of creating a whole new
AGI architecture from scratch, *or* it may take the form of minorly
modifying an existing AGI ... or it may be understood why some
existing AGI design is adequate and there is not really any
Friendliness problem with it. We don't know which of these
eventualities will occur because we don't have the theory of FAI

Your excellent article AIGR, in my view, does not do a good job of
arguing against this sort of perspective that I'm advocating here. I
understand that this is not its job, though: it is mostly devoted to
making more basic points, which are not sufficiently widely
appreciated and with which I mainly agree enthusiastically.

-- Ben G

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:56 MDT