Friendliness, Vagueness, self-modifying AGI, etc.

From: Ben Goertzel (
Date: Mon Apr 05 2004 - 05:46:59 MDT

Michael Wilson wrote:
> Ben Goertzel wrote;
> > While I find Eliezer's ideas interesting and sometimes
> deep, based on
> > his posted writings I do not agree that he has "made progress in
> > producing engineering solutions."
> Recent history; I attempted to implement a prototype of LOGI
> and CFAI over the November -> February period this year. In
> the process I discovered just how vauge the two documents
> were; I had to fill in a lot of details myself in order to
> produce a constructive account that would server as a
> blueprint.


Well, this does not surprise me at all.

I think that documents on the level of CFAI and LOGI are important and
necessary, but that they need to come along with more formal, detailed
and precise documents. There are a lot of ideas that *look* sensible on
the abstract conceptual level of CFAI/LOGI, but don't actually
correspond to any sensible detailed design. And then, one level down,
there are a lot of ideas that look sensible on both the conceptual level
and the detailed-design level, but don't actually work for reasons that
are only seen in the course of experimenting with the implementation of
the design. (The only way to avoid this latter problem is to do a full
formal, mathematical analysis of one's detailed design, but I doubt that
is possible.)

In the case of Novamente we have a conceptual picture and a fairly
detailed design (which still has some gaps at the low-to-intermediate
level of precision), and a working codebase implementing some of the
design --- but we understand that we *still* may get taught some
unwelcome lessons during further implementation and experimentation...

> I agree that CFAI etc provide (dangerously
> outdated and flawed) principles but not a HOW TO.

My worry is that, even if the successor to CFAI provides Friendliness
principles that are not obviously flawed on the *conceptual* level,
they'll still fall apart in the experimental phase.

Thus my oft-repeated statement that the true theory of Friendliness will
be arrived at only:

A) by experimentation with fairly smart AGI's that can modify themselves
to a limited degree

B) by the development of a very strong mathematical theory of
self-modifying AGI's and their behavior in uncertain environments

My opinion is that we will NOT be able to do B within the next few
decades, because the math is just too hard.

However, A is obviously pretty dangerous from a Friendliness

My best guess of the correct path is a combination of A and B, in which
one builds

A) a collection of sub-human-level self-modifying AI's, with strict
limits on the ways they can self-modify and careful observation

B) a smart non-self-modifying AI which is customized for data analysis
and automated theorem-proving, with a modicum of general intelligence as

One then uses B to help one model A, and obtain a high-quality theory of
the dynamics of AGI's under self-modification. This theory then tells
one how to proceed next.

> It looks like Eliezer is in fact making good
> progress towards a scientifically well justified and feasible
> to engineer constructive theory. I look forward to his
> publication of the details.

I guess we all do ;-)

-- Ben

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:46 MDT