Re: Summary of current FAI thought

From: Samantha Atkins (
Date: Fri Jun 04 2004 - 23:03:48 MDT

On Jun 1, 2004, at 1:24 PM, Eliezer Yudkowsky wrote:

> For our purposes (pragmatics, not AI theory) FAI is a special case of
> seed AGI. Seed AGI without having solved the Friendliness problem
> seems to me a huge risk, i.e., seed AGI is the thing most likely to
> kill off humanity if FAI doesn't come first. If a non-F seed AGI goes
> foom, that's it, game over.

I have heard you say this many times. However, it is not certain that
a non-F seed AGI going "foom" would kill off humanity. At least it
isn't to me.

> I think we should be building in takeoff safeguards into essentially
> everything "AGI" or "seed", because it's too dangerous to guess when
> safeguards start becoming necessary. I can't do the math to calculate
> critical mass, and there are too many surprises when one can't do the
> math.

You are attempting to rigorously shape something more complex than any
evolved life-form and capable of self-change at a much greater rate
with little real limits. I can't help feeling like you are attempting
the impossible. Compared to such a task totally understanding and
predicting even the most complicated software systems currently in
existence is simple. Honestly, I don't believe it is doable.

>> 2) You propose building a non-AGI, non-sentient system with the goal
>> and
>> ability to:
> A Really Powerful Optimization Process is an AGI but non-sentient, if
> I can figure out how to guarantee nonsentience for the hypotheses it
> develops to model sentient beings in external reality.
>> a) extract a coherent view from humanity of what humans would want
>> if
>> they were more 'grown up' - ie. were more rational, used more/better
>> knowledge in their decisions, had overcome many of their prejudices
>> and
>> emotional limitations.
> ...more or less.

To do this implies the FRPOP needs to model/manipulate/understand
humans to a depth beyond the understanding of perhaps any human ever.
And do this preferably without even being sentient? This doesn't look
at all workable.

Presumably it only has current knowledge of the subjects involved and
this without full sentient being referents. The knowledge to date and
any likely extrapolation of such knowledge in say the next decade is
likely to not be up to the task. Extrapolations from the FRPOP without
it understanding anything about qualia or knowing sentience itself are
likely to be extremely dangerous to human well-being.

>> b) recurse this analysis, assuming that these human desires had
>> largely
>> been achieved.
> Recursing the analysis gets you a longer-distance extrapolation, but
> presumably more grownup people.
>> c) Continue to improve this process until these computed human
>> wants
>> converge/ cohere 'sufficiently'
> Or fail to converge, in which I'd have to look for some other
> reasonably nice solution (or more likely move to a backup plan that
> had already been developed).

You will certainly need that backup plan. Please describe it when you
get the chance.

>> -- only then implement strategies that
>> ensure that these wishes are not thwarted by natural events or human
>> action.
> That's phrased too negatively; the volition could render first aid,
> not only protect - do whatever the coherent wishes said was worth
> doing, bearing in mind that the coherent wish may be to limit the work
> done by the collective volition.

But wouldn't the FRPOP only honor a coherent wish that allowed it to
fulfill its primary directive of protecting/serving humans? If the
collective coherent wish was for the FRPOP to bug out, would it do so?

- samantha

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:47 MDT