Re: Definition of strong recursive self-improvement

From: Eliezer S. Yudkowsky (
Date: Sun Jan 02 2005 - 10:38:54 MST

Russell Wallace wrote:
> On Sat, 01 Jan 2005 22:48:31 -0600, Eliezer S. Yudkowsky
> <> wrote:
>>I intend to comprehend how it is theoretically possible that humans
>>should write code, and then come up with a deterministic or
>>calibrated-very-high-probability way of doing "the same thing" or
>>better. It is not logically necessary that this be possible, but I
>>expect it to be possible.
> I can explain that one for you. Humans write code the same way we do
> other things: a combination of semi-formal reasoning and empirical
> testing.
> By "semi-formal reasoning", I mean where we think things along the
> lines of "A -> B", ignoring the fact that in strict logic it would
> read "A &!C &!D &!E ... -> B", where C, D, E and an indefinitely long
> list of other things we haven't thought of, could intervene to stop A
> from working. We ignore C, D, E etc because if we had to stop to take
> them all into account, we'd starve to death in our beds because we
> couldn't prove it was a good idea to get up in the morning. In
> practice, we're good enough at taking into account only those things
> that are important enough to make a practical difference, that we can
> survive in the real world.

Thank you for your helpful explanation; go forth and implement it in an
AI code-writing system and put all the programmers out of business.

I do not intend to achieve (it is theoretically impossible to achieve, I
think) a 1.00000... expected probability of success. However, such
outside causes as you name are not *independent* causes of failure among
all the elements of a complex system. A CPU works because all the
individual transistors have extremely low failure rates, much lower than
the real-world probability of the CPU being tossed into a bowl of ice
cream. The probability of a transistor working might be only 99%, when
ice cream is taken into account, yet the probability of the entire CPU
working is scarcely less than 99% (despite the millions of transistors)
because it's not an independent probability for each transistor.

I do not say: If I could solve the code-writing problem for
deterministic transistors, I would be done. For there is still more
safety that may be wrung from such a system; with probabilistic
reasoning it may be proofed against cosmic rays. And the problem of
guarding against errors of human interface is more complex still, to
which I devote much thought trying to make it a solvable technical
problem instead of a sterile philosphical one.

But if I knew how to build an FAI that worked so long as no one tossed
its CPU into a bowl of ice cream, I would count myself as having made
major progress.

Meanwhile, saying that humans use "semi-formal reasoning" to write code
is not, I'm afraid, a Technical Explanation. Imagine someone who knew
naught of Bayes, pointing to probabilistic reasoning and saying it was
all "guessing" and therefore would inevitably fail at one point or
another. In that vague and verbal model you could not express the
notion of a more reliable, better-discriminating probabilistic guesser,
powered by Bayesian principles and a better implementation, that could
achieve a calibrated probability of 0.0001% for the failure of an entire
system over, say, ten millennia. (For I do now regard FAI as an interim
measure, to be replaced by some other System when humans have grown up a

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:50 MDT