Re: Flight recorders in AIs

From: Eliezer S. Yudkowsky (
Date: Fri May 23 2003 - 04:09:50 MDT

Samantha wrote:
> On Thursday 22 May 2003 01:42 am, Eliezer S. Yudkowsky wrote:
>>I think you missed the point of a flight recorder. The idea is
>>that even if you don't notice a failure *instantly*, there's at
>>least the *potential* to notice the failure five years later - so
>>long as the AI hasn't overwritten the evidence meanwhile. When the
>>goal of hiding a catastrophic failure first forms, and before it is
>>implemented, the goal itself should be noticeable. It may only be
>>noticeable for 500 microseconds before it's obscured, but even so,
>>formation of the desire and implementing it should not be
>>simultaneous. If you have the ability to run an exact
>>frame-by-frame reenactment of the AI's history, you can use
>>advanced tools built offsite, or additional programmers, to notice
>>that 500-microsecond unobscured failure. Five years later, if
>>necessary. There would at least be a chance, so long as the
>>evidence is not erased in the meanwhile. That's why the AI can't
>>have access to the flight recorder.
> I trust we are talking about very immature AIs here with abnormally
> slow maturation rates. Five years is a very, very long time in the
> life of an AI of much real promise. In five years an AI that had
> gone off track would be expected to take considerable countermeasures
> and be a *lot* harder to stop or countermand. I am very surprised
> to see you speak in terms of noticing a problem five years later and
> doing something useful about it. It seems very counter earlier
> notions of how fast a Singularity would ensue from a well constructed
> seed.
> This also assumes that anything much less complex than the AI would
> be capable of dependably detecting flaws in its reasoning/motivation.
> This is by no means a given. I would expect the humans,
> augmentation, tools and all to have a very difficult time
> understanding the AI for veyr long even "smeared across time".

I agree, Samantha. I am not speaking of an absolutely reliable method,
nor yet even necessarily a moderately reliable method. I am talking about
an additional increment of probability that the problem will be detected -
a genuine, actual improvement that can be bought if the resources are
available. Therefore, take it if possible. Actual improvements are very
precious because they are not easy to think of. Small ones, big ones,
take 'em all.

>>The flight recorder is not a magical device that automatically
>>notices any humanly conceivable failure of Friendliness before it
>>can do any harm. The function of the flight recorder is not to
>>solve that problem but to smear that problem across time, so that
>>even if you, as a human, fail to notice the failure at time T+1,
>>with the monitoring technology of time T+1, you can potentially
>>still notice that failure at time T+10 with the monitoring
>>technology of time T+10. Rather than the young UFAI needing to
>>conceal its failure only from inexperienced programmers, and the
>>mature UFAI being old enough to conceal its failure from
>>experienced programmers, the young UFAI must immediately outwit all
>>experienced future programmers that will ever go over that
>>particular moment of its youth.
> Except of course by then it may well be far too late.

Of course.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:42 MDT