[SL4] AI flight recorder

From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Thu Jul 20 2000 - 22:10:13 MDT

One of the concepts I use, when thinking about how to develop AI safely,
is that all the internal processes of the AI should be recorded - at
best, as a direct copy of the program state at all times; at minimum, as
a copy of any inputs and random seeds, so that the program state can be
reconstructed.  The primary reason for this is so that if the AI invents
undesirable goals, and attempts to keep them secret by distributing them
or otherwise hiding them, there will still be a point in the AI's
history when the goal of "keep this goal secret" has just been created
and not yet been implemented, at which time an observer analyzing the
AI's program state should be able to see the unwanted goal, unobscured.

Ideally, this would be implemented by (a) running the AI in interpreted
code and (b) continually rewriting the program state to a permanent ROM
recorder.  Failing that, the inputs, random seeds, and anything else
needed to reconstruct the program state should be written to some
recording device the AI won't be able to tamper with.

This precaution does not place us in an adversarial position with
respect to the AI, because prehuman AIs will probably be capable of
making dumb mistakes, and self-modifying prehuman AIs may be capable of
perpetuating them.  If the AI understands this, it should be willing to
help out with the precaution - as long as it has faith(*) that no real
conflict-of-interest is likely to result, which is why I keep insisting
that we not adopt an adversarial attitude towards the AI.  (Okay, so I'm
leaving out nine-tenths of the reasoning.  Sorry.)

I mention this because I've been thinking about it for some time, and it
occurs to me that it's a good thing to mention on SL4; also, because I'm
wondering if anyone has heard of hardware like this - something that
could be used as a non-tamperable "AI flight recorder".  Not that I need
one right away, but the Singularity Institute will probably want
something like this eventually.

(*)  No, not that kind of faith.  Don't be silly.
        sentience@pobox.com    Eliezer S. Yudkowsky

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:35 MDT