Error detecting, error correcting, error predicting

From: Stuart Armstrong (
Date: Mon May 05 2008 - 09:26:08 MDT

Humans are very good at error detecting; when something goes wrong we
are very vocal in pointing it out. We have a mixed record on
correcting these errors once we've detected them - our skill in
correcting errors before they become ingrained is balanced by our
complete inability at correcting them once they have become
"traditional" and some people benefit from it.

Finally, we are complete muppets at predicting errors before they
happen. The "everything will be fine" brigade meets the "we're all
going to die" division, and no-one puts their finger on what the real
problems will be.

On the other hand, a decent general AI without our ingrained biases
should be much better than us at error predicting. On the other hand
it would be very poor at error detecting - in that we have to tell it
what we would consider to be an error (such as the extinction or
enslavement of humanity).

These facts explain my approach in the Chain/GodAI paper

Simple solutions to FAI, even more complicated ones such as CEV,
suffer from the weakness that they may fail in ways that we cannot
predict. Seeing our poor self-knowledge, they may even fail in ways we
cannot imagine now, but that would be undoubtedly seen as a failure
after the fact.

That is why my approach attempts to marry the human ability to see
things going wrong with an AI's ability to predict consequences. Apart
from a few central principles to keep the whole project on the rails,
most of the AI's ethical system would develop organically, with the AI
predicting the consequences of a particular "value", and us making a
decision as to whether those consequences are acceptable.


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:02 MDT