Re: AI Boxing:

From: Tim Freeman (
Date: Sun Jun 08 2008 - 17:40:01 MDT

         <> <>
--text follows this line--
From: "Stuart Armstrong" <>
>The oracle AI will only be good (in the form discussed here) if the
>theory of Friendliness it comes up with is simple enough that we can
>go through it and check there are no holes.
>By personal prejudice is that no such simple theory exists - mostly
>because the definition of friendliness depends on the level of power
>and intelligence of the AI, on the evolution of external
>circumstances, and is not a simple universal.

That's FUD.

There's no oracle required, so far as I can tell, and the solution
isn't incomprehensibly complex, nor is it radically different
depending on the intelligence of the AI. A specification of a
proposed solution is at:

It's not practical, but the added hair to make a practical
implementation isn't likely make it incomprehensible. It has one
known bug:

The bug seems fixable. There might be other bugs, but finding and
fixing them doesn't seem likely to make it incomprehensible either.
This proposal may be incorrect, but it's close enough that the correct
solution isn't going to be enormously more complex.

Basically you want an algorithm that implements "figure out what
people want, and do that". Optimizing for a utility function is
simple, in principle, so the hard part is figuring out a utility
function that describes what people want. In ordinary human
discourse, the concept of "what so-and-so wants" is treated as a
simple thing, so the reasonable assumption is that an implementation
of that concept is reasonably simple as well.

It is much more constructive to proceed on that assumption and attempt
to figure out the details than to assume failure and give up before

Tim Freeman      

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:03 MDT