Re: Eliezer: unconvinced by your objection to safe boxing of "Minerva AI"

From: Daniel Radetsky (
Date: Sun Mar 13 2005 - 12:01:47 MST

On Sun, 13 Mar 2005 15:31:40 +0200
"Kaj Sotala" <> wrote:

> Any code written by humans is bound to have plenty of bugs
> and ways of doing something in a suboptimal fashion.

Yes, but from this it does not necessarily follow that

> An AI programmed for recursive self-improvement will find these
> and be able to reason that humans are fallible when writing
> code.

since this ignores the most important part of my argument, namely that the AI
would not necessarily have any conception of "humans" in the first place. It
wouldn't even have a notion of, "Whatever it was that created me." It would just
know, "I am imperfect." I agree that if it were to reason to the existence of
humans, and to their imperfection in writing code then

> It wouldn't be a huge step to generalize this into humans
> being fallible in other things as well, and thus prone to being
> manipulated.

but the first step is a big jump, and I don't think a lot of people understand

> An alternate way for an RSI-capable AI to reason the same
> would be to compare its own functioning before and after
> making changes to itself. It would see that the changes led to
> it being able to process information better or worse than
> before.

Probably, but again, I don't think that from this, it follows that

> This would imply that other beings would also have
> different information processing capabilities, depending on
> the level of self-improvement they were capable of.

because "other beings" may not be a concept that the AI is familiar with.

> Beings with an inferior information processing capability could be
> manipulated by one with a superior capability.

And I'm not sure this is always true. I think that if I have a sufficiently
strong understanding of the set of probabilities involved in a situation, then
while you can see to it that I can't make a reliable guess, you can't make it
so I think I'm making a reliable guess when I'm actually not. But this point
isn't as important as the previous ones.


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:50 MDT