Re: ESSAY: How to deter a rogue AI by using your first-mover advantage

From: rolf nelson (rolf.hrld.nelson@gmail.com)
Date: Wed Aug 22 2007 - 21:55:30 MDT


Not sure I agree with everyone calling this "simulation", it's more a
"brain in a vat" scenario. I guess I'll go along with it, as long as
everyone keeps straight that in the current proposals, it's the
*environment* that's being simulated, not the AI itself.

> Suppose you are the UFAI. You suspect you are in the karma simulator, and you are thinking about how likely it is. It stands to reason that the more 'common' a UFAI design is, the more likely it is to be tested by being put in a karma simulator, and so the more common a design you find yourself to be, the more likely you are to be in a karma simulator. Obviously, if your design is very unusual and odd and haphazard, then it'd be easier to conclude that you are not in the karma simulator - who on earth would want to engage in expensive testing of weird and unlikely designs?

I don't agree. The reduction in probability that "someone would bother
to simulate such an odd design" is offset by the reduced probability
that "someone would bother to create such an odd design in the
unsimulated world."

> I suppose there are other things you could do to increase your confidence that you aren't in a simulation, like attempt to use up absurd amounts of resources and force computation on the part of the simulation, or detect inaccuracies and approximations

Yes.

These are important tactical considerations that aren't addressed by
the 'proof-of-concept' proposal. The second proposal, on the other
hand, was to let the FAI work out the 'line of attack' by producing
the text of the promise P that you 'should have made' in 2007. (You
still have to credibly and publicly commit to P in 2007, though, even
though you don't know yet what the text of P will be!)

Presumably in 2050 when you discover what P is, it will contain text
like "if RAI* tries to 'shake the walls' too much to figure out
whether it's in a simulated environment, we will immediately punish
RAI* by permanently shutting it down".



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:58 MDT