Re: [sl4] Rolf's gambit revisited

From: Norman Noman (overturnedchair@gmail.com)
Date: Sun Jan 04 2009 - 15:24:04 MST


On Thu, Jan 1, 2009 at 11:39 AM, Matt Mahoney <matmahoney@yahoo.com> wrote:

> --- On Wed, 12/31/08, Norman Noman <overturnedchair@gmail.com> wrote:
>
> > (rolf's gambit is a method for cooperation between powers in different
> > worldlines via mutual simulation, gone over in detail in this earlier sl4
> > thread: http://www.sl4.org/archive/0708/16600.html)
> >
> > (if you haven't read that thread, you really should, it's probably the
> most
> > interesting thing to ever come out of this list)
> >
> > I was thinking today about a puzzle. Let's say you're a friendly AI, and
> > you're going to enact rolf's gambit. But before you do that, you take
> over a
> > few solar systems, and you discover obvious proof that your world is a
> > simulation. For the sake of argument, let's say it's an indestructible
> tuba
> > on the third moon of saturn.
> >
> > The question is this: assuming you continue with rolf's gambit, do you
> > include the tuba in your subsimulations? Why or why not?
>
> First, there is no such thing as indisputable proof.

No one said "indisputable".

> There is only belief.

Oh boy, down the well of pedanticness we go. Apparently matt would like to
point out that we only BELIEVE 2+2=4, we only BELIEVE sonic the hedgehog is
blue, etc. etc.

Touche, my friend, touche! This certainly punches a huge hole in my
argument, and indeed every argument about everything, ever. My hat is off to
you.

> If an AI believes the universe is real or simulated, it is because it was
> programmed that way (on purpose or by accident).

Only to the extent that if it believes anything, it's "only because it was
programmed to". 2+2, sky is blue, etc. If one day everything turns into
wireframe models and the moon is replaced with a big square that says BFRAME
DECODER LAG, that doesn't PROVE you're in a simulation, but it certainly
INCREASES THE ODDS.

> If the two cases are indistinguishable, then belief one way or the other
> should have no effect on rational behavior because it does not affect
> expected utility. In particular, there is no difference from the viewpoint
> of either AI or humanity between a real world AI wiping out real world
> humanity and a simulated AI wiping out simulated humanity.
>

Well, it makes a difference to me. And although I don't take pride in it, I
consider myself part of humanity.

Maybe you mean it SHOULDN'T make a difference? Why not? Even if you feel
simulated people are 100% sentient concious creatures who should be allowed
to vote and drive, why does that mean it doesn't make a difference? Saying
YOU don't care about context is one thing, saying no rational agent is
allowed to is something else.

But let's say for the sake of argument that it does matter.
>
> An AI running in a simulation cannot know anything about the AI that is
> simulating it.

False, except in the sense that no one can "know" anything about anything.
If the simulation contains cheese, we know the AI running the simulation
doesn't have a problem with simulating cheese. If it manifests itself and
says "Hi, i'm jupiter brain Xarblox and I'm simulating you because I thought
it would be a laff, my favorite color is green and I'm 77 million years
old", then we don't KNOW any of that is true, but if we just plug our ears
and go "la la la, there's no way we can know anything about the AI that's
simulating us", we'd be being a bit retarded.

> Any Turing machine can simulate any other Turing machine. The simulated AI
> might believe otherwise, but only because the simulating AI biased it toward
> those beliefs.
>

The "it could be anything therefore we ignore it" argument applies equally
well (that is to say, badly) to anything and everything. Someone rings your
doorbell! Who is it? Well, it COULD be ANYONE, or ANYTHING, best to pretend
it never rang. I'm reminded of a fake dialog box style popup ad I saw once
that said "Warning! Your computer is currently downloading INFORMATION!
Information is potentially dangerous and could DESTROY YOU!"

Well, it COULD. But chances are it won't. EVERYTHING is ultimately
Unknowable, from the color of the pee in your bladder to the square root of
49. But we can still assign things probabilities. There's an infinite number
of turing machines that could have written Harry Potter, but I'd bet my
money that it was J. K. Rowling. Guessing the author of our own world is a
lot harder, but IT IS NOT A SPECIAL CASE. Probability still applies.

If an RAI needs to wipe out humanity to complete its goal of computing C,
> then it will do so. In order for Rolf's gambit to work, the RAI would have
> to biased to believe in the possibility of an FAI simulating it and biased
> to guess the motivations of the FAI, i.e. biased to believe in the existence
> of a humanity that does not want to be exterminated in the simulating world
> just like in the simulated world, except that the simulating humanity could
> effectively control the FAI*. It would be simpler (or at worst, equivalently
> difficult) just to bias the RAI to not want to wipe out humanity. But if we
> could do that, we wouldn't need to worry about the FAI problem.
>
> *This is logically inconsistent, because it implies that humanity would
> have greater intelligence than the FAI. The RAI (also having superhuman
> intelligence) would know that the friendliness problem is ill defined. Any
> superhuman AI could be made to appear friendly while doing whatever it wants
> by reprogramming our brains' definition of friendliness.
>

If humanity's whims aren't the source of the FAI's motivations, where does
it get them? It sounds like you're taking issue with the possibility of
friendly AI, which is a whole different argument.



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:03 MDT