Re: Fundamentals - was RE: Visualizing muddled volitions

From: Samantha Atkins (sjatkins@gmail.com)
Date: Fri Jun 18 2004 - 02:25:39 MDT


On Thu, 17 Jun 2004 13:26:49 -0400, Eliezer Yudkowsky
<sentience@pobox.com> wrote:
>
> Samantha Atkins wrote:
>
> > So, if you fuck it up and the EV is grossly inaccurate then humanity is
> > basically eternally screwed.
>
> Yes, that is correct.

So, unless there is no possible way to design a "oops, reset" button
into this system, it behooves you to add such a button with suitable
safeguards against unwise use. Hmm, wise judged according to what?
I see a problem here..

>
> > And this is supposed to be safer than
> > creating a fully conscious SAI how?
>
> You suppose it is not possible to fuck up the task of creating an
> independent humane SAI?
>

Not at all. But I would trust a full Super Intelligent conscious
entity that happened to turn out reasonably benevolent more than a
super optimizer running out the logical consequence of some error in
its initial settings with no or at least far less ability to
self-correct. To self-correct it would need something more than its
own flawed CV model, which by construction you will not give it.
 
> > The Judge of Last Resort must
> > somehow be the future persons who must live under this thing.
>
> Yes, that is one major *difference* between a independent humane SAI and a
> collective volition. You can always shut off a collective volition if you
> desire to do so is not muddled according to a collective volition that has
> no built-in tendency to rationalize or self-protect on that type of
> extrapolation.

This is circular. The very CV whose flawed nature would make us want
to shut it off must approve this action using its flawed judgment.

>The question is one of simple fact, and it is just, "Are
> the people who say 'No!' still going to agree with that decision in a few
> years?"
>

Not workable because of the above.
 
>
> >> Including human infants, I assume. I'll expect you to deliver the
> >> exact, eternal, unalterable specification of what constitutes a
> >> "sentient" by Thursday. Whatever happened to keeping things simple?
> >
> > This is not required. Being able to shut down the optimization if it
> > gets wildly out of hand is required.
>
> Current plans call for around three off-switches:
>
> 1) The programmer verification process, viewing the dynamics;
> 2) The Last Judge, viewing the outcome;
> 3) The ability of our collective decision to shut down the collective
> volition providing our collective decision is not defined by the collective
> volition as muddled.
>

So effectively only the first two can help as the third is circular.

> > If this can't be done why on
> > earth would anyone trust you or a hundred persons equally bright and
> > with excellent intentions to not fuck it up drastically?
>
> It has to be reduced to a pure technical issue. It's not a question of who
> deserves trust. It's a question of what safeguards you can take. The
> former question is not solvable and not helpful; the latter question is
> solvable and helpful.
>

Well, yes. But the safeguards mentioned thus far are not reassuring.
 
> >
> > Not at all. Make sure everyone is backed up at all times and give them
> > free choice but with nudges/reminders/more grown-up advise at whatever
> > level each one is willing/desirious of taking.
>
> So, you and Brent Thomas can fight it out about which of these is the One
> Fundamental Right we all need, and then I'll take on the winner?
>

I am serious. I think the above is more likely to give a good outcome
across the Singularity than what you are proposing.

 
> > Set things up so they
> > can't do themselves in ultimately (although it make look like they can
> > to them). Keep it going long enough for each to grow up.
>
> I do not think we are so wise, as to decide this thing. Too many
> unforeseen consequences. We can't evaluate the real effect of our actions,
> as opposed to the imagined effect.
>

Most of it would be done by a true SAI. How to get it going and
benevolent is of course the Big Problem.

<aside> What do you think extrapolating the CV is doing if it is not
evaluating imagined effects?

> >> It's not about public relations, it's about living with the actual
> >> result for the next ten billion years if that wonderful PR invariant
> >> turns out to be a bad idea.
> >
> > Instead we live with an original implementation of EV extraction and
> > decision making for the next 10 billion years without any possibility of
> > a reset or out? Hmm.
>
> No, hence the whole distinction between initial dynamic, successor dynamic,
> etc.
>

But these are just unfoldings of the original setup of the CV
extrapolation mechanisms and its subsequent optimization. Hmm. Is
there enough freedom the Optimizing Process could decide an approach
like the one hinted at above was the successor dynamic and go in that
direction? If so then I have little real objection. :-)

> >> Not under your system, no. I would like to allow your grownup self
> >> and/or your volition to object effectively.
> >
> > But that being exists only within the extrapolation which may in fact be
> > erroneously formulated.
>
> Right. If you fuck up the extrapolation, you're screwed. If you try to
> build in a separate emergency exit, and you fuck up the emergency exit,
> you're screwed. If you build an independent humane mind and you fuck that
> up, you're screwed. Hence, keep it simple, and keep it technical, because
> on moral problems you can fuck up even if everything works exactly as planned.
>

But if the technical problem as you have currently formulated it is
unsolvable then you are still screwed.

> >
> > Genie bottle with warning labels AND the ability to recover from errors
> > wouldn't be so bad.
>
> It would be a total shift in human society and an enormous impact on every
> individual. Maybe we want to take it slower than that?
>

Sure. Agreed. Only a very few people would be likely remotely ready
for such a thing right now. More could survive it. But they
probably wouldn't learn nuthin'.
 
> >
> > Not unless you design the system so that eternal damnation is possible!
> > Without the ability to opt-out or try something different or recover
> > from any and all errors eternal damnation will always be possible.
>
> Collective volition has this, and it is one of the primary motivations.
> And yes, this requires that the initial dynamic work satisfactorily, just
> as any other method requires that the technical solutions work satisfactorily.
>

Yep.

- samantha



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:47 MDT