From: Daniel Radetsky (daniel@radray.us)
Date: Wed Jan 26 2005 - 17:00:23 MST
People:
Thanks to everyone who responded. I appreciate it. Some comments on the
responses I recieved:
On Tue, 25 Jan 2005 23:31:17 -0800
Robin Lee Powell <rlpowell@digitalkingdom.org> wrote:
> > 1. In developing FAI, some form of Last Judge/AI Jail will be
> > used.
>
> Those are *utterly* different things. I disagree in both cases. A
> Jail is insanely stupid, and a Last Judge is a weapon of absolutely
> last resort.
My understanding, which seems to be supported by careful reading of the
responses of other members, is that the Last Judge is the best anyone can come
up with so far. It's better than "Let's just hope to God it's friendly," and,
though it is, as MRA points out, a kludge, it's the only kludge we have.
Therefore, it seems to me to be disingenuous to call your only weapon a weapon
of last resort. Correct me, though; I'm not very well read in these areas.
Another thing: maybe I have the wrong picture of the Last Judge scenario. As
far as I understood it, Last Judge necessitates an AI Jail of some type. That
is, the AI is operating under some type of restriction (e.g. within a single
mainframe, but also maybe some sort of software directive "You may not do X."
The specific type of restriction is unimportant to the argument), which the
Last Judge can remove.
Here's how I picture it: The AI is confined to a single large computer, which
runs a simulated world which the AI is to manage. The judge allows it to
self-improve and manage the world for a while, and then examines the world and
interviews the AI, and decides to let it out. Maybe I'm just not as smart as
the rest of you, but I don't see what the Last Judge *could do* besides being
a jailer. Eliezer writes:
> I think, though I am not sure, that the FAI theory will dictate what
> constitutes a necessary and sufficient verification process, including whether
> we need a Last Judge to peek at the extrapolated final outcome.
What else might be meant by "Extrapolated final outcome?" After all, if we
have a UFAI which is all set to start cupcaking us (Bostrom talks about the
paperclip problem, but I prefer the cupcake problem), we sure as hell don't
want the judge saying, "Hey, stop that!" We want it to never happen. Eliezer
admits a Last Judge scenario is "[N]ontrivial to set up safely." I don't see
what else he might be talking about. I've read the archives (albiet, not as
carefully as I might have), and all they seem to say on the subject is "AI
Boxing is very hard to do safely and maybe impossible." Please, enlighten me.
Also, regarding the objection
> > 2. If a last judge is used, the judge will have sole authority to
> > determine whether an AI is sufficiently friendly/correctly
> > programmed/whatever he's supposed to determine.
>
> No. The Last Judge accepts or refuses the *outcome* of CV, which is
> very different.
On Wed, 26 Jan 2005 08:06:40 -0800
"Michael Roy Ames" <michaelroyames@yahoo.com> wrote:
> Determining a 'degree' of friendliness will involve a
> complex technical procedure of comparison against ideal data & models -
> something that could only be done by a team of experts with a thorough
> understanding of the models.
I apologize. I was not using the technical terms properly. I meant evaluating
its actions, which I (incorrectly?) equated with evaluating its degree of
friendliness.
> The AI Jail concept will probably be used in the sense of "running on
> computers that are not connected to the network" but not in the sense of "we
> will see how it behaves before we let it out". The first sense is useful in
> that it provides a minimum level of protection against code 'escaping' or
> being stolen. The second sense is not useful because it suggests that AI
> evaluation by humans can be done the same way as human evaluation by humans,
> which is false for the type of AI we are attempting to build.
>
> AI evaluation will use many methods of observation, all of which boil down
> to observing actions.
This may again be another case of "I'm not smart enough to grasp the subtle
distinctions," but this sounds to me like "We *will* see how it behaves before
we let it out, we'll just be extra clever in how we evaluate its behavior."
On Wed, 26 Jan 2005 04:06:53 -0800 (PST)
Thomas Buckner <tcbevolver@yahoo.com> wrote:
> As far as I know, I'm the one who brought the
> Sergeant Schulz strategy into the discussion,
> i.e. try to decieve the stupidest jailer. I may
> have given it a name, but certainly didn't invent
> the strategy, which probably is as old as jails.
All I found when I was reading the archives was the name, your brief
description of what it meant, and reference to it having been previously
discussed. I couldn't find any discussion of it specifically. It seems like
everyone (well, Eliezer at least, but one could argue that he proved it) took
for granted that the AI could fool anyone, and didn't think they had to worry
about the dumbest jailer.
On Wed, 26 Jan 2005 08:18:49 -0500
"Ben Goertzel" <ben@goertzel.org> wrote:
> > > 3. It would be intractible to audit the "grown"
> > > code of a seed AI. That is, once
> > > the AI begins to self improve, we have to
> > > evaluate its (I refuse to use
> > > posthuman pronouns) degree of friendliness on
> > > the basis of its actions.
>
> Not entirely true. It is already intractable to audit the code of large
> software projects, though.
I don't get it. FAI would be *way* larger than, say, the linux kernel, right?
So what makes you think we'd be able to audit it?
On Wed, 26 Jan 2005 07:40:07 -0800 (PST)
michael or aruna vassar <michaelaruna@yahoo.com> wrote:
> SI code is intractable.
>
> I'm confused as to how there could be uncertainty regarding any of these
> questions. Have you read the list archives thoroughly?
This, Michael or Aruna, is why I ask. I don't think there are uncertainties,
and I did read the list archives, but maybe Ben knows something I don't. If
so, I wish he'd tell me.
In any case, unless I get substantial or good responses to this, I will assume
that Singularitarians grudgingly agree with premises (1) and (3). I will also
assume they agree with (2), because nobody has actually disagreed with it yet.
Yours,
Daniel
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:50 MDT