From: Daniel Radetsky (firstname.lastname@example.org)
Date: Tue Aug 02 2005 - 18:54:44 MDT
On Tue, 2 Aug 2005 20:03:43 -0400
Randall Randall <email@example.com> wrote:
> When I program, having finished a unit of code, I have no evidence that there
> are any bugs in what I've written. Still, prudence indicates that I should
> *assume* that there are bugs in my code, and write tests to catch them if
> they exist...All I am arguing is that prudence indicates that we should
> assume that exploits exist until we have some reason to believe they do not
But you don't assume there are bugs in your code out of prudence any more than
you would assume that there are monsters under your bed for the same reason.
You assume your code has bugs because most code does, and you don't check under
your bed for monsters because believe that most beds don't have any. For this
analogy to work, you'd have to believe that most boxes have exploits in the
sense I was discussing before (physical exploits), not in the sense you seem to
have confused my position with (psychological exploits of the jailer).
> Given this, safeguards must be built into the superhuman intelligence
> directly. This, of course, is the Party position, the common wisdom, on this
> list. This is why (I believe) people on this list keep insisting that some
> exploit may exist: assuming that they do not predetermines failure in the
> case that they do, while the reverse is not true.
I have less charitable reasons for why people insist on the existence of
exploits, but even if you are right, that doesn't mean that acting as though
exploits exist is automatically the rational thing to do. For example, if we
thought that there was an pretty low probability that there were exploits
existed, but that if there were we would be screwed if we didn't account for
them, although we probably could account for them if we spent an extra 20 years
before the debut of AI. In this case, given the very low probability of the
existence of exploits and the astronomical waste of 20 years without AI, it
seems like the decision to ignore the possibility of exploits has more expected
value, although I may just be reacting to my bias to avoid sure losses.
> Now, it may or may not be *probable*, with what we know today, that exploits
> do exist, but just as a pack of dogs cannot reason about a helicopter they
> haven't had any experience of, I don't think we have any way to reason about
> the probability of exploits within those parts of physics we don't yet have
> nailed down.
Which is exactly why it is paranoid to worry about exploits.
> Just to be clear, I do not know of any [exploits]. My best guess about
> Eliezer's box-break involves reasoning with the jailer about what future
> jailers will do, combined with a carrot and stick.
For those joining the debate late, the issue of whether or not the AI could
talk its way out of a box is a seperate issue from the existence of exploits.
I am in the process of writing up my argument and responses to various
objections. I will post a link when I am finished, and create a wiki
subdirectory of my subscriber page where y'all can write comments.
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:51 MDT