Re: Fighting UFAI

From: D. Alex (
Date: Wed Jul 20 2005 - 08:34:44 MDT

Despite reading all the old posts I can find on "AI boxing", I cannot see
that the "idea of containment is played out". The arguments against
containment seem to be based on unproven assertions and inapropriate
comparisons - and I would love to have someone correct me here and PROVE
that AI boxing is not feasible.

I also cannot accept that an AI that would be required to PROVE its benign
nature (prove mathematically while staying fully within the box) would be
able to pull the wool over the eyes of an appropriate gatekeeper commettee.
A proof is by definition sufficient, and just because it can be understood
and validated by such a committee does not make it invalid, even though a
much higher intelligence had the opportunity to bend it to its ends. There
need not be room for any subtelties beyond human understanding in a proof
like that.

Finally, why would a proof provided by the AI itself be less reliable than a
set of principles developed by humans to build friendly AI? It would seem
easier by far to check than to effectively develop a proof.

D. Alex

----- Original Message -----
From: "justin corwin" <>
To: <>
Sent: Thursday, July 14, 2005 2:01 AM
Subject: Re: Fighting UFAI

> Appreciation to E for the override.
> In deference to Mitch Howe, I think that the idea of containment is
> played out. But we're not exactly discussing keeping an AI in a box
> against its will, but whether or not its existence in the world means
> our immediate destruction, or if we have some game-theoretic chance of
> defending ourselves.
> I tend to think there is a chance that a self-improving AI could be so
> smart so fast that it doesn't make sense to try to evaluate its
> 'power' relative to us. We could be entirely within it's mercy(if it
> had any). This bare possibility is enough to warp the planning of any
> AI theorist, as it's not very good to continue with plans that have
> big open existential risks in them.
> The question is, how likely is that? This is important, quite aside
> from the problem that it's only one part of the equation. Suppose I,
> Eliezer and other self improvement enthusiasts are quite wrong about
> the scaling speed of self improving cognition. We might see AGI
> designs stuck at infrahuman intelligence for ten years(or a thousand,
> but that's a different discussion). In that ten years, do you think
> that even a project that started out as friendly-compliant(whatever
> that means) would remain so? I imagine even I might have trouble
> continuing to treat it with the respect and developmental fear tht it
> deserves. To be frank, if AGIs are stuck at comparatively low levels
> of intelligence for any amount of time, they're going to be
> productized, and used everywhere. That's an entirely different kind of
> safety problem. Which one is the first up the knee of the curve?
> Yours? IBM's? An unknown? There are safety concerns here that must be
> applied societally, or at least not to a single AI.
> There are other possible scenarios. Broken AGIs may exist in the
> future, which could be just as dangerous as unFriendly ones. Another
> spectre worried about is something I call a commandline intelligence,
> with no upperlevel goal system, just mindlessly optimizing against
> input commands of a priviliged user. Such a system would be
> fantastically dangerous, even before it started unintended
> optimization. It would be the equivalent of a stunted human upload.
> all the powers, none of the intelligence upgrade.
> These and other scenarios may not be realistic, but I have thought
> about them. If they are possible, they deserve consideration, just as
> much(relative to their possiblity, severity, and possibility of
> resolution) as the possibility of a single super-fast self improving
> AI, which will be either Friendly or unFriendly, depending on who
> writes it.
> So what's likely? What can be plan for, and what solutions may there
> be? None of the above are particularly well served by planning a
> Friendly architecture, and nothing else. Someone else talked about
> safety measures, I think that may be a realistic thing to think about
> *for some classes of error*. The question is, how likely are those
> classes, and does it make sense to worry about them more than the edge
> cases.
> For those of you who are still shaking your heads at the impossibility
> of defending against a transhuman intelligence, let me point out some
> scale. If you imagine that an ascendant AI might take 2 hours from
> first getting out to transcension, that's more than enough time for a
> forewarned military from one of the superpowers to physically destroy
> a signifance portion of internet infrastructure(mines, perhaps), and
> EMP the whole world into the 17th century(ICBMs set for high altitute
> airburst would take less than 45 minutes from anywhere in the
> world(plus America, at least, has EMP weapons), the amount of shielded
> computing centers is miniscule). We may be stupid monkeys, but we've
> spent a lot of time preparing for the use of force. Arguing that we
> would be impotent in front of a new threat requires some fancy
> stepping. I, for one, tend to think there might be classes of danger
> we could defend against, which are worth defending against.
> I am open to persuasion, of course.
> --
> Justin Corwin

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:51 MDT