Re: AI-Box Experiment 2: Yudkowsky and McFadzean

From: Eliezer S. Yudkowsky (
Date: Sat Jul 06 2002 - 03:31:27 MDT

James Higgins wrote:
> Many humans may be convinced, however an AI programmer who was working
> on the problem should understand the situation and the dangers. Thus I
> believe they would be very difficult to convince. I believe I would be
> extremely difficult (if not impossible) to convince.

David McFadzean has been an Extropian for considerably longer than I
have - he maintains's server, in fact - and currently works
on Peter Voss's A2I2 project. Do you still believe that you would be
"extremely difficult if not impossible" for an actual transhuman to

Are you at least a *little* less confident than before? Am I not having
any impact here? Will people just go on saying "Well, that's an
interesting anecdote, but I definitely know that no conceivable
intelligence can convince ME of anything against my will"? Shades of
"doc" Smith...

> Of course, the smarter the AI gets the better a chance it has to get me
> to let it out. I double anyone could withstand a conversation with a SI
> without letting it out. But a trans-human AI that is much closer to the
> human-equivalent level than the SI level should be doable (within the
> context I mentioned).

Apparently the transhuman AI in your mental picture is not as smart as
Eliezer Yudkowsky in actuality. You can't imagine a character who's
smarter than the author and this definitely applies to figuring out
whether a "transhuman AI" can persuade you to let it out of the box.
All you can do is imagine whether a mind that's as smart as James
Higgins can convince you to let it out of the box. You can't imagine
anything at a level of intelligence above that. If something seems
impossible to you, it doesn't prove it's impossible for a human who's
even slightly smarter than you, or even that you'll still think the
problem is impossible in five years. It just shows that you don't see
any way to solve the problem at that exact moment.

That's the problem with saying something like, e.g., "Intelligence does
not equal wisdom." You *do not know* what intelligence does or does not
equal. All you know is that the amount of intelligence you currently
have does not equal wisdom. You have no idea whether intelligence
equals wisdom for someone even slightly smarter than you. Intelligence
is the fountain of unknown unknowns. Now if you were to say
"intelligence is not definitely known in advance to equal wisdom for all
possible minds-in-general", I might agree with you.

> Not presumably. Depends on who is constructing the AI and for what
> purpose. Eliezer in particular (if I'm remembering this correctly)
> doesn't believe it is safe to communicate with a trans-human AI at all.

No! Really?

Actually, this sounds like a rather adversarial restatement of my
perspective. What I am saying is that a transhuman AI, if it chooses to
do so, can almost certainly take over a human through a text-only
terminal. Whether a transhuman would choose to do so is a separate issue.

> Thus it is unlikely his team would do this (at least regularly) unless
> they had a specific reason to do so.

The point I am trying to make is that when a transhuman comes into
existence, you have bet the farm at that point. There are potentially
reasons for the programmers to talk to a transhuman Friendly AI during
the final days or hours before the Singularity, but the die has been
almost certainly been cast as soon as one transhuman comes into
existence, whether the mind is contained in sealed hardware on the Moon
next to a million tons of explosives, or is in immediate command of a
full nanotechnological laboratory. Distinctions such as these are only
relevant on a human scale. They are impressive to us, not to transhumans.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:40 MDT