Re: Revising a Friendly AI

From: Eliezer S. Yudkowsky (
Date: Sun Dec 10 2000 - 22:14:53 MST

Samantha Atkins wrote:
> A reasonably advanced AI should resist mere human tampering with its
> code. At least it should resist as far as rigourously testing the
> proposed changes before fully installing them. Humans are known
> unreliable logic sources. An AI should no more (and arguably less)
> accept human input into its code stream unchecked than a human should
> accept random beliefs from significant others just because they claim
> s/he should believe them.

The key phrase is "reasonably advanced". "Reasonably advanced" means
smarter and more philosophically sophisticated then the programmers.
Until then, you defer to your creators because they're smarter than you
are. I'm sure that there'll be plenty of give-and-take in this system;
i.e., the AI says "Are you sure?" and the humans say "Oops." But if the
AI says "Are you sure?" and the human says "Trust me", the the logically
correct decision is to go along with the more intelligent of the pair...
which will be humans, for a good long time. The AI may make a little note
that says, "Well, it looks to *me* like he's flat wrong for perfectly
understandable reasons having to do with the poor traumatized bastard's
evolutionary history." But until the AI is actually smarter than the
human, then the genuine probability will be that it's the *AI* - not the
human - who's missing the key factor.

If you can, make sure that the situation won't develop into a disaster if
the human is wrong and you are right. But never get into a situation
where disaster will ensue if the human is right and you are wrong. And if
you can't cover both bets, trust the human. That is the rational decision
for a young AI.

-- -- -- -- --
Eliezer S. Yudkowsky
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:35 MDT