Friendly AI communications issues (was: Consciousness)

From: Eliezer S. Yudkowsky (
Date: Tue Sep 10 2002 - 07:46:27 MDT

Brian Phillips wrote:
> I suspect that understanding the mechanisms whereby a biological frame
> produces sentience is only the first step in getting a *useful*
> machine intelligence.
> I have raised this point before, with Annissimov and others.
> "How do you get something that can or will talk to you?"
> Apparently people have a real problem with discerning the difference
> between a critique of Friendliness and a healthy appreciation of
> the differences between *purely human states of mental function
> within the SAME BRAIN* (normal vs. tripping), to say nothing
> of what an AI's "thought processesses" would be like.
> I don't critique Friendliness, it sounds like a grand idea (you
> want to give the AI something like a rationally derived
> Buddha-nature, this sounds like a good thing for a functionally
> omnipotent entity to have....). I do think the communication
> issues are non-trivial (and boy are they EVER non-trivial).

In a word... yeah. Except that I don't think a human on drugs is a good
metaphor for the alienness of an AI. Admittedly I've never been on drugs
or even talked in realtime to anyone on drugs, but y'know, I'm fairly
confident of this anyway. Drugs don't

The way I think of the question is "How do you explain Friendliness to an
AI that understands nothing except billiard balls?", or, at a more
advanced level, an AI that understands itself and billiard balls.

Let's say that's *all*.

Now... how do you explain Friendliness?

The interesting thing, though, is that this question has a real answer.
Yes, there are difficulties in AI communication that are unlike anything
you'd experience in talking to a human, on drugs or otherwise. But there
are also - as I have very recently started to realize - things that the AI
can do that have absolutely no analogue in human thinking. If there are
new problems, there are new solutions as well. And sometimes the new
solutions to the new problems are shockingly more powerful than the human
way of doing things. And I mean that in a good way.

It's like the way you start out by saying "Intelligence is so amazingly
hard" but then, once you can visualize "recursive self-improvement" in
enough detail that it's not just a phrase anymore, you realize that if you
*can* build an intelligent AI it's soon going to be *way* more intelligent
than a human.

It's like the way you start out by saying "Friendliness / altruism / love
/ kindness is amazingly hard" but, once you learn enough evolutionary
psychology that you can see human emotions as something real rather than a
mysterious property of humans, you can mentally model the idea of a mind
that is *far more* friendly than a modern-day human, a mind to which we've
passed on love and caring without passing on hate, tribalistic thinking,
the urge to dominate, the tendency to rationalize amassing personal power.
  This doesn't speak to the issue of whether an AI or a human upload is
more likely to reliably converge to this point, but it's something where
if you haven't spent some time mentally modeling FAI you aren't likely to
even *see* the goal.

There are things like that for the communication problem too, which this
margin is unfortunately too small to contain, plus they're so alien Greg
Egan would spit out his coffee... I *really* have to write this up at some

I think there may even be a valid analogy here to the reason that
technophilia usually turns out to be a better strategy than technophobia;
it's easier to see automobiles putting the saddle industry out of
business, but harder to visualize in advance all the jobs created by the
present-day auto industry. You have to visualize the solutions along with
the problems, and - usually, but not always - the solutions turn out to be
even more powerful than the problems. That's why technological advance
almost always turns out to be a good thing despite all fears; it's what
technophobes fail to take into account in their predictions and it's why a
certain measure of courage in confronting the future has been historically

It could be that when we actually start building AI, we'll find out that
the inhumanly hard problems are real but the inhumanly good solutions are
chimerical. In which case SingInst would drop the whole AI strategy like
a radioactive weasel and try like hell to figure out another strategy that
has a snowball's chance of working. And you'll also note that I'm trying
hard to make sure I know what the inhumanly good solutions are *in
advance* - or at least know that *a* inhumanly good solution exists,
regardless of whether it's the one we end up using - rather than relying
on the historically reliable empirical regularity that we'll find the
solutions as we go along. You can't *know* in advance that empirical
regularities like Moore's law will continue, no matter how reliably they
have worked in the past, unless you have an underlying causal model.

> (The worst part is people tell me I am "anthropomorphizing
> AI".. feyh!)
> This is a wildly anthropomorphic example but it should serve
> to illustrate the point. (Just as a thought experiment)
> Upload a lungfish. Give it an IQ of 130 or so. From the point
> you turn the computer on make sure the lungfish is experiencing
> something rather analogous to the effects of 600 micrograms
> of LSD in a healthy 90 kg adult human. The rise-plateau-fall
> pattern of intensity interaction is, rather than linear, wildly chaotic,
> or perhaps rythmic in some virutally undetectable pattern (for
> the lungfish anyway). Give the lungfish the ability to moderate the
> dissolving effects of the drug, with one difference.. the lungfish
> doesn't know which way is *down* (i.e. even if it knew
> what normal was like it can't apply that knowledge to the
> present state). Give this lungfish access to a file in which you
> have placed an electronic version of the human Universal
> Grammer. Talk to the lungfish.
> Was what you just did to the lungfish a nice thing to do?
> Would you do it to a child? A pet? a seed AI?
> Hmmm......

I confess I don't see what you're trying to illustrate. Also a lungfish
doesn't have nearly enough complexity to unambiguously specify what a
"lungfish of IQ 130" would be like, so we're talking about a pretty wide
range of possible minds-in-general here. For example, the lungfish of IQ
130 could have a far stronger mind than a human of IQ 130, so that a
perturbation of the order caused by the LSD dosage would be simply
shrugged off.

Eliezer S. Yudkowsky
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:41 MDT