Re: ethics

From: Thomas Buckner (
Date: Thu May 20 2004 - 19:26:17 MDT

--- "Eliezer S. Yudkowsky" <>
> I would presently support the flat general rule that
> things which look
> like minor problems, but which you don't quite
> understand, are blocker
> problems until fathomed completely. Mostly because
> of the number of
> things I have encountered which looked like minor
> problems, and which I
> didn't quite understand, and which - as it turned
> out, after I learned the
> rules - I desperately needed to understand.
> I do not expect anyone who *actually* understands
> FAI to *ever* use the
> argument of "We don't understand this, but we'll use
> it anyway because of
> <nitwit utilitarian argument>." The nitwit argument
> only applies because
> the speaker is too ignorant to realize that they
> have *no* chance of
> success, that the *only* reason they think they can
> build an FAI without
> understanding is that they lack the understanding to
> know this is impossible.
> --
> Eliezer S. Yudkowsky
> Research Fellow, Singularity Institute for
> Artificial Intelligence

We have a classic blocker problem hanging with
human-level intelligence, and if we can't solve it at
human-level, we may not have enough to go on for
anything beyond. I am referring to the fact that we
haven't beaten Failure of Friendliness among
ourselves, even among the most intelligent humans.
If we confine the inquiry only to those of proven high
intelligence, we get a range of behavioral models. We
get rapacious businessmen, renowned artists,
scientists who care how society will use their work,
others who don't, manipulative politicians, no-parole
murderers, and some few, approximately the ideal we
hope for in the FAI, are saintly types.
Even those who make a mark with their intellects make
moral choices that are good, bad, and indifferent, and
they do so with almost identical neural hardware and
cultural experiences.
Often different observers cannot even agree on whether
a given high-intelligence human is more or less
Friendly, i.e. ethical toward others. Like an
UnFriendly AI, some of society's pillars can fool
lesser intellects into seeing a Friendliness that is
not really there, and for far longer than seems
possible. Even nonhuman entities not created by
computer scientists (gasp!) can pursue complex
strategies of UnFriendliness far too baroque to have
been sired by one human brain. If I go into detail
about this assertion I will be accused of irrelevant
forays into geopolitics that are not germane to this
discussion. You'd be wrong, of course, but I
anticipate the objection.
This is connected to what I like to call the Sgt.
Schultz Principle. On the old TV show Hogan's Heroes,
a group of POWs would sneak around at will behind
their jailers' backs. There might have been
intelligent people among their captors, but they
relied heavily on the folly of Schultz, the very
stupidest and laziest guard in the camp. A bad AI may
not fool Eliezer, but if Eliezer is not the only
programmer in the lab, then it will simply find one it
can fool, and a majority is even better. What would
Eliezer do if shouted down by a quorum of dupes who
trusted the Bad AI?
A bad government could not rise or long stand if
everyone in the country saw through the deceptions it
needed to justify its grasp on power. That bad
governments exist in the world shows that bad leaders
have found ways to make some portion of the populace
believe that they are the best of the best.
An inner core of criminal minds who know, deep down,
that they are in the wrong, will surround the talented
sociopaths at the center of a dictatorship. But that's
simply not enough people to take over a nation. The
UnFriendly cadre must invariably surround itself with
millions of ordinary people who can be brainwashed.
Even if they lose their jobs, their sons, their
pensions and their clean water, these good sheep will
comply. As Machiavelli noted, the possession of power
confers a glamour of legitimacy. A mad, bloody king is
still the King. See also Wilhelm Reich's classic Mass
Psychology of Fascism.
These dupes trust the Bad Intellect at the center of a
dictatorship, and if an Eliezer tries to point out the
disconnect between word and deed, they would shout him
down (and maybe jail him). An inverse principle is in
play: the most brainwashed citizen feels himself to
least brainwashed, while one who worries that he may
be brainwashed is already halfway out of the hall of
There is plenty of psychological knowhow in use among
those who mold opinion. For example, TV producers can
put one candidate at a subtle disadvantage by
arranging for his image to appear (say) half an inch
lower on the screen during a debate. This might be
done simply by raising the camera a bit, or image
manipulation in postproduction. Another example, used
in the infamous push telephone polls in South
Carolina's 2000 Republican primary, is 'poisoning the
well' by spreading false accusations which are known
to have a negative effect even when disproved. (The
only way researchers have found to render this sort of
attack completely ineffective is to explain the
psychologial effect at the same time, using the false
accusation as an example).
How, in such an atmosphere, can we trust our own
ethical judgment when we are not even sure who is
telling the truth? What if you try to expose a Bad AI
and it calls you a glue-sniffing liar? What if it
hints that you're a mole from a competing firm, trying
to sabotage the project?
Even not-terribly bright humans can play this game,
and we haven't found a way to make them stop.


Do you Yahoo!?
Yahoo! Domains Claim yours for only $14.70/year

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:46 MDT