From: Eliezer S. Yudkowsky (email@example.com)
Date: Wed Feb 15 2006 - 10:47:28 MST
Joshua Fox wrote:
> Yes, I know that they are working on _Friendly_ GAI. But my question is:
> What reason is there to think that the Institute has any real chance of
> winning the race to General Artificial Intelligence of any sort, beating
> out those thousands of very smart GAI researchers?
> Though it might be a very bad thing for nonFriendly GAI to emerge first,
> it seems to me by far more likely for someone else --there are a lot of
> smart people out there -- to beat the Institute to the goal of GAI.
Through no fault on the part of the poster, who has asked a question
that seems ordinary enough from his perspective, this is a "wrong
question" from the perspective of anyone trying to build an AI - or do
anything difficult to a scientist or engineer. You don't want to come
up with convincing reasons why you can solve the problem. You just want
to solve the problem. Any attention you devote to comparing yourself to
other people is wasted neural firings.
As HC also pointed out, you play the cards you're dealt. If you've got
to beat a thousand other researchers to the punch to prevent the world
from blowing up, then that's what you gotta do. You should not mistake
the Singularity Institute for futurists. We are not here to prophesy
that AI *will be* Friendly. It's an easy point to become confused over,
because most people talking in this mindspace are futurists; they want
to convince you that things *will* turn out nicely. We make no such claim.
I will try to answer anyway, as best I can, the question as you put it.
If I thought the probability of winning was negligible, I'd look for
other cards to play.
Suppose I walk into a ballroom full of PhD physicists. Can I, a
nonphysicist, tell who in that room has the best likelihood of making
significant advances in physics?
I can try to sort the physicists into a stratum of relatively uninspired
people who picked up a PhD in college, and a stratum of dedicated
geniuses. This sorting will not be perfectly reliable but it may be
discriminating enough to make it worthwhile.
Competence is made up of fluid intelligence, crystallized skill, and
background knowledge. I can't detect crystallized skill or background
knowledge in the domain of physics without possessing it myself. I can
try to detect fluid intelligence. But short of becoming a physicist
myself, I may have no luck at all in discriminating among the people who
strike me as smart, unless they possess crystallized skill or background
knowledge which I happen to share. If a physicist launches into a
lecture on cognitive science, I can label him as "+1 Polymath", or
detect a mistake if he makes one. Similarly for people who start
talking about Bayes, biases, or the philosophy of science, and get "+1
I've run into people whom others described as "very smart", who not only
struck me as not "very smart", but as quite noticeably less smart than
other people I know. I strongly suspect that everyone significantly
smarter than a given perceiver tends to be perceived as just "very
smart". The hypothesis here is that if you've got IQ 130, you can
distinguish grades of intelligence up to IQ 140, and everyone smarter
than that is just "very smart". I don't think this is actually true,
but I think there's a grain of truth in it, meaning that your ability to
detect differences in grades of intelligence decreases as the distance
above you increases. Someone once asked me if I considered myself a
genius. I immediately inquired how rare a level of intelligence is
required to qualify as "genius". The person thought for a moment and
replied, "1 in 300". I laughed involuntarily and assured him that, yes,
I was a "genius".
There are not thousands of AGI researchers in the world. I doubt there
are so many as a hundred. And they are "very smart" to widely different
Observed behavior can set upper bounds on competence. When you make a
specific observation of this type, you automatically offend people who
are underneath the upper bound. My father, a Ph.D. physicist and
believing Orthodox Jew, would not agree that his acceptance of the Old
Testament as the factual word of God sets an upper bound on his
rationality skills. But we are not talking about a subtle mistake.
My father is more rational than the average human; for example, he
taught me some simple magic tricks to make sure I wasn't taken in by
supposed psychics. My father looks with contempt upon Jewish sects
which practice what he regards as superstition - Kabbalah and so on.
But my father cannot possibly be a *world-class* rationalist. That's
beyond the bounds of possibility given his observed behavior.
Many atheists, maybe even most atheists, would be reluctant to say that
there is a limit to how smart you can be and still be religious. Wasn't
Isaac Newton religious? It is historical fact that Newton wasted most
of his life on Christian mysticism. The notion of observed behavior
setting an upper bound on competence should be understood as a 3D
surface over fluid intelligence, relevant crystallized skill, and
relevant factual knowledge. Newton lived before Darwin, in an era when
humanity's grasp on science and scientific procedure were both much
weaker. If Newton lived today and was still a Christian, I'd penalize
him a lot more points for the mistake. Also, physics is not as directly
relevant to religion or rationality as other disciplines. Darwin's
observations forcibly stripped him of his belief in a personal loving
God, but he remained a deist. Laplace, the inventor of what is now
known as Bayesian probability theory, was questioned by Napoleon as to
whether his astronomy book made mention of God. Laplace famously
replied, "Your Highness, I have no need of that hypothesis."
Many atheists, probably a majority, arrived to that conclusion through
some degree of luck; not because their rationality skills lay above the
upper bound that *forces* someone to become an atheist. There are
atheists who profess themselves as having unsupported faith in a
proposition, the nonexistence of God, which strikes them as more
pleasant than its negation; atheists who try to keep an open mind about
the healing powers of crystals; and atheists who are atheists because
their parents raised them as atheists.
People who are not sufficiently competent themselves, may be very
skeptical about the idea of competence *forcing* you to a particular
position. People who know the probabilities and still buy lottery
tickets set upper bounds on how well they can have internalized the
concept of a hundred-million-to-one probability; good luck explaining
that to them. People whose mix of fluid intelligence, relevant
crystallized skill, and relevant knowledge, does not *force* them to
believe in evolution, have a hard time understanding that evolution is
not "just a theory". And of course you can't convince them that their
"openmindedness" is the result of insufficient competence, or that their
"openmindedness" sets a hard upper bound on how competent they could be.
They are not willing to believe - to really, emotionally believe, as
opposed to claiming to entertain the theoretical possibility - that
someone else could have a mind stronger than their own, which is
inevitably forced to a single verdict favoring evolution.
Now let's consider an AGI researcher working on a "human-level" AGI
project in which Friendliness is not a first-class technical requirement
actively shaping the AI. Unless the researchers are setting out in
deliberate intent to destroy the world, there is an upper bound on how
competent they can be *at AGI*. That is, there is a 2D surface which
bounds the combination of crystallized skill at AGI research, and
knowledge of related sciences, which they can possibly be using to
challenge the problem. (Unfortunately, in this domain, I don't think
there's any associated bound on raw g-factor. Lacking knowledge and
skill and rationality, you can be at the human limits of fluid
intelligence and still get it wrong, a la Newton on religion.)
Trying to explain exactly which AGI skills they can't possibly have,
stumbles over the problem of the skills themselves being harder to
explain than any one issue that rests on them. If you look at a dynamic
computational process, and you expect it to be Friendly for no good
reason, then that bounds your skill at noticing what a piece of code
really does, and the rigor of the standards to which you hold yourself
in predicting that a piece of code does pleasant things. If you were
sufficiently skilled at AGI thought, you'd write a walkthrough showing
exactly how the nice thing happened, or else you wouldn't expect it to
happen. This, whether the nice thing you wanted consisted of something
"Friendly" or something "intelligent".
Trying to explain this in words, I see that it sounds very vague - not
more vague than most AI discussion, perhaps, but much too vague for an
FAI researcher to accept. Some of these concepts are explained more
clearly in "A Technical Explanation of Technical Explanation". If
you've read that, you remember that people will invent magical
explanations like "phlogiston" or "elan vital" or "emergence", and not
notice that they are magical; it is not an error that humans notice by
instinct, which it is why it is so common in history. If you've read
_Technical Explanation_ plus Judea Pearl, then you will understand when
I say that bad explanations for intelligence consist of causal graphs
labeled with portentous words: leaf nodes for desired outcomes such as
"intelligence" or "benevolence", and parent nodes for causes such as
"emergence" or "spontaneous order", with arcs reinforced by perceived
correlation (the one says, humans are "emergent", humans are
"intelligent", from this correlation I infer necessary and sufficient
causation). If you come up with a bad explanation for intelligence, and
you are sufficiently enthusiastic about it, you can declare yourself an
AGI researcher. That's most of the AGI researchers out there, at least
right now. People who can't give you a walkthrough of how their program
will behave intelligently (let alone nicely), but they have a bright
idea about intelligence, and they want to test it experimentally.
That's what their teachers told them science was all about.
There's a good amount of knowledge you can acquire, such as evolutionary
biology, heuristics and biases, experimental study of anthropomorphism,
evolutionary psychology, etc. etc., which will make it *more difficult*
to stare at a computer program and think that it will magically do nice
things. Unfortunately, the possible lack of such knowledge in AGI
researchers doesn't give FAI researchers any significant advantage,
since evolutionary biology is not directly relevant to constructing an
AGI. Worse, you can know quite a few individual disciplines before they
*combine* to *force* a correct answer, since it only takes a *single*
mistake not to get there.
In this domain, I doubt there is any humanly possible level of raw fluid
intelligence, which would *force* you to get the answer right in the
absence of skill and knowledge. I.e., Newton was extraordinarily
intelligent but still failed on easy tests of rationality because he
lacked knowledge we take for granted. Relative to the background of
modern science, AGI and FAI are hard enough as problems that no humanly
possible level of g-factor alone will force you to get it right. This
is bad because it means you can get incredibly talented mathematicians
trying to build an AGI, without them even realizing that FAI is a
problem. But they are still limited in how deeply they can understand
intelligence; they can grasp facets and combine powerful tools and
What an FAI researcher can theoretically do, which would require
competence above the bound implied by trying to write an AGI *without*
FAI, is write an AI based on a complete understanding of intelligence.
An FAI researcher knows they are forbidden to invoke and use concepts
that they don't fully and nonmagically understand (again, see TechExp to
gain a clearer grasp on what this means). When you're staring at a
blank sheet of paper, trying to reason out how an aspect of cognition
works, in advance of designing your FAI, then your thoughts may bounce
off the rubber walls of magical things. But you will be aware of your
own lack of understanding, and you will be aware that you are prohibited
from making use of the magic until it has ceased to be magical to you.
And that's not just an FAI skill, it's an AGI skill - although realizing
that you need to do FAI causes you to elevate this skill to a much
higher level of importance, because you are no longer *allowed* to just
try stuff and see if something unexpectedly works.
If an FAI project comes first, it will be because the researchers of
that project had a much deeper understanding.
Again, I am not saying that you *can't* build an AGI without being
sufficiently competent that your theory grabs you by the throat and
forces you to elevate FAI to a first-class technical requirement.
Natural selection produced humans without exercising any design
intelligence whatsoever. But there's a limit to how well you can
visualize and understand your AI, and yet make elementary, gaping,
obvious mistakes about whether the AI will be nice. (Unfortunately, I
suspect that you can understand your AI fully, and still make more
subtle mistakes, more dignified forms of failure that are just as lethal.)
If you're a researcher building an F-less AGI and you're not
deliberately out to destroy the world, there are things you can't know,
skills you can't have, and a limit on how well you can understand the AI
you're trying to build. You can be a tremendous genius, possibly at the
limits of human intelligence, but if so that sets an even stricter upper
bound on your crystallized skill and relevant knowledge. Most such
folks *won't* be tremendous geniuses. World-class geniuses will be rare
among AGI researchers, simply because world-class geniuses are rare in
general. There is no physical law that prohibits a non-world-class
genius from declaring themselves an AGI researcher; even the Mentifexes
and Marc Geddeses of the world do it.
So it is not that FAI and AGI projects are racing at the same speed,
toward a goal which is miles more distant for FAI projects because FAI
projects have additional requirements. The AGI projects are bounded in
their competence, or they will turn into FAI projects; if their vision
grows clear enough it will *force* the realization that to develop a
nonlethal design they must junk their theories and start over with a
higher standard of understanding. FAI projects can continue on past
that point of realization, and develop the skills which come afterward
in the order of learning. The advantage is not entirely to the
ignorant, nor to the careless.
I am sure that it is possible to spend years thinking about FAI, hold
yourself to the standard of understanding every concept which you invoke
and being able to walk through every nice behavior you expect, and yet
make a nonobvious lethal mistake, and so fail. But the projects whose
AGIs would *automatically* kill off humanity, the projects who must fail
at FAI by *default* - are, yes, genuinely limited in their competence.
To reduce it to a slogan that fits on a T-Shirt:
There is a limit to how competent you can be, and still be that stupid.
It's a *very high* limit. There's more to it than raw g-factor. People
can be *that stupid*, and still look "very smart". I can even conceive
that they might *genuinely* be very smart, though I've yet to encounter
a failing-by-default AGI researcher who strikes me as being on a level
with, say, Judea Pearl.
So the life-or-death problem reduces to whether people permissibly
smarter than an upper bound can accomplish a colossally difficult task
by means of exactly understanding it; before at least one member of a
much larger pool of people, a few of whom are "very smart", but none of
them more competent than the upper bound, can accomplish a task, whose
difficulty is merely huge, through work that includes a substantial
component of guessing, vagueness, and luck.
And remember also this. This is the pass-fail test for the human
species itself. The other AGI projects are, most of them, not fighting
that battle. Sadly there is no monopoly on enthusiasm; perhaps some of
the other AGI projects will work 20 hours per day on our doom. I wish I
could say that any sufficiently intelligent human being *must* see the
pass-fail test of our survival, and devote all their resources to
passing it. Unfortunately this again requires crystallized skill and
background knowledge, not just g-factor. When I look over my own past
history, I find that there was, dismayingly, a significant component of
luck to noticing the pass-fail test. That is not something I am
comfortable with, and I will try to do better in the future, win on
purpose instead of by accident.
It still strikes me as endlessly strange that so few people should care
about this matter, the hinge of time around which all else turns. Yet I
cannot boast that my course here was inevitable - I wonder sometimes if
I *could* have made it without a component of luck, but that is not what
actually happened. But if I had been smart enough to bulldoze through
life on sheer raw intelligence, as opposed to my grand-uncle loaning me
a copy of "Great Mambo Chicken and the Transhuman Condition" when I was
11, then there would be many others of slightly lesser intelligence,
some of whom would succeed through an event path that included a
component of luck. And there would be many FAI researchers; the
battlefield would not be so deserted.
It would seem that sufficient fluid intelligence, crystallized skill at
rationality, and background knowledge of science, is the price only of
it being *possible* to find the hinge of time and move to defend it, if
you are also lucky. Yet even if there is a component of luck, we may
hope that it will be some of the very best of the human species who move
to prevent the world from destruction. That it will *not* be an equal
dispute between bands of people who happened to major in computer
science, or even an equal dispute between scientific geniuses who fell
into the particular field of AI. The FAI project may draw on resources
and brainpower allocated to the defense of the human species. Which, in
this time, is practically nothing, because the understanding is so rare.
But among *very very* smart individuals, if the FAI project has need
of them, that rationality may not be so rare.
And it may be that there will not be enough smart people among the human
species who have also the luck to discover the one important problem; or
the smart people will not be able to raise enough of an advantage in
competence to overcome the greater intrinsic difficulty of their
problem; or the human species will not have sufficient wisdom among its
six billions to allocate even the tiny proportion of its resources that
it would need to defend itself. And that will be the end of the human
But it is not a foregone conclusion. It is worth putting up a fight.
-- Eliezer S. Yudkowsky http://intelligence.org/ Research Fellow, Singularity Institute for Artificial Intelligence
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:55 MDT