From: Thomas McCabe (firstname.lastname@example.org)
Date: Tue Jan 29 2008 - 13:41:53 MST
It's too early to start thinking about Friendly AI
The "it is too early to worry about the dangers of AI" argument has
some merit, but as Eliezer Yudkowsky notes, there was very little
discussion about the dangers of AI even back when researchers thought
it was just around the corner. What is needed is a mindset of caution
- a way of thinking that makes safety issues the first priority, and
which is shared by all researchers working on AI. A mindset like that
does not spontaneously appear - it takes either decades of careful
cultivation, or sudden catastrophes that shock people into realizing
the dangers. Environmental activists have been talking about the
dangers of climate change for decades now, but they are only now
starting to get taken seriously. Soviet engineers obviously did not
have a mindset of caution when they designed the Chernobyl power
plant, nor did its operators when they started the fateful experiment.
Most AI researchers do not have a mindset of caution that makes them
consider thrice every detail of their system architectures - or even
make them realize that there are dangers. If active discussion is
postponed to the moment when AI is starting to become a real threat,
then it will be too late to foster that mindset.
There is also the issue of our current awareness of risks influencing
our AI engineering techniques. Investors who have only been told of
the promising sides are likely to pressure the researchers to pursue
progress at any means available - or if the original researchers are
aware of the risks and refuse to do so, the investors will hire other
researchers who are less aware of them. To quote Artificial
Intelligence as a Positive and Negative Factor in Global Risk:
"The field of AI has techniques, such as neural networks and
evolutionary programming, which have grown in power with the slow
tweaking of decades. But neural networks are opaque - the user has no
idea how the neural net is making its decisions - and cannot easily be
rendered un-opaque; the people who invented and polished neural
networks were not thinking about the long-term problems of Friendly
AI. Evolutionary programming (EP) is stochastic, and does not
precisely preserve the optimization target in the generated code; EP
gives you code that does what you ask, most of the time, under the
tested circumstances, but the code may also do something else on the
side. EP is a powerful, still maturing technique that is intrinsically
unsuited to the demands of Friendly AI. Friendly AI, as I have
proposed it, requires repeated cycles of recursive self-improvement
that precisely preserve a stable optimization target.
The most powerful current AI techniques, as they were developed and
then polished and improved over time, have basic incompatibilities
with the requirements of Friendly AI as I currently see them. The Y2K
problem - which proved very expensive to fix, though not
global-catastrophic - analogously arose from failing to foresee
tomorrow's design requirements. The nightmare scenario is that we find
ourselves stuck with a catalog of mature, powerful, publicly available
AI techniques which combine to yield non-Friendly AI, but which cannot
be used to build Friendly AI without redoing the last three decades of
AI work from scratch."
Development towards AI will be gradual. Methods will pop up to deal with it.
Unfortunately, it is by no means not a given that society will have
time to adapt to artificial intelligences. Once a roughly-human level
intelligence has been reached, there are many ways for an AI to become
vastly more intelligent (and thus more powerful) than humans in a very
Hardware increase/speed-up. Once a certain amount of hardware has
human-equivalence, it may be possible to make it faster by simply
adding more hardware. While the increase isn't necessarily linear -
many systems need to spend an increasing fraction of resources to
managing overhead as the scale involved increases - it is daunting to
imagine a mind which is human-equivalent, then has five times as many
extra processors and memory added on. AIs might also be capable of
increasing the general speed of development - [Staring into the
Singularity] has a hypothetical scenario with technological
development being done by AIs, which themselves double in (hardware)
speed every two years - two subjective years, which shorten as their
speed goes up. A Model-1 AI takes two years to develop the Model-2 AI,
which takes takes a year to develop the Model-3 AI, which takes six
months to develop the Model-4 AI, which takes three months to develop
the Model-5 AI...
Instant reproduction. An AI can "create offspring" very fast, by
simply copying itself to any system to which it has access. Likewise,
if the memories and knowledge obtained by the different AIs are in an
easily transferable format, they can simply be copied, enabling
computer systems to learn immense amounts of information in an
Software self-improvement involves the computer studying itself and
applying its intelligence to modifying itself to become more
intelligent, then using that improved intelligence to modify itself
further. An AI could make itself more intelligent by, for instance,
studying its learning algorithms for signs of bias and improving them
with better ones, developing ways for more effective management of its
working memory, or creating entirely new program modules for handling
particular tasks. Each round of improvement would make the AI smarter
and accelerate continued self-improvement. An early, primitive example
of this sort of capability was EURISKO, a computer program composed of
different heuristics (rules of thumb) which it used for learning and
for creating and modifying its own heuristics. Having been fed
hundreds of pages of rules for the Traveller science fiction wargame,
EURISKO began running simulated battles between different fleets of
its own design, abstracting useful principles into new heuristics and
modifying old ones with the help of its creator. When EURISKO was
eventually entered into a tournament, the fleet of its design won the
contest single-handedly. In response, the organizers of the contest
revised the rules, releasing the new set of them only a short time
before the next contest. According to the creator of the program,
Douglas Lenat, the original EURISKO would not have had the time to
design a new fleet in such a short time - but now it had learned
enough general-purpose heuristics from the first contest that it could
build a fleet that won the contest, even with the modified rules.
And it is much easier to improve a purely digital entity than it is to
improve human beings: an electronic being can be built in a modular
fashion and have bits of it re-written from scratch. The minds of
human beings are evolved to be hopelessly interdependent and are so
fragile that they easily develop numerous traumas and disorders even
without outside tampering.
* Friendliness is trivially achieved. People evolved from selfish
self-replicators; AIs will "evolve" from programs which exist solely
to fulfill our wishes. Without evolution building them, AIs will
automatically be Friendly.
o Possible rebuttals:
- An AI doesn't need to be selfish in order to be unsafe. The fact
that they were not shaped by evolution to have the same moral
intuitions as we do is precisely the problem. Might be good to link to
Eli's posts on the topic (1) (2) here. - Most humans, when placed in
positions of power, are not Friendly in the FAI sense. History is rife
with abuse of power; just look at Hitler, Stalin, and Mao, who led
large portions of the world for decades. - Directed evolution on a
computer is not likely to resemble human evolution. Most of the
selection pressures which drove human altruism will be absent or
* Trying to build Friendly AI is pointless, as a Singularity is by
definition beyond human understanding and control.
* Unfriendly AI is much easier than Friendly AI, so we are going
to be destroyed regardless.
o Rebuttal synopsis: There's no point in giving up the
future of humanity just because things seem bleak.
* Other technologies, such as nanotechnology and bioengineering,
are much easier than FAI and they have no "Friendly" equivalent that
could prevent them from being used to destroy humanity.
o Rebuttal synopsis: There's no point in giving up the
future of humanity just because things seem bleak. If we achieve FAI
first, then the risks from the other technologies will be mitigated,
* Any true AI would have a drastic impact on human society,
including a large number of unpredictable, unintended, probably really
o Rebuttal synopsis: AI will be developed sooner or later
anyway. Making it Friendly will minimize the bad consequences.
* We can't start making AIs Friendly until we have AIs around to
look at and experiment with. (Goertzel's objection)
o Rebuttal synopsis: then we should research Friendliness
and general AI theory side by side.
* Talking about possible dangers would make people much less
willing to fund needed AI research.
o Rebuttal synopsis: Funding AI research without considering
the dangers is much worse than AI research being delayed.
* Any work done on FAI will be hijacked and used to build hostile AI.
o Rebuttal synopsis: That's a valid fear. It's just a risk
we have to take.
This archive was generated by hypermail 2.1.5 : Wed Jun 19 2013 - 04:01:26 MDT