General summary of FAI theory

From: Tom McCabe (rocketjet314@yahoo.com)
Date: Tue Nov 20 2007 - 15:27:57 MST


SL4 is supposed to be for advanced topics in futurism,
not endlessly
rehashing the basics. Some of the things which have
already been
covered years ago, and are therefore ineligible for
rehashing:

1). An AGI will not act like an "enslaved" human, a
resentful human,
an emotionally repressed human, or any other kind of
human. See
http://www.intelligence.org/upload/CFAI//anthro.html.

2). Friendliness content is the morality stuff that
says "X is good, Y
is bad". Friendliness content is what we think the FAI
*should* do.
FAI theory describes *how* to get an FAI to do what
you want it to do.
See http://www.intelligence.org/upload/CEV.html.

3). FAI theory is damn hard; it is much harder than
Friendliness
content. So far as I know, nobody knows how to make
sure that some AGI
design reliably produces paperclips, which is much
*simpler* than
ensuring reliable Friendliness. Keep in mind that the
Friendliness
content must be maintained during recursive
self-improvement, or the
FAI may wind up destroying us all on programming
iteration
#1,576,169,123.

4). CEV is a way of deriving Friendliness content from
humanity's
collective cognitive architecture. CEV is a
morality-constructor, not
a morality in and of itself; if you speak programming,
think of CEV as
a function that takes the human race as an argument
and returns a
morality.

5). Goal systems naturally maintain themselves (under
most
conditions). If the AGI has a supergoal of X, changing
to a supergoal
of X' will mean that less effort is put towards
accomplishing X.
Because the AGI *currently* has a supergoal of X, the
switch will
therefore be seen as undesirable. It's not like you
have to point a
gun at the AGI's head and say, "Do X or else!"; no
external coercion
is necessary. See
http://www.intelligence.org/upload/CFAI//design/structure/external.html.

6). An AGI has the goals we give it. It does not have
human-like goals
such as "reproduce", "survive", "be nice", "get
revenge", "avoid
external manipulation", etc., unless we insert them or
they turn out
to be useful for fulfillment of supergoals. See
http://www.intelligence.org/upload/CFAI//anthro.html#observer.

7). The vast, vast majority of goal systems lead to
the destruction of
the Earth. The Earth's actual destruction would be
more complicated
than this, but essentially, more energy, matter,
computing power, etc.
are almost always desirable, and so the AGI won't stop
consuming the
planet for its own use until it runs out of matter.

8). Just because the AGI can do something doesn't mean
it will. This
is what Eli calls the Giant Cheesecake Fallacy- "A
superintelligent
AGI could make huge cheesecakes, cheesecakes larger
than any ever made
before; wow, the future will be full of giant
cheesecakes!" Some
examples of this in action:

"The AGI, being superintelligent, has all the
computational power it
needs to understand natural language. Therefore, it
will start
analyzing natural language, instead of analyzing the
nearest random
quark."

"The AGI will be powerful enough to figure out exactly
what humans
mean when they give an instruction. Therefore, the AGI
will choose to
obey the intended meanings of human instructions,
rather than obey the
commands of the nearest lemur."

9). In general, it is much easier to work with simple
examples than
complicated examples. If you can't do the simple
stuff, you can't do
the complicated stuff. If you can't prove that an AGI
will flood the
universe with paperclips and not iron crystals, you
can't prove that
an AGI will be Friendly.

 - Tom

      ____________________________________________________________________________________
Be a better pen pal.
Text or chat with friends inside Yahoo! Mail. See how. http://overview.mail.yahoo.com/



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:00 MDT