Re: SIAI's flawed friendliness analysis

From: Leonardo Wild (dlwild@access.net.ec)
Date: Sun May 18 2003 - 23:33:39 MDT


Hello,

I agree with Bill Hibbard on the following:

>
> Understanding what humans are, what the AI itself is, and
> what values are, is implicit in the simulation model of the
> world created by any intelligent mind. It uses this model to
> predict the long-term consequences of its behaviors on its
> values. Such understanding is prescribed by an intelligence
> model rather than a safe AI model.
>

And I disagree with Gary Miller that his post of 5/17/03 ...

Gary Miller wrote:
> My proposed solution to friendliness problem.
>
> Note some of you will laugh this off as overkill. But believe me having
> worked as a
> consultant for the government for a number of years, this is just
> business as
> usual for NSA. It is a very expensive but very secure development process.
> It is based upon separation and balance of power. No one person has the
> access
> and knowledge to compromise the system. Relationships between team members
> must be prohibited to prevent possibility of collusion.

<(snip)>

... is really a solution to the "friendliness problem." His proposed
solution is one that deals with the "safety" of a FAI project.

Naturally, the safety is necessary to make sure that "intended
friendliness programming" doesn't or can't get meddled with. In other
words, its a solution (partial) as to how to go about making sure that
intended outcomes are not changed by those working on such a project or
by those who would have some interest in infiltrating such a project.
Yet it does not define what "friendliness" is nor what core values are.
Is is one more way to make sure that the following does happen (as
expressed by Bill Hibbard):

> BH: The ambiguous definitions in the SIAI analysis will be
> exploited by powerful people and institutions to create
> AIs that protect and enhance their own interests.

In his posting he also writes:

> BH: But I think that AI values
> for human happiness link AI values with human values and
> create a symbiotic system combining humans and AIs. Keeping
> humans "in the loop" offers the best hope of preventing any
> drift away from human interests. An AI with humans in its
> loop will have no motive to design an AI without humans in
> its loop. And as long as humans are in the loop, they can
> exert reinforcement to protect their own interests.
>
(...)
>
> The "*deep* understanding of values" is implicit in superior
> intelligence. It is a very accurate simulation model of the
> world that includes understanding of how value systems work,
> and the effects that different value systems would have on
> brains. But choosing between different value systems requires
> base values for comparing the consequences of different value
> systems.

Which is the core of the problem, which is also found in the partial or
  qualitatively limited definitions of "intelligence," "intuition,"
"smartness," "awareness," "values," "understanding," "knowledge," etc.

We can speak of intelligence like we can speak of the wind, but "wind"
is a concept that is implicit to "superior intelligence" as long as the
"organism" that has the concept has experienced "wind." But the concept
of wind, for someone whose life (or livelihood) depends on it, will have
qualitatively different and relatively precise definitions about the
kinds of "wind." The more you depend on it, the more precise (and
diferentiating) will be your definition of wind (or any other subject).

"Yes, it is wind, but what _kind_ of wind?"

or "What kind of snow?"
or "What kind of fish?"
or "What kind of intelligence?"

etc.

So, too, friendliness is implicit to superior intelligence, yes (though
not only then, according to studies presented in a book called GOOD
NATURED -I can check on the author another time, for anyone interested).
But what _is_ it that makes someone behave in a friendly way? What kind
of values create the context in which it is "intelligent" to behave in a
friendly way? Sometimes, friendliness can be the wrong type of attitude
or behavior in a given context in order to survive; in fact, it may even
be "unintelligent" to behave in a friendly way.

Friendliness is directly related with "values," just as values are
directly related to "needs." But there are different kinds of needs
which reflect in different kinds of values. So, certain values can be
viewed as positive or "good" in a certain context, and completely the
opposite in another. The bottom line is to find the kind of need for
values that will make an AI "friendly" towards humanity at large (though
perhaps not necessarily towards particular individual human beings) or,
rather, towards "Life," in the most general sense. (?)

As Bill Hibbard wrote:

> Keeping humans "in the loop" offers the best hope of preventing any
> drift away from human interests.

Meaning, how can we make sure that the needs of an AI for the "respected
and respectful" presence and existence of human beings (and the context
they need for survival and evolution) gets reflected in and AI's value
system or structure? And how can such a "set of values" be programmed
into the core or kernel of an AI regardless of future development and
autopoietic growth? Because, for programming's sake, it's not enough for
  types of needs, types of values, types of friendliness to be
"implicit" in superior intelligence (human or AI or n-versions of it),
but it must be made "explicit," which means the creation of an UML
(Universal Modelling Language)-type diagram or sets of diagrams that
enable programmers and project designers and all those involved in the
FAI project to "explicitly agree" on the "ingredients" of concepts (such
as friendliness) ... which is but a very broad ideal for a given type of
behavior that appears to be as implicit as when we speak of "happiness"
or "anger."

The thing to consider is that the ambiguous definitions of _any_
analysis will be exploited by powerful people and institutions to
protect and enhance their own interests.

I read a while ago posts on the "search for money" to fund SIAI, but
never was there any questioning as to why money (which is a man made
technology) is scarce (not just for funding deep science projects), and
why money appears to flow only into certain type of projects, and why it
is necessary for someone (like Eliezer) to _prove_ that the money is
"well invested" ... it even remains ambigous what a "good investment"
means in monetary terms. A lot of assumptions that do not question the
"technology money" at all, which becomes, once again, a problem similar
to what Bill Hibbard already wrote about finding the base values to the
values you wish to work with. If you wish to work with "funding" then
you must necessarily work with money, but if money's "inner workings"
are implicit or ambigous in our understanding, then we can forever chase
after it without really knowing what "it" is nor why it is not available
for "good" projects. We may be intelligent, but our intelligence doesn't
seem to be willing to create an awareness and hopefully understanding of
one of the most widely used man-made contraptions (money).

The flaw, if I may say so (and not only in relation to "friendliness
analysis" but to many apparently implicitly understood concepts,
including something as illusorily clear as "money") is to avoid creating
UML-diagramable versions of the concepts so we can all agree on them
rather than spending our efforts disagreeing about them or going off on
tangents.

What this means is that certain concepts must be re-analyzed in this new
light (with a different goal in mind, one of programmability, one of
consensus agreements) rather than saying "that's already a closed
issue." It was, more or less, a closed issue tha the world is "flat." It
is, more or less, an assumed concept that it isn't possible to make
triangles* that have three inner angles each of 90 degrees, or squares**
with four inner angles each of 105 degrees (to give but two examples).

Best,

Leonardo Wild

PS: Crocker's Rules apply ...

... considering the fact that I was 'ousted' for a month for being
un-scientific; yet my email created an activity of discussion on the
subject of multiple universes and infinite universes of over 140 emails
even though that subject had not previously been (according to the
archives) discussed on this list, or at least not under such a heading.
Paradoxical, isn't it?

***

*Triangle = 1. in geometry, a figure bounded by three lines, and
containing three angles.

**Square = 1. b) more or less cubical; rectangular and
three-dimensional, as a box.

>



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:42 MDT