RE: SIAI's flawed friendliness analysis

From: Gary Miller (
Date: Mon May 19 2003 - 10:33:50 MDT

Leonard Wild said:

>> And I disagree with Gary Miller that his post of 5/17/03 ...
>> ... is really a solution to the "friendliness problem." His proposed
>> solution is one that deals with the "safety" of a FAI project.

While a large part of my posting did deal with those issues...

I also described a logging and testing process where the FAI could be
re-examined on a continual basis to ensure no inadvertent anti-social
training crept in or that any training was interpreted in a way that
led to an anti-social inference. By replaying existing training the
code can be evaluated for errors in the reasoning process itself. By
performing personality tests and setting up ethical challenges for the
AI at the end of each training run. Any log containing antisocial
inferences can traced back assuming complete causal inferencing is
to those inputs or insufficient inputs which triggered the antisocial
inferences. This allows for a development process where the experts can

experiment with the order and prioritization of moral concepts and don't
to right the first time! If on Tuesday they have a paranoid psychotic
AI they
restore him to Friday and try to determine what made him bonkers! Maybe
Should have left him read Gandhi's autobiography instead of of Richard
Or maybe they just forgot to assert Gandi good, Nixon bad!

I did not of course try to define all the inputs that are required to
a normal, social, and mentally healthy FAI or even attempt to define
what one of those
is. I leave that to the behavioral psychologists and knowledge
Most healthy normal parents who bother to make the time seem to do a
decent job of creating well-adjusted humans even with all the negative
influences of television, peer pressure and raging hormones. I dare say
if it
weren't for all those raging hormones I had, my parents would have
probably done a
pretty good job.

-----Original Message-----
From: [] On Behalf Of Leonardo
Sent: Monday, May 19, 2003 1:34 AM
Subject: Re: SIAI's flawed friendliness analysis


I agree with Bill Hibbard on the following:

> Understanding what humans are, what the AI itself is, and what values
> are, is implicit in the simulation model of the world created by any
> intelligent mind. It uses this model to predict the long-term
> consequences of its behaviors on its values. Such understanding is
> prescribed by an intelligence model rather than a safe AI model.

And I disagree with Gary Miller that his post of 5/17/03 ...

Gary Miller wrote:
> My proposed solution to friendliness problem.
> Note some of you will laugh this off as overkill. But believe me
> having
> worked as a
> consultant for the government for a number of years, this is just
> business as
> usual for NSA. It is a very expensive but very secure development
> It is based upon separation and balance of power. No one person has
> access
> and knowledge to compromise the system. Relationships between team
> must be prohibited to prevent possibility of collusion.


... is really a solution to the "friendliness problem." His proposed
solution is one that deals with the "safety" of a FAI project.

Naturally, the safety is necessary to make sure that "intended
friendliness programming" doesn't or can't get meddled with. In other
words, its a solution (partial) as to how to go about making sure that
intended outcomes are not changed by those working on such a project or
by those who would have some interest in infiltrating such a project.
Yet it does not define what "friendliness" is nor what core values are.
Is is one more way to make sure that the following does happen (as
expressed by Bill Hibbard):

> BH: The ambiguous definitions in the SIAI analysis will be exploited
> by powerful people and institutions to create AIs that protect and
> enhance their own interests.

In his posting he also writes:

> BH: But I think that AI values
> for human happiness link AI values with human values and create a
> symbiotic system combining humans and AIs. Keeping humans "in the
> loop" offers the best hope of preventing any drift away from human
> interests. An AI with humans in its loop will have no motive to design

> an AI without humans in its loop. And as long as humans are in the
> loop, they can exert reinforcement to protect their own interests.
> The "*deep* understanding of values" is implicit in superior
> intelligence. It is a very accurate simulation model of the world that

> includes understanding of how value systems work, and the effects that

> different value systems would have on brains. But choosing between
> different value systems requires base values for comparing the
> consequences of different value systems.

Which is the core of the problem, which is also found in the partial or
  qualitatively limited definitions of "intelligence," "intuition,"
"smartness," "awareness," "values," "understanding," "knowledge," etc.

We can speak of intelligence like we can speak of the wind, but "wind"
is a concept that is implicit to "superior intelligence" as long as the
"organism" that has the concept has experienced "wind." But the concept
of wind, for someone whose life (or livelihood) depends on it, will have

qualitatively different and relatively precise definitions about the
kinds of "wind." The more you depend on it, the more precise (and
diferentiating) will be your definition of wind (or any other subject).

"Yes, it is wind, but what _kind_ of wind?"

or "What kind of snow?"
or "What kind of fish?"
or "What kind of intelligence?"


So, too, friendliness is implicit to superior intelligence, yes (though
not only then, according to studies presented in a book called GOOD
NATURED -I can check on the author another time, for anyone interested).

But what _is_ it that makes someone behave in a friendly way? What kind
of values create the context in which it is "intelligent" to behave in a

friendly way? Sometimes, friendliness can be the wrong type of attitude
or behavior in a given context in order to survive; in fact, it may even

be "unintelligent" to behave in a friendly way.

Friendliness is directly related with "values," just as values are
directly related to "needs." But there are different kinds of needs
which reflect in different kinds of values. So, certain values can be
viewed as positive or "good" in a certain context, and completely the
opposite in another. The bottom line is to find the kind of need for
values that will make an AI "friendly" towards humanity at large (though

perhaps not necessarily towards particular individual human beings) or,
rather, towards "Life," in the most general sense. (?)

As Bill Hibbard wrote:

> Keeping humans "in the loop" offers the best hope of preventing any
> drift away from human interests.

Meaning, how can we make sure that the needs of an AI for the "respected

and respectful" presence and existence of human beings (and the context
they need for survival and evolution) gets reflected in and AI's value
system or structure? And how can such a "set of values" be programmed
into the core or kernel of an AI regardless of future development and
autopoietic growth? Because, for programming's sake, it's not enough for

  types of needs, types of values, types of friendliness to be
"implicit" in superior intelligence (human or AI or n-versions of it),
but it must be made "explicit," which means the creation of an UML
(Universal Modelling Language)-type diagram or sets of diagrams that
enable programmers and project designers and all those involved in the
FAI project to "explicitly agree" on the "ingredients" of concepts (such

as friendliness) ... which is but a very broad ideal for a given type of

behavior that appears to be as implicit as when we speak of "happiness"
or "anger."

The thing to consider is that the ambiguous definitions of _any_
analysis will be exploited by powerful people and institutions to
protect and enhance their own interests.

I read a while ago posts on the "search for money" to fund SIAI, but
never was there any questioning as to why money (which is a man made
technology) is scarce (not just for funding deep science projects), and
why money appears to flow only into certain type of projects, and why it

is necessary for someone (like Eliezer) to _prove_ that the money is
"well invested" ... it even remains ambigous what a "good investment"
means in monetary terms. A lot of assumptions that do not question the
"technology money" at all, which becomes, once again, a problem similar
to what Bill Hibbard already wrote about finding the base values to the
values you wish to work with. If you wish to work with "funding" then
you must necessarily work with money, but if money's "inner workings"
are implicit or ambigous in our understanding, then we can forever chase

after it without really knowing what "it" is nor why it is not available

for "good" projects. We may be intelligent, but our intelligence doesn't

seem to be willing to create an awareness and hopefully understanding of

one of the most widely used man-made contraptions (money).

The flaw, if I may say so (and not only in relation to "friendliness
analysis" but to many apparently implicitly understood concepts,
including something as illusorily clear as "money") is to avoid creating

UML-diagramable versions of the concepts so we can all agree on them
rather than spending our efforts disagreeing about them or going off on

What this means is that certain concepts must be re-analyzed in this new

light (with a different goal in mind, one of programmability, one of
consensus agreements) rather than saying "that's already a closed
issue." It was, more or less, a closed issue tha the world is "flat." It

is, more or less, an assumed concept that it isn't possible to make
triangles* that have three inner angles each of 90 degrees, or squares**

with four inner angles each of 105 degrees (to give but two examples).


Leonardo Wild

PS: Crocker's Rules apply ...

... considering the fact that I was 'ousted' for a month for being
un-scientific; yet my email created an activity of discussion on the
subject of multiple universes and infinite universes of over 140 emails
even though that subject had not previously been (according to the
archives) discussed on this list, or at least not under such a heading.
Paradoxical, isn't it?


*Triangle = 1. in geometry, a figure bounded by three lines, and
containing three angles.

**Square = 1. b) more or less cubical; rectangular and
three-dimensional, as a box.


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:42 MDT