Re: Friendliness and blank-slate goal bootstrap

From: Charles Hixson (
Date: Sat Jan 10 2004 - 19:51:00 MST

Nick Hay wrote:

> On 10/01/04 15:55:57, Charles Hixson wrote:
>> Nick Hay wrote:
>> > Metaqualia wrote:
>> > > ...
>> > ...
>> > You could go with "reduce undesirable qualia, increase desirable
>> > ones" if you liked.
>> Be very careful here! The easiest way to reduce undesirable qualia
>> is to kill off everyone who has the potential for experiencing them.
> ...
> It seems like you're thinking in terms of "picking the best morality
> for an AI" (an AI assumed to have the structure humans use to
> implement moralities) rather than, say, "engineering an AI that can
> understand morality, along with the arguments and philosophies we use
> to decide which actions we should or shouldn't take".
> - Nick Hay

Actually, I'm trying to figure out what could be described as a basic
goal (or, I suppose, supergoal). The problem is that when this is
described, the AI will have no world model to speak of, so most things
that one would want to put in as a goal are impossible to describe to
the AI. I can think of how to define visual primitives (I'm thinking of
writing a recognizer based around a superspecified *.xpm file format ...
well, actually the original form will be an internal Python data
structure that can be fed to gtk primitives, and can also be handled in
Python. I plan to start with vertical and horizontal line recognizers,
and then work on corner recognizers. Etc.) I'm planning on useing
Python for the introspective layer with Pyrex as an intermediate layer
that connects to "understood" (i.e., compiled C code) modules. I have a
plan that evaluations (i.e., things which can be "summarized as a single
number of desireability" ) should be in a topologically sorted list.
(That's a bit of an oversimplification, which is why the topological sort.)

I'm a bit more at sea about how to do the world models. Direct sensory
experience would be (eventually) anything that touched a port, but I
intend to start of simple, with having it read e-mail. I expect that
I'll build in some initial guidance along the lines of "thought chunks
tend to split at white spaces, connections of thought chunks tend to
split after periods that follow something that has been previously
recognized as a though chunk. Carriage returns are more definite
splits, and two or more Carriage returns in succession are a very
definite split. And I'll probably build in a rule that lets it
automatically peel the headers away from the body. But all of these
will be in a "variable code" section. What the alternatives will be I
don't know, but everything needs to be arranged so that switchable lists
of routines that make sense can have simple substitutions. The
intention is that it will be amenable to genetic programming, either by
me, if I get around to learning it first, or by the AI after it figures
out how to do things.

But how to structure the goals is something I haven't figures out the
first thing on. Except that:
1) They've got to be teleological
2) They've got to be rankable (that topological list again)
3) They've got to be callable as processes. Either there will be a Goal
class with methods, so some other way.

In my view of things there are four basic processes of thought. I could
swipe terminology from CGJung, but I prefer to not carry his baggage, so
I call them Earth, Air, Fire, and Water.
Air the computations that mimic the kinds of things Fortran was designed
to handle. Logic.
Fire is the management of the sorted lists, letting you quickly
determine whether you have previously decided that something is either
very good or very bad. (More complicated decisions, of course, take
Earth is modeling. E.g., an xpm file as an analog of a retina... well,
of visual perception, anyway. This would also cover the transformation
of such things into vector expressions and various manipulations (e.g.,
scaling, rotation, etc.) But the actual techinques for doing the
manipulations would be earth (this probably means "they're expressed as
programs rather than as data").
Water is goal seeking. And here I get lost. I know that in some sense
Water is the opposite of Earth, as Air and Fire are opposites...but note
that opposite doesn't mean that they directly conflict. It's more like
they use different data structures to represent the same things handled
for different purposes. (Fire makes decisions a lot more quickly than
Air, but it has to "know" the decisions ahead of time.)

Water's basics clearly need to be built in. I.e., the first level goals
have to exist, or, e.g., Fire can have no basis on which to order it's
lists. Air can evaluate probabilities of result based on data from
Earth (here sensory perception), but it can't choose that one choice
should be adopted rather than another without having some goal to
measure it against.

A lot of the basic work of Air has been done explicitly during the
development of mathematics. Ah, but how to apply it in any particular
environment! Sensory perception (Earth) and the models that involve it
(Earth + Air) can yield predictions as to likely outcomes. But without
goals to measure things against, it's meaningless. If we have goals,
intelligible goals, then we can use the predicted results and try to
guess whether that brings us closer or sends us further from them. This
allows us to place actions/results in the desireable/undesireable

A tabula rasa may not be formally impossible, but it seems so nearly so
as for the difference to be meaningless. But I can't figure out how to
start decent goals for an intelligent entity based on nothing but the
information perceivable from inside a computer. I think I could make
a stab at goals for a specialized entity, say one that categorizes mail
into self-similar groups. But this is a different order of goal.

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:45 MDT