Re: Friendliness and blank-slate goal bootstrap

From: Metaqualia (
Date: Mon Oct 06 2003 - 04:10:39 MDT

Our ideas seem to be converging.

> > I am suggesting that it isn't clear how you can formulate "cutting
> > fingers is wrong" unless you have experienced an "ouch".
> And I am suggesting you can formuate simple notions of wrong without it,

Yes, I agree with that, actually my statement was too broad and it should
have included the condition "with no programmer input".

Of course an AI can represent the moral law "cutting people's fingers is
wrong" in whatever cognitive system you want to implement if you hard code
it, and it will easily infer it independently by (for instance) 1. a trainer
giving consistent UNFRIENDLY grade for every time the system makes people
cry and 2. watching people with cut fingers crying

What an experience of "ouch" could do for a mind is:

1. let it independently figure out that cutting people's fingers (or putting
live lobsters in the microwave) is wrong (the blank-slate machine could be
delivered on a planet of monkeys, who can't give it any instruction, and
still end up being a friendly AI).

2. have some real physical evidence for programmers' statements (I consider
qualia direct physical evidence, more direct than seeing a rock fall to the
ground or whatever other physical observation you think as direct)

Do we agree on these?

Now, if you just want to have an assurance the AI won't blow us up, an
"ouch-less" AI will surely do, and Elizier provided tons of clever ways to
produce such a system.

Maybe while we still don't understand qualia the most practical way of
proceeding will be making sure the AI stays within our parameters for
friendliness AND at the same time develops a "figure out qualia" goal as a
subgoal of understanding friendliness better.


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:42 MDT