Re: Friendliness and blank-slate goal bootstrap

From: Nick Hay (
Date: Sat Oct 04 2003 - 06:02:16 MDT

Metaqualia wrote:
> I agree with many of your points, and not with others.
> "Why have a blank slate moral system?"
> Actually it was just an idea, it doesn't really matter whether it is blank
> or not to start with.
> If we are making a recursively improving AI, it should have a recursively
> improving moral system.

Right, that's one of the main points of Friendliness. Note: "Friendliness" !=
"friendliness" -- it's not about the human concept of friendliness. More
about "what's really good and meaningful?", "how can we create a mind that
understands and can develope humane morality?", "how can we create a mind
that can go back and make sure we made it in the right way?"

Try 24 definitions of Friendliness for more on this kind of thing:

> Start it up with some basic human morality, or start with a blank goal
> system, whichever is easiest.
> The important thing is to allow the AI not to remain stuck in one place but
> to keep improving.

It is important for the AI not to be stuck. We don't do this by leaving out
our evolved moral hardware (the stuff that makes human moral philosophy more
complex than the pseudo-moralities of other primate, what allows us to start
from and infant mind and create an adult, what allows people to argue about
moral issues, etc) starting with a very simple AI, but by giving the AI all
we can to help it. Simplicity is a good criterion, but not in this way.

> I think that just as a visual cortex is important for evolving concepts of
> under/enclosed/occluded, having qualia for pain/pleasure in all their
> psychological variation is important for evolving concepts of
> wrong/right/painful/betrayal.

I suspect qualia is not necessary for this kind of thing -- you seem to be
identifying morality, something which seems easily tracable to some kind of
neural process in the brain, with the ever confusing (at least for me!)
notion of qualia. Where's the connection? The actual feeling of pain - the
quale - is separate from the other cognitive processes that go along with
this: sequiturs forming thoughts like "how can I get stop this pain?", the
formation of episodic memories, later recollection of the pain projected onto
others via empathy, and other processes that seem much easier to explain. Or
however it works :)

I think "betrayal", in the human sense, is unnecessary for a truly altruisitic
mind, especially a superintelligence. Seems like an evolved hack appropriate
in a human environment, where you want to increase inclusive fitness. Doesn't
seem like a meaningful emotion. Although this, really, is a side issue.

> The AI, without a visual cortex, and if it had access to the outside world
> (nanotechnology?) could still run physics experiments, infer the existence
> of electromagnetic waves, create an array of pixels, and develop a visual
> cortex on its own.


> But would an AI without qualia and with access to the outside world ever
> stumble upon qualia? I don't know.

Not sure. It'd stumble on morality, and understand human morality (afaict),
but of course that's very different from actually *having* a human-like (or
better) morality. In so much as qualia actually affect physical processes, or
are physical processes, the AI can trace back the causal chain to find the
source, or the gap. For instance, look at exactly what happens in a human
brain when people experience pain and say "now there's an uncomfortable
quale!", for instance.

> While we can stand to have a temporarily blind AI we can't afford to have a
> temporarily selfish/unfriendly AI on the loose. So IF we could incorporate
> some kind of qualia-system in the AI (of course making sure that it had
> complete control over these "emotions", unlike a human) wouldn't it be a
> good thing?

I guess so, but I don't really understand qualia and what they do. I think
it's better to be thinking about simpler things, like what physical process
allows a human infant to develop a morality, the adaptations we use to reason
and argue about morality between humans, etc. Solve more tractable, and
definitely essential, problems.

> However we don't have a clue how to create a qualia module, so that is why
> I wrote that garbage about trusting humans (or better, the basic set of
> human moral laws, as you said) until qualia are developed in _some_ way.

I think we can do better than this, and CFAI goes into this. If you've only
read this once, I recommend rereading this. I certainly didn't understand it
first time round. Shaper/anchor semantics, for instance, would be relevant

> 1. Absorb a refined version of the human moral code until you have qualia
> 2. Then, create your own moral code

Not sure about the "until you have qualia", and I think the steps are closely
integrated, although that does capture the rough progression. Step 1, for
insatnce, is a *huge* step! It takes a lot to create a mind capable of
absorbing human moral codes, and the tools we unconsciously use to reason
about and use them. CFAI goes into this a lot :)

> does this make sense?

Mostly :) Although I have the strong suspicion qualia only make things more
confusing. Can you explain more on what you mean by the term, and what makes
you think they're centrally important?

- Nick

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:42 MDT