From: Anand AI (
Date: Sun Jun 16 2002 - 16:32:05 MDT

Anand wrote:
>01. Does CFAI argue for a set of panhuman characteristics that comrpise
>human moral cognition? If so, what characteristics do we have evidence
>for, and what characteristics of human moral cognition will be

The following is correlated to question #1. The below quoted text has the
premise that a philosophy can be grounded and approximately derived from
panhuman characteristics, and used as content for Friendly AI. I am
requesting clear explication of and evidence for this premise. Also, what
evidence is there for the below claim that "panhuman attributes such as
'altruism'... build up _very_ strongly when all the humans on Earth are

CFAI: SAS: Grounding for external reference semantics:
>In a later section, I give the actual, highly secret, no-peeking target
>definition of Friendliness that is sufficiently convergent, totally
>programmer-independent, and so on. Hopefully, you've seen enough already
>to accept, as a working hypothesis, the idea that a philosophy can be
>grounded in panhuman affectors. The programmers try to produce a
>philosophy that's an approximation to that one. Then, they pass it on to
>the Friendly AI. The Friendly AI's external referent is supposed to refer
>to that programmer-independent philosophy, about which the programmers
>are good sources of information, as long as the programmers give it their
>honest best shot. This is not a complete grounding - that takes causal
>validity semantics - but it does work to describe all the ways that
>external reference semantics should behave. For example, morality does
>not change when words leave the programmers' lips, it is possible for a
>programmer to say the wrong thing, the cognitive cause of a statement
>almost always has priority over the statement itself, manipulating the
>programmer's brain doesn't change morality, and so on.

CFAI: 3.4.4: The actual definition of Friendliness:
>The renormalizing shaper network should ultimately ground itself in the
>panhuman and gaussian layers, without use of material from the personality
>layer of the original programmer. This is how "programmer independence"
>is ultimately defined.
>Humanity is diverse, and there's still some variance even in the panhuman
>layer, but it's still possible to conceive of description for humanity and
>not just any one individual human, by superposing the sum of all the
>variances in the panhuman layer into one description of humanity.
>Suppose, for example, that any given human has a preference for X; this
>preference can be thought of as a cloud in configuration space. Certain
>events very strongly satisfy the metric for X; others satisfy it more
>weakly; other events satisfy it not at all. Thus, there's a cloud in
>configuration space, with a clearly defined center. If you take something
>in the panhuman layer (not the personal layer) and superimpose the
>clouds of all humanity, you should end up with a slightly larger cloud
>that still has a clearly defined center. Any point that is squarely in
>the center of the cloud is "grounded in the panhuman layer of
>Panhuman attributes that we would think of as "selfish" or
>"observer-biased" tend to cancel out in the superposition; since each
>individual human has a drastically different definition, the cloud is very
>thin, and insofar as it can be described at all, would center about
>equally on each individual human. Panhuman attributes such as
>"altruism", especially morally symmetric altruism or altruism that has
>been phrased using the semantics of objectivity, or by other means
>made a little more convergent for use in "morality" and not just the
>originating mind, builds up very strongly when all the humans on Earth
>are superposed. The difference is analogous to that between a beam of
>incoherent light and a laser.

Anand wrote:
>02. Why is volition-based Friendliness the assumed model of Friendliness
>content? What will it and what will it not constitute and allow? If the
>model is entirely incorrect, how is this predicted to affect the AI's
>architecture[, specifically, causal validity semantics]?

CFAI contains only two paragraphs, I believe, that explicitly relate to the
above question. Please interpret my second question as a request for
elaboration beyond the below quoted text.

CFAI: 1.3: Seed AI and the Singularity:
>Punting the issue of "What is 'good'?" back to individual sentients
>enormously simplifies a lot of moral issues; whether life is better than
>death, for example. Nobody should be able to interfere if a sentient
>chooses life. And - in all probability - nobody should be able to
>interfere if a sentient chooses death. So what's left to argue about?
>Well, quite a bit, and a fully Friendly AI needs to be able to argue it;
>the resolution, however, is likely to come down to individual volition.
>Thus, Creating Friendly AI uses "volition-based Friendliness" as the
>assumed model for Friendliness content. Volition-based Friendliness
>has both a negative aspect - don't cause involuntary pain, death,
>alteration, et cetera; try to do something about those things if you see
>them happening - and a positive aspect: to try and fulfill the requests
>of sentient entities.

Anand wrote:
>03. What alternatives to volition-based Friendliness have been considered,
>and why were they not chosen?

According to my memory, CFAI does not contain an answer to the above
question. If this is wrong, please reference specific sections.

Anand wrote:
>04. How will the AI know and decide what constitutes "normativeness"?

Specifically, I do not understand what constitutes "normativeness" in the
real-world. What constitutes a normative human or normative altruism, and
how can they be achieved?



Get your FREE download of MSN Explorer at

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:39 MDT