From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Sun Aug 25 2002 - 03:53:37 MDT
Mitch Howe wrote:
>
> Ok. The issue, as I see it, is this. Since it seems unlikely that there is
> any sort of cosmic Truth that human minds are tuned to when deciding what
> constitutes a morally "good" goal, human minds must be making these
> judgments as a consequence of their particular design.
This is an excellent summary to which I must make only one amendment: If
there *is* a cosmic truth of morality - something which we would
universally recognize, if confronted with it, as the Meaning of Life -
then it may be that humans are only programmed to *seek* that truth as a
consequence of their particular design. In particular, we use the
correspondence theory of truth with respect to moral beliefs, enabling us
to conceive of a *right answer* in moral domains.
In other words, it's possible that even if there is a true Meaning of Life
that transcends human individuals and even humanity as a species, you
still need a certain kind of goal system in order to *care* enough about
this to want to reprogram your goal system into correspondence with the
Meaning of Life once you've found it. Two plus two equals four; this is a
truth that exists beyond individuals and beyond our species, but you can't
argue a chatbot into believing it if the chatbot is programmed to do
nothing except repeat the phrase "two plus two equals five".
Thus Friendliness does not rely on the judgement that it is "unlikely"
that human minds are attuned to a cosmic truth. It does not even rely on
the judgement that reaching this cosmic truth requires starting with a
goal system that, like humans, would care about a cosmic truth if it found
one. It can rest on the idea that if human preconditions actually
*prevent* the recognition of a cosmic truth, then, by hypothesis, the
truth in question is one that you and I would not see as relevant to
morality - if you construct a scenario where a relevant truth exists, then
the simple fact that you see it as relevant means that human thinking
doesn't block the perception of the relevance of that class of truths. In
that sense, Friendliness is safe. If the moral thing to do is construct
an AI with a blank slate, and the morality of this course of action is
even in theory perceptible to humans, than the first Friendly AI can do
the moral thing and build a blank-slate AI. Or if the cosmic truth
underlying the human morality is universally accessible, Friendly AI may
turn out to have been a waste of effort, but it won't actually have *hurt*
anything.
-- Eliezer S. Yudkowsky http://intelligence.org/ Research Fellow, Singularity Institute for Artificial Intelligence
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:40 MDT