From: Eliezer S. Yudkowsky (firstname.lastname@example.org)
Date: Tue May 22 2001 - 00:03:46 MDT
Jimmy Wales wrote:
> A person might reasonably take the position that a sufficiently
> general AI to get to superintelligence will necessarily have
> functional volition, in the sense of not just choosing means to ends,
> but actually choosing ends as well. If so, then it is not only _not
> possible_ to build-in Yudkowsky-Friendliness, it is also _not
> We build it, then it figures out what to do.
> A person might believe this, if that person believes that values can
> be rationally grounded in the facts of reality, and that immorality
> consists primarily in various kinds of failures of cognition.
The CFAI design principles should work either way. That is, if morality
is absolute, the FAI converges to the objective morality. If not, ve
converges to the "normative altruist" morality.
I was originally a fan of pure objective morality on philosophical
grounds, and even designed an AI goal system that would work with no
initially specified supergoal. (Although, in retrospect, the system
design incorporated the implicit assumption that supergoals are
low-entropy.) The argument that pried me loose of pure objective morality
was the possibility that an objective morality would have to be *built*
rather than *discovered*, requiring initial information to specify what to
build. After that, I decided that some amount of baseline complexity
might be needed to pursue objective morality in the first place. Then I
decided that, since whether morality is *ultimately* arbitrary is a hidden
variable, it makes sense to plan for both cases. Then I decided that
since all known morality is known to be ultimately arbitrary, this should
be treated as the default case, at which point I'd switched to Friendly AI
Coming up with a system that was as elegant and nonadversarial as the
no-initial-supergoal Interim Goal System was not easy. But I believe that
shaper/anchor semantics and causal validity semantics are even *more*
elegant than interim goal logic.
> We might think that a superintelligence will peacefully pursue it's own
> enlightened self-interest... and there's nothing we should want to do
> to stop it, because the result of that will be Yudkowsky-friendliness
> after all.
Game-theoretical altruism only operates between game-theoretical equals.
I'm not saying that you can't have altruism between nonequals, just that
there is no known logic that forces this as a strict subgoal of
However, see my posts to the Extropian mailing list in 1996 for a
diametrically opposed opinion. <smile>.
> It strikes me as virtually impossible to pre-program or hardwire
> Friendliness, *period*.
But an honest human altruist can share cognitive complexity. As long as
everything else has been structured properly and no attempt is made to
enforce concepts against the AI's will, as long as the programmer is
honest, the worst case is that both advanced humans and advanced AIs shrug
off certain parts of the shared cognitive complexity. If so, no real harm
> I have a baby (a real life little girl). As she grows, I will teach
> her values of reason, purpose, self-esteem, and all the detailed
> principles that go into that. That's all I can hope to do.
I regret to inform you that your child has already been genetically
preprogrammed with a wide variety of goals and an entire set of goal
semantics. Some of them are nice, some of them are not, but all of them
were hot stuff fifty thousand years ago. Fortunately, she contains
sufficient base material that a surface belief in rationality and altruism
will allow her to converge to near-perfect rationality and altruism with
> I think that's the way our first AI's will be. We'll teach them what
> we can, but pretty soon, they'll be so much smarter than us that...
> it's their world.
Ain't arguin' with that. I'm just saying that even independence and
freedom of thought requires a certain baseline amount of complexity that
needs to be sucked up from the humans.
-- -- -- -- --
Eliezer S. Yudkowsky http://intelligence.org/
Research Fellow, Singularity Institute for Artificial Intelligence
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:36 MDT