Friendliness at different levels

From: Stuart Armstrong (dragondreaming@googlemail.com)
Date: Mon May 05 2008 - 09:22:58 MDT

Next message: Stuart Armstrong: "Signaling after a singularity"
Previous message: Stuart Armstrong: "Bound unhappiness below (was Re: What if there's an expiration date?)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

"With great powers come great responsibilities," Spider-man
"But if you could choose Doctor, if you could decide who lives and who
dies, that would make you a monster," Doctor Who

I'd like to add my voice to the occasional monologue of complains
against the term "friendly AI". I understand why the term is used -
essentially we want good outcomes from an AI, but the problem of
safely specifying outcomes is intractable, so the best solution is to
have an AI that wants similar outcomes to us. Our friends want the
best for us, hence friendly AI.

That's fine, but it makes no sense to call an advanced AI friendly. I
have many good friends, but very few that I would trust as a
politician, none I would trust as a head of state, and certainly none
I would trust with the sort of power an advanced AI would wield. I do
not care if elected leaders feel my pain, understand me, despise or
love me. I only care that they make decisions that benefit me or
refrain from hurting me.

If two people are about to die, and I must choose one to save, then
universal friendliness won't help me decide. Unless I pick at random,
I have to use some sort of balance of cost and benefits to make the
decision.

Similarly, an advanced AI must make its decisions based on a much more
complicated calculus of costs and benefits, not on friendliness. If a
friendly man tries to save people's lives during a flood, and fails to
save them all, then he is admirable. If an advanced AI fails to save
someone, then it is likely that the AI decided to let them die. This
not the decision of a friend, but that of a calculating leader.

As part of the AI's decision-making process, friendliness reduces to
valuing humanity's survival, happiness and development. But the actual
details of how the AI acts are unrelated to any intuitive feelings of
"friendliness". The majority of the AI's runtime and decisions will
not be governed by friendliness. The best model for an AI is that of a
"good politician" or "benevolent despot", not a friend.

So will I continue using the term FAI (rather than benevolent AI, or
safe AI)? Of course I will, as it's the agreed upon term. I just
wanted to point out its misleading quality.

Stuart

Next message: Stuart Armstrong: "Signaling after a singularity"
Previous message: Stuart Armstrong: "Bound unhappiness below (was Re: What if there's an expiration date?)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:02 MDT