Re: Why playing it safe is the most dangerous thing

From: Philip Goetz (
Date: Fri Feb 24 2006 - 08:26:58 MST

On 2/24/06, Ben Goertzel <> wrote:

> Peter, two points:
> 1)
> Eliezer has sometimes proposed that a Singularity not properly planned
> with regard to Friendly AI is almost certain to lead to human
> extinction. But this has not been convincingly argued for. He has
> merely shown why this is a significant possibility.

Human extinction might be a likely outcome. I was speaking of
extinction of life, which I regard as a definitely bad thing, and an
unlikely outcome.

> 2)
> Phil is not really suggesting that rushing into the Singularity
> blindly is the best possible option. He's merely suggesting that the
> *better-in-principle* options are not very plausible, so that we
> should focus on it because it's by far the highest-probability
> plausible option.
> As I understand it, a caricature of Phil's argument would go something like:
> * If we launch a Singularity before the jerks in power figure out
> what's up, we have a 50/50 or so chance of a good outcome (by the
> Principle of Indifference, since what happens after the Singularity is
> totally opaque to us lesser beings)
> * If we don't launch a Singularity before the jerks in power figure
> out what's up, we have a much lower chance of a good outcome, because
> those jerks are likely to find some way to screw things up
> * The truly better-in-principle approach to the Singularity would
> require a long period of peaceful study and experimentation before
> launching the Singularity: but this is just not feasible because once
> the tech gets to a certain point, the jerks in power will pay people
> to develop it quickly and in an unsafe way
> Specialized to AGI, the argument would go something like:
> -- making provably safe AGI is really hard and will take time X
> -- for a dedicated maverick team to make AGI with unknown safety may
> be easier, and will take time Y
> -- after enough time has passed, some jerks will make unsafe and nasty
> AI; this will take time Z
> If
> Y < Z < X
> then it may be optimal to make AGI with unknown safety.

Yes! That's what I meant, thank you. With emphasis on the idea that
the AI made in time Z would be deliberately designed to lead to

We could add the notion of negative utility. "Negative utility" is my
explanation for why lotteries are so popular in poor communities,
despite the fact that the expected ROI of a lottery ticket is < 1;
also why people choose crack addiction despite knowing in advance the

Suppose, contemplating whether to buy a lottery ticket, a person sums
up the expected utility of their entire future life without buying the
lottery ticket, and concludes it is below the "zero utility level"
below which they would be better off dead. They then consider the
expected utility on buying the lottery ticket. This gives them two
possible outcomes: one of very high probability, and a slightly lower
negative utilty; one of small probability, with positive utilty.

Rather than combining these two, the person reasons that they can kill
themselves any time they choose, and thus replaces each of the
negative-utility outcomes with a zero "suicide utility". The
low-probability positive outcome, averaged together with the
high-probability suicide utility of zero, produces an average utility,
which is higher than the suicide utility (zero) of their life without
the lottery ticket.

(Note that finding oneself with a losing lottery ticket doesn't then
require one to commit suicide. One merely begins looking for other
low-probability branches - future lottery tickets - leading towards
positive utility.)

(The crack cocaine case involves planning to create a brief period of
positive utility, followed by a long period of negative utility, which
averages out to lower summed utility than without crack.)

More specifically, this negative utility theory says that, when
comparing possible actions, you compare the expected utilities only of
the portions of the probability distributions with positive utility.
If you consider the probability distribution on future expected summed
life utilities, and let

    - U0 be the positive area for the no-ticket distribution (the
integral of utility over all outcomes under which utility is positive)
    - UT be the positive area for the bought-a-ticket distribution

then UT > U0 => you should buy a ticket.

We can apply similar logic to possible outcomes of the Singularity.
If, as I've argued, the careful approach provides us with a near-1
probability of negative utility, and the damn-the-torpedoes approach
provides us with a greater-than-epsilon probability of positive
utility, then we seem to be in a situation where the summed positive
utility of damn-the-torpedos is greater than the summed positive
utility of the cautious approach, EVEN if the expected utility of the
cautious approach is greater.

- Phil

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:56 MDT