Re: [sl4] Bayesian rationality vs. voluntary mergers

From: Norman Noman (overturnedchair@gmail.com)
Date: Mon Sep 08 2008 - 02:39:29 MDT

On Sun, Sep 7, 2008 at 5:36 PM, Wei Dai <weidai@weidai.com> wrote:

> After suggesting in a previous post [1] that AIs who want to cooperate with
> each other may find it more efficient to merge than to trade, I realized
> that voluntary mergers do not necessarily preserve Bayesian rationality,
> that is, rationality as defined by standard decision theory. In other words,
> two "rational" AIs may find themselves in a situation where they won't
> voluntarily merge into a "rational" AI, but can agree merge into an
> "irrational" one. This seems to suggest that we shouldn't expect AIs to be
> constrained by Bayesian rationality, and that we need an expanded definition
> of what rationality is.
>
> Let me give a couple of examples to illustrate my point. First consider an
> AI with the only goal of turning the universe into paperclips, and another
> one with the goal of turning the universe into staples. Each AI is
> programmed to get 1 util if at least 60% of the accessible universe is
> converted into its target item, and 0 utils otherwise. Clearly they can't
> both reach their goals (assuming their definitions of "accessible universe"
> overlap sufficiently), but they are not playing a zero-sum game, since it is
> possible for them to both lose, if for example they start a destructive war
> that devastates both of them, or if they just each convert 50% of the
> universe.
>
> So what should they do? In [1] I suggested that two AIs can create a third
> AI whose utility function is a linear combination of the utilities of the
> original AIs, and then hand off their assets to the new AI. But that doesn't
> work in this case. If they tried this, the new AI will get 1 util if at
> least 60% of the universe is converted to paperclips, and 1 util if at least
> 60% of the universe is converted to staples. In order to maximize its
> expected utility, it will pursue the one goal with the highest chance of
> success (even if it's just slightly higher than the other goal). But if
> these success probabilities were known before the merger, the AI whose goal
> has a smaller chance of success would have refused to agree to the merger.
> That AI should only agree if the merger allows it to have a close to 50%
> probability of success according to its original utility function.
>
> The problem here is that standard decision theory does not allow a
> probabilistic mixture of outcomes to have a higher utility than the
> mixture's expected utility, so a 50/50 chance of reaching either of two
> goals A and B cannot have a higher utility than 100% chance of reaching A
> and a higher utility than 100% chance of reaching B, but that is what is
> needed in this case in order for both AIs to agree to the merger.
>

How is an AI which flips a coin and then turns the universe into paperclips
or staples depending on the result "irrational"? It's certainly odd, but it
seems rational to me.

