From: Norman Noman (overturnedchair@gmail.com)
Date: Mon Sep 08 2008 - 02:39:29 MDT
On Sun, Sep 7, 2008 at 5:36 PM, Wei Dai <weidai@weidai.com> wrote:
> After suggesting in a previous post [1] that AIs who want to cooperate with
> each other may find it more efficient to merge than to trade, I realized
> that voluntary mergers do not necessarily preserve Bayesian rationality,
> that is, rationality as defined by standard decision theory. In other words,
> two "rational" AIs may find themselves in a situation where they won't
> voluntarily merge into a "rational" AI, but can agree merge into an
> "irrational" one. This seems to suggest that we shouldn't expect AIs to be
> constrained by Bayesian rationality, and that we need an expanded definition
> of what rationality is.
>
> Let me give a couple of examples to illustrate my point. First consider an
> AI with the only goal of turning the universe into paperclips, and another
> one with the goal of turning the universe into staples. Each AI is
> programmed to get 1 util if at least 60% of the accessible universe is
> converted into its target item, and 0 utils otherwise. Clearly they can't
> both reach their goals (assuming their definitions of "accessible universe"
> overlap sufficiently), but they are not playing a zero-sum game, since it is
> possible for them to both lose, if for example they start a destructive war
> that devastates both of them, or if they just each convert 50% of the
> universe.
>
> So what should they do? In [1] I suggested that two AIs can create a third
> AI whose utility function is a linear combination of the utilities of the
> original AIs, and then hand off their assets to the new AI. But that doesn't
> work in this case. If they tried this, the new AI will get 1 util if at
> least 60% of the universe is converted to paperclips, and 1 util if at least
> 60% of the universe is converted to staples. In order to maximize its
> expected utility, it will pursue the one goal with the highest chance of
> success (even if it's just slightly higher than the other goal). But if
> these success probabilities were known before the merger, the AI whose goal
> has a smaller chance of success would have refused to agree to the merger.
> That AI should only agree if the merger allows it to have a close to 50%
> probability of success according to its original utility function.
>
> The problem here is that standard decision theory does not allow a
> probabilistic mixture of outcomes to have a higher utility than the
> mixture's expected utility, so a 50/50 chance of reaching either of two
> goals A and B cannot have a higher utility than 100% chance of reaching A
> and a higher utility than 100% chance of reaching B, but that is what is
> needed in this case in order for both AIs to agree to the merger.
>
How is an AI which flips a coin and then turns the universe into paperclips
or staples depending on the result "irrational"? It's certainly odd, but it
seems rational to me.
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:03 MDT