Re: Overconfidence and meta-rationality

From: Eliezer S. Yudkowsky (
Date: Sat Mar 19 2005 - 23:25:06 MST

Robin Hanson wrote:
> At 12:57 AM 3/13/2005, Eliezer S. Yudkowsky wrote:
>> If I had to select out two points as most important, they would be:
>> 1) Just because perfect Bayesians, or even certain formally imperfect
>> Bayesians that are still not like humans, *will* always agree; it does
>> not follow that a human rationalist can obtain a higher Bayesian score
>> (truth value), or the maximal humanly feasible score, by deliberately
>> *trying* to agree more with other humans, even other human rationalists.

>> 2) Just because, if everyone agreed to do X without further argument
>> or modification (where X is not agreeing to disagree), the average
>> Bayesian score would increase relative to its current position, it
>> does not follow that X is the *optimal* strategy.
> These points are stated very weakly, basically just inviting me to
> *prove* my claims with mathematical precision. I may yet rise to that
> challenge when I get more back into this.

We are at odds about what the math here actually *says*. I don't regard
the sequiturs (the points above where I say, "it does not follow") as
things that are a trivial distance from previously accomplished math.
They seem to me almost wholly unrelated to all work on Aumann Agreement
Theorems done so far.

Since state information is irrelevant to our dispute, it would seem that
we disagree about the results of a computation.

Here's at least one semisolid mathematical result that I scribbled down
in a couple of lines of calculus left to the reader, a result which I
intuitively expected to find, and which I would have found to be a much
more compelling argument in 2003. It is that, when two Bayesianitarians
disagree about probabilities P and ~P, they can always immediately
improve the expectation of the sum of their Bayesian scores by averaging
together their probability estimates for P and ~P, *regardless of the
real value*.

Let the average probability equal X. Let the individual pre-averaging
probabilities equal X+d and X-d. Let the actual objective frequency
equal P. The function:

f(d) = p*[log(x+d) + log(x-d)] + (1-p)*[log(1-x-d) + log(1-x+d)]

has a maximum at d=0, regardless of the value of p. f'(0)=0 and f''(0)
is negative. If my math hasn't misled me there's a couple of other
points where f'(d)=0 but they're presumably inflection points or minima
or some such. I didn't bother checking which is why I call this a
semisolid result.

Therefore, if two Bayesianitarian *altruists* find that they disagree,
and they have no better algorithm to resolve their disagreement, they
should immediately average together their probability estimates.

I would have found this argument compelling in 2003 because at that
time, I was thinking in terms of a "Categorical Imperative" foundation
for probability theory, i.e., a rule that, if all observers follow it,
will maximize their collective Bayesian score. I thought this solved
some anthropic problems, but I was mistaken, though it did shed light.
Never mind, long story.

To try and translate my problem with my former foundation without going
into a full-blown lecture on the Way: Suppose that a creationist comes
to me and is genuinely willing to update his belief in evolution from ~0
to .5, providing that I update my belief from ~1 to .5. This will
necessarily improve the expectation of the sum of our Bayesian scores.


1) I can't just change my beliefs any time I please. I can't cause
myself not to believe in evolution by an act of will. I can't look up
at a blue sky and believe it to be green. I account this a strength of
a rationalist.
2) Evolution is still correct regardless of what the two of us do about
our probability assignments. I would need to update my belief because
of a cause that I believe to be uncorrelated with the state of the
actual world.
3) Just before I make my Bayesianitarian act of sacrifice, I will know
even as I do so, correctly and rationally, that evolution is true. And
afterward I'll still know, deep down, whatever my lips say...
4) I have other beliefs about biology that would be inconsistent with a
probability assignment to evolution of 0.5.
5) I do wish to be an altruist, but in the service of that wish, it is
rather more important that I get my beliefs right about evolution than
that J. Random Creationist do so, because JRC is less likely to like
blow up the world an' stuff if he gets his cognitive science wrong.

If I were an utterly selfish Bayesianitarian, how would I maximize only
my own Bayesian score? It is this question that is the foundation of
*probability* theory, the what-we-deep-down-know-will-happen-next,
whatever decisions we make in the service of altruistic utilities.

Point (2) from my previous post should now also be clearer:

>> 2) Just because, if everyone agreed to do X without further argument
>> or modification (where X is not agreeing to disagree), the average
>> Bayesian score would increase relative to its current position, it
>> does not follow that X is the *optimal* strategy.

Two Aumann agents, to whom other agents' probability estimates are
highly informative with respect to the state of the actual world, could
also theoretically follow the average-together-probability-estimates
algorithm and thereby, yes, improve their summed Bayesian scores from
the former status quo - but they would do worse that way than by
following the actual Aumann rules, which do not lend credence as such to
the other agent's beliefs, but simply treat those beliefs as another
kind of Bayesian signal about the state of the actual world.

Thus, if you want to claim a mathematical result about an expected
individual benefit (let alone optimality!) for rationalists deliberately
*trying* to agree with each other, I think you need to specify what
algorithm they should follow to agreement - Aumann agent?
Bayesianitarian altruist? In the absence of any specification of how
rationalists try to agree with each other, I don't see how you could
prove this would be an expected individual improvement. Setting both
your probability estimates to zero is an algorithm for trying to agree,
but it will not improve your Bayesian score.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:50 MDT