Re: Overconfidence and meta-rationality

From: Eliezer S. Yudkowsky (
Date: Sat Mar 12 2005 - 22:57:32 MST

Robin Hanson wrote:
> Eliezer, you are just writing far too much for me to comment on all of
> it.

Yes. I know. You don't have to comment on all of it. I just thought I
should say all of it before you wrote your book, rather than afterward.

I don't think that this issue is simple - you did say you wanted to
write a book on it - so I don't think that the volume of discussion is
inappropriate to the question. I understand that your time is
constrained, as is mine.

If you allege that I don't seem interested in the math, you have to
expect a certain probability of a long answer.

> If you give me an indication of what your key points are, I will
> try to respond to those points.

If I had to select out two points as most important, they would be:

1) Just because perfect Bayesians, or even certain formally imperfect
Bayesians that are still not like humans, *will* always agree; it does
not follow that a human rationalist can obtain a higher Bayesian score
(truth value), or the maximal humanly feasible score, by deliberately
*trying* to agree more with other humans, even other human rationalists.

2) Just because, if everyone agreed to do X without further argument or
modification (where X is not agreeing to disagree), the average Bayesian
score would increase relative to its current position, it does not
follow that X is the *optimal* strategy.

> For now, I will just make a few
> comments on specific claims.
> At 06:40 PM 3/9/2005, Eliezer S. Yudkowsky wrote:
>> The modesty argument uses Aumann's Agreement Theorem and AAT's
>> extensions as plugins, but the modesty argument itself is not formal
>> from start to finish. I know of no *formal* extension of Aumann's
>> Agreement Theorem such that its premises are plausibly applicable to
>> humans.
> Then see: <>For Bayesian Wannabes,
> Are Disagreements Not About Information?
> <>Theory and Decision
> 54(2):105-123, March 2003.

(I immediately notice that your proof of Lemma 1 describes a Bayesian
Wannabe as wishing to minimize her expected squared error. Orthodox
statisticians minimize their expected squared error because, like,
that's what orthodox statisticians do all day. As described in TechExp,
Bayesians maximize their expectation of the logarithm of the probability
assigned to the actual outcome, which equates to minimizing expected
squared error when the error is believed to possess a Gaussian
distribution and the prior probability density is uniform.

I don't think this is really important to the general thrust of your
paper, but it deserves noting.)

On to the main issue. These Bayesian Wannabes are still unrealistically
skilled rationalists; no human is a Bayesian Wannabe as so defined. BWs
do not self-deceive. They approximate their estimates of deterministic
computations via guesses whose error they treat as random variables.

I remark on the wisdom of Jaynes who points out that 'randomness' exists
in the map rather than the territory; random variables are variables of
which we are ignorant. I remark on the wisdom of Pearl, who points out
that when our map sums up many tiny details we can't afford to compute,
it is advantageous to retain the Markov property, and hence humans
regard any map without the Markov property as unsatisfactory; we say it
possesses unexplained correlations and hence is incomplete.

If the errors in BWs computations are uncorrelated random errors, the
BWs are, in effect, simple measuring instruments, and they can treat
each other as such, combining their two measurements to obtain a third,
more reliable measurement.

If we assume the computation errors follow a bell curve, we obtain a
constructive procedure for combining the computations of any number of
agents; the best group guess is the arithmetic mean of the individual

How long is the Emperor of China's nose?

>> you say: "If people mostly disagree because they systematically
>> violate the rationality standards that they profess, and hold up for
>> others, then we will say that their disagreements are dishonest." (I
>> would disagree with your terminology; they might be dishonest *or*
>> they might be self-deceived. ...
> I was taking self-deception to be a kind of dishonesty.

Life would be so much simpler if it were. Being honest is difficult and
often socially unrewarding, but halting self-deception is harder.

>> ... if Aumann's Agreement Theorem is wrong (goes wrong reliably in the
>> long run, not just failing 1 time out of 100 when the consensus belief
>> is 99% probability) then we can readily compare the premises of AAT
>> against the dynamics of the agents, their updating, their prior
>> knowledge, etc., and track down the mistaken assumption that caused
>> AAT (or the extension of AAT) to fail to match physical reality. ...
> This actually seems to me rather hard, as it is hard to observe people's
> priors.

Is it hard to observe the qualitative fact of whether or not humans'
priors agree? Well, yes, I suppose, as humans, not being Bayesians,
possess no distinguished priors.

But it is a relatively straightforward matter to tell whether humans
behave like Aumann agents. They don't. Similarly, I think it would be
a relatively straightforward matter to sample whether Aumann agents
indeed all had the same priors, as they believed, if they had agreed to
disagree and therefore the premises stood in doubt. Since Aumann agents
know their priors and can presumably report them.

>> ... You attribute the great number of extensions of AAT to the
>> following underlying reason: "His [Aumann's] results are robust
>> because they are based on the simple idea that when seeking to
>> estimate the truth, you should realize you might be wrong; others may
>> well know things that you do not."
>> I disagree; this is *not* what Aumann's results are based on.
>> Aumann's results are based on the underlying idea that if other
>> entities behave in a way understandable to you, then their observable
>> behaviors are relevant Bayesian evidence to you. This includes the
>> behavior of assigning probabilities according to understandable
>> Bayesian cognition.
> The paper I cite above is not based on having a specific model of the
> other's behavior.

The paper you cite above does not yield a constructive method of
agreement without additional assumptions. But then the paper does not
prove agreement *given* a set of assumptions. As far as I can tell, the
paper says that Bayesian Wannabes who agree to disagree about
state-independent computations and who treat their computation error as
a state-independent "random" variable - presumably meaning, a variable
of whose exact value they are to some degree ignorant - must agree to
disagree about a state-independent random variable.

>> So A and B are *not* compromising between their previous positions;
>> their consensus probability assignment is *not* a linear weighting of
>> their previous assignments.
> Yes, of course, who ever said it was?

If two people who find that they disagree immediately act to eliminate
their disagreement (which should be "much easier"), what should they
compromise on, if not a weighted mix of their probability distributions
weighted by an agreed-upon estimate of relative rationality on that problem?

>> ... If this were AAT, rather than a human conversation, then as Fred
>> and I exchanged probability assignments our actual knowledge of the
>> moon would steadily increase; our models would concentrate into an
>> ever-smaller set of possible worlds. So in this sense the dynamics of
>> the modesty argument are most unlike the dynamics of Aumann's
>> Agreement Theorem, from which the modesty argument seeks to derive its
>> force. AAT drives down entropy (sorta); the modesty argument
>> doesn't. This is a BIG difference.
> AAT is *not* about dynamics at all. It might require a certain dynamics
> to reach the state where AAT applies, but this paper of mine applies at
> any point during any conversation:
> <>Disagreement Is Unpredictable.
> <>Economics Letters
> 77(3):365-369, November 2002.

I agree that rational agents will not be able to predict the direction
of the other agent's disagreement. But I don't see what that has to do
with my observation, that human beings who attempt to immediately agree
with each other will not necessarily know more after compromising than
they started out knowing.

>> The AATs I know are constructive; they don't just prove that agents
>> will agree as they acquire common knowledge, they describe *exactly
>> how* agents arrive at agreement.
> Again, see my Theory and Decision paper cited above.

As far as I can see, this paper is not constructive, but that is because
it does not start from some set of premises and prove agent agreement.
Rather the paper proves that if Bayesian Wannabes treat their
computation errors as state-independent random variables, then if they
agree to disagree about computations, they must agree to disagree about
state-independent random variables. So in that sense, the paper proves
a non-constructive result that is unlike the usual class of Aumann
Agreement theorems. Unless I'm missing something?

>>> ... people uphold rationality standards that prefer logical
>>> consistency...
>> Is the Way to have beliefs that are consistent among themselves? This
>> is not the Way, though it is often mistaken for the Way by logicians
>> and philosophers. ...
> Preferring consistency, all else equal, is not the same as requiring
> it. Surely you also prefer it all else equal.

No! No, I do not prefer consistency, all else equal. I prefer *only*
that my map match the territory. If I have two maps that are unrelated
to the territory, I care not whether they are consistent. Within the
Way, fit to the territory is the *only* thing that I am permitted to

Michael Wilson remarked to me that general relativity and quantum
mechanics are widely believed to be inconsistent in their present forms,
yet they both yield excellent predictions of physical phenomena. This
is a challenge to find a unified theory because underlying reality is
consistent and therefore there is presumably some *specific* consistent
unified theory that would yield better predictions. It is *not* a
problem because I prefer 'all else being equal' that my map be consistent.

>> ... agree that when two humans disagree and have common knowledge of
>> each other's opinion ... *at least one* human must be doing something
>> wrong. ...
>> One possible underlying fact of the matter might be that one person is
>> right and the other person is wrong and that is all there ever was to it.
> This is *not* all there is too it. There is also the crucial question
> of what exactly one of them did wrong.


>> Trying to estimate your own rationality or meta-rationality involves
>> severe theoretical problems ... "Beliefs" ... are not ontological
>> parts of our universe, ... if you know the purely abstract fact that
>> the other entity is a Bayesian reasoner (implements a causal process
>> with a certain Bayesian structure),... how do you integrate it? If
>> there's a mathematical solution it ought to be constructive. Second,
>> attaching this kind of *abstract* confidence to the output of a
>> cognitive system runs into formal problems.
> I think you exaggerate the difficulties. Again see the above papers.

I think I need to explain the difficulties at greater length. Nevermind.

>> It seems to me that you have sometimes argued that I should
>> foreshorten my chain of reasoning, saying, "But why argue and defend
>> yourself, and give yourself a chance to deceive yourself? Why not
>> just accept the modesty argument? Just stop fighting, dammit!" ...
> I would not put my advice that way. I'd say that whatever your
> reasoning, you should realize that if you disagree, that has certain
> general implications you should note.

Perhaps we disagree about what those general implications are?

>> It happens every time a scientific illiterate argues with a scientific
>> literate about natural selection. ... How does the scientific
>> literate guess that he is in the right, when he ... is also aware of
>> studies of human ... biases toward self-overestimation of relative
>> competence? ... I try to estimate my rationality in detail, instead of
>> using unchanged my mean estimate for the rationality of an average
>> human. And maybe an average person who tries to do that will fail
>> pathetically. Doesn't mean *I'll* fail, cuz, let's face it, I'm a
>> better-than-average rationalist. ... If you, Robin Hanson, go about
>> saying that you have no way of knowing that you know more about
>> rationality than a typical undergraduate philosophy student because
>> you *might* be deceiving yourself, then you have argued yourself into
>> believing the patently ridiculous, making your estimate correct
> You claim to look in detail, but in this conversation on this the key
> point you continue to be content to just cite the existence of a few
> extreme examples, though you write volumes on various digressions. This
> is what I meant when I said that you don't seem very interested in
> formal analysis.

I don't regard this as the key point. If you regard it as the key
point, then this is my reply: while there are risks in not
foreshortening the chain of logic, I think that foreshortening the
reasoning places an upper bound on predictive power and that there exist
alternate strategies which exceed the upper bound, even after the human
biases are taken into account. To sum up my reply, I think I can
generate an estimate of my rationality that is predictively better than
the estimate I would get by substituting unchanged my judgment of the
average human rationality on the present planet Earth, even taking into
account the known biases that have been discovered to affect
self-estimates of rationality. And this explains my persistent
disagreement with that majority of the population which believes in God
- how do you justify this disagreement for yourself?

The formal math I can find does not deal at all with questions of
self-deceptive reasoning or the choice of when to foreshorten a chain of
reasoning with error-prone links. Which is the formal analysis that you
feel I am ignoring?

> Maybe there are some extreme situations where it is "obvious" that one
> side is right and the other is a fool.

How do these extreme situations fit into what you seem to feel is a
mathematical result requiring agreement? The more so, as, measuring
over Earth's present population, most cases of "obviousness" will be
wrong. Most people think God obviously exists.

> This possibility does not
> justify your just disagreeing as you always have.

I started disagreeing differently after learning that Bayesians could
not agree to disagree, though only when arguing with people I regarded
as aspiring rationalists who had indicated explicit knowledge of
Aumann-ish results. Later I would launch a project to break my mind of
the habit of disagreeing with domain experts unless I had a very strong
reason. Perhaps I did not adjust my behavior enough; I do not say,
"See, I adjusted my behavior!" as my excuse. Let the observation just
be noted for whatever the information is worth.

> The question is what
> reliable clues you have to justify disagreement in your typical
> practice. When you decide that your beliefs are better than theirs,
> what reasoning are you going through at the meta-level? Yes, you have
> specific arguments on the specific topic, but so do they - why exactly
> is your process for producing an estimate more likely to be accurate
> than their process?

Sometimes it isn't. Then I try to substitute their judgment for my
judgment. Then there isn't a disagreement any more. Then nobody
remembers this event because it flashed by too quickly compared to the
extended disagreements, and they call me stubborn.

I do reserve to myself the judgment of when to overwrite my own opinion
with someone else's. Maybe if someone who knew and understood Aumann's
result, and knew also to whom they spoke, said to me, "I know and
respect your power, Eliezer, but I judge that in this case you must
overwrite your opinion with my own," I would have to give it serious

If you're asking after specifics, then I'd have to start describing the
art of specific cases, and that would be a long answer. The most recent
occasion where I recall attempting to overwrite my own opinion with
someone else's was with an opinion of James Rogers's. That was a case
of domain-specific expertise; James Rogers is a decent rationalist with
explicit knowledge of Bayesianity but he hasn't indicated any knowledge
of Aumannish things. Maybe I'll describe the incident later if I have
the time to write a further reply detailing what I feel to be the
constructive art of resolving disagreements between aspiring rationalists.

Michael Raimondi and I formed a meta-rational pair from 2001 to 2003.
We might still be a meta-rational pair now, but I'm not sure.

> In the above you put great weight on literacy/education, presuming that
> when two people disagree the much more educated person is more likely to
> be correct.

In this day and age, people rarely go about disagreeing as to whether
the Earth goes around the Sun. I would attribute the argument over
evolution to education about a simple and enormously overdetermined
scientific fact, set against tremendous stupidity and self-deception
focused on that particular question.

It's not a general rule about the superiority of education - just one
example scenario. If you want an art of resolving specific
disagreements between rationalists, or cues to help you estimate how
likely you are to be correct on a particular question, then the art is
specific and complicated.

That said, I do indeed assign tremendous weight to education. Degree of
education is domain-specific; an educated biologist is not an educated
physicist. The value of education is domain-specific; not all education
is equally worthwhile. If a physicist argues with a biologist about
physics then the biologist's opinion has no weight. If a clinical
psychologist argues with a physicist about psychology then, as far as
any experimental tests have been able to determine, the clinical
psychologist has no particular advantage.

> Setting aside the awkward fact of not actually having hard
> data to support this, do you ever disagree with people who have a lot
> more literacy/education than you?

Define "literacy". I strive to know the basics of a pretty damn broad
assortment of fields. *You* might (or might not) have greater breadth
than I, but most of your colleagues' publication records haven't nearly
your variety.

> If so, what indicators are you using
> there, and what evidence is there to support them?

When I disagree with an 'educated' person, it may be because I feel the
other person to be ignorant of specific known results; overreaching his
domain competence into an external domain; affected by wishful thinking;
affected by political ideology; educated but not very bright; a
well-meaning but incompetent rationalist; or any number of reasons. Why
are the specific cues important to this argument? You seem to be
arguing that there are mathematical results which a priori rule out the
usefulness of this digression.

> A formal Bayesian analysis of such an indicator would be to construct a
> likelihood and a prior, find some data, and then do the math. It is not
> enough to just throw out the possibility of various indicators being
> useful.

I lack the cognitive resources for a formal Bayesian analysis, but my
best guess is that I can do better with informal analysis than with no
analysis. As the Way renounces consistency for its own sake, so do I
renounce formality, save in the service of arriving to the correct answer.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:50 MDT