**From:** Eliezer S. Yudkowsky (*sentience@pobox.com*)

**Date:** Fri Sep 09 2005 - 21:11:18 MDT

**Next message:**David Picon Alvarez: "potential technological idea"**Previous message:**H C: "Re: Immorally optimized? - alternate observation points"**In reply to:**Ben Goertzel: "Hempel's Paradox [ was RE: The Relevance of Complex Systems [was: Re: Retrenchment]]"**Next in thread:**Ben Goertzel: "RE: Hempel's Paradox"**Reply:**Ben Goertzel: "RE: Hempel's Paradox"**Reply:**Jeff Medina: "Re: Hempel's Paradox"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Ben Goertzel wrote:

*> Eli,
*

*>
*

*>>Googling on Hempel + Bayes turns up some standard responses, for example
*

*>>in http://plato.stanford.edu/entries/epistemology-bayesian/
*

*>>
*

*>>"Hempel first pointed out that we typically expect the hypothesis that
*

*>>all ravens are black to be confirmed to some degree by the observation
*

*>>of a black raven, but not by the observation of a non-black, non-raven.
*

*>>Let H be the hypothesis that all ravens are black. Let E1 describe the
*

*>>observation of a non-black, non-raven. Let E2 describe the observation
*

*>>of a black raven. Bayesian Confirmation Theory actually holds that both
*

*>>E1 and E2 may provide some confirmation for H. Recall that E1 supports H
*

*>>just in case Pi(E1/H)/Pi(E1) > 1. It is plausible to think that this
*

*>>ratio is ever so slightly greater than one. On the other hand, E2 would
*

*>>seem to provide much greater confirmation to H, because, in this
*

*>>example, it would be expected that Pi(E2/H)/Pi(E2) >> Pi(E1/H)/Pi(E1)."
*

*>>
*

*>>A fine answer so far as it goes, though really it is only half the
*

*>>solution.
*

*>
*

*> IMO this is *not* a fine answer, it's a dodge of the issue.
*

*>
*

*> The problem is that what is logically correct is that an observation of
*

*> a non-black non-raven should provide NO evidence toward the hypothesis
*

*> that all ravens are black.
*

E.T. Jaynes would have exploded on that one. Jaynes is dead, so I guess

I'll have to handle this one myself.

Your statement is certainly not logically correct. I can easily

generate situations in which observing a non-black non-raven can

generate evidence favoring the hypothesis "All ravens are black" over

its alternatives. For example, the set of objects includes 7 ravens and

1 non-raven. We randomly sample a non-black object and find that it is

not a raven. If we carry out this operation a sufficient number of

times, each time finding a red lampshade, it becomes asymptotically

certain that all ravens are black, as compared to the hypothesis that 1

raven is nonblack, 2 ravens are nonblack, etc. This is a standard

Bayesian analysis for finite sets and I do not think you would care to

dispute it. Taking the limit as the ratio of non-black non-ravens to

ravens goes to infinity, the evidence provided by sampling a nonblack

object and finding that it is not a raven approaches zero.

If we randomly sample a non-raven (note that this is different from

randomly sampling an object, or randomly sampling a non-black object)

then this evidence should have equal likelihood ratios for most

hypotheses about ravens, providing our hypotheses about ravens are

independent of our hypotheses about other objects. Perhaps these two

background assumptions (ravens independent of other objects, random

sampling of a non-raven) are what you had in mind when you stated that

seeing a non-black non-raven should provide NO evidence in favor of the

assertion that all ravens are black. (Note that as the ratio of

nonravens to ravens goes to infinity, sampling a random object

asymptotically approaches sampling a non-raven.)

More importantly, your statement is mathematical nonsense. Whether

observing a non-black non-raven increases the collective probability

mass assigned to hypotheses which include the statement "All ravens are

black", will in general depend on which other hypotheses are competing.

There is simply no such thing as evidence which favors a hypothesis.

There is only evidence which favors a hypothesis over other hypotheses.

This follows directly from the definition of "evidence" as a

likelihood ratio. A ratio needs a numerator and a denominator.

Another way of looking at it: Probability mass can neither be created

nor destroyed, it always has to sum to 1. If the probability mass of

one hypothesis increases, it has to come from other hypotheses. This

redistribution, however you carry it out, makes an implicit statement

about likelihood ratios. Denying this just ensures that your apparent

likelihood ratios make no sense. It is like people who try to deny that

they assign prior probabilities, and hence end up assigning absurd prior

probabilities.

*> If probability theory as standardly deployed states that an observation
*

*> of a non-black non-raven provides a NON-ZERO amount of evidence toward
*

*> the hypothesis that all ravens are black, then this shows there is
*

*> something wrong with probability theory as standardly deployed.
*

Considering how many different ways standard probability theory has been

proven to be the unique method of reasoning that obeys consistency

axioms XYZ, it is a simpler hypothesis that you are wrong about the

observation providing ZERO evidence.

*> However, what frustrates me about the quote you cite, and your attitude,
*

*> is that you seem to be denying that probability theory as standardly
*

*> deployed is conceptually and logically erroneous in this case -- albeit
*

*> the magnitude of its error is generally small.
*

Yes, Ben, I do deny that standard probability theory is "conceptually

and logically erroneous" in this quite straightforward case. At this

point you really should read Jaynes, who uses (correctly formulated and

applied) probability theory to knock down one alleged "paradox" after

another. (Quite often, the paradox has to do with assuming infinities

directly in the math, rather than passing to the limit of finite cases.)

It is your own theory which is the approximation that makes small

errors. You omit from consideration the small amounts of evidence

provided by sampling a random object or a random non-black object and

finding that it is not a nonblack raven.

I, too, expect that an efficient AI will omit these small amounts of

evidence, because the evidence is too small to be worth the time cost of

updating, or the memory cost of representing probabilities to that

precision. But the evidence is still there in theory, and becomes

highly relevant in practice for small sets of objects. That's why I use

the illustration of small sets to overcome the instinctive human

tendency to *approximate* the evidence as zero.

You have started thinking that planets really do move in epicycles

instead of ellipses. "NO evidence" is the tractable approximation. "A

very small amount of evidence" is the mathematically correct answer.

*> Your followup comments are intelligent and well-thought-out, but, they
*

*> don't really solve the problems with the attempted solution given in
*

*> the paragraph you cite above.
*

*>
*

*> I believe the Hempel problem is handled more artfully in Novamente.
*

*>
*

*> In PTL, we can estimate the truth value of
*

*>
*

*> ForAll x { is_raven(x) ==> is_black(x) }
*

*>
*

*> as a transform of the truth value of
*

*>
*

*> P( is_black(x) | is_raven(x) )
*

*>
*

*> [the simplest rule for this is, e.g. s^N estimates the probability
*

*> of the former statement where s is an estimate of the probability
*

*> of the latter statement and N defines the amount of (explicit or
*

*> implicit) evidence used to arrive at s].
*

Uh... didn't someone, I think Frank Ramsey but possibly Stalnaker, prove

that it was *impossible* to transform P(B|A) into P(A -> B), using any

connective -> that was true or false in any given possible world? Or am

I misunderstanding what you're trying to do here?

Furthermore, you don't make it clear how Novamente decides p(

is_black(x) | is_raven(x) ), which is the whole problem at hand. If I

randomly sample a small group and find a couple of black ravens, then,

after this, when I repeatedly sample non-black objects from the small

group and none of them are ravens, or even if I randomly sample objects

and they are not nonblack ravens, then even without observing more black

ravens, my p(black|raven) should go on increasing.

*> The definition of evidence in PTL makes clear that the only evidence
*

*> that counts for
*

*>
*

*> P( is_black(x) | is_raven(x) )
*

*>
*

*> is the set of x for which is_raven(x) has a nonzero truth value,
*

*> and therefore Hempel's paradox does not exist.
*

It seems that PTL ignores relevant evidence, then.

As for Hempel's paradox not existing, as far as I can tell, you haven't

addressed it at all. Exactly the same definition would show that P(

is_not_raven(x) | is_not_black(x) ) only uses as evidence the set of x

for which is_not_black(x) has nonzero truth value.

PTL's estimate for

ForAll x { is_raven(x) ==> is_black(x) }

ought to be identical to its estimate for

ForAll x { is_not_black(x) ==> is_not_raven(x) }

and if it's not, that's Hempel's Paradox square in the face.

*>>I shall now demonstrate the folly of adulterating Bayes with lesser wares.
*

*>>
*

*>>Suppose that I know that, in a certain sample, there is at least one
*

*>>black raven, and at least one blue teapot, and some number of other
*

*>>ravens of unknown color. I now observe an item from the group that is
*

*>>produced by the following sampling method: Someone looks over the
*

*>>group, and if there are no non-black ravens, he tosses out a blue
*

*>>teapot. If there are non-black ravens, he tosses out a black raven.
*

*>>Now observing a black raven definitely shows that not all ravens
*

*>>are black.
*

*>>
*

*>>How would Novamente's "augmented" probability theory handle that case, I
*

*>>wonder?
*

*>
*

*> Given the constraints you've introduced, the only way Novamente has to
*

*> handle this problem is to use "higher-order inference", which means
*

*> to explicitly represent the definition of the problem in terms of
*

*> variables and quantifiers, in a manner similar to predicate logic.
*

Standard Bayes doesn't need to resort to higher-order logic to solve

this problem. It just says what our expectations are, given various

hypotheses, same as in any other case.

*> The difference is that, unlike standard predicate logic, Novamente has
*

*> formulas for managing uncertain truth values attached to quantified
*

*> logical formulae.
*

*>
*

*> I could write out the details of this example in Novamente formalism,
*

*> and may do so later as it's a moderately amusing exercise, but I don't
*

*> have time at the moment.
*

Sounds overcomplicated... not good for efficiency.

-- Eliezer S. Yudkowsky http://intelligence.org/ Research Fellow, Singularity Institute for Artificial Intelligence

**Next message:**David Picon Alvarez: "potential technological idea"**Previous message:**H C: "Re: Immorally optimized? - alternate observation points"**In reply to:**Ben Goertzel: "Hempel's Paradox [ was RE: The Relevance of Complex Systems [was: Re: Retrenchment]]"**Next in thread:**Ben Goertzel: "RE: Hempel's Paradox"**Reply:**Ben Goertzel: "RE: Hempel's Paradox"**Reply:**Jeff Medina: "Re: Hempel's Paradox"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]

*
This archive was generated by hypermail 2.1.5
: Wed Jul 17 2013 - 04:00:52 MDT
*