Hempel's Paradox [ was RE: The Relevance of Complex Systems [was: Re: Retrenchment]]

From: Ben Goertzel (ben@goertzel.org)
Date: Fri Sep 09 2005 - 10:17:00 MDT


Eli,

> Googling on Hempel + Bayes turns up some standard responses, for example
> in http://plato.stanford.edu/entries/epistemology-bayesian/
>
> "Hempel first pointed out that we typically expect the hypothesis that
> all ravens are black to be confirmed to some degree by the observation
> of a black raven, but not by the observation of a non-black, non-raven.
> Let H be the hypothesis that all ravens are black. Let E1 describe the
> observation of a non-black, non-raven. Let E2 describe the observation
> of a black raven. Bayesian Confirmation Theory actually holds that both
> E1 and E2 may provide some confirmation for H. Recall that E1 supports H
> just in case Pi(E1/H)/Pi(E1) > 1. It is plausible to think that this
> ratio is ever so slightly greater than one. On the other hand, E2 would
> seem to provide much greater confirmation to H, because, in this
> example, it would be expected that Pi(E2/H)/Pi(E2) >> Pi(E1/H)/Pi(E1)."
>
> A fine answer so far as it goes, though really it is only half the
> solution.

IMO this is *not* a fine answer, it's a dodge of the issue.

The problem is that what is logically correct is that an observation of
a non-black non-raven should provide NO evidence toward the hypothesis
that all ravens are black.

If probability theory as standardly deployed states that an observation
of a non-black non-raven provides a NON-ZERO amount of evidence toward
the hypothesis that all ravens are black, then this shows there is
something wrong with probability theory as standardly deployed.

Of cousre, an approach that yields small errors may still be valuable
for practical AI purposes.

However, what frustrates me about the quote you cite, and your attitude,
is that you seem to be denying that probability theory as standardly
deployed is conceptually and logically erroneous in this case -- albeit
the magnitude of its error is generally small.

Your followup comments are intelligent and well-thought-out, but, they
don't really solve the problems with the attempted solution given in
the paragraph you cite above.

I believe the Hempel problem is handled more artfully in Novamente.

In PTL, we can estimate the truth value of

ForAll x { is_raven(x) ==> is_black(x) }

as a transform of the truth value of

P( is_black(x) | is_raven(x) )

[the simplest rule for this is, e.g. s^N estimates the probability
of the former statement where s is an estimate of the probability
of the latter statement and N defines the amount of (explicit or
implicit) evidence used to arrive at s].

The definition of evidence in PTL makes clear that the only evidence
that counts for

P( is_black(x) | is_raven(x) )

is the set of x for which is_raven(x) has a nonzero truth value,
and therefore Hempel's paradox does not exist.

The key point is that PTL explicitly defines the concept of evidence
and keeps track of the evidence in favor of each statement. Evidence
is defined as something separate from probability, though related
to probability. Basically, the evidence in favor of an assertion is
the "number of observations" made to estimate the probability of
the assertion.

I realize these comments are only evocative rather than convincing,
which is pretty much inevitable given the constraints of expression
in a brief and semi-technical email.

Pei Wang and I have recently written and submitted for publication
a paper arguing that, to be adequate, an uncertain logic system must
use at least two numbers to quantify truth value. Examples of
uncertain logic systems using two numbers to quantify truth values
are:

* Novamente's PTL framework

* Pei Wang's NARS framework (which I have some serious issues with)

* Walley's theory of interval probabilities (which I haven't explored
that fully, though it has some nice algebraic similarities to PTL
and NARS)

I continue to believe that a purely probabilistic approach is not
adequate, but that if one augments probability theory by considering
truth values with more than one component (e.g. a "weight of evidence"
as well as a probability), then things work out more adequately
(though one still needs to introduce a bunch of heuristic approximations
to tractably handle real-world inferencing).

> I shall now demonstrate the folly of adulterating Bayes with lesser wares.
>
> Suppose that I know that, in a certain sample, there is at least one
> black raven, and at least one blue teapot, and some number of other
> ravens of unknown color. I now observe an item from the group that is
> produced by the following sampling method: Someone looks over the
> group, and if there are no non-black ravens, he tosses out a blue
> teapot. If there are non-black ravens, he tosses out a black raven.
> Now observing a black raven definitely shows that not all ravens
> are black.
>
> How would Novamente's "augmented" probability theory handle that case, I
> wonder?

Given the constraints you've introduced, the only way Novamente has to
handle this problem is to use "higher-order inference", which means
to explicitly represent the definition of the problem in terms of
variables and quantifiers, in a manner similar to predicate logic.

The difference is that, unlike standard predicate logic, Novamente has
formulas for managing uncertain truth values attached to quantified
logical formulae.

I could write out the details of this example in Novamente formalism,
and may do so later as it's a moderately amusing exercise, but I don't
have time at the moment.

-- Ben G.



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:52 MDT