From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Thu Sep 15 2005 - 14:02:10 MDT
Ben Goertzel wrote:
> Hey Eli,
>>> http://www.goertzel.org/new_essays/hempel.htm 
>> I find your lack of faith disturbing.
> 
> Well, my wife and I found the flaw in a brief conversation before we went to
> sleep last night but I was really tired and I didn't feel like dealing with
> the computer anymore ;)
> 
> As she pointed out, the situation with states 2 and 4 in my little story
> about the midget is a lot like the situation where you have two coins, one
> with two heads and one regular coin with a head and a tail.  If you are told
> that one has been tossed and the result was a head, then the odds are 2:1
> that the the coin in question was the double-headed coin.  [The heads are
> W's, the tail is a B]
> 
> Duuuuhhh...
That was one mistake, yes.
Marcello caught the first one, but his algebra may have been hard for 
some to follow, so I now explain it intuitively.
1)
A mathematician tells you that he has two children.  You ask, "Is one of 
your children a boy?" and the mathematician answers "Yes."  What is the 
probability that the other child is a girl?  Two-thirds.  Why?  Because 
the possible birth orders are BB, BG, GB, and GG, each equally probable. 
  For the first three the mathematician certainly answers "Yes" to your 
question, and for the fourth the mathematician certainly answers "No". 
So since the likelihoods are equal for the first three probabilities, 
the mathematician's answer doesn't change your prior belief that the 
mathematician is twice as likely to have a girl and a boy as to have two 
boys.
But suppose that you randomly sample a child, and the child is a boy. 
It is now equally likely that the other child is a boy, or that the 
child is a girl, because BB is twice as likely to produce a boy in a 
random sample as BG or GB.
Still another way of looking at it is that you have incorrectly 
decomposed your uncertainty into atomic possible worlds.  Atomic 
possible worlds must slice reality as finely as possible, so an atomic 
possible world includes, as background state, the values of 'random' 
variables.  "One white raven and one white nonraven" is not an atomic 
possible world, but a set of possible worlds, because it fails to 
specify the value of an important random variable.  An *atomic* possible 
world, relative to your problem space, is "One white raven and one white 
nonraven exist, and 'random' sampling will produce a white raven".  By 
the definition of random sampling, this possibility contains half the 
probability mass within the set of possible worlds "One white raven and 
one white nonraven."  This atomic possible world is ruled out if random 
sampling of a white object produces a nonraven, and the probability mass 
collectively within the set "One white raven and one white nonraven" 
decreases accordingly.
2)
> "If the midget gives you a nonblack entity when asked, that means the bag must be in states 1, 2, 3 or 4."
The bag must be in states 2, 3, 4, 6, or 7.
3)
> "So if the ratio is unchanged via the observation of a white nonraven, i.e. if
> 
> P_prior(black|raven) / P_prior(white|raven) =
> 
> P(black|raven & I have chosen a random white entity and found it to be a nonraven) /
> 
> P(white|raven & I have chosen a random white entity and found it to be a nonraven)
> 
> (as we see from the fact that b_2/b_4 = a_2/a_4)"
??  This doesn't follow at all.  Previously, you would have to calculate 
your prior probability that a randomly sampled raven would be black or 
white by taking into account the probability mass in every one of your 
(sets of) possible worlds 1 through 7.  You cannot calculate your prior 
probability using worlds 2 and 4 alone.
4)
> "So there is no Hempel paradox in this case.  Here we are able to observe evidence increasing our estimate of
> 
> P(non-raven|non-black)
> 
> without affecting our estimate of
> 
> P(black|raven)"
Hempel's Paradox arises because the statements "All ravens are black" 
and "All non-black objects are not ravens" are logically equivalent, 
under the standard mathematical interpretation where "All ravens are 
black" is vacuously true if no ravens exist.
However, it is not the case that p(black|raven) is always locked to 
p(~raven|~black).  Although you failed to construct such an example, it 
can be done.  For example, suppose there are ten objects: a black raven, 
a white raven, and eight nonblack nonravens.  Here p(black|raven) = 1/2 
and p(~raven|~black) = 8/9.  Now suppose a different set of ten objects: 
two black ravens, two white ravens, and six nonblack nonravens. 
p(black|raven) = 1/2, but p(~raven|~black) = 6/8 = 3/4.  Suppose I am 
not sure which of these two (sets of) possible worlds I am in, and I 
randomly sample a nonblack object and it is a nonraven.  This is 
evidence that I occupy the first world, so (after renormalization) 
probability mass shifts from the second (set of) possible worlds to the 
first.
So sampling a random nonblack object which turns out to be a nonraven, 
increases p(~raven|~black) but leaves p(black|raven) constant, because, 
*unlike* the statements "All ravens are black" and "All non-black 
objects are not ravens", the two conditional probabilities are not 
logically equivalent, nor do they always change in lockstep.
At the start of your problem, you left the bounds of Hempel's 
confirmation paradox entirely.
5)
> "To see why we have increased our estimate of P(non-raven|non-black), we need to look at the three other possible states of the universe, ignored above:"
?? Why did you ignore them?
6)
> "P(black|raven)
> 
> is unchanged via the process of observing a random white entity and finding it to be a raven."
I think you mean "non-raven", but anyway...
In several of your possible cases, there are no nonblack objects.  In 
this case the statement "All non-black objects are not ravens" is 
vacuously true.  But the conditional probability p(~raven|~black) is 
undefined in standard Bayes, involving a literal division by zero - 
after which anything can happen, as in the classic proof that 1=2.  So 
if you are trying to see what happens to p(~raven|~black), you had 
better specify that at least one nonblack object exists.
Actually, I also need to specify that at least one nonblack object is 
known to exist in every possible world; along with the requirement that, 
in at least one possible world containing nonblack ravens, the ratio of 
these nonblack ravens to all other nonblack objects does not approach 
zero; and the requirement that the proposition "All ravens are black" 
not initially have prior probability equal to zero; in order for my 
general conclusion to hold that randomly sampling a nonblack object and 
finding it to be a nonraven ALWAYS increases the probability assigned to 
the proposition "All ravens are black."  (Assuming I haven't missed any 
other necessary assumptions.)
If we allow for some possible worlds to contain no nonblack objects, 
then the procedure of randomly sampling a nonblack object has a third 
possibility besides "Raven" and "Non-Raven" which is "Empty".  We then 
have to take into account the likelihoods assigned to this third 
possibility, which changes everything.  For example, states 1 and 5 
assign probability 1 to the result "Empty".  If we assigned most of our 
prior probability mass to state 1 or state 5, then the sample coming up 
with a nonblack object at all, instead of "Empty", could drastically 
decrease the total probability assigned to p(black|raven).
7)
> "There are seven distinguishable states for the interior of the bag, each of which one may assign a certain prior probability."
But which prior probability?
As seen above, the prior probabilities assigned to states 1-7, may 
drastically change the effect of sampling a nonblack object and finding 
it to be a nonraven.
One of the major benefits of training in probability theory a la Jaynes 
is that you learn to stop sweeping critically necessary assumptions 
under the carpet of "no information".  If you have no information, sir, 
do please tell us exactly what information you do not have.
> I wouldn't every say something like "probability theory is wrong" -- it's a
> branch of math and is correct assuming its axiom systems, just like any
> other branch of math....  It's even a *very useful* branch of math which is
> why Novamente is substantially based upon it....
Ben Goertzel wrote on September 9th, 2005:
> If probability theory as standardly deployed states that an observation
> of a non-black non-raven provides a NON-ZERO amount of evidence toward
> the hypothesis that all ravens are black, then this shows there is
> something wrong with probability theory as standardly deployed.
> 
> Of cousre, an approach that yields small errors may still be valuable
> for practical AI purposes.
> 
> However, what frustrates me about the quote you cite, and your attitude,
> is that you seem to be denying that probability theory as standardly
> deployed is conceptually and logically erroneous in this case -- albeit
> the magnitude of its error is generally small.
I suppose the "as standardly deployed" leaves you an out.  So, if you 
like, I amend my request:  Ben, stop dissing Bayesian probability theory 
"as standardly deployed".
-- Eliezer S. Yudkowsky http://intelligence.org/ Research Fellow, Singularity Institute for Artificial Intelligence
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:52 MDT