From: Eliezer S. Yudkowsky (firstname.lastname@example.org)
Date: Tue Sep 13 2005 - 19:28:42 MDT
Ben Goertzel wrote:
> -- As it turns out standard probabilistic semantics and PTL say
> basically the same thing about Hempel's paradox, though they
> express it in different ways. This is interesting to me,
> though in the big picture not surprising. PTL was created in
> order to formula probabilistic ideas in a way convenient for
> AGI, not to contradict probability theory.
The probability theory that is currently conventional, sometimes called
"Bayesian", got that way for a reason. There were some very fundamental
proofs that you *had to* do it this way in order to be consistent.
If Novamente has betting odds where it will buy gambles priced under
those odds and sell gambles priced above those odds, and those odds
disobey the laws of conventional probability theory in their relations
to one another, dutch book may be made against Novamente - I can arrange
a set of bets which Novamente will individually accept and which lead to
a guaranteed loss regardless of the real state of the world. If PTL's
two numbers lead it to bracket probabilities by two numbers, as in
Dempster-Shafer theory, so that Novamente will buy gambles priced under
one number and will sell gambles priced above a higher number, then
dutch book may not be made against Novamente so long as every bracket
contains the Bayesian value. Perhaps it is faster to compute
probabilities in this way. (Perhaps not.) However, I can still arrange
a set of gambles which Novamente will refuse and which would be a
deductively guaranteed gain, unless all lower and upper brackets
coincide with the exact Bayesian price. That's the cost of saving on
computing power, even if your approximation doesn't give absurd answers.
Then there are all the theorems which prove that reasoning consistent in
various ways - such as the final probability (betting odds) being the
same, regardless of the order in which the evidence is taken into
account - must be Bayesian. Et cetera. Et cetera.
If you haven't read the particular work of Jaynes, "Probability Theory:
The Logic of Science", read it, especially chapters 2 and 15. Or maybe
Jaynes isn't the remedy that's right for you. But what you *needed* to
do was sit bolt upright and say "Wait, something's wrong with my math!"
as soon as you thought PTL gave an answer substantially different from
the Bayesian one. This would have saved some time, even if you didn't
know whether you'd messed up the Bayesian math, the PTL math, or both.
I recently saved Michael Wilson some time by looking at a complex
formula he'd derived, which he'd sent back to me to see if there was a
more efficient way to do it. I looked at the formula and said, "This
has to be doing *something* wrong," because Bayesian probability theory
said he couldn't compute the end result from his premises. It turned
out he was, literally, dividing by zero.
> -- My issue with probability theory as a foundation for AI has
> to do with its inability to deal with some issues in a
> computationally efficient way, without the addition of a
> significant amount of additional concepts. The main
> examples I have in mind here are attention allocation in
> a complex cognitive system, assignment of credit, concept
> creation, and the learning of schemata for the control
> of inference trajectories. These things certainly can be done
> in a way that's *consistent* with probability theory, but,
> seem to require the addition of a lot of structures and
> dynamics that are not suggested by probability theory.
The problem is that you're regarding Bayesian probability theory as a
computational library, a tool for solving problems. It's not.
Probability theory is the underlying mathematics of the thing you're
trying to do. It's how you recognize intuitively when your
approximation delivers an absurd result. If you can compute things out
by brute force probability theory, fine; if not, then approximate using
a different algorithm - but don't lose sight of the privileged status of
probability theory. Probability theory is not a computer program.
Probability theory is not a fast algorithm for AI. It's your sanity
check. It's your casting-out-nines. It's how to spot an absurd end
result, and go back and check your approximation line by line against
the equivalent Bayesian derivation, until you find the line where you
divided by zero. Probability theory tells you the price (in foregone
gains) of saving on computing power, and says what your approximation
has to do to avoid guaranteed losses.
I wouldn't have gotten on your case if you'd said, "Novamente
approximates the evidence as zero to save on computing power, though the
Bayesian probability is infinitesimally positive." It's the part where
you said Bayesian probability theory was *wrong* where I got out the tar
and feathers and started heaping firewood around the stake in the town
square. PTL doesn't pull enough weight, and intuition doesn't pull
enough weight, to contradict a theory that's only got three million
published proofs of its unique consistency. That sort of thing makes me
think that you don't respect probability theory, bro, and if I let
people go around dissing Bayes, soon Bayes won't get no respect. If you
find a case where you think PTL gives a different result from
probability theory, and you see nothing alarming about this, but instead
conclude offhandedly that probability theory must be wrong, you will end
up not noticing when you divide by zero.
-- Eliezer S. Yudkowsky http://intelligence.org/ Research Fellow, Singularity Institute for Artificial Intelligence
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:52 MDT