From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Wed Sep 07 2005 - 20:11:44 MDT
Richard Loosemore wrote:
>
> I want to conclude by quoting one extract from your message below that
> sums up the whole argument:
>
> [Richard Loosemore wrote:]
>
>>> Like Behaviorists and Ptolemaic Astronomers, they mistake a
>>> formalism that approximately describes a system for the mechanism
>>> that is actually inside the system. They can carry on like this
>>> for centuries, adding epicycles onto their models in order to
>>> refine them. When Bayesian Inference does not seem to cut it,
>>> they assert that *in principle* a sufficiently complex Bayesian
>>> Inference system really would be able to cut it ... but they are
>>> not able to understand that the "in principle" bit of their argument
>>> depends on subtleties that they don't think much about.
>>
>> There are subtleties to real-world intelligence that don't
>> appear in standard Bayesian decision theory (he said controversially),
>> but Bayesian decision theory can describe a hell of a lot more than
>> naive students think. I bet that if you name three subtleties, I can
>> describe how Bayes plus expected utility plus Solomonoff
>> (= AIXI) would do it given infinite computing power.
>
> You make my point for me. The Ptolemaic astronomers would have used
> exactly the same argument that you do: "Name some subtle ways in which
> the heavenly bodies do not move according to the standard set of
> epicycles, and I can describe how an infinite number of epicycles would
> do it...." Yes, yes yes! But they were wrong, because the *real*
> mechanism for planetary movement was not actually governed by epicycles,
> it was governed by something completely different, and all the Ptolemaic
> folks were barking up the wrong tree when though their system was in
> principle capable of covering the data.
First off, you didn't answer my challenge. Name three subtleties, heck,
name one subtlety, and see what I make of it.
Your Ptolemaic argument misses the point. Ellipses are not superior to
epicycles because they are local, or cheaply computable, or any such
computational advantage. Ellipses are superior to epicycles because
they are simpler. If planets really moved in ellipses, but we had to
try and approximate ellipses using epicycles because ellipses were just
too darn expensive to compute directly, that would be an appropriate
analogy to Bayesian probability versus cheaper approximations. If
planets really moved in ellipses, but we had to approximate ellipses
with epicycles because ellipses were intractable, then you'd damn well
better understand that planets *really* move in ellipses. And use that
knowledge to develop your fast epicycle algorithms and your programs
that find good, simple epicyclic approximations when looking at elliptic
data. It wouldn't be good to stare at the computer screen for months,
go nuts, and start thinking that planets *really* moved in epicycles.
Your claim amounts to saying that nobody can actually build a Carnot
engine, therefore thermodynamics is invalid as a description of the
physical universe because it fails to take into account the subtleties
of real engines, and obviously Carnot was an idiot who never got his
hands dirty on a real engineering project.
If you want to argue that Bayesian probability can't do something *in
principle*, or that it's the wrong system to describe the ideal we're
trying to approximate, you have to say something other than "It's
computationally intractable". You have to show me a specific case where
Bayesian reasoning breaks down - where some other system produces
answers that are better. Perhaps you can find a case where you can
produce only slightly inferior answers using simpler methods, a la
Gigerenzer - those are fascinating and useful. But much more important
would be a case where an alternate method produces answers that are
*better* than Solomonoff induction using infinite computing power.
> And a vital corollary to the above arguments about how to build an AGI
> is the fact that _absolutely guaranteeing_ a Friendly AI is impossible
> the way you are trying to do it. If AGI systems that actually work are
> Complex (and all the indications are that they are indeed Complex), then
> guarantees are impossible. It's a waste of time to look for absolute
> guarantees. (Other indications of Friendliness .... now that's a
> different matter).
That's another non-sequitur. Suppose you have to write localizable,
cheaply computable approximations to an ideal. If the localizable,
cheaply computable approximations are generated by the AI itself, the AI
can guarantee that the approximation is an approximation *of* the ideal,
rather than something else.
The wrong sequence of cosmic-ray transistor flips might turn any FAI bad
- in the limiting case, the bitflips rewrite your whole AI from scratch.
In that sense, there are no guarantees. (Though if you formally
guarantee that no *possible* three bitflips can corrupt your AI, you'll
have gone a long way toward ensuring that any *random* thousand bitflips
is extremely unlikely to corrupt your AI.)
What I want is no significant *independent* sources of failure that
apply to each round of recursive self-improvement. Each transistor in
your computer is less than 99% likely to operate over the next year -
you might drop your computer out a window, or accidentally spoon ice
cream onto the motherboard. But, such catastrophes destroy large groups
of transistors simultaneously. Any given transistor has less than a 99%
chance of working for one year - but that doesn't mean that, in a chip
of twenty million transistors, you can raise 0.99 to the power of twenty
million and calculate a negligible probability of the whole chip lasting
a year. The probability that transistor A works for a year is not
independent of the probability that transistor B works for a year, so
you can't just multiply P(A) and P(B) to get P(AB). If each transistor
in your computer had a 99% *independent* chance of failing each year, it
wouldn't work for a day.
A seed AI needs to rewrite its own source code, then that new source
code may rewrite itself again, and so on. If there are independent
sources of error, then 99% reliability on one rewrite is nearly sure to
fail on a thousand sequential rewrites. A Friendly AI needs some way to
assert with extremely high probability that each rewrite maintains some
invariant.
Somehow human mathematicians manage to scale up abstract mathematical
reasoning to theorems far larger than computers have yet succeeded in
proving. No known algorithm could independently prove a CPU design
correct in the age of the universe, but with human-chosen *lemmas* we
can get machine-*verified* correctness proofs. The critical property of
proof in an axiomatic system is not that it's certain, since the system
might be inconsistent for all we know or can formally prove. The
critical property is that, *if* the system is consistent, then a proof
of ten thousand steps is as reliable as a proof of ten steps. There are
no independent sources of failure. I hope that provably correct
rewrites, like provably correct CPUs, will be managable if the AI can
put forth deductive reasoning with efficiency, tractability, and
scalability at least equalling that of a human mathematician. An
AI-complete problem? Sure, but let's not forget - we *are* trying to
design an AI.
I would like to be able to say that if the framework we're using to
understand what our guarantees *mean* doesn't contain some deep,
concealed conceptual flaw - that's the part of the risk that can never
be wholly eliminated - then the system will not fail catastrophically,
guaranteed with p>0.999 reliability. Barring improbably huge
conjunctions of cosmic-ray transistor flips, or the Simulators reaching
into the Matrix to tweak the AI code.
> These points are so crucial to the issues being discussed on this list,
> that at the very least they need to be taken seriously, rather than
> dismissed out of hand by people who are unbelievably scornful of the
> Complex Systems community.
Give me one good, solid, predictive equation applying to cognitive
systems that stems from CAS. I am interested in knowledge, not proud
statements of ignorance. Don't tell me "some things are intractable".
I already knew that. Show me a tractable approximation of something
important. Show me something better than raw Bayes on bounded computing
power, and I'll be interested in that, as I'm interested in Gigerenzer's
stuff, despite the silly things that some of the Fast and Frugal crowd
have said about the value of Bayesian theory. Of course I'll go on
measuring the efficacy of your shiny new method in Bayesian terms, but
that can't be helped.
Again, Phil Goetz gave an excellent example of what I was looking for
when he made up the fake example of power laws applying to the
distribution of goals and subgoals of various sizes, as related to the
tractability of the system. Can you show me something like that, but
non-fake?
-- Eliezer S. Yudkowsky http://intelligence.org/ Research Fellow, Singularity Institute for Artificial Intelligence
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:52 MDT