Re: The dangers of genuine ignorance (was: Volitional Morality and Action Judgement)

From: Eliezer Yudkowsky (
Date: Thu May 27 2004 - 00:42:01 MDT

Ben Goertzel wrote:
>> I didn't say my insights were hard to grok, Ben, but neither, it
>> seems, are they so trivial as to be explained without a week of work.
>> I say something that I see immediately, and you say no. Past
>> experience shows that if you and I both have the time to spend a week
>> arguing about the subject, there's a significant chance I can make my
>> point clear, if my point is accessible in one inferential step from
>> knowledge we already share.
> Eliezer, it is unwarranted nastiness on your part to insinuate that I am
> not able to follow more than a single inference step. Gimme a break!

What, that people have trouble following multiple simultaneous inferential
steps? That's just common sense. Anyway, I can't recall ever seeing you
do two at once in one of our arguments.

>> The case of AIXI comes to mind; you made a mistake that seemed
>> straightforward to me because I'd extensively analyzed the problem
>> from multiple directions. And no, my insight was not too subtle for
>> you to comprehend. But it took a week, and the clock went on ticking
>> during that time.
> Heh. That "week arguing about AIXI" that you mention was about an hour
> of my time altogether; as I recall that was a VERY busy week for me, and
> I was reading and writing emails on the SL4 list in spare moments at
> high speed.

Good for you. I can't do that. For me, it ends up being a week full-time.

>> When I came to Novamente, I didn't succeed in explaining to anyone how
>> "curiosity" didn't need to be an independent drive because it was
>> directly emergent from information values in expected utility combined
>> with Bayesian probability. Maybe you've grown stronger since then.
> First of all, you never came to Novamente LLC (which BTW does not have
> an office, though office space is shared with partner firms in the US
> and Brazil). You came to Webmind Inc., which was a different company
> building a different AI system.

I accept your correction.

> Novamente is founded on probability theory, Webmind was not. That is
> one among several significant differences between the two systems.
> Secondly, when you visited Webmind, you (surprise, surprise!) seem to
> have come away with the impression that all of the staff were much
> slower and stupider than they actually are.

What is this leap from, "A twenty-year-old poor explainer fails to convey a
point that will someday be a two-semester university course" and "He thinks
the listeners are morons"? This is *your* mistake, Ben, not mine; to
assume that these matters are so easily understood; that if I think you do
not understand them, I must think you a moron. I have climbed the
mountain. I know how high it is. I know how difficult it was. All that
difficulty is invisible from the outside. From the outside it looks like a
foothill. I am not saying you cannot climb foothills, Ben, I am saying
that you will not know how high this mountain was until after you have
climbed it. Meanwhile, stop insisting that I insult your intelligence, for
by doing so you insult the problem. It is a beautiful problem, and does
not deserve your insults.

> Regarding the particular point you mention, many of us UNDERSTOOD your
> point that you COULD derive curiosity from other, more basic
> motivations. However, we didn't agree that this is the BEST way to
> implement curiousity in an AGI system. Just because something CAN be
> achieved in a certain way doesn't mean that's the BEST way to do it --
> where "best" must be interpreted in the context of realistic memory and
> processing power constraints.

This does not square with my memory. I recall that for some time after my
visit, you would still object that an AI with "narrow supergoals" would be
dull and uncreative (not "creative, but slightly slower"), and that drive
XYZ would not pop up in the system or would not be represented in the
system. (Where XYZ was necessary, and hence straightforwardly desirable
under the expected utility formalism.) You would ask, "But then how does
an expected utility maximizer [you didn't call it that, nor did I, but it
was what we were talking about] have a drive to minimize memory use?"

> Novamente is more probabilistically based than Webmind, yet even so we
> will implement novelty-seeking as an independent drive, initially,
> because this is a lot more efficient than making the system learn this
> from a more basic motivation.

I didn't say you had to make the system learn the value of novelty; I said
to pre-enter an abstract assertion along which the utility would flow.

>> But as far as I can tell, you've never understood anything of Friendly
>> AI theory except that it involves expected utility and a central
>> utility function, which in the past you said you disagreed with.
> Well, I believe I understand what you say in CFAI, I just don't see why
> you think that philosophy would work in a real self-modifying software
> system...

CFAI contains descriptions of *more* than Friendliness-topped systems or
expected utility. I do not see, in your emails, any signs that you realize
the other parts of CFAI exist.

>> I still haven't managed to make you see the point of "external
>> reference semantics" as described in CFAI, which I consider the Pons
>> Asinorum of Friendly AI; the first utility system with nontrivial
>> function, with the intent in CFAI being to describe an elegant way to
>> repair programmer errors in describing morality. It's not that I
>> haven't managed to make you agree, Ben, it's that you still haven't
>> seen the *point*, the thing the system as described is supposed to
>> *do*, and why it's different from existing proposals.
> Your "external reference semantics", as I recall, is basically the idea
> that an AGI system considers its own supergoals to be uncertain
> approximations of some unknown ideal supergoals, and tries to improve
> its own supergoals. It's kind of a supergoal that says "Make all my
> supergoals, including this one, do what they're supposed to do better."

No. That was an earlier system that predated CFAI by years - 1999-era or
thereabouts. CFAI obsoleted that business completely.

External reference semantics says, "Your supergoal content is an
approximation to [this function which is expensive to compute] or [the
contents of this box in external reality] or [this box in external reality
which contains a description of a function which is expensive to compute]."
  This is a novel expansion to the expected utility formalism. A goal
system that handles the external reference in the utility function in a
certain specific way will retain the external reference, and continue
improving the approximation, even under self-modification. Or at least,
that was the idea in CFAI. I now know that there are other unsolved
problems. What doesn't change is that if you don't solve the ERS problem,
the system will definitely not be stable under self-modification (or rather
it will be stable, just not stable the way you hoped).

> What I don't understand is why you think this idea is so amazingly
> profound. Yes, this attitude toward one's supergoals is an element of
> what people call "wisdom." But I don't see that this kind of thing
> provides any kind of guarantee of Friendliness after iterated
> self-modification. Seems to me that an AGI with "external reference
> semantics" [TERRIBLE name for the concept, BTW ;-)]

I agree.

> can go loony just as easily as one without.
> But if I ever disagree with one of your ideas, your reaction is "Well
> that's because you don't understand it." ;-p

You did not understand it, as the above paragraphs illustrate. It is not
that I see you disagreeing. It is that I do not see any indication that
you are representing, in your mind, a model of the author's meaning that is
even close to the truth.

> BTW, the idea of goals having probabilities, propagating these to sub
> and supergoals, etc., was there in Webmind and is there in Novamente.
> Goal refinement has been part of my AI design for a long time, and it
> was always applied to top-level supergoals as well as to other goals.

"Goal refinement" is not the big idea. The idea is a *particular set* of
semantics for goal refinement that is stable under self-modification, in a
way that is not possible without expanding the standard expected utility

Supergoal refinement is part of plenty of other proposals I have heard, and
in all of them the supergoal-refinement mechanism immediately washes out
under self-modification.

>> Doesn't excuse every new generation of scientists making the same
>> mistakes over, and over, and over again. Imagine my chagrin when I
>> realized that consciousness was going to have an explanation in
>> ordinary, mundane, non-mysterious physics, just like the LAST THOUSAND
>> FRICKIN' MYSTERIES the human species had encountered.
> 1) I don't call quantum physics "non-mysterious physics"

I do. Everett dispels the mystery completely. The answer has been known
since 1957; it just isn't widely appreciated outside the theoretical
physics community. As I should have suspected, back when I believed the
people boasting about humanity's ignorance. There are mysterious
questions. Never mysterious answers.

> 2) Have you worked out a convincing physics-based explanation of
> consciousness? If so, why aren't you sharing it with us? Too busy to
> take the time?

I worked out something else I had once elevated to the status of a holy
mystery. Reading scientific history is no substitute for undergoing the
experience personally. If only I had *personally* postulated astrological
mysteries and discovered Newtonian physics, *personally* postulated
alchemical mysteries and discovered chemistry, *personally* postulated
vitalistic mysteries and discovered biology. I would have thought of a
mysterious explanation for consciousness and said to myself: "No way am I
falling for that again."

Reading the scientific history doesn't convey how reasonable and plausible
vitalism must have seemed *at the time*, how *surprising* and *embarassing*
it was when Nature came back and said: "*Still* just physics." I didn't
have to be extraordinarily stupid to make the mistake - just human. Each
new generation grows up in a world where physics and chemistry and biology
have never been mysterious, and past generations were just stupid. And
each new generation says, "But this problem *really is* mysterious," and
makes the same mistake again.

I did not insult you so greatly, by comparing you to, say, a competent,
experienced medieval alchemist. I do not even say that you are an
alchemist, just that you are presently thinking alchemically about the
problem of consciousness, and AI morality. As did I, once upon a time.

But now I've gone through the embarassment myself, and seen the blindingly
obvious historical parallels. Now I understand in my bones, not just
abstractly, that we live in a non-mysterious universe. Yes! It really is
ALL JUST PHYSICS! And not special mysterious physics either, just ordinary
mundane physics. No exceptions to explain the current sacred mystery, even
if it seems, y'know, REALLY MYSTERIOUS.

And, yes, I'm too busy to explain that formerly mysterious thing I solved,
the warm-up problem that gave me hope of tackling consciousness
successfully. I hope to get around to explaining eventually. But it's fun
philosophy, not crisis-critical, and therefore a low priority. Right now
I'm working on an answer to Aubrey de Grey's pending question about how to
define Friendliness.

I know it sounds odd. Maybe if someone offers SIAI a sufficiently huge
donation, I'll commit to writing it up. But everything I said should be
blindingly obvious in retrospect, so whether the alleged discovery is real
or gibberish, the materialist satori is straightforward.

Eliezer S. Yudkowsky                
Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:47 MDT