From: Eliezer S. Yudkowsky (firstname.lastname@example.org)
Date: Sun May 19 2002 - 18:02:21 MDT
Ben Goertzel wrote:
> I don't *know* how the brain works, nobody does, of course.
> My guess is that a simulation at the cellular level, with some attention to
> extracellular diffusion of charge and neurotransmitter chemistry, can do the
Well, yeah, cuz you think that all the higher levels of organization emerge
automatically from the low-level behaviors. The usual, in other words. And
yet, when I try to figure out how to make a higher-level quality emerge, my
mental experience is that there are an unbounded number of wrong ways to do
it, and extreme mental precision (and pessimism) is needed in order to
figure out how to do it. Hence my skepticism of your claim that all higher
levels of organization emerge automatically. You seem to be skipping over
all the issues that I think constitute the real, critical, hard parts of the
> > Also, I happen to feel that incorrect AI
> > designs contribute nontrivially to the amount of work that gets dumped on
> > parameter-tuning, engineering, and performance analysis.
> Why do you think a correct AI design would not require substantial
> parameter-tuning, performance analysis and engineering?
> The human brain clearly has required very substantial "parameter-tuning"
> over an evolutionary time-scale, and it has also been "performance tuned" by
> evolution in many ways.
An AI is not a human. I think that an AI design would start out by working
very inefficiently with a small amount of tuning. After that would come the
task of getting the AI to undertake more and more complex "tuning", a task
which is one of the earliest forms of seed AI.
> In all other software systems that I know of, "complexity of interaction of
> different algorithms" is a good indicator of the amount of parameter-tuning,
> subtle engineering design, and complicated performance analysis that has to
> be done.
> So, why do you think a "correct" AGI design would avoid these issues that
> can be seen in the brain and in all other complex software systems?
Because a correct seed AI design is designed to create and store
complexity. It has other places to store complexity aside from a global
surface of 1500 parameters. These complexity stores are tractable and
manageable because the designer spent time thinking about how to make them
tractable and manageable for the earliest stages of the AI. That is part of
the hard design problem of seed AI. And before you run away screaming, it
is *not* humanly unsolvable, it is a *very hard* problem that lies at the
inescapable center of seed AI.
> >From what I know of DGI, I think it's going to require an incredible amount
> of subtle performance analysis and engineering and parameter tuning to get
> it to work at all. Even after you produce a detailed design, I mean. If
> you can produce a detailed design based on your DGI philosophy, that will be
> great. If you can produce such a design that
> -- can be made efficient without a huge amount of engineering effort and
> performance analysis, and
Efficient? The two most critical performance issues are:
1) Making the pieces of the system fit together, at all;
2) Making tuning of the system tractable and manageable for the pieces of
the system working together.
Working to make the system more efficient is really not the point. Working
to make the system more tunable by itself or more capable of tuning itself
is the point. This means tuning the system has a tractable fitness
landscape or poses problems that are understandable for the AI's
I understand that Novamente is a commercial project and hence may put in a
lot of human effort toward achieving a given level of performance at a given
time, but I really don't think that it's possible to understand seed AI by
improving the system yourself and tuning the system using genetic
algorithms. I think that making "the kind of complexity that improves
performance efficiency, on the average", or one kind of complexity like
that, into an AI-solvable problem, is one of the critical thresholds. It's
a critical threshold because a lot of simple parameter tweakings, or pieces
of complexity that are easy to create, improve performance on one problem
but suck up enough computing power to make them bad tradeoffs for
problems-in-general. When you find parameters that can be tuned with simple
quantitative settings, that's great. You just turn the AI loose on the
problem. But since really improving performance on general problems often
requires more complexity than that, getting the AI to leap those gaps is one
of the critical thresholds of seed AI.
Mostly, though, I feel that a correct AI design may work *better* if you set
the parameters for "forgetting" things to exactly the right value, but it
will *still work* even if the parameters are set to different values.
Moreover I feel that seed AI designs will tend to accumulate complexity for
describing when to forget things, rather than setting a global parameter. I
think that the very-high-level behaviors for inference will work OK even if
the mathematical behaviors they use are just rough approximations to the
optimal values. I think that Novamente's sensitivity to the exact equations
it uses for inference are symptomatic of an AI pathology for inference that
results from insufficient functional complexity for inference. A real AI
would be able to use rough approximations to Novamente's exact equations and
still work just fine.
> -- has a small number of free parameters, the values of other parameters
> being give by theory
You can have a huge number of free parameters as long as it is a manageable
problem to find an initial parameter set that will let the pieces of the
system fit together. Not "work optimally or efficiently". Just "fit
together". Just because you can get benefits out of tuning parameters does
not mean that you must spend all day tuning parameters. Rather what you
want to do is build a system that can work even if it has poorly tuned
And where you say "Well, that's not possible" or "It's easier to model the
human brain neuron by neuron," I say "This is one of the impossible-seeming,
critical, unavoidable hard problems of designing a general intelligence."
> THEN, my friend, you will have performed what I would term "One Fucking Hell
> of a Miracle." I don't believe it's possible, although I do consider it
> possible you can make a decent AI design based on your DGI theory.
Yes, an AI design requires One Fucking Hell of a Miracle. But it's a
*different* Miracle than the kind you describe. It's not a question of
solving enormous engineering problems through Incredible Dauntless Efforts
but of creating designs that are So Impossibly Clever they don't run into
those enormous engineering problems. I think many of the problems you're
running into are symptoms of trying to solve the problem too simply.
> In fact, I think that the engineering and performance analysis problems are
> likely to be significantly GREATER for a DGI based AI design than for
> Novamente, because, DGI makes fewer compromises to match itself up to the
> ways and means of contemporary computer hardware & software frameworks.
Engineering problems will be greater for DGI because DGI contains more
complexity, but I consider it a critical, unavoidable problem that the
overall system design should enable *adequate solutions* to work. Usually,
when you add pieces to a system, it becomes easier to break. This is
because people aren't thinking in terms of creating smooth fitness
landscapes, oversolving problems, and creating multiple routes to
solutions. If you oversimplify a mind design, you concentrate too much
functionality into individual pieces of the system. There aren't multiple
routes around things and each individual piece has to do a lot of work.
This is a very hard picture to convey, and I suspect I'm not doing too well
at it... another way of looking at it is that, from my perspective, you're
making generic algorithms do things that I think should be broken up into
interdependent internally specialized subsystems. So I think that one of
the consequences of this is that the generic algorithm finds *one path* from
A to D because it has to search through the whole problem space in one lump,
where the interdependent, internally specialized subsystems would be each
capable of finding *many paths* from A to B, B to C, and C to D. This
picture also gives a rough image of why the generic algorithm tends to be
more sensitive to parameter tuning, tends to require more performance
engineering, and so on.
I think one of the reasons you're focused on parameter tuning and
performance engineering of Novamente is that Novamente is *just barely*
capable of solving a certain class of engineering problems, because
Novamente is too simple a design. I think that an improved design would
just swallow this whole class of problems whole and hence not require an
enormous amount of parameter tuning and performance engineering to do it.
Of course, there will then be a new fringe of problems, which you swallow
not by tuning parameters but by improving the system design so that these
problems are also "oversolved", swallowed whole. But since you believe the
current Novamente design is already adequate for general intelligence, and
since the design itself has a flat architecture, that kind of space for
design improvement is not really open to you. Which is why you focus on
parameter tuning and performance engineering. That's how I see it, anyway.
This business of very fragile solutions is a symptom of gnawing at the
fringes of the problem space, which in turn is a symptom of (a)
oversimplifying and (b) not being able to bite off big chunks of problem
space. Don't solve the problem. Oversolve it. Have multiple paths to
success in the local subtask, for each of the levels of organization that
makes up the superproblem (you will need multiple levels of organization for
this to work, of course).
This kind of design is totally foreign to software engineering as it stands
today, which typically is interested in *just one* solution to a problem.
If you iterate *just one* solution over and over, it creates systems that
become very fragile as they become large. If you iterate *many possible
paths to success* over and over - which really is one of those things that
you can do in an AI design but not a bank's transaction system - then you
don't get the AI pathology of this incredible fragility.
> > Imagine Lenat saying, "Well,
> > suppose that you
> > need to enter a trillion facts into the system... in this case it
> > would make
> > sense to scan an existing human brain because no programming team could
> > handle the engineering challenge of managing relationships among a dataset
> > that large."
> But this is the worst example you could have possibly come up with! Cyc is
> very easy to engineer precisely because it makes so many simplifying
This is just how I feel about Novamente.
> In almost all cases, I believe, incorrect AI theories have led to overly
> SIMPLE implementation designs,
I quite agree.
> not overly complex ones. AI scientists have
> VERY often, it seems to me, simplified their theories so they would have
> theories that could be implemented without excessive implementation effort
> and excessive parameter tuning.
Noooo... AI scientists have often oversimplified their theories because (a)
they made philosophical connections between observed human behaviors and
simple computational properties based on surface similarities and
enthusiasm; (b) because they didn't have the knowledge, skill, or
pessimistic attitude to perceive really complex systems, and hence could not
"move" in the direction of greater complexity when figuring out which system
> I'm afraid you are fooling yourself when you say that parameter tuning will
> not be a big issue for your AI system.
I don't mind parameter tuning contributing enormously to the AI's
performance, as long as the AI can tune its own parameters or get by with a
fairly simple set of starter parameters. Incidentally, I think that many
things which you consider as global parameters should be local,
context-sensitive parameters. I also think that many things which you think
are quantitative parameters are patterned or holonic variables.
> Even relatively simple AI models like attractor neural nets require a lot of
> parameter tuning. Dave Goldberg has spent years working on parameter tuning
> for the GA. Of course, you can claim that this is because these are all bad
> techniques and you have a good one up your sleeve. But I find it hard to
> believe you're going to come up with the first-ever complex computational
> system for which parameter-tuning is not a significant problem.
> Yes, a sufficiently advanced system can tune its own parameters, and
> Novamente does this in many cases; but intelligent adaptive self-tuning for
> a very complex system presents an obvious bootstrapping problem, which is
> trickier the more complex the system is.
This is an inescapable problem of seed AI, and one of the ways it becomes
more tractable is by, for example, localizing parameters. It's very easy to
see how a problem that is locally tractable could become globally
intractable if all the local parameters are yanked together.
Oversimplifying a fitness landscape can render it intractable. Making
complex parameters into simple parameters can make them intractable. And so
on. I think that the problems you are now experiencing are AI pathologies
of parameters that are too global and too simple. I think the real versions
of many of these parameters will be learned complexity.
> > DGI does not contain *more specialized
> > versions* of these subsystems that support specific cognitive
> > talents, which
> > is what you seem to be visualizing, but rather contains a *completely
> > different* set of underlying subsystems whose cardinality happens to be
> > larger than the cardinality of the set of Novamente subsystems.
> Can you give us a hint of what these underlying subsystems are?
> Are they the structures described in the DGI philosophy paper that you
> posted to this list, or something quite different?
Memory. Concept kernel formation. I would say the things from DGI, but I
would add the proviso that I don't think you understood which subsystems DGI
was asking for.
> > I believe this problem is an AI pathology of the Novamente architecture.
> > (This is not a recent thought; I've had this impression ever
> > since I visited
> > Webmind Inc. and saw some poor guy trying to optimize 1500
> > parameters with a
> > GA.)
> Webmind had about 300 parameters, if someone told you 1500 they were goofing
> However, only about 25 of them were ever actively tuned, the others were set
> at fixed values.
Well, that's the word I remember hearing. It might have been a special case
or an exaggeration. I do remember seeing a HECK OF A LOT of parameters
scrolling down the guy's screen - more than 25, I'm sure. Although I also
think I remember seeing a lot of similarity between adjacent lines of text,
so he might have been breaking down parameters into subparameters or
something; I dunno.
> I sure am eager to see how DGI or *any* AGI system is going to avoid this
> sort of problem.
Deep architectures, experiential learning of local patterned variables
instead of optimization of global quantitative variables, multiple solution
pathways on multiple levels of organization, carving the system at the
> > > we'd be better off to focus on brain scanning and
> > > cellular brain simulation.
> > That doesn't help.
> Your extreme confidence in this regard, as in other matters, seems
> relatively unfounded.
> Many people with expertise in brain scanning and biological systems
> simulation disagree with you.
*Cough*. Let me rephrase. Brain scanning and cellular brain simulation are
best understood as helping us to understand what goes on in the brain, *not*
in letting us create intelligence without knowing what it is. I am
extremely skeptical that you can duplicate intelligence without knowledge
unless you have incredible scanning abilities and incredibly fine
simulations, because of the extreme complexity of single neurons and the
irreduceability of this complexity without an understanding of higher levels
> > Novamente has what I would consider a flat architecture, like
> > "Coding a Transhuman AI" circa 1998. Flat architectures come with certain
> > explosive combinatorial problems that can only be solved with deep
> > architectures. Deep architectures are admittedly much harder to
> > think about and invent.
> "Deep architecture" is a cosmic-sounding term; would you care to venture a
> definition? I don't really know what you mean, except that you're implying
> that your ideas are deep and mine are shallow.
Hopefully, what I said above fleshes out the definition a bit.
> My own subjective view, not surprisingly, is that YOUR approach is
> "shallower" than mind, in that it does not seem to embrace the depth of
> dynamical complexity and emergence that exists in the mind. You want to
> ground concepts too thoroughly in images and percepts rather than accepting
> the self-organizing, self-generating dynamics of the pool of intercreating
> concepts that is the crux of the mind. I think that Novamente accepts this
> essential depth of the mind whereas DGI does not, because in DGI the concept
> layer is a kind of thin shell sitting on top of perception and action,
> relying on imagery for most of its substance.
I don't think you're seeing the complexity. (As you have accused me of with
respect to Novamente, of course.) The processes do not "rely on imagery for
most of their substance" but rather require imagery as a level of
organization, and transformation into imagery and back into concept
substance, in order to engage in concept-level dynamics during experiential
learning. It's not *relying on* imagery for all of its substance but rather
doing things for which imagery is a *necessary but not sufficient*
condition. From my perspective, you're trying to use simple generic
processes to do things that require the interaction of interdependent
internally specialized processes. Imagery is one of the things that concept
learning depends on. Also, I'm still not sure that your imagery for
"imagery" is anything like my imagery for "imagery".
> The depth of the Novamente design lies in the dynamics that I believe (based
> on intuition not proof!) will emerge from the system, not in the code
> itself. Just as I believe the depth of the human brain lies in the dynamics
> that emerge from neural interactions, not in the neurons and
> neurotransmitters and glia and so forth. Not even the exalted
The part where we disagree is in the question of whether evolution carefully
and exactingly sculpted those higher levels of organization just as it
sculpted the neural interactions, or whether all higher levels of
organization emerge automatically as the laws of physics supposedly do (I
have my doubts).
I also feel that if you intuit dynamics will emerge, they will not emerge.
If you know what the dynamics are and how they work, you will be able to
create systems that support them; not otherwise. I think that the history
of AI shows that one of the most frequent classes of error is hoping that a
quality emerges when you don't really know exactly how it works.
> > It requires that you listen to your quiet, nagging
> > doubts about
> > shallow architectures and that you go on relentlessly replacing
> > every single
> > shallow architecture your programmer's mind invents, until you
> > finally start
> > to see how deep architectures work.
> Ah, how my colleagues would laugh to see you describe me as having a
> "programmer's mind" !!!
> For sure, I am at bottom a philosopher, much as you are I suspect.
No, I am not. I have fundamental problems with philosophy as an approach to
the mind, many of them the same fundamental problems that I have with
programming as an approach to the mind.
> You may
> disagree with my philosophy but the fact remains that I spent about 8 years
My clock is currently at 6 years and running.
> working on mathematically and scientifically inspired philosophy (while also
> doing various scientific projects), before venturing to design an AGI.
> Novamente is not at all a programming-driven AI project, although at this
> stage we are certainly using all the algorithmic and programming tricks we
> can find, in the service of the design. The design was inspired by a
> philosophy of mind, and is an attempt to realize this philosophy of mind in
> a practical way using contemporary hardware and software.
Yes, I know. I think that trying to implement philosophies is a recipe for
disaster because it involves saying "Effect A arises from B - isn't that
beautiful?" instead of "X, Y, and Z are necessary and sufficient causes of
A; here's a functional decomposition and a walkthrough." You think that all
the higher levels of your system will arise automatically because of
> You seem to have misinterpreted me. I am not talking about anything being
> in principle beyond human capability to comprehend forever. Some things ARE
> (this is guaranteed by the finite brain size of the human species), but
> that's not the point I'm making.
OK. We have different ideas about what a modern-day AI researcher should be
trying to comprehend. Does that terminology meet with your approval?
> I still believe it's possible that the AGI design problem is SO hard that
> detailed brain simulation is easier. I hope this isn't true, but if pressed
> I'd give it a 10%-20% chance of being true. Generally, I am not prone to
> the near 100% confident judgments that you are, Eliezer. I think I tend to
> be more aware of the limitations of my own knowledge and cognitive ability,
> than you are of your own corresponding limitations.
Ben, I don't know for sure that I'm right, but I'm pretty sure that you're
wrong. I do wish this distinction was easier to convey to people, since it
seems to be a pretty common confusion.
> Eliezer, I think it is rather funny for *you* to accuse *me* of flinching
> away from the prospect of trying to do something!
What on Earth are you talking about here? Where did you get the idea that I
am deliberately holding back on anything? I'd be putting together a
programming team right now if SIAI had the funding.
-- -- -- -- --
Eliezer S. Yudkowsky http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence
This archive was generated by hypermail 2.1.5 : Wed Jun 19 2013 - 04:00:43 MDT