RE: Review of Novamente

From: Ben Goertzel (ben@goertzel.org)
Date: Wed May 08 2002 - 10:24:04 MDT


Hi eliezer,

> A note: Ben and I have discussed Novamente privately before now, and I'm
> not sure I want to recap the whole discussion from SL4. However, a couple
> of people have come forward and said that they want to see more discussion
> between me and Ben about Novamente. A problem is that, unlike when I'm
> talking with Ben, I can't quote from the Novamente manuscript and I can't
> assume anyone else has read it. However, I'll give it a shot.

I'm a bit strapped for time; I'm going out of town (again) next week and
have a lot of work to finish up...

So I'll type a long response this time but I may not be able to do 5
consecutive ones!!

> Well, I can only criticize what's *in* the book. If in the whole book
> there's no mention of emergent maps and then you say that you
> expect most of
> Novamente's functionality to come from emergent maps, then
> there's not much
> I can say about;

Well I did send you an extra chapter on emergent maps, drawn from previously
written material that hadn't made its way into the book.

I have found that I'm much worse at explaining my own ideas than at
explaining other peoples' in some cases!

In this case, after working on Webmind and Novamente with the same team for
many years, I think I sort of lost touch with what's obvious and implicit to
non-team-members, versus what's obvious and implicit to team members.

See, everyone on the team is so fucking used to my philosophy of mind, it's
100% implicitly understood by all of them when they read the manuscript that
the purpose of all the mechanism is to give rise to mentally meaningful
emergent
structures & dynamics, and that the mindstuff is the emergent maps and not
the individual nodes & links.

And the main, immediate purpose of the book draft was actually to
communicate to the team the *precise variations* on our commonly-known
collection of ideas, that I'd decided to go with.

Unfortunately, the book draft as written is effective *only* for this
purpose, i.e. only for communicating to the existing team who already fully
understand the context within which the details are proposed.

Knowing it was a crude draft, I only showed it to a handful of people
outside the time.

And, some of the others who read it -- who were more a priori sympathetic
than you to my overall philosophy of mind -- seemed to be more willing to
"fill in the gaps" themselves and have had a more positive assessment of the
design than yourself.

On the other hand, of the readers said they found it totally
incomprehensible and to consist almost entirely of nonsense ;-)

Other comments included "a stroke of genius" and "atrociously written" ;->

Anyway, we are finding it is a hell of a lot of work to create a systematic
exposition of these ideas in a way that will be comprehensible to outsiders.
This is definitely a worthwhile kind of work, because there are loads of
people in the world who can give us valuable feedback on our ideas.
(Although the most valuable feedback will come from people whose
a priori outlook is a little closer to ours than yours is.)

On the other hand, we're spending most of our time building and applying the
system rather than writing about it, so it's gonna be at least another 6
months or so before we have a book draft we're willing to distribute more
widely.

> When I hadn't
> read the Novamente manuscript, I was basically willing to assume that, in
> the absence of more specific information, Novamente might have
> some process
> that implemented X. But if I read the manuscript and there's no
> mention of
> X, then I'm going to assume X is not supported unless I hear a specific,
> critique-able explanation of X - not in terms of "P.S: Novamente can do X"
> but in terms of "This is how these design features support X."
> Where we are
> at right now is that you've given me the manuscript, I read the
> manuscript,
> I said "This isn't general intelligence", and you said "Ah, but all the
> general intelligence features aren't in the manuscript." Okay.
> It could be
> true. But my working assumption is still going to be that the parts
> discussed in the manuscript describe the parts you actually know
> how to do,
> and all the interesting stuff that isn't in the manuscript are things you
> *want* to do but have not actually worked out in the level of
> design detail
> that would be required to discuss them in the manuscript.

There seems to be a fundamental philosophical point here...

I think most of the X's that you're referring to are things that, according
to our theory of mind, are supposed to be *emergent* phenomena rather than
parts of the codebase

The book gives a mathematical formalization of the stuff that's supposed to
be in the codebase

It doesn't talk enough about our experiences getting emergent phenomena out
of preliminary versions of the system, partly because these experiences were
with Webmind not Novamente and the book is about Novamente, and partly
because we didn't do enough to *quantify* and *formalize* these experiences
& observations.... So a lot of our knowledge about emergent behavior of the
system is really at the level of lore, not science. Since it's Webmind
lore, not Novamente lore, to explain it in detail would require us to write
a book on webmind, which is a conceptually related but different-in-detail
system that we don't own anymore, and don't even have legal rights to write
a book about...

Hence, our inclination is to keep building & playing with Novamente, be more
careful to record, quantify and formalize all observed emergent phenomena
this time around, and then write about Novamente phenomena as we observe
them and as we have time.

Anyway, it's not possible to work out emergent phenomena at the same level
of design detail as code-level stuff. But it is possible to talk about it
in more detail than was done in the book version you read, and that will be
done in a later version.

Anyway, we certainly knew that this version was not going to be solid enough
to convince a skeptic that we have a good design for AGI. In fact, no
matter how well-written the book, it wouldn't pass this test. The only way
to convince a skeptic that we have a good design for AGI would be

a) create the working AGI

b) a distant second-place: create a thoroughly rigorous mathematical theory
of general intelligence, connected to pragmatic real-world intelligence
measurements, and prove that the design can lead to an AGI according to this
theory
[this still wouldn't convince the die-hard skeptics because debate would
just shift to the axioms of the mathematical theory]

My conclusion is that perhaps Peter Voss, James Rogers and others working on
their own AGI designs have the right approach -- which is the same approach
I was taking during the Webmind era and until very recently with Novamente.

I.e., perhaps the wiser attitude is: "Keep your mouth shut except among your
own little group, because talking about your ideas with others is just going
to lead you to spend all your time in arguments and none of your time
getting any work done. You'll never convince anyone you're on the right
track, because nearly everyone in the world believes AGI is impossible, and
nearly everyone who believes AGI is possible believes *they* have the secret
ingredient to AGI and therefore you cannot."

The amount of work required to explain our intuitions about the system, and
our anecdotal experiences with earlier versions, and why we think the system
can give rise to the emergent structures and dynamics we think it can, is,
it's becoming clear to me, a LOT. Is it worth doing this work instead of
working on the system itself, which may produce evidence that will be more
convincing than any words? Maybe not.

> Naturally, if you're going to focus on having a design in hand as
> a critical
> point, then I may be excused for pointing out that you do not
> seem to have a
> design in hand for the most critical parts of your system.

We have a design in hand for a system that WE BELIEVE will lead to an AGI.
We do not believe there are any critical parts of the system yet to be
designed.

The parts of the system you think we have not designed, are phenomena we
believe will emerge from the system.

The book apparently did a very poor job of explaining why we believe these
phenomena will emerge from the system. Slowly, we will produce a better
book draft that explains this more thoroughly. Slowly, because we will
continue to spend more time working on the system itself, than working on
describing our intuitions and experience in a way that will be convincing to
skeptical outsiders.

> > To sum up before giving details, basically, Eliezer's critique is that
> >
> > 1) he doesn't see how a collection of relatively simple,
> generic processes
> > working together can give rise to a rich enough set of emergent
> dynamics and
> > structures to support AGI
>
> I don't see how *your specific* collection of simple, generic processes,
> working in the fashion described in the Novamente manuscript, and
> interacting as I picture them (because their interaction is *not*
> described
> in any specific detail in the Novamente manuscript),

Here the discussion becomes tough because all listeners have not read the
manuscript (and I don't want to distribute it widely).

I feel these interactions were described in moderate detail in many places,
which you failed to understand or chose to ignore.

> Certainly I feel that if one takes the Novamente
> design at face
> value then it is not an AGI.

The design is not an AGI.

The design is a design.

I think we're just dancing around the same point over & over again.

I need to do a better job of explaining why we think the emergent structures
& dynamics of mind will emerge from a system implemented according to the
Novamente design.

But this is not something I can do in an e-mail, even a long one.

> What kind of emergent meaning? How does it emerge and why? Can
> you give a
> specific example of a case where emergent meaning in Novamente is expected
> to contribute to general intelligence, including the nature of
> the emergent
> meaning, the low-level support of the emergent meaning, and the specific
> contribution made to general intelligence? If these things are not
> documented in your design then it is natural for me to assume
> that they are
> not driving the design.

yeah, i will give plenty of examples like this in the rewrite of the
manuscript. maybe I will post something on the topic to you or to this list
in a few weeks. I don't have time right now.

The example I would work out for you first would be the number "two" I
think...

> Your design description makes sense on its own;
> it's hard for me to believe that the entire motivation behind it is
> missing. The emergent behaviors you expect from Novamente seem to me like
> hopes, rather than part of the design.

The emergent behaviors are not PART of the design, they were however the
ENTIRE MOTIVATION for the design. Which is 100% obvious to the team, and,
understandably, not obvious to outsiders.

I spent 8 years thinking about emergent mind, and then 4 years designing a
system that I thought could give rise to it... The team was participating
in these 4 years hence the contextual meaning of the parts of the design is
obvious to them. Also would be obvious to you if the Novamente manuscript
were read in the context of my previous books. But that is too much to ask
of a reader, I understand that.

> My overall prediction for Novamente is that all the behaviors you
> are hoping
> will "emerge naturally", won't.

Gee that's a big surprise!!

And see, even after we revise the book and explain more clearly why we
believe the right structures & dynamics will emerge, you'll still say the
same thing

Of course, if I came to you with a design for neurons and synapses and an
architecture diagram for the brain, and described the emergent dynamics and
structures I thought would come out of the brain -- and you'd never heard of
a brain before -- you'd probably say the same thing.

> Really? You don't enter nodes directly into Novamente, right now, at the
> current state of the system? Or is this something that you hope
> to do later
> but haven't done yet? How does the node get the name "cat"? In current
> versions of Novamente, how much of the network is semantic and how much is
> not?

Currently for the data analysis work we're doing, we actually have nodes for
numbers and patterns among numbers.
Nothing else ;)

When we hook it up to a simple "ShapeWorld" visual UI, we will have nodes
for (pixel coordinates, color) pairs, for instance.

When we hook it up to a conversation UI, we will have nodes for *characters*
(as described in the NLP chapter), with words being represented as perceived
lists of characters, etc.

Conceptual nodes must be built up from basic perceptual nodes like: numbers,
characters, (pixel coord., color) pairs, ...

If the system recognizes {s, q, u, a, r e} as a coherent whole (a ListLink
of CharacterNodes} then it knows the word "square" in some sense. If it
then automatically links this ListLink with a PredicateNode P embodying a
certain set of relations between pixels (that we'd call a "square"), then it
is grounding P (the concept of square) in the word "square" ( the ListLink
of CharacterNodes)

In this way it can (we hope) build up a semantic network from perceptions.
WE did some simple experimenting with this sort of stuff in Webmind and have
not gotten there yet with Novamente.

> > The intention is that much of the semantics of the system resides, not
> > directly in individual nodes and links, but rather in "maps" or
> > "attractors" -- patterns of connectivity and interconnection
> involving large
> > numbers of nodes and links.
>
> This is an *intention*. I don't see it in the Novamente design.
> What kinds
> of maps? What kinds of attractors? What functions do they implement?

This does not properly belong in the design. You don't *design* attractors,
they arise.

We can put more verbiage about this in the book but that will just document
our intuitions and anecdotal experiences

The proof or disproof will be in the pudding of course..

> Okay. It's true, as you say below, that I tend to lump together certain
> systems that you think have key distinctions; i.e., I do not believe these
> distinctions are progress toward building a real AI system, while you
> believe that they do.

Well, in this case the distinctions at hand are standard ones in the CS
literature, not ones that I made up
(predicate logic vs. term logic, crisp logic vs. probabilistic logic)

In many cases, it seems, I think that choosing the right tool from the
standard toolkit is very important, whereas you think the whole toolkit is
inadequate

> Yes. I must say that this idea stands out in my mind as seeming *very*
> GOFAIish - the idea that mathematical reasoning can be implemented/taught
> using Novamente's probabilistic inference on a series of Novamente
> propositions corresponding directly to the formal steps of a Mizar proof.

Well really, GOFAI is not commonly based on supervised learning from a huge
training database

This is a much more recent meme in AI, which came along with the Net and
powerful computers

For instance GOFAI computational lingustics is on the way out, in favor of
corpus linguistics

I don't think that training-database-based-supervised-learning AI is the
true path either but I think it has more meat to it than GOFAI. At least
one is dealing with a rich body of data and letting a system spontaneously
pick its own patterns from the data.

The Mizar approach to theorem-proving is in the spirit of modern
supervised-learning-based-AI, not expert-system/logic-based GOFAI

There is nothing like it in the computational theorem-proving literature

However, I don't really think that supervised learning training based on
Mizar will be enough to teach Novamente (or any system) to prove nontrivial
theorems, I think that approach will only suffice for simple set theory
theorems and such. I think a human teacher will be needed to help it learn
from this highly valuable database. Anyway we're a long way from there at
present.

> I expect to be documenting which behaviors are supposed to be
> emerging from
> which other behaviors and which behaviors are supposed to be emergent from
> them, and I expect this to force internal specialization on
> multiple levels
> of organization. If I see you trying to make all the complexity of
> cognition emerge from generic behaviors, then my suspicion is
> naturally that
> you haven't mentally connected the hoped-for emergent behaviors
> to the level
> of organization from which they are supposed to be emergent, and hence
> experience no mental pressure to design low-level behaviors with internal
> specialization that naturally fits the emergent behaviors.

I think you want to design too much in detail. I have more faith/respect
in the power of self-organizing learning than you, I think.

> When I
> think of noticing spontaneous similarities, I think of background
> processes
> that work against current imagery. In Novamente the analogue of this part
> of the mind is Apriori datamining, or at least, if there's anything else
> that does this, it's not in the manuscript.

Inference and evolutionary CRN mining do that. They are described amply in
the mansucript but evidently their scope application is not sufficiently
clearly explained.

> > The evolutionary programming in Novamente is not classical ev.
> programming;
> > it has at least two huge innovations (only one of which has
> been tested so
> > far): 1) evolution is hybridized with probabilistic inference, which can
> > improve efficiency by a couple orders of magnitude,
>
> Can improve? Already has improved? You hope will improve?

Already has been shown to improve in the GA case, by a factor of 50-100

In the GP case, we didn't finish the experiments, but initial results were
promising

For related work, see Pelikan and David Goldberg's work on the Bayesian
Optimization Algorithm (BOA)

> Okay, stepwise classical evolution on a slightly specialized
> representation
> that interacts with other stepwise generic processes. I suppose that from
> your perspective it is indeed unfair to call this "classical evolutionary
> programming". BUT the Novamente manuscript does not give examples of how
> evolutionary programming is supposed to interact with logical
> inference; it
> just says that it is.

the manuscript explains mathematically how the interaction happens, pretty
plainly (at least for those who know both processes well, which is a very
small set ;)

but you are right it does not give particular examples, and it should

> Hm, I think we have different ideas of what it means to invoke "Bayesian
> semantics". Of course Bayesian semantics are much more popular in CompSci
> and so your usage is probably closer to the norm. When I say Bayesian
> semantics I am simply contrasting them to, say, Tversky-and-Kahneman
> semantics; what I mean is semantics that obey the Bayesian behaviors for
> quantitative probabilities, not necessarily analogy with the philosophical
> bases or system designs of those previous AI systems that have been
> advertised as "Bayesian" approaches.

Novamente's inference engine uses Bayesian semantics locally but not
globally

Globally the system is not consistent with bayesian probability
calculations, but it is approximately consistent within a single inference

> > And I think this is as it should be.
>
> Why did it take so long to scale from spider brains to human brains?

'Cause evolution is a terribly inefficient learning mechanism ;>

> I would say, in fact, that my own reaction to the history of
> the AI field has been to become very strongly prejudiced against
> permitting
> oneself to attribute conceptual and philosophical significance because it
> leads to declaring victory much too early

We are certainly not declaring victory.

> > > The lower
> > > levels of Novamente were designed with the belief that these
> lower levels,
> > > in themselves, implemented cognition, not with the intent
> that these low
> > > levels should support higher levels of organization.
> >
> > This is completely untrue. You were not there when we designed these
> > levels, so how on Earth can you make this presumption??
>
> Because they show no sign of specialization to support higher levels of
> organization.

We have different ideas about how much specialization *should* be there in
the "implementation level" of an AI design...

If you showed me a design with a huge amount of functional specialization in
the codebase, I might well think it was doomed to fail because of being too
complicated and overspecialized at such a low level...

> > I spent the 8 years before starting designing Webmind, writing books and
> > paper on self-organization and emergence in the mind. (See especially
> > Chaotic Logic and From Complexity to Creativity)
>
> Great, you're older. I'm more dedicated.

My terrifyingly advanced age was not the point of that statement. My point
was entirely different: that I had been thinking a lot about *emergent mind*
before launching into detailed AI design, so that for me and my team, our
detailed work was implicitly understood in the context of all this prior
stuff on emergent mind. A context that was not adequately drawn into the
book, as I've said a lot of times already.

Your statement and assumption that you're "more dedicated" than I am is
rather silly, in my view. I suspect we're both extremely dedicated to our
work, enough so that a competitive comparison is not meaningful. (or
useful).

> I know you had emergence as a goal. What aspects of the system are there
> *specifically* to support this goal? What design requirements were handed
> down from this goal?

Too much to answer in an e-mail

> And bear in mind also that from my
> perspective having
> "a higher level of structure and dynamics" is not a good goal for an AGI
> design; one should have certain specific high-level structures
> and dynamics,
> and certain specific behaviors above *those* dynamics, and so on through
> your system's specified levels of organization. Of course you
> may disagree.

Of course simply having *any* higher level of structure and dynamics is not
good enough.

I think we disagree on the *amount* of specificity one should try to design
into the emergent behaviors of a system, but the amount is far greater than
*zero* even in my view

>
> While I worry that Novamente's emergent dynamics will be chained
> to the same
> behaviors as the built-in ones.

An empirical question, fortunately

> What are you calling a "goal system" and what did Webmind do with it?
>

later, my time for this email has expired...

> Again, this is the kind of encouraging statement that I used to suspend
> judgement on before I read the Novamente manuscript. Now that
> I've read it,
> I would ask questions such as "What specific applications did you think
> Webmind was better suited for?", "Did you test it?", "What kind of results
> did you get in what you thought of as better-suited applications?", and so
> on.

The problem with text mining apps is that they require dealing with language
in an ungrounded way.

The stuff we're doing now, with analyzing bioinformatic data, is better.
Because the system is perceiving "raw data" from quantitative data files &
databases, and then building up its own concepts from them. when it deals
with language it will then do so only in the context of the empirical data
patterns it has already built up.

The financial analytics apps we did at webmind were better in an AGI sense.

In bio & finance we got really awesome results in terms of being able to
recognized fancier patterns than anyone else.

Not AGI of course. Either

a) a dead end, or

b) important work tuning the perceptual pattern-recognition part of the
system for interesting real-world domains and building up its base of
perceptual patterns

depending on your perspective ;>

> One of the attitudes that seems very obvious to me is that you should
> estimate the size of the problem without being influenced by what
> resources
> you think will be available to solve it.

well yeah, if I didn't do *that*, I would have just decided it was a problem
I could solve myself and avoided all the hassle of gathering a team ;->

My estimate of the amount of manpower required to make an AI has gone DOWN
over the last 3 years, not up. It went up from 1997-2000, and has gone down
from late 2000 till now. This is because of the design simplifications that
went into the move from Webmind to Novamente...

ben



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:38 MDT