Re: Review of Novamente

From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Wed May 08 2002 - 02:54:22 MDT


A note: Ben and I have discussed Novamente privately before now, and I'm
not sure I want to recap the whole discussion from SL4. However, a couple
of people have come forward and said that they want to see more discussion
between me and Ben about Novamente. A problem is that, unlike when I'm
talking with Ben, I can't quote from the Novamente manuscript and I can't
assume anyone else has read it. However, I'll give it a shot.

Ben Goertzel wrote:
>
> Unfortunately, I think that Eliezer did not really understand the basic
> concepts underlying the design, based on his reading of the manuscript.
> Obviously, since Eliezer is very smart and has a fair bit of relevant
> knowledge, this means that the book manuscript is in piss-poor shape. We
> should have a much better draft within 6 months or so. My feeling is that
> Eliezer's understanding of the desing was impaired significantly by his
> strong philosophical biases which are different from my own strong
> philosophical biases.

Well, I can only criticize what's *in* the book. If in the whole book
there's no mention of emergent maps and then you say that you expect most of
Novamente's functionality to come from emergent maps, then there's not much
I can say about; except, of course, that as far as I can tell what *was*
discussed in the Novamente manuscript won't give rise to emergent maps that
suffice for general intelligence. I can see that Novamente's several
generic processes can theoretically exhibit stepwise interaction through a
common representation, but this is mentioned in the manuscript only in
passing; it doesn't give specific examples of how this works, explain to
what degree this has been observed to work in Webmind and to what extent it
is only a theoretical expectation from Novamente, and so on. When I hadn't
read the Novamente manuscript, I was basically willing to assume that, in
the absence of more specific information, Novamente might have some process
that implemented X. But if I read the manuscript and there's no mention of
X, then I'm going to assume X is not supported unless I hear a specific,
critique-able explanation of X - not in terms of "P.S: Novamente can do X"
but in terms of "This is how these design features support X." Where we are
at right now is that you've given me the manuscript, I read the manuscript,
I said "This isn't general intelligence", and you said "Ah, but all the
general intelligence features aren't in the manuscript." Okay. It could be
true. But my working assumption is still going to be that the parts
discussed in the manuscript describe the parts you actually know how to do,
and all the interesting stuff that isn't in the manuscript are things you
*want* to do but have not actually worked out in the level of design detail
that would be required to discuss them in the manuscript.

Naturally, if you're going to focus on having a design in hand as a critical
point, then I may be excused for pointing out that you do not seem to have a
design in hand for the most critical parts of your system.

> To sum up before giving details, basically, Eliezer's critique is that
>
> 1) he doesn't see how a collection of relatively simple, generic processes
> working together can give rise to a rich enough set of emergent dynamics and
> structures to support AGI

I don't see how *your specific* collection of simple, generic processes,
working in the fashion described in the Novamente manuscript, and
interacting as I picture them (because their interaction is *not* described
in any specific detail in the Novamente manuscript), will support those rich
emergent dynamics that I think are needed to support AGI. I don't know
whether they'll support the rich emergent dynamics that *you* hope for from
Novamente, or whether these hoped-for dynamics would be sufficient for AGI
if you had them, because these dynamics are not documented in the Novamente
manuscript. Certainly I feel that if one takes the Novamente design at face
value then it is not an AGI.

> 2) he doesn't think it's sensible to create a network *some of whose basic
> nodes and links have explicit semantic meaning*, but whose basic cognitive
> dynamics is based on *emergent meaning resident in patterns in the basic
> node-and-link-network"

What kind of emergent meaning? How does it emerge and why? Can you give a
specific example of a case where emergent meaning in Novamente is expected
to contribute to general intelligence, including the nature of the emergent
meaning, the low-level support of the emergent meaning, and the specific
contribution made to general intelligence? If these things are not
documented in your design then it is natural for me to assume that they are
not driving the design. Your design description makes sense on its own;
it's hard for me to believe that the entire motivation behind it is
missing. The emergent behaviors you expect from Novamente seem to me like
hopes, rather than part of the design.

My overall prediction for Novamente is that all the behaviors you are hoping
will "emerge naturally", won't. You may get naturally emergent behaviors
that implement a tiny subset A of the vast problem space B, and you may say,
"Yay! We solved B!", but B itself will not emerge naturally and will not in
fact be solvable by processes of the same basic character as A on reasonable
computing hardware. You may get things that you can call "emergent maps"
and that help solve one or two problems, but you won't get the kind of
emergent maps you need for general intelligence.

> Since I can't prove he's wrong or I'm right on these points, I guess it's
> just gonna remain a difference of intuition for a while.
>
> One nice thing about this sort of work is that it's empirical. Assuming the
> team holds together, we will finish implementing and testing the mofo and
> see if we're right or wrong.

Yes, quite true. And I'm doing my best to predict specific failure modes
that are falsifiable, rather than vague premonitions of doom that would
simply consist of jeering from the audience.

> > Capsule description of Novamente's architecture: Novamente's core
> > representation is a semantic net, with nodes such as "cat" and "fish", and
> > relations such as "eats". Some kind of emotional reaction is called for
> > here, lest others suspect me of secret sympathies for semantic networks:
> > "AAAARRRRGGGHHH!" Having gotten that over with, let's forge ahead.
>
> This is not a correct statement; the core data representation is not a
> semantic network.
>
> It is a NETWORK, with nodes and links. Some nodes and links may have
> transparent semantic meaning, such as "cat" or "eats". Others -- the vast
> majority -- will not. And if a node has a transparent meaning like "cat",
> this meaning (and the node) must be built by the system, not loaded in
> externally.

Really? You don't enter nodes directly into Novamente, right now, at the
current state of the system? Or is this something that you hope to do later
but haven't done yet? How does the node get the name "cat"? In current
versions of Novamente, how much of the network is semantic and how much is
not?

> The intention is that much of the semantics of the system resides, not
> directly in individual nodes and links, but rather in "maps" or
> "attractors" -- patterns of connectivity and interconnection involving large
> numbers of nodes and links.

This is an *intention*. I don't see it in the Novamente design. What kinds
of maps? What kinds of attractors? What functions do they implement?

> > Novamente's core representation is not entirely that of a
> > classical AI; Ben
> > insists that it be described as "term logic" rather than
> > "predicate logic",
> > meaning that it has quantitative truth values and quantitative attention
> > values (actually, Novamente can express more complex kinds of truth values
> > and attention values than simple quantities).
>
> Okay, there two different confusions in this paragraph.
>
> 1) Logical inference is only one among very many dynamics involved in
> Novamente. "Term logic" is not a representation, it is a way of combining
> some links to form new links. The node-and-link representation is designed
> to support probabilistic term logic among many other important dynamics.
>
> 2) The difference between predicate logic and term logic has nothing to do
> with the use of probabilistic truth values. The difference between
> predicate logic and term logic has to do with the structure of the inference
> rules involved. In term logic two statements can only be combined if they
> share common terms; this is not true in predicate logic. This difference
> has a lot of philosophical implications: it means that term logic is not
> susceptible to the same logical paradoxes as predicate logic, and that term
> logic is better suited for implementation in a distributed self-organizing
> knowledge system like Novamente.

Okay. It's true, as you say below, that I tend to lump together certain
systems that you think have key distinctions; i.e., I do not believe these
distinctions are progress toward building a real AI system, while you
believe that they do.

> > Similarly, Novamente's
> > logical inference processes are also quantitative; fuzzy logic rather than
> > theorem proving.
>
> Again there are two different confusions overlaid.
>
> First, "Fuzzy logic" in the technical sense has no role in Novamente.
>
> Next, there is a whole chapter in the manuscript on theorem-proving. I
> think this is one thing the system will eventually be able to do quite well.
> In fact, I think that probabilistic inference and other non-inferential
> cognitive aspects like evolutionary concept creation and
> association-formation, are highly critical to mathematical theorem-proving.

Okay.

> And I think that expertise at theorem-proving will be an important partway
> step towards intelligent goal-directed self-modification. There was an SL4
> thread on the possible use of the Mizar theorem/proof database for this
> purpose, about a year ago.

Yes. I must say that this idea stands out in my mind as seeming *very*
GOFAIish - the idea that mathematical reasoning can be implemented/taught
using Novamente's probabilistic inference on a series of Novamente
propositions corresponding directly to the formal steps of a Mizar proof.

> >> However, from my perspective, Novamente has very *simple* behaviors for
> > inference, attention, generalization, and evolutionary programming.
>
> We have tried to simplify these basic cognitive processes as much as
> possible.
>
> The complexity of cognition is intended to emerge from the self-organizing
> interaction of the right set of simple processes on a large set of
> information. NOT from complexity of the basic behaviors.

Well, that's a major difference of philosophy between us, not only about how
to approach AI, but what constitutes "having a design" for an AI.

I expect to be documenting which behaviors are supposed to be emerging from
which other behaviors and which behaviors are supposed to be emergent from
them, and I expect this to force internal specialization on multiple levels
of organization. If I see you trying to make all the complexity of
cognition emerge from generic behaviors, then my suspicion is naturally that
you haven't mentally connected the hoped-for emergent behaviors to the level
of organization from which they are supposed to be emergent, and hence
experience no mental pressure to design low-level behaviors with internal
specialization that naturally fits the emergent behaviors.

> > For
> > example, Novamente notices spontaneous regularities by handing off the
> > problem to a generic data-mining algorithm on a separate server. The
> > evolutionary programming is classical evolutionary programming.
> > The logical
> > inference has classical Bayesian semantics. Attention spreads
> > outward like
> > ripples in a pond.
>
> All of these statements are wrong, Eliezer.
>
> Novamente notices regularities in internal and external by many different
> mechanisms. The Apriori datamining algorithm that you mention is a simple
> preprocessing technique used to suggest potentially interesting regularities
> to the main cognition algorithms. It is by no means the sum total or even
> the centerpiece of the system's approach to recognizing regularities.

Hence the wording, "spontaneous regularities". Please bear in mind that
when I look at Novamente I am looking at it through the lens of DGI. When I
think of noticing spontaneous similarities, I think of background processes
that work against current imagery. In Novamente the analogue of this part
of the mind is Apriori datamining, or at least, if there's anything else
that does this, it's not in the manuscript. There are other kinds of
noticeable regularities. The point is that, at least at this point, there
is a case where there's a part of Novamente that seems to have a fairly
clear analogy with a cognitive process postulated in DGI, and this causes me
to say that Novamente's way of handling this process is not complex enough.
I believe that the similarity-noticing parts of the mind are driven by a
certain specific kind of patterned interaction with attention, certain
learnable and instinctive biases, and a certain kind of nongeneral context
sensitivity, which means that handing it all off to Apriori doesn't seem
likely to work right. However, the local regularity-noticing implemented by
content within Novamente and not datamined globally cannot (I think) perform
this function properly either. I realize that everything in Novamente
interacts in at least some ways with everything else, but that doesn't mean
you have the specific interactions you need.

> The evolutionary programming in Novamente is not classical ev. programming;
> it has at least two huge innovations (only one of which has been tested so
> far): 1) evolution is hybridized with probabilistic inference, which can
> improve efficiency by a couple orders of magnitude,

Can improve? Already has improved? You hope will improve?

Incidentally, I am willing to believe not only this statement, but even the
statement that hybridizing evolution with inference significantly widens the
solution space; I just still don't think it's wide enough.

> 2) evolution takes place
> on node-and-link structures interpretable as combinatory logic expressions,
> which means that functions with loops and recursion can be much more
> efficiently learned (this is not yet tested). These may sound like small
> technical improvements, but they are specifically improvements that allow
> ev. prog. to become smarter & more effective thru feedback with other parts
> of the mind.

Okay, stepwise classical evolution on a slightly specialized representation
that interacts with other stepwise generic processes. I suppose that from
your perspective it is indeed unfair to call this "classical evolutionary
programming". BUT the Novamente manuscript does not give examples of how
evolutionary programming is supposed to interact with logical inference; it
just says that it is.

> The logical inference system does not have classical Bayesian semantics, not
> at all. No single consistent prior or posterior distribution is assumed
> over all knowledge available to the system. Rather, each individual
> inference constructs its own distributions prior to inference. This means
> that the inference behavior of the system as a whole involves many
> overlapping pdf's rather than one big pdf. This is just NOT classical
> Bayesian semantics in any sense, sorry.

Hm, I think we have different ideas of what it means to invoke "Bayesian
semantics". Of course Bayesian semantics are much more popular in CompSci
and so your usage is probably closer to the norm. When I say Bayesian
semantics I am simply contrasting them to, say, Tversky-and-Kahneman
semantics; what I mean is semantics that obey the Bayesian behaviors for
quantitative probabilities, not necessarily analogy with the philosophical
bases or system designs of those previous AI systems that have been
advertised as "Bayesian" approaches.

> > Novamente does not have the complexity that
> > would render
> > these problems tractable; the processes may intersect in a common
> > representation but the processes themselves are generic.
>
> If by "generic" you mean that Novamente's basic cognitive processes are not
> functionally specialized, you are correct.
>
> And I think this is as it should be.

Why did it take so long to scale from spider brains to human brains?

> > Ben believes that Novamente will support another level of
> > organization above
> > the current behaviors, so that inference/attention/mining/evolution of the
> > low level can support complex constructs on the high level. While I
> > naturally agree that having more than one level of organization is a step
> > forward, the idea of trying to build a mind on top of low-level behaviors
> > originally constructed to imitate inference and attention is... well,
> > Novamente is already the most alien thing I've ever tried to wrap my mind
> > around;
>
> I am afraid that, because the description you read was a very sloppy rough
> draft, and because the design is so intuitively alien to you, you have
> managed to achieve only a very partial understanding of the system. Many
> things that, to me, are highly conceptually and philosophically significant,
> you seem to pass off as "implementation details" or "tweaks to existing
> algorithms."

I completely agree. I think that what you designate as "conceptual and
philosophical significance" is what I would designate as the "trophy
mentality". When I think about AI, I try to maintain a mental state where I
have to show that each element of the design contributes materially to
general intelligence, and conceptual and philosophical significance counts
for nothing. I would say, in fact, that my own reaction to the history of
the AI field has been to become very strongly prejudiced against permitting
oneself to attribute conceptual and philosophical significance because it
leads to declaring victory much too early - I consider it part of Failed
Promise Syndrome, the means by which researchers are seduced by ideas.
There is such a thing as ideas that are important, but if so, you have to
implement those important ideas by finding the one way to do it that
actually works, and not any of the many easy ways that seem to match the
surface description. It is a question of being willing to accept additional
burdens upon oneself, even unreasonable-seeming burdens, in order to meet
the problem on its own terms rather than yours.

> > The lower
> > levels of Novamente were designed with the belief that these lower levels,
> > in themselves, implemented cognition, not with the intent that these low
> > levels should support higher levels of organization.
>
> This is completely untrue. You were not there when we designed these
> levels, so how on Earth can you make this presumption??

Because they show no sign of specialization to support higher levels of
organization. If I see a bird made out of blocky monocolor legos, I assume
that the legos were designed without the bird in mind. If I see a bird made
out of legos that show a surface feather pattern and which have special
curved legos for the beak, I assume that the legos were designed with the
bird in mind. (We are, of course, speaking of human design rather than
evolution; in evolution these assumptions are much trickier.)

> I spent the 8 years before starting designing Webmind, writing books and
> paper on self-organization and emergence in the mind. (See especially
> Chaotic Logic and From Complexity to Creativity)

Great, you're older. I'm more dedicated. Neither of these are necessarily
cognitive advantages. So if we argue, let's argue our designs, not argue
the audience's a priori estimates of who would be expected to have the
better design. 1 year of thought on my part could easily turn out to be
equivalent to 4 or more years of thought on yours, or vice versa, depending
on how much time we spent thinking, the underlying fruitfulness of our
relative approaches, and of course our native intelligence levels which must
be Shown Not Told.

> OF COURSE, I did not design the lower levels of the system without the
> emergence of a higher level of structure and dynamics as a key goal.

I know you had emergence as a goal. What aspects of the system are there
*specifically* to support this goal? What design requirements were handed
down from this goal? And bear in mind also that from my perspective having
"a higher level of structure and dynamics" is not a good goal for an AGI
design; one should have certain specific high-level structures and dynamics,
and certain specific behaviors above *those* dynamics, and so on through
your system's specified levels of organization. Of course you may disagree.

> > For example, Ben has
> > indicated that while he expects high-level inference on a
> > separate level of
> > organization to emerge above the current low-level inferential
> > behaviors, he
> > believes that it would be good to summarize the high-level patterns as
> > individual Novamente nodes so that the faster and more powerful low-level
> > inference mechanisms can operate on them directly.
>
> I think that the automated recognition *by the system* of high-level
> patterns in the system's mind, and the encapsulation of these patterns in
> individual nodes, is *one valuable cognitive heuristic* among many.

Yes, but it's something that I see as a *profoundly mistaken* approach,
which is why I'm singling it out. I see it as an indication that the higher
level of organization you hope for is being crushed into what I see as the
genericity of the lower level; that you think a mapping is even possible is,
to me, very worrisome in terms of what it leads me to think you are
visualizing as a higher level organization.

> The interplay between the concretely implemented structures/dynamics and the
> emergent ones, in Novamente, is going to be quite complex and interesting.
> This is where the complexity SHOULD lie, not at the level of the basic
> implemented structures and dynamics.

While I worry that Novamente's emergent dynamics will be chained to the same
behaviors as the built-in ones.

> > To see a genuine AI capability, you have to strip away the suggestive
> > English names and look at what behaviors the system supports even
> > if nobody
> > is interpreting it. When I look at Novamente through that lens, I see a
> > pattern-recognition system that may be capable of achieving limited goals
> > within the patterns it can recognize, although the goal system currently
> > described (and, as I understand, not yet implemented or tested)
>
> Webmind's goal system was implemented and tested, Novamente's is not (yet).

What are you calling a "goal system" and what did Webmind do with it?

> > would permit
> > Novamente to achieve only a small fraction of the goals it should
> > be capable
> > of representing. Checking with Ben confirmed that all of the old Webmind
> > system's successes were in the domain of pattern recognition, so
> > it doesn't
> > look like my intuitions are off.
>
> Yes, we were developing Webmind in the context of a commercial corporation,
> and so most of our practical testing concerned pragmatic data analysis
> tasks. This doesn't mean that the architecture was designed to support ONLY
> this kind of behavior, nor even that it was the most natural stuff for us to
> be doing, in AI terms. In fact, we ended up using the system for a lot of
> "text analysis" work that it was really relatively *ill-suited* for, because
> that was what the business's products needed. (And the system performed
> well at text analysis, even though this really wasn't an appropriate
> application for it at that stage of its development).

Again, this is the kind of encouraging statement that I used to suspend
judgement on before I read the Novamente manuscript. Now that I've read it,
I would ask questions such as "What specific applications did you think
Webmind was better suited for?", "Did you test it?", "What kind of results
did you get in what you thought of as better-suited applications?", and so
on.

> Developing AI in a biz context has its plusses and minuses. The big plus is
> plenty of resources. The big minus is that you get pushed into spending a
> lot of time on applications that distract the focus from real AI.

Yes, Ben, hence our 501(c)(3) status.

> > By the standards I would apply to real AI, Novamente is
> > architecturally very
> > simple and is built around a relative handful of generic
> > behaviors; I do not
> > believe that Novamente as it stands can support Ben's stated goals of
> > general intelligence, seed AI, or even the existence of substantial
> > intelligence on higher levels of organization.
>
> You are right: Novamente is architecturally relatively simple and is built
> around a relative handful of generic behaviors.
>
> It is not all THAT simple of course: it will definitely be 100,000-200,000
> lines of C++ code when finished, and it involves around 20 different mental
> dynamics. But it is a lot simpler than Eliezer would like. And I think its
> *relative* simplicity is a good thing.
>
> I suspect that an AI system with 200 more specialized mental dynamics,
> rather than 20 generic ones, would be effectively impossible for a team of
> humans to program, debug and test. So: Eliezer, I think that IF you're
> right about the level of complexity needed (which I doubt), THEN Kurzweil is
> also right that the only viable approach to real AI is to emulate human
> brain-biology in silico. Because I think that implementing a system 10
> times more complex than Novamente via software engineering rather than
> brain-emulation is not going to be feasible.

One of the attitudes that seems very obvious to me is that you should
estimate the size of the problem without being influenced by what resources
you think will be available to solve it. Why? Because, logically, the size
of the problem is totally independent of what kind of resources are easy to
obtain. *First* you estimate the size of the problem, *then* you figure out
what you can do about it using available resources. There is no a priori
guarantee that the requirements are reasonable. In my experience most of
them are unreasonable. This is one of the differences in attitude that
keeps throwing me off my stride when I encounter it in your arguments. You
say that there's no billionaire currently funding AI, where my own attitude,
as a Singularity strategist, is to *first* ask whether a billionaire is
necessary. If so, you don't throw yourself at the problem and go splat; you
recurse on the problem of finding a billionaire or building an organization
large enough to get the necessary level of funding. You meet the problem on
its own terms and continue facing the unreasonably high standards set by the
problem until you come up with unreasonably good solutions, *if* you ever
do.

If it seems like complexity problems in software engineering may turn out to
represent a severe limit, or even the critical limit, then you face that
problem squarely (for example by forking off Flare), instead of shrugging
your shoulders and saying "Well, I'll assume the problem isn't that hard
because otherwise I'm helpless." If you consider your statement
"Implementing a system 10 times more complex than Novamente will not be
feasible" as a possible *death sentence* for the human species, and spend a
week or so thinking that it not only represents a *possible* death sentence
but actually *does* represent a death sentence for you and your family and
your entire species, I'm sure you'd start having creative ideas about how to
manage complexity - even if the most creative idea you could come up with
was trying to push a massive industry effort to develop new ways of managing
software complexity, in the hopes that someone else would solve the problem.

There is no promise that AI will be easy. I only know that it is
necessary. That said, if I didn't think our current approach would work, I
would be doing something else.

-- -- -- -- --
Eliezer S. Yudkowsky http://intelligence.org/
Research Fellow, Singularity Institute for Artificial Intelligence



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:38 MDT