RE: ARTICLE: Memory bandwidth

From: Ben Goertzel (ben@webmind.com)
Date: Mon Apr 16 2001 - 05:39:53 MDT


hi,

>Do you have much experience with C++?

Yes, and I fucking hate it. It's an awkward, oversized, nasty language,
which is intensely painful to debug. That's why i chose Java for Webmind.
What a disappointment to find that, all in all, for large-scale projects
Java is even ~worse~ ;(

> What did you find was more of a problem: memory
> consumption, execution speed,
> or distributed interfaces?

Execution speed was only a problem in one sense: the strict OO nature of
Java forces you to use design patterns that create a lot of objects all the
time. Implementing a similar design in C++ and in Java, and using Hotspot
to optimize Java, there isn't much speed difference. But C++ is a larger
and nastier language, and as a consequence gives you more design flexibility
to use an object-less, down-and-dirty pure C approach to implementing some
things, when it's really necessary. In this sense, sometimes, the optimal
Java program for solving a problem can be slower than the optimal C++
program. Because in C++ you always have the option to revert to the most
nasty and efficient type of C when you need to.

Memory consumption is a really big problem with Java, and there's no way to
get around it. You can mitigate it by using object pools and other fancy
design patterns, but even so, Java uses at least 4 times as much memory as
C++ for doing the same things. And that's a comparison with OO C++, not
with down-and-dirty C written with a view toward memory conservations.

> Even more mature languages only have a
> few years on
> Java in terms of distributed architectures, so I suspect that
> while you had a
> challenge on your hands with that, it wasn't the limiting factor.
> Am I wrong
> about that?

Doing distributed programming in Java is REALLY REALLY NICE.

The main problem is that Java's obscene memory consumption, combined with
the 1.2 GB memory limit on every existing JVM, forces you into distributed
programming when dealing with relatively small datasets ("relatively"
meaning "relative to all human knowledge" ;).

There are some other technical issues too. For instance, RMI is elegant and
easy to use, but, it wasn't made for high-volume message throughput, and
this shows sometimes.

Also, if you need to send messages between machines really fast, the
serialization/deserialization of objects becomes your main bottleneck --
it's much slower than sending messages across cables these days. And
there's no way to fix this without going inside the JVM.

> Do you mind if I ask a few more questions?
>
> How many machines are we talking about - five, fifty, five hundred?

At most, we're working with about 30 quad-processor PC's (4GB RAM each) for
the "mind core", and about 50 ordinary PC's that can sometimes do small bits
of processing (when their users aren't using them)

Of course, most of these machines have now been sold off in the bankruptcy
proceedings ;D

Now we have a few quads left, and some smaller machines... running in the
basement of another WM die-hard along with the mailserver, intranet, etc.

In practice, we rarely ran a Webmind with more than 7-10 of the quads at
once

> How many JVMs did you run on each machine, and how many
> threads in each JVM?

1-3 JVM's per machine, depending on all sorts of things

In terms of threads, generally less than half a dozen per process. We used
to have an architecture that used more threads than that, but that was
before we implemented our own internal scheduler.

> You mentioned billions of objects - are you serious about
> that? I've never
> worked with a system with so many active objects. Rows in
> databases, yes. But
> for running objects, even in a distributed C++ architecture, I've
> never seen
> anything like that many. Were they active (Runnable), serving as
> data storage,
> or what?

Millions of nodes, billions of links between them. All in RAM, on a cluster
of machines.
That's what it takes to make an object-oriented mind!

> I take it you tried every available JVM?

oh yeah...

ben



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:36 MDT