Re: [sl4] The Jaguar Supercomputer

From: Matt Mahoney (
Date: Fri Nov 20 2009 - 08:31:02 MST

Some specs on the Jaguar XT5.
In particular, it has 37,376 6-core AMD Opterons, 299 TB of DDR2-800 memory, and 10,000 TB of disk. Peak speed is 2332 teraflops. Memory bandwidth is 478 GB/s. Interconnect bandwidth is 378 GB/s.

A human brain sized neural network can be roughly modeled by a 10^11 by 10^11 sparse weight matrix with 10^15 nonzero weights. An array would require 10^10 TB, which would not fit on disk. The weight matrix could be represented instead as a list of pointers using 4 x 10^15 bytes = 4000 TB.

A simulation would require matrix addition and multiplication operations on the order of tens of milliseconds. (I am assuming that the signal of interest is the firing rate and not individual spikes.) Unfortunately, memory is optimized for serial access. Random access time is on the order of 50 ns. Thus, one matrix operation would take 10^8 seconds, or 10^9 slower than real time.

Weights are not uniformly distributed. Rather, neurons that are closer together are more likely to be connected. This suggests a more efficient implementation. The brain has a volume of about 1000 cm^3. Axons have a diameter of 10^-6 m. Suppose that 10^6 processors each simulated 1 mm^3 of brain tissue. Each cube would have 10^5 neurons, 10^9 synapses and 10^6 I/O at 10 to 100 bits per second each. The internal weight matrix would be dense. It could be represented as an array of 10^10 elements with 10^11 to 10^12 operations per second. This is not unreasonable for one processor. So it seems that a human brain could be simulated by a million off the shelf processors or a few hundred Jaguars. Furthermore, if a few fail then the simulation will degrade gracefully because the human brain is fault tolerant.

Of course this is a different problem than writing the software or obtaining appropriate training data.

 -- Matt Mahoney,

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:05 MDT