From: Eugen Leitl (eugen@leitl.org)
Date: Sun Jun 16 2002 - 04:52:44 MDT
A question to AI implementers: what do you think you could do with a
hardware architecture described below? I have my reasons for asking,
please humour me.
Boilerplate:
* substrate: 300 mm wafers, a modern embedded RAM process, wafer-scale
integrated (routing around defect dies done in software)
* memory grain size: depending on yield (80..90% of good dies), but
certainly below 1 MByte (possibly way below, ~100 kBytes)
* number of grains/wafer: depends on die size and structure size, but
~10^3 per wafer, or more.
* asynchronous, ultrawide (say, 256 bit or more) variant of
http://colorforth.com/25x.html (single instance, two shallow stacks,
ultrawide data stack, SIMD instructions on array of short integers). Very high
bandwidth, low latency.
(if you don't know about peculiarities of stack computers, get current
on http://www-2.cs.cmu.edu/~koopman/stack_computers/ )
* no floats, no memory protection, no reentrant multitasking
* wafer-wide low-latency high-bandwidth (~TBps/die) hardware
message passing, dynamic remapping and routing around dead dies
(high-dimensional analogon of geographic switching)
* wafer cascadable using above meshed switching. Total bandwidth a
fair fraction of a real crossbar.
* minimal cooperative multitasking OS, 2-4 kBytes. That's your redundancy
in each die.
* language: most naturally FORTH with SIMD VLIW extensions
Question:
Does this completely weird you out, or do you think you could do
meaningful work on above machine?
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:39 MDT