Re: Syllabus for Seed Developer Qualifications [WAS Re: Some considerations about AGI]

From: Charles D Hixson (charleshixsn@earthlink.net)
Date: Wed Feb 01 2006 - 21:59:41 MST


On Wednesday 01 February 2006 03:57 pm, Kevin Osborne wrote:
> without putting a fire under yet another religious programming
> language 'discussion' of attrition, I have what for me is an important
> question in regards to AGI development and seed developer syllabus:
>
> what programming language are we going to code an AGI with?
>
> when answering this question I probably want to dismiss convoluted
> combinations of systems, specialist languages/compilers that may
> (already?) be developed for AGI purposes and focus on:
>
> what's the best language to develop the AGI workhorse code in?
>
> 'workhorse' here means the code that will bridge the gap from the
> not-so 'intelligent' systems we have now and be able to bootstrap a
> higher semantic-language/instruction-set that will be part of the
> first steps towards an AGI.
>
> Here's a hopefully not-too-biased critique-that-invites-critique of
> some of the choices as they stand:
>
> C: I'm separating this from C++ as, well, most of the crack C
> programmers I've worked with view C++ as some kind of leperous cousin;
> they're able to make a host of criticisms of C++ in contrast to C but
> I'm not one to repeat them succinctly I think. In summary;
> positives:
> - it's the fastest; without question. It also runs on every board on
> the planet. There are a host of _great_ tools and compilers from the
> likes of GCC, Sun, Intel, IBM, HP etc. If you want to write your own
> OS kernel then you'll be doing it in C.
> - the #1 choice for RTOS. if you want it to run real-time, it'll be in
> C. I''ve personally had exposure to the VxWorks and Nucleus hard
> real-time systems and a soft real-time Linux (Monta Vista). Even if
> don't end up with
> kernel=observer/ego/consciousness/insert-your-term-here, going
> real-time seems to be an intuitive requirement for replicating a range
> of faculties. That said, real-time code is very limited to what it can
> do; it runs out of puff pretty quickly when climbing the OSI stack.
> - Macros are a great language feature, and can provide some of the
> extensibility and run-time switching needed.
> - great debugging tools and lint collectors like Purify that can
> pretty much guarantee against a number of errors like memory leaks and
> overwrites
> negatives:
> - plumbing code. it just plain sucks to have to call memset, malloc
> and free everywhere. #IFDEF may be damn useful but for me is ugly as
> hell
> - complexity/productivity. add all the plumbing code in; the need to
> track and free all your resources; macros that obfuscate recursive &
> often cyclic function calls to n levels; the need to dick with your
> defines both in the code and in the compiler, and the flag mess that
> ensues when you are linking against every man and his dll and you end
> up with a language great for low-level tweaking of the cpu instruction
> set and a morass once it scales that eats huge amounts of programmer
> time dicking with the maintenance of flags, variables and
> linker/compiler bleats. this inevitibly sucks large amounts of time
> away from higher-level functional work, especially during integration
> with other people's code. I'd posit it as a given that we are going to
> have to write more higher-level functional code than for any other
> project ever attempted. To code the capabilities and faculties of
> smart human is going to be a ridiculously huge endeavour. And once our
> boy is smart enough he'll be rewriting himself bigger and bigger while
> rewriting our code smaller and smaller. we just need to write
> something big enough so that he has the capability to do so.
>
>
> C++: the current language of choice for all large critical systems
> worldwide. The OSes are written in C, but the apps are in C++. When
> your plane lands, it's C++; when the latest NASA space-gadget bleeps,
> it's C++. The vast majority of apps that run global infrastructure are
> in C++.
>
> positives:
> - first things first; every VM and interpreter of note for the
> bytecode/interpreted languages is written in C++. A Java programmer
> bitching about C++ is like a hand bitching about it's forearm. If you
> want to hack your own special JIT or JVM, you'll need to be doing it
> in C++
> - Most of the RTOS vendors provide C++ APIs; so real-time application
> development is available (for a cost)
> - There's a larger body of support libraries available than for C when
> it comes to higher-level functions; the STL is a great example
>
> negatives:
> - as noted in one of the posts above, the toolset is awful in
> comparison. g++ is gcc's ugly cousin. linker errors, especially with
> STL code, are even more convoluted than C.
> - complexity/productivity. template soup is a great example. and yes,
> it still has pointers and memory housekeeping requirements, and yes,
> any plumbing work that is programming-language specific is a negative.
> Yes, some programmers thrive in this enviroment; however it is
> defnitely not competing when it comes to the RAD qualities of say,
> Perl or (gasp)VB.
> - Microsoft are dumping C++ like a brazilian baby. For all their
> faults, MS are a cluey bunch, especially when it comes to developers.
> They've got to have seen something plenty nasty in the bathwater to
> eat the cash-outlay gobstopper that is C#/.NET
>
> Java: The contender to replace C++ pretty much; the guys behind the
> language came out and said that they created Java to put the kiss of
> death to Bjarne's creation. This puppy is running some crucial
> high-load apps now, especially in finance. Also becoming the app layer
> of choice on mobiles through I doubt that's relevant here.
>
> positives:
> - a memory managed language; less programmer time spent playing
> nursemaid to an incomplete toolchain.
> - APIs/Libraries/Tools. The core API is simply enormous; if you want
> to do something, think of a class name that fits, and it'll probably
> be in the VM already. What isn't in their yet is probably either in
> the JSR's, sourceforge or IBM. Ant and JUnit(stack, incl things like
> HTTPUnit & JCoverage) are truly revolutionary in the tool space. They
> make makefiles and test stubs look archaic.
> - developers. every tertiary institution on the planet is pumping them
> out like sperm. we can debate their veracity, but the simple fact is
> most coders (of _any_ langauge) couldn't give a rats' about coding an
> AGI so having a deeper resource pool has got to help
> - reflection. run-time introspection, querying the classloader etc.
> gives more flexibilty than most strong/static typed languages
> - remoting. RMI/EJB have their issues, but you have a distributed
> systems stack in the core API. CORBA for C/C++ is an inferior
> (supported) subset.
> - price. it's all pretty much free as in beer, and free as in open
> source otherwise, apart from the spec. this _does_ matter; thirty
> C/C++ VxWorks/Metrowerks/ADS developer licenses would sting a pretty
> penny.
>
> negatives:
> - slowness. Now, this is historically overblown, especially in
> relation to the original GUI (remember applets? anyone?) and I/O
> impementations which have either been superseded or obseleted. Having
> had a look at some of the Sun source code, their C/C++ programmers are
> kickass (think Solaris). They've spent years refactoring every
> bottleneck and apples-for-apples underperformer in comparison to STL
> C++ until the difference is often negligible (check their marketing
> 'fact'oids). And for performance over a longer run, the application
> servers with thier hacked JIT's and pre-loaded code means that Java
> gets quicker the longer you run it (discounting any leaks, which are,
> sadly, still present, though much reducied in comparision to early
> JDKs). Another thing is that slowness seems to pretty much be a
> non-issue where AGI development is concerned; by the time we finish
> hacking at the thing the hardware and tools will be generations
> better. You either need real-time; or you let Moore's law do your work
> for you. My Java apps from 1998 fly on newer RAM-stacked hardware.
> - strong+static typing. my feeling is that writing on-the-fly runtime
> customizable code is going to be needed to replicate what a brain can
> do. Reflection helps but isn't enough; Java is a little too
> monolithically structured when compared to something like Lisp; the
> code is very homogenous, and doesn't seem to have the agility to adapt
> well. I think this is somewhat intended to stop migrating VB
> developers from deciding they now want to be Perl programmers but it
> doesn't aid in dexterity.
>
> .NET/C#
> you can pretty much replicate everything said for Java here as it's a
> flat ripoff; that's why I think Sun had no qualms ripping off ASP and
> calling it JSP.
> positives: they've learned their lessons from Java's mistakes; most
> things are less broken in the IL and the CLR. It's early days though
> and some of the apps I've seen behave atrociously.
> negatives: price; no option for CLR hacking. And it's got to be said,
> MS are evil bastards; trying being a chair in Steve Ballmer's office,
> let alone Netscape, Sun or Real.
>
> Perl
> OK I'll state my bias here; I've clearly coded in most of the others
> previously mentioned but Perl took my commercial programming virginity
> - and no, not doing CGI. Perl6/Parrot, while unfinished, seem to me to
> be pretty damn compelling. Once they have Parrot out with plugins for
> Lisp/Haskell/Java etc they'll have a pretty damn decent alternative to
> .NET. and having regex support within the syntax is just plain right.
>
> positives:
> - libraries. CPAN is huge; there's a module for most everything you
> can find in the Java API and plenty else besides
> - speed. competes tidily with C++, especially in batch processing.
> - typing. weak+dynamic. Perl doesn't care what it is or where it came
> from or what you're trying to do with it. 'use strict' can tighten the
> belt if needed for debugging. the auto/dynaloader magick allows
> run-time composition and execution of completely new
> functions/classes. The things the monks and their kin can do with this
> language is spectacular in a very scary way
>
> negatives:
> - oo. people knock their ISA implementation as a bolt-on. has always
> worked fine for me though. but it's definitely not as structured as
> say Java
> - complexity. weak+dynamic gives bad programmers license to kill. some
> perl code is unmaintainable. some wizards also take perverse pleasure
> in writing incredibly obfuscated code, unmatched outside of the
> functional languages I expect
> - toolset. perl is, well, fractured. it's a bit all over the place.
> you can get most anything to work, but just about everything is
> idiosyncratic as hell. Perl6/Parrot should put some kind of nail in
> this, but you never know with these crazy Perl nuts
>
> Lisp
> I have no sodding idea about Lisp apart from doing some reading
> recently and downloading a common Lisp compiler. That said, a good
> portion of the brightest minds in programming reserve a special status
> for Lisp.
> positives: functional/macro language right? good for self-evolving code
> negatives: Lisp already failed as the AI coding language of choice.
> Quibble all you like but it's 0 for 1, and AI Winter and the decline
> of Lisp seem interwined. Common Lisp doesn't even come close to
> matching the breadth of the bytecode-based APIs.
>
> Candidates dismissed for discussion, and why:
> (these langauges seemed to me to have no standout qualities that
> belied their shortcomings; and basically they just don't compete in
> the same league as the heavyweights)
> Pascal/Delphi etc: subsets of C/C++
> Python/PHP/Ruby: subsets of Perl/Java, with piss all supporting
> libraries for non-web applications in comparison
> Haskell or functional-language-of-choice: Useful past the bootstrap
> level, and mixed in via say Parrot could be useful; but underweight
> for workhorse work in terms of developer-space footprint
> ADA/Fortran archaic-failed-language-of-choice: nothing better than Lisp.
> Assembler/Machine language: all great, until you leave x86 to go to
> Cell chips; then you're stuffed

OCaML is one plausible candidate. Pretty fast, compileable & interpretable.
It's an Object Oriented dialect of CaML which is a dialect of ML. SML
doesn't have lots of nice features, like variables, but OCaML (and CaML?)
slips them back in. Also, ANSI C can be embedded into the code. (So the
docs say. I'm not familiar enough to judge.)

Have you looked at Alice, from Stanford University? Still in the early
stages, but it looks like it supports parallelism in a quite interesting
fashion. Check out "Promises". Nice!

Also: If you think of "subsets of some other language" as not worth
considering, please think again. Languages can be made much better by
removing some feature that's just too dangerous.

However, the real problem is: "Where are you going to get your programmers?".
This is why so many projects settle on C or Java, despite all their many
problems. Even when you're paying good money most people are reluctant to
learn a new language. (I'm currently finding myself reluctant to learn OCaML
for a somewhat similar reason. It looks like a lot of effort for an
uncertain return.)

If you think that Python doesn't have any libraries outside of web
programming, you've never looked at the Python libraries. (That's true of
Ruby also, but to a lesser extent.) Also neither Python nor Ruby is very
much like Java. (I rather *like* coding in Python, and Ruby is almost a
joy. Unfortunately, they're a bit slow. If you use them, figure you'll need
twice the CPU cycles to reach breakthrough as with an efficient language,
like C, D, or SML. (Also, why include Perl? It deserves to be dismissed for
the same reasons, and to the same degree, as Python and Ruby.)

If you think Ada doesn't have it's own unique strengths, you've never used it.
It is, for the right kind of problem, the ideal tool. But nobody knows the
language, and doing lots of simple things takes huge amounts of code.
(Also, it doesn't have a garbage collector. That can be dealt with, but it's
an extra hassle.)

No comments on C#, but if you don't consider programmer availability, then I'd
plunk clearly for D (unless you go for something experimental like Alice).
If you do... SIGH ... it's pretty much GOT to be C, C++, or Java. Perhaps
the gcj subset of Java (so it's actually compileable to native code).



This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:55 MDT