Re: Threats to the Singularity.

From: Stephen Reed (
Date: Thu Jun 13 2002 - 21:59:57 MDT

On Thu, 13 Jun 2002, Samantha Atkins wrote:

> >>Yes, if a deductive inference engine and symbolic knowledge representation
> >>repository were essential for Seed AI, then I agree that less than 20K
> >>lines of java code would be required, especially if a relational database
> >>were the object store.
> >
> > A very nice, efficient, 1-machine AGI-friendly data representation framework
> > can be done in 5K-10K lines of C++.
> It is impossible to evaluate such a statement without knowing
> what you have in mind as the requirements of this subsystem. Of
> course, it is really meaningless to talk about LOC anyway as the
> smallest in LOC is not necessarily either the best or the
> simplest to implement.

Software Engineering research over the last 20 years shows that an
accurate estimate of the lines of code for a project, in advance of its
implementation, is the most accurate estimate of the total project effort.
Of course the issue then becomes how do you achieve an accurate estimate
of lines of code. Having a long and varied career in medium/large
enterprise systems development and project management, I welcome a
discussion among AI software implementers here on the issue of deciding
what features are required for Seed AI and how much effort will be
required to accomplish a feature.

I give the Cyc lines of code example because some programmers can
appreciate the effort required based on that figure, rather than say it
is a certain large number of staff-years. For example I know that I can
build a self-designed software system in java having about 25,000 lines
of code in a year working furious part time - if I well understand the
technical obstacles in the project. Enterprise code writing productivity
rates are much lower because the total system cost includes the years of
maintenance programming in which most of the effort is investigation and

In the case above I intentionally left vague what the requirements for a
minimal symbolic KR and deductive inference software system might be, with
the sole constraint that it not need do everything that Cyc now does.

> > > >
> >
> >>My question to you then would be: Given a budget of 50K LOC, just what
> >>behaviors would you require the system to have? Or if your base system
> >>merely accepts knowledge for the next layer up, how many person-years of
> >>effort at that next layer would be required to achieve Seed AI, and what
> >>behaviors would be taught to the system with this budget.
> >>
> >
> > I think a complete Novamente could be compressed into 50K lines of C++, at
> > significant cost in code comprehensibility and maintainability. Not a path
> > we're likely to take though, I'd prefer 200K lines of *good* code ;->
> >
> What on earth is all this LOC talk about? I haven't seen the
> like since we used to brag about getting tiny Basic in less than
> 2K bytes of machine code.

See for example the web page:

which describes a software system cost estimation process for NASA.

The LOC metric was in widespread use well before the advent of
microcomputers which is why you are confused about its use here.


Stephen L. Reed                  phone:  512.342.4036
Cycorp, Suite 100                  fax:  512.342.4040
3721 Executive Center Drive      email:
Austin, TX 78731                   web:
         download OpenCyc at

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:39 MDT