RE: Is generalisation a limit to intelligence?

From: Ben Goertzel (
Date: Sun Dec 03 2000 - 06:42:36 MST


> Information theoretic approaches have already demonstrated much of what is
> being questioned, or at least insofar as finite-state machines are
> concerned. Generally speaking, given a finite amount of memory and an
> arbitrarily long sequence of data (generated by any finite state machine
> no matter how complex), it is possible to attain the minimum possible
> predictive error rate using universal prediction schemes.

Yes, I am well aware of these theorems.

However, this kind of "in principle" calculation is not very useful in
now is it? only in very narrow domains, like text compression...

Here is an excerpt from an e-mail on an internal Webmind Inc. list (the one
devoted to
philosophy rather than practical stuff), on a related topic ...

What do I mean when I say I believe that EC (evol. computing, GP) is enough
to create a thinking
machine, in principle?

Here we go...

You have to posit a system with a goal(s) (e.g. survival, mating)

You have to assume it has a way to assess the degree to which these goals
are satisfied at a given time, at least approximately

You have to assume it has a memory of, at each moment in time: what programs
it was running at each time,
and the extent to which it achieved its goals. [A memory of what stimuli it
was receiving is optional, but can obviously enhance efficiency of

Then, if the system has enough time to learn, EC (or more simply, Monte
Carlo search over program space),
will cause the system to arrive at a program that can achieve its goals

The assumptions one needs to make an EC-based thinking machine possible are

-- fast processors
-- huge memory
-- a very very long lifespan
-- a relatively benign environment (the latter so the system doesn't die in
the early stages of its
very slow and stupid learning process)

Since none of these conditions obtain, we need something much more
specialized and more complicated
than EC inside our thinking machine...


> An optimal
> prediction scheme can be algorithmically generated and the error rate
> figured for any data generated by finite-state machinery. <much lengthy
> theory omitted> In short, it has been demonstrated that for any finite
> state machine, it is possible to ascertain the minimum possible predictive
> error rate for any data sequence given any finite amount of memory.

Yes, but this method will not perform adequately
under the conditions under which a real intelligent systems have
to operate.

> An
> optimal prediction scheme will typically approach the theoretical error
> limit quite fast. However, sub-optimal prediction schemes, nonparametric
> or unknown models, and similar types of situations may approach their
> theoretical error rates quite slowly.

The problem is that the prediction schemes that are "optimal" under standard
mathematical assumptions, are NOT optimal given the real-world conditions
which organisms operate.

> It would be trivial for a
> computer today to calculate error rates for any optimal universal
> predictive scheme. These would seem to answer the above question and
> quite a few others I've seen on this thread.

Yes, it answers the question under the glaringly false assumption that minds
general optimal predictive schemes

> Among the interesting things that have been shown with respect to this is
> that humans are quite apparently finite state-machines. The first example
> of this was Hagelbarger at Bell Labs (and later Claude Shannon), who first
> demonstrated that humans are apparently unable to generate truly random
> sequences of any kind; computers using information theoretic prediction
> algorithms were able to successfully predict the behavior of humans
> intentionally attempting to generate random data, with an error rate many,
> many orders of magnitude below what would be expected if the human
> participants were actually generating random data.

These experiments certainly do not demonstrate that humans are finite-state
There are many many other explanations for this data. I won't bore you by
reciting them.

> I've actually been using information theoretic approaches in my engines
> for several years now, and with generally superb results across many
> fields. It has been widely rumored that Claude Shannon made his
> fortune by
> "working" the stock market (as an aside, a couple years ago I calculated
> that running an optimal predictive engine against the entire NASDAQ in
> realtime, based on the best engine I had produced to date, would require a
> machine capable of 10^11 Flops sustained. The amount of memory was
> reasonably attainable though.) I've found it odd that information theory
> is routinely overlooked in AI research since it provides such a solid
> foundation for the mathematical basis of the topic.

Information theory is actually NOT a solid basis for AI. It's useful for
simple aspects of AI, mostly on the perceptual side.

But let's face it -- the key properties of the human mind are all about
dealing with
lack of memory and processor speed. The distinction between STM and LTM,
the balance
between reason (a thorough analysis method that's resource-intensive, when
applied across
a broad scope) and intuition (cheaper but less accurate), and so forth.
Most of what makes
real minds interesting is NOT about optimal prediction or modeling, but
about pretty good ways
of achieving pretty good intelligence within very limited space and time
resources. This is a whole
different story from information theory.

For example, in the area of computational linguistics, Denis Yuret's
excellent MIT PhD thesis from a few
years back uses information theory to model language ("lexical attraction"
he calls it). All well and
good. It doesn't help you deal with the translation of language into
meaning. I think I know how to do the
latter, but not using information theory explicitly....

 I mean, how do you construct a mind
given current computational resources and real-time learning constraints,
inspired primarily by information
theory? You don't have to tell me all the details, just give me the broad

> I am currently working on putting together a website with a lot of
> the theory and actual application of my work, quite a few parts of which
> have been applied in the commercial sector.

I look forward to reading your stuff!

But still, a deep yet unrealistic general theory, plus some narrow-domain
cool applications, does not
make a viable theory of mind-construction or mind-analysis in the real


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:35 MDT