RE: anonymous collaborative filtering

From: Ben Goertzel (
Date: Tue Sep 24 2002 - 16:56:12 MDT


Actually, Webmind Inc. had two different applications analyzing financial
message boards, for two different purposes.

The Webmind Market Predictor looked for statistical patterns relating

-- the words, phrases and concepts expressed in text (daily news,
messageboard posts, etc.) on days (n-k,...,n-1)
-- the prices of various financial instruments on days (n-k,...,n-1)
-- the price of a target financial instrume on day n

In fact, we did not find messageboard posts to be of significant predictive
value in the markets on which our system was most successful. The only text
source we used consistently was daily news. That financial prediction
system is now in use by Advanced Prediction Corp. based in Greenwich
Connecticut; I have no connection with it at this time. It seemed to work
well, but in late 1999 we shifted our business model away from financial
prediction to Internet information retrieval (an error, in hindsight, but it
seemed to make sense at the time).

For those who haven't heard my tale of the end of Webmind Inc. already, see or

The Webmind Classification System, a document categorization engine, was
used in many different context by different customers. We were working on a
deal with Yahoo! Finance to use it to categorize messageboard postings into
different categories (but our stuff never made it onto the Yahoo! site). A
company called Netcurrents was using it to survey messageboards and extract
information pertinent to particular companies. For instance, if the CEO of
HP subscribed to Netcurrents, he'd get a special password-protected Web page
to log onto, which would contain

-- pointers to everything said about him and his company on the
messageboards that day
-- a summary of the "sentiment" of messages (e.g. pos. or neg. about
earnings, pos. or neg. about management, etc. etc.)

Doing automated classification of news articles by topic (as we did for some
customers, such as Screaming Media in New York) is relatively easy; doing
automated classification of messageboard rants by sentiment was a lot
harder, but netcurrents and Yahoo! Finance did not require anywhere near
100% accuracy.

In my current AI project, (Novamente), I am not doing any
text analysis stuff at the moment. It's a bit of a digression from the path
to real AI, in the sense that the only "real" way to do text analysis is to
have a system that truly understanding linguistic terms via their (direct or
inferential) grounding in its own experience. On the other hand, from
building these apps and doing other language and text processing research,
we gained a lot of understanding of how language understanding and
production relate to other cognitive functions.

For example, we worked out a very neat reduction of grammar parsing to
probabilistic logical unification (building on the work of many others in
"unification feature structure grammars"), which basically shows that, if
you've got

a) probabilistic logical unification
b) the ability to learn fairly simple schemata for reordering lists

then you've got full syntactic capability. This is not mathematically deep,
but it was useful to work it out in a very detailed way, seeing that it
works pragmatically on various examples etc. However, this was R&D stuff
and neither of the two above-mentioned products used our unification feature
structure grammar parser, because it was too slow to meet real-world
response times. Instead we used simpler methods based on statistical
analysis, machine learning categorization, context-free grammar parsing, and
some simple probabilistic inference.

My work on computational finance in the period 1997-1999 taught me that
there is a LOT of structure out there in the world -- even the human
socioeconomic world -- that humans do not perceive. Of course, this is old
knowledge in the financial world, which is why such a huge percentage of the
trading going on out there is program trading. But it's interesting to get
a feel for such things at first hand.

This underlies my intuition that, in the future, an AGI with human-level but
nonhuman intelligence will be able to clean up on the financial markets. We
may see laws emerging banning such activity, at some point. Though these
laws may only hold for a short time, if the hard takeoff idea is right and
human-level intelligence is rapidly followed by dramatically superhuman

-- Ben G

> -----Original Message-----
> From: []On Behalf Of mike99
> Sent: Tuesday, September 24, 2002 4:33 PM
> To:
> Subject: RE: anonymous collaborative filtering
> From: Simon McClenahan
> ...
> > IIRC, WebMind had an application that parsed the stock message boards to
> > determine the general consensus on whether people thought a stock
> > was a good
> > buy or sell. A system that can detect arbitrary ideas from many human
> > sources seems to me like a general pattern recognition problem
> > rather than a
> > specific pattern recognition such as "buy" or "sell".
> >
> > cheers,
> > Simon
> Could WebMind identify when stocks were being puffed up by
> multiple messages
> from dummy accounts being used by single individuals trying to "pump then
> dump" a stock? Some teenagers were caught doing this and, because of their
> age, actually got away with keeping some of their ill-gotten gains and
> without doing any jail time.
> Michael LaTorra

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:40 MDT