Re: Friendliness, Vagueness, self-modifying AGI, etc

From: Michael Wilson (
Date: Wed Apr 07 2004 - 02:03:15 MDT

I apologise if my remarks to you have appeared aggresive Ben; despite
my best efforts the crushing mental pressure of existential risks
still causes me to tend towards extremism. To some extent this just
reflects how disappointed I am with myself for being so short-sighted
over Christmas.

Ben Goertzel wrote;
> I think that documents on the level of CFAI and LOGI are important
> and necessary, but that they need to come along with more formal,
> detailed and precise documents.

Neither document is a constructive account, as I understand it for
three reasons; risk, time and expertise. The risk factor is the most
important; handing out detailed blueprints for building a seed AI is
dangerous even if they're seriously flawed. From experience I know
that trawling large numbers of papers for local solutions and
intelligent use of DE can patch a lot of holes. Dangerously competent
people can and will go ahead and implement a constructive version of
LOGI, and a constructive version of CFAI would make things worse by
giving them a broken version of Friendliness to soothe any lingering
doubts about destroying the world. Remember that this is exactly what
Eliezer wanted to do in 2001; before that he had the breathtakingly
irresponsible plan to implement an AGI with /no/ Friendliness at
all and just hope that objective morality existed and that the AGI
found it before doing anything terminal.

The other issues are that Eliezer doesn't have a lot of time and has
relatively little actual coding or architecture experience. You may
need to write a million words and throw them away to be a good
writer, but you certainly need to write a million lines of code and
throw them away to be an excellent programmer. I'm sure you know Ben
that effective AGI architecture requires absolute focus and tends to
absorb your every waking hour; the constant interference of financial
and personnel concerns with my ability to technically direct was one
of the factors that killed my first startup at the end of last year.

> There are a lot of ideas that *look* sensible on the abstract
> conceptual level of CFAI/LOGI, but don't actually correspond to any
> sensible detailed design.

Without going into detail, I found I had to bend but not break most
of LOGI to make it work. I also put in a lot of novel low-level
structure, some of which had trickle-up implications for the concept,
thought and deliberation layers. I was operating under the New
Bayesian Paradigm (tm) (though I wasn't sold on strict utility back
then), hacked things a bit for easy DE application and also for
efficient loose clustering based on the Destiny Star encrypted grid
computing platform.

> And then, one level down, there are a lot of ideas that look
> sensible on both the conceptual level and the detailed-design
> level, but don't actually work for reasons that are only seen in
> the course of experimenting with the implementation of the design.

There are quite a few things in LOGI and a lot of things in CFAI I
didn't get as far as implementing before sanity intervened, so I may
just have left out the hard bits. The only thing I had a huge amount
of trouble implementing and failed to get to work in any meaningful
fashion is CFAI's shaper networks. Admittadly I was trying to
generate pathetically simple moralities grounded in microworlds,
but still the concept looks unworkable as written to me.

> The only way to avoid this latter problem is to do a full formal,
> mathematical analysis of one's detailed design, but I doubt that
> is possible.

Certainly formal analysis of non-trivial software doesn't have a good
track record. I'm prepared to keep an open mind until Eliezer
publishes he revised cognitive theories though; there appears to be
some impressive mathematical grounding in progress. In the mean time
detailed cognitive competence verification is the watchword for AGI
design; for Friendliness I'm working on verifyably emergence-free
utilitarian descriptive calculus substrates.

> My worry is that, even if the successor to CFAI provides
> Friendliness principles that are not obviously flawed on the
> *conceptual* level, they'll still fall apart in the experimental
> phase.

If my limited progress in understanding Friendliness has revealed one
thing, it is that a verifyably correct theory puts fairly strict
constraints on the architecture (and even some on the code layer). I
don't think conceptual fudging will be necessary to implement it, but
of course in the face of the SIAI's track record it would be foolish
to ignore the possiblity of inexplicable early results. In that case
it will presumably be back to the drawing board (with the requirement
of explaining not just what the earlier failure was, but also why we
made it).

> Thus my oft-repeated statement that the true theory of Friendliness
> will be arrived at only: A) by experimentation with fairly smart
> AGI's that can modify themselves to a limited degree.

This is very close to what I thought prior to getting some real
experience with non-traditional (ie usefully complex) AI. Basically
once I'd accepted that there was no escape from the logic of the
Singularity, I tried to help in any way possible. Unlike the average
SIAI volunteer I was used to producing 50,000+ lines of working,
tested code a month, had some experience of GOFAI and was used to
designing complex, asynchronous systems. When my company failed I
also had a lot of free time, a spare 40 gigaflop compute cluster and
frankly a motive to engage in obsessive avoidance behaviour. I
decided to try and produce some experimental data on LOGI and CFAI
to assist the SIAI research, while also developing relevant low-level
optimisation techniques and doing design-ahead for an emergency
last-ditch takeoff seed in case nanowar is about to break out.
Trying to gather data on obsolete techniques was mistake #1, failing
to implement a positive safety strategy for takeoff protection was
mistake #2, considering an illegal and dangerously tempting takeoff
plan was mistake #3, and using every directed evolution trick I could
research or invent in order to cut down development time was mistake
#4. I sincerely hope that relating this reduces the likelyhood that
anyone else makes the same mistakes, as frankly I have nightmares
about someone else replicating this without Eliezer catching them in
time and convincing them how stupid they're being.

> B) by the development of a very strong mathematical theory of
> self-modifying AGI's and their behavior in uncertain environments
> My opinion is that we will NOT be able to do B within the next few
> decades, because the math is just too hard.

I'm not qualified to give an opinion on this; I haven't spent years
staring at it. I suspect that a lot of progress could be made if lots
of genius researchers were working on it, but you and Eliezer seem to
be it.

> However, A is obviously pretty dangerous from a Friendliness
> perspective.

ie an existential risk perspective. Developing positive-safety
takeoff protection is just difficult, not near impossible, and is
our duty as AGI researchers. I am not too worried about Novamente at
the moment, but you may well hire a bright spark or two who revises
the architecture in the direction of AI-completeness (I would've
volunteered, if you'd asked a few months back). I think everyone
affiliated with the SIAI would be a lot happier if you adopted a
draconian, triple-redundant and preemptive takeoff prevention policy.

> My best guess of the correct path is a combination of A and B.

I agree; I have been working on a new design that is mostly DE-free
with this in mind. However I don't know if anything useful will be
inferrable from wisdom tournaments occuring under such strict takeoff
protection. Still, it's worth a try, and there should be a lot of
useful low-level design results on optimising complex, partial,
overlaid Bayesian networks and fast reversible perception (achieving
stochastic-like results from non-stochastic algorithms).

I am also working on a commercial AI application primarily as the
most obvious way to raise cash for the SIAI project, but also in some
part due to the realisations I came to when doing design-ahead for
the Emergency Takeoff Device (tm). I'm thus working on an 'expert
system' (actually cut-down LOGI, but don't tell anyone :) for network
security (penetration testing and active defence) applications. The
field is red hot in VC circles right now and things are looking
fairly promising. Of course there's no way (and possibly no point)
to defend the Internet against a transhuman AGI, but widespread
deployment of these kind of systems might make it a lot more
difficult for a viral proto-AGI to get a foothold.

Footnote: embarassingly I've just discovered that Daniel Dennett
pre-empted the material in my last post by about 14 years; it
could've been a rip-off of 'Consciousness Explained' if the book
had been slightly earlier in my reading list. This does tend to
happen from time to time when you read through five feet of AGI
relevant literature in the space of four months ;> Anyway, here's
a particularly relevant section;
'I have grown accustomed to the disrespect expressed by some of
 the participants for their colleagues in the other disciplines.
 "Why, Dan," ask the people in artificial intelligence, "do you
 waste your time conferring with those neuroscientists? They
 wave their hands about 'information processing' and worry about
 /where/ it happens, and which neurotransmitters are involved,
 but they haven't a clue about the computational requirements of
 higher cognitive functions." "Why," ask the neuroscientists,
 "do you waste your time on the fantasies of artificial
 intelligence? They just invent whatever machinery they wan, and
 say unpardonably ignorant things about the brain. The cognitive
 psychologists, meanwhile, are accused of concocting models with
 /neither/ biological plauisiblity /nor/ proven computational
 powers; the anthropologists wouldn't know a model if they saw
 one, and the philosophers, as we all know, just take in each
 other's laundry, warning about confusions they themselves have
 created, in an arena bereft of both data and empirically
 testable theories. With so many idiots working on the problem,
 no wonder consciousness is still a mystery.
 All these charges are true, and more besides, but I have yet to
 encounter any idiots. Mostly the theorists I have drawn from
 strike me as very smart people - even brilliant people, with the
 arrogance and impatience that often comes with brilliance - but
 with limited perspectives and agendeas, trying to make progress
 on the hard problems by taking whatever shortcuts they can see,
 while deploring other people's shortcuts. No one can keep all
 the problems and details clear, including me, and everyone has
 to mumble, guess and handwave about large parts of the problem.'
Put that chapter together with some choice cuts from 'What
Computers Still Can't Do' and you have have a good account of why
AI hasn't made much progress. This may be for the best;
'mumbling, guessing and handwaving' about takeoff prevention and
Friendliness is a really bad plan. In any case, 'Consciousness
Explained' has the highest concentration of LOGI-relevant material
of anything I've read to date, so if anyone is having a hard time
understanding Eliezer's ideas I highly recommend it.
 * Michael Wilson
'Elegance is more than just a frill in life; it is one of the driving
 criteria behind survival.' - Douglas Hofstadter, 'Metamagical Themas'
Yahoo! Messenger - Communicate instantly..."Ping" 
your friends today! Download Messenger Now

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:46 MDT