RE: SIAI's flawed friendliness analysis

From: Ben Goertzel (
Date: Sat May 10 2003 - 06:23:12 MDT

> What is dangerous is that someone who believes that "AIs just do what
> they're told" will think that the big issue is who gets to tell AIs what
> to do. Such people will not, of course, succeed in taking over
> the world;
> I find it extremely implausible that any human expressing such a goal has
> the depth of understanding required to build anything that is not a
> thermostat AI. The problem is that these people might succeed in
> destroying the world, given enough computing power to brute-force AI with
> no real understanding of it.

To me, this paragraph indicates a potentially dangerous naivete' about human
individual and group psychology.

My concerns include:

There are many less extreme and more rational views in the conceptual
vicinity of "AI's just do what they're told" -- such as "real AI's will be
autonomous yet heavily influencable, just like human beings."

Human nature is complex and contradictory. Certainly, a deep understanding
of cognition and AI design could plausibly be compatible in the same human
with a shallowness on moral and futurist issues.

What about a group of people, where there's a team of scientists who
understand mind and AI but have decided to cede moral and futurist judgments
to their leaders, who do not understand mind and AI but do want to create an
AI that is under their influence (though not absolute control).

I don't share your view that anyone with the the intuition and knowledge to
create AI, is necessarily likely to have strong moral standards and a wisdom
about morality. I understand that creating a workable AI design is likely
to involve a deep understanding of the human mind [the Novamente design, for
example, certainly involved integration of information from neuroscience,
empirical psychology and introspective psychology, as well as math, CS, etc.
etc.], but ... well, so what? Abstract understanding of mind structures and
dynamics does not imply moral wisdom, or the ability to make sound judgments
in very difficult situations.

> Anyone fighting over what values the AI ought to have is simply fighting
> over who gets to commit suicide. If you know how to give an AI
> any set of
> values, you know how to give it a humanly representative set of values.

This doesn't sound right to me either. Maybe there is another non-human-ish
set of values that is easier to inculcate into an AI, than any set of
"humanly representative" values. My intuition is that this IS in fact the

> It is not a trivial thing, to create a mind that embodies the full human
> understanding of morality.

I am not sure it is a desirable or necessary goal, either. The human
understanding of morality has a lot of weaknesses, and a lot of ties to
confusing aspects of human biology. I believe that ultimately an AI can be
*more* moral than nearly any human or human group.

> There is a high and beautiful sorcery to it.
> I find it hard to believe that any human truly capable of learning and
> understanding that art would use it to do something so small and mean.

I am afraid you underestimate the complex, convoluted, and often
self-contradictory, perverse and self-defeating aspects of human nature --
qualities which occur in *very smart, insightful* people, along with many of
the others...

> Indeed the largest computers are the most dangerous, but not in the way
> that you mean. They are dangerous because even people who don't
> understand what they're doing may be able to brute-force AI given truly
> insane amounts of computing power. Friendliness, of course, cannot be
> brute-forced.

I consider it reasonably possible that scientists without any deep
understanding of Ai morality, could produce a highly clever and functional
seed-AI system by methods far more sophisticated than "brute force."

Of course, I hope to beat them to it with Novamente ;-)

> Your political recommendations appear to be based on an extremely
> different model of AI. Specifically:
> 1) "AIs" are just very powerful tools that amplify the short-term goals
> of their users, like any other technology.

AGIs may start out this way, and then grow into being autonomous beings and
seed AIs

> 2) AIs have power proportional to the computing resources invested in
> them, and everyone has access to pretty much the same theoretical model
> and class of AI.

Of course, this is not true. But depending on how the future of AI
development goes, it may be roughly true at some point in the future.

If someone figures out a new, workable theory of AGI, and it ends up taking
5-10 years to refine this theory into a viable seed AI, then it's likely
that within this 5-10 year period, said workable AI theory will propagate
through the scientific community, so that computing power will play a major
role (though far from being the only factor) in the relative intelligence
levels of in-development AI's using said workable theory.

> 3) There is no seed AI, no rapid recursive self-improvement, no hard
> takeoff, no "first" AI. AIs are just new forces in existing society,
> coming into play a bit at a time, as everyone's AI technology improves at
> roughly the same rate.

I don't think Bill assumes that; I think his comments are sensible if one
assumes a soft takeoff that takes several years. In the hard takeoff
scenario his comments are less relevant, I agree.

> 4) Anyone can make an AI that does anything. AI morality is an easy
> problem with fully specifiable arbitrary solutions that are reliable and
> humanly comprehensible.

I don't think Bill said AI morality was an easy problem... I did not get
that impression from his book.

Heck, even raising a human child effectively is not an easy problem !!!

> 5) Government workers can look at an AI design and tell what the AI's
> morality does and whether it's safe.

Maybe some government workers could do this about as well as anyone else.
There are a lot of mighty bright, knowledgeable people in gov't research
labs, for example.

> Hm. I think all I can do here is point to Part III of LOGI and say that
> my concern is with FOOMgoing AIs (AIs that go FOOM, as in a hard
> takeoff).
> Computer programs with major social effects, owned by powerful
> organizations, that are *not* capable of rapid recursive self-improvement
> and sparking a superintelligent transition, are not the kind of
> AI I worry
> about.

But the risk is that one of these socially-pertinent non-seed-AI AI's will
then be grown into a seed AI through repeated iterative re-engineering...

>If there are governmentally understandable variables that
> correlate to democratically disputed social outcomes, and so on, then I
> might indeed write it off as ordinary politics.

I don't understand your attitude toward "governmental understandability."
The government has many scientists on its payroll who are as intelligent and
knowledgeable as anyone on this list. I don't doubt that many employees of
national labs could contribute substantially to the design and evaluation of
seed AI's. I have specific friends at Los Alamos Labs who would be
incredibly helpful in this role.

> The guidelines are not intended as a means of making AI programmers do
> something against their will. I'll be astonished if I can get people to
> understand the method with their wholehearted cooperation and willingness
> to devote substantial amounts of time. I see little or no hope
> for people
> who are vaguely interested and casually agreeable, unless they can be
> transformed into the former class.

I believe that once there are impressive proto-seed-AI systems in existence,
a lot more people will become *very strongly interested* in this area....
The reason so many scientists are just vaguely interested in AI Friendliness
is that they reckon any kind of serious seed-AI is minimally decades away.

> Non-Bayesian? I don't think you're going to find much backing on this
> one. If you've really discovered a non-Bayesian form of reasoning, write
> it up and collect your everlasting fame. Personally I consider such a
> thing almost exactly analogous to a perpetual motion machine.
> Except that
> a perpetual motion machine is merely physically impossible, while
> "non-Bayesian reasoning" appears to be mathematically impossible. Though
> of course I could be wrong.
> Reinforcement learning emerges from Bayesian reasoning, not the other way
> around.

Well, this is a whole other topic!

Bayesian modeling provides a way to understand cognitive systems. But there
are also other, complementary ways to understand aspects of mental dynamics.
And, of course, the actual dynamic processes carrying out cognition -- even
if modelable using probability theory -- may be more easily describable
using some other formalism.

In short, I think there can be "non-Bayesian reasoning" in the sense that
there can be reasoning whose most concise and useful explanation involves
concepts other than Bayes' rule and elementary probability theory in
general. Probability theory can still be used to explain such reasoning,
but in an irritatingly cumbersome way.

Novamente's cognition involves both explicitly probabilistic (Bayesian)
aspects, and other aspects. The behavior of the
non-explicitly-probabilistic aspects can still be analyzed in terms of
probability theory, if one so wishes; but this is not necessarily the most
incisive analysis.

I think that Bill's comments on reinforcement learning should be taken in
this light. yeah, any RL scheme can be analyzed using probability theory,
and viewed as an approximate way to make some probabilistic calculations.
But for some RL schemes this is a really useful and productive perspective;
for others it isn't.

-- Ben G

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:42 MDT