Re: Mind Reading - Approach to Friendliness

From: H C (
Date: Mon Jun 19 2006 - 15:12:03 MDT

>From: Charles D Hixson <>
>Subject: Re: Mind Reading - Approach to Friendliness
>Date: Sun, 18 Jun 2006 20:59:01 -0700
>H C wrote:
> > Concept is pretty simple. You develop really powerful
> > analytical/debugging applications that can display the entire contents
> > of the AI mind in tractable and useful visual representations and
> > extrapolations into future states of the mind.
> >
> > Strategy:
> > Let the AI 'sleep' (i.e. down-time) long enough to periodically
> > analyze the entire contents of the mind. The point in the analysis is
> > to isolate areas of potential risk/danger and either directly
> > modify/secure these areas, or to instruct the AI via it's
> > communication channel with the programmers (and obviously check up on
> > the AI's obediance).
> >
> > Theoertically, the only window for danger would be in the period it is
> > awake and thinking. It would need to come to several conclusions
> > simultaneously that all affirmed some non-Friendly behavior, and
> > develop that intention into a non-Friendly action before it went to
> > sleep.
> >
> > A strategy to combat this possibility would be to develop dynamic
> > diaognostic software, that could actively monitor the entire range of
> > the AI's mental and external actions. A comprehensive security system
> > would need to be developed to set alerts, automatic shut downs,
> > security warnings, and anything abnormal or potentially remarkeable.
> >
> > The point of implementing this strategy is to allow a non-verifiably
> > Friendly AGI to help the programmers and mathematicians developing
> > Friendliness theory in a relatively safe and reliable manner.
> >
> > -Hank
>Centers of power are attractive to people who desire to control.
>Allowing the programmers this kind of access to a working AI seems to me
>suicidally wrong. It's all very well when what's available is the
>ability to engage in a lot of hard work for dubious rewards. When it
>becomes a center of power an entirely different crew of people will be
>attracted...and you only need one bad apple.
>How many centuries are proposing that this setup should endure? It's

My perception is that AGI should be coming along much sooner than
Friendliness theory will. We can only hope that the AGI creators can take
the necessary steps in order to secure what they have, from inteference of
other humans, and Friendliness threats involving the AI.

I also hope that Friendliness reasearchers, with the help of a very limited
oracle-like AGI could produce useful results in less than "centuries".

Optimally though, we wouldn't want anybody to develop an AGI until
Friendliness theory can tell us that we are definitely not going to explode
the Universe by flipping the 'on' switch.

- Hank

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:56 MDT