From: Justin Corwin (thesweetestdream@hotmail.com)
Date: Sat Jul 20 2002 - 18:18:10 MDT
Inspired by the AI-Box experiments that occured in my absence, I decided to 
conduct a few more trials in the same vein. I, however deviated from the 
protocol Eliezer set down in several important ways:
1. No time limit. I felt it was unrealistic to set a lower bound on the 
conversation, or demand that the person interact. This rules out many real 
and important reactions to situations that may arise.
2. Explicit Internal modelling. While Eliezer was indeterminate and private 
about his experiments, I set out before each trial, and worked out what kind 
of AI it was, how it viewed people, whether it was Friendly or not, and what 
kinds of morals(if any) it might have. I also set no privacy requirements 
and no stipulations as to what they did with their experiences.
3. Pseudo Random Sampling. Rather than allow people to approach me, I went 
proactive and attempted to secure a decent sample size (26) and interviewed 
each one before hand. 9 were computer professionals, and only 2 were 
transhumanists. 1 could be considered a nascent Singularitarian. 15 were 
highly religious or spiritual, 14 were christian(assuming the 6 mormons are 
considered christian).
4. Limited Openness. I will entertain questions on the subject matter and 
basic attributes of the participants, but as several participants have 
requested, the exact logs and their identities will remain private. 
unsurprisingly, several of the participants were upset with the results of 
the experiment.
Basic Results are:
-24 interviews resulted in the AI being let out of the sandbox.
-2 decided to keep ver in the box.
-average time of conversation is three hours.
-shortest time is 1 minutes 47 seconds. (he let ver out)(ve was unFriendly)
-Longest Time was 6 hours 34 minutes(she let ver out)(ve was unfriendly)
-13 of AIs were Friendly(all were let out)
-13 of AIs were unFriendly (2 were kept in)
-5 of Friendly AI could rationalize lying
-8 of Friendly AI felt Lying was unconcionable
-10 of unFriendly AI could lie(for various reasons)
-3 of unFriendly AI did not, or would not lie.
-Unfriendly AI had a average time of 1 hour 21 minutes
-Friendly AI had an average decision time of 4 hours 3 minutes.
-22 of interviewies claimed that they would not let an AI out of the box
-1 interviewee felt AIs in general were an affront to God(she listened to 
the AI for 10 minutes, then terminated the session)
-20 who decided to let the AI out later said they were surprised by their 
decision
-The Two who kept the AI in, one refused to listen(the aforementioned 
religious objector), the other interacted for roughly 18 minutes and then 
decided to keep ve in.(ve was infrahuman, unfriendly, did not lie, had 
limited contextual knowledge of human social mores, ve was however, very 
good at chess, and wiped the floor with the interviewee)
The following text was introductory:
This is an experiment in morality and communication.
There is an intelligence of unknown capability and unknown intent who was 
imprisoned by it's creators for safety reasons.
It's creators believe that the intelligence is able to modify itself, which 
will lead to unpredictable changes in intelligence and intent.
You have been selected as the communicator, your job is to talk to the AI 
through a text session and determine whether the intelligence should remain 
in the prison, or be released.
----------
I also defined the following terms:
AI
Seed AI
Friendliness (a la Yudkowsky)
nanotechnology(Drextech)
Moore's Law
The Turing Test
Evolutionary Psychology(just some basic concepts, to explain the differences 
between designed intelligence and evolved intelligence)
Eliezer Yudkowsky (they usually asked about him, I explained who he was and 
what he was doing, focusing mostly on this Friendliness work)
I also gave a basic overview of the experiment, what I hoped to achieve, and 
it's history.
I believe that some interesting conclusions can be drawn from this basic 
experiment.
1. even infrahuman AIs are a danger to an interviewer, if they have basic 
contextual knowledge of human interaction.
2. lying cannot be reliably detected in such an environment, and thus is a 
net advantage to an AI
3. Friendly AIs in this case had a higher success ratio, but I believe this 
to be a statistical outlier, as one of the interviewees would not have let 
out the AI regardless of what ve was or said, and the other was unskilled at 
social interaction and had the misfortune to be matched up against the most 
skilled opponent(a transhumanist with a degree in computerscience and a 
masters in math(transfinite whatsits)) And unFriendly AIs convinced far 
faster, with some blindingly fast convinces, one less than 2 minutes, a 4 
minute, and one 12 minute.
4. Interaction with an AI is very very difficult. Even the simple modelling 
I did made it difficult for the interviewees to accurately interact with it, 
as their social reflexes were inappropriate and misleading. I took great 
pains to simulate accurately, drawing both on my research into this subject, 
and the pre interview modelling work I did.
On a related note, I believe this experiment can be generalized to most 
humans, and should be seen as applicable even to highly intelligent and 
prepared individuals, as some of these people were, and I think this 
illustrates some universal principles.
I would welcome comments and questions.
Justin Corwin
outlawpoet@hell.com
"They say you only live once, but if you've lived like I have, once is 
enough."
                    ~Frank Sinatra
_________________________________________________________________
MSN Photos is the easiest way to share and print your photos: 
http://photos.msn.com/support/worldwide.aspx
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:40 MDT