Re: AGI motivations

From: Michael Vassar (
Date: Sun Oct 23 2005 - 17:27:57 MDT

Michael Wilson said
>Yes, uploads occupy a small region of cognitive architecture space within a
>larger region
>of 'human-like AGI designs'. However we can actually hit the narrower
>region semi-reliably if we can develop an accurate brain simulation
>and copy an actual human's brain structure into it. We cannot hit the
>larger safer region reliably by creating an AGI from scratch, at least not
>without an understanding of how the human brain works and how to
>build AGIs considerably in advance of what is needed for uploading
>and FAI respectively.

That assertion appears plausible but unsubstantiated to me. The
understanding of human brain function required to build a relatively safe
human-like AGI might be only trivially greater than that required to create
an upload, while the scanning resolution required might be much lower. It
may be much simpler to make a mind that will reliably not attempt to
transcend than to build one that can transcend safely. One way to make such
a mind is to upload the right person. It may be that building a large
number of moderately different neuromorphic AIs (possibly based on medium
res scans of particular brains, scans inadequate for uploading, followed by
repair via software clean-up) in AI boxes and testing them under a limited
range of conditions similar to what they will actually face in the world is
easier than uploading a particular person.

> > Surely there are regions within the broader category which are
> > safer than the particular region containing human uploads.
>Almost certainly. We don't know where they are or how to reliably
>hit them with an implementation, and unlike say uploading or FAI
>there is no obvious research path to gaining this capability.

We know some ways for reliably hitting them, such as "don't implement
transhuman capabilities".

>you know how to reliably build a safe, human-like AGI, feel free
>to say how.

Upload an ordinary person with no desire to become a god. That's one way.
Another may be to build an AGI that is such a person. How do you know what
they want? Ask them. Detecting that a simulation is lying with high
reliability, and detecting its emotions, should not be difficult.

>Indeed, if someone did manage this, my existing model of AI
>development would be shown to be seriously broken.

I suspect that it may be. Since you haven't shared your model I have no way
to evaluate it, but a priori, given that most models of AI development
,ncluding most models held by certified geniuses,are broken, I assume yours
is too. I'd be happy to work with you on improving it, and that seems to me
to be the sort of thing this site is for, but destiny-star may be more
It's best to predict well enough to create, then stop predicting and create.
  Trouble is, it's hard to know when your predictions are good enough.

> >> To work out from first principles (i.e. reliably) whether
> >> you should trust a somewhat-human-equivalent AGI you'd need nearly as
> >> much theory, if not more, than you'd need to just build an FAI in the
> >> first place.
> >
> > That depends on what you know, and on what its cognitive capacities are.
> > There should be ways of confidently predicting that a given machine does
> > not have any transhuman capabilities other than a small set of specified
> > ones which are not sufficient for transhuman persuasion or transhuman
> > engineering.
>Why 'should' there be an easy way to do this? In my experience predicting
>what capabilities a usefully general design will actually have is pretty
>hard, whether you're trying to prove positives or negatives.

We do it all the time with actual humans. For a chunk of AI design space
larger than "uploads" AGIs are just humans. Whatever advantages they have
will only be those you have given them, probably including speed and
moderately fine-grained self-awareness (or you having moderately
fine-grained awareness of them). Predicting approximately what new
capabilities a human will have when you make a small change to their
neurological hardware can be difficult or easy depending on how well you
understand what you are doing, but small changes, that is, changes of
magnitude comparable to the range of variation among the human baseline
population will never create large and novel transhuman abilities, but lots
of time and mere savant abilities may be really really useful.

> > It should also be possible to ensure a human-like enough goal system
> > that you can understand its motivation prior to recursive
> > self-improvement.
>Where is this 'should' coming from? If you know how to do this, tell
>the rest of us and claim your renown as the researcher who cracked a
>good fraction of the FAI problem.

The "should" comes from the fact that the whole venture of attempting to
build an organization to create a Friendly AI presupposed the solution to
the problem of friendly human assistant has been solved without even needing
to turn to the use of intrusive brain scans and neuro-chemical
modifications. What is the difference between trusting a human derived AI
and trusting a human. Either can be understood equally well by reading what
they say, talking to them, etc.

>Unless you're simulating the brain at the neuron level (including keeping
>the propagation speed down to human levels) /and/ closely copying human
>brain organisation, you simply can't generalise from 'humans behave like
>this' to 'the AGI will behave like this'.

Obviously you have to closely copy human brain organization. You can
experiment with different propoagation speeds like you can with new drugs
without credible risk of UnFriendlyness, takeoff, or the acquisition of
transhuman abilities. I see no evidence that simulations must be at the
neuron level, though its probably safest. What you need is to be sure that
your high level simulations don't produce more powerful outputs than the
neuron level simulation would, which should be non-problematic for some of
the brain.
Neuron level simulation is still not uploading. Simulating pan-human
complexity demands much less scanning than simulating human-specific neural

>Given this constraint on
>structure, your options for getting bootstrap /content/ into the AGI are
>(a) upload a human, (b) replicate the biological brain growth and human
>learning process (even more research, implementation complexity and
>technical difficulty and takes a lot of time) or

We shouldn't be doing b, but it will be done and if it advanced faster than
what we are doing we should utilize it.

>(c) use other algorithms
>unrelated to the way humans work to generate the seed complexity. The
>last option again discards any ability to make simple generalisations
>from human behaviour to the behaviour of your human-like AGI.


>The second
>option introduces even more potential for things to go wrong (due to more
>design complexity)

I'm not convinced of this. It could be true, time will tell.

>and even if it works perfectly it will produce an
>arbitrary (and probably pretty damn strange) human-like personality, with
>no special guarantees of benevolence.

No guarantees other than what you can infer from selecting a human of your
choice from a large number of options, examining their brain and life
history, and possibly slightly modifying neurochemistry or utilizing
conditioning etc. However, what guarantees of benevolence do we have from
real humans? Primarily the fact that an imminant singularity leaves humans
with little basis for conflict. FAI is an extremely desirable outcome for
all of us, and only questionably not an optimal outcome for any given human.

>Actually I find it highly unlikely
>that any research group trying to do this would actually go all out on
>replicating the brain accurately without being tempted to meddle and
>'improve' the structure and/or content, but I digress.

Depends on what tools they had available and what they wanted to do.

>Thus the first
>option, uploading, looks like the best option to me if you're going to
>insist on building a human-like AGI.

I agree its the best choice, but it may not be easiest.

>Trying to prevent an AGI from self-modifying is the classic 'adversarial
>swamp' situation that CFAI correctly characterises as hopeless.

You shouldn't try to prevent a non-human AGI from self-modifying, but
adversarial swamps involving humans are NOT intractible, not to mention that
you can still influence the human's preferences a LOT with simulate
chemistry and total control of their environment. I don't think this is as
severe as the adversarial situation currently existing between SIAI and
other AI development teams from whom SIAI withholds information.

>single point of failure in your technical isolation or human factors
>(i.e. AGI convinces a programmer to do something that allows it to
>write arbitrary code) will probably lead to seed AI. The task is more
>or less impossible even given perfect understanding, and perfect
>understanding is pretty unlikely to be present.

Huh? That's like saying that workplace drug policies are fundamentally
impossible even given an unlimited selection of workers, perfect worker
monitoring, and drugs of unknown effect some of which end the world when one
person takes them.

> > An AI which others can see the non-transparent thoughts of, and which
> > others can manipulate the preferences and desires of, is in some
> > respects safer than a human, so long as those who control it are safe.
>This looks like a contradiction in terms. A 'non-transparent' i.e.
>'opaque' AGI is one that you /can't/ see the thoughts of, only at best
>high level and fakeable abstractions.

You *can* see the non-transparent thoughts of a human to a significant
degree, especially with neural scanning. With perfect scanning you could do
much better.

>The problem of understanding what
>a (realistic) 'human-like AGI' is thinking is equivalent to the problem
>of understanding what a human is thinking given a complete real-time
>brain scan; a challenge considerably harder than merely simulating the

I strongly disagree. Reading thoughts is very difficult. Reading emotions
is not nearly so difficult, nor is making some reasonable inferences about
the subject matter of thoughts. Look at what Koch has already done with
today's crude tools.

> > Finally, any AI that doesn't know it is an AI or doesn't know about
> > programming, formal logic, neurology, etc is safe until it learns
> > those things.
>I'll grant you that, but how is it going to be useful for FAI design
>if it doesn't know about these things?

I can think of several ways. Admittedly the use would be more limited.

>How do you propose to stop
>anyone from teaching a 'commercially available human-like AGI' these

Probably not possible at that point, but at that point you have to really
really hurry anyway so some safety needs may be sacrificed. Actually,
probably possible but at immense cost, and probably not worth preparing for.

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:52 MDT