Re: [sl4] to-do list for strong, nice AI

From: Pavitra (
Date: Fri Oct 16 2009 - 23:54:19 MDT

Matt Mahoney wrote:
> Pavitra wrote:
>> A[ ] Develop a mathematically formal definition of Friendliness.
> In order for AI to do what you want (as opposed to what you tell it),
> it has to at least know what you know, and use that knowledge at
> least as fast as your brain does.

Doesn't this imply that the relevant data and algorithms are
incompressible? In particular, it's possible that _lossy_ compression
may be acceptable, provided edge cases are properly handled; I can
predict the trajectory of a cannonball to an acceptable precision
without knowing the position or trajectory of any of its individual atoms.

> To satisfy conflicts between people
> (e.g. I want your money), AI has to know what everyone knows. Then it
> could calculate what an ideal secrecy-free market would do and
> allocate resources accordingly.

Assuming an ideal secrecy-free market generates the best possible
allocation of resources. Unless there's a relevant theorem of ethics I'm
not aware of, that seems a nontrivial assumption.

> One human knows 10^9 bits (Landauer's estimate of human long term
> memory). 10^10 humans know 10^17 to 10^18 bits, allowing for some
> overlapping knowledge.

Again, where are you obtaining your estimates of degree-of-compressibility?

>> A->B[ ] Develop an automated test for Friendliness with a 0% false
>> positive rate and a reasonably low false negative rate.
> Unlikely. Using an iterative approach, each time that a human gives
> feedback to the AI (good or bad), one bit of information is added to
> the model. Development will be slow.

This sounds relevant to step A, but not to B-assuming-A-is-solved.
Solving A doesn't necessarily rely on iterating over a binary predicate;
I agree with you that it probably shouldn't.

>> C[ ] Develop a mathematically formal definition of intelligence.
> Legg and Hutter propose to define universal intelligence as the
> expected reward given a universal (Solomonoff) distribution of
> environments. However it is not
> computable because the number of environments is infinite. Other
> definitions are possible of course, e.g. the Turing test.

The Turing test is probably not suitable.

What about computing an approximation? Is it possible to determine that
a given precision of approximation is "good enough" for a given
situation, or would that have to be part of the AI itself and am I
mixing levels?

>> C->D[ ] Develop an automated comparison test that returns the more
>> intelligent of two given systems.
> How? The test giver has to know more than the test taker.

Again, this seems more a criticism of C than of D. However, reading it
as such:

I am not a mathematician, but it feels like the validity of your
argument may be equivalent (or at least analogous) to P=NP.

> However, you don't need C and D. If you solve B then you already have
> a model of all human minds, and therefore have already solved
> intelligence, at least by the Turing test.

The whole point of Singularity-level AGI is that it's a nonhuman
intelligence. By hypothesis, "humanity" ‚äČ "intelligence".

I don't hold with the Turing test. It's too fuzzy, subjective, and
fallible, and above all it tests for humanity rather than intelligence.
Chatterbots have been found to improve their Turing Test performance
significantly by committing deliberate errors of spelling and avoiding
topics that require intelligent or coherent discourse.

>> B,D->E[ ] Develop prototype systems and apply these tests to them
>> iteratively until the Singularity occurs.
> Let's keep in mind that a Singularity is *not* the goal. The goal is
> friendly AI. The Singularity is what happens when we lose control of
> it.

I thought the Singularity was defined as "AGI occurs and goes foom", so
that the Singularity would be either Friendly or unFriendly according to
the nature of the particular AI that first reaches the magic threshhold.

The goal, then, would be to ensure that the Singularity will be Friendly.

If Singularity is defined as "unFriendly foom", then of course E should
instead read "...until we get Friendly AGI or die trying."

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:05 MDT