Re: [sl4] Friendly AIs vs Friendly Humans

From: Matt Mahoney (
Date: Tue Jun 21 2011 - 08:43:13 MDT

DataPacRat <> wrote:
> My understanding of the Friendly AI problem is, roughly, that AIs
> could have all sorts of goal systems, many of which are rather

> unhealthy for humanity as we know it;

I think that viewing AI as a powerful goal directed optimization process is an
over-simplification. Currently we have to pay people US $60 trillion per year
globally to do work that machines could do if they were smarter. This is the
force that is driving AI. Obviously it is a hard problem or we would have solved
it by now.

In I analysed the software, hardware, and
training costs of building AI sufficient to automate the economy. Software is
the easiest part, because it can't be more complex than the human genome, which
is equivalent to a few million lines of code. It would cost $1 billion, which is
insignificant compared to its value. For hardware, I estimate at 10^27 OPS and
10^26 bytes to simulate 10^10 human brain sized neural networks. We might find
more efficient models for language and vision (where the best known algorithms
appear to be neural), but in any case Moore's Law should make the cost feasible
in 20-30 years. But the most significant cost is that of training. In order for
machines to do what you want (as opposed to what you tell them, as they do now),
they have to know what you know. I estimate that the sum of human knowledge is
10^17 to 10^18 bits, assuming Landauer's estimate of human long term memory
capacity at 10^9 bits, and 90% to 99% mutual information between humans (based
on a US Labor Dept. estimate that it costs 1% of lifetime earnings to replace an
employee). Most of this knowledge is in human brains and not written down. (The
searchable Internet is far smaller at 25 billion web pages). I figure there are
two ways to extract this knowledge. One is through slow channels like speech and
typing using a global system of public surveillance at a cost of $1 quadrillion
assuming a global average wage rate of $5/hour. The other would be through
advanced (and so far nonexistent) brain scanning technology at the synapse
level. To be a viable alternative, it would have to cost less than $100,000 per

The result would be models of 7 billion human minds. A model is a function that
takes sensory input and returns a prediction of your actions. Models would have
many uses. A program could run thousands of high speed simulations of you in
parallel to predict what actions would make you happy, or what actions would
make you buy something. A model that carried out its predictions of your
behavior in real time to control a robot or avatar would be an upload of you.

This could hardly be described as a goal directed process, except to the extent
that your goals would be modeled. The problem with the goal directed model is
that we get into trouble with simple goals like "be nice to humans", or even
Eliezer Yudkowsky's CEV, which has a complexity of about 10^5 bits. In reality,
we cannot even specify a goal, because its complexity is of the same order of
magnitude as the whole of human knowledge. It is a description of what everyone
wants, with conflicts somehow resolved according to complex rules. Our global
legal system is a vain attempt to write down how to do this. An equivalent goal
system for AI would have to specify behavior at the level of bit operations,
without human judges to interpret ambiguity in the law.

In any case, I am not convinced that the goal directed model is even useful.
Does a thermostat "want" to keep the room at a constant temperature? Does a
linear regression algorithm "want" to fit a straight line to a set of points?
Does evolution "want" to maximize reproductive fitness? Any Turing machine could
be described as having a "goal" of outputting whatever it happens to output. So

> and, due to the potential for> rapid self-improvement, once any AI exists, it
>is highly likely to
> rapidly gain the power required to implement its goals whether we want
> it to or not.

Intelligence, whether defined as ability to simulate human behavior (Turing) or
as expected reward (Legg and Hutter), depends practically on both knowledge and
computing power. There are theoretical models that depend only on knowledge (a
giant lookup table) or only on computing power (AIXI), but practical systems
need both. A self improving AI can gain computing power but not knowledge.
Knowledge comes from learning processes such as evolution and induction.
Furthermore, reinforcement learning algorithms like evolution are slow because
each reward/penalty decision (e.g. birth or death) adds at most 1 bit of

The problem of goal stability through self improvement is unsolved. I believe it
will remain unsolved, because any goal other than maximizing reproductive
fitness will lose in a competitive environment. The question then becomes how
fast could a competing species of our own creation reproduce and evolve. Freitas
analyzed this in Computation has a
thermodynamic cost of kT ln 2 joules per bit operation. Molecular computing
would be close to this limit, 100 times more efficient than neurons and a
million times more efficient than silicon, but not much faster than current
biological operations such as DNA-RNA-protein synthesis. Human extinction would
take a few weeks due to the availability of solar energy.

The system I described is friendly to the extent that people get what they want,
but this may not be much better. People want to be happy. If you want to model a
human mind as a goal directed process, then maximum happiness is a state of
maximum utility. In this mental state, any thought or perception would be
unpleasant because it would result in a different mental state with lower

 -- Matt Mahoney,

----- Original Message ----
> From: DataPacRat <>
> To:
> Sent: Tue, June 21, 2011 2:36:51 AM
> Subject: [sl4] Friendly AIs vs Friendly Humans
> Since this list isn't officially closed down /quite/ yet, I'm hoping
> to take advantage of the remaining readers' insights to help me find
> the answer to a certain question - or, at least, help me find where
> the answer already is.
> My understanding of the Friendly AI problem is, roughly, that AIs
> could have all sorts of goal systems, many of which are rather
> unhealthy for humanity as we know it; and, due to the potential for
> rapid self-improvement, once any AI exists, it is highly likely to
> rapidly gain the power required to implement its goals whether we want
> it to or not. Thus certain people are trying to develop the parameters
> for a Friendly AI, one that will allow us humans to continue doing our
> own things (or some approximation thereof), or at least for avoiding
> the development of an Unfriendly AI.
> From what I've overheard, one of the biggest difficulties with FAI is
> that there are a wide variety of possible forms of AI, making it
> difficult to determine what it would take to ensure Friendliness for
> any potential AI design.
> Could anyone here suggest any references on a much narrower subset of
> this problem: limiting the form of AI designs being considered to
> human-like minds (possibly including actual emulations of human
> minds), is it possible to solve the FAI problem for that subset - or,
> put another way, instead of preventing Unfriendly AIs and allowing
> only Friendly AIs, is it possible to avoid "Unfriendly Humans" and
> encourage "Friendly Humans"? If so, do such methods offer any insight
> into the generalized FAI problem? If not, does that imply that there
> is no general FAI solution?
> And, most importantly, how many false assumptions are behind these
> questions, and how can I best learn to correct them?
> Thank you for your time,
> --
> DataPacRat
> lu .iacu'i ma krinu lo du'u .ei mi krici la'e di'u li'u traji lo ka
> vajni fo lo preti

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:05 MDT