Re: guaranteeing friendliness

From: Jef Allbright (
Date: Sun Dec 04 2005 - 09:52:39 MST

On 12/4/05, Jeff Medina <> wrote:
> On 12/4/05, Jef Allbright <> wrote:
> > I will suggest, as I have approximately annually, that you
> > will find that there is no guaranteed *solution* to the problem of
> > friendliness, but that there is an optimum *approach*,
> I'm not sure how it might be said Eli would disagree with this. He may
> well disagree with your proposal concerning the nature of that optimal
> approach, but not with the limitation of having only optimal, and not
> guaranteed, solutions to any given problem.
> Bayesian epistemology requires one never to assign zero probability to
> anything. Hence, where "guaranteed solution" implies certainty
> (P(successful solution)=1), there are no guaranteed solutions for the
> methodologically Bayesian. And should "guaranteed solution" refer to
> something less than complete certainty, one might reasonably wonder
> what difference there could be between a "guaranteed" solution, on
> this construal, and an optimal one. The use of language like
> "guaranteed" or "formally proven" or "verifiable" in discussions of
> FAI do not indicate guaranteed (i.e., certain, P=1), contra 'merely'
> optimal, approaches, as there are possibilities of failure
> ineradicable even in principle let alone practice. For example, a
> formal proof may fail due to human error in the construction of the
> proof itself, no matter how many times one n-tuple-checks one's work,
> or inapplicability of the chosen formal system to the problem at hand,
> because, say, the formal system doesn't accurately map/model reality
> in ways one has not foreseen

Thanks Jeff. Very poor wording on my part; a reminder not to post
while distracted or rushed. I should have used phrasing contrasting
optimal solution with optimal approach, keeping in mind that both are
applied within a context of recursive improvement.

Previously I've perceived clear statements of belief from Eliezer and
Wilson that the threat is too great, and the risk too high, to launch
anything but a formally verified and extremely high probability
solution onto the world.

I think the concept is fundamentally flawed because of its dependence
on extrapolation within relative isolation from an evolving
environment. A significant degree of extrapolation will lead to fatal
divergence from evolving external reality due to combinatorial
explosion and cumulative errors. In order to control (rather than
merely influence) the outcome of such a complex system (without
destroying its essential complexity) would require accurately modeling
its relevant features and its probable trajectories beyond the
capability of any practical simulation system. The only way to know
with confidence what such a complex system will do is to play it out.
This reasoning applies both to controlling (instrumental aspect) as
well as choosing (values aspect) the direction of humanity.

I see AI playing a very important part in the future of humanity but I
don't share in the consensus here that a Really Powerful Optimization
Process leads to the AI transcending its intended domain and tiling
the galaxy with paperclips or whatever. I understand that this
assumption underlies the idea that we are in a race to develop
Friendly AI before humanity is made irrelevant by the unfriendly sort.

My suggestion is that a more effective approach to positively
influencing (not solving) the challenges facing humanity is to build a
framework for the next level of morality (decision-making increasingly
seen as increasingly good) based on converging human values exploiting
diverse instrumental/creative knowledge.

I apologize that as usual my post is densely packed, and I have a
close family member who entered the hospital yesterday and may need
most of my attention for a while. However, I am highly motivated to
explore this thinking to the extent that the conversation seems

- Jef

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:54 MDT