Guaranteeing Friendlyness

From: Michael Vassar (
Date: Mon Dec 05 2005 - 09:48:18 MST

Jeff Allbright: I don't understand exactly what you are proposing, but
your critique seems specifically inapplicable to the SIAI approach. It
seems to me that the whole point of formal verification is to avoid the
dependence upon extrapolation in relative isolation from an evolving
environment. Your concern seems totally correct when applied to the "test
cautiously on infrahuman AIs" approach favored by essentially all other (I
would say insufficiently paranoid but I see Eliezer's paranoia regarding
other humans in general and everyone else's paranoia about the time limit we
face before someone "out there" builds UFAI to detract from safety, and I
don't think the response that comes from a deep understanding of
optimization is that closely related to paranoia psychologically. BTW, I
have actual professional experience with clinical paranoia and no-one I have
encountered here appears to display anything remotely similar) Friendlyness
aware AI researchers.

It is precisely for the reasons you named that verification is so extremely
desirable, though time will tell if it is possible as well as desirable,
humanly possible, and possible with our practical time limit. If it is not,
my very low confidence estimate is that extremely human-like AIs with
relatively predictable abilities and no self-enhancement enabling
capabilities provide our best hope, though not a very good one. If we can't
figure out how to control optimization processes, we must deal with entities
similar to those we have experience with instead. That could mean uploads,
but could also mean human-derived neuromorphic AIs. Others here disagree,
but I see no evidence that anything infrahuman and no more alien than an
autistic savant from a totally alien culture is a dire take-off risk. A
psychological theory about the conditions under which humans become
increasingly ‘optimization-process-like’ would be very useful. Without one,
I would suggest keeping a human-derived AI away from formal games, and even
more so RPGs. Keeping it away from decision theory, computer science, and
the social sciences is also a no-brainer. Given an upload, you probably
can't do these things, but the take-off risk from a human speed upload with
no access to its own software should be slight. Simplifications leading to
transhumanly perfect health might lead to it gradually increasing in
intelligence, but if you start from a reasonably trustworthy person this
effect should be visible and only moderately threatening (and can also be
reduced by slowing the upload down).

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:54 MDT