Re: Building a friendly AI from a "just do what I tell you" AI

From: Joshua Fox (
Date: Mon Dec 03 2007 - 10:10:37 MST

2007/11/21, Tim Freeman < >:
> ... Unfortunately the paper
> needs revision and hasn't yet made sense to someone who I didn't
> explain it to personally. Maybe I'll be able to make it readable over
> Thanksgiving.


The paper is, in fact, very clearly written. Its style achieves a nice
balance between the heavy formalism of academic papers and the informal
exposition in many transhumanist items, even the better ones. Each style has
its place, but your article's is just right for what it is aiming at.

Here are my thoughts about it:

- SIAI is trying to trigger more research into FAI beyond Eliezer's own. It
is really good to see that this is starting to happen.
- Your diagrams are great. But they need more step-by-step explanation. This
is the only less-than-clear part the paper.
- Thanks for including the Python. Code gives a great tool for verifying the
claims of your article.
- A minor point: There are some places where you mix levels of abstraction,
raising code-level issues ( e.g., using integers as indexes) in an article
that is otherwise focused on theory.
- I don't get why compassion and respect have to be separated. Both mean
that the AI needs a utility function which matches another agent's. In the
case of compassion, positive utility, and in the case of respect, negative.
Since the AI can give different weightings to different agent's utility, it
seems that we can cover "compassion" and "respect" with a single concept.
- Might issues of horizons, time periods, and transaction demarcation be
handled by introducing time into the utility function -- e.g., with
exponential damping/discounting?


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:01 MDT