Re: Recipe for CEV (was Re: Morality simulator)

From: Nick Tarleton (
Date: Sat Nov 24 2007 - 19:16:44 MST

On Nov 24, 2007 8:40 PM, Matt Mahoney <> wrote:

> The model P will distinguish between descriptions (in words or pictures)
> of
> friendly and unfriendly behavior by assigning higher probabilities to the
> friendly descriptions. This is different than distinguishing between
> friendly
> and unfriendly behavior. I don't claim that such a thing is possible.

If this worked at all (that is, if a detailed model of a human mind is
actually the best way to compress a human's output AND your search algorithm
can find its way out of all of the only-slightly-worse local minima), why
would the model predict Friendly descriptions rather than human-typical

