Re: UCaRtMaAI paper

From: Wei Dai (
Date: Thu Nov 22 2007 - 19:24:43 MST

Tim Freeman wrote:
> Do you know of a reasonable prior I could have used that dominates the
> speed prior without breaking the algorithm?

Tim, have you read my recent posts titled "answers I'd like from an SI" and
"answers I'd like, part 2"? I think the problem is not just picking a prior,
but that we still don't understand the nature of induction well enough. I
think there are probably some insights beyond Solomonoff Induction that we
are missing.

> The AI might want to periodically
> temporarily stop helping you so it can get more information about your
> utility function and make better decisions about how to prioritize.

I don't think that helps too much, because if everyone knows that the AI
will stop helping only temporarily, they will take that into account and
still not act in ways that reveal their true utilities.

> The same issue arises when raising kids -- if you give them too much
> then everybody involved loses all sense of what's important. It's an
> essential issue with helping people, not something specific about this
> algorithm.

No, it is something specific about this algorithm. Suppose the AI instead
gives each person a fixed quota of resources, and tells him it will only
help him until his quota is used up. This would be similar to giving your
children fixed trust funds instead of helping whichever child you think
needs your help most (which encourages them to exaggerate how much help they
need). (Caveat: I haven't thought this through. It might solve this problem,
but have other bad consequences.)

> The main issue here seems to be that I put a plausible algorithm for
> "do-what-we-want" on the table, and we don't have any other
> specification of "do-what-we-want" so there's no way to judge whether
> the algorithm is any good. I can see no approaches to solving that
> problem other than implementing and running some practical
> approximation to the algorithm. This seems unsafe, but less unsafe
> than a bunch of other AGI projects presently in progress that don't
> have a model of friendliness. I would welcome any ideas.

That's why I think your paper is more valuable as an illustration of how
hard friendliness is, by showing that even a non-practical specification of
"do-what-we-want" is non-trivial. Maybe showing your paper to other AGI
projects will help convince them that pursuing AGI before solving
friendliness is not a good idea.

> Yes, sometimes the AI will decide arbitrarily because the situation
> really is ambiguous. If it gets plausible answers to the important
> questions, I'll be satisfied with it. People really are dying from
> many different things, the world is burning in places, etc., so there
> are lots of obvious conclusions to draw about interpersonal comparison
> of utilities.

First, you haven't showed that the AI will actually draw the obvious-to-us
conclusions correctly. Second, if we eventually discover a moral philosophy
that is a big improvement over what is hard coded into the AI, we are
screwed because we won't be able to reason with it and get it to change.

> I agree about having an infinite number of algorithms, but I don't see
> it as a problem. Life seems to require arbitrary choices.

But if arbitrariness is not a problem, then why not just pick the utility
function of an arbitrary person instead of trying to average them?

> All of the algorithmic priors I've run into depend on measuring the
> complexity of something by counting the bits in an encoded
> representation of an algorithm. There are infinitely many ways to do
> the encoding, but people don't seem to mind it too much. If you're
> looking for indefensible arbitrary choices, the choice of what
> language to use for knowledge representation seems less defensible
> than the algorithm for interpersonal utility comparison we're talking
> about here.

Yep, that's another problem. I mentioned it in the two posts I cited

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:01:00 MDT