# Re: Optimality of using probability

From: Mitchell Porter (mitchtemporarily@hotmail.com)
Date: Tue Feb 06 2007 - 16:25:51 MST

I said

>If you the programmer ('you' being an AI, I assume) already have the
>concept of probability, and you can prove that a possible program will
>estimate probabilities more accurately than you do, you should be able
>to prove that it would provide an increase in utility, to a degree
>depending on the superiority of its estimates and the structure of
>your utility function. (A trivial observation, but that's usually where
>you have to start.)

Suppose that

the environment is a two-state Markov process, pr(A)=p, pr(B)=1-p;
your modelling freedom consists in setting q, your guess at the value of p;
and utility at timestep t is just the cumulative number of correct guesses.

Then at time t, for a given q, expected utility is
EU_q[t] = pq + (1-p)(1-q).

It should not be hard to prove that
|p-q0|<|p-q1| implies EU_q0[t] > EU_q1[t].

What I had in mind was a situation in which there is a programmable
external device with higher-precision arithmetic than you have, so
it can estimate p better than you. It's a rather artificial example
(although
this is the human situation with respect to electronic hardware), but
the situation involved would not be hard to represent, superficially
anyway, and that would be enough for the deduction to be made.

So that's a simple case, "where the statistical structure of the
environment is known", as you put it below. The more abstract
cases will revolve around proofs of *algorithmic* superiority,
perhaps.

Eliezer said

>Mitch, I haven't found that problem to be trivial if one seeks a precise
>demonstration. I say "precise demonstration", rather than "formal proof",
>because formal proof often carries the connotation of first-order logic,
>which is not necessarily what I'm looking for. But a line of reasoning
>that an AI itself carries out will have some exact particular
>representation and this is what I mean by "precise". What exactly does it
>mean for an AI to believe that a program, a collection of ones and zeroes,
>"estimates probabilities" "more accurately" than does the AI? And how does
>the AI use this belief to choose that the expected utility of running its
>program is ordinally greater than the expected utility of the AI exerting
>direct control? For simple cases - where the statistical structure of the
>environment is known, so that you could calculate the probabilities
>yourself given the same sensory observations as the program - this can be
>argued precisely by summing over all probable observations. What if you
>can't do the exact sum? How would you make the demonstration precise enough
>for an AI to walk through it, let alone independently discover it?
>
>*Intuitively* the argument is clear enough, I agree.

_________________________________________________________________