I have to go on record here as saying that I (and others who are poorly
represented on this list) fundamentally disagree with this statement. I
would not want readers of these posts to get the idea that this is THE
universally agreed way to build an artificial intelligence. Moreover,
many of the recent debates on this list are utterly dependent on the
assumption that you state above, so to people like me these debates are
just wheel-spinning built on nonsensical premises.

Here is why.

Friendly AIs built on decision theory have goal systems that specify
their goals: but in what form are the goals represented, and how are
they interpreted? Here is a nice example of a goal:

     "Put the blue block on top of the red block"

In a Blocks World, the semantics of this goal - its "meaning" - are not
at all difficult. All fine and good: standard 1970's-issue artificial
intelligence, etc.

But what happens when the goals become more abstract:

     "Maximize the utility function, where the utility function
specifies that thinking is good"

I've deliberately chosen a silly UF (thinking is good) because people on
this list frequently talk as if a goal like that has a meaning that is
just as transparent as the meaning of "put the blue block on top of the
red block". The semantics of "thinking is good" is clearly not trivial,
and in fact it is by no means obvious that the phrase can be given a
clear enough semantics to enable it to be used as a sensible input to a
decision-theory-driven AGI.

The behavior of an AGI with such a goal would depend crucially on what
mechanisms it used to interpret the meaning of "thinking is good". So
much so, in fact, that it becomes stupid to talk of the system as being
governed by the decision theory component: it is not, it is governed by
whatever mechanisms you can cobble together to interpret that vague goal
statement. What initially looked like the dog's tail (the mechanisms
that govern the interpretation of goals) starts to wag the dog (the
decision-theory-based goal engine).

The standard response to this criticism is that while the semantics are
not obvious, the whole point of modern AI research is to build systems
that do rigorously interpret the semantics in some kind of compositional
way, even in the cases of abstract goals like "thinking is good". In
other words, the claim is that I am seeing a fundamental problem where
others only see a bunch of complex implementation details.

This is infuriating nonsense: there are many people out there who
utterly disagree with this position, and who have solid reasons for
doing so. I am one of them.

So when you say "Friendly AIs [...] act according to decision theory."
you mean "The particular interpretation of how to build a Friendly AI
that is common on this list, acts according to decision theory."

And, as I say, much of the recent discussion about passive AI and goal
systems is just content-free speculation, from my point of view.

Richard Loosemore

