Self-modifcation of goals bad?

From: Will Pearson (
Date: Tue Apr 30 2002 - 04:24:34 MDT

('binary' encoding is not supported, stored as-is)

Having been informed that this is SL4 material here you go.

Quite a bit of this is based on what my thoughts were for a system I was designing, but abandoned. While the majority of the list is sceptical of the human psychology, I am sceptical of information. I think that in any situation information has the ability to act in a selfish way (preserving it's own existance) those that do so will survive. Basically the ideas from Dawkins selfish gene applied to all information.
I know eliezer that you are basing your hopes on this for the
friendliness to survive, but I think that you over estimate it's
ability to preserve itself. I will base most of my arguments on the
SeedAI document, mainly because I haven't found much on how Novamente
proposes to keep the correct goals.

First off what probabilities will the program except for a change to
itself causing it not to no longer follow its goals? If the probability is greater than 0 then it will eventually change itself not to follow it's goals.

Next how does it accurately predict these probabilities? The shadowing thing, I suppose. First off it will have to have at least twice the normal resources in order to be able to do this.

Also it would have to run all the inputs over all the possible states in order to be able to see whether it performs an action that is not towards it's goals. Ignoring the hugeness of this space for the moment. One of these states would have to be the shadow itself shadowing the one after it. Does anybody else see a infinite recursion coming in here? Why would you have to shadow the next program shadowing? Because the change to the first program may have changed the shadowing part that renders it useless.

Also it is not just the goal that is at risk. It is other parts of the program such as
1) isGoalOrientated function you may end up thinking that blowing bubbles at everybody is friendly and self-improving
2) functions that call the isGoalOrientated function if these stop calling it then what happens?

If you think all these are fanciful, here's one that is less so. What happens if the program for whatever reason thinks that there is no way to improve itself. Because that is a stable configuration the program will stay like that.
It would still be interesting to see if want I think is correct, so carry on, but don't be surprised if the systems don't follow there goals.
Basically the message can be sumarised into two sentences.
1) Self-modify/evolving software where the goal can be somehow modified is damn tricky.
2) Your programs are not omniscient, they may accidentally modify themselves so that they no longer follow their goals.
 Sign-up for your own FREE Personalized E-mail at

Sign-up for your own FREE Personalized E-mail at

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:38 MDT