Eliezer S. Yudkowsky wrote:

> Mitchell Porter wrote:

> According to Bishop94 above, for example, it amounts
> to adding an extra term in the loss function, and it is possible to get
> the same benefit without some dangers by modifying the algorithm
> directly instead of training with noise.

Er, sorry, not the loss function. On closer examination it looks like a
function that represents the cost of different trained networks - it looks
like a term that approximates the prior probability of a network.

