From: Russell Wallace (russell.wallace@gmail.com)
Date: Mon May 09 2005 - 14:42:34 MDT
On 5/9/05, Sebastian Hagen <sebastian_hagen@gmx.de> wrote:
> This design for Friendliness content is based on your decisions,
> however. And in a lot of similar cases (the basic design, but also some
> parameters; see your responses to my example questions) you are
> obviously willing to assume the role of 'dictator of Earth' (Eliezer
> called it 'taking over the world' in the CV document on the sl4-wiki) -
> that is, you are specifying basic rules that, if implemented by a RPOP,
> could not later be overridden under any circumstances.
Perhaps I wasn't clear, then - I am _not_ proposing the RPOP take over
the world. I don't think that's likely to be possible, and if it were
possible it would be appallingly dangerous, not to mention ethically
dubious. I'm proposing once the RPOP exists, people decide what sort
of future they want, then give the RPOP the keys and tells it to go
implement humanity's decisions.
At that point the RPOP could be called "dictator of the universe" in
terms of physical capability, but that would be highly misleading,
because it would imply independent volition. In my view the RPOP would
be just an operating system, not sentient or possessed of value
judgements, simply implementing the rules it was given at the start.
(I prefer the term "operating system" to "sysop", to emphasize that I
don't think the RPOP will be sentient, nor understand sentience or
value.)
(This differs from collective volition in that I think humanity's
_actual_ wishes will have to be used, not what the RPOP thinks they
would have been if we were wiser; hence, we need to decide up front to
exercise care and humility in what we do wish for.)
> Is the suggestion to decide whether individual humans are capable of
> making this choice based on their age an attempt to make your model more
> acceptable to individuals used to current society?
It is an acknowledgement that if I suggest a 13 year old who passes a
bunch of IQ and maturity tests be allowed to hop on a bus to the
Transcend by himself against his parents' wishes, the answer the law
will give to that is no, end of story. I don't make the law. Of course
if you think you can get it amended, by all means feel free to try!
> So all people insisting that they want a domain exclusively for
> themselves (and any other entities they create) would get one?
To filter out most of the "ooh, I want a galaxy for myself so I can
fill it with my obedient slave creations, muhahaha!"s, I would suggest
the creation of a domain require an application by a group of a
certain minimum size, perhaps something like 100 people.
> Are these limits always fixed at domain creation, are they always
> subject to later revision, is their later modifiability determined at
> domain creation, or use it completely different rules?
Fixed at domain creation. (There could be some domains whose
proponents want to make their rules modifiable later, but there had
better be at least some fixed ones.)
> > - If you're from the Transcend, for example, you're not going to be
> > allowed into the Amish domain anyway.
> Does this apply to all sentiences that have ever had any kind of contact
> with a transhuman intelligence, even if they themselves are plain
> 'human' in capabilities and appearance? There are a lot of possibilities
> for memetic 'corruption', as perceived by the Amish, here. If the Sysop
> explains this to them, it is quite possible that they will end up
> blocking all incoming data (including visitors) from the vast majority
> of other domains.
I think the Amish are likely to write such a rule for their domain,
and with good reason.
> > - Someone in a domain where lots of computing power is available to
> > individuals could simulate a bunch of sentient beings and not allow
> > them access to the exit.
> That's a valid concern. There's also the possibility that physically
> existing individuals (or individuals existing on the lowest level
> available in the domain) could be denied the possibility to use the exit
> by other entities existing inside it; how exactly this would be possible
> depends on the implementation. If the exit is linked to spatial
> coordinates, one could prevent access to them. If it required a specific
> information one could eliminate the knowledge - etc.
Indeed so.
> Eliezer's Collective Volition is an example for a Friendliness content
> model that defers this kind of decision to an instance that doesn't
> consist of plain humans. I don't know if it's actually workable in
> practice; though if it isn't it should fail safely.
Only if by "fail safely" you mean "Eliezer will be wise enough to see
it's not a winner before go-live day". If actually implemented, I
think it is highly likely to fail in such a way as to eliminate all
sentient life.
> What about the possibility of domains that use a significant part of the
> available resources developing into "hell worlds" with a significant
> negative utility under most common value systems?
> It is possible that those domains would drag the total utility
> (according to the mentioned value systems) of the entire system below zero.
It is conceivable. I don't think it's going to happen (most people
don't want hell worlds, after all), though if you can come up with a
good way to prevent it, then by all means that could be added as a
metarule.
(It wouldn't surprise me at all, mind you, if 99% of the universe ends
up, after the first billion years or so, in domains empty of sentient
life; that's still better than 100%.)
> > The rules for individual domains? Get a bunch of intelligent,
> > responsible humans who are representative of humanity's various
> > cultures and acceptable to the political process to draw them up. Use
> Making that selection would be quite difficult by itself.
Yep. The process I'd be inclined to suggest would be that any group
that meets a minimum number of individuals can specify the domain they
want to live in.
I don't know whether political agreement will be possible, or whether
humanity is doomed to fight over the matter until it becomes too late
to do anything; but I think a proposal that goes as far as possible in
the direction of "everyone gets what they want" has the best chance of
avoiding that.
> > Friendly AI for error checking (i.e. have it try to point out where
> > the humans' decisions would have unintended consequences, though the
> > final decision has to be made by humans).
> The conventional assumption would be to assume that even a RPOP couldn't
> predict all of the possible consequences in detail; after all you
> probably don't want to spend more resources (units of computation) for
> initial simulations than for the actual implementation.
Yes indeed; at the end of the day, it's going to have to come down to
human judgement.
- Russell
This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:51 MDT