does complexity tell us that there are probably exploits?

From: Daniel Radetsky (daniel@radray.us)
Date: Mon Aug 22 2005 - 18:15:39 MDT

Next message: Chris Paget: "[JOIN] Chris Paget"
Previous message: Russell Wallace: "Re: Transcript. please? (Re: AI-Box Experiment 3)"
Next in thread: Thomas Buckner: "Re: does complexity tell us that there are probably exploits?"
Reply: Thomas Buckner: "Re: does complexity tell us that there are probably exploits?"
Reply: Peter de Blanc: "Re: does complexity tell us that there are probably exploits?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

I decided I would respond at least once to Michael Vassar's objection before
putting it in my essay, just in case he's right. I'm writing this in a
non-conversational tone, partially because I'm practicing for the essay, and
partially because enough time has elapsed since Vassar's response that it no
longer feels like a conversation. I am also writing more verbosely than is
probably necessary to convey my response. This is because I recognize my
general ignorance about many of these topics, and writing verbosely will allow
a reader to diagnose errors of reasoning in my email, if any exist.

Vassar wants to say that despite my objections, we ought worry about exploits
but not ninja hippos because the claim "there are exploits" has a higher prior
probability than "there are ninja hippos." This is because, according to
Vassar, the Kolmogorov complexity of e, the first proposition, is greater than
the complexity of h, the second proposition.

Obviously, e and h are not bitstrings and do not straightforwardly have
complexity. To discuss the complexity of objects in the sense of rocks and
fish, we need some sort of encoding scheme which maps real world objects to
bitstrings. Vassar must hold either:

1. The encoding scheme is a matter of free choice on our part, and every
complexity is complexity-for-scheme-S.

2. There is a universal, correct encoding scheme.

I assume that Vassar holds (2), because (1) is too easy for someone holding my
position to get around. Briefly, I can pick an artificial encoding scheme where
ninja hippos have a simpler representation than exploits.

If Vassar holds (2), then what is the encoding scheme, and what is the basis
for saying that this is THE scheme? I know of only one such basis, and will
assume that it is the only basis here. This basis is the idea that the universe
is fundamentally digital: it consists of a bunch of states that are either in
one fundamental state or another. I have heard this theory repeated by a lot of
SL4 types and would not be surprised if Vassar held it. I will call the theory
the Binary Universe Thesis (BUT). One who asserts the BUT will naturally hold
that any proposition p corresponds to some binary state of affairs (BSA), and
can be expressed as a bitstring with a definite complexity. So e and h are
numbers or sets of numbers (just in case more than one BSA would count as there
being exploits or ninja hippos).

I'm not entirely sure how one gets prior probability from complexity, but I'm
willing to accept that it can be done. We'll say there is a function f such
that for some proposition p with a corresponding BSA n, f(K(n))=P(p)=x, where x
is a real number, K is the complexity of a string, and P is a probability
function (I will sometimes use the same symbol to represent both the BSA and
the proposition, but I don't think this will be confusing. I suspect many of
the proponents of the BUT would assert that the proposition and the BSA are
identical, and so no distinction need be made). Using some technique
resembling the above, Vassar holds that P(e) > P(h), and so we should worry
about e before we worry about h. This is a valid defense against my h-based
counterexample, but it doesn't actually get the job of defending a worry about
exploits done.

The first problem with Vassar's position is a problem for Orthodox Bayesianism
in general. Suppose we want to know the prior probability of p, so we calculate
f(K(p)) and get x. However, to do this, we presuppose BUT. We have not
calculated P(p), but rather P(p|BUT), since we would be wrong to claim the
probability of p is x if BUT were false. However, to find P(p|BUT) we need to
know P(BUT), but we can't find this out by using f(K(BUT)), as this would be
begging the question. So we need to use ordinary, non-formal scientific
know-how to confirm BUT. I am under the impression the BUT is far from strongly
confirmed, but rather is merely another exciting theory. Is this true? How
confident can we be that BUT is the case? If we cannot be quite confident, we
cannot make the kind of claims about prior probability that Vassar needs to
make.

The second problem has to do with the structure of an argument against my
position. If my intuition tells me that P(a) = P(b), and Vassar (or someone
else) wants to defeat my intuition, he can do it either mathematically or
intuitively. No doubt Vassar wishes he could make a mathematical argument that
P(e) != P(h), but this is not possible because to do this Vassar would have
possess knowledge which (as far as I know) he doesn't: the BSAs corresponding
to e and h. So he must defeat my intuition intuitively. Vassar simply claimed
that e was vague and h specific -> K(e) < K(h) -> P(e) > P(h). This is fine,
but it's not clear that m="There is magic" (in the ordinary sense of the word)
or l="there is a lurking horror" or g="there is a god" are more specific than e.

Vassar also points out that even if, for example, g and e are equally vague,
and hence have similar prior probability, it still may not be rational to treat
them the same way. Obviously, we need to worry about God just in case there is
something relevant about doing so. Let g'="God will send you to hell no matter
what." How should we respond to the possibility that g'? We shouldn't, because
nothing we can do will change it. On the other hand g''="God will send you to
hell unless you go to church on sunday" should be responded to by going to
church on sunday iff we are justified in believing g''. In this case, we are
justified to the tune of f(K(g'')). However, the proposition g'''="God will
send you to hell unless you avoid going to church on sunday" tells us to do the
opposite of what g'' tells us. Vassar would claim that f(K(g'')) is equal to
f(K(g''')), and so we can't use our worries about going to hell to decide
whether or not to attend church. The claim is encapsulated by the principle
that we are not justified in worrying about p if there is no evidence for p and
if, given that p will alter the utility of the world if we do X, the complexity
of p remains roughly constant for all X. If we wanted to make this principle
more mathematical, we could require that the distance between the highest and
lowest probability remain below a certain value, with the value perhaps related
to the mean probability or the disutility of the event.

Here's the problem as I see it: I claim that a world which contains exploits
is about as complex as a world which does not (or, There are two possible
worlds w1 and w2 such that both are empirically equivalent to the actual world,
and w1 contains exploits, w2 does not, and K(w1) = K(w2)) (What is the symbol
for "approximately equal to" in text?). Suppose we were to engineer humans
which, for whatever reason, could not be mind-controlled by UFAI. Now we want
to decide whether or not we should box the AI, recognizing that if there are
exploits, we're screwed. Necessarily, we cannot have evidence that there are
exploits, so we consider our complexity-based priors. But since K(w1) = K(w2),
they should have the same prior probability. If w1 were the case, then we
should not box the AI, because if it is going to be friendly it would be a
waste of time and resources to box it, and if it is going to be unfriendly,
boxing won't do any good. But if w2 were the case, then we should box it,
because if the AI is friendly, we'll just have wasted a bit of time and
resources, but if it is unfriendly we've averted disaster. Hence we cannot use
our worry that e to decide between boxing and not boxing, as with the case of
God. Unless Vassar can compellingly argue that K(w1) != K(w2), I don't see how
the complexity argument can move forward.

So, to sum up, arguing that complexity tells us to worry about exploits has two
major problems. It relies on a premise that even more controversial than the
conclusion, and it seems like the logical conclusion of the premises is the
opposite of the intended conclusion.

Daniel

Next message: Chris Paget: "[JOIN] Chris Paget"
Previous message: Russell Wallace: "Re: Transcript. please? (Re: AI-Box Experiment 3)"
Next in thread: Thomas Buckner: "Re: does complexity tell us that there are probably exploits?"
Reply: Thomas Buckner: "Re: does complexity tell us that there are probably exploits?"
Reply: Peter de Blanc: "Re: does complexity tell us that there are probably exploits?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:52 MDT