RE: Unambiguous Language WAS: MEME: A.I.: Artificial Intelligence

From: Durant Schoon (durant@ilm.com)
Date: Tue Jun 19 2001 - 17:21:16 MDT


PREFACE: I'm going to extend the idea of my previous post. The
original proposition was: "A variant of English can be used for
unambiguous communication". We might use any natural human language
as a superset, but examples here will be in English. English might be
sufficiently powerful but just needs to be constrained. The question
could be: "Are there examples of the English language which cannot be
disambiguated (ones which we cannot mend through some sort of
additional information)?" or even "Is unambiguous communication
possible?". If the answer to either of these is no, then we have to
give up. The discussion below presumes there is an "unambiguous
intention of the speaker". This might be false in some cases and the
choice of language would neither help nor hinder. We should also be
concerned if there are some things which cannot be communicated.

I might not arrive at solutions, but I mention a lot of things here
:)

> From: "Patrick McCuller" <patrick@kia.net>

[...]

> Very funny! Still, even that might not suffice. There are many dictionaries,
> and all do not agree. Because definitions contain words, and we relate to each
> of those words, we would all have to have exactly the same understandings of
> all English words.

If there are multiple dictionaries and we need them to agree, then
perhaps we can define equivalence mappings:

  commoner(MWOCD,ed7,2) "means the same as" commoner(MSED,ed666,13)*
  foo(HD,ed13,1) "means the same as" foo(PD,ed415,7)+<modification>**

So we can find all the matches that match exactly and then find all
the matches which are close and attach the difference (<modification>
above).

* (MSED,ed13,7) = MicroSoft Encarta Dictionary, edition 666, def'n 13,
  35 cents has been deducted from your account. Thank you for using MSN.

** foo according to (HD,ed13,1) is the same as foo according to
   (PD,ed415,7), EXCEPT foo(PD,ed415,7) also means <modification>.

> Even if we all agreed on the the book definitions of all words, and even if
> we were willing to spend huge amounts of processing power cross-referencing
> words (in many varied and fluidly changing dictionaries) we would still have a
> problem presenting ideas without the possibility of misinterpretation.

Regarding word ambiguity:
------------------------

But it might only be a matter of practicality. We might be able to
communicate unambiguously, if we define all our terms ahead of time
(see above footnote style descriptions).

At some point we hit bottom level symbol-referent bindings (as an
example of this I mean the symbol "tree" refers to "the concept of
tree" and is therefore "bound" to it. Or the symbol "Patrick" refers
to and is bound to (my mental symbol of) you, the person.) See also
"More regarding experiential references" below.

Hardware Limitations:
--------------------

One interesting point is that our understanding of English is
*hardware limited*. We (humans) can usually only remember 7 chunks of
information at a time, right?

      http://www.well.com/user/smalin/miller.html

There are examples of English sentences which are grammatically
correct and unambiguous, but which are just too hard for most people
to parse correctly due to memory limitations. I tried to find an
example on the web, but I failed. I almost remember a sample that
looked something like: "The sentence Ben said Gordon said Samantha
said Durant said Patrick said Jimmy said Jordon said Dave said Eli
said was right" - but that's not an actual example, unfortunately...

Regarding grammatical ambiguity:
-------------------------------

Consider the grammatically correct sentence:

        Visiting extropians can be exciting.

Could either mean:
        a) Extropians, who are visiting, can be exciting.
Or:
        b) Going to visit extropians can be exciting.
      
Normally we distinguish between the possibility of (a) or (b) by
*context*, ie. the rest of the paragraph, chapter, etc. will support
one interpretation over the other. This is not guaranteed, of
course. So there are maybe two patches here: 1) disallow known
ambiguous sentence patterns (we're getting a subset of English here)
or 2) footnote ambiguous grammar in the same way we were foot-noting
ambiguous words above. If the speaker means to convey 7 concepts,
then those 7 concepts should be noted.

Note:
The above example sentence was modified from one in "Patterns of the
Mind : Language and Human Nature" by Ray Jackendoff, p. 48. The book
is pretty simple (just right for me), but a good read.

Another example:

        I gave a copy of CFAI to the man in the chair with the broken leg.

which could mean:
        I gave a copy of CFAI to the man in (the chair with the broken leg).
Or:
        I gave a copy of CFAI to (the man with the broken leg) in the chair.

This would be especially ambiguous if you looked over and saw one man
with a broken leg sitting in a chair and another man in a chair which
had a broken leg, each reading a copy of CFAI. It would be funny,
too.

> We each as individuals attach our own feelings, ideas, and experiences to our
> perceptions of words.

Just to carry my ridiculous thought experiment forward, I might
"publish" my meaning prior to using it, in a personal, public
dictionary:

        "Memetic Attractor"(DD,ed1):
                1. a sort of basin of beliefs to which a sentient
                   entity can be "attracted"
                2. a meme which attracts other memes to it, the way
                   that Christianity co-opted Celtic winter
                   festivals.
                diz. using this phrase sometimes make me dizzy for
                     unknown reasons. This attached note means I'm
                     referencing a feeling of dizziness in addition
                     any another meaning.

Of course, these definitions would need to be replete with
disambiguating references as well (Is this hard? Am I side stepping
the real problem here? Does it all work if I assume everything
bottoms out?).

More regarding experiential references:
--------------------------------------

Now, there are some cultural and personal experiential definitions
which are hard to convey. Consider the sentence:

     It was like the first time I used a bidet.

Even if you look up the definition of "bidet" in a French dictionary,
you still might have no idea how the author's experience of "it" was
like the author's first experience using a "bidet". But I could
easily imagine an example of a satisfactory description of the
specific details. So the limitation is not the language, just the
cumbersome effort to convey all the relevant particulars.

Regarding reliance on shared experiences:
----------------------------------------

If I talk about colors to a blind person (ok, we're getting
dangerously close to the "qualia" issue) she might have no way of
knowing what I'm talking about. I can probably list connotations
about feeling "hot", or making literary associations to "anger",
which she would probably know. This is different from "ambiguity",
which arises when more than one interpretation is possible. This is
the case of NO possible interpretation.

If I were talking with a sentient machine which has no knowledge or
experience of heat or anger and no "close" mapping to those concepts
exist, we'd have the same problem. We can probably work around this
though in some (dare I suggest it) Skinnerian behavioral notions of
cause and effect (ie. if the sentient machine knows how these
concepts "affect me behaviorally" we might still be able to
communicate to a lesser extent).

In fact it is very important that we can communicate unambiguously
(or at least effectively) if we are, or someone we know is :), going
to build a self enhancing AI any time soon.

> Language as we use it is fluid. Dictionaries do not contain every relevant
> piece of information, and they cannot tell us how others will interpret a
> given word.

Agreed. We'd have to change the way we use it. Just to make this a
little more SL3-ish, can we assume a fat pipe*** connection to our
hardware enhanced brains, so hardware considerations disappear?

"Dictionaries do not contain every relevant piece of information"
              True. But we could publish our personal meanings first
              and reference them parenthetically (see above).

"Dictionaries cannot tell us how others will interpret a given word."
              Currently true. But if we can define precise enough
              definitions and enumerate all connotations, couldn't
              this be solved? I assert that "precise enough
              definitions" can be defined as needed (it just requires
              extra effort and assumes bottom level common
              experiences exist, so that we can refer to the same
              things...or sufficiently similar things. Then it
              becomes a matter of listing which meanings are
              intended).

*** Fat Pipe(DD,ed1)
    1. High bandwidth connection to a channel that carries lots of
       information.

> Given that we cannot control how others will interpret a word, a perfectly
> unambiguous languange may be impossible.

So can we turn that around? "Unambiguous languange may be possible if
we can control how others will interpret a word." Our next question
would be: "What kind of control would we need?" Maybe control is
unnecessary here, maybe all we need is verification that when I say
"chocolate chip cookie" your concept of this phrase is sufficiently
similar to mine for what I want to communicate. If I use the invented
phrase "cow chip cookie" and I could check that in your personal,
published lexicon you have similar definitions for "cow chip" and
"cookie", I can feel pretty confident you'll know what I'm talking
about, even if neither of us knows how one tastes.

Communication through English already works (ambiguity and all). The
stronger my anticipation that you'll know what I'm talking about, the
less I'll have to reference and clarify footnote-style. With this
envisioned future footnote/hyperlink scenario, the listener doesn't
have to ask for clarification, since the listener can check the
speaker's dictionary. Cumbersome currently, but we already use the
web to look things up without bothering the speaker or other
listeners...that is if our employers grant us web access :(

For real world, common, tangible things, there is always a little
ambiguity. For example, if I sent you a cookie recipe:

        (Recipe may be halved)

        2 cups butter
        4 cups flour
        2 tsp. soda
        2 cups sugar
        5 cups blended oatmeal (blend in a blender to a fine powder)
        24 oz. chocolate chips
        2 cups brown sugar
        1 tsp. salt
        1 8 oz. Hershey Bar (grated)
        4 eggs
        2 tsp. baking powder
        2 tsp. vanilla
        3 cups chopped nuts (your choice)

        Cream the butter and both sugars. Add eggs and vanilla; mix
        together with flour, oatmeal, salt, baking powder, and
        soda. Add chocolate chips, Hershey Bar and nuts. Roll into
        balls and place two inches apart on a cookie sheet.

        Bake for 10 minutes at 375 degrees. Makes 112 cookies.****

you followed it and successfully baked the cookies. I'd say we
communicated. I don't know how you'll interpret: Hershey Bar, flour,
and soda, but somehow I can be pretty sure you'll interpret them as I
intend. Contextually, and otherwise, I expect these words should have
clear meaning. The little ambiguity that does exists is explained in
parenthetical notes (this actually reveals what I think is
ambiguous).

And yes a full blow common sense engine would probably be needed for
a machine to bake these cookies given only the above recipe, because
there are always many, many shared assumptions. This is where
practicality prevents what might be possible.

There might be associations which I do not want to or need to convey.
If the smell of vanilla reminds me of the (hypothetical) time I drank
a whole bottle of vanilla extract and got sick, that information is
irrelevant to you for the task of making cookies. If it were relevant
I could footnote it. *(6)

**** Regarding this cookie recipe, I had just finished a salad at
     Nieman-Marcus cafe in Dallas and decided to have a small
     dessert. Because I'm such cookie lover, I decided to try the
     "Neiman-Marcus Cookie."

     It was so excellent that I asked if they would give me the
     recipe and the waitress said with a small frown, "I'm afraid
     not." "Well," I said, "would you let me buy the recipe?" With a
     cute smile, she said, "yes." I asked how much and she responded,
     "only two fifty. It's a great deal!" With approval, I said to
     just add it to my tab.

     Thirty days later, I received my statement from Neiman-Marcus
     and it was $285.00. I looked again and I remembered that I had
     only spent $9.95 for a salad and about $20.00 for a hat. As I
     glanced at the bottom of the statement, it said, "COOKIE RECIPE
     - $250.00" That's outrageous!

     I called Nieman's accounting department and told them the
     waitress said it was "two-fifty" which clearly does not mean
     "two hundred and fifty dollars" by any possible interpretation
     of the phrase.

     Neiman-Marcus refused to budge. They would not refund my money
     because according to them, "What the waitress told you is not
     our problem. You have already seen the recipe - we absolutely
     will not refund your money at this point."

     I explained to her the criminal statutes which govern fraud in
     Texas. I threatened to refer them to the Better Business Bureau
     and the state's Attorney General for engaging in fraud. I was
     basically told, "Do what you want, we don't give a crap, and
     we're not refunding your money."

     I waited, thinking of how I could get even or even try and get
     any of my money back. I just said, "Okay, you folks got my
     $250.00 and now I'm going to have $250.00 worth of fun." I told
     her that I was going to see to it that every cookie lover in the
     United States with an e-mail account has a $250.00 cookie recipe
     from Nieman-Marcus for free.

     She replied, "I wish you wouldn't do that." I said, "Well, you
     should have thought of that before you ripped me off," and
     slammed the phone on her.

     So here it is!!!! Please, please, please pass it on to everyone
     you can possibly think of. I paid $250.00 for this ... I don't
     want Nieman-Marcus to ever get another penny off of this
     recipe. *(5)

*(5) Actually none of this happened and most of you already suspected
     this.

*(6) Do not drink entire bottles of vanilla extract, no matter how
     delicious it smells. Though I have not done so, I think you
     might get sick.

PS - I thought I'd mention that that pun about "A wild bore" in one
   of my previous emails was invented by a computer program JAPE (I
   think) written by Kim Binstead (I think) while she was at Oxford
   (I think), in case you didn't recognize the reference. It was
   obscure and didn't need to be explained, but I explained it
   anyway. So there :)

--
Durant Schoon


This archive was generated by hypermail 2.1.5 : Wed Jul 17 2013 - 04:00:36 MDT