Reading Yudkowsky, part 33

by Luke Muehlhauser on May 2, 2011 in Eliezer Yudkowsky,Resources,Reviews

AI researcher Eliezer Yudkowsky is something of an expert at human rationality, and at teaching it to others. His hundreds of posts at Less Wrong are a treasure trove for those who want to improve their own rationality. As such, I’m reading all of them, chronologically.

I suspect some of my readers want to “level up” their rationality, too. So I’m keeping a diary of my Yudkowsky reading. Feel free to follow along.

His 262nd post is Fallacies of Compression:

“The map is not the territory,” as the saying goes.  The only life-size, atomically detailed, 100% accurate map of California is California.  But California has important regularities, such as the shape of its highways, that can be described using vastly less information – not to mention vastly less physical material – than it would take to describe every atom within the state borders.  Hence the other saying:  “The map is not the territory, but you can’t fold up the territory and put it in your glove compartment.”

A paper map of California, at a scale of 10 kilometers to 1 centimeter (a million to one), doesn’t have room to show the distinct position of two fallen leaves lying a centimeter apart on the sidewalk.  Even if the map tried to show the leaves, the leaves would appear as the same point on the map; or rather the map would need a feature size of 10 nanometers, which is a finer resolution than most book printers handle, not to mention human eyes.

Reality is very large – just the part we can see is billions of lightyears across.  But your map of reality is written on a few pounds of neurons, folded up to fit inside your skull.  I don’t mean to be insulting, but your skull is tiny, comparatively speaking.

Inevitably, then, certain things that are distinct in reality, will be compressed into the same point on your map.

…Sometimes fallacies of compression result from confusing two known things under the same label – you know about acoustic vibrations, and you know about auditory processing in brains, but you call them both “sound” and so confuse yourself.  But the more dangerous fallacy of compression arises from having no idea whatsoever that two distinct entities even exist.  There is just one mental folder in the filing system, labeled “sound”, and everything thought about “sound” drops into that one folder.  It’s not that there are two folders with the same label; there’s just a single folder.  By default, the map is compressed; why would the brain create two mental buckets where one would serve?

And, never forget, Categorizing Has Consequences:

Any way you look at it, drawing a boundary in thingspace is not a neutral act.  Maybe a more cleanly designed, more purely Bayesian AI could ponder an arbitrary class and not be influenced by it.  But you, a human, do not have that option.  Categories are not static things in the context of a human brain; as soon as you actually think of them, they exert force on your mind.  One more reason not to believe you can define a word any way you like.

In fact, it is often an attempt at Sneaking in Connotations, and there are other perils of Arguing By Definition to watch out for. Where to Draw the Boundary reminds us:

Just because there’s a word “art” doesn’t mean that it has a meaning, floating out there in the void, which you can discover by finding the right definition.

It feels that way, but it is not so.

Wondering how to define a word means you’re looking at the problem the wrong way – searching for the mysterious essence of what is, in fact, a communication signal.

So is there a project concerning the definitions of words that is worthy of a rationalist? Yup! Read: Where to Draw the Boundary?

If you define “eluctromugnetism” to include lightning, include compasses, exclude light, and include Mesmer’s “animal magnetism” (what we now call hypnosis), then you will have some trouble asking “How does electromugnetism work?”  You have lumped together things which do not belong together, and excluded others that would be needed to complete a set.  (This example is historically plausible; Mesmer came before Faraday.)

We could say that electromugnetism is a wrong word, a boundary in thingspace that loops around and swerves through the clusters, a cut that fails to carve reality along its natural joints.

Figuring where to cut reality in order to carve along the joints - this is the problem worthy of a rationalist.  It is what people should be trying to do, when they set out in search of the floating essence of a word.

And make no mistake: it is a scientific challenge to realize that you need a single word to describe breathing and fire.  So do not think to consult the dictionary editors, for that is not their job.

Eliezer’s example concerns the definition of “art,” and it is highly worth reading. Entropy, and Short Codes uses entropy to illustrate an otherwise simple point:

The key to creating a good code – a code that transmits messages as compactly as possible – is to reserve short words for things that you’ll need to say frequently, and use longer words for things that you won’t need to say as often.

When you take this art to its limit, the length of the message you need to describe something, corresponds exactly or almost exactly to its probability.  This is the Minimum Description Length or Minimum Message Length formalization of Occam’s Razor.

And so even the labels that we use for words are not quite arbitrary.  The sounds that we attach to our concepts can be better or worse, wiser or more foolish.  Even apart from considerations of common usage!

The length of words also plays a nontrivial role in the cognitive science of language:

Consider the phrases “recliner”, “chair”, and “furniture”.  Recliner is a more specific category than chair; furniture is a more general category than chair.  But the vast majority of chairs have a common use – you use the same sort of motor actions to sit down in them, and you sit down in them for the same sort of purpose (to take your weight off your feet while you eat, or read, or type, or rest).  Recliners do not depart from this theme.  “Furniture”, on the other hand, includes things like beds and tables which have different uses, and call up different motor functions, from chairs.

In the terminology of cognitive psychology, “chair” is a basic-level category.

People have a tendency to talk, and presumably think, at the basic level of categorization – to draw the boundary around “chairs”, rather than around the more specific category “recliner”, or the more general category “furniture”.  People are more likely to say “You can sit in that chair” than “You can sit in that recliner” or “You can sit in that furniture”.

And it is no coincidence that the word for “chair” contains fewer syllables than either “recliner” or “furniture”.

Mutual Information, and Density in Thingspace admonishes:

…the way to carve reality at its joints, is to draw your boundaries around concentrations of unusually high probability density in Thingspace.

Superexponential Concepts and Simple Words concludes:

Presenting us with the word “wiggin”, defined as “a black-haired green-eyed person”, without some reason for raising this particular concept to the level of our deliberate attention, is rather like a detective saying:  “Well, I haven’t the slightest shred of support one way or the other for who could’ve murdered those orphans… not even an intuition, mind you… but have we considered John Q. Wiffleheim of 1234 Norkle Rd as a suspect?”

Previous post:

Next post:

{ 2 comments… read them below or add one }

MarkD May 2, 2011 at 10:28 am

Slight href failure for Entropy, and short codes…

One of the takeaways from work on MDL and topics like Latent Semantic Analysis (which I argue are closely related in a ’98 Cog Sci paper) is that the smearing and extent of semantic categories that has been debated ad nauseam since Fodor and Putnam is actually a crucial feature of the plasticity of learning. Which is why I tend to be dismissive of the notion that AI research should be interested in perfecting rationality rather than just continuing, slowly, to get a good handle on the underlying informational physics.


mopey May 2, 2011 at 11:39 am

There is just one mental folder in the filing system, labeled “sound”, and everything thought about “sound” drops into that one folder. It’s not that there are two folders with the same label; there’s just a single folder. By default, the map is compressed; why would the brain create two mental buckets where one would serve?

I’m cuckoo for cocoa puffs!


Leave a Comment