Friday, May 20, 2011

Data, metaphorically

Hmm ... haven't been in here in a while.  Everything still looks OK, just a bit musty.  Let's open up the curtains, blow the dust off the bookends.  Ah ... better.

Now where was I?

In the previous post, I tried to find an ordinary mass noun that behaved like data in its mass noun form, but without great success.  I'm not going to try to fix that here.  In fact, I'm going to try to explain why the effort failed, and to do that I want to explore how data behaves, metaphorically.  But first a bit about metaphor.

Metaphor generally connotes figurative speech used for poetic effect, whether well
That time of year thou mayst in me behold
When yellow leaves, or none, or few, do hang
Upon those boughs which shake against the cold,
Bare ruined choirs, where late the sweet birds sang.
or perhaps not so well
Head down into the storm they went, pressing barehanded to their chests an unshielded sense of peril.
Um ... right.

That's all well and good, but it's not the whole story.  Even the definition your English teacher gave was probably more like "A comparison made by referring to one thing as another."*  That's closer to Aristotle's definition and to the etymology from the Greek for "carry over", and in my view it's an apt one.  There is a strong case to be made that metaphor in this sense is not merely a figure of speech reserved for flowery poetry and purple prose, but rather a fundamental aspect of how we think, whether we put those thoughts into words or not.

Lakoff and Johnson, for example, make this case in Metaphors We Live By, which pulls together dozens of examples of particular metaphors and shows how, taken together, they imply underlying mental metaphors.  Far from grinding away at a desk in English class to produce a figure of speech that will survive the dreaded red pen, we effortlessly produce metaphors -- in Aristotle's sense -- in nearly every sentence.  These metaphors, on the order of "more is up/less is down" and "anger is a hot liquid" (it can boil over, you can get rid of it by blowing off steam, it behaves as a mass noun ...), are so pervasive we don't even see them as metaphors unless we look -- at which point we see them everywhere.

(To get the flavor, go over that last paragraph.  Clearly "grinding away" is metaphoric, but so is "see" in "see them as metaphors", "pulls together" and even "in" in "in this sentence".  Well, a sentence isn't really a container or bounded space, is it?)

It's perfectly normal, indeed probably universal, to have more than one metaphoric view of a concept, and that the different views don't have to be consistent.  For example, we can view ourselves as moving through time ("Let's just get through today.") or ourselves as stationary with time moving past ("What's coming next week?") depending on what works best at the moment.

So, from this point of view, what is data (in the computing sense)?
  • It's a fluid.  It can flow or otherwise move from place to place.  It can leak.  It can fill up space.  It can also be compressed, but generally it acts more like a liquid than a gas.  If your data isn't flowing fast enough, you need a bigger pipe.
  • It's made up of discrete parts, ultimately bits.  It can be partitioned into chunks of uniform or varying size.  You can change parts independently, but only down to the bit level.
  • It's something of value. It can be secured, tampered with, stolen, bought, sold or given away.
  • It's a form of text.  It can be written, read, erased and copied.
  • You can search through it, organize it and make it universally accessible and useful ... wait, where did that come from?
I'm sure that with a little more thought I could come up with several more metaphors for data, but I think that's enough to make two points: First, that data, like very many other concepts, can be described by internally consistent metaphors, and second, because these metaphors, as with those for other concepts, aren't always consistent with each other, there's no one concrete noun that could serve as a universal metaphor for data.  In other words, trying to fit water or stone or gravel or rice to data as a whole was doomed to failure from the beginning.

That's to be expected, I suppose.  One definition of equality is that two things are the same if they can stand in for each other in all circumstances.  If it looks like rice, tastes like rice and is generally like rice in every observable way, then we may as well say it is rice.  Which leads me to a different definition of metaphor that I don't like nearly as much as the one I used:  A comparison of two unlike things that have something in common.

That's fine as far as it goes, and in particular the things in a metaphor do have to be unlike, but it implies that the things being compared are interchangeable.  They aren't.  One thing is being explained by referring to it as the other.  Moreover, the thing be explained is always more abstract than the thing being referred to.  In the first data example, something very abstract (data), is being referred to as something more concrete (a fluid, for example).


As always, the definition you choose makes a difference.  Seeing metaphor as a comparison between two unlike things with something in common provides a formula for incoherent images (What do you mean "The stop sign was a fire truck." isn't a good metaphor?  They're unlike but they're both red!).  Seeing metaphor as one thing carried over to stand in for another -- the original metaphor for metaphor -- opens up a vast and surprising new world.






* It took me a while to find a definition I liked.  This one is courtesy of Gideon Burton's Silva Rhetoricae.