Thursday, December 8, 2011

Speculative degrees of difficulty


How hard could it be?

Everyone loves a good round of blue-sky speculation and "what if"?  What if people could live for centuries?  What if electricity were too cheap to meter?  What if we could send messages telepathically?    As the wild ideas start flying, it can be hard to remember that some what if's really could happen in our lifetimes and some are, well, just impossible.

With that in mind, here's a sort of Mohs hardness scale (or maybe Beaufort scale) for speculative ideas, using aerospace as a running example (except the last item, where I couldn't come up with a suitable example for aerospace).  The categories here are broad, partly because there's a lot of ground to cover and partly in hopes that it will generally not be too hard to agree what category something fits in.  In other words, I've traded precision for accuracy.  The exact boundaries are not necessarily so important as simply asking what it would actually take to realize a given idea and getting a rough but believable idea of the answer.  Here's my proposed scale:
  1. Most people could do it easily.  Example: Making a paper airplane or something else that flies.
    1. Many people do it, particularly in richer countries, but at noticeable expense.  Example: Taking a trip on a commercial airliner.
    2. Only the richest individuals or smallish corporations could do it.  Example: Orbiting the earth (using someone else's rocket)  (see note a).
    3. Generally done by large corporations or small countries (see note b).  Example: Producing a system to put a satellite in orbit [actually, this is level 3 now, thanks to SpaceX. A better example might be manufacturing commercial airliners].
    4. Only done by large countries or groups of countries.  Example: Sending an interplanetary probe.
    5. Requires bleeding-edge technology in untested combinations and would require a concerted effort by one or more large countries.  Example: Sending a manned interplanetary mission.
    6. Requires yet-to-be built technology, but based on known principles.  Example: Getting any macroscopic amount of matter to any star (other than the sun) with travel time under a millennium (see note c).
    7. Does not require a new understanding of the universe, but no plausible technology exists, even on paper.  Example: Getting a manned mission to any star (other than the sun) with travel time under a decade in Earth's frame of reference (see note d).
    8. Would require a new understanding of the universe, but not logically impossible.  Example: Travel between galaxies on human time scales (see note e).
    9. No way.  Logically impossible or in blatant conflict with any reasonable understanding of the universe.  Example: Travel back in time.
    Note a: There's a bit of leeway here.  Orbital flights cost tens of millions of dollars.  Not many individuals could afford that, but the very richest are considerably richer than those who could merely afford a single orbital flight.

    Note b: "Large" and "small" here refer to economy (say, GDP), not population or area.

    Note c: To get to Proxima Centauri in a millennium an object would have to be traveling approximately 1/250th of light speed, or about 1200 km/s relative to Earth.  New Horizons maxed out around 20 km/s after it flew by Jupiter.  A probe with the same mass going 1200 km/s would require 3600 times as much energy.  An ion drive with an exhaust velocity of around 400 km/s -- the one propelling the Dawn spacecraft has more like 30 km/s -- could provide the required acceleration if the thing starts out as 95% fuel, but I'm completely handwaving about the power source.

    Note d: Traveling four light-years in ten years implies that relativity will become noticeable for at least part of the trip.  The (true) astronauts on board would experience a somewhat shorter travel time than mission control would.  Going a hundred times faster than the previous example would require 10,000 times as much energy.  Compared to the New Horizons probe, that's 36 million times more energy at the very minimum.  A real manned craft would have to be significantly bigger than New Horizons, even without a propulsion system (most of the New Horizons propulsion system fell away shortly after launch).  It would also be nice to be able to slow down when we got there, and, ideally, turn around and come back.  A factor of a billion is probably more realistic.

    Note e: At the very least this would require some form of faster-than-light travel, and not just a little bit faster like the famous neutrinos might or might not have been doing [They weren't, of course].  The Canis Major Dwarf Galaxy, probably our nearest neighbor, is 25,000 light-years away.  To get there in a decade you'd need to be going 2500 times light speed.  The nearest big pretty spiral galaxy, Andromeda, is about a thousand times further still.  See the comments section for a little about why this is probably not level 10, and Wikipedia's article on faster-than-light for a lot more.

    Friday, October 28, 2011

    The hour and other records


    On 25 October, 1972, Eddie Merckx got on his bicycle at an outdoor track in Mexico City and rode just over 49 kilometers in an hour, a record which would hold for 28 years(*).  If you haven't heard of Merckx, here's a quick comparison with a couple of cyclists you may have heard of:

    Eddie Merckx
    Miguel Indurain
    Lance Armstrong
    Tours de France
    5
    5
    7
    Giri d'Italia
    5
    2
    0
    Vueltas de EspaƱa
    1
    0
    0
    Hour record?
    Yes (*)
    Yes (*)
    No
    World Championships
    3
    0
    1
    6-day races
    17
    0
    0

    [If you've heard of Lance Armstrong, you probably know why I've struck through that 7 -- D.H. 26 Aug 2023]

    To be fair, I believe the 6-day races weren't the old-style 144-hour-straight solo affairs, but "only" the two-person 6pm-to-2am-on-six-consecutive-nights variety, mainly done so that the other riders could make a bit more money than if Eddie hadn't been riding.  Still, it's about 17 more than most of us will ever win.

    Merckx's hour ride is a cycling legend.  He'd arrived in Mexico from Europe only days before, though he had previously trained using an oxygen mask to simulate the altitude (2,300m/7550ft.), and had only minimal practice on the track itself.  On the day, Merckx breakfasted on ham, coffee and toast (spread with Belgian cheese from home) and started his ride.

    His times for the first 5km, 10km and 20km broke the existing records for those distances, records that had been set in a 20km record ride, not as the first part of an hour ride.  At his trainer's advice he slowed considerably after 10km, but only temporarily -- from about 30km on in his pace was steadily increasing.  At the finish Merckx declared the ride the hardest of his life and said he would never try the Hour again, a promise he has so far kept.

    Merckx's ride was an extraordinary achievement in sporting history in general, to say nothing of cycling in particular.  But records are set to be broken.  Merckx's eventually fell, which leads me near to the point of this post ... but hang on ... Miguel Indurain also holds an hour record.  Did he break Merckx's?  According to the UCI, which governs most cycling records, no, and not because someone else had broken it first.  His is a different record (or at least it is now).

    Thus the asterisks.

    (*) Equipment is crucial in cycling.  High-end bicycle makers are famous for shaving every last gram off the weight of the bike and making every possible tweak to the aerodynamics.   For example, I was fortunate enough to be on the Champs Elysees in 1989 to see Greg LeMond make up 58 seconds on Laurent Fignon to win the Tour by 8 seconds (out of over 87 hours total).  LeMond's win in that stage was widely attributed to the advantage of his triathlon handlebars and aerodynamic helmet.  Fignon rode bareheaded (!) on conventional drop bars.

    LeMond may or may not have been the stronger rider on the day, but the difference in equipment is generally considered to have accounted for a time difference of at least two seconds per kilometer -- a pretty significant amount in cycling races -- accumulating over the course of 25 kilometers to ... at least 50 seconds.

    Shave two seconds off every kilometer of Merckx's ride and he's got an extra 98 seconds, enough for another 1300 meters or so.

    The bike Merckx rode in 1972 looks like an ordinary bike.  The bike Miguel Indurain rode has two wheels.  They're solid, the front one is noticeably smaller than the rear, the monocoque frame that they connect to is an irregular, aerodynamically sculpted blob and the handlebars stick out in front instead of curving ... but it does have two wheels.  Indurain's bike is by no means the most unconventional record-setting bike, but it's still an entirely different machine from Merckx's.

    For this reason the UCI eventually split the hour record into two records: ones using equipment essentially like Merckx's, and the "Best Human Effort" record for any upright bike (an entirely different governing body deals with human-powered vehicles in general, and recumbent bikes are in a whole different class, with the current hour record standing at 91.56km).  Merck's record was, by definition, the UCI record.  Indurain's record was the Best Human Effort record.

    This splitting of records is frankly a bit unfair to those who came before Merckx, who of course took advantage of the best technology he could get his hands on.  The bike may look ordinary, but it weighs 5.5kg, with a 90g front tire, a specially welded titanium stem and material strategically drilled away at every possible place, including the chain.  By the same logic, should Merckx be considered to have beaten Frank Dodd's hour record of 1896?  He wasn't riding a penny-farthing after all.

    Could some previous record holder have done better with Merckx's bike?  Who knows?  Essentially, if you go for the UCI Hour record, you are competing with Merckx, and the implication is, no, no one before could have done what he did, even with the same equipment.

    The split is arguably also unfair to those who came afterward, since the split was done retroactively in 1996, reclassifying the nine records that had been set in the interim.  In one sense, these riders had been doing exactly the same as Merckx: Trying to ride as far as possible using the best training and equipment they had available.  In this view, the split is nothing more than an attempt to enshrine Merckx's record as particularly butt-kicking and Merckx as the gold standard butt-kicking cyclist.

    On the other hand, how do you compare a ride like Merckx's to one like Indurain's?  At the very least you'd have to adjust for the difference in equipment, and that's problematic at best.  Ideally, you would send the aspiring record-breaker back in time to Mexico City in 1972, sit them down in a replica of Merckx's bike and let them at it. 

    Or would you?  Is your challenger the same height and weight as Merckx?  Indurain is considerably taller and heavier.  Do they ride in the same style?  Surely there should be some allowance for differences of that sort.  A rider taking on Eddie in a stage race had a choice of bike, within limits.  So perhaps you set general guidelines of design and equipment on the assumption that within those guidelines the advantage you could get over Eddie's own customized machine would be minimal.

    This is what the UCI in fact did, and it seems reasonable.  It's notable that since the split the Best Human Effort record has not been broken, while the UCI record has been broken twice.  Cyclists have voted with their pedals, so to speak.

    Even so, it's a mess:
    • How long did Merckx's record stand?  The UCI record stood from 1972 to 2000, but that's in part because everyone was chasing what turned out to be the Best Human Effort record.  That is, they weren't trying to replicate Merckx's ride.  Merckx's record was just short of the overall best human-powered performance at the time, Francois Faure's 50.53km, set in 1938.  It was the best upright bicycle performance for twelve years, until Francesco Moser beat it in 1984 using a mutant beast with two wheels.  It was the best performance-on-a-bike-like-Eddie-Merckx's for 28 years, until Chris Boardman beat it by about ten meters in 2000.
    • Could Merckx have done even better if he'd focused on the Hour instead of keeping to more or less his usual schedule?  Merckx was a road racer at heart and in the run-up to the hour attempt he won fifty road races, including his fourth Tour and his third Giro.  I don't know how many he entered and didn't win.   Unlike Merckx, Boardman (along with a couple of other record holders) specialized in track racing and time trials and followed a training regimen aimed specifically at beating the hour record [That's not quite right.  Boardman did race in the Tour and other stage races, but not to nearly the extent Merckx or Indurain did.  I should also mention Boardman's rival Graeme Obree, whose innovations in bicycle design had a large part in the UCI splitting the record, who focused mainly on the Hour.  --DH].  Can we really compare the two?
    • The Mexico City track was outdoors, and though conditions were good on the day, there must have at least been some amount of smog in the air from the city's notorious traffic.  Would Merckx have done better had he used an indoor track in a place with better air? 
    • Then there's altitude.  Mexico City's air is considerably thinner than at sea level, important in a sport like cycling where aerodynamics are crucial.  In the 1968 olympics there, Bob Beamon broke the world long jump record by 55cm (21.75in), about half of which can be explained by a tail wind and the thinner air.  That record stood for 23 years, and in the same hour Lee Evans set a record for 400m that stood for 20 years.  On the other hand, it has to matter in an endurance event that just as there is only about 75% as much air to get in your way, there is also only about 75% as much to breathe.  Marathon records tend to be set at or near sea level.  Boardman broke the UCI record in Manchester.  Would he have done better or worse at altitude?  Most of the recent record rides have been near sea level, which gives some indication of current thinking.
    So what's really going on here?  Merckx is unquestionably one of the great riders of all time, and his hour ride was something that very few people in history could have done.  We would like to acknowledge that, and also have some way to compare his performance to those of today's and future stars.

    The first part is easy.  Cycling fans know who Eddie Merckx is and just how impressive his record is.

    The second part is only possible if you let go of the idea of perfect comparison.  If you really wanted to be scientific about it, you would build and maintain an official track and a set of official bikes in various sizes.  Each cyclist would do, say, twenty or thirty hour rides on an official bike.  The aggregate results would then be compared statistically to answer the question "What is the probability that rider A is faster than rider B?"  You could then say that -- to totally make up some numbers -- Merckx is faster than Indurain with 60% confidence and Moser with 85% confidence.

    That's probably not going to happen, but that's really just part of sports fandom.  Would Merckx have beaten Indurain in the Tour?  Would Ali have beaten Tyson?  Would Jordan's Tar Heels have beaten Manning's Jayhawks?  Would Bird's Celtics have beaten Russell's?  Would 1970 Brazil have beaten France in 1998? Would ... it's all part of the fun.

    In a physics lab you can measure the same thing over and over again and say with some certainty that X is greater than Y.  In much of life we look for these clear distinctions, but we forget about the error bars.  We don't take uncertainty into account.  We can feel gut-certain that Merckx's Hour was The Best Hour Ride and anything that might look better is really down to something else, like a better track, but we don't really know.

    In fact, Merckx, Boardman and the rest are all great riders with abilities far beyond what most of us could ever hope to have.  Ranking them, or designating "the best" and "all the rest" has a certain primitive appeal to our brains, but tends to obscure that fact.  Likewise, a record says something about a competitor's ability and determination, but it only says so much.

    Saturday, September 24, 2011

    That whole neutrino thing

    The past few days have been rife with headlines of the form "CERN Scientists Claim Neutrinos Travel Faster Than Light."  Sometimes the story underneath explains what's really going on, but too often it veers off into time travel and speculation that Einstein Was Wrong.  In fact, no one at CERN (which produced the neutrinos) or OPERA (which detected them) is claiming any such thing.  As their actual article makes clear, they have a measurement they can't explain and they're asking for help explaining it.

    This being a particle physics article, the list of authors takes up the first two pages, but there's plenty of solid information after that.  I haven't read it in detail and, not being a particle physicist, probably wouldn't understand all the detail, but they talk a lot about how they measured both the times and distances involved, including taking into account the movement of the earth's crust and the 2009 Italian earthquake, along with a host of other factors.

    They present their data and explain how they analyzed it, but specifically don't make any claims about faster-than-light particles or anything else.  They simply claim they have an anomaly they don't know how to explain and they're still looking into it.  Their precise words:
    Despite the large significance of the measurement reported here and the stability of the analysis, the potentially great impact of the result motivates the continuation of our studies in order to investigate possible still unknown systematic effects that could explain the observed anomaly. We deliberately do not attempt any theoretical or phenomenological interpretation of the results.
    Statements I've seen in the press from the actual physicists involved reflect this.  A better headline would be "Scientists Can't Explain Neutrino Speed Measurement."

    There is ample reason to doubt that OPERA observed neutrinos traveling faster than light.  First, the measurements underpinning special and general relativity are quite solid by now.  Relativity predicts not just that nothing travels faster than light, but a large number of other effects -- for example that clocks run faster in weaker gravity than stronger -- that have been measured to great accuracy.  The odds that those measurements are wrong are very small.  Much more likely that we just haven't found the flaw in the neutrino measurement.

    Second, there is strong evidence from astronomy that neutrinos do not travel faster than light.  Supernovae put out both neutrinos and light, and they arrive here at essentially the same time, having travelled for hundreds of thousands of years.  The OPERA anomaly of one part in 4,000 or so would accumulate to 25 years or so over 100,000 years.  In practice, the neutrinos from a supernova do arrive sooner, but only on the order of hours, and astronomers have good reason to believe this is because they leave about that much sooner.  Physicist Matt Strassler has a good summary on his blog Of Particular Significance.

    Even if the measurements did hold up, and it turned out that neutrinos can travel faster than the observed speed of light, we're quite a way from time travel.  It might not even be evidence that relativity is wrong.  I've seen speculation that the photon, as we already know the neutrino is, might actually be ever-so-slightly massive.  This would leave relativity's absolute speed limit intact and imply that we just hadn't had the tools to measure the difference between the speeds of photons and the actual upper limit.  I'm not sure I quite buy that that squares with all the observations of light over the last several decades, but I haven't looked at the details (and I'm still not a physicist).

    Failing that, it's quite possible that relativity is only mostly right and breaks down in some extreme cases, the same way that Newtonian physics breaks down at extreme speeds and other places.  Who knows?  Such a breakdown might even clear the way for unifying gravitation and quantum mechanics.

    But again, no one involved is claiming we're anywhere near that point.

    [Prof. Strassler has added a post about the OPERA anomaly.  Among other things, he says that the speed of light not quite being the ultimate speed limit -- that is, not quite the c in e = mc2 -- would be a plausible explanation for slightly-faster-than-light particles.  Since he really is a particle physicist, I'm going to bow out and suggest that non-physicists interested in the subject follow his blog (if you are a physicist, I'm sure you already know where to go, but then what are you doing reading this?) -- D.H.][I had originally referred to the "CERN/OPERA" anomaly, but I've changed that.  Although CERN did produce the neutrinos and its name is now associated with the results, it did not conduct the measurements in question. -- D.H.]
    [And, of course, it now appears the measurements were wrong. due to a faulty cable.  Kind of anticlimactic, except to two of the project leads involved, who resigned -- D. H.]

    Sunday, August 21, 2011

    And then it became self-aware

    Something in our mind likes magic thresholds -- crisp clear dividing lines, to one side of which is X and to the other side not-X.  The world has other notions.  Accepting this takes continual effort.

    When I was first learning how logic gates worked, my mathematical mind was enchanted by the clean symbolism of boolean logic, its Ands, Ors and Nots dancing their beautifully symmetrical algebraic ballet, its truth tables laying out precisely how the various operators combined True and False.

    I would spend hours poring over component catalogs, drawing circuit diagrams full of gates and lines and little circles representing Not.  I had some notion of how those gates broke down into individual transistors, a transistor being an idealized beast that modulated perfect high and low voltages with other perfect high and low voltages.

    And then I started looking at the technical specs more closely.  With growing discomfort I came to realize that there simply is no perfect step function from low to high.  The transition in the middle might be more or less exponential, but it is not perfectly vertical.  As I struggled to understand flip-flops and latches, I puzzled over metastable states and propagation delays.  Those weren't on the pretty circuit diagrams, were they?  There came a time when my eye could no longer filter out the symbols for resistors and capacitors sprinkled among the transistors -- and then there were those stowaway analog components like operational amplifiers skulking around, daring to use the same transistors as the digital circuits.  What had happened to my digital world?  When you got down to it, it was all analog at heart.

    I could cite several other cases of learning that simple on/off distinctions generally don't hold up to close scrutiny, but one more will suffice.  From time to time, sometimes in classrooms but usually not, I would try to learn to draw, something I'm still not at all good at.  Along the way, studying shading, I learned the old saw that there are no lines in nature.  Where one might draw a line in a sketch or cartoon, there was actually a sharp, but not perfectly sharp, change in shading.  It was the eye that inferred a line, the same eye that could therefore accept a line drawing as realistic even when, objectively, it was anything but.


    Understanding of intelligence, whether natural or artificial, can suffer by the same tendency to create lines where none exist.  It's tempting to try to come up with a clear, crisp definition of intelligence, but intelligence is not a binary attribute.  There are many different ways to be intelligent, some of which can manifest to significantly varying degrees.   Cognitive science has identified scores of intelligent behaviors, from counting to recognizing faces to remembering a path and far beyond.

    Most notions of intelligence require the ability to learn, but what's learning?  The best answer I know is that there are many kinds of learning, just as there are many aspects to intelligence -- and there is quite likely no simple relationship between the two.

    Which brings me to the title.  A recurring motif in science fiction and its cousins is the notion of a machine becoming self-aware and therefore, by a commonly-accepted notion of intelligence, intelligent. This magical moment brings us spine-tinglingly near the very engines of creation, to say nothing of providing an infinitely more formidable opponent for Our Hero.  That's fine for plot purposes, but just as there are many kinds of learning and intelligence, there must many sorts of awareness, self- or otherwise.

    For example, many things with eyes react to other things with eyes watching them, in some cases even playing it to their advantage.  Without trying to put together a nice crisp definition of awareness -- after all my whole point here is that such definitions never stand up to a good round of "But what about ..?" -- I will posit that a bird watching you watch it is in some sense aware of you.

    Statements like that can cause a certain discomfort among human readers because we all agree, quite possibly correctly, that a bird is not aware of the world in the same way we are.  If awareness is a binary attribute then, perforce, birds must not have it, because we do have awareness and birds don't have the same awareness we have.  QED.  Unfortunately, as airtight as that logic may be, it doesn't really tell us much.  We already knew birds weren't humans.

    If, however, we allow that there may be many kinds of awareness, we can make fairly concrete assertions, in fact more detailed and meaningfully testable assertions, without getting backed into logical corners.  For example, if we assert that there is such a thing as watching -- actively behaving so as to keep something in sight, say -- and there is also awareness of being watched -- leaving aside what exactly that might comprise -- we can assert that both we and birds have those capabilities without saying that we apprehend the world the same way birds do.

    There are many sorts of awareness that we share with birds and many other kinds of animal.  For example, many animals can recognize individuals, reacting differently depending on whether the other party is a stranger or familiar.  Both we and birds can be aware of where things are hidden, and in fact some species of bird appear to be much better at that than we are.  Both we and they can find our way from point A to point B and back and remember new routes that we find.

    This is leaving aside a host of simple capabilities that seem too trivial to note until one realizes that not every living thing has them:  For example, knowing that some things are safe to eat and some aren't, that some animals are liable to attack you and some aren't, that there are objects in the world and we can manipulate them, that things dropped tend to fall, and so forth.

    So how do we differ from birds in awareness?  For one thing, birds probably have some sorts of awareness that we lack.  Migratory birds appear to be aware of the strength and orientation of the Earth's magnetic field, and flying birds in general must surely have a richer awareness of three-dimensional space than we do.

    Likewise, of course, we must surely be aware of things that birds aren't, but once we get done congratulating ourselves on being such vastly more sophisticated creatures, what would those things be?

    A bird may be aware of the local magnetic field, but I'll boldly assert here that it isn't aware that said field is caused by electric currents in the Earth's outer core.  Fine, but just what is it here that we have that they lack that allows us to be aware of such things?  If you want to say "abstract concepts", bear in mind that at least some birds can count and appear to distinguish "same" from "different".  Also bear in mind that not every human is aware of such things (I had to look up the part about it being the outer core), so we're probably grasping at some sort of abstract awareness of cause and effect.  I'm not denying that there's something there, but we do have to be careful trying to define what it is.  Just saying "it's abstract" doesn't really help.


    Here's a stab at something more like what our hypothetical AI villain would have to be able grasp in order to become the dangerously-aware creature we'll pay ten bucks to see:
    Last week, John met Martha at a party on a boat on Lake Michigan.  It turned out that they had grown up within a mile of each other, but never known it.
    From that short paragraph, you now know not only where John and Martha met, and when, and that they grew up close together without knowing of each other, but also that I know that John and Martha know that fact, but they hadn't until last week, and I know that you now know that, and ... well, you get the drift.  This is the sort of awareness that seems, if not completely unique to humans, rare in the animal world.  It's the sort of awareness that can make one a cunning adversary.  If you don't know that I know you're sneaking up on me, I may well have a crucial tactical advantage.

    But is it self-awareness?  There is a famous experiment in which an animal is given access to a mirror.  All animals tend to react to the animal in the mirror as a different animal initially -- this includes humans who haven't seen a mirror before.  Some animals, however, will eventually start to behave differently, for example by poking at a spot painted on their forehead or positioning the mirror or themselves in order to see places they can't ordinarily see.

    Animals that can do so include humans, bonobos, chimpanzees and orangutans, but also bottlenose dolphins, orcas and European magpies.  On the other hand most animals, including ones much more closely related to these animals than they are to each other, don't seem to be able to make the same leap.  Nor, for that matter, can humans less than about eighteen months old.

    We may as well call mirror-test awareness self-awareness, but clearly passing the mirror test doesn't necessarily mean being able to make the kind of I-know-you-know inference described above.   It's also at least logically possible to reason sophisticatedly about who knows what without being able to pass the mirror test.  In short, just as there are many kinds of awareness, there are mostly likely many kinds of self awareness.

    What we're really looking for here goes by the name "Theory of Mind", which is a good topic for another post ...

    Friday, June 10, 2011

    The non-metric mind

    I grew up in the US using English units.  I know my height in feet and inches, my weight in pounds, the distance to various places in miles, the area of my house in square feet, the area of my grandparents' property in acres, the capacity of my car's gas tank in gallons, the temperature in degrees Fahrenheit and so forth.

    I've visited, and even lived in, places where it would be centimeters, kilos, kilometers, square meters, hectares, liters and Celsius, but never really got to the point where it felt natural to use metric units.  If I hear it's 86 degrees out, I know it's warm.  If I hear it's 30 on a summer day, I have to remind myself it's not below freezing and then think "30 ... that's warm, right? ... that's what, 80? 90?"

    Non-metric units are still in use here and there outside the US, to be sure.  Even the English still use some English units, posting speed limits in miles per hour, quoting weights in stone (14 pounds) and quaffing beer by the pint (officially 568ml). All the same, the US is widely recognized as the world's least metricated nation.

    Officially, this isn't supposed to have happened.  The US signed the Treaty of the Meter in 1875 and re-defined traditional measures such as the ounce and gallon in terms of metric units in 1893.  Then, for about a hundred years, the metric system was known to exist but largely ignored.

    In 1975 Congress passed the metric conversion act, thereby establishing the US metric board.  The board was abolished as part of a round of spending cuts in 1982, so we tried again in 1988 with the Omnibus Trade and Competitiveness Act, which among other things required the federal government to go metric by 1992.  For all that the federal government is supposed to intrude into every aspect of Americans' lives and dictate the smallest details of behavior, I can't say I have any idea to whether it actually did.

    The benefits of the metric system are well known, or at least widely touted.  Instead of a hodgepodge of arcane conversions from, say, teaspoons to tablespoons to ounces to cups to pints to quarts to gallons, you have (to continue the example) just liters, optionally with one of a standard set of prefixes should the numbers accumulate too many zeroes (in the case of cooking units, milliliters are fairly common).

    Moreover, even if the metric system were based on multiples of random prime numbers rather than uniformly using base ten, it is the system that most of the world uses, giving a strong incentive for anyone interested in trading with the rest of the world to use it.  So why do we persist in going our own way? I don't know, but I can conjecture, can't I?

    Two conjectures come to mind: The first is that standardization is by far the more pressing reason to use metric units, and that the US does just that when it matters.  US chemists and physicists do not insist on using Fahrenheit degrees or measuring liquids in gills and minims.  They use the same units as everyone else.  A US mechanic fixing a car and faced with 13mm bolt head reaches for a 13mm wrench.  For all that the US was supposed to have been faced with a crisis in competitiveness unless the metric system was made mandatory, market forces seem to have sorted this one out.

    As far as I can tell the remaining differences in units matter mostly as an annoyance to travelers, and for better or worse, Americans in the aggregate don't spend much time traveling to foreign countries.  Even for those who do, metric units are just one more item on a long list of things to get used to: different languages, cultural customs, food, currency, traffic signs, line voltage and frequency, electrical socket designs, light switches, etc., etc..

    It's also worth asking whether ease of conversion is all that important.  Advocating a single system of units, metric or otherwise, assumes that a single system is appreciably more convenient than having multiple systems.  This is a hypothesis to be verified, not an axiom.  In practice, people seem to tolerate quirks in measurement systems remarkably well.  The traditional profusion of units of measurement arose naturally, after all, which leads me to my second conjecture:  The hodgepodge of different units is in fact a reflection of how we think about measurement, even in ostensibly metricated environments.

    For example, if I'm buying soda in the US, I can buy a two-liter bottle without caring that the gas in my car is measured in gallons.  Drinking soda and using gasoline are completely different experiences.  I'm not going to drink the gasoline or pour soda in my gas tank.  Two liters of soda is a lot of soda to drink. 17 gallons (about 64l) is way more soda than I even want to think about drinking.  Two liters of gas will just about get me to work in the morning.

    From a practical point of view, soda could be sold by the ngogn and gasoline by the firkin so long as the numbers didn't get too out of hand (2 liters is about 172 ngogn; 17 gallons is about 1.6 firkins).  In fact, there are two prevalent units of soda in the US: 2-liter bottles and 12-ounce cans or bottles, generally packaged in multiples of six.  As it happens, a six-pack of 12-ounce cans is about two liters, but that's not exact and it doesn't matter much whether it is.

    We learn to associate measures with the physical world on a case by case basis.  You learn how far a mile or kilometer is by traveling.  You learn how much a pound or kilogram is by handling things by the pound or kilo*.  Cognitively, there's not a lot of overlap.  I really don't need to know that a gallon of milk weighs about 8.3 pounds.  It weighs as much as a gallon of milk.  If I'm in the dairy business, I care how many gallons of milk I can load on my truck, but that's just another piece of specialized knowledge.

    In general, there is either a natural or conventional unit for many things we deal with, and, because different things have different properties, that unit will vary.  It would be of little use to require, say, perfume to be sold only in liter sizes or to copper wire to be packaged in meter lengths.  Instead, perfumers have developed standard-sized bottles and wire comes in standard spools.  Whether these happen to measure a round number of ounces or liters or yards or meters is not particularly important.


    Not only is it not a problem to use different units for different things, ad hoc units seem ubiquitous, once you look at the actual unit and not the number on the package.  Even in counting, where the unit is essentially the thing being counted, distinctions can be seen.  Many languages use different words for counting different sorts of things.  For example, in Japanese you would use ba in counting, say, copies of a newspaper but dai in counting, say, cars or bicycles.  English has signs of this as well.  Driving through the midwestern US, you might say you see five cows out in a field, but the owner of that field will almost certainly call them five head of cattle.




    The metric system was designed for scientific use, and it's there that it really comes into its own.  In the physical sciences, there actually can be a need to deal with different orders of magnitude, for example, so it's very good to be able to shift decimal points instead of trying to figure out how many feet are in 1000 inches (83' 4").  In everyday life, shifting decimal points is not so important.  If you're doubling a recipe, it's actually not so bad to be using 1/2 teaspoons and 1/4 cups.

    Even in the sciences, though, idiosyncratic units find their way in
    • How far away is Alpha Centauri?  To an astronomer, about 1.3 parsecs, not about 40Pm (petameters, not afternoon).  The parsec itself was originally defined in terms of the Astronomical Unit (about 150Gm) and the arc second (1/3600 of a degree, or about 4.8 microradians).
    • If you want to make sodium chloride (salt), you'll need about 23g of sodium and 35g of chlorine to make 58g of salt (a fume hood and other equipment will probably be a good idea).  Sodium atoms are less massive than chlorine atoms.  If you used the same mass of each you'd have sodium left over.  Chemists use a gram mole to represent the mass of Avogadro's number (about 600,000,000,000,000,000,000,000) of a given atom or molecule to account for this.  One gram mole of sodium plus one gram mole of chlorine makes one gram mole of salt.
    • In theoretical particle physics, Planck units (or "God's units") set five fundamental physical constants to 1, which simplifies a number of equations.  For example e = m when c is 1.  I'm not sure how often they're used, but they do turn up (for example here and here, to pick a couple more or less at random).
    • I previously remarked on compugeeks measuring data in units of K/kilo- (1024), M/mega- (1048576) and so forth.  That doesn't mean that a compugeek will think a kilowatt is 1024 watts (well, maybe, depending on how hardcore the geek).  It's only data that is measured this way.  The convention that K,M,G,T etc. refer to powers of two flows directly from addresses being represented in binary [Some folks prefer to say things like "Mebi" and use abbreviations like MiB in order to explicitly call out the distinction between a million and 1,048,576.  This can be a very good idea in some particular situations, but in everyday speech even geeks tend to ignore the difference and just say K, M, G, T etc.  --D.H.].
    Even the plain vanilla metric system offers choices.  Strictly speaking, the liter is redundant.  We could just use cubic meters.  In practice, we choose the unit that fits best
    • Liters and cubic meters both act like basic units of volume.
    • Square meters and hectares both act like basic units of area
    • Grams and kilos both act like basic units of mass
    That's leaving aside square and cubic centimeters.  Which units you use will depend on what you're doing.  A recipe might call for 15ml of oil but an olympic swimming pool will hold about 2500 cubic meters (2.5Gl) of water**.  Housing space is measured in square meters, but land is measured in hectares.


    Finally, there is one area of common measurement that has universally and persistently resisted metrication: time.  Everybody uses days, years and some notion of months.  Hours comprising sixty minutes of sixty seconds are approximately as widespread as writing.  Even thoroughly metricated places use kilometers per hour instead of meters per second.  Outside very specialized contexts, long periods of time are measured in years, though which exact definition of the year may depend on context and most definitions vary over time.

    There have been various efforts to "rationalize" time measurement, but none even close to successful.  The natural units are just too strong.  Only when dealing with very short periods of time, outside the realm of everyday experience, do we use "correct" units and talk about microseconds and such.


    Measurement is not an abstraction.  It is a concrete action dealing with the physical world.  The experience of measurement depends on what is being measured, and our mental representations reflect this, making distinctions that appear illogical from an abstract point of view.




    * No discussion of pounds and kilos would be complete without a pedantic comment that pounds measure weight, that is, force, while kilos measure mass.  In everyday life, we deal in force.  It takes careful observation to realize that there is a difference (for example, a diver at neutral buoyancy has little weight but just as much mass as on dry land).  Thus the pedantic distinction.  A kilo of something will normally weigh about 9.8 Newtons, which is what a scale "should" typically read if you put a kilo of something on it.

    ** I originally left out "of water" here.  After all, an olympic pool could just as well hold 2500 cubic meters of beer, or silly putty, or whatever.  But it's hard to think of a container without its expected contents.  Which more or less goes to prove my point.

    Friday, May 20, 2011

    Data, metaphorically

    Hmm ... haven't been in here in a while.  Everything still looks OK, just a bit musty.  Let's open up the curtains, blow the dust off the bookends.  Ah ... better.

    Now where was I?

    In the previous post, I tried to find an ordinary mass noun that behaved like data in its mass noun form, but without great success.  I'm not going to try to fix that here.  In fact, I'm going to try to explain why the effort failed, and to do that I want to explore how data behaves, metaphorically.  But first a bit about metaphor.

    Metaphor generally connotes figurative speech used for poetic effect, whether well
    That time of year thou mayst in me behold
    When yellow leaves, or none, or few, do hang
    Upon those boughs which shake against the cold,
    Bare ruined choirs, where late the sweet birds sang.
    or perhaps not so well
    Head down into the storm they went, pressing barehanded to their chests an unshielded sense of peril.
    Um ... right.

    That's all well and good, but it's not the whole story.  Even the definition your English teacher gave was probably more like "A comparison made by referring to one thing as another."*  That's closer to Aristotle's definition and to the etymology from the Greek for "carry over", and in my view it's an apt one.  There is a strong case to be made that metaphor in this sense is not merely a figure of speech reserved for flowery poetry and purple prose, but rather a fundamental aspect of how we think, whether we put those thoughts into words or not.

    Lakoff and Johnson, for example, make this case in Metaphors We Live By, which pulls together dozens of examples of particular metaphors and shows how, taken together, they imply underlying mental metaphors.  Far from grinding away at a desk in English class to produce a figure of speech that will survive the dreaded red pen, we effortlessly produce metaphors -- in Aristotle's sense -- in nearly every sentence.  These metaphors, on the order of "more is up/less is down" and "anger is a hot liquid" (it can boil over, you can get rid of it by blowing off steam, it behaves as a mass noun ...), are so pervasive we don't even see them as metaphors unless we look -- at which point we see them everywhere.

    (To get the flavor, go over that last paragraph.  Clearly "grinding away" is metaphoric, but so is "see" in "see them as metaphors", "pulls together" and even "in" in "in this sentence".  Well, a sentence isn't really a container or bounded space, is it?)

    It's perfectly normal, indeed probably universal, to have more than one metaphoric view of a concept, and that the different views don't have to be consistent.  For example, we can view ourselves as moving through time ("Let's just get through today.") or ourselves as stationary with time moving past ("What's coming next week?") depending on what works best at the moment.

    So, from this point of view, what is data (in the computing sense)?
    • It's a fluid.  It can flow or otherwise move from place to place.  It can leak.  It can fill up space.  It can also be compressed, but generally it acts more like a liquid than a gas.  If your data isn't flowing fast enough, you need a bigger pipe.
    • It's made up of discrete parts, ultimately bits.  It can be partitioned into chunks of uniform or varying size.  You can change parts independently, but only down to the bit level.
    • It's something of value. It can be secured, tampered with, stolen, bought, sold or given away.
    • It's a form of text.  It can be written, read, erased and copied.
    • You can search through it, organize it and make it universally accessible and useful ... wait, where did that come from?
    I'm sure that with a little more thought I could come up with several more metaphors for data, but I think that's enough to make two points: First, that data, like very many other concepts, can be described by internally consistent metaphors, and second, because these metaphors, as with those for other concepts, aren't always consistent with each other, there's no one concrete noun that could serve as a universal metaphor for data.  In other words, trying to fit water or stone or gravel or rice to data as a whole was doomed to failure from the beginning.

    That's to be expected, I suppose.  One definition of equality is that two things are the same if they can stand in for each other in all circumstances.  If it looks like rice, tastes like rice and is generally like rice in every observable way, then we may as well say it is rice.  Which leads me to a different definition of metaphor that I don't like nearly as much as the one I used:  A comparison of two unlike things that have something in common.

    That's fine as far as it goes, and in particular the things in a metaphor do have to be unlike, but it implies that the things being compared are interchangeable.  They aren't.  One thing is being explained by referring to it as the other.  Moreover, the thing be explained is always more abstract than the thing being referred to.  In the first data example, something very abstract (data), is being referred to as something more concrete (a fluid, for example).


    As always, the definition you choose makes a difference.  Seeing metaphor as a comparison between two unlike things with something in common provides a formula for incoherent images (What do you mean "The stop sign was a fire truck." isn't a good metaphor?  They're unlike but they're both red!).  Seeing metaphor as one thing carried over to stand in for another -- the original metaphor for metaphor -- opens up a vast and surprising new world.






    * It took me a while to find a definition I liked.  This one is courtesy of Gideon Burton's Silva Rhetoricae.

    Thursday, April 7, 2011

    Data and rice

    It may seem like this blog is turning into yet another "my usage can whip your usage" column, but bear with me for one more post.

    In computing contexts (where I spend most of my time), you'll almost invariably see "The data is ..." as opposed to "The data are ...".  The naive analysis of this is that data is plural (singular datum) and it is therefore simply incorrect to say "The data is ...".  Computer geeks simply don't know any better.

    A more reasonable analysis is that data here is a mass noun, like water or gravel.  Mass nouns are measured as opposed to counted.  You ask how much water, as opposed to how many eggs.  There is no such thing as a water, except in special cases like "I ordered a water," meaning a glass or similar serving, or "Bubblyfritz is a water that really refreshes," meaning a kind of water.  Mass nouns act singular.  You say "This water is salty," not "These water are salty."

    Fair enough, but in contexts outside computing people often use data as a plural:  "These data support my theory," or "We don't have many data to work with here."  A naive explanation is that people who say such things are just stuffier than computer geeks, who are notoriously playful in their use of language.  A more reasonable explanation is that such speakers are using data as an ordinary (count) noun, albeit one with an irregular plural carried over from Latin.  Consistent with this, people also use the singular datum in various ways, including forms like "This datum doesn't fit with the other data."

    Why do different groups adopt different usages, each perfectly defensible?  No doubt culture plays a part.  One's use of data indicates whether one is a grammatical ignoramus who doesn't realize that data is a plural form or an uptight pedant who insists on applying arbitrary rules from dead languages.  However, I think that, even allowing for this effect, there is another reason to use data as either singular or plural depending on circumstances.

    We computer geeks typically deal with lots and lots of data.  The more the better.  We eat terabytes for breakfast and gigabytes for a light snack.  Further, the individual bits generally don't carry any particular significance.  The fourth bit of the ASCII or Unicode representation of the 'T' at the beginning of this sentence didn't come from some physical measurement, but from an arbitrary encoding.  We also process data differently.  One doesn't generally take the mean or standard deviation of the bytes in a blog post or audio clip.

    In short, our data is a different beast from a statistician's or biologist's.  It really only makes sense when considered in aggregate.  Metaphorically it acts much like a substance.  We speak of storing data, or moving it from place to place.  We wonder how much space we have left for storing data.  We even speak of compressing data, some of which might be lost in the process, and of "memory leaks" filling up available heap space.  In short, computer data acts like a mass noun.

    Conversely, individual statistical or scientific data are significant.  If I measure the temperature today, that's a datum (but more commonly data point -- I'll come back to that).  If I measure the temperature again tomorrow, that's another datum.  Once I've accumulated a data set [hmm ... not "datum set"?], I try to derive some aggregate measure from that, but the key word here is "derive".  The individual data are the source of truth.

    To get a better measure, I may throw out particular data.  I may present the data sorted or grouped in various ways to make particular points.  I might note some property of a particular datum, or I might call attention to the source of some subset of the data as opposed to the others.  In all these cases the individual data have their own identity and it's perfectly logical to refer to them collectively in the plural and to an individual one in the singular.



    I mentioned rice in the title, didn't I?

    While casting about for more ordinary, physical analogs to computer data, I started looking for mass nouns that fit the part.  I didn't like water because computer data is ultimately discrete.  A terabyte is a lot of bytes (somewhat over a trillion*), but still a finite number of distinct bytes. For practical purposes water is infinitely divisible.

    What about rock or stone?  They can certainly behave as mass nouns.  You can order a ton of rock or fill your pickup truck with stone.  But you can also use the same word (not a variant form) in the singular.  You can say "a rock" or "a stone".  Not even a compugeek would normally say "a data".

    I had been looking for an aggregate of individual pieces which are so numerous that we treat the aggregate as a substance.  You can certainly gather rocks together until somewhere along the line you've shifted from "some rocks" to "some rock".  But on the other hand, rock and stone can be treated as substances themselves.  You can refer to a chunk of rock or a statue made of stone.  In fact, you could argue that we have two mass nouns called "stone": stone as a substance, and an aggregate of stones, which are usually be made of stone in the first sense (but even if the individual stones were made of some clever compound of plastic, chances are a ton of them would still be called "stone").  Computer data doesn't have that extra level.

    I had originally called this post Data and gravel, but computer data is made up of individual parts of uniform size (ultimately bits) while gravel is visibly irregular.  Sand is closer, and was my second try.  Grains of sand may be irregular, but they're small and there are so many that you don't really notice.  Ultimately, though, something like rice seems closest.  The parts are small, numerous and visually uniform.

    If you have an aggregate of smaller objects that you treat collectively as a mass noun, you may still need to refer to the individual pieces from time to time.  There are several ways of doing this
    • In cases like brick, rock or stone, the name of the substance, meaning a small chunk of that substance (a brick, a rock, a stone).
    • In most cases, "a ____ of ...", where the blank may be filled in by something specialized (a  kernel of corn, a grain of sand) or, failing that, the generic piece (a piece of gravel).
    • In some but not all cases this can be turned around (a sand grain, a corn kernel, but a gravel piece sounds a bit odd to me).
    This may help explain why people like to say "data point".  If you have enough data to do meaningful statistics, it's easy (and even useful) to start thinking of it in the aggregate.  Working back from that, you can get a point of data, or more usually a data point.

    Finally, why would we choose data as the form for the mass noun, rather than the singular datum, by analogy with rock and stone?  First, data is simply used a lot more.  Second, it's data, not datum, that is used in contexts that work as both for count nouns and mass nouns ("I can't interpret the data", "Could you send me your data?").  Further, mass nouns like brick and stone only seem to occur in the pattern mentioned above of substance ➝ small chunk of that substance ➝ aggregation of said small chunks.



    *  Yes, I've heard of tebibyte but I've never, ever heard it used seriously in real life.  Yes, it's an IEC standard.  Yes, IEEE officially says that a terabyte is exactly a trillion bytes as opposed to 240 = 1,099,511,627,776.  No one outside the standards committees cares, and I doubt even they care all that much most of the time.  In theory, it matters whether your disk holds exactly a trillion bytes or close to 10% more.  In practice, either your disk is nearly full or it isn't.  When it fills up, you buy more [I've since seen notations like TiB in the wild, but it adds little, as far as I can tell.  If it says TB, you still know it means the power of two and not an even trillion.  If you see TiB, you know it means the same thing, but whoever said TiB instead of TB wants you to know they know the difference --D.H.]

    Standards are great, and if a standards body wants to, say, limit files to 4,294,967,296 bytes they should either say "files shall be no larger than 4 gibibytes" or be clear that "GB" means 230 bytes.  Or they can just say "file sizes are 32 bits".  The rest of us will continue to blithely use the "wrong" units.

    That said, perhaps the distinction is becoming more important as the numbers get larger.  Where this all started, with 210 = 1,024 being practically 1000,  the error is only 2.4%.  At the megabyte level, the difference is 4.8% and at the gigabyte level, 7.3%.  Once we get into peta- and exa- territory, the errors are 13% and 15%, harder and harder to ignore.  Even then, manufacturers, who would one would think might stand to gain by saying 1.1TB instead of 1TB, seem content to say 1TB anyway.  No harm, no foul.

    Friday, March 11, 2011

    Koala bears

    Admit it: You're thinking "But koalas aren't bears, they're marsupial!"

    Fair enough.  This is, after all, the response we've all had drilled into our heads since grade school.

    But why should it matter whether a koala is a marsupial and not an ursid?  Lots of things we call bears aren't members of family Ursidae.  For example:
    • Teddy bears
    • Chicago Bears football players
    • Statues of bears
    • Final exams
    • Goldilocks' Three Bears
    It would generally sound silly to point out that teddy bears aren't really bears, or that bears don't actually talk, eat porridge and live in houses, or that "That test was a bear!" is just an expression, so why the urge to point out that koalas aren't in Ursidae?

    Given the regularity with which it is pointed out that koalas "aren't really bears", it hardly adds much to point it out again.  There is, however, a fairly plausible explanation based not on some fundamental need taxonomic accuracy, but on normal rotten human nature: It serves to make known that, yes, you went to school and you, too, know that koalas are marsupial (or at least "not real bears" if "marsupial" escapes you at the moment).  You thus mark yourself as belonging to the "in" group (albeit not a particularly exclusive one) of Those Who Know Koalas Aren't Closely Related To Those Other Animals We Call "Bears".

    Behind this is a more general notion: The "technical" definition is the "right" one and anything else is "incorrect" or, more cynically, "If I had it drummed into my head, so should you".

    Again, fair enough.  I have no doubt that such balding-ape behavior is at work here, but what triggers it? Once more, why do we not feel compelled to mark ourselves as belonging to Those Who Know That The Chicago Bears Are Not Really Hairy Carnivores (well, actually ...)?

    It seems this sort of behavior only comes out in borderline cases, where there is some chance that the listener isn't one of Those Who Know.  Koalas look and act a fair bit like ursids.  It's perfectly understandable that a European encountering a koala would think "That's a funny-looking bear," and so they did.  But now we know better, or at least Some of Us do.

    It has been said that academic disputes are bitter precisely because the stakes are so low.  Just so, quibbles over usage are most heated precisely when they are inherently least consequential, that is, when the distinction in question makes little difference.  If I say "koala bear", you won't think I'm talking about a squid.  You'll know exactly which animal I'm talking about, but feel a strong urge to whisper "He doesn't know it's not really a bear" to the first Person in the Know that you can find.

    The perfect in-group marker, evidently, is content-free.



    One might be tempted to think that when it was discovered that koalas weren't actually placental at all and thus were not on the same taxonomic branch as the Ursidae, all educated persons began calling them by their right name and "koala bear" fell into immediate disuse.   Not so.  Search Google books for "marsupial koala bear" in quotes and you'll find at least three books, clearly written by trained biologists.  Why would a biologist say "koala bear"?  Why not?  From a biological point of view common names carry no particular weight.  If you want to be clear and unambiguous, you say Phascolarctos cinereus (or P. cinereus for short).


    This sort of thing seems to happen a fair bit -- those who would presumably know best tend to be more casual in their usage than those who wish to appeal to them for authority.  It has been said, for example, that the term "tide" can only be properly applied to phenomena due to gravitational gradients, centuries of usage before and after Newton's gravitational explanation of the tides notwithstanding.  This does not, however, keep atmospheric scientists from studying "atmospheric tides".

    These small daily fluctuations in air pressure, due to heating from the sun and in no significant way related to the moon, are nicely analogous to the usual oceanic tides.  Three possible explanations for this presumably "incorrect" name come to mind:
    • Whoever coined the term "atmospheric tide" mistakenly thought they were caused by the moon's gravity.
    • Whoever coined the term was unaware of the rule that anything called a "tide" or "tidal" must have a gravitational cause.
    • No such rule exists.
    I'm going with the last of these.

    Saturday, January 8, 2011

    You

    Change is a part of language.  Of all the ways to justify a pedantic claim that one's pet usage is "correct" and Kids These Days are borderline illiterate, the appeal to history is one of the weakest.  OK, so originally decimate (or rather, decimare) referred to a randomly selected tenth of an insubordinate legion.  That meaning hasn't been current in English for decades or centuries, if indeed it ever was.  In today's English, decimate means "destroy almost totally", because that's how people use and understand the word.

    (Not that I don't inwardly cringe when I see the growing use of the possessive marker 's for the plural marker, as for example "Employee's only")

    One change that appears to be gaining acceptance is the use of they as a gender-neutral substitute for he or she, handily filling a gap that seems to have grown more noticeable over time.  It has the advantage of being a real word, as opposed to any of several newly-minted words that have been proposed for the purpose, so people already know how to pronounce it and what verb form to use with it.  Of course, everyone knows that they is actually plural, and so using it as a singular is incorrect and thus liable to lead upcoming generations horribly astray if not destroy their verbal capacity entirely.

    But of course, everyone is forgetting about you.

    Just like they, you is syntactically plural.  In particular it takes the plural form of its verb, as in you are.  Nonetheless, it is used for both singular and plural.

    English is unusual in this respect, at least as far as European languages go.  Most European languages use separate forms for singular and plural in the second person, as indeed English used to.  The singular form is almost always tu (as in the Romance languages) or some variant (e.g., German du).  There is more variation in the plural form, though in the Romance languages it's consistently a derivative of Latin vos.

    So why would English not distinguish, but instead use the plural form for both singular and plural? Well ...

    European languages don't just distinguish singular you from plural.  They generally also distinguish familiar from formal.  For example, in Dutch, a shopkeeper or bank teller will generally address a new customer as U, but two friends or family members will address each other as jij (pronounced "yiy" to rhyme with "sigh", or "yuh", depending on whether it's stressed).  There are two main patterns for this:
    1. The formal comes from the third person, as with German Sie or Spanish Usted (from Vuestra merced, literally "your mercy", more loosely "your grace").
    2. The formal comes from the plural, as with French vous.
    European languages also tend to distinguish case, as English still does with I vs. mehe vs. him and she vs. her.  English has lost almost all of its other case distinctions and seems intent on losing the rest.  The who vs. whom distinction is essentially gone, and even he/she vs. him/her only seems to matter in simpler contexts.  In my own experience, most people will say either me and him, if they think no one's looking, or he and I if they think they need to use "proper grammar", regardless of the actual case involved.

    What does all this have to do with you?

    English used to use a perfectly ordinary European system: singular thou (cognate with tu) and plural ye, with the plural for the formal.  In the accusative case (direct object of a sentence), the forms are thee and you.  So
    • Shall I compare thee to a summer's day?
    • Thou art more lovely and more temperate.
    • (I had to go back to Chaucer for clear examples of you as opposed to ye):  But first I pray you, of your courtesy/That ye narrate it not my villainy
    I don't know the actual order of the first two events, but over time
    • The ye/you case distinction was lost in favor of you
    • The you form became the (singular and plural) formal as well as the plural familiar
    • The familiar/formal distinction was lost, again in favor of you (to the extent that to modern ears the familiar thou tends to sound "formal")
    So, behind that simple word you, one-size-fits-all-numbers-and-cases, lurks an elaborate structure of case, number and familiar/formal distinctions, most of which is now long forgotten.


    And if only it were that simple.  There are plenty of wrinkles in the basic pattern of "tu, some plural form, either plural or third person for formal (both singular and plural)":
    • German uses third-person (Sie) while its close cousin English used the plural (you)
    • Dutch (about as close as you can get to both German and English) uses different forms (jij and U) both cognate with the English plural you (and German accusative plural euch), for the singular familiar and formal respectively.
    • Spanish uses a third-person formal, but the Vuestra in Vuestra merced implies that it had previously used the plural vos.
    • Spanish distinguishes between formal singular and plural (Usted and Ustedes -- Vuestras mercedes), while most European languages only distinguish singular and plural in the familiar form
    • French uses the plural vous for the formal -- actually, French is relatively straightforward in that respect.
    • Italian has both (Lei, a third-person form, is more widely used, but some dialects retain the older voi).
    • English has several unofficial plural forms (y'all, you guys, youse, you-uns etc.), which leave the once-plural you as a purely singular form