Wednesday, October 27, 2021

Mortality by the numbers

The following post talks about life expectancy, which inevitably means talking about people dying, and mostly-inevitably doing it in a fairly clinical way.  If that's not a topic you want to get into right now, I get it, and I hope the next post (whatever it is) will be more appealing. 


Maybe I just need to fix my news feed, but in the past few days I've run across at least two articles stating that for most of human existence people only lived 25 years or so.

Well ... no.

It is true that life expectancy at birth has taken a large jump in recent decades.  It's also true that estimates of life expectancy from prehistory up to about 1900 tend to be in the range of 20-35 years, and that estimates for modern-day hunter-gatherer societies are in the same range.  As I understand it, that's not a complete coincidence since estimates for prehistoric societies are generally not based on archeological evidence, which is thin for all but the best-studied cases, or written records, which by definition don't exist.  Rather, they're based on the assumption that ancient people were most similar to modern hunter-gatherers, so there you go.

None of this means that no one used to live past 25 or 30, though.  The life expectancy of a group is not the age by which everyone will have died.  That's the maximum lifespan.  Now that life expectancies are in the 70s and 80s, it's probably easier to confuse life expectancy with maximum lifespan, and from there conclude that life expectancy of 25 means people didn't live past 25, but that's not how it works.  For example, in the US, based on 2018 data, the average life expectancy was 78.7 years, but about half the population could expect to still be alive at age 83, and obviously there are lots of people in the US older than 78.7 years.  The story is similar for any real-world calculation of life expectancy.

A life expectancy of 25 years means that if you looked at everyone in the group you're studying, say, everyone born in a certain place in a given year, then counted up the total number of years everyone lived and divided that by the number of people in your group, you'd get 25 years.  For example, if your group includes ten people, three of them die as infants and the rest live 10, 15, 30, 35, 40, 50 and 70 years, that's 250 person-years.  Dividing that by ten people gives 25 years.

No matter what particular numbers you use, the only way the life expectancy can equal the maximum lifespan is if everybody lives to exactly that age.  If some people in a particular group died younger than the life expectancy, that means that someone else lived longer. 

Sadly, the example above is likely a plausible distribution for most times and places.  Current thinking is that for most of human existence, infant mortality has been much higher than it is now.  If you survived your first year, you had a good chance of making it to age 15, and if you made it that far, you had a good chance of living at least into your forties and probably your fifties.  In the made-up sample above, the people who made it past 15 lived to an average age of 45.  However, there was also a tragically high chance that a newborn wouldn't survive that first year.

Life expectancies in the 20s and 30s are mostly a matter of high infant mortality, and to a lesser extent high child mortality, not a matter of people dying in their mid 20s.  For the same reason, the increase in life expectancy in the late 20th century was largely a matter of many more people surviving their first year and of more children surviving into adulthood (even then, the rise in life expectancy hasn't been universal).


In real environments where average life expectancy is 25, there will be many people considerably older, and a 24-year-old has a very good chance of making it to 25, and then to 26 and onward.  The usual way of quantifying this is with age-specific mortality, which is the chance at any particular birthday that you won't make it to the next one (this is different from age-adjusted mortality, which accounts for age differences when comparing populations).

At any given age, you can use age-specific mortality rates to calculate how much longer a person can expect to live.  By itself, "life expectancy" means "life expectancy at birth", but you can also calculate life expectancy at age 30, or 70 or whatever.  From the US data above, a 70-year old can expect to live to age 86 (85.8 if you want to be picky).  A 70-year-old has a significantly higher chance of living to be 86 than someone just born, just because they've already lived to 70, whether or not infant mortality is low and whether the average life expectancy is in the 70s or 80s or in the 20s or 30s.  They also have a 100% chance of living past 25.

Looking at it from another angle, anyone who makes it to their first birthday has a higher life expectancy than the life expectancy at birth, anyone who makes it to their second birthday has a higher life expectancy still, and so forth.  Overall, the number of years you can expect to live beyond your current age goes down each year, because there's always a chance, even if it's small, that you won't live to see the next year.  However, it goes down by less than a year each year, because that chance isn't 100%.  Even as your expected number of years left decreases, your expected age of death increases, but more and more slowly as you age.

Past a certain point in adulthood, age-specific mortality tends to increase exponentially.  Since the chances of dying at, say, age 20 are pretty low, and the doubling period is pretty long, around 8-10 years, and the maximum for any probability is 100%, this doesn't produce the hockey-stick graph that's usually associated with exponential growth, but it's still exponential.  Every year, your chance of dying is multiplied by a fairly constant factor of around 1.08 to 1.09, or 8-9% annual growth, compounded.  Again from the US data, at age 20 you have about a 0.075% chance of dying that year.  At age 87, it's about 10%.  At age 98, it's about 30%.

This isn't a law of nature, but an empirical observation, and it doesn't seem to quite hold up at the high end.  For example, CDC data for the US shows a pretty plausibly exponential increase up to age 99, where the table stops, but extrapolating, the chance of death would become greater than 100% somewhere around age 110, even though people in the US have lived longer than that.



It's been predicted at some point, thanks to advances in medicine and other fields, life expectancy will start to increase by more than one year per year, and as a consequence anyone young enough when this starts to happen will live forever.  Life expectancy doesn't work that way, either.  There could be a lot of reasons for life expectancy in some population to go up by more than a year in any given year.

Again, the important measure is age-specific mortality.  If the chances of living to see the next year increase just a bit for people from, say, 20 to 50, life expectancy could increase by a year or more, but that just means that more people are going to make it into old age.  It doesn't mean that they'll live longer once they get there.

The key to extending the maximum lifespan is to increase the chances that an old person will live longer, not to increase the chances that someone will live to be old.   If, somehow, anyone 100 or older, but only them, suddenly had a steady 99% chance of living to their next birthday, then the average 100-year-old could look forward to living to about 169.  This wouldn't have much effect on overall life expectancy, though, because there aren't that many 100-year-olds to begin with.  

What are the actual numbers, once you get past, say, 100?  It's hard to tell, because there aren't very many people that old.  How many people live to a certain age depends not only on age-specific mortality, but on how many people are still around at what younger ages.  This may seem too obvious to state, but it's easy to lose track of this if you're only looking at overall probabilities.

Currently there's no verified record of anyone living to 123 and only one person has been verified to live past 120.  No man has been verified to live to 117, and only one has been verified to have lived to 116.  Does that mean that no one could live to, say, 135?  Not necessarily.  Does it mean that women inherently live longer than men?  Possibly, but again not necessarily.  Inference from rare events is tricky, and people who do this for a living know a lot more about the subject than I do, but in any case we're looking at handfuls out of however many people have well-verified birth dates in the early 1900s.

Suppose, for the sake of illustration, that after age 100 you have a steady 50/50 chance of living each subsequent year.  Of the people who live to 100, only 1/2 will live to 101, 1/4 to 102, then 1/8, 1/16 and so forth.  Only 1 in 1024 will live to be 110 and only 1 in 1,048,576 -- call it one in a million -- will live to 120.

If there are fewer than a million 100-year-olds to start with, the odds are against any of them living to 120, but they're not zero.  At any given point, you have to look at the ages of the people who are actually alive, and (your best estimate of) their odds of living each additional year.  If there are a million 100-year-olds now and each year is a 50/50 proposition, there probably won't be any 120-year-olds in twenty years, but if there does happen to be a 119-year-old after 19 years, there's a 50% chance there will be a 120-year-old a year later.  By the same reasoning, it's less likely that there were any 120-year-olds a thousand years ago, not only because age-specific mortality was very likely higher, but because there were simply fewer people around, so there were fewer 100-year-olds with a chance to turn 101, and so forth.

In real life, a 100-year-old has a much better than 50% chance of living to be 101, but we don't really know if age-specific mortality ever levels off.  We know that it's less than 100% at age 121, because someone lived to be 122, but that just indicates that at some point there's no longer an exponential increase in age-specific mortality (else it would hit 100% before then, based on the growth curve at ages where we do have a lot of data).  It doesn't mean that the mortality rate levels off.  It might still be increasing to 100%, but slowly enough that it doesn't actually hit 100% until sometime after age 121.

It may well be that there's some sort of mechanism of human biology that prevents anyone from living past 122 or thereabouts, and some mechanism of female human biology in particular that sets the limit for women higher than for men.  On the other hand, it may be that there aren't any 123-year-olds because so far only one person has made it to 122, and their luck ran out.

Similarly, there may not have been any 117-year-old  men because not enough men made it to, say, 80, for there to be a good chance of any of them making it to 116.  That in turn might be a matter of men being more likely to die younger, for example in the 20th-century wars that were fought primarily by men.  I'm sure that professionals have studied this and could probably confirm or refute this idea.  The main point is that at after a certain point the numbers thin out and it becomes very tricky to sort out all the possible factors behind them.

On the other hand, even if it's luck of the draw that no one has lived to 123, there could still be an inherent limit, whether it's 124, 150 or 1,000, just that no one's been lucky enough to get there.


Along with the difference between life expectancy and lifespan, and the importance of age-specific mortality, it's important to keep in mind where the numbers come from in the first place.  Life expectancy is calculated from age-specific-mortality, and age-specific mortality is measured by looking at people of a given age who are currently alive.  If you're 25 now, your age-specific mortality is based on the population of 25-year-olds from last year and what proportion of them survived to be 26.  Except in exceptional circumstances like a pandemic, that will be a pretty good estimate of your own chances for this year, but it's still based on a group you're not in, because you can only measure things that have happened in the past.

If you're 25 and you want to calculate how long you can expect to live, you'll need to look at the age-specific mortalities for age 25 on up.  The higher the age you're looking at, the more out-of-date it will be when you reach that age.  Current age-specific mortality for 30-year-olds is probably a good estimate of what yours will be at age 30, but current age-specific mortality at 70 might or might not be.  There's a good chance that 45 years from now we'll be significantly better at making sure a 70-year-old lives to be 71.  

Even if medical care doesn't change, a current 70-year-old is more likely to have smoked, or been exposed to high levels of carcinogens, or any of a number of other risk factors, than someone who's currently 25 will have been when they're 70.  Diet and physical activity have also changed over time, not necessarily for the better or worse, and it's a good bet they will continue to change.  There's no guarantee that our future 70-year-old's medical history will include fewer risk factors than a current 70-year-old's, but it will certainly be different.

For those and other reasons, the further into the future you go, the more uncertain the age-specific mortality becomes.  On the other hand, it also becomes less of a factor.  Right now, at least, it won't matter to most people whether age-specific mortality at 99 is half what it is now, because, unless mortality in old age drops by quite a bit, most people alive today are unlikely to live to be 99.

Sunday, May 2, 2021

Things are more like they are now than they have ever been

I hadn't noticed until I looked at the list, but it looks like this is post 100 for this blog.  As with the other blog, I didn't start out with a goal of writing any particular number of posts, or even on any particular schedule.  I can clearly remember browsing through a couple dozen posts early on and feeling like a hundred would be a lot.  Maybe I'd get there some day or maybe I wouldn't.  In any case, it's a nice round number, in base 10 anyway, so I thought I'd take that as an excuse to go off in a different direction from some of the recent themes like math, cognition and language.


The other day, a colleague pointed me at Josh Bloch's A Brief, Opinionated History of the API (disclaimer: Josh Bloch worked at Google for several years, and while he was no longer at Google when he made the video, it does support Google's position on the Google v. Oracle suit).  What jumped out at me, probably because Bloch spends a good portion of the talk on it, was just how much the developers of EDSAC, generally considered "the second electronic digital stored-program computer to go into regular service", anticipated, in 1949.

Bloch argues that its subroutine library -- literally a file cabinet full of punched paper tapes containing instructions for performing various common tasks -- could be considered the first API (Application Program Interface), but the team involved also developed several other building blocks of computing, including a form of mnemonic assembler (a notation for machine instructions designed for people to read and write without having to deal with raw numbers) and a boot loader (a small program whose purpose is to load larger programs into the computer memory).  For many years, their book on the subject, Preparation of Programs for Electronic Digital Computers, was required reading for anyone working with computers.

This isn't the first "Wow, they really thought of everything" moment I've had in my field of computing.  Another favorite is Ivan Sutherland's Sketchpad (which I really thought I'd already blogged about, but apparently not), generally considered the first fully-developed example of a graphical user interface.  It also laid foundations for object-oriented programming and offers an early example of constraint-solving as a way of interacting with computers.  Sutherland wrote it in 1963 as part of his PhD work.

These two pioneering achievements lie either side of the 1950s, a time that Americans often tend to regard as a period of rigid conformity and cold-war paranoia in the aftermath of World War II (as always, I can't speak for the rest of the world, and even when it comes to my own part, my perspective is limited). Nonetheless, it was also a decade of great innovation, both technically and culturally.  The Lincoln X-2 computer that Sketchpad ran on, released in 1958, had over 200 times the memory EDSAC had in 1949 (it must also have run considerably faster, but I haven't found the precise numbers).  This development happened in the midst of a major burst of progress throughout computing.  To pick a few milestones:

  • In 1950, Alan Turing wrote the paper that described the Imitation Game, now generally referred to as the Turing test.
  • In 1951, Remington Rand released the UNIVAC-I, the first general-purpose production computer in the US.  The transition from one-offs to full production is a key development in any technology.
  • In 1951, the solid-state transistor was developed.
  • In 1952, Grace Hopper published her first paper on compilers. The terminology of the time is confusing, but she was specifically talking about translating human-readable notation, at a higher level than just mnemonics for machine instructions, into machine code, exactly what the compilers I use on a daily basis do.  Her first compiler implementation was also in 1952.
  • In 1953, the University of Machester prototyped its Transistor Computer, the world's first transistorized computer, beginning a line of development that includes all commercial computers running today (as of this writing ... I'm counting current quantum computers as experimental).
  • In 1956, IBM prototyped the first hard drive, a technology still in use (though it's on the way out now that SSDs are widely available).
  • In 1957, the first FORTRAN compiler appeared.  In college, we loved to trash FORTRAN (in fact "FORTRASH" was the preferred name), but FORTRAN played a huge role in the development of scientific computing, and is still in use to this day.
  • In 1957, the first COMIT compiler appeared, developed by Victor Yngve et. al..  While the language itself is quite obscure, it begins a line of development in natural-language processing, one branch of which eventually led to everyone's favorite write-only language, Perl.
  • In 1958, John McCarthy developed the first LISP implementation.  LISP is based on Alonzo Church's lambda calculus, a computing model equivalent in power to the Turing/Von Neumann model that CPU designs are based on, but much more amenable to mathematical reasoning.  LISP was the workhorse of much early research in AI and its fundamental constructs, particularly lists, trees and closures, are still in wide use today (Java officially introduced lambda expressions in 2014).  Its explicit treatment of programs as data is foundational to computer language research.  Its automatic memory management, colloquially known as garbage collection, came along a bit later, but is a key feature of several currently popular languages (and explicitly not a key feature of some others). For my money, LISP is one of the two most influential programming languages, ever.
  • Also in 1958, the ZMMD group gave the name ALGOL to the programming language they were working on.  The 1958 version included "block statements", which supported what at the time was known as structured programming and is now so ubiquitous no one even notices there's anything special about it.  The shift from "do this action, now do this calculation and go to this step in the instructions if the result is zero (or negative, etc.)" to "do these things as long as this condition is true" was a major step in moving from a notation for what the computer was doing to a notation specifically designed for humans to work with algorithms.  Two years later, Algol 60 codified several more significant developments from the late 50s, resulting in a language famously described as "an improvement on its predecessors and many of its successors".  Most if not all widely-used languages -- for example Java, C/C++/C#, Python, JavaScript/ECMAScript, Ruby ... can trace their control structures and various other features directly back to Algol, making it, for my money, the other of the two most influential programming languages, ever.
  • In 1959, the CODASYL committee published the specification for COBOL, based on Hopper's work on FLOW-MATIC from 1950-1959.  As with FORTRAN, COBOL is now the target for widespread derision, and its PICTURE clauses turned out to be a major issue in the notorious Y2K panic.  Nonetheless, it has been hugely influential in business and government computing and until not too long ago more lines of code were written in COBOL than anything else (partly because COBOL infamously requires more lines of code than most languages to do the same thing)
  • In 1959, Tony Hoare wrote Quicksort, still one of the fastest ways to sort a list of items, the subject of much deep analysis and arguably one of the most widely-implemented and influential algorithms ever written.
This is just scratching the surface of developments in computing, and I've left off one of the great and needless tragedies of the field, Alan Turing's suicide in 1954.  On a different note, in 1958, the National Advisory Committee on Aeronautics became the National Aeronautics and Space Administration and disbanded its pool of computers, that is, people who performed computations, and Katherine Johnson began her career in aerospace technology in earnest.

It wasn't just a productive decade in computing.  Originally, I tried to list some of the major developments elsewhere in the sciences, and in art and culture in general in 1950s America, but I eventually realized that there was no way to do it without sounding like one of those news-TV specials and also leaving out significant people and accomplishments through sheer ignorance.  Even in the list above, in a field I know something about, I'm sure I've left out a lot, and someone else might come up with a completely different list of significant developments.


As I was thinking through this, though, I realized that I could write much the same post about any of a number of times and places.  The 1940s and 1960s were hardly quiet.  The 1930s saw huge economic upheaval in much of the world.  The Victorian era, also often portrayed as a period of stifling conformity, not to mention one of the starkest examples of rampant imperialism, was also a time of great technical innovation and cultural change.  The idea of the Dark Ages, where supposedly nothing of note happened between the fall of Rome and the Renaissance, has largely been debunked, and so on and on.

All of the above is heavily tilted toward "Western" history, not because it has a monopoly on innovation, but simply because I'm slightly less ignorant of it.  My default assumption now is that there has pretty much always been major innovation affecting large portions of the world's population, often in several places at once, and the main distinction is how well we're aware of it.


While Bloch's lecture was the jumping-off point for this post, I didn't take too long for me to realize that the real driver was one of the recurring themes from the other blog: not-so-disruptive technology.  That in turn comes from my nearly instinctive tendency to push back against "it's all different now" narratives and particularly the sort of breathless hype that, for better or worse, the Valley has excelled in for generations.

It may seem odd for someone to be both a technologist by trade and a skeptical pourer-of-cold-water by nature, but in my experience it's actually not that rare.  I know geeks who are eager first adopters of new shiny things, but I think there are at least as many who make a point of never getting version 1.0 of anything.  I may or may not be more behind-the-times than most, but the principle is widely understood: Version 1.0 is almost always the buggiest and generally harder to use than what will come along once the team has had a chance to catch a breath and respond to feedback from early adopters.  Don't get me wrong: if there weren't early adopters, hardly anything would get released at all.  It's just not in my nature to be one.

There are good reasons to put out a version 1.0 that doesn't do everything you'd want it to and doesn't run as reliably as you'd like.  The whole "launch and iterate" philosophy is based on the idea that you're not actually that good at predicting what people will like or dislike, so you shouldn't waste a lot of time building something based on your speculation.  Just get the basic idea out and be ready to run with whatever aspect of it people respond to.

Equally, a startup, or a new team within an established company, will typically only command a certain amount of resources (and giving a new team or company carte blanche often doesn't end well).  At some point you have to get more resources in, either from paying customers or from whoever you can convince that yes, this is really a thing.  Having a shippable if imperfect product makes that case much better than having a bunch of nice-looking presentations and well-polished sales pitches.  Especially when dealing with paying customers.

But there's probably another reason to put things out in a hurry.  Everyone knows that tech, and software in general, moves fast (whether or not it also breaks stuff).  In other words, there's a built-in cultural bias toward doing things quickly whether it makes sense or not, and then loudly proclaiming how fast you're moving and, therefore, how innovative and influential you must be.  I think this is the part I tend to react to.  It's easy to confuse activity with progress, and after seeing the same avoidable mistakes made a few times in the name of velocity, the eye can become somewhat jaundiced.

As much as I may tend toward that sort of reaction, I don't particularly enjoy it.  A different angle is to go back and study, appreciate, even celebrate, the accomplishments of people who came before.  The developments I mentioned above are all significant advances.  They didn't appear fully-formed out of a vacuum.  Each of them builds on previous developments, many just as significant but not as widely known.

Looking back and focusing on achievements, one doesn't see the many false starts and oversold inventions that went nowhere, just the good bits, the same way that we remember and cherish great music from previous eras and leave aside the much larger volume of unremarkable or outright bad.

Years from now, people will most likely look back on the present era much the same and pick out the developments that really mattered, leaving aside much of the commotion surrounding it.  It's not that the breathless hype is all wrong, much less that everything important has already happened, just that from the middle of it all it's harder to pick out what's what.  Not that there's a lack of opinions on the matter.



The quote in the tile has been attributed to several people, but no one seems to know who really said it first.