Wednesday, November 30, 2011

On exa, zetta, and beyond

Anyone who lives in metric system countries knows what "kilo" means. A kilogram is 1000 grams, a kilometer is 1000 meters. Of course frequencies are measured in kiloHertz and in the computer world we have kilobits and kilobytes (although we are never quite sure if that is 1000 or 1024!).

Most people even know that "mega" means a million. Power stations output megawatts of electricity, FM radios receive at megaHertz frequencies, and atomic bombs deliver megatons. For years our disks were measured in megabytes, and for most of us our Internet connections are in megabits (although we are not quite sure whether that is 1,000,000 of 1024*1024!).

People with state-of-the-art computers are aware that giga means a (US) billion (a thousand million), and that tera means a thousand of those, but only because disk capacities have increased so rapidly. When you ask people what comes next, you tend to get puzzled looks. Most people aren't even sure whether when they say a billion they mean a thousand million or a million million, so don't expect them to be expert in anything bigger than that!

Up to now only astrophysicists were interested in such large numbers, but with global data traffic increasing at over 30% per year, networking people are getting accustomed to them as well.

For those who are interested, the following numbers are peta (10^15), exa (10^18), zetta (10^21), and finally yotta (10^24). The last two names were only formally accepted in 1991. For those who prefer powers of two, the
IEC has standardized kibi (Ki) for 2^10, mebi (Mi) for 2^20, gibi (Gi) for 2^30, tebi (Ti) for^40, etc., although these terms don't seem to have caught on.

Several years ago I heard that the total amount of printed information in the worlds' libraries does not exceed a few hundred petabytes. On the other hand, present estimates are that global IP traffic now amounts to about 30 exabytes per month, or about ten times the world's accumulated printed knowledge every day. By the middle of this decade should surpass 100 exabytes per month, i.e., about the entire world's printed knowledge per hour.

These datarates, and particularly their time derivatives, present the telecommunications community with major challenges. We have grown accustomed to sophisticated applications that transfer massive amounts of data. A prime example is the new breed of cellphone voice/meaning recognition that sends copious amounts of raw data back to huge servers for processing. Such applications can only continue to be efficiently and inexpensively provided if the transport infrastructure can keep up with the demand for datarates.

And that infrastructure is simply not scaling up quickly enough. We haven't found ways to continually increase the number of bits/sec we can put into long-distance fiber to compensate for >30% annual increase in demand (although new research into mode division multiplexing may help). Moore's law is only marginally sufficient to cover increases in raw computation power needed for routers, but we will need Koomey's law for power consumption (MIPS / unit of energy doubles every year and a half) to continue unabated as well. And we haven't even been able to transition away from IPv4 after all of its addresses were exhausted!

If we don't find ways to make the infrastructure scale, then keeping up with exponential increases in demand will require exponential increase in cost.


Y(J)S