Quantifying data
An information explosion is well under way. Peter Cheney looks at its causes and effects.
The near-collapse of the world’s banking systems, in the not too distant past, was partly triggered by vast amounts of financial data being misread by the computer models designed to analyse risk. Such a major crisis illustrated the scale of data in today’s world and also the problems when the information is not handled properly.
Digital information is now increasing tenfold every five years as more is accumulated, stored and shared through the internet. The number of blogs alone is also thought to double every six months.
This has certainly has its benefits. Holiday bookings, shopping, personal banking and academic research is quicker and often more convenient online than in person. Communication is virtually instant. However, the same principles apply to the problems. Data loss or theft, IT breakdowns and the spread of rumour and false information are made all the more easier in a digital world.
Information explosion and big data are two of the terms used to describe these ongoing trends.
The former phrase was coined during the 1960s and was already “much discussed” by 7 June 1964, according to that day’s New York Times. The latter is a more recent invention by scientists and IT experts, describing the exponential rise in information experienced in the late 2000s.
With no global ‘governing body’ per se, data is hard to quantify exactly although the US-based International Data Corporation estimates that 1,250 ‘exabytes’ of information will be created this year compared to less than 250 in 2005. One exabyte equals one billion gigabytes. Sources for such data include the callers on over four billion mobile phones and up to two billion internet users.
Technological reasons for this trend i.e. the growth of the internet and cheaper, more available handsets, are obvious although they do not tell the whole story. There is a cultural and scientific side to this too.
Economic growth is accompanied by the better literacy needed to learn new technologies. While the original technology was invented in Europe and America, its use is now growing fastest place among the emerging middle classes in the Third World.
The Chinese Government claims that there were 338 million internet users in that country last June, up from 298 million six months previously. That 13.4 per cent increase came on the back of a 41.9 per cent annual increase over 2008; internet growth that year was 60.8 per cent in the countryside. India reportedly has around 80 million subscribers, Brazil 67.5 million.
In addition, it is no co-incidence that commentators started talking about an ‘information explosion’ during the space race. Computer technology advanced rapidly to meet engineering’s demands, while the first satellites opened up a completely new perspective on the world, accompanied by a flood of data. Modern telescopes have become some of the largest accumulators of data, by photographing and analysing the universe in detail.
Making sense of the data is a challenge in itself, but a manageable one with the right hardware and software. The real problem is storage. Up to 2006 or 2007, the ‘explosion’ was contained with more storage space globally than information. Since then, information ‘stores’ have effectively overflowed.
According to the International Data Corporation, there is a 500 exabyte gap between the total amount of information and the total storage space, which is set to increase. As the Economist put it in February, research scientists now “collect what they can and let the rest dissipate into the ether.”
‘Big data’ can be problematic, as the financial crisis showed. The natural lesson seems to be that data needs to be better controlled and analysed. As the task gets bigger, there are clearly more opportunities for businesses and graduates to find solutions.