A data revolution
‘Big Data: A Revolution That Will Transform How We Live, Work and Think’ is a new book written by the Economist’s Data Editor, Kenneth Cukier, and Oxford professor Viktor Mayer-Schonberger. Owen McQuade reviews their conclusions on this key technological trend.
Whilst the authors say they are “messengers of big data, not its evangelists”, their enthusiasm for the subject comes across. They document, with lively examples beyond the obvious Amazon and Google case studies, how big data is happening now.
The era of big data has come about due to three drivers: the ability to analyse large amounts of data, an acceptance of data’s “real-world messiness” and the focusing on correlations rather than looking for causes.
The book looks at emerging concepts in the world of big data including datafication, in which any phenomenon can be put into a quantified format so it can be tabulated and analysed. This is different from the process of digitisation which coverts analogue data into the binary code of a computer that can handle it.
One example is Google’s scanning of books, which was initially by digitisation i.e. by using optical character recognition software that could take this digital image and recognise the letters, words and sentences. The result was datafied text which could be indexed and thus searched. The book sees the trend towards location as data and sees much growth in datafication of interactions on social media platforms. It even goes so far to say that with a little imagination most, if not all, things can be rendered into data form.
Although the book sees huge potential in the value of big data, and the economic efficiency it brings, it does look at “the dark side of big data”. The most obvious downside is that big data makes privacy much harder. The authors said they were both “troubled” by an entirely new menace of “propensity” when big data predictions could be used to judge and punish people before they have even acted. A third danger is what they call the “dictatorship of data” with its “fetishization” with organisations blindly deferring to what says without understanding its limitations.
There are many examples throughout the book, such as police forces using algorithmic models to decide where and when to patrol. In the final chapter entitled ‘Next’ the authors give an excellent detailed case study which brings many of the issues in the book together. It is how New York City’s Director of Analytics has used data from a variety of sources to look at the problem of tackling illegal property conversions, which were major fire hazards as a well as hotspots for crime. By using big data techniques they increased the percentage of cases investigated that ended up with a warrant to vacate from 13 per cent to an impressive 70 per cent.
This book is a very well written and balanced look at big data with interesting examples throughout. It deals with the negatives as well as enthusing about its potential, describing big data as “the moment when the information society finally fulfils the promise implied by its name.” We are living in the time “when the possession of knowledge, which once meant an understanding of the past, is coming to mean an ability to predict the future.”