Buzzword Bingo (1) – Big Data

data“Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it…” (Dan Ariely)

I was working with a data product before I had ever heard the term “big data“. Nesstar had taken an EU funded development all the way to an excellent commercial product, developing the concept of the semantic web to give structure and meaning to data, enabling those data to be disseminated on the web to users with a standard web browser. Users could browse, analyse and visualise data from very large data sets.  The aim was to allow non-expert users to find data relevant to their needs and then, most importantly, to derive meaning from the data.

Ten years on, big data is now a hot topic, often in relation to the Internet of Things – the proliferation of sensors in the Internet of Things will generate lots of data. In many ways the challenge is the same as that addressed by Nesstar – how to gain meaningful insight from large amounts of data.  The difference is in the diversity of the data.  Nesstar worked on static data that had been carefully prepared and uploaded on to a server – data preparation was a major effort.  The new big data story is often about dynamic data collected from many sources, some of them real sensors, others on the internet – the data sources might be controlled by a range of organisations. The data will often be near real-time – there is likely to be little scope for manual intervention in data preparation. The process will in many cases be machine-based from end to end. The common feature is that for big data the challenge is for systems to make sense of the information, either for presentation to humans, or as input into machine controlled systems.

This all means that there’s something for everyone in big data ranging from acquisition, communication, data management and storage and perhaps most importantly the applications making sense of and making use of the data. There are questions of who owns the data (whether it’s analysis of your location or browsing habits as collected by Google, or your electricity consumption as measured by your smart meter) so even the lawyers can take part.

Fortunately for those of us involved in product development, there is still plenty to be done in pulling all these elements together.