What is the right question, “What is Big Data” or “How to use the data”? – Didem Gundogdu

by ThePercept 0

“It is appropriate to say; there is no Big Data, but just Data“

If you are enthusiastic about computers and old enough to follow up the trends, you should have come across some of the buzzwords in the last 30 years. It was web and e-mail in the early 90s, followed by the ERP (Enterprise Resource Planning) and CRM (Customer Relationship Management) software. And today, it is mostly AI (Artificial Intelligence), Machine Learning, Deep Learning and Big Data.

So, what is big data? Data is the plural form of the datum in Latin, which means a piece of information, according to Oxford Dictionary. Based on that definition, we can address each information as data. It would be how many paces we accomplish in a day, or with whom we are friends on Facebook, or which tweet we like and share, or what we buy from the grocery store. Each data trace we leave behind is a piece of breadcrumb in the massive data piles. It is not necessarily to be digital; even our waste can be evaluated as a part of the big data [1]. Or maybe more appropriate to say, there is no Big Data, but just Data.

So why does it become so much important, and why do we start to hear about it almost every day? The reason is that we are getting more digitalized and what we leave behind in the means of data can be accessible and computed with the current technologies easier. In general, it is represented as 3Vs (Volume, velocity, and variety). The %90 of the data in the world has been generated within the last two years (Volume and Velocity) [2]. The availability of various data types (Variety), such as your location (GPS), your posts on the social network, your expenditures, what you watch on TV, allows developing more intelligent applications. Accessibility to a variety of data types let us correlate and elevate meaningful deductions.

In fact, data is still data, either big or in the spreadsheet. However, there is a wind of change in the air. That is the value of data, which is understood by the corporates. Today, companies are taking decisions with these new technologies, such as Netflix. They decide which TV series to buy, as they know millions of users’ tastes. The success of “House of Cards” was not a surprise [3].

To benefit more from data, you have to be aware of how to exploit it. So how does it work? What most of us does not aware is that there are some patterns hidden in the data. If you have a kaleidoscope (statistical intelligence, or widely say machine learning algorithms), you can have magical insights.

In 1814 a well-known French mathematician Pierre-Simon Laplace stated, which is known as Laplace Demon, “if someone (Demon) knows the precise location and momentum of every atom their past and future values are at any given time are entailed” [4]. This deterministic approach can be easily related to our present problems while we try to understand the data. The very recent example for this is the last presidential election in the USA; contradict to forecasts, Trump won against Clinton [5].

The power of data is enormous. However, it may be misleading if you do not have enough data, or your model is inadequate to represent the real world. Otherwise, when we expect magic from the crystal ball, we end it up with illusion.

About Didem Gündoğdu:

Didem Gündoğdu is a PhD student at the University of Trento, Italy, Information Engineering and Computer Science department, under the scholarship from Fondazione Bruno Kessler, in the Mobile and Social Computing Lab (MobS Lab). She completed her MSc in computer science at Bogazici University, Turkey, after 15 years in the information technologies industry. Her main research is anomaly detection in telecommunication data using stochastic methods.

References:

  1. http://www.hurriyet.com.tr/istanbulun-cop-haritasi-40256009
  2. http://www.vcloudnews.com/every-day-big-data-statistics-2-5-quintillion-bytes-of-data-created-daily/
  3. http://www.nytimes.com/2013/02/25/business/media/for-house-of-cards-using-big-data-to-guarantee-its-popularity.html
  4. https://en.wikipedia.org/wiki/Laplace%27s_demon
  5. http://www.nytimes.com/2016/11/10/technology/the-data-said-clinton-would-win-why-you-shouldnt-have-believed-it.html