Big Data? Small Data? If there is something that the students of Inference and Econometrics learn quickly, it is that very few statistics are applicable to complete populations.Logically, they are based on samples and discards.In the rearview mirror we have the Spanish elections of 2016 and the different polls, pulling overboard assumptions taken as firm facts.Why? People can lie, feel coerced or just not be comfortable telling sorry matters.
To understand the difference between big and small data , we can look at micro and macroeconomics.They both operate under the same precepts.Micro is fixed on the consumer , distributor or producer; the Macro, in whole populations, in countries.But let's think about the austerity paradox: at the microeconomic level, it is best to calculate the expenditure: the greater the savings the greater the possibility of future investment.But at the macroeconomic level this is disastrous: consumption decreases , companies sell less, prices rise, future demand soars, GDP falls, unemployment increases...
Why do we tell this? Well, because one is not better than another.Because, on the one hand, we have machines throwing graphs based on logarithms, cultivating raw data to spit coherent conclusions: the "big data".And, on the other, people on foot from the street doing interviews, creating "small data".We will first define these two alleged contenders, who seem not to get along.
-Big Data: refers to the storage of large amounts of data to find patterns.It could be said that ' Big Data: the revolution of mass data ' is your Bible; and Viktor Mayer-Schonberger, his prophet.
-Small Data: if big data refers to large data harvested by machines, small data refers to small data extracted from people, from observation active witnesses, a leisurely analysis .Martin Lindstrom could be considered his particular prophet, with the bestseller ' Small Data: small clues that reveal the most important trends ' as the central axis of his thought.
What does big data hide?
Take the example of cinema: There are two films that, thanks to their fragmented montage, try to evoke a series of emotions in the viewer. Elephant (Gus Van Sant, 2003) and 71 fragments of a random chronology (Michael Haneke, 1994) are two good examples of how the director, through the cold exposure of information, intends to suggest without judging, that the puzzle be mounted on the viewer's head and same draw your own conclusions.
The Big Data are all those fragments.Once analyzed, an immediate conclusion would be reached: this or that character is the murderer, for such or such reasons.The Small Data would not analyze the footage, but the viewer.Its emotional conclusions would be your registration.
However, Big Data is essential for any company.It does not only cover the collection and processing of information.It does not refer to a specific amount.It feeds on the quantity: the more information, the better.It is a matter of volume and variety.This one "understands" how we use social networks, how we drive on the road, our favorite type of cinema, collects every likes of Facebook or every fav of Twitter.All those recommendations of friendship, similar websites, holidays in the mountains or restaurant ads are determined by an analysis of habits and tastes.
Big Data is not limited to digitally piling up information, but rather it measures and communicates movements .This is especially relevant when we need a quick response, for example, among High Frequency Traders.HFT are systems for buying and selling listed financial assets, from stock indexes, raw materials to cars at auction.The fact is that these robots have the Stock Market staggering thanks to their ability to operate at the speed of light.
This speed is essential for the user.Surely you have heard about cloud and fog computing .To avoid going deeper into these concepts, we will say that Its function is as follows: instead of sending all the information collected to a provider and having it analyzed by its servers, this method of computing harnesses the power of our smartphone to generate an answer.The goal is simple : Shorten wait times.
How do you get all this data?
There is a common fallacy that says that human beings increasingly produce more data.Beyond the number of inhabitants of the planet, what increases is the obtaining of them.And the ability to house astronomical quantities of numbers if it increases every day.
In the foreground we would have population censuses, medical records, taxes, fines, etc.To this we would have to add each transaction, every conversation-on Twitter 12 Terabytes of daily tweets are generated, Facebook stores about 100 Petabytes in photos and videos and YouTube...better take a look at this infographic-and, finally, add all the learning that the machines themselves do when interpreting that data, the so-called cognitive computing and the m2m (machine to machine).If we add all the activity of all the mobile phones in the world we would spend 3 quintillion bytes of daily data.
This rate has an exponential growth.Today almost everything is connected, it is the so-called Internet of things: the smartphone tops the list, but we would have to add our TV, the printer, any wearable (bracelets, watches), the new wireless audio systems, the smart homes -washing machine, oven, plate, fridge-and even some musical instruments.Each new connected gadget is a new real-time information issuer.
Finding connections with the user
But all that is just the tip of the iceberg.It is a machine language that we cannot understand.And not everything can be reduced to variables of ones and zeros.Ingvar Kamprad, founder of IKEA, said that the cheapest research and Effective that exists is to ask each customer why they buy this or that product: this is Small Data .
Martin Lindstrom, backbone of Small Data , tells an interesting anecdote: remember how Lego , on the verge of bankruptcy in 2003, stopped paying attention to Big Data, which said that its parts sets were finished because the current generation had become accustomed to instant gratification.The company talked to children and I ended up making their pieces smaller, more versatile, a decision that affected to the packaging and automatically increased the difficulty.This decision contrary shot sales.
Lindstrom is known as the father of neuromarketing.It consists in the application of neuroscientific techniques in the field of marketing, as a study of the levels of emotion, attention and memory that the user shows against different stimuli.It is something that advertising comes practicing for 150 years.The Small Data appeals to creativity, to accidents, to that which escapes from raw statistics.Snapchat success, converting the post-its as a work tool, the viralization of just-the-opposite-what-you expected.
The Big Data wants to find consistent correlations in large volumes of data. Small Data prefers to sit with the user, meet him, study him, or even anything like that: act how people act, without great reasoning, with a hint of madness and risk, hence the Small Data is the new topic of conversation and obsesses large companies: why do startups exist? that perform better than millimetrically ordered corporations? Why Amazon has not made the small publishers disappear but quite the opposite, being Spain at the head of Europe despite its economy?
However, Lindstrom's book hides a great fallacy: "Big Data is data, and data favors the analysis of emotion." This is a romantic perspective that leads to error: Big Data is data as much as Small Data .And it used to determine emotions-that ask the algorithm of favorite groups in Spotify-as well as wishes of the type of "purchase intention".
Finally, we should take into account that maximum random call: the essayist and financial researcher Nassim Nicholas Taleb calls an unlikely event "black swan", so much so that its consequences can only be explained a posteriori, which leave us out of place and we can only understand them when we have conducted a cold analysis, such as cases of labor pressure or the terrible consequences of the killing of the Columbine Institute, portrayed precisely on the Elephant tape.
Comments
Post a Comment