The seven Vs of Big Data

3 min reading
Digital transformation / 02 December 2019
The seven Vs of Big Data
The seven Vs of Big Data

BBVA API Market

Data chains are the new value chains. Big Data is a concept that describes the large volume of structured and unstructured data which currently floods the business world. However, what really matters is what you do with the data rather than how many they are.

Big Data was coined to mean the processing of large quantities of data, which can be analyzed to obtain information or insights that lead to better decisions and business strategies in the long term. Its benefits extend to such varied areas as customer relations, operational optimization and fraud prevention. The mass volume, variety and velocity of information nowadays make it indispensable to capture, store and analyze this complex gear assembly. That is why Big Data is characterized by “five Vs”:

Data stored in company repositories have gone from taking megabytes to taking gigabytes and then petabytes of space. Ninety percent of existing information was created in the last two years. To give you an idea: in 2008, Google was processing over 20 petabytes of data per day!

By 2020, it is estimated that 40 zettabytes of data will be processed across the world, and the amount of data in the world should double every two years. A major contributor to this data volume is the Internet of Things (IoT), which retrieves an immense amount of information via sensors.

The velocity of data movement, processing and capture within and outside companies has increased significantly. Models based on business intelligence normally take days to be processed, while today’s analytical needs require that data be captured and processed “practically” in real time thanks to the high-speed flow of data.

Data velocity in almost real time derives from the ubiquity and availability of devices connected to the internet, both wireless and wired. Information is currently transmitted at extraordinary speed. For example, it is estimated that 500 hours of video are uploaded to YouTube per minute and that 200 million emails are sent in the same period of time.

Data diversity have burgeoned, going from stored and structured data kept in business databases to unstructured data, semi-structured data and data in different formats (audio, video, XMLs, etc.). For example, over 3.5 million people make calls, send SMS, tweet and browse the internet from their cell phones.

Estimations indicate that 90% of today’s data are generated in an unstructured manner. And not every analysis method can be used with every data, consequently, these methods must adjust to the nature of the information.

The aim is to promote the search for data veracity so that we may retrieve reliable information. Accurate data allow for greater utilization because of their quality. This is particularly important for organizations whose business is centered on information.

However, given the existing amount of information, some people believe that veracity is a secondary characteristic of Big Data.

The return resulting from data management. The key to Big Data is not the countless amounts of information but rather how it is used and/or handled. Even though it is very expensive to implement IT infrastructures to handle large volumes of data, this implementation may offer companies major competitive advantages.

A common reference when you speak of Big Data’s value is the number of people connected to the internet around the world, 3.149 billion of hyper-connected users – a pocket of data whose return in many sectors is still to be estimated.

Two additional Vs

In addition to the Vs mentioned above, some experts suggest other aspects. For example Mark Van Rijmenam, one of the 10 global influencers linked to this topic, defends that variability and visualization should be added to the 5 Vs:

Variability refers to variability in meaning. This is important when you analyze perceptions. Algorithms must be able to understand the context and decode the exact meaning of every word in its specific environment.  This is a much more complex analysis.

Visualization means making the collected and analyzed data understandable and easy to read. Without the right visualization, it is impossible to maximize and leverage raw data.

Are you interested in financial APIs? Discover all the APIs we can offer you at BBVA

It may interest you