How many Vs should Big Data have?
From the outset IBM and Gartner's approach to Big Data envisaged a model in three dimensions (volume, velocity and variety), called the "three Vs" model.
Based on the three Vs, Big Data can be defined as a set of tools that combine to support the compilation, storage and management of large volumes of varied data at high velocities, generating information to support sound decision-making.
Data volume
As the name indicates, Big Data technology must be capable of handling the large volumes of data generated each and every day by companies and organizations worldwide. For example, a supermarket chain can record up to 1 million sales transactions every hour and needs to identify each product that its customers acquire; more than 100,000 GB of data is saved on the social media site Facebook every day, while the Appstore handles over 72 million downloads.
Data variety
Big Data must be able to combine a wide variety of digital information in numerous formats, be it video, audio or text. Different data sources such as new wearable technology monitor physical activity; the Internet of things is set to connect up devices and machines; millions of messages are posted to social networks such as Facebook and Twitter and millions of videos are uploaded to YouTube every day... all of which are just some examples of different sources generating different data types.
Velocity
Big data technology needs with work in real time with data sources such as sensors, video cameras, social networks, blogs, websites... sources that generate millions and millions of items of data every second. It also needs to quickly analyze said data, slashing the long processing times associated with traditional analysis tools.
With Big Data already generating plenty of debate, IBM decided to extend the "three Vs" to include one more, Veracity.
Veracity
Big Data must be able to intelligently handle and analyze large data volumes in order to generate accurate and useful information to support better decision-making.
This fourth V was roundly welcomed by the community, which not only demands large data quantities, from diverse sources and that can be assessed and capitalized on, but also needs said data be veracious and reliable.
Veracity, however, is more a facet of business intelligence, and is used as part of data analysis to support decision-making, thus forming part of Value generation. Will Value be the fifth V?
For many people (especially Big Data engineers and developers), the new Vs such as veracity and other potential new additions (value, validity, viability...), are more appropriate for data characterization, rather than Big Data, which seeks to process huge amounts of data to generate information that was beyond the reach of traditional systems. The V of Veracity has also attracted some humorous derision:
Sources:
Kavya Muthanna - Innovation to make a difference - Last visit: 30/07/2014
AVNET - Last visit: 30/07/2014
The Big Data & Analytics Hub - Last visit: 30/07/2014
Quees.info - Last visit: 30/07/2014