Big Data Ecosystem
Ecosystem of Big Data
The rapid development of digital technologies, IoT products and connectivity platforms, social networking applications, video, audio and geolocation services has created opportunities for collecting/accumulating a large amount of data. While in the past corporations used to deal with static, centrally stored data collected from various sources, with the birth of the web and cloud services, cloud computing is rapidly overtaking the traditional in-house system as a reliable, scalable and cost-effective IT solution. The high volumes of structures and unstructured data, stored in a distributed manner, and the wide variety of data sources pose problems related to data/knowledge representation and integration, data querying, business anaylsis and knowledge discovery.
In 2001, in an attempt to characterize and visualize the changes that are likely to emerge in the future, Douglas Laney of META Group (Gartner now) proposed three dimensions that characterize the challenges and opportunities of increasingly large data: Volume, Velocity, and Variety, known as the 3 Vs of big data. Thus, according to Gartner:
"Big data" is high-volume, velocity, and variety information assessts that demand cost-effective, innovative forms of information processing for enhanced insights and decision making.
According to Maniyka et al. this definition is intentionally subjective and incorporates a moving definition of how big a dataset needs to be in order to be considered big data. Along this lines, big data to Amazon or Google is quite different from big data to a medium-sized insurance or telecommunications organization. Hence, many different definitions have emerged over time, but in general, it refers to "datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze" and technologies that address "data management challenges" and process and analyze data to uncover valuable information that can benefit businesses and organizations. Additional "Vs" of data have added over the years, but Volume, Velocity, and Variety are the tree main dimensions that characterize the data.
The volume dimension refers to the largeness of the data. The data size in a big data ecosystem can range from dozens of terabytes to a few zettabytes and is still growing. In 2010, the McKinsey Global Institute estimated that enterprises globally stored more than 7 exabytes of new data on disk drives, while consumers stored more than 6 exabytes of new data on devices such as PCs and notebooks.
The velocity dimension refers to the increasing speed at which big data is created and the increasing speed at which the data need to be stored and analysed, while the variety dimension refers to increased diversity of data types.
Variety introduces additional complexity to data processing as more kinds of data need to be processed, combined and stored. While the 3 Vs have been continuously used to describe big data, the additional dimensions of veracity and value have been added to describe data integrity and quality, in what is called the 5 Vs of big data. More Vs have been introduced, including validity, vulnerability, volatity, and visualization, which sums up to the 10 Vs of big data. Regardless of how many descriptors are isolted when describing the nature of big data, it is abundantly clear that the nature of big data is highly complex and that it, as such, requires special technical solutions for every step in the data workflow.
Big Data Ecosystem
The term Ecosystem is defined in scientific literature as a complex network or interconnected systems. While in the past corporations used to deal with static, centrally stored data collected from various sources, with the birth of the web and cloud services, cloud computing is rapidly overtaking the traditional in-house system as a reliable, scalable and cost-effective IT solution. Thus, large dataset - log files, social media sentiments, click-streams - are no longer expected to reside within a central server or within a fixed place in the cloud.
Comments
Post a Comment