Big Data or Big Data, one-click definition

Big Data, or big data, is the term for the very large volumes of data available to organizations within their IS. Heterogeneous and continuously collected, this data often remains to be exploited.

Big Data or Big Data, one-click definition

With the acceleration of digitization, the volume of data generated each year is growing at a breakneck pace. In 2018, IDC estimated this data to be 33 zettabytes (Zo). But the Data Age 2025 study predicts that this figure will reach 175 Zo, or 175 billion terabytes, by 2025. The promise of Big Data and its technologies is therefore to allow the storage and processing of this data. to derive value for the company.

However, the categories of data are very diverse. Indeed, is associated with the concept of Big Data the rule of 3V – or even 4V or 5V. At the outset, Big Data is described through three characteristics: volume, variety and speed.

The 3Vs of Big Data

Thus, data is collected on a massive scale. They are also heterogeneous, since they can be of a structured, semi-structured or unstructured nature. As for the speed or velocity, it designates both the speed of creation of this data (thanks to the IoT in particular), but also of their processing, which can go as far as real time.

Experts have since completed this list of 3Vs to incorporate the notions of volatility and validity. If Big Data is the science of data exploitation, it is still necessary that the data is always exploitable, it is volatility, but also of quality (Validity). These additional characteristics intervened to correct the mistakes of the first Big Data projects.

Initially, companies collected and dumped masses of data considering that their processing would naturally allow the extraction of valuable information or “insights”. Data lakes or datalakes then flourished … leading in many cases to Data Swamps, data swamps.

From Data Lake to Data Swamp

Storing growing volumes of data is not enough to unlock value, despite the emergence of infrastructure technologies that can handle big data. The era of Big Data is thus marked by the emergence of solutions such as Hadoop, but also NoSQL databases and Spark distributed computing systems.

But technologies are only part of Big Data projects. It is also important to define data governance, to have skills and especially to identify specific use cases. Between 2010 and 2015, investments in Big Data therefore mainly resulted in PoCs and experiments.

In 2015, Gartner estimated that nearly two-thirds of Big Data projects did not pass the pilot stage and were ultimately abandoned in the long term. Organizations have since matured, especially when it comes to data business application cases. To do this, they first had to acculturate internally in order to develop a culture of data and possible uses.

Business use cases for Data

Big Data is only of interest to a company or a profession if the available data can be used within the framework of a well-defined project. It could for example be to collect and process the operating data of an industrial tool in order to prevent breakdowns and therefore production stoppages. This is the purpose of predictive maintenance.

Data crossings and histories will intervene to detect weak signals suggesting a failure. Maintenance interventions can be planned as well as possible to increase the availability of the production tool.

Big Data also finds many uses in the fields of marketing and customer relations. The data is used to identify the customers most likely to terminate their contract. It’s churn, or attrition. Behavioral data analysis plays a role in tackling attrition. But it also contributes to the personalization of offers or e-mailing to improve the conversion rate and the average basket, in the case of e-commerce.

In the banking world, the first use cases of Big Data focused on the fight against fraud. There are projects that correspond to each profession and process. And if the beginnings of Big Data were centered on infrastructure, since then, issues related to skills such as Data Science and artificial intelligence have been added. Machine learning, whose models are trained using data, is in a way the intelligence of Big Data.

Big Data, or big data, is the term for the very large volumes of data available to organizations within their IS. Heterogeneous and continuously collected, this data often remains to be exploited. With the acceleration of digitization, the volume of data generated each year is growing at a breakneck pace. In 2018, IDC estimated this…

Leave a Reply

Your email address will not be published. Required fields are marked *