Data Management Glossary
Deep analytics is the process of applying data mining and data processing techniques to analyze and find large amounts of data in a form that is useful and beneficial for new applications. Deep analytics can apply to both structured and unstructured data.
In the context of unstructured data, deep analytics is the process of examining file metadata (both standard and extended) across billions of files to find data that fits specific criteria. A petabyte of unstructured data can be a few billion files. Analyzing petabytes of data typically involves analyzing tens to hundreds of billions of files. Because analysis of such large workloads can require distribution over a farm of processing units, deep analytics is often associated with scale-out distributed computing, cloud computing, distributed search, and metadata analytics.
Deep analytics of unstructured file data requires efficient indexing and search of files and objects across a distributed farm. Financial services, genomics, research and exploration, biomedical, and pharmaceutical are some of the early adopters of deep analytics. In recent years, enterprises have started to show interest in deep analytics as the amount of corporate data has increased, and with it, the desire to extract value from the data.
Deep analytics enables additional use cases such as Big Data Analytics, Artificial Intelligence and Machine Learning.
When the result of a deep analytics query is a virtual data lake, data does not have to be moved or disrupted from its original destination to enable reuse. This is an ideal scenario to rapidly leverage deep analytics without disruption since data can be pretty heavy to move.