Modernizing Unstructured Data Management in the Automotive Industry with AWS


Recent years have seen ample disruption in the automotive industry – from self-driving cars to the rise of electric cars and an increasingly digital driving experience. Given these trends, we sought out Paul Baccaro, Storage Specialist at AWS, who works with Amazon’s global auto customers. In a webinar hosted by Komprise and AWS on June 28th, we discussed the new requirements for unstructured data management in the auto sector and the evolution of data lakes and ML services on Amazon to help automakers and related suppliers remain relevant and competitive in their markets. Komprise is an AWS partner for Migration and Modernization.

Delivering the Right Data to the Cloud in the Automotive Sector

Auto customers need data storage solutions that can scale while simultaneously incorporating all types of data, according to Baccaro. As in many industries, the variety of data has grown along with the velocity of new data creation from sensor data, images and streaming content such as video. This puts pressure not just on enterprise storage capacity but on the bandwidth required to serve data to end users.

“The volume of unstructured data is so large that you can’t just move it all to cloud,” added Krishna Subramanian, President and COO of Komprise. “You need to filter the data across all storage and send the right data sets to the cloud while also deleting what you no longer need.”

Komprise delivers a systematic, simple way to look at data across all platforms and data centers and consistently move data to the optimal cloud storage class and then apply the power of cloud processing on it. Otherwise, enterprises are just transferring costs from one area to another. Cloud data storage architecture is increasingly nuanced: having a solid understanding of your data and the various storage classes is critical to determining any savings from moving data off on-premises storage.

The Self-Driving Car Data Deluge Calls for Unstructured Data Analytics

Baccaro and Subramanian then delved into the demand for mining these massive data sets to meet the needs of new market segments. “By 2025, there will be 8 million autonomous or semi-autonomous cars on the road,” Baccaro said. “That is a very high number and so much of these vehicles’ operation is AI and ML-driven that it’s really creating challenges for our customers.” Autonomous cars will generate as much as 40 TB of data an hour from sensors.

This is creating supreme urgency for car makers to manage their data differently than in the past. “They really need the ability to analyze and filter data at the edge,” Baccaro said. Giving customers the option and ability to save on infrastructure costs by sending critical data sets is key. The ability to intelligently tier data from on premises storage to the cloud or back again is another critical capability for automotive companies.

Data Modernization: Bringing Industry Data to Cloud Analytics Platforms

Finally, there is the exciting potential in using modern data lake and data lakehouse technologies to uncover new value in machine data. “This is a big shift for automotive companies,” Baccaro said. “Traditional analytics platforms had just a few sources of data such as a CRM or ERP and you would build out ETL to push the data into a data warehouse and run business intelligence on it. But over time data sources can change as well as the structure of data and every time you add or change the data, you then had to modify the ETL structure.”

With data lakes, which are much more open and flexible, you can run machine learning models continuously on the data, query data directly on the data lake and process it there too without modifying the code, he explained. “Storage is the fabric of that and customers use AWS S3 for data lakes because of its scalability, security, availability and durability. S3 also delivers high performance for response time which is important for analytics.”

Komprise adds value to S3 data lakes in how the solution manages unstructured data for cloud native access. During the demo, Subramanian showed the power of file-object duality that Komprise Transparent Move Technology™ provides so that users can access moved data as files from the original location but also leverage the same data as objects in AWS.

Cloud-native access to data is a requirement for data to be ingested into data lakes and other analytics platforms such as Amazon Macie or Amazon SageMaker.

“Our partnership with AWS helps auto customers maximize unstructured data value because Komprise delivers the ability to analyze and mobilize data to leverage all the AWS storage tiers at the right time and in an open fashion,” Subramanian said. “This means customers can save the most and use data fully in the cloud.”

Recently, Komprise announced Smart Data Workflows, new functionality which allows IT users to create automated workflows for all the steps required to find the right data across storage assets, tag it for easier search and send it to external tools for analysis.

You can watch the entire webinar on-demand.


Getting Started with Komprise:

Contact | Data Assessment