How to find your data "sweet spot"​ with AI-led data auto-discovery

How to find your data "sweet spot" with AI-led data auto-discovery

Artificial intelligence is enabling organizations to swiftly identify and integrate powerful new data sets amid a global information explosion at scale.


By Musaddiq Rehman, EY Americas Principal and Partner Digital, Data Analytics


The amount of data being generated today is truly staggering. The World Economic Forum calculates that 44 zettabytes (ZB) of digital information was collected globally in 2020. If that data were downloaded onto 1GB thumb drives and laid end to end, they would stretch across more than four billion football fields. Never in human history has there been such a wealth of information.

The opportunities for businesses looking to harness this torrent of data are equally profound. Data analytics can enable real-time visibility of the most convoluted supply chains, help explain complex market trends, deliver predictive insights and help automate complicated processes. This is achieved by identifying patterns and correlations in data that can be tracked and extrapolated, delivering game-changing business insights.

The big challenge, however, is organizing and making sense of this increasing volume of information – spotting and separating out the golden nuggets of data that will help deliver the intelligence that businesses require.

The conventional way of achieving this is labor intensive. It involves analysts sifting through data sets, deciding which will add value, trying to locate system owners, requesting their co-operation, and only then extracting the data. Having completed this process there is still a high probability data could be inaccurate or invalid.

This is a slow and labor-intensive process, and one that must be repeated every time a new use case is identified.

Thankfully, there is now a solution in the shape of AI-led data auto-discovery, which can automate and fast track the data discovery and management process, connecting new data sources with a corresponding semantic layer, and reducing the human effort needed to make data sets consumable by 50%-90%.


Fast tracking the data-discovery and management process

The conventional process of preparing data is a little bit like dumpster diving – the data analyst has to really get their hands dirty. AI-led data auto-discovery, however, uses machine learning to do the heavy lifting of discovery, preparation, and operationalization into transactional systems. 

While AI-led data auto-discovery isn’t sophisticated enough yet to independently execute on the full end-to-end data management process, that day is coming. At present, the technology works alongside a data analyst, interrogating data sets and making recommendations.  

This process begins with an algorithm which uses training data to identify high-value data. Recommendations and exceptions are flagged to a human data steward who reviews the AI’s recommendations and provides feedback, which enables the algorithm to improve its performance.

As the AI identifies high-value data sets, it strips away any existing titles, table names or cataloging and recategorizes the data in a standardized way, according to the functional area or use case (for example, P&L or procurement). This creates a business-facing, business-centric semantic layer which sits above the available data elements, ensuring they can be speedily added into a company’s data fabric so they are ready for consumption.

This plug-and-play approach enables companies to understand business fundamentals, such as margin, in greater detail than ever before.

Every organization is under pressure as their industry progresses through the maturity curve and more competitors enter the market and chip away at margins. The companies that survive are those that are agile and nimble. They can interrogate their disparate data sources quickly, understand where the actual profitability of their product lines lies, and they can leverage that information to make powerful business decisions.

If, on the other hand, a big corporation fails to understand its P&L data, for instance, there’s a real risk it could be selling hundreds of millions of stock-keeping units (SKUs) that generate no margin whatsoever. This could easily shift a company from the black into the red.

A lack of corporate self-awareness in areas as critical as P&L is not unusual. A large privately owned US company with 100+ legal entities recently approached EY because it couldn’t account for a sudden 10% to 15% drop in sales. The organization has grown through M&A activity, so it has multiple legacy companies, each with its own systems and data sources. AI-led data auto-discovery is the perfect fit for such a convoluted legacy structure.

 

How AI-led data auto-discovery makes a global procurement strategy possible

Effective localized, short-term business decisions can still be made using conventional data management methods, but as organizations start to make bigger strategic decisions, involving more data and more variables, the value of AI-led data auto-discovery becomes clear.

Take the example of the major battery brand that my team worked with recently. It needs to source large amounts of steel for its manufacturing processes. To achieve economies of scale, the company created a centralized steel-purchasing function that can analyze a wide range of data. This data includes the buying signals generated globally by consumers, fluctuations in steel price, the data being generated across its international manufacturing operations, as well as numerous other data points.

To achieve the best steel purchasing decisions, the company must gather, harmonize and consolidate all this information across its global operations in near real time. This would be a huge – if not impossible – challenge using conventional data management methods. AI-led data auto-discovery, however, can identify the relevant data sets and make them consumable in just a fraction of the time, increasing the company’s margins and giving it a real competitive advantage.


Leveraging the power of third-party data sets

The ability of AI-led data auto-discovery to leverage third-party data is also massive. Market research company IDC predicts that by the end of 2021, 75% of enterprises will use external data sources to enhance their cross-functional decision-making capabilities in ways that increase value compared to using internal data in isolation.

Companies of every shape and size are adopting this approach. Trade credit insurance companies, for example, are known to import maritime monitoring and tracking data to help measure global trade flows; while international fast-food restaurants use weather-mapping data to inform their procurement processes and offer their customers personalized dining recommendations.

Unprecedented amounts of third-party data are now available, but once again, the challenge is identifying the high-value data sets and processing them at speed so they are fit for consumption. The use of AI-led data auto-discovery to process plug-and-play big external data sets is helping organizations make sense of their own data, adding additional context and enabling them to make better decisions, better forecasts and achieve better planning.

We’re not talking about using 10,000 data elements here. We’re talking about identifying a sweet spot of around 50 to 100 data elements needed to make powerful decisions. AI data auto-discovery is perfect for interrogating and filtering the available information to achieve that sweet spot.

Clearly, with 44ZB of data generated globally every year, and that figure expected to rise to 149ZB by 2024, simply ingesting more and more data from multiple sources into data lakes is not enough to make it usable. In fact, the conventional labor-intensive method of making data consumable constitutes a serious barrier to progress.

Overall, AI-led data auto-discovery capabilities are getting better by the month. Companies should launch pilot programs to monitor the accuracy and through-put of such capabilities. On a use-case by use-case basis, these capabilities can streamline the steps of incorporating new internal and external data source quickly and at scale.



George Muhkaleev

Head of Lead Gen @Birdiva

1y

Musaddiq, thanks for sharing!

Like
Reply

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics