Somewhere, hidden in the depths of your business, lies a threat you may never have considered. Lurking in the shadows, it silently multiplies beyond your control. It threatens your business’ security and could cost you dearly under new regulations. The threat? Dark data. Credit: Getty Images Also referred to as unstructured data, dark data is growing at a rate of 62% per year, according to IDG. By 2022, they say, 93% of all data will be unstructured. Gartner defines dark data as, “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes”. Consisting of data from a huge variety of sources – emails, documents, instant messages, digital media posts, partly developed applications – or just information which isn’t being used or analyzed, its nomenclature makes it sound foreboding. With new regulations such as the GDPR coming into force, businesses must gain a clear understanding of the data they hold. For structured data, this is straightforward. But dark data is much harder to manage, stored across a distributed IT environment with no single owner. A ‘bottomless lake of data’ Dark data tends to be text-based data, as well as video, audio files and images. It’s generated by a diversity of different sources, gathered from mobile devices, social platforms, apps and internal systems to name but a few. Much of the data generated by the Industrial Internet and the Internet of Things is unstructured, so this also falls under the dark data shadow. In the workplace, employees are responsible for generating a lot of dark data. In fact, says Sony Shetty from Gartner, “Across the enterprise, employees are blindly building a bottomless lake of data and, in many cases, a corporate mantra of ‘save everything, just in case’ is encouraging the behavior”. Think about the amount of data you, personally, generate, filter and store each working day – did you record your last conference call in case anyone missed it? Did you make it available as a podcast and save that, too? What about your customer calls – do you record them ‘for training purposes’ and store them as audio files? Do you have a chat function on your website and keep a record of the interactions, or use an instant message function on your desktop? One study found enterprises to be using almost 500 business applications, each generating data. All the data generated by this activity falls under the definition of dark data, and is stored across different devices, drives, desktops and SaaS platforms. Most of it will never see the light of day again. Employees leave – taking their passwords with them – customers move on, business priorities change, and no-one has the remit, the ability or the time, to remove the data. The information quickly becomes out of date and inaccessible. The need to understand data Prior to the GDPR, dark data would have been an accepted part of legacy business. In the UK, the 1998 Data Protection Act didn’t provide any minimum or maximum period for data to be stored, so it would have been a case of ‘out of sight, out of mind’. Now, though, the GDPR requires businesses to gain an in-depth understanding of how data flows across their organization, along with stringent data governance. The new Data Protection Bill coming into force will implement the GDPR into UK law. From May 25th, if a ‘data subject’ – a client, employee or other stakeholder – asks what data a company holds on them, the company must know and share this. If they ask to see a record of when and how they gave their consent to be used, the company must provide this too, and only information necessary for its original purpose should be processed. “Inaccurate or outdated data should be deleted or amended and data controllers are required to take “every reasonable step” to comply with this principle”, says Debbie Heywood from Taylor Wessing. This is extremely hard to fulfil if data is held in silos across an organization. “Because unstructured data is text heavy and irregular, making sense of what is being said and how it’s being said — positively or negatively — is not for the faint of heart,” says a report from the Medallia Institute. Tapping into uncharted territory The time has come for businesses to bring their dark data into the light. Doing so helps drive GDPR compliance, but the benefits of understanding dark data stretch far beyond compliance. Think of it as discovering uncharted territory: analyzing this unstructured data offers the opportunity to extract invaluable business insight which would otherwise lie dormant. It transforms information from data into strategic intelligence. Gartner cite, “Some examples of data that is often left dark include server log files that can give clues to website visitor behavior, customer call detail records that can indicate consumer sentiment and mobile geolocation data that can reveal traffic patterns to aid in business planning.”. For example, most of us know that retailers are experts at using psychology to drive product placements. They understand our thought process and how we tend to move around a store, and place products accordingly. Studying filmed footage of consumers’ mobility in stores helps retailers refine their product placement strategies even further. As Deloitte says, “A retailer may be able to gain a more nuanced understanding of customer mood or intent by analyzing video images of shoppers’ posture, facial expressions, or gestures”. This intelligence, extracted by analyzing dark data, can translate directly into revenue as retailers apply it to their store layout. By analyzing dark data businesses can: Create a truly 360-degree single customer view, to drive engagement and boost interactions Anticipate, understand and respond to changes in market- and consumer-demand Develop an in-depth understanding of consumer sentiment on their brands, gleaned from social platforms and multichannel interactions Lockdown and secure vulnerable data points, and give personal data the protection it requires Refine the accuracy of risk management models Address recurring pain points for customers and direct customer support to those areas most affected Identify any links and connections between data sets Generate a strong foundation for accurate forecasting Gain a deeper understanding of website performance from web analytics Identify new revenue streams. According to IDC, “By the end of this year, according to IDC, “50% of Large Enterprises Will Be Generating Data-as-a-Service (Daas) Revenue from the Sale of Raw Data, Derived Metrics, Insights, and Recommendations”. Now, analyzing unstructured dark data is simpler than ever before. Advanced, high-performance Customer Information Management tools automate and accelerate processes, connecting data sets for clarity and insight. Software scans both structured and unstructured data, using different data profiling techniques. The results of the scan are used to automatically generate a library of documentation, which describes a company’s assets and creates a metadata repository. You can then start to explore the opportunities and possibilities which lie within the data – and that’s when it starts to get really exciting. Related content opinion Rethinking ‘Big Data’ — and the rift between business and data ops As an era, ‘Big Data’ may be over, but its underlying value (and tensions) live on, even as organizations seek to make the leap to an AI future. By Thornton May May 07, 2024 5 mins Big Data Business IT Alignment Data Management feature Should you build or buy generative AI? You can adopt generative AI by taking, shaping or making the models you need. But the more you build, the more resources are needed, and even systems you buy will need building work. By Mary Branscombe Jul 14, 2023 11 mins CIO Generative AI Big Data feature Top 8 data engineer and data architect certifications Data engineers and data architects are in high demand. Here are the certifications that will give your career an edge. By Thor Olavsrud Jun 01, 2023 9 mins Certifications Big Data Data Mining feature What is data governance? Best practices for managing data assets Data governance defines roles, responsibilities, and processes for ensuring accountability for and ownership of data assets across the enterprise. By Thor Olavsrud Mar 24, 2023 10 mins IT Governance Frameworks Big Data Data Mining PODCASTS VIDEOS RESOURCES EVENTS SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe