Are You Drowning in Dark Data?

The opportunities and threats to your business from masses of hidden data

Whether you like it or not, countless companies are using your data. Some of this information is structured and easily accessible. Some of it, though, is not. Through machine learning techniques and big data analysis, companies can unintentionally collect unstructured data that isn’t readily available. This is called dark data, and it dwells inside every email, text, Word document, invoice and contract. From a business perspective, data generally tends to positively inform decision making. However, dark data can be more of a hindrance than a help. Why has unstructured data become such an important issue, and how should businesses respond?

Shedding light on dark data
Data makes the world go round. It’s becoming more and more important for companies to collect information and use it to improve their strategies. . . but unwittingly accumulating mass dark data stores can be damaging. Without the application of certain machine learning systems (like pattern recognition or natural language processing), dark data is useless. Reliable data storage doesn’t come cheap, so data hoarding is an expensive habit – especially when over half of the data stored today is unused. Nonetheless, companies appear to be willing to fork out for additional storage. Another issue with storing dark data is that it’s difficult to know what information you’ve actually got. This is obviously problematic in terms of data privacy and protection, but companies are unwilling to delete data in case they need to use it to justify certain decisions. Ultimately, this all boils down to the fact that companies know that they need to collect data, but aren’t unlocking its full potential. Not only this, but failing to keep track of what information they collect could lead to serious infringements of regulations – which is ironic, given that compliance to data laws has contributed to the build up of so called databergs. Of course, there are ways of handling dark data. Turning unstructured data into structured data with machine learning techniques is one option. Another is to ensure interoperability (in other words, compatibility) between IT systems. This is easier said than done for SMEs with limited resources, however.

Disruptive dark data
Dark data is a thorn in the side of businesses that don’t have the relevant data analysis tools. The unintended accumulation of unstructured, unused data in organisations could have various consequences, and not just for the business itself. The more information that’s tied up in a company, the more scope there is for cybercriminals to get hold of sensitive data. If a company uses a black box, Artificial Intelligence platform, it may become impossible to trace decisions. The idea that opaque AI systems could sift through unprecedent amounts of personal data without anyone knowing is worrying to say the least. But, in fairness, dark data only presents a problem if it isn’t dealt with appropriately. It has massively disrupted the way that businesses work, and despite the many potential issues this has positively impacted strategy. Structuring unstructured data has enabled companies to gain clues about how best to tackle markets, maximise ROI and establish a closer relationship with customers. With advanced data analysis, this will continue to happen. For the consumer, though, the quantity of personal data that enterprises can access is undoubtedly unsettling. New legislation (for instance, a new UK Data Protection Bill) could be the much needed catalyst for change. Additionally, as businesses become more familiar with data analysis techniques, they will be more willing to apply them – and apply them well.

Dark data presents a double edged sword for businesses. One the one hand, it can provide valuable insights. On the other, it can be a costly and risky burden. Larger businesses are generally better equipped to handle dark data, but modern analytics solutions have enabled SMEs to integrate similar systems. Without quality data insights, businesses can lag behind their competitors. Structuring dark data, rather than hoarding it in additional storage, could help organisations to better understand their market – as well as allowing them to reliably trace decision making. Companies with a mountain of unused data need to begin deciphering the information they have – because soon, thanks to strict data regulations, they won’t have a choice.

Will regulations combat the trend of data hoarding? What other techniques could businesses use to decipher dark data? How will addressing mass data storage impact cloud computing companies? Comment below with your thoughts and experiences.