Don’t ignore active data, the trees in the big data forest
This post was also published on Enterprise CIO.
Big data has been in vogue for years, but many businesses are having a lot of difficulty harnessing value and gaining insights from the voluminous amounts of data they collect. However, there is an often-ignored set of data in the enterprise that is truly actionable, data that I call “active” data.
Active data is “in-flight data” that represents things that are changing or need some sort of action taken to move forward. Active data includes data like open purchase orders, new PTO or family leave requests, sales opportunities that are changing in scope, orders that are shipped late and so on.
Surprisingly, there's a relatively small amount of active data, even in companies with tens or even hundreds of thousands of employees. Yes, there is a lot of data floating around the enterprise, but there are only so many open Purchase Order requests and or Key Performance Indicators—data that employees actually need and use.
The problem with only focusing on big data is that most of this active data is buried in software that employees are reluctant to use. IT spends a lot of time extracting big data out of various systems, but assumes that things like Purchase Orders or sales opportunities are easy for employees to find and use with existing software. However, most of this active data is actually buried in legacy systems and hard-to-use SaaS systems.
Take the sales team for example. They need to know if or when their goals change or if they meet their projections. To do their jobs, they need only a small portion of all of the data they have access to, such as a sales executive’s top prospect running into an issue and filing it with customer support organization, the kind of data that usually falls through the cracks. Big data, on the other hand, can help tune the sales organization, for example making it more efficient in processing leads.
Sometimes the complexity that comes with big data ends up scaring employees away, causing no one to use the data at all. Big data has no effect on the short-term business impact, so employees are less likely to care. Gartner even says that the issue is not so much big data itself, but rather how it is used.
Every department, from IT (ironically enough) to sales to HR, is guilty of not using active data. That’s why it’s important for IT to consider the end user when it comes to extracting data from enterprise systems. IT should consider both the power users and the occasional users of various HR, CRM, or finance systems. Both types of users want access to data. However, an HR representative is a power user of Workday, and is in the system every day examining open PTO or family leave requests. Joe in marketing is an occasional user who just needs to know the status and next steps of his personal requests, his active data.
The IT team is often the most guilty of not distilling big data into active data for employees to act on. When I was the CIO at CBS Interactive, we processed almost one billion events a day that flowed from our web and application servers over message queues to a huge cluster of twelve-core Hadoop nodes that then fed a Teradata data warehouse. Now of course we analyzed that data and distilled insights to better package ad product. But what day-to-day managers needed were Key Performance Indicators like the bounce rate to be easily accessible and to be notified if they shifted unexpectedly.
For IT, big data isn't the end all, be all. Instead, enterprises should focus on the data that matters most to specific users now, so they can be as productive as possible. We know that today’s enterprise software offerings need to be modernized – or go micro as I like to say. According to a Forrester survey, the average worker spends one day a week searching for information across their various systems. And unfortunately, it is only getting worse as the amount of enterprise data doubles every 18 months.
Data should be personalized and delivered to employees in small, digestible sets. It's easy for IT teams to get lost in the process of building out big data infrastructure and forget that data needs to be usable, actionable and personalized. If IT arms employees with the personalized active data they need, productivity will easily increase.