Clean Up Your Enterprise Data Mess the Easy Way: Ignore it

If you’re responsible for data strategy in a large organization, there’s a good chance you’ve got a mess on your hands. In recent years, most organizations have let their discipline for deploying enterprise data slip a bit. Some have let it slip a lot. In their rush to deploy valuable applications, project teams developed solutions in relative isolation, collecting the data they need for their own purposes, missing the opportunity to reuse data across applications and analytic use cases. On top of this, well-intentioned “self-service” initiatives have often spun out of control. What were supposed to be prototypes, experiments, or ad-hoc analytics have turned into de facto production solutions, with all the risks and support burdens that go with it.

As a result, data is replicated everywhere – data about customers, sales, products, inventory, and so on – leading to inconsistency, excessive interface costs, inability to adequately protect sensitive data, and challenges linking data together for enterprise initiatives. And, ironically, the desire to build solutions quickly led to a dramatic slow-down of projects, since every project had to spend time and effort to collect, manage, and organize data on their own, regardless of how many other applications needed the same data.

Now you own this mess. What do you do? If you’re like most data leaders, you take the most obvious step: You set about cleaning it up. Maybe you’ll take advantage of technology “modernization” – such as moving to the cloud or implementing advanced data integration and data management tools – to re-architect, re-design, and re-implement existing data resources to create the well-structured and thoughtful enterprise data resources that should have been built in the first place.

Yeah, don’t do that.

Think about it: If your goal is to simply clean up the mess, what will you really achieve? After spending literally years reorganizing and rewriting complex structures and carefully ensuring that all existing applications continue to function, you’ll have exactly the same results you have today. And that’s if you execute the transition perfectly.

Meanwhile, the rest of the company will continue to focus on initiatives to address the real priorities – customer experience, supply chain optimization, manufacturing automation – without the benefit of your help because you’re distracted by all that rework. Maybe you’ll try to fold in support of new use cases while simultaneously re-writing everything. That just introduces more complexity and slows critical business projects while they wait on your data projects. They’ll just work around you.

There is a much better way.

Instead of turning toward the mess and diving in, you’d be better off ignoring it. (Well, not completely, but mostly.) Focus your attention where it’s needed most: the approved, funded business initiatives that are set to transform the enterprise. All major business initiatives require data for their success. You should deliver that data, just-in-time and just-enough. You should be able to show how every data element delivered, every data quality issue addressed, and every act of data stewardship contributes directly to specific application or analytic needs within approved business initiatives. And because you’ll drive the data deployment, you can organize the data rationally and coherently – ready for reuse across the enterprise.

But wait a minute, what about the mess?

Here’s what’s going to happen: As you deploy data in support of new business initiatives, outdated data structures will begin to fall away. As a result of the changes within the initiatives, source systems will be replaced, as will applications that rely on the old data structures. Little by little, you can just decommission obsolete resources as the new structures support modern versions of retired applications. No need for an archeological dig to decipher the internal complexity of existing systems, many of which have no documentation and no one who knows or remembers how it all works.

Ok, yes, there are some exceptions, but they should be very targeted. You should make surgical changes to current systems only as justified based on necessity or value. For example, you may discover that personally identifiable information (PII) is scattered and vulnerable. You need to deal with that. Or there may be some technical processes that require excessive maintenance, and a rewrite of those specific processes is justified. You may also want to salvage elements of or at least connect to existing data stores – very carefully and selectively – as you build the new, integrated solution. And if moving to the cloud is a priority, you can just “forklift” current structures without getting in the way of new development work.

As part of this strategy, it’s also important to continue to enable self-service, but closely monitored and supported. You’ll find that when you publish your roadmap, then show results by delivering certified, reliable, and coherent production data quickly and iteratively, users will not feel so compelled to build pseudo-production solutions. Users don’t collect, organize, and manage their own data because they want to. They do it because they don’t have a choice. They’d much rather focus on value-producing analytics and save their data curation skills to experiment with new and innovative data sets.

Moving away from disjointed data resources that hinder your company’s ability to compete, move, and change quickly is an admirable goal. But if there’s a way to directly support the most urgent business needs of the company, build out trustworthy enterprise data resources, and avoid the muck of indecipherable complexity all at the same time with streamlined effort and speed, shouldn’t you give that approach a try?

Share this:

Related

Leave a comment Cancel reply