Techerati editor James Orme spoke with Jiri Vojtek, COO at data management company CloverDX, to understand the data mapping challenge and how businesses can tackle it
Q&A: The enterpise data mapping problem, with Jiri Vojtek, COO at CloverDX
Thu 7 May 2020
What is the enterprise data mapping problem and why does it need to be solved?
Does this sound familiar? “It’s gonna be just this one thing” I’ve heard it maybe a thousand times during my career. It always goes like: “We use HL7 standard (or insert your own complicated data format), but we just need to remap some segments, triggers …easy”. Or especially these days, companies get sold on “Our system (insert your favourite here) will completely change your business. You just to need to send your data in THIS FORMAT”. Enterprise data mapping problem is always about reducing something complex and all-encompassing into a manageable subset. It’s perceived as “we just need this one small thing”.
The misrepresentation of the problem is quite natural. It’s subject matter experts we’re dealing with and they operate in the realm of understanding their data from beginning to end. And they’re quite often “blind” so to speak to the complexity of the actual data structures. It’s the complexity of the formats on one hand, but also there’s this other thing. There is no such thing as a “standardised”, uniform company. And therefore there’s no such thing as “standard” data. There’s always this base that shares many similarities but then you’ve got this 10-20 percent, even maybe more, that’s just too specific, too special.
Next, there’s no time to spare. we need to solve the problem as quickly and effectively as possible. You really don’t want spent weeks or months with onboarding a new customer or vendor just because you have to modify your systems for that. I just need to add a few more fields. That’s usually one of the most feared sentences.
Why is this problem often underestimated?
Data we typically deal with is not perfect, far from it. How? Well, have you ever heard of someone “repurposing” (what an innocent term!) certain data fields for something else than intended because it just worked perfectly for them at the time? This poses big problems when you’re “naively” mapping data from A to B – suddenly you end up with something that does not fit but it should. This is one of the easier problems to deal with but still not as uncommon as you’d think.
Why is it not that easy?
Because people tend to focus on the current problem and don’t think “big”. Or they do, but the cost is simply too high for now. So as the problem evolves you are getting more interim solutions, cheap and dirty. One day you end up with a very good example of “spaghetti code”. In IT everyone knows that temporary solution has to be done properly because it will last forever. The reasons for that are cost and time. You need it quickly and cheaply. Later on, it becomes too big to refactor and it is not worth the cost. That is the trap.
What are the key steps to tackling this deeper complexity?
Selecting the right toolset is the first thing. It should give you a good manoeuvring space, but should not require the immediate complexity – you should be able to start small but grow big later. Another one to keep in mind is that you are building for the next 10 years. Just think like that. You don’t need to cover all the possible cases – that would be too costly and frankly impossible. But, the design has to be done right. Don’t cheat. Keep the best practices and understand that you are just in the first iteration because it will change.
That is actually the most important fact – it will change as soon as someone starts using it. Use common standards for software development and you should be safe. Do you want to do everything on your own? There are no “free features”, maintenance etc when you go DIY road.
Find someone whose job is to deal with such projects. Learning everything from scratch is expensive so at least having a mentor or consultant is a good idea.