fbpx
News Hub

Cleaning the big dirty data problem with The Classification Guru’s Susan Walsh

Written by Thu 9 Feb 2023

Ahead of Tech Show London, Techerati spoke with Susan Walsh, Founder and MD of The Classification Guru Ltd, about the detrimental effects of having dirty data.

Susan will appear at Big Data & AI World 2023 to share more on how to clean your data, the importance of data accuracy, and making sure your data has its COAT on.

Identifying dirty data as a problem

While working for a spend analytics company almost six years ago, Susan recognised the issue of dirty data. Despite having attractive dashboards, she found that the large majority of work that goes into these platforms is cleaning and classification of the data.

This is when Susan questioned why very few people were focusing on this issue. “Why are we pretending that like AI is doing this when it’s real people. There must be people out here who don’t want the dashboards, they just want the clean data” she said.

As a result, Susan was inspired to set up her own company: The Classification Guru Ltd.

What has changed in the last six years?

Susan has recognised that there is more investment in data reporting and analytics to drive businesses forward, yet, data cleaning is still not seen as a priority. This may be due to a perception that data cleaning is menial. Susan’s business is, however, growing for those that realise the importance of keeping clean data sets to achieve efficiency and accuracy.

“There are software companies that say ‘buy our tool and we will fix all your data’. But it doesn’t fix the data at all,” added Susan.

Susan has recognised that there is more investment in data reporting and analytics to drive businesses forward, yet, data cleaning is still not seen as a priority. This may be due to a perception that data cleaning is menial. Susan’s business is, however, growing for those that realise the importance of keeping clean data sets to achieve efficiency and accuracy.

“There are software companies that say ‘buy our tool and we will fix all your data’. But it doesn’t fix the data at all,” added Susan.

What are the benefits of cleaning dirty data?

For Susan, data cleaning consists of making sure data is spelled, formatted, categorised, and classified correctly.

“If you don’t put clean data in first, you’re not going to get clean data out. There is some amazing technology that can help, but sometimes there is no quick fix; you have to just do it manually and suck it up,” said Susan.

The benefits of data cleaning include compliance with privacy and data protection laws, cost savings, fraud prevention, and more.

How can businesses keep their data clean?

Susan created the COAT methodology, a memorable acronym encourages businesses to keep their data Consistent, Accurate, Organised, and Trustworthy.

To be consistent, data must follow standard processes, organised into appropriate categories, accurate to what the data is representing, all of which results in trustworthy data.

“This means you can make better business decisions,” said Susan.

However, Susan noticed that most businesses don’t realise how bad their dirty data problem is, which is often caused by siloes. Therefore, business leaders are advised to spend more time familiarising themselves with their data.

“Data problems are really people problems. We are not all data professionals, but we have to speak to everybody. Talking to them about algorithms and neural networks and meshes means nothing to most people and it turns them off. That is why I created the COAT methodology, as it is easy to remember by everyone in a business.

“This is why we have ended up with companies spending tens of millions on software they cannot use because they have been blinded by big fancy words,” added Susan.

What does a good data leader look like?

Susan believes a good data leader is inclusive and leads by example. Speaking on a level that everyone can understand will help your cause.

“For CEOs and business leaders, cleaning your data will increase your profitability and have an impact on your bottom line,” said Susan.

The work does not stop after a business has cleaned their data – it must be maintained. Susan advises businesses to conduct regular spot checks, as data can be accidentally deleted or overwritten.

Susan promises both fun and actionable advice during her session at Big Data & AI World on 8-9 March at ExCeL London.

Join Big Data & AI World

8-9 March 2023, ExCeL London

Be at the forefront of change with thousands of technologists, data specialists, and AI pioneers.

Don’t miss the biggest opportunities to advance your business into the future.

Written by Thu 9 Feb 2023

Send us a correction Send us a news tip