How to Address Data Quality and Avoid the Hidden Costs of Poor Data

You may not see a Fortune 1000 company list “Data Quality” in their annual report as a major company initiative for the year, but there’s no doubt that data quality issues affect virtually every department at these companies. While Data Quality may not be a hot topic, it’s an important one, because bad data could be a driving factor in keeping your organization, business unit, or department from achieving a return on your analytics investment.

The Cost of Data Quality Issues

Having data quality issues creates distrust in your analytics solutions, slows workflows, and generates extra work. The downstream affect is poor decision making which can end up costing you a lot.

  • Bad decisions or no decision at all. A decision is only as good as the information it is based upon, so making decisions based on bad data leads to lost opportunities and expensive rework, research, and remediation. The cost of bad decisions will inevitably surface, whether financially, through bad publicity, or by losing a competitive edge.  What they say about “garbage in, garbage out” is true.
  • Decreased productivity. If you don’t have to worry about data quality issues, staff can be more productive. Instead of spending time validating and fixing data errors, they can focus on more profitable activities.
  • Missed decisions. When bad or inaccurate information hides an underlying opportunity, this affects your ability to excel above competitors, and this cost is almost never known or understood.

Because it’s hard to empirically measure the cost of bad data, directors and managers often don’t have the time or budget allocated to ensure good data. But, prioritizing data quality as a core competency will ensure proactive decision-making and help you thrive with your data.

Real Client Scenario

One of our clients who manufactures bridges had problems delivering on time due to data quality issues.  The procurement team orders parts from a range of suppliers, but they often failed to put shipping information into their ERP system.  In the meantime, a day’s work would be planned out only to discover that parts hadn’t arrived.  This meant that there were idle employees waiting for work or a manufacturing space had been prepped and now needed to be cleared which impacted completion dates and profitability. The solution was to address the procurement team’s work processes to capture and maintain shipping data accurately.  This manufacturing team could then plan work more accurately and dramatically reduce idle time.

While this may appear to be a simple problem to address, sometimes it’s hard to change processes because of company politics and culture or perceived work. But because we were able to express – in dollars – the cost of bad data each time employees were idle or work spaces weren’t used, the issue became much more realistic to our client and was then prioritized.

Types of Bad Data

When assessing the quality of your data, focus on these common issues:

  • Missing data: This is the most common issue, and it impedes accurate analysis.  If this is your issue, you need to develop a strategy to address missing data. Do you suppress rows with missing data? Do you define default values? Your strategy must be unique to your issues and data.
  • Inaccurate data: When you have inaccurate data, the cause may not be discoverable without detailed profiling and cross verification.  The root cause of inaccurate data is often difficult to find, and decisions based on incorrect information have a costly impact to your business.
  • Duplicate data: Duplicate data is often caused by errant business processes which leads to a variety of issues. Luckily, it can be readily addressed through consistent profiling and remediation.
  • Wrong data source: Data is often acquired through a dizzying array of sources. At the advent of computer systems, human error was by far the most common issue due to manual input.  Today, data comes from partners, clients, internal systems, and external systems; and companies often choose the most readily available sources which are not necessarily the most trusted.  

How to Address Your Data Quality Issues

At Analytics8, we believe that creating a culture of analytics is the foundational step to addressing data quality issues. 10 Critical Behaviors of Analytics Maturity is a great place to start.  These 10 Behaviors will put you on the path to addressing the long-term issues of data quality issues.  Beyond creating a data-driven culture, here are some tactical steps you can take now to improve your data:

1.)  Conduct a data audit and assess data quality issues. To start, we recommend conducting a data audit to document things like what data you are collecting, where it‘s kept, and who has access to it.  Take a full inventory of all the data your company uses and processes. 

2.)  Assess your data quality issues.  Identify and rank the business intelligence activities that are the most susceptible to the impact of bad data. Document processes of data for these back to the source.  Creating data flow diagrams is an excellent way to identify the “data path” or the process that data goes through from source to target analytics. 

Then, flag poor sources and processes. You may discover an Excel spreadsheet right in the middle of your data flow that gets manually updated.Profile data and utilize subject matter experts to validate.

Do: Be iterative and agile in your assessment, and don’t be afraid to mark complete if the sources check out.  Document the sources and ensure a process is in place to govern future changes.

Don’t: Over focus on creating documentation as part of the assessment.  Doing so delays critical remediation efforts.  Do only as much documentation as necessary and focus documentation efforts on what’s wrong.

3.)  Evaluate the assessment and create a prioritization matrix. Once you’ve identified issues, you have action items that need to be prioritized. Prioritize those items by considering Business Value on one dimension and Feasibility of Remediation on the other.  The items can then be categorized as high feasibility-high value, low feasibility-high value, low feasibility-low value, and high feasibility-low value.

Do: Communicate findings and gain critical consensus. Get everyone committed to resolving data quality issues early in the project. Consensus can often be reached when people understand the impact to their respective operations.

Don’t: Skip communication. Skipping this critical element will only create reactionary work later.

4.)  Put a plan in place that parallels the prioritization matrix.  Many make the mistake of starting with what’s familiar. Refer to your prioritization matrix to identify what actions can be taken early on for high impact, and plan out what should be accomplished next.  

Do: Work iteratively and agile.  Get started quickly and work incrementally.  Set the scope of work in short manageable efforts and don’t be afraid to reprioritize between sprints.

Don’t: Make dramatic changes in scope in the middle of a sprint. It’s important to complete work that’s started and that progress is being made with each iteration. Reprioritization is fine, but avoid changing scope.

5.) Finally, as with any project, ensure that you have adequate time, tools, and skills on hand to address data quality. Even when data quality projects are difficult to justify among competing high priority projects, remember the importance of data quality to the success of your business.

If you need help assessing the state of your data and analytics, let us know. When you sign up for our Analytics Assessment, we review where you are today, help you map out where you'd like to go, and provide a plan for how to get there. If this seems like too much right now, you can start with a free 30-minute Data Strategy Session.

Analytics8's photo
Analytics and big data. It's what we do.

 

Contact Us

National Office Telephone | Mon-Fri 8:30am-5:30pm CT