Data Quality Best Practices

Donna Burbank and Nigel Turner

Global Data Strategy, Ltd.
August 26th, 2021

Donna Burbank

• Recognized industry expert in information • Excellence in Data Management Award Follow on Twitter @donnaburbank
management with over 25 years of from DAMA International @GlobalDataStrat
experience in data strategy, information
• Past President and Advisor to the DAMA
management, data modeling, metadata
Rocky Mountain chapter
management, and enterprise architecture
• Co-author of several books on data
• Managing Director at Global Data Strategy,
Ltd., an international information
management consulting company that • Regular contributor to industry
specializes in the alignment of business publications
drivers with data-centric technology • She can be reached at
• Worked with dozens of Fortune 500 [email protected]
companies worldwide in the Americas, Donna is based in Boulder, Colorado, US
Europe, Asia, and Africa and speaks
regularly at industry conferences
Nigel Turner

• Spent much of his career in British • Nigel is very active in professional Data
Telecommunications Group (BT) Management organizations and is an
where he led a series of enterprise- elected Data Management Association
wide IM & data governance initiatives. (DAMA) UK Committee member.
• Also been VP of Information • He was the joint winner of DAMA
Management Strategy at Harte Hanks International’s 2015 Community Award
Trillium Software, and Principal for the work he initiated and led in
• Worked in Information Management Consultant at FromHereOn and IPL. setting up a mentoring scheme in the
(IM) and related areas for over 25 UK where experienced DAMA
years. Experience has embraced Data professionals coach and support newer
Governance, Information Strategy, data management professionals.
Data Quality, Data Governance, Master • Nigel is based in Cardiff, Wales, UK.
Data Management & Business

Follow on Twitter @NigelTurner8

What We’ll Cover Today

• Tackling data quality problems requires more than a

series of tactical, one off improvement projects.
• By their nature, many data quality problems extend
across and often beyond an organization.
• Addressing these issues requires a holistic architectural
approach combining people, process and technology.

Global Data Strategy, Ltd. 2021 5


• Discuss how to deliver data quality improvements in the Baseline & Develop
phases of the A2E methodology
• Highlight the critical role of Business Rules in improving Data Quality
• Illustrate why getting Business Rules right is critical
• Outline how to use Business Rules to correct poor data quality and sustain
improved data quality

Data Quality is Part of a Wider Data Strategy
A Successful Data Strategy links Business Goals with Technology Solutions

“Top-Down” alignment with

business priorities

Managing the people, process,

policies & culture around data

Leveraging & managing data for

strategic advantage

Coordinating & integrating

disparate data sources

“Bottom-Up” management &

inventory of data sources

Tackling Data Quality: the A2E approach

Assess Step Purpose

Understand what data exists and how it is used
Assess Business within the organization
Baseline the current quality of the data and
Baseline Data assess how well it is meeting business needs
Evaluate Baseline Sources
Focus priorities to optimise early business
Cycle of Continuous
Data Quality Improvement
Converge on benefits and set ‘fit for purpose’ quality targets
Business Critical Areas to guide improvement activities
Design & deploy improvement initiatives
Develop (encompassing people, process, and technology)
Improvements and measure the impact against targets
Regularly measure the data and continue to
Evaluate Benefits & improve it so that it continues to meet current
Develop Converge ROI and future business needs

Data Quality Improvement: The Importance of Business Rules

• In a data context, business rules are used to define and

enforce the standards that data must conform to
”A Business Rule is a criterion • Have a key role in assessing, baselining and improving data
used to guide day-to-day quality
business activity, shape
operational business judgments, • Can be used to:
or make operational business • Cleanse and enhance existing data
decisions.” • Become standards which new data must conform to
• Guide data design in new developments
Ronald Ross, quoted in • Enforce data standards in existing applications and platforms • Stop poor quality data being entered at source, e.g. via drop
down lists, screen entry validation etc.

How Do You Classify Business Rules?

• Many different ways to classify business rules – can be very complex

• A simple classification is:


Specify the format standards data Specify the allowable content
should comply with of records or fields

Include: Include:
• Field length • Allowable values
(fixed, variable etc.) • Whether mandatory or
• Character format optional
(e.g. Alphabetic, Numeric, • Relationships with other
Alphanumeric etc.) fields or records

Example Data Related Business Rules


• A UK National Insurance Number must be in the format: aa nn nn nn a

• An employee must have a unique Employee ID in the format: aa nnnn
• Date of birth should be in North American format of MM/DD/YYYY
• A full US zip code must be in the format nnnnn-nnnn
• Internet router identifier must be in the format Aaa_Nan_Naa

Global Data Strategy, Ltd. 2021 11

Example Data Related Business Rules

• Every Sales Representative must be assigned to one and only one Sales Region
• A valid email address must be entered by a customer to enable a customer’s
order to be accepted
• Gender codes must have the valid value of Male, Female or Unknown
• A supplier must have at least one associated geographical address
• Product Price should be Product Unit Cost + 25%

How Do You Identify Business Rules?
• Business rules can be discovered or derived from:
• Data models (Business / Logical / Physical)
• Business documentation (e.g. Process Descriptions, User Instructions)
• IT Documentation (e.g. requirements specifications, system manuals)
• Source code (e.g. If ‘A Then B’ statements)
• Master and / or Reference Data Sources (e.g. currency codes, product
master data)
• Documented metadata (e.g. Business Glossaries, Data Dictionaries,
Metadata Repositories)
• Data profiling outputs
• Talking to key stakeholders: VITAL IMPORTANCE OF STAKEHOLDER
• Data owners and data stewards (if in place) ENGAGEMENT:
• Business rules are frequently implicit (i.e. locked
• Data producers and consumers in people’s heads) and not formally documented
• Where business rules are documented,
• Other business and IT subject matter experts documentation is often out of date and not
updated in line with system changes

Data Models Describe the Organization

• Relationships define the data-centric Business Rules of an organization

• You should be able to “read” a data model like a sentence
• The Entities / Concepts are the “nouns” – the boxes on a data model
• It’s often helpful to start by taking some text describing the organization (or transcripts
from stakeholder interviews) and draw boxes around the nouns to find the core entities
• An employee can work for more than one department.
• A customer can have more than one account. BUSINESS Employee
• A department can contain more than one employee.

Deriving Business Rules: Business Data Model
• Communication & definition of core data concepts & their definitions
• A business data BUSINESS RULE: current or former client
model provides An EMPLOYEE must be who must have had an
core definitions on the active payroll account active within
of key data the last 6 months
• It also shows key
between data
• Even a simple
diagram as the
one on the right
can tell a
powerful “story”
…. And
uncover key
business rules
contain 1 or more
customers with an
Global Data Strategy, Ltd. 2021 16

Why Do Business Rules Matter? DQ ‘Short’comings
• Liam Thorp made headline news in the UK in Feb 2021
• Received a priority invite for a Covid-19 vaccination because
he was medically classed as ‘morbidly obese’
Beatles statue • The reason – his local health board had recorded his height as
City of Liverpool 6.2 centimetres and not his real height of 6 feet 2 inches
• This made his Body Mass Index (BMI) 28,000, calculated by his
weight / height ratio
• A BMI of 40 and above is classed as ‘morbidly obese’
• Now corrected, and he was put back in his rightful place in the
vaccine queue

Liam Thorp “I can see the funny

32 years old side of this story but KEY PROBLEM - ABSENCE
Liverpool also recognise there is OF BUSINESS RULES TO
resident an important issue for SPECIFY:
us to address” • Minimum Height
Chair of the Liverpool • Maximum BMI
Clinical Commissioning (Content)
Group (leading the city’s
vaccine roll out)
Why Do Business Rules Matter? ‘Miss’ing weight

• UK Air Accidents Investigation Branch (AAIB) report (April 2021)

declared a ‘Serious Incident’ at Birmingham airport, UK
• Report highlighted that 3 flights to Europe in July 2020 had taken off with
the weight of the plane load underestimated by an average 1,200kg
• This miscalculation could have caused a ‘serious incident’ on take off as it
determines take off speed, thrust etc.
• Problem happened because all passengers with the title ‘Miss’ were
automatically assumed by outsourced IT suppliers to be children and not
• A child’s standard estimated weight is 35kg; an adult 69kg
• The airline described it as ‘ a simple flaw in its IT system’
• In reality, there was a serious problem with its business rules! KEY PROBLEMS:
• Reliance on IT, and not the business,
• The airline has now introduced manual validation of all passengers at
to specify the business rules
check in to ensure adults titled ‘Miss’ are changed to ‘Ms’ on the
• Making cultural assumptions that
passenger roster (?)
were incorrect

Four Step Process: Using Business Rules for Data Quality Improvement


Monitor &
priority DQ
problems &
adherence DATA QUALITY
to Business


Step 1: Quantifying Data Problems - The Value of Data Profiling

• The benefits of data profiling include:

• Checks conformance of the dataset with
business rules
• Enables fact-based discussion of the causes and
impacts of data problems
• Great starting point for Data Quality
improvement workshops
• Automatic generation of metadata
• Supports both data quality focus &
improvement and metadata capture
• Data profiling tools automate the process
of assessing and reporting on the quality
of data sources
• Data profiling can also be done via SQL,
without purchasing a tool
Example partial Data Profiling report

Step 1: An Alternative Approach to Quantifying Data Problems

Only 3% of Companies’ Data
Meets Basic Quality Standards

Tadhg Nagle, Thomas C. Redman

& David Sammon

Harvard Business Review

September 11 2017

Step 1: Data Profiling & Potential Data Quality Problem Identification


802540 Smith Brian Female 31/01/56 PM16

YN4176B Gregg Male 07/09/80 9999
811609 Patel Priya XXXX 25/12/78 AL60
22298 Bothroyd Bridget Female 28/08/09 TBD
802540 Smith Bryan Male 31/01/56 PM10

855265 Hayes Leslie Female 00/00/00 AL76

Taylor Kevin Unknown 12/30/69 US18

Note: Records extracted and anonymized from an actual HR database

Step 1: Data Profiling & Potential DQ Problem Identification


802540 Smith Brian Female 31/01/56 PM16

YN4176B Gregg Male 07/09/80 9999

811609 Patel Priya XXXX 25/12/78 AL60

22298 Bothroyd Bridget Female 28/08/09 TBD Key:

802540 Smith Bryan Male 31/01/56 PM10 Data Quality
855265 Hayes Leslie Female 00/00/00 AL76
Taylor Kevin Unknown 12/30/69 US18 Duplicate

ANSWER: Total number of potential Data Quality problems is 13 or 19, depending on

whether Smith is a duplicate record

Step 2: Business Review & Validation
• Data profiling findings should be reviewed by appropriate business & IT
• If formal Data Governance in place, this should ideally led by the Data Stewards
responsible for the specific data domains
• Aim to reach consensus on what the business impact is
• Ways of doing this:
• Workshops and / or meetings (virtual or F2F)
• By workflows, seeking views on the potential problem areas
• For priority areas, agree Business Rules which should be in place to drive and
enforce data quality improvement
• Create and deploy Business Rules
• Test rules first in case of unforeseen downstream impacts
• Embed in appropriate operational systems or Data Quality Rules Engine (see later)

Step 3: Using Business Rules to steer and enforce Data Quality standards

Example potential format Example potential

business rules content business rules
Employee No. must be in format Gender should align with First
nnnnnn. Blank Employee Numbers Name derived from Common
are allowed if new starter awaiting Names Reference file
Emp. No. allocation
First Name must not be blank Allowable Genders are FEMALE,
Role code must be in format AAnn Date of Birth must be expressed
as DD/MM/YY and in the range
01/01/1940 to 12/12/2005
Date of Birth must be in format Employee No. should be unique.
nn/nn/nn Only one Emp. No. should be
allocated to any individual

Step 3: Deploying Business Rules - Approaches

Data Entry
Guidelines, Master & Reference
Business Glossary Data Management
& Training

Application Code Data Quality Tool:

(e.g. data input DQ Business Rules
validation) Engine

Step 3: Automating Data Quality Business Rules via a DQ Rules Engine

Real Time Data Validation



Step 4: Monitor & Report Adherence

• When Business Rules are implemented can be used to:

• Check continued adherence of existing data
• Enforce the rules on new data to prevent new problems
• Best monitored via Data Quality Dashboards
• Provide regular reports on adherence of data to Business Rules
• Set KPIs to drive continuous data improvement
• Identify data quality trends
• Highlight areas where corrective action required
• Indicate where / if Business Rules may need to be amended to
meet changing business needs
• When reporting always try to relate data quality to business
• Address the ‘so what’ objection
• Puts a financial or other benefit on continued data quality Data Quality Dashboard

• Business Rules are key to uncovering data quality

problems and driving data quality improvement

• Business Rules can be explicit or implicit so have to be

discovered and created in a variety of ways

• Follow the simple 4 Step process outlined to ensure you

optimize the value of Business Rules in your data quality

• Remember that Business Rules are not set in stone and

need to be monitored and amended in line with changing
organizational needs and requirements

• With data quality the business always ultimately rules, so

Business Rules provide the means to enable this

Who We Are: Business-Focused Data Strategy
Maximize the Organizational Value of Your Data Investment

In today’s business environment, showing rapid time to value for

any technical investment is critical.

But technology and data can be complex. At Global Data Strategy,

we help demystify technical complexity to help you:

• Demonstrate the ROI and business value of data to your

• Build a data strategy at your pace to match your unique culture
and organizational style.
• Create an actionable roadmap for “quick wins”, which building
towards a long-term scalable architecture.

Global Data Strategy’s shares experience from some of the largest Global Data Strategy has worked with organizations globally in the
international organizations scaled to the pace of your unique team. following industries:
Finance · Retail · Social Services · Health Care · Education · Manufacturing
· Government · Public Utilities · Construction · Media & Entertainment ·
Insurance …. and more
Thoughts? Ideas?

