Before we go too deep into Data Quality, we must first establish standard verbiage and the breadth and depth of the concepts at hand. These brief definitions of the topics need to be addressed in data quality. By no means is this an in-depth list of Data Quality terms, nor are they fully defined. The below definitions are just enough to get you started.
Completeness - Do you have a complete record or dataset?
Accuracy- Does the data match the real world?
Consistency- Is the data consistently cleansed and formatted? (Chris, CHRIS, chris, cHrIS)
Validity- Does the data represent real-world expected values? (ZIP codes are ZIP Codes, Countries are Countries, etc.)
Uniqueness- Is each record suitably unique? (You should only have one business key in a type one dimension.)
Integrity- As data moves through a system, are you losing or duplicating records?
Accessibility- Do the right people have access to the correct data in the right way? (Business users via reports and dashboards, analysts via query tools, data engineers via sources, transformation, errors)
Timeliness- Do you have the data within the acceptable SLA? (Real-time/ near real-time, micro-batch, batch)
Relevance- Is the data relevant for the business needs?
Actionable- Do you have a complete record or dataset? Is the data Actionable, and is the data recommending the right actions?
DATA QUALITY BLOG SERIES
Each day the Data Quality Blog post will be released at 8:45 AM each day.
DATA QUALITY - Part 1 January 6th
DATA QUALITY CONCEPTS - Part 2 January 7th
DATA QUALITY FOR EVERYONE - Part 3 January 10th
DATA QUALITY FRAMEWORK - Part 4 January 11th
DATA QUALITY DEVELOPMENT - Part 5 January 12th
QUALITY DATA - Part 6 January 13th