Professional Documents
Culture Documents
&
QUALITY ISSUES
Information Search and Analysis
Skills
Venue : NIIT Ltd, Agra.
Semester: 4
Credits:
Amol Shrivastav
Mohit Bhaduria
Harsha Rajwanshi
Guidance & support
Gunjan Verma
Contents
Introduction
Measuring Data Quality
Tools for Data Quality
Data Quality Methodology
ETL
Section 1
By
Amol Shrivastav
A producer wants to know….
Which are our
lowest/highest margin
customers ?
Who are my customers
What is the most and what products
effective distribution are they buying?
channel?
[Barry Devlin]
Data Flow
Section 2
By
Mohit Bhaduria
Measurin
g
Data
Quality
Attributes for measuring Data
doQuality
I know what the fields
mean, do I know when the
data I’m using usefulness
was last updated?
Interpretability
Attributes for measuring Data
doQuality
I know what the fields
Usefulness
mean, do I knowiswhen
the data
the
datarelevant for my needs? Is the
I’m using usefulness
wasdata current?
last updated?
Interpretability
Attributes for measuring Data
doQuality
I know what the fields
Usefulness
mean, do I knowiswhen
the data
the
I’mam
datarelevant I for
usingmissing too much
my needs? data?
Is the usefulness
lastAre
wasdata there strong biais? Is the
current?
updated?
data quality
consistent?
Interpretability
Attributes for measuring Data
doQuality
I know what the fields
Usefulness
mean, is the
do I know data
when the
am using
relevant I missing
for my too much
needs? data?
data I’mdo the people whoIsneed
the to usefulness
data
was Are there
lastcurrent?
updated?strong biais? Is the
have access to the data have
data quality
the proper access?
consistent?
Is the system crashing or too
slow?
Interpretability
Linking Quality Factors to DW
quality
Data warehouse
quality
DW Design
Update policy
Data Sources Models Language
DW evolution DW Sources, Design
Data warehouse Design Query processing DW process
Data warehouse process Data sources , and process
DW Data & process
DW Design & process
Quality metamodel
The quality meta model can
be used for both design and
analysis purposes. The
DWQ quality metamodel is
based on the Goal-Question-
Metric approach
DWQM is
continuous
process in
life DW
Section 3
By
Harsha Rajwanshi
Tools for
Data
Warehouse
Quality
Tools for Data quality
The tools that may be used to
extract/transform/clean the source
data or to measure/control the
quality of the inserted data can be
grouped in the following
categories
Data cleansing tools are used in the intermediate staging area. The data cleansing tools
contain features which perform the following functions:
Data migration tool, is responsible for converting the data from one platform to another.
Cleansing
Data Augmentation
Profiling & Assessment
There are many different techniques
and processes for data profiling.
Grouping them together into three
major categories: