You are on page 1of 2

data quality: 6 dimensions hote h on basis of which ---we validate the data

like gender field ...male/female...if null is there then its wrong..


1.cmpleteness source 100 source records---100 target ..but there can be possibil
ty like null values is coming...or email me kuch no.aa gya...
idq-historical data....incinsistent data not to be added
country fied -india in, ind...which signies a single country
select * from table where ctry ='IN'
statstical measures and according to data
2.Accuracy... decimal points dekhne the..lekin nhi aa rahe...uniqueness
3.cinsstency...all across the data set data should be consistent
reports
4.integrity..relationship link is missing...order ....customer ne kiya h
....say order table me customer id null h
5.validity....date daal diya..creation date...lekin date valid nhi h
so record shoud be correct tat the date should be curret/future
6.Uniqueness
informatica analyst :web console -- web based url...different projects---shared/
unshared..folder....data object liya /no different src/tgt
wen u import..it will ask source/tgt/lookup..if you want to know abt data set...
.create-new profile- column based profiling/quick profiling...
if 2-3 column..select ....run....
informatica developer :
profiling rule master rule-total no of records/valid/invalid..
score card- graphical representation..valid records..invalid records..
option..create score card/existing record threshold option
between kis ran
ge me hum excellent record or bet what range it's acception/non acceptable..
depends on criticality of data...or client ..
we usually see pass value not the failed ones..
groups can be created based
now data ko dekhte hue..data stewardship..set new rules..give to developer.
for cleansing
data reference table created in data analyst ..have all the possible values ...v
alid/invalid --- less possibilty
mostly it is done in developer
birectional --data analyst/developer
-MRS----data revevant to mapping workfloews
DIS--take care of data flow from ne to another
Analyst service--needed for data analyst...otherwise analust cnt be accessed
Content management Service-- Address Validator/Address Doctor--informayica have
240 countries reference data..address data...ye resference data informatica deta
h..
pincode/geo code ---if we check
identity transformation--60 countries ................................................................................
....................................................
rules : validate as mapplet : cleasing
validate as rule...as a validation

profile-- new rule----apply rule


if we want to
mapplet--active transformation---as a rule nhi kr sakte
standarizer t/r--standarize data..new strategy--4 options--relpace with
match analysis : ravi verma
ravi
verma
some hs addess
ph
pincode
seq : data object---key generator---match tra---association t/r-----consolidatio
n t/r(golden record)
unique key...group key ....same sound pronounce-strategy in key generator---soundx based--only one vowel
character/string
nysiss --counts all vowel
cluster ig in match/t/r
is survivor port---y/n values ...

You might also like