data quality: 6 dimensions hote h on basis of which ---we validate the data
like gender field ...male/female...if null is there then its wrong..
1.cmpleteness source 100 source records---100 target ..but there can be possibil ty like null values is coming...or email me kuch no.aa gya... idq-historical data....incinsistent data not to be added country fied -india in, ind...which signies a single country select * from table where ctry ='IN' statstical measures and according to data 2.Accuracy... decimal points dekhne the..lekin nhi aa rahe...uniqueness 3.cinsstency...all across the data set data should be consistent reports 4.integrity..relationship link is missing...order ....customer ne kiya h ....say order table me customer id null h 5.validity....date daal diya..creation date...lekin date valid nhi h so record shoud be correct tat the date should be curret/future 6.Uniqueness informatica analyst :web console -- web based url...different projects---shared/ unshared..folder....data object liya /no different src/tgt wen u import..it will ask source/tgt/lookup..if you want to know abt data set... .create-new profile- column based profiling/quick profiling... if 2-3 column..select ....run.... informatica developer : profiling rule master rule-total no of records/valid/invalid.. score card- graphical representation..valid records..invalid records.. option..create score card/existing record threshold option between kis ran ge me hum excellent record or bet what range it's acception/non acceptable.. depends on criticality of data...or client .. we usually see pass value not the failed ones.. groups can be created based now data ko dekhte hue..data stewardship..set new rules..give to developer. for cleansing data reference table created in data analyst ..have all the possible values ...v alid/invalid --- less possibilty mostly it is done in developer birectional --data analyst/developer -MRS----data revevant to mapping workfloews DIS--take care of data flow from ne to another Analyst service--needed for data analyst...otherwise analust cnt be accessed Content management Service-- Address Validator/Address Doctor--informayica have 240 countries reference data..address data...ye resference data informatica deta h.. pincode/geo code ---if we check identity transformation--60 countries ................................................................................ .................................................... rules : validate as mapplet : cleasing validate as rule...as a validation
profile-- new rule----apply rule
if we want to mapplet--active transformation---as a rule nhi kr sakte standarizer t/r--standarize data..new strategy--4 options--relpace with match analysis : ravi verma ravi verma some hs addess ph pincode seq : data object---key generator---match tra---association t/r-----consolidatio n t/r(golden record) unique key...group key ....same sound pronounce-strategy in key generator---soundx based--only one vowel character/string nysiss --counts all vowel cluster ig in match/t/r is survivor port---y/n values ...