Master Data Management

Front cover
Master Data Management:

Rapid Deployment
Package for MDM
RDP for MDM overview
MDM overview
Financial services
scenario
Nagraj Alur Norbert Eschle

Alex Baryudin Clive Hannah
Mike Carney Patrick Owen
Priyanka Deswal Barry Rosen
Tim Davis Torben Skov
Elizabeth Dial
ibm.com/redbooks
International Technical Support Organization
MDM: RDP for MDM
April 2009
SG24-7704-00
Note: Before using this information and the product it supports, read the information in
“Notices” on page xvii.
First Edition (April 2009)
This edition applies to Version 8, Release 0, Modification 1 of IBM InfoSphere Information Server
(5724-Q36) and Version 8, Release 0, Modification 1 of IBM InfoSphere Master Data
Management Server (5724-V51).
© Copyright International Business Machines Corporation 2009. All rights reserved.

Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP
Schedule Contract with IBM Corp.
Contents
Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
The team that wrote this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xx
Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiv
Chapter 1. Rapid Deployment Package for Master Data Management

solution overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 MDM Server implementation phases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.1 Source data analysis (data profiling) . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.2 MDM Server point in time load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.3 MDM data consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 RDP for MDM solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Configuring the RDP for MDM solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.4.1 Configuration screens in the MDM Server UI . . . . . . . . . . . . . . . . . . 16
1.4.2 Main configuration scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.4.3 Best practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Chapter 2. Rapid Deployment Package details. . . . . . . . . . . . . . . . . . . . . . 25

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2 Main components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3 Configuration parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3.1 Configuration parameter file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3.2 CONFIGELEMENT table in the MDM repository . . . . . . . . . . . . . . . 49
2.4 SIF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.5 Import SIF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.6 Validation and Standardization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.7 Error Consolidation and Referential Integrity . . . . . . . . . . . . . . . . . . . . . . 58
2.8 Match. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.9 ID Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
© Copyright IBM Corp. 2009. All rights reserved. iii

2.10 Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Chapter 3. Financial services business scenario . . . . . . . . . . . . . . . . . . . 67

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.2 Business requirement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.3 Environment configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.4 Scope of this IBM Redbooks publication. . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.5 Initial load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.5.1 TBank checking, savings, and loans systems. . . . . . . . . . . . . . . . . . 75
3.5.2 Data Quality Assessment (DQA). . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.5.3 Create canonical form from the data sources . . . . . . . . . . . . . . . . . . 97
3.5.4 Validate efficacy of the RDP for MDM rulesets & modify to suit . . . 116
3.5.5 Create SIF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
3.5.6 Execute RDP for MDM jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
3.5.7 Verify successful load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
3.6 Suspect resolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
3.7 Hierarchies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
3.7.1 Hierarchy overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
3.7.2 Hierarchy scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
3.8 MDM consumption application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Appendix A. Master Data Management overview. . . . . . . . . . . . . . . . . . . 237

A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
A.2 Architectural styles of MDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
A.2.1 Consolidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
A.2.2 Registry. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
A.2.3 Coexistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
A.2.4 Transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
A.3 IBM InfoSphere MDM Server overview. . . . . . . . . . . . . . . . . . . . . . . . . . 241
Appendix B. Standard Interface File details . . . . . . . . . . . . . . . . . . . . . . . 247

B.1 SIF details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
Appendix C. Master Data Management Server customization

considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
C.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
C.2 Data extensions and additions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
C.3 Behavior extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
C.4 Impact of data/behavior extensions on RDP for MDM . . . . . . . . . . . . . . 266
C.5 Extending RDP for MDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
C.6 Runtime Column Propagation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
C.7 Adding new elements (columns). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
C.8 Modifying existing elements (columns). . . . . . . . . . . . . . . . . . . . . . . . . . 270
iv MDM: RDP for MDM

Appendix D. Error processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
D.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
D.2 Pipe character “|” in the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
D.3 Validation error with the code table . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
D.4 RT/ST/ADMIN_SYS_TP_CD error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
D.5 End of record missing error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
D.6 Start date after end date error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
D.7 Date format error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
Appendix E. Additional material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

Locating the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
Using the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
System requirements for downloading the Web material . . . . . . . . . . . . . 290
How to use the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
Contents v
vi MDM: RDP for MDM
Figures
1-1 Optimal architecture combining MDM Hub with Data Integration Hub . . . . 3
1-2 Synchronization of master data in the enterprise . . . . . . . . . . . . . . . . . . . . 4
1-3 IBM Information Server suite in the flow . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1-4 IBM Information Server functionality in MDM deployment . . . . . . . . . . . . . 6
1-5 RDP for MDM solution overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1-6 MDM Logical Data Model & Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1-7 Role and flow of RDP for MDM solution . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1-8 Enabling Suspect Duplicate Processing in MDM Server UI . . . . . . . . . . . 17
1-9 Configuring the critical matching fields for person matching . . . . . . . . . . . 18
1-10 Configuring the critical matching fields for organization matching . . . . . 19
1-11 Creating a new application accessing MDM Server . . . . . . . . . . . . . . . . 21
1-12 Modifying a existing application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1-13 Creating a existing adapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1-14 Using a Change Data Capture solution. . . . . . . . . . . . . . . . . . . . . . . . . . 22
1-15 Overview of RDP for MDM solution scenarios . . . . . . . . . . . . . . . . . . . . 23
2-1 Main components of RDP for MDM processing . . . . . . . . . . . . . . . . . . . . 29
2-2 CDIDTP table contents: Corresponds to the I’n’ columns . . . . . . . . . . . . . 48
2-3 CDCONTMETHTP table contents: Corresponds to the C’n’ columns . . . 48
2-4 Contact (P) and Contract (C) RT/ST combinations . . . . . . . . . . . . . . . . . . 50
2-5 Import SIF flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2-6 Validation and Standardization flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2-7 Error Consolidation and Referential Integrity flow. . . . . . . . . . . . . . . . . . . 58
2-8 Match Pers and Org flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2-9 Match LOB flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2-10 Bulk Load flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2-11 Upsert Load flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3-1 TBank environment configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3-2 Rapid MDM approach used in the scenario for the initial load . . . . . . . . . 73
3-3 Checking table data, part 1 of 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3-4 Checking table data, part 2 of 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3-5 Savings table data, part 1 of 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3-8 Loans table data, part 1 of 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3-9 Loans table data, part 2 of 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3-10 Loans table data, part 3 of 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3-11 Loans table data, part 4 of 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3-12 DQA approach: Data assessment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
© Copyright IBM Corp. 2009. All rights reserved. vii

3-13 Data assessment tools functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3-14 Frequency Distribution Data for GENDER in Checking table . . . . . . . . . 92
3-15 Frequency Distribution Data for GENDER in Savings table . . . . . . . . . . 92
3-16 Frequency Distribution Data for GENDER in Loan table . . . . . . . . . . . . 93
3-17 Mapping from sources to canonical form . . . . . . . . . . . . . . . . . . . . . . . . 96
3-18 ODBC data sources on odbc.ini file . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3-19 Define the sources to canonical form table target mapping,
part 1 of 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
part 2 of 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
part 3 of 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
part 4 of 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
part 5 of 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
part 6 of 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
part 7 of 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
part 8 of 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
part 9 of 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
part 10 of 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
part 11 of 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
part 12 of 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
3-31 Populate the canonical form table, part 13 of 13. . . . . . . . . . . . . . . . . . 112
3-32 Populate the canonical form table, part 1 of 4. . . . . . . . . . . . . . . . . . . . 114
3-36 Import OOTB RDP for MDM rulesets into standardization job,
part 1 of 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
part 2 of 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
part 3 of 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
part 4 of 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
viii MDM: RDP for MDM

part 5 of 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
part 6 of 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
part 7 of 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
3-43 Validate RDP for MDM ruleset on standardization job, part 1 of 17 . . . 127
3-52 Validate RDP for MDM ruleset on standardization job, part 10 of 17 . . 133
3-59 Validate efficacy of RDP for MDM rulesets, part 17 of 17 . . . . . . . . . . . 138
3-60 Override Input Pattern & rerun the standardization job, part 1 of 11. . . 140
3-69 Override Input Pattern & rerun the standardization job, part 10 of 11. . 147
3-70 Override Input Pattern & rerun the standardization job, part 11 of 11. . 147
3-71 Import modified RDP for MDM rulesets into RDP for MDM jobs,
part 1 of 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
part 2 of 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
part 3 of 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
part 4 of 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Figures ix
part 5 of 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
3-76 Create of a reference table using Information Analyzer, part 1 of 6 . . . 154
3-80 Create SIF, part 5 of 6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
3-82 Create SIF tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
3-90 Launch RDP for MDM jobs, part 1 of 7 . . . . . . . . . . . . . . . . . . . . . . . . . 182
3-97 Verify successful load, part 1 of 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
3-98 Verify successful load , part 2 of 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
3-99 Verify successful load, part 3 of 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
3-100 Verify successful load, part 4 of 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
3-101 Verify successful load, part 5 of 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
3-102 Suspect , part 1 of 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
3-103 Suspect resolution, part 2 of 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
3-111 Hierarchy scenario example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
3-112 TBank hierarchy scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
3-113 RDP for MDM jobs Director output, part 1 of 2 . . . . . . . . . . . . . . . . . . 214
3-114 RDP for MDM jobs Director output, part 2 of 2 . . . . . . . . . . . . . . . . . . 215
3-115 Hierarchy view using MDM Server UI, part 1 of 15 . . . . . . . . . . . . . . . 217
x MDM: RDP for MDM

3-124 Hierarchy view using MDM Server UI, part 10 of 15 . . . . . . . . . . . . . . 225
3-130 MDM consumption application, part 1 of 4 . . . . . . . . . . . . . . . . . . . . . 231
A-1 Architectural styles of MDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
A-2 MDM Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
A-3 Business features (services) supported by MDM Server . . . . . . . . . . . . 244
D-1 Main components of RDP processing & error logs generated . . . . . . . . 272
D-2 End of record missing error — partial contents of Director log output . . 284
Figures xi
xii MDM: RDP for MDM
Tables
2-1 Import SIF phase jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2-2 RDP Configuration Parameters by the SETUP category . . . . . . . . . . . . . 36
2-3 RDP Configuration Parameters by the ERROR HANDLING category . . . 39
2-4 DP Configuration Parameters by the RUNTIME category . . . . . . . . . . . . 40
2-5 RDP Configuration Parameters by the ADVANCED category . . . . . . . . . 41
2-6 RDP Configuration Parameters MUST MODIFY list . . . . . . . . . . . . . . . . . 44
2-7 RDP configuration parameters in the CONSIDER MODIFYING list . . . . . 45
3-1 Overlapping customers in the checking, savings, and loans systems . . . 87
3-2 Code table mapping between the sources and the MDM repository . . . . 93
3-3 Code table mapping between canonical form columns and MDM . . . . . 160
3-4 Canonical form to SIF mapping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
3-5 RDP Configuration Parameters MUST MODIFY list . . . . . . . . . . . . . . . . 178
3-6 RDP Configuration Parameters in the CONSIDER MODIFYING list . . . 179
B-1 Contact information RT/ST is “PP” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
B-2 OrgName information RT/ST is “PG” . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
B-3 Person Name / Person Search information RT/ST is “PH” . . . . . . . . . . . 251
B-4 External Match RT/ST is “PE”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
B-5 Location_Group_Address_Group Address RT/ST is “PA” . . . . . . . . . . . 252
B-6 LocationGroup_ContactMethodGroup_ContactMethod RT/ST is “PC” . 253
B-7 Identifier RT/ST is “PI” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
B-8 LobRel RT/ST is “PB” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
B-9 ContactRel RT/ST is “PR”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
B-10 Contract RT/ST is “CH”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
B-11 Contract RT/ST is “CK” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
B-12 Contract Component RT/ST is “CC” . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
B-13 Contract Role RT/ST is “CR” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
B-14 Role Location RT/ST is “CR” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
B-15 Role Location RT/ST is “CL”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
B-16 ContractCompVal RT/ST is “CV” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
B-17 MiscValue RT/ST is “CM” or “PM” . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
B-18 PPrefEntity_PrivPref RT/ST is “PS” . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
B-19 Alert RT/ST is “CT” or “PT” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
B-20 Hierarchy RT/ST is “HH” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
B-21 Hierarchy Node RT/ST is “HN” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
B-22 Hierarchy Rel RT/ST is “HR” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
B-23 Hierarchy Ultimate Parent RT/ST is “HU” . . . . . . . . . . . . . . . . . . . . . . . 261
C-1 MDM extensions impact on RDP for MDM . . . . . . . . . . . . . . . . . . . . . . . 267
C-2 RDP for MDM objects with RCP disabled . . . . . . . . . . . . . . . . . . . . . . . 270
© Copyright IBM Corp. 2009. All rights reserved. xiii

D-1 Error message format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
xiv MDM: RDP for MDM

Examples
2-1 Sample SIF contents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

3-1 DDL of the checking, savings, and loan table . . . . . . . . . . . . . . . . . . . . . . 76
3-2 DDL of the canonical form table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3-3 Partial contents of CANONICAL_TBL . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
3-4 Partial contents of SIF file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
3-5 Partial contents of drop_triggers.sql . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
3-6 Partial contents of Deactivate_FK.txt . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
3-7 Partial contents of CreateTriggers_simple.sql . . . . . . . . . . . . . . . . . . . . 189
3-8 Partial contents of Activate_FK.sql . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
3-9 SIF Hierarchy RT/ST records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
3-10 TBANK 360 view test.jsp. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
D-1 Pipe character in the data error—partial contents of SIF . . . . . . . . . . . . 275
D-2 Pipe character in the data error log . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
D-3 Validation error with the code table error—partial contents of SIF . . . . . 278
D-4 Validation error with the code table error log output . . . . . . . . . . . . . . . . 280
D-5 RT/ST/ADMIN_SYS_TP_CD error—partial contents of SIF . . . . . . . . . 281
D-6 RT/ST/ADMIN_SYS_TP_CD error log output . . . . . . . . . . . . . . . . . . . . 282
D-7 End of record missing error — partial contents of SIF . . . . . . . . . . . . . . 284
D-8 Start date after end date error — partial contents of SIF . . . . . . . . . . . . 285
D-9 Start date after end date error log output . . . . . . . . . . . . . . . . . . . . . . . . 286
D-10 Date format error — partial contents of SIF . . . . . . . . . . . . . . . . . . . . . 287
D-11 Date format error log output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
© Copyright IBM Corp. 2009. All rights reserved. xv

xvi MDM: RDP for MDM
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area.
Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM
product, program, or service may be used. Any functionally equivalent product, program, or service that
does not infringe any IBM intellectual property right may be used instead. However, it is the user's
responsibility to evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document.
The furnishing of this document does not give you any license to these patents. You can send license
inquiries, in writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer
of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may
make improvements and/or changes in the product(s) and/or the program(s) described in this publication at
any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in any
manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the
materials for this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without
incurring any obligation to you.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm
the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on
the capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the
sample programs are written. These examples have not been thoroughly tested under all conditions. IBM,
therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.
© Copyright IBM Corp. 2009. All rights reserved. xvii

Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corporation in the United States, other countries, or both. These and other IBM trademarked
terms are marked on their first occurrence in this information with the appropriate symbol (® or ™),
indicating US registered or common law trademarks owned by IBM at the time this information was
published. Such trademarks may also be registered or common law trademarks in other countries. A current
list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml
The following terms are trademarks of the International Business Machines Corporation in the United States,
other countries, or both:
Ascential® Information Agenda™ Redbooks®

DataStage® InfoSphere™ Redbooks (logo) ®
DB2® Orchestrate® WebSphere®
IBM® Rational®
The following terms are trademarks of other companies:
Oracle, JD Edwards, PeopleSoft, Siebel, and TopLink are registered trademarks of Oracle Corporation
and/or its affiliates.
Red Hat, and the Shadowman logo are trademarks or registered trademarks of Red Hat, Inc. in the U.S. and
other countries.
SAP, and SAP logos are trademarks or registered trademarks of SAP AG in Germany and in several other
countries.
EJB, J2EE, J2SE, Java, JSP, and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in
the United States, other countries, or both.
Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States, other
countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks of others.
xviii MDM: RDP for MDM

Preface
This IBM® Redbooks® publication documents the procedures for implementing

an IBM InfoSphere™ Master Data Management (MDM) solution using the Rapid
Deployment Package (RDP) for Master Data Management offering involving a
typical financial services business scenario.
It is aimed at IT architects, Information Management specialists, and Information

Integration specialists responsible for implementing an IBM InfoSphere Master
Data Management solution on a Red Hat® Enterprise Linux® 4.0 platform.
This book is organized as follows:

򐂰 Chapter 1, “Rapid Deployment Package for Master Data Management
solution overview” on page 1 provides an overview of the IBM InfoSphere
MDM Server implementation phases and IBM InfoSphere RDP for MDM
solution.
򐂰 Chapter 2, “Rapid Deployment Package details” on page 25 provides an
overview of the RDP component of the IBM InfoSphere RDP for MDM
offering. It includes a brief description of the Standard Interface Format (SIF),
the configuration parameters, and the main functions of the various
DataStage® and QualityStage jobs in the RDP.
򐂰 Chapter 3, “Financial services business scenario” on page 67 describes a
step-by-step approach to implementing an IBM InfoSphere MDM solution on
a Red Hat Enterprise Linux 4.0 platform for a fictitious financial services
company using the RDP for MDM offering. The initial load of the MDM Server
is performed with RDP for MDM. The Delta RDP for MDM solution was not
available at the time of writing of this IBM Redbooks publication. It will be
addressed as an update to this IBM Redbooks publication or as a separate
IBM Redpaper.
򐂰 Appendix A, “Master Data Management overview” on page 237 provides an
overview of MDM, IBM InfoSphere MDM Server, and its components.
򐂰 Appendix B, “Standard Interface File details” on page 247 provides details of
the Record Type/System Type (RT/ST) mapping.
򐂰 Appendix C, “Master Data Management Server customization considerations”
on page 263 describes the extensions supported by MDM Server and the
impact of such extensions on the RDP for MDM jobs.
򐂰 Appendix D, “Error processing” on page 271 describes common errors in the
SIF, and the corresponding error messages generated by the RDP for MDP
jobs in the error consolidation log.
© Copyright IBM Corp. 2009. All rights reserved. xix

The team that wrote this book
This book was produced by a team of specialists from around the world working
at the International Technical Support Organization, Rochester Center.
Nagraj Alur is a Project Leader with the IBM ITSO, San Jose Center. He has
more than 33 years of experience in database management systems (DBMSs),
and has been a programmer, systems analyst, project leader, independent
consultant, and researcher. His areas of expertise include DBMSs, data
warehousing, distributed systems management, database performance,
information integration, and client/server and Internet computing. He has written
extensively on these subjects and has taught classes and presented at
conferences all around the world. Before joining the ITSO in November 2001, he
was on a two-year assignment from the Software Group to the IBM Almaden
Research Center, where he worked on Data Links solutions and an eSourcing
prototype. He holds a master’s degree in computer science from the Indian
Institute of Technology (IIT), Mumbai, India.
Alex Baryudin is currently serving as a Technical Architect on the InfoSphere

WW Competency Team, with a specialization in IBM InfoSphere QualityStage.
His past roles have included being the Technical Architect for InfoSphere
Worldwide Center of Excellence, Data Quality Solution Architect, and Principal
Consultant. Alex now has over 15 years experience in designing and
implementing Data Quality projects that have included integration of
QualityStage with IBM MDM Server, as well as other enterprise applications such
as SAP® and Siebel®.
Mike Carney is an Executive Architect within the IBM Software Group, and the
lead architect and development team lead for RDP DataStage Jobs. He has over
20 years of experience with Software application development and data
warehousing, including 10 years with IBM Advanced Consulting Group for
DataStage, where he has contributed to many innovations to the Information
Server Product. Mike holds a BA in Mathematics from Boston College.
Priyanka Deswal is a Senior IT Specialist with the IBM Software group in India.
She has more than nine years of experience in Information Management. She is
currently working as a Technical Pre-Sales specialist for Information Platform and
Solutions. She is an advanced cluster certified DB2® Expert. Her areas of
expertise include DB2, InfoSphere Information Server, InfoSphere MDM Server,
DB2 Content Manager. She holds a bachelors degree in computer science and
engineering.
xx MDM: RDP for MDM

Tim Davis is an Executive Director within the IBM Software Group, with a
mission to drive the Trusted Information Agenda™ through the advancement of
industry leading architectures for IBM customers worldwide. He is also one of the
leading content contributors for IBM curriculums, best practices, and
methodologies. Tim is the founder of the IBM Center of Excellence for Data
Integration, and recently led the development and launch of the IBM Information
Grid, MDM Server Rapid Deployment, and the IBM SAP deployment
accelerators. He has over 25 years of professional experience in large-scale
systems integration, high performance computing, SOA, MDM, ERP/SAP, CRM,
Data Warehousing, Analytics, Banking Risk Management & Compliance. Tim
has published numerous papers and holds an MSEE/CS from USC and a
BSEE/CS from Clarkson University.
Elizabeth Dial is a Technical Architect with the IBM Software Group. She is
currently part of the Trusted Information Agenda team, supporting the
advancement of industry leading architectures for IBM customers worldwide. As
a member of the InfoSphere Worldwide Center of Excellence, Elizabeth has
created and contributed to best practices pertaining to data quality and the
iterations methodology, and has participated in the design of data quality
components to support the InfoSphere suite of products. Elizabeth has 10 years
of professional experience designing and implementing data integration projects
that include data warehousing, SOA, and the integration of QualityStage with
IBM MDM Serve and other enterprise applications.
Norbert Eschle is an experienced consultant and sales engineer with broad

technical experience in Information Management. He has worked as a consultant
with both structured and unstructured information management, specializing in
data integration, data, and metadata management. More recently, he has worked
as a sales engineer for IBM Software Group, specializing in InfoSphere
Information Server.
Clive Hannah is currently serving as a Global Technical Architect in the

InfoSphere Worldwide Center of Excellence with a specialization in IBM MDM
Server with IBM Canada. His past roles included being the Chief Architect for
Professional Services in IBM MDM Server. Clive has over 10 years experience in
designing, managing, and implementing multiple large-scale projects that
included IBM MDM Server and its integration with other enterprise applications.
As part of his past role as Chief Architect, he reviewed and oversaw the
implementation of many MDM Servers worldwide. Clive brings his technical
expertise in performance tuning, data quality, and enterprise integration along
with his business expertise from working on projects within the banking,
insurance, telecom, and manufacturing industries. He holds a Bachelors degree
in Computer Science from the University of Western Ontario.
Preface xxi
Patrick Owen began his IT career at Acxiom Corporation, one of the world's
largest Data Service Providers specializing in personal identification and
name/address hygiene. After his experiences there with the Orchestrate®
Parallel Framework, he moved to Ascential® Software and now holds the
position of world-wide InfoSphere Information Server Architect, specializing in
performance, High-Availability, and Grid. Patrick has worked on projects
spanning many industries (including: insurance, package delivery, mortgage,
utilities, retail, and entertainment rental). He holds a BS in Computer Science
from the University Of Arkansas at Little Rock where he published several papers
on Optical Character Recognition, and Water Vapor Mapping Systems for
Extra-Terrestrial Landers.
Barry Rosen is currently serving as Executive Global Architect in the InfoSphere

Worldwide Center of Excellence and is an advocate for Data Profiling, Data
Quality, and Metadata at IBM. His past roles include Director of Best Practices
and Data, Director of Enterprise Architecture/Data Warehousing, Technical
Architect and Principal Consultant. Barry has designed, managed, and
implemented multiple highly available, large scale transaction, data
warehousing/mining, and business intelligence solutions in various vertical
market segments worldwide including financial, telecom, insurance,
pharmaceutical, and retail sectors. Barry brings over 25 years of technical
architecture and customer-focused information management expertise. He has
provided technical leadership for business system architecture and applications
including ERP, CRM, and data warehousing. Barry has consulted on various
complex, high-risk, time-sensitive systems for companies such as Epsilon Data
Management, Fidelity Investments, Wellington Management Company, Investors
Bank & Trust, and Harte-Hanks Data Technologies. He holds a Masters degree in
Engineering Management and Computer Information Systems from
Northeastern University.
Torben Skov holds a position as IT Specialist within the Information-On-Demand

department of IBM Denmark. His current area of interest is the IBM InfoSphere
Information Server product suite focusing on QualityStage and DataStage.
Torben has 10 years of experience in ETL and application development using the
SAS System. For the past three years, Torben has been working for IBM within
the Business Intelligence/Information on Demand area. Customer assignments
include some of the largest companies within the telecommunication and
agricultural sectors of Denmark. During these assignments Torben has worked in
the area of Data Quality, ranging from data profiling and data cleaning strategies
to standardization of product data. Torben holds a Master of Science in
Economics and Business Administration from the University of Southern
Denmark, specializing in Organizational Theory and Operations Research.
xxii MDM: RDP for MDM

Thanks to the following people for their contributions to this project:
David Borean
Karen Chouinard
Charles Jia
Linda Park
Joseph Tsang
Lena Woolf
IBM Canada
Aarti Borkar
Paul Christensen
Stacy Scoggins
Neil D Potter
Kiranmayi Potu
Brian L Tinnel
Ningning (Kevin) Wang
Balakumaran (Bala) Vaithyalingam
Larissa Wojciechowski
IBM USA
Srinivas Mudigonda
IBM India
Become a published author

Join us for a two- to six-week residency program! Help write a book dealing with
specific products or solutions, while getting hands-on experience with
leading-edge technologies. You will have the opportunity to team with IBM
technical professionals, Business Partners, and Clients.
Your efforts will help increase product acceptance and customer satisfaction. As
a bonus, you will develop a network of contacts in IBM development labs, and
increase your productivity and marketability.
Find out more about the residency program, browse the residency index, and
apply online at:
ibm.com/redbooks/residencies.html
Preface xxiii
Comments welcome
Your comments are important to us!
We want our books to be as helpful as possible. Send us your comments about

this book or other IBM Redbooks publications in one of the following ways:
򐂰 Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
򐂰 Send your comments in an e-mail to:
redbooks@us.ibm.com
򐂰 Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
xxiv MDM: RDP for MDM

1
Chapter 1. Rapid Deployment Package

for Master Data Management
solution overview
This chapter provides an overview of the IBM InfoSphere MDM Server
implementation phases and IBM InfoSphere Rapid Deployment Package (RDP)
for Master Data Management (MDM) solution.
© Copyright IBM Corp. 2009. All rights reserved. 1

1.1 Introduction
In this IBM Redbooks publication we outline the fundamentals of the InfoSphere
Rapid Deployment Solution for MDM Server, which delivers a fully integrated
solution that provides a Single View of The Customer to your enterprise. Full
integration means that the solution comes prepackaged with Information Server,
MDM Server Foundation, a prebuilt set of over 200 DataStage and Quality Stage
(Information Server) jobs, prebuilt QualityStage Data Quality Rule Sets, and a
fully articulated set of best practices and repeatable deployment standard
practices. Rapid Deployment MDM (RDP MDM) is designed to be rapidly
deployed (12 to 16 weeks) as the first stage of any MDM Server deployment.
Deployment of the RDP MDM Solution involves the following steps:
1. Load and configure the solution on your UNIX® servers.
2. Map source systems to RDP MDM Standard Interface File (SIF).
3. Configure ETL to feed source systems data to load and delta load processes.
4. Test and tune the system.
5. Deploy a single view of the customer into production.
The content we provide in this Redbooks publication dives into the details of RDP
MDM, to give you a better understanding of the technical underpinnings,
operational metrics, and deployment methods for the RDP MDM Server solution.
The context for the RDP for MDM solution is described as follows:
򐂰 MDM and the enterprise
Typical enterprises are run under a myriad of applications and systems that
all work together over the enterprise network to accomplish the management,
control, and reporting of the business. Each of these systems has some slice
of data that is critical and important to the enterprise and in fact represents
the gold copy of this data. The master data. In order for all of the systems in
the enterprise to work effectively together, this master data must be managed,
standardized, and synchronized. Otherwise it would be like people in the
United Nations trying to communicate without translators.
Over time, we have shown that the most optimal method of managing this
master data in an enterprise is through the use of flexible scalable MDM and
data integration hubs working in unison, as shown in Figure 1-1 on page 3. It
is critical that a common data governance practice be used across the
enterprise, and that these hubs be driven by a common set of data
transformation and data quality business rules giving consistency in the
enterprise deployments. It was these goals, and the goal to provide a rapid
deployment framework for MDM in the enterprise that led to the development
of the RDP for MDM solution.
2 MDM: RDP for MDM

SAP R/3 ERP & Logistics Financials Siebel EDW
Application-specific Application-specific Application-specific Application-specific Application-specific

Data & Services Data & Services Data & Services Data & Services Data & Services
Master Data & Master Data & Master Data & Master Data & Master Data &
Master Data Master Data Master Data Master Data Master Data
Services Services Services Services Services
Batch, Real Time, Transactional, Web Services/SOA

IBM MDM Server
METADATA Master Data & Master Data Services Hub DATA QUALITY
Single, authoritative Source
Batch, Real Time, Transactional, Web Services/SOA

IBM Information Server
METADATA Data Integration & Data Quality Services Hub
DATA QUALITY
Single, authoritative Source
0
Figure 1-1 Optimal architecture combining MDM Hub with Data Integration Hub
Chapter 1. Rapid Deployment Package for Master Data Management solution overview 3
򐂰 Synchronization of master data in the enterprise
Synchronization of master data in the enterprise is where the flexibility and
scalability of the MDM Hub/Data Integration Hub approach really shows its
power. By providing access and consumption control of MDM data, driven by
a common set of business and data quality rules, we have the best of all
worlds. Synchronization can now be done at an end-of-day batch, delta load,
intraday trickle feed, as a service-oriented architecture (SOA) service call, or
as a full XA-compliant transaction under high availability with two phase
commit and roll back, as shown in Figure 1-2.
Synchronization can happen at either the MDM Hub or Data Integration Hub
layer through a common set of rules. Metadata gold copies with known data
quality can now be managed in the enterprise under a common set of
precedence and synchronization rules.
Success Factor = Be Flexible and Allow Multiple Methods

Of Synchronization And Servicing MDM Consumers
Enterprise DataWarehouse – SAP – Siebel – Risk – Compliance – Operational – Call Centers - etc
MDM CONSUMERS
MDM Transaction
Web Services
Data MDM HUB

Sources Data
MDM Batch Sources
Data Integration Web Services
DATA INTEGRATION HUB

High Volume Data Integration Batch
High Volume Data Integration Real Time
High Volume Data Integration Transactions
Data Integration & MDM Consumers

Enterprise DataWarehouse – SAP – Siebel – Risk – Compliance – Operational – Call Centers - etc
1
Figure 1-2 Synchronization of master data in the enterprise
4 MDM: RDP for MDM

򐂰 The IBM RDP strategy and RDP for MDM role in it
RDP for MDM is a prebuilt system that brings together IBM MDM Server and
IBM Information Server into an integrated package ready for deployment to
solve Single View of the Customer MDM problems, as shown in Figure 1-3. It
is, in effect a Single View of the Customer system in a box ready to be
deployed in your enterprise today. Map your sources systems to a Standard
Interface File (SIF), and RDP for MDM does the rest, including
standardization, matching, aligning, de-duplication, harmonizing your data,
and loading it ready to use into MDM Server.
RDP for MDM provides full delta update processing as well in a tightly written
reusable DataStage/QualityStage job package. This integrated job package
provides full metadata linkage for data lineage and impact analysis, and sets
up the metadata for integration with your full enterprise life cycle metadata
management and data governance infrastructure.
At a transaction level, MDM Server comes complete with a full set of
synchronization and data management services. These services also access
and use the common data quality infrastructure provided by the Information
Server platform.
By utilizing the full suite, implementation

times and costs are reduced
Source Systems MDM SERVER
MDM Business Services

Source Source Source Information Server User Interface
#1 #2 #N
&
Reporting
DataStage QS Duplicate Suspect Processing
Information Server
Information Analyzer
Fast Track SIF
DataStage Load Process
DS jobs History
MDM Database
Key tasks in implementing RDP:

• Data analysis to ensure that attributes contain what they should (part of workshops)
• Mapping to the SIF format (part of workshops)
• Extending the model for up to 10 additional attributes and attribute lengths
• Tuning standardization and matching rules
2
Figure 1-3 IBM Information Server suite in the flow
򐂰 Data Integration and MDM — why use RDP MDM in every MDM Server
deployment
It is critical that RDP for MDM be used in all MDM Server deployments for
both loading and delta processing in order to allow fully automated end-to-end
metadata and data quality management in the enterprise. When using data
management services in the MDM Server hub, it is critical that these services
update the metadata when changes are made to MDM Server resident gold
copy metadata so that the metadata lineage linkage is maintained.
It is also critical that QualityStage be called and used for all data quality
processing so that a single set of data quality rules apply for the enterprise.
By abiding by these rules, both MDM Server and Information Server can now
be used to deliver synchronized master data across all domains in the
enterprise, no matter how large or complex. Figure 1-4 shows the breadth of
Information Server functionality contained in the RDP for MDM solution to
achieve the stated objectives.
Core MDM Methodology and Best Practice

Always Use RDP MDM Delta Load Job Assets For MDM Deployments
Analysis Standardization Transformation Transaction
& Mapping & Quality Harmonization Synchronization
• Understand physical • Match and reconcile • Enable creation of • Enable
and logical data related data data Synchronization
characteristics elements • Load to EDW, of Legacy and
• Determine data • Remove duplicate, Datamart, ODS, New Systems
consistency and redundant data ERP, CRM • Transaction
cleanliness • Reengineer data to Applications Replication with
• Discover related match single • Interface with risk Commit and Roll
data elements corporate standard models & reporting Back
• Facilitate gap • Quality processing applications • Enhance Real
analysis to ensure data • Maintain audit trail Time Delta
integrity Processing
• Enables Parallel
Run
RDP JOB Assets Connect Metadata Lineage, Data Transformation Rules 3
& Data Quality Rules Across The Enterprises,

While Accelerating Deployment of MDM Solution
Figure 1-4 IBM Information Server functionality in MDM deployment
6 MDM: RDP for MDM

Note: An RDP for MDM solution is not meant to provide a turnkey
out-of-the-box (OOTB) solution where all your source data is completely
mastered within one implementation phase. Your data and your business
processes have to be augmented over time.
The RDP for MDM solution is the important first step towards implementing
the MDM vision and the business value it offers at a later date. Your first
implementation phase should scope an attainable business objective to
validate the advantages of continuing with subsequent phases that
incrementally incorporate all the features of MDM Server.
Attention: The RDP for MDM solution is continually being enhanced in

response to new requirements based on customer feedback and IBM
Information Server product enhancements. We are currently working on
providing a link to up-to-date sources of information about this RDP for MDM
solution. Until such a download site becomes available, contact your local
sales representative to obtain this RDP for MDM solution. In Chapter 2, “Rapid
Deployment Package details” on page 25, we provide an overview of the core
functionality of this RDP for MDM solution.
The RDP for MDM solution reflects the three main phases of an implementation
of MDM Server:
򐂰 Source data analysis
򐂰 MDM Server point-in-time load
򐂰 MDM Server data consumption
In the following sections, we describe the following aspects of the MDM solution:
򐂰 MDM Server implementation phases
򐂰 RDP for MDM solution
򐂰 Configuring the RDP for MDM solution
򐂰 Main configuration scenarios
򐂰 Best practices
1.2 MDM Server implementation phases
This section describes the three main phases of an implementation of MDM
Server:
򐂰 Source Data Analysis (aka Data Profiling),
Note: Typically, customers use the Registry architectural style (described in

A.2, “Architectural styles of MDM” on page 239) for read-only purposes as a
first phase MDM Server implementation success criterion. Similarly, the typical
MDM consumption scenario is to implement a Web user interface that invokes
the Web services interface of MDM Server.
1.2.1 Source data analysis (data profiling)

When mapping data from one or more sources to the MDM Server, you need to
analyze the source data in order to perform the following tasks:
򐂰 Identify the master data and the system of record from the source system and
map it to the target MDM Server table structures. For each attribute, the
system of record is the single reliable source system from which that
attribute’s value should be extracted for feeding downstream systems.
򐂰 Analyze reference data in the source systems to map it to corresponding
reference values in the target MDM Server tables. Reference data includes
fields such as gender codes, country codes, list of account types, and list of
address types.
򐂰 Determine the appropriate matching criteria for person and organization
entities, which are the entities participating in de-duplication (removal of
duplicates) process. This determines how the persons or organizations from
two different sources are matched as the same actual person or organization.
򐂰 Determine the appropriate business duplicate keys for all other types of data
records. This determines whether a record from two different sources is to be
identified as being the same record or not. When it is the same record it will
be updated, otherwise it will be added.
A business duplicate key is the minimum set of columns in one table or across
tables that can uniquely identify a data record. This definition is essential
when consolidating data from multiple systems.
8 MDM: RDP for MDM

Part of this phase is to plan the strategy for the following tasks:
򐂰 Initial load of the target MDM Server from the data sources
򐂰 Delta processing which involves applying the changes occurring in the source
systems to the target MDM Server.
The frequency of applying these changes to the target MDM Server depends
upon the data latency requirements of the business application. Some
applications (such as weekly reports) require data latency of a week, while
other applications (such as customer relationship management) may need
data to be current as of the end of the business day.
Data latency refers to the freshness of the data in the target MDM Server
relative to same data in the source system. In other words, the maximum time
interval it takes for a change occurring in the source system to be reflected in
the target MDM Server.
This phase is critical to ensuring a successful data integration effort that delivers
timely and superior data quality.
1.2.2 MDM Server point in time load

This concept refers to performing an initial load of the target MDM Server from
the source systems and subsequently applying the delta changes occurring in
the source systems to the target at a frequency based on your application’s data
latency requirements.
During the initial load (as well as delta processing), appropriate translation and
transformation of the source data needs to occur. Also required is any potential
data cleansing activity that includes standardization, matching, and
de-duplication.
Data cleansing should be performed to improve data quality and includes

standardization and correction of errors in fields (such as names, addresses,
telephone numbers, and dates).
As part of the data cleansing activity during initial load and delta processing,
potential duplicate records are identified. You need to take appropriate action to
confirm or deny whether the identified duplicate condition exists. You need to
determine the match rules for identifying potential duplicates between two or
more records. This involves defining match criteria, which vary depending upon
whether the records are of the type person or organization. For a person, the
default business duplicate key is the Social Security Number. For an
organization, the default duplicate key is the Corporate Tax Identification.
When duplicates are identified, you need to survive (combine/consolidate) data
about an entity from the multiple records and fill in the gaps from missing fields
by creating a more complete record based on the fields in the duplicate records.
The survivorship process can be automated when we are confident that the
records are duplicates. If any degree of doubt exists, the records are stored
separately and marked as potential duplicates. The Data Stewards will have to
perform manual consolidation of duplicate data using MDM Server Data
Stewardship UI, which calls MDM Server business transactions
1.2.3 MDM data consumption

As discussed in A.3, “IBM InfoSphere MDM Server overview” on page 241,
numerous methods are available for interfacing with MDM Server:
򐂰 RMI
򐂰 JMS
򐂰 Batch services
򐂰 Web services
Note: The RDP for MDM solution described in 1.3, “RDP for MDM solution” on
page 11 provides a user interface, as well as business services that are Web
services or Enterprise Java™ Bean (EJB™)-enabled. That does not preclude
other consumption mechanisms (such as stored procedures or extracts), but
those solutions tend not to be as scalable as Web services or EJBs.
10 MDM: RDP for MDM

1.3 RDP for MDM solution
As explained in 1.2, “MDM Server implementation phases” on page 8, the RDP
for MDM solution reflects the three main phases of an implementation of MDM
Server:
򐂰 Source data analysis
Figure 1-5 shows how the components that make up the RDP for MDM solution
correspond to the three MDM Server implementation phases.
MDM Server
Composites
Processor
SIF Parser
Batch Txn
SIF Sequencer MDM Business
(DataStage) Soundex
Services NYSIIS Key
Generation
Tactical UIs
Duplicate Suspect Processing
Default Rule Reporting UI
Information Server Deterministic Off
Probabilistic On
New Candidate
List Rule* + use new QS Job
Match UI
DataStage QS
History Configuration
MDM Database Management
Information Server
Reporting
Information Analyzer SIF Load Process
Street name
Phonetic
Load (BIRT)
Address Script – p _street name
Fast Track DS Jobs Process Table Address lines 100 chr
Parameter
Candidate List
DataStage Sample Reports
(Data Stewardship
and MDM KPIs)
MDM
Source Data Analysis MDM Point in Time Load
Consumption
Figure 1-5 RDP for MDM solution overview
The components shown in Figure 1-5 on page 11 are as follows:
򐂰 IBM InfoSphere InformationAnalyzer
The InfoSphere InformationAnalyzer assesses the quality of the reference
data, the frequency of data per column, and the cardinality across columns.
򐂰 IBM InfoSphere FastTrack
InfoSphere FastTrack accelerates the translation of business requirements
into data integration projects, which requires collaboration across analysts,
data modelers, and developers. It allows business logic to be captured and
translated into DataStage ETL jobs.
򐂰 IBM InfoSphere DataStage
InfoSphere DataStage is an ETL tool that uses a graphical notation to
construct data integration solutions.
򐂰 IBM InfoSphere QualityStage
InfoSphere QualityStage (QS in Figure 1-5 on page 11) is used to assess the
quality of free-form fields such as names and addresses.
򐂰 IBM InfoSphere Master Data Management (MDM) Server
IBM InfoSphere Master Data Management (MDM) Server allows businesses
to centrally manage customer, product, and account data for use
enterprise-wide.
򐂰 Standard Interface Format (SIF)
Standard Interface Format (SIF) is a flat file format delimited by the pipe
symbol between columns and the new line character between records. The
first 2 columns identify the type of data record and is denoted by the Record
Type (RT) followed by the Sub Type (ST). After the first 2 columns, the Source
System Key (SSK) is provided to allow the record to be referenced using the
existing source systems’ keys. The SIF is described in detail in Appendix B.1,
“SIF details” on page 248.
򐂰 DataStage and QualityStage jobs
These jobs perform validation, standardization, matching, de-duplication,
suspect processing, loading, and delta processing of SIF input data into the
MDM data repository.
򐂰 MDM Server data model subset
RDP for MDM solution supports a subset of the MDM Server data model —
this subset is shown in Figure 1-6 on page 13.
12 MDM: RDP for MDM

It includes the following aspects:
– SIF Parser and Maintenance Business Transactions to load data through
the Batch Framework
– Role-based Sample User Interface (UI) which is an umbrella GUI that
incorporates a Reporting UI, a registry that serves and renders Business
Intelligence and Reporting Tools (BIRT) reports,
– Customer Matching Critical Data Rule UI that allows defining matching
criteria
– Data Stewardship UI and Party Maintenance UI to view and maintain data
in the MDM data repository.
ADDRESS
CONTRACT LOCATION GROUP
CONTRACT
ADMIN BASE CONTRACT ROLE ADDRESS GROUP
CONTACT
SYSTEM KEY CONTACT METHOD GROUP
METHOD
CONTRACT COMPONENT
PHONE
NUMBER
PARTY
RELATION-
CATEGORY PARTY
SHIP
NAME
TYPE PARTY
PRIVACY/ PREFERENCE IDENTIFIER
PARTY
ACTION PREFERENCE
PERSON
DUPLICATE
PARTY
CATEGORY TYPE ORGANIZATION SUSPECT
EQUIVALENCY
ATTRIB TYPE SOURCE ADJUST

MISC.
ID TYPE REASON
VALUE
PRIORITY
NODE ULTIMATE PARENT

DESIGNATION HIERARCHY
CATEGORY TYPE NODE
TYPE
ALERT RELATIONSHIP
SEVERITY HIERARCHY HIERARCHY
CATEGORY TYPE
Figure 1-6 MDM Logical Data Model & Domains
Figure 1-7 shows the general flow of the RDP for MDM solution and the various
roles and supporting InfoSphere products participating in the different phases of
the implementation.
MDM
Source Data Analysis MDM Point in Time Load
Consumption
Figure 1-7 Role and flow of RDP for MDM solution
Figure 1-7 shows the following:

򐂰 In the Source Data Analysis (data profiling) phase, the primary responsibility
is that of the data analyst, while the mapping of source to target (SIF) is the
responsibility of the data analyst, ETL developer, and business analyst.
Profiling is supported by InformationAnalyzer, while the mapping and
cleansing is supported by FastTrack, QualityStage, and DataStage.
򐂰 In the MDM RDP point-in-time load phase, the data quality aspects of
validation, standardization, matching, de-duplication, and data transformation
and data loading tasks are performed using QualityStage and DataStage.
The data analyst, ETL developer, and business analyst play a prominent role.
The RDP for MDM solution provides DataStage jobs to insert/update data
within the MDM Server database. It uses parallelization to improve load times
for data loads. This Direct Database Load capability implements the same
business logic applied by MDM Server, and includes the integration of
QualityStage to provide address standardization, name standardization, and
party matching.
Delta RDP for MDM is for delta processing and uses a direct load method or
MDM Server Maintenance Business Services to perform delta processing.
14 MDM: RDP for MDM

About Maintenance Business Services: Maintenance Business
Services performs validation on the input record, and inserts the record
into the MDM data repository if it does not currently exist therein, or
updates the record in the MDM data repository if it already exists therein.
Maintenance Business Services is built using MDM Server and the Java
Composites Framework. It is provided as source code that enables
customization for your environment. Maintenance Business Services can
be used in the MDM Server transaction hub as well.
Delta RDP for MDM uses the Information Server engine to maintain metadata
lineage automatically. If you choose to use the Maintenance Business
Services method for delta processing, an additional set of custom Java
metadata update services will need to be developed and connected into the
X-Meta hub for data lineage to be automatically maintained and available
through Metadata Workbench.
MDM Business Services Load provides support for standard MDM Server
XML layout, as well as MDM Server SIF layout as the file input format.
This Business Services Load method uses the MDM Server SIF layout. This
method requires that the records with the same context be loaded as one
grouping. This is accomplished by producing the SIF layout and passing it
through the SIF Sequencer DataStage job, as shown in Figure 1-5 on
page 11. The SIF Sequencer sorts the records to allow the input to match
what would have been expected by the MDM Server Business Services if
XML was used as the input format.
Attention: The RDP for MDM solution provides DataStage/QualityStage

jobs that support and initial load of the MDM Server as well as a delta
loada. This IBM Redbooks publication only describes the initial load jobs
and not the delta load jobs because they were not available at the time of
the writing of this publication.
a. Corresponds to synchronizing the MDM Server with changes occurring in the
source systems after the initial load has been completed. As mentioned
earlier, the frequency of this delta processing would be driven by the data
latency requirements of the application.
The DataStage/QualityStage jobs used in this phase are described in detail in

Chapter 2, “Rapid Deployment Package details” on page 25.
򐂰 In the MDM consumption phase, the master data that is eventually loaded into
MDM Server (and updated per data latency requirements of the application)
is consumed using one of the interfaces identified earlier.
Potential duplicate master data that could not be automatically collapsed
during load needs to be examined by the Data Steward. Often, duplicate
records have conflicting data in the same fields. It is the Data Steward's role
to choose which data will survive and be carried forward into the master
record. Data Steward uses the MDM Server UI to perform this task.
The MDM Server UI invokes MDM Server business transactions, which
creates the new master record and inactivate duplicate records. As part of
this process, MDM Server invokes duplicate suspect processing again to
identify any potential new suspects. MDM Server will call QualityStage
runtime matching jobs with the same set of rules as in batch QualityStage job
used during initial load to guarantee the same matching logic.
Also provided are the following features:
– Data Stewardship UI, which allows users to perform various processing
activities on party data
– Party Maintenance UI, which is a Web services-based UI featuring a
graphical 360 view of customer data
– Sample OOTB reports and a framework to build your own custom reports.
This feature allows a client to implement reporting requirements around
Suspect Duplicate Processing in support of the Data Stewardship role.
1.4 Configuring the RDP for MDM solution

The RDP for MDM solution can be configured using the configuration screens in
the Customer Matching Critical Data Rule UI, or by setting values in the MDM
Server configuration and management tables. Modifying these settings
simultaneously affects both the RDP for MDM Direct Database Load and the
MDM Server Business Services.
1.4.1 Configuration screens in the MDM Server UI

The configuration screens in MDM Server UI permit the following RDP for MDM
solution customizations.
16 MDM: RDP for MDM

Enabling and disabling Suspect Duplicate Processing
From the MDM Server UI, navigate to Matching Critical Data Rules →
Configuration Options in the navigation pane. Select the Suspect Duplicate
Processing radio button shown in Figure 1-8 to enable suspect duplicate
processing. This setting affects the processing of both the RDP for MDM Direct
Database Load and the MDM Server Business Services.
Figure 1-8 Enabling Suspect Duplicate Processing in MDM Server UI
Selecting the set of Party match criteria

You can configure the critical match fields to use for party matching, as well as
set the threshold scores to be used to categorize the action to take for a given
score.
Because the critical fields for person matching and organization matching can be
configured independently, the UI uses two different screens.
Configuring the critical matching fields for person matching

From the MDM Server UI, navigate to Matching Critical Data Rules → Person i
to view the “Matching Critical Data for Person” in panel, as shown in Figure 1-9
on page 18.
򐂰 Override the Minimum Match Score for each Suspect Match Category to set
the threshold scores for A1, A2, and B matches.
Note: A1, A2, and B matches are described in 2.8, “Match” on page 60.
򐂰 Select the matching critical data fields for a person by moving the appropriate
fields from the left pane to the right pane under “Matching Critical Data Fields”
section.
The fields selected are as follows:
– Name
– Address
– City
– State/Province
– Country
– ZIP/Postal Code
– Gender
– Birth Date
– Social Security Number
– Driver License Number
M aster D ata M anagem ent S erver
Data Steward sh ip Match in g C ritical D ata fo r P erson

Party M ainten an ce C o n so le S uspect C ategory M atching S core
Rep o rtin g
S uspect M atch C ategory M inim um M atch S core
M atch in g C ritical Data R ules
P e rs on A 1 /D uplic ate 275 .0

Organization
A 2 /S uspec t 250 .0
C onfiguration Options
B /U nres olved S uspec t 230 .0
Ad min istration C on so le
M atching C ritical D ata Fields
S elect M atching C ritical D ata Fields S elected M atching C ritical D ata Fields
N ame N am e
Address A ddress
C ity C ity
State/Province S tate /P rovince
C ountry C ountry
C ounty ZIP /P ostal C ode
Z ip/Postal C ode Add > > Gender
Gender B irth D ate
Birth D ate S ocial S ecurity N um ber
< < R emove
Social Security N um ber D river License num ber
D riv e r Lic ens e num ber
P a ss port
H om e E- m ail A ddress
H om e Telep h on e
M obile Telephone
SU BM IT C AN C EL
Figure 1-9 Configuring the critical matching fields for person matching
18 MDM: RDP for MDM

Configuring the critical matching fields for organization matching
From the MDM Server UI, navigate to Matching Critical Data Rules →
Organization to view the “Matching Critical Data for Organization” panel, as
shown in Figure 1-10.
򐂰 Override the Minimum Match Score for each Suspect Match Category to set
the threshold scores for A1, A2, and B matches.
򐂰 Select the matching critical data fields for an organization by moving the
appropriate fields from the left pane to the right pane under Matching Critical
Data Fields section. The fields selected in this case are different from those
selected for Person matching. Figure 1-10 shows the selected fields:
– Name
– Address
– City
– State/Province
– Country
– ZIP/Postal Code
– Established Date
– Corporate Tax Identification
– DUNS number
– Business Telephone
M a s te r D a ta M a n a g e m e n t S e r v e r
D a t a S t e w a rd s h ip M a t c h in g C r itic a l D a ta f o r O r g a n iz a tio n
P a rt y M a in t e n a n c e C o n s o le S u s p e c t C a t e g o r y M a tc h i n g S c o r e
R e p o rt in g
S u s p e c t M a tc h C a te g o r y M i n i m u m M a tc h S c o r e
M a t c h i n g C r it i c a l D a t a R u le s
A 1 /D u p lic a t e 27 5 .0
P e rso n
O r g a n iz a t io n A 2 /S u s p e c t 25 0 .0
C o n fig u r a tio n O p tio n s
B / U n re s o lv e d S u s p e c t 23 0 .0
A d m in is t ra t io n C o n s o le
M a tc h i n g C r i ti c a l D a ta F i e l d s
S e l e c t M a t c h i n g C r i ti c a l D a ta F i e l d s S e l e c te d M a tc h i n g C r i ti c a l D a ta F i e l d s
N am e N am e
A d d re ss A d d ress
C it y C i ty
S ta te /P r o v in c e S ta te / P r o v i n c e
C o u n tr y C o u n tr y
C o u n ty Z i p / P o s ta l C o d e
Z ip /P o s ta l C o d e Ad d > > E s ta b l i s h e d D a t e
E s ta b lis h e d D a te C o r p o r a t e T a x I d e n ti fi c a ti o n
C o r p o r a te T a x Id e n tific a tio n D U N S nu m ber
< < R e m o ve
D U N S num be r B u sin e ss T e le p h on e
T a x R e g is t r a t io n N u m b e r
T a x Id e n t if ic a t io n N u m b e r
B u s in e s s T e le p h o n e
B u s i n e s s E -m a i l
S U B M IT CAN
Figure 1-10 Configuring the critical matching fields for organization matching
1.4.2 Main configuration scenarios
Figure 1-15 on page 23 shows the options available for the initial load of the
MDM data repository using the RDP for MDM solution, delta load using the Delta
RDP for MDM option, and the Batch Framework with SIF or XML layout as input.
Figure 1-15 on page 23 highlights the following features:

򐂰 A coexistence architectural style MDM implementation.
򐂰 Option to modify the OOTB RDP for MDM data model, which can impact the
MDM Server data repository tables, business transactions, the SIF format,
and the DataStage/QualityStage jobs.
򐂰 The Source to SIF section includes Data Quality Assessment (DQA) and
extraction and transformation of data from sources to the SIF format. It may
also include data cleansing through standardization and matching.
򐂰 Configuration parameters provide for at least three configurations of execution
as follows:
– Configuration A (OOTB)
No standardization and matching is performed in the SIF to Load section.
It is assumed to be done in the Source to SIF section.
– Configuration B
You choose to perform matching in the SIF to Load section only, and
expect to perform standardization in the Source to SIF section.
– Configuration C (recommended)
You choose to perform both standardization and matching in the SIF to
Load section. This will likely involve customization of the rulesets used in
the QualityStage jobs.
򐂰 Once the data is loaded into the MDM data repository, clean and
de-duplicated master data is available to be consumed by any external
system as shown in 3.8, “MDM consumption application” on page 230.
In the coexistence architectural style MDM implementation shown here,
master data synchronization is bidirectional. This means that changes to the
master data continue to occur in the existing source systems. These changes
are propagated to the MDM data repository in real-time or at frequent
intervals as determined by the data latency requirements of the application.
Clean and accurate master data is sent back to existing systems to keep them
up-to-date with the master view.
20 MDM: RDP for MDM

There are four ways for a client to consume the data within MDM Server:
– Create a new application to invoke MDM Server Web Services.
– Modify an existing application to invoke MDM Server Web Services.
– Create an adapter/wrapper around the application to invoke MDM Server
Web Services
– Use CDC against the existing application's database to invoke MDM
Server Web Services
In each of the scenarios, the best way to interact with MDM Server is through
its Web services layer, but it could also be invoked through its EJB interface or
at the lowest level at the database for inquiries only.
Note: Direct database access to MDM Server would only be allowed for
inquiry purposes, because for add/update purposes it would bypass data
quality validations, thereby compromising the master data management
standards. MDM Server highly recommends to use either its Web services
layer or its EJB layer for processing data.
– New applications would access the master data by interfacing with MDM
Server and use the Source System Keys stored therein to access the
non-master data stored in the existing systems as shown in 3.8, “MDM
consumption application” on page 230.
Figure 1-11 is the simplest scenario for consuming data within MDM
Server. The new application can be designed to ensure that customer
master data is read directly from the MDM Server, as well as use MDM
Server to add new customers before adding new accounts in its own
system.
MDM Server
New Application Add/Update/Search Web MDM Server
Services
Figure 1-11 Creating a new application accessing MDM Server
– Existing applications will continue to read data from their own repositories.
A synchronization process would need to be written to update existing
systems with clean and accurate master data from the MDM Server. The
MDM Server provides a notification mechanism to inform external systems
about significant MDM events, such as the collapse of duplicate parties.
The synchronization process should listen for the data changes that have
occurred in master repository and update existing systems accordingly.
Three scenarios are briefly described here as follows:
• In the “Modifying a existing application” scenario (Figure 1-12), an
existing application is modified to use MDM Server Web Services to
search for new customers and update their data within MDM Server.
This could involve the creation of new screens within the existing
application, or by re-coding existing screens.
MDM Server
Legacy
Modify for Add/Update/Search Web MDM Server
Application
Services
Figure 1-12 Modifying a existing application
• In the “Creating a existing adapter“ scenario (Figure 1-13), the existing

application is not modified, but an adapter is created that will make the
calls to MDM Server Web Services. While this method may still be
invasive to the existing application, it may be possible in cases where
the user interface code was separated from the data access layer of
the application. This allows the existing application to be left
unchanged except that the underlying data access layer would be
re-written to invoke MDM Server Web Services.
MDM Server
Legacy Legacy Add/Update/Search
Web MDM Server
Application Adapter
Services
Figure 1-13 Creating a existing adapter
• In the “Using a Change Data Capture solution” scenario (Figure 1-14),

the existing application is not modified. The existing database it uses
would have a Change-Data-Capture (CDC) process applied to it. The
existing application would be used as is, but as a post-process, the
modified data would be sent to MDM Server for data cleansing. The
changes would be fed back to update the existing database. This is not
a preferred solution but may be the only option when a existing
application is off-limits to a project team, but the application still wants
to take advantage of the master data stored within MDM Server.
Legacy CDC MDM Server

Legacy Application Database Web MDM Server
Legacy
Adapter Update Services
Figure 1-14 Using a Change Data Capture solution
22 MDM: RDP for MDM

Note: Figure 1-11 on page 21 through Figure 1-14 on page 22 show
applications using the MDM Server Web Services integration point.
Applications may also integrate with MDM Server through the JMS/MQ or
EJB/RMI integration points. A detailed discussion of these options is
beyond the scope of this IBM Redbooks publication.
򐂰 The delta load portion of Figure 1-15 shows the invocation of the
customizable (as indicated by the red border around this box) RDP
Maintenance Business Services component of the MDM Server by the Batch
Framework when processing delta records in the SIF or XML layout format. It
also shows how the SIF records can be consumed directly by the delta RDP
for MDM functionality. This is not covered in this IBM Redbooks publication.
Coexistence Style Only

Source Operational Data From Source – Bring back foreign keys and some
master data
To
SIF Initial Load Delta Load
Modify MDM Data Model
DQA, EXTRACT SIF SIF

TRANSFORM CLEANSE XML
Modify SIF SIF Batch Batch

Framework Framework
SIF RDP for MDM (DataStage)

To Config A Config B Config N RDP Maintenance
Load Business Services
Modify RDP MDM Server

(WebSphere AppServer)
Modify DB
MDM Database
New Applications
Consume Custom
MDM Data Stored MDM UI ESB Legacy Applications
Procedures or Legacy Adapters
Figure 1-15 Overview of RDP for MDM solution scenarios
1.4.3 Best practices
The goal of any initial implementation of a technology should be a successful first
phase that fulfills an acceptable set of business requirements.
The goal of the RDP for MDM implementation is to minimize the amount of
configuration necessary within client implementations, by providing prebuilt
configuration assets at the most common configuration points, as follows:
򐂰 Name Standardization
򐂰 Address Standardization
򐂰 Party Matching
򐂰 Data Model Extensions
If you want greater flexibility with standardizing and matching your input source
data, we suggest you use Configuration C, described in 1.4.2, “Main
configuration scenarios” on page 20 for your RDP for MDM implementations
where you choose to perform both standardization and matching in the SIF to
Load section as shown in Figure 1-15 on page 23.
Tip: During the development of the data integration solution, we recommend a

test run of your data with all the default settings in MDM Server. This enables
you to work out any issues with data formats, while you continue to work
through the business rules.
To improve load time we suggest the following approaches:

򐂰 Pre-standardize the names in the input data. You should disable name
standardization in RDP for MDM by setting the configuration parameters
QS_STAN_ORG_NAME and QS_STAN_PERSON_NAME appropriately.
򐂰 Pre-standardize the addresses in the input data. You should disable address
standardization in RDP for MDM by setting the configuration parameter
QS_STAN_ADDRESS appropriately.
Note: These configuration parameters are described in Table 2-2 on page 36.
24 MDM: RDP for MDM

2
Chapter 2. Rapid Deployment Package

details
This chapter provides an overview of the Rapid Deployment Package (RDP)
component of the IBM InfoSphere RDP for Master Data Management (MDM)
solution. It includes a brief description of the Standard Interface Format (SIF), the
configuration parameters, and the main functions of the various DataStage and
QualityStage jobs in the RDP.

2.1 Introduction
The RDP for MDM solution consists of the definition of a SIF, and a set of
DataStage and QualityStage jobs for loading input data in SIF format into the
MDM data repository.
Currently, the SIF and the DataStage/QualityStage have some flexibility for
customization to accommodate the specific needs of your organization (such as
modifying the code table values in the MDM Server and modifying the
standardization rules and de-duplication process). In addition, you can add
columns and change the precision and scale of existing MDM Server columns,
which would require modification of the SIF and DataStage jobs.
In this chapter, we provide a high-level overview of the main components of the

RDP component of the IBM InfoSphere RDP for MDM solution as follows:
򐂰 Main components
򐂰 Configuration parameters
򐂰 SIF
򐂰 Import SIF
򐂰 Validation and Standardization
򐂰 Error Consolidation and Referential Integrity
򐂰 Match
򐂰 ID Assignment
򐂰 Load
2.2 Main components

Figure 2-1 on page 29 shows the main components of the RDP for MDM.
򐂰 The SIF files the data to be loaded into the MDM data repository. During a
single execution of the RDP for MDM jobs, you may specify one or more input
SIF files for processing.
򐂰 The execution of the RDP for MDM jobs is driven by configuration parameters
that you can customize to suit the unique requirements of your organization.
These configuration parameters may be specified either in a configuration
parameter file named STATIC_MDMIS (default) or in a CONFIGELEMENT
table in the MDM data repository.
26 MDM: RDP for MDM

The configuration parameters used depend upon the job sequence executed
as follows:
– If you execute the job sequence IL_000_INITIAL_LOAD or
IL_200_Hierarchy, then you are responsible for creating your own
parameters in a configuration file and providing it as input to the job
sequence. You can give it any name. We used “canonical_i”.
– If you execute the job sequence IL_000_AutoStart_PS_IL, it runs prestart
jobs to extract the configuration parameters from both the
CONFIGELEMENT table and the default configuration parameter file
(named STATIC_MDMIS1) and creates a temporary configuration file
(named VOLATILE**) where the CONFIGELEMENT parameter values
override the corresponding parameter values in the STATIC_MDMIS
configuration file. The newly generated temporary file VOLATILE** is then
passed to the IL_000_INITIAL_LOAD or IL_200_Hierarchy jobs. The
default configuration parameter file STATIC_MDMIS is unchanged.
Attention: In a production environment, you must always run the

IL_000_AutoStart_PS_IL job.
Failure to run with configuration options in the CONFIGELEMENT table

can result in a deterioration of data quality. The configuration options that
describe matching attribute selection are consumed by the QualityStage
job in MDM Server runtime environment from the CONFIGELEMENT table
only. The same configuration options are consumed by the QualityStage
job in batch during the initial load.
If the initial load uses the configuration options in the config file, while the
runtime uses the configuration options in the CONFIGELEMENT table,
data will not be clean after a few days in production.
Note: The IL_000_INITIAL_LOAD and IL_200_Hierarchy jobs are suitable

for testing various options using specific configuration parameter sets
involving certain combinations of options. It would be quite cumbersome to
use the IL_000_AutoStart_PS_IL job in such cases, because you would
have to constantly modify the CONFIGELEMENT table for each
test/development run.
1 The configuration parameters to be overridden have “configelement ==” in the help text field
Chapter 2. Rapid Deployment Package details 27

򐂰 There are 80+ DataStage/QualityStage jobs listed in Table 2-1 on page 30,
performing specific tasks in phases that are provided as part of RDP for MDM
solution.
There are 6 phases, as follows:
– Import SIF phase
This phase consists of a single job (IL_010_IS_Import_SIF) that reads the
SIF input data and parses it by each record type (RT). It identifies record
type/sub type (RT/ST) errors, data type validation errors, and duplicate
primary keys, and creates Parallel Data Sets for each record type. It writes
the errors detected to a single file named
SIF_Import_ERR_MSGS.[batchid].txt. This phase is described in more
detail in 2.5, “Import SIF” on page 53.
– Validation & Standardization phase
This phase consists of multiple jobs (IL_020_VS_*) that perform code
table lookups, date bound checking, and other validations appropriate to
each record type. They also perform optional custom validations and
create new Parallel Data Sets for input to the Error Consolidation and
Referential Integrity step. It also includes name and address
(QualityStage) Standardization jobs when that option is selected. It writes
the errors detected to up to 17 files such as
Party_address_us_err_msg.[batchid].txt and
Party_****_us_err_msg.[batchid].txt, where **** varies.
Note: The names of the error files vary widely. Therefore, you should
construe that other error files conform to the template shown here for
Party.
This phase is described in more detail in 2.6, “Validation and

Standardization” on page 55.
– Error Consolidation & Referential Integrity phase
This phase consists of multiple jobs (IL_030_RI_* and IL_040_EC_*) that
check referential integrity relationships and drop parties with one or more
errors. It also consolidates the errors from the previous phases into 2 error
logs — one for party named PARTY_Errcon.[batchid].txt, and another for
contract named CONTRACT_Errcon.[batchid].txt). This phase is
described in more detail in 2.7, “Error Consolidation and Referential
Integrity” on page 58.
28 MDM: RDP for MDM

– Match phase
This phase consists of two jobs (IL_050_MA_*) that perform Person
match, Org match, LOB match, and Suspect processing. This phase is
described in more detail in 2.8, “Match” on page 60.
– ID Assignment phase
This phase consists of multiple jobs (IL_060_AI_*) that perform Primary
Key and Foreign Key assignments and produce load ready files. This
phase is described in more detail in 2.9, “ID Assignment” on page 63.
– Load phase
This phase consists of multiple jobs (IL_090_LD_* and
IL_091_LD_Bulk_Common) that insert load ready files into MDM tables.
This phase is described in more detail in 2.10, “Load” on page 64.
򐂰 The MDM data repository contains the MDM data loaded (and updated1) by
the RDP for MDM jobs.
RDP for MDM Jobs
Import Validation & Error Consolidation & Match ID Assignment Load

SIF Standardization Referential Integrity
(Intra-record (Intra-record only) (Inter-record only)
only)
MDM GUI
Data type Standardization Referential Integrity
checking Code tables Transitive errors
RT/ST Pair date validations MDM
validation data
repository
MDM No more error logs

ConfigElement
table
Error Error ~17 files Error Variable number of files
log logs logs
Configuration
Parameters
One file for Party

Consolidated One file for Contract
SIF Error log
File(s)
Figure 2-1 Main components of RDP for MDM processing
1
By Delta RDP for MDM which was not available at the time of writing of this IBM Redbooks
publication

Table 2-1 Import SIF phase jobs
Step Job name(s) Description Parameters & Recommendationsa
Import SIF IL_010_IS_Import_SIF This job reads the DS_PROCESSING_DATE

SIF input data and
parses it by each DS_SOURCE_DATE_FORMAT (%yyyy-%mm-%nn
record type. It %hh:%nn%ss.6)
identifies data
type validation DS_FAILED_COLUMNIZATION_ACTION (F)
errors, and
DS_FAILED_RECORDIZATION_ACTION (F)
duplicate primary
keys, and creates
Parallel Data Sets
for each record
type.
Validation & IL_020_VS_Alert These set of jobs DS_LANGUAGE_TYPE_CODE (100)

Standardization IL_020_VS_Contact perform code
IL_020_VS_ContactMethod table lookups, DS_PROCESSING_DATE
IL_020_VS_ContactRel date bound
IL_020_VS_Contract checking, and DB_CONNECT_STRING
IL_020_VS_ContractCompVal other validations
IL_020_VS_ContractComponent appropriate to DB_SCHEMA
IL_020_VS_ContractRole each record type.
DB_USERID
IL_020_VS_External_Match They also perform
IL_020_VS_Identifier optional custom DB_PASSWORD
IL_020_VS_LOBRel validations and
IL_020_VS_MiscValue create new
IL_020_VS_NativeKey Parallel Data Sets
IL_020_VS_PrivPref for input to the
IL_020_VS_RoleLocation Error
Consolidation and
Referential
Integrity step.
IL_020_VS_Address In addition to all QS_PHONETIC_CODING_TYPE_ADDRESS (QSNYSIIS)

the table lookups
etc. performed in QS_STAN_ADDRESS (1)
the other
IL_020_VS jobs, QS_REJECT_ADDRESS_IF_NOT_STANDARDIZED (0)
this job performs
standardization
when that option
is selected.
IL_020_VS_PersonName In addition to all QS_REJECT_PERSON_NAME_IF_NOT_STANDARDIZED (0)

the table lookups
etc. performed in QS_PHONETIC_CODING_TYPE_PERSON (QSNYSIIS)
the other
IL_020_VS jobs, QS_STAN_PERSON_NAME (1)
this job performs
standardization
when that option
is selected.
IL_020_VS_Orgname In addition to all QS_REJECT_ORG_NAME_IF_NOT_STANDARDIZED (0)

the table lookups
etc. performed in QS_STAN_ORG_NAME (1)
the other
IL_020_VS jobs, QS_PHONETIC_CODING_TYPE_ORGANIZATION (QSNYSIIS)
this job performs
standardization
when that option
is selected.
30 MDM: RDP for MDM

Error IL_030_RI_Contact_Person_Org These set of jobs DS_PROCESSING_DATE

Consolidation & IL_030_RI_RoleLocation check referential
Referential IL_040_EC_Contract integrity DS_PARTY_DROP_SEVERITY_LEVEL (4)
Integrity IL_040_EC_Contract_Initial relationships and
IL_040_EC_Contract_Iterative_Drop drop parties with DS_DETECTED_DUPLICATES_ACTION (E)
IL_040_EC_Contract_Last_Drop one or more
IL_040_EC_Party errors.
IL_040_EC_Party_Initial
IL_040_EC_Party_Iterative_Drop
IL_040_EC_Party_Last_Drop
Match IL_050_MA_Match_LOB These two jobs DS_PROCESSING_DATE

IL_050_MA_Match_Person_Org perform Person
match, Org match, QS_ALLOW_LOB_MATCH (0)
LOB MATCH,
and Suspect QS_A1_MATCH_CUTOFF_ORGANIZATION (205)
processing
QS_A2_MATCH_CUTOFF_ORGANIZATION (175)
QS_B_MATCH_CUTOFF_ORGANIZATION (150)
QS_EXCLUDE_FIELDS_FROM_MATCH_ORGANIZATION
QS_A1_MATCH_CUTOFF_PERSON (205)
QS_A2_MATCH_CUTOFF_PERSON (175)
QS_B_MATCH_CUTOFF_PERSON (150)
QS_EXCLUDE_FIELDS_FROM_MATCH_PERSON
QS_MATCH_ORG_NATID (I2)
QS_MATCH_ORG_1
QS_MATCH_ORG_2
QS_MATCH_ORG_3
QS_MATCH_ORG_4
QS_MATCH_PERSON_NATID
QS_MATCH_PERSON_1
QS_MATCH_PERSON_2
QS_MATCH_PERSON_3
QS_MATCH_PERSON_4
ID Assignment IL_060_AI_Address_ContactMethod These two jobs DS_PROCESSING_DATE

IL_060_AI_ContactRel perform Primary
and Foreign key
Assignments
IL_060_AI_Contact_Match These set of jobs

IL_060_AI_Contact_NoMatch produce load
IL_060_AI_Contract ready files
IL_060_AI_Identifier
IL_060_AI_LOBRel
IL_060_AI_OrgName
IL_060_AI_PersonName
IL_060_AI_PrivPref

Load IL_090_LD_Insert_Address These set of jobs DS_PROCESSING_DATE

(upsert -- first try IL_090_LD_Insert_AddressGroup insert load ready
insert, and if that IL_090_LD_Insert_Alert files into MDM LOAD_HISTORY_FLAG (C)
operation fails, IL_090_LD_Insert_ContEquiv tables.
try an update IL_090_LD_Insert_Contact DB_CONNECT_STRING
operation) IL_090_LD_Insert_ContactMethod
IL_090_LD_Insert_ContactMethodGroup DB_SCHEMA
IL_090_LD_Insert_ContactRel
DB_USERID
IL_090_LD_Insert_Contract
IL_090_LD_Insert_ContractCompVal DB_PASSWORD
IL_090_LD_Insert_ContractComponent
IL_090_LD_Insert_ContractRole
IL_090_LD_Insert_Identifier
IL_090_LD_Insert_LobRel
IL_090_LD_Insert_LocationGroup
IL_090_LD_Insert_MiscValue
IL_090_LD_Insert_NativeKey
IL_090_LD_Insert_Org
IL_090_LD_Insert_OrgName
IL_090_LD_Insert_PPrefEntity
IL_090_LD_Insert_Person
IL_090_LD_Insert_PersonName
IL_090_LD_Insert_PersonSearch
IL_090_LD_Insert_PhoneNumber
IL_090_LD_Insert_PrivPref
IL_090_LD_Insert_RoleLocation
IL_090_LD_Insert_Suspect
Load (bulk) IL_091_LD_Bulk_Common These set of jobs DS_PROCESSING_DATE

insert load ready
files into MDM LOAD_HISTORY_FLAG (C)
tables.
DB_CONNECT_STRING
DB_SCHEMA
DB_USERID
DB_PASSWORD
a. Parameters indicate that the parameter value must be customized for the user’s particular
environment, while the recommended value for this parameter is in parenthesis
2.3 Configuration parameters

As mentioned earlier, the RDP for MDM jobs are driven by configuration
parameters that may be read from either a configuration parameter file (initiated
by IL_000_INITIAL_LOAD or IL_200_Hierarchy) or the CONFIGELEMENT table
in the MDM data repository (initiated by IL_000_AutoStart_PS_IL). These
choices are briefly described here.
Tables Table 2-2 on page 36 through Table 2-5 on page 41 describe all the
parameters available for customization. It can be a daunting task. However, it is
our opinion that there are a few key parameters that fall in the MUST MODIFY list
(described in “MUST MODIFY parameters” on page 43), others that should be in
the CONSIDER MODIFYING list (described in “CONSIDER MODIFYING
parameters” on page 45), while the rest can be left to default until sufficient
experience has been gained to attempt to customize them as well.
32 MDM: RDP for MDM

2.3.1 Configuration parameter file
A number of parameters are provided to control the execution of the RDP for
MDM jobs. These are all listed in Table 2-2 on page 36 through Table 2-5 on
page 41 with a brief description and their default value. The default configuration
parameter file is STATIC_MDMIS in the directory
/opt/IBM/InformationServer/Server/Projects/RDP0_1/ParameterSets/MDMIS.
In this IBM Redbooks publication, for ease of understanding, we classified the

various parameters into broad categories and sub-categories based on their
function. We then also identified the parameters in these categories that must be
modified (Table 2-6 on page 44) before the RDP for MDM jobs can be executed,
and those that you should consider modifying (Table 2-7 on page 45) before the
RDP for MDM jobs can be executed.
Note: For a detailed description of these parameters, refer to the IBM

WebSphere DataStage and QualityStage Version 8 Parallel Job Developer
Guide, SC18-9891-00.
Categories and sub-categories

The broad categories and sub-categories defined are as follows:
򐂰 SETUP category
This category identifies parameters associated with setting up the
environment for the RDP for MDM jobs to run as follows:
– The database instance to connect to: DB_INSTANCE
– The location of the various libraries: $APT_DB2INSTANCE_HOME
– The error file directories: FS_ERROR_DIR
We defined the sub-categories of SETUP as follows:
– Connection sub-category
This sub-category includes parameters to access the MDM repository
database and includes DB_CONNECT_STRING, DB_INSTANCE, and
$APT_DB2INSTANCE_HOME.
– DS PARAMETER sub-category
This sub-category includes parameters that identify the DataStage code
libraries and configuration files and includes
DS_PARALLEL_APT_CONFIG_FILE.
– Miscellaneous sub-category
This sub-category includes parameters such as the date format in the SIF
file (DS_SOURCE_DATE_FORMAT) and the language code
(DS_LANGUAGE_TYPE_CODE).

– File location sub-category
This sub-category includes parameters that identify the path of the various
files such as error (FS_ERROR_DIR), log (FS_LOG_DIR), and parameter
sets (FS_PARAM_SET_DIR).
– QualityStage sub-category
This sub-category includes parameters that specify whether
standardization and matching should be performed or not (such as
QS_STAN_ADDRESS, QS_STAN_PERSON_NAME,
QS_STAN_ORG_NAME, QS_PERFORM_MATCH, and
QS_PERFORM_ORG_MATCH). If standardization and matching is
requested, then parameters in support of these functions can be
customized (such as QS_A1_MATCH_CUTOFF_PERSON,
QS_A2_MATCH_CUTOFF_PERSON, and
QS_MATCH_PERSON_NATID).
򐂰 ERROR HANDLING category
This category identifies parameters that define the action taken when errors
are detected by the RDP for MDM jobs, as follows:
– When duplicates are detected in the SIF:
DS_DETECTED_DUPLICATES_ACTION
– The number of errors threshold beyond which a job should be aborted:
DS_DROP_MAX_ITERATIONS
– The severity level above which parties should be dropped:
DS_PARTY_DROP_SEVERITY_LEVEL
We defined the sub-categories of ERROR HANDLING are as follows:
– DROP sub-category
The DROP sub-category includes parameters that define when records
should be dropped given an error condition such as
DS_DETECTED_DUPLCATES_ACTION and
DS_PARTY_DROP_SEVERITY_LEVEL.
– Notification sub-category
This sub-category includes parameters that identify the persons to be
notified when errors occur such as
DS_EMAIL_ERROR_CHECK_DISTRIBUTION and
DS_EMAIL_ERROR_CHECK_REPORT.
34 MDM: RDP for MDM

– Abort handling sub-category
The abort handling sub-category includes parameters that specify the
error conditions and thresholds that should cause the job to abort such as
DS_DROP_MAX_ITERATIONS,
DS_FAILED_COLUMNIZATION_ACTION, and
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD.
– Error Consolidation sub-category
The Error Consolidation sub-category includes parameters that specify
whether a party or contract should be dropped when a specific error
occurs such as DROP_ON_PRVBY_ERR and DROP_ON_REPLB_ERR.
򐂰 RUNTIME category
The RUNTIME category identifies parameters that must be provided at
runtime to uniquely identify that execution and include the SIF files to be
processed (FS_SIF_FILE_PATTERN and
FS_HIERARCHY_SIF_FILE_PATTERN), the processing date
(DS_PROCESSING_DATE) and batch id (BATCH_ID). There are no
sub-categories.
򐂰 ADVANCED category
This category identifies parameters that afford greater control over the
performance and functionality of the RDP for MDM jobs, such as the type of
history records created (LOAD_HISTORY_FLAG), the columns used to
calculate the checksum used in address matching
(DS_MD5_CRITICAL_ADDRESS_COLUMNS), and the next value to be
used in surrogate key generation (SK_MID_CONT_ID_NEXT_VAL) and the
file holding it (SK_MID_CONT_ID_SIF).
We defined the sub-categories of ADVANCED as follows:
– DEBUG (FUTURE) sub-category
DEBUG (FUTURE) sub-category includes parameters for debugging such
as DS_LAND_FILE_FLAG (currently not used).
– DS PARAMETER sub-category
DS PARAMETER sub-category includes parameters that control the
DataStage environment variables such as
$APT_IMPEXP_ALLOW_ZERO_LENGTH_FIXED_NULL and
$APT_NO_SORT_INSERTION.
– HISTORY sub-category
HISTORY sub-category includes the parameter LOAD_HISTORY_FLAG
that controls history record creation.

– SETUP sub-category
SETUP sub-category includes the
DS_MD5_CRITICAL_ADDRESS_COLUMNS parameter that specifies the
columns used to calculate the checksum used in address matching.
– SURROGATE sub-category
SURROGATE sub-category parameters specifies the next surrogate key
value (SK_MID_CONT_ID_NEXT_VAL) to be used and the file
(SK_MID_CONT_ID_SIF) it is to be taken from for the various identifiers.
Table 2-2 RDP Configuration Parameters by the SETUP category

Sub Parameter Default Description
category
Connection DB_CONNECT_STRING (blank) Values to control the database access. Will

vary between environments.Is set in the
MDM_CONNECTION parmset
DB_INSTANCE (blank) Values to control the database access. Will

DB_PASSWORD (blank) Values to control the database access. Will

DB_SCHEMA (blank) Values to control the database access. Will

DB_USERID (blank) Values to control the database access. Will

DS_VALUE_FILE_NAME (blank) The AutoStart sequencer uses this parameter

to pass information to RDP MDM jobs.
LOAD_METHOD (not connection) ODBC_INSERT Database load method - ODBC_INSERT

(default) / BULK_LOAD. The ODBC_INSERT
means UPSERT.
$APT_DB2INSTANCE_HOME /home/dsadm/remote_db2config (blank)
DS PARAMETER $APT_IMPORT_PATTERN_USES_FILESET_MOUNTED True (blank)
$APT_STRING_PADCHAR (blank) (blank)
DS_PARALLEL_APT_CONFIG_FILE /opt/IBM/InformationServer/Server/Configuratio DataStage Configuration file for parallel jobs

ns/MDM_Default.apt
DS_SEQUENTIAL_APT_CONFIG_FILE /opt/IBM/InformationServer/Server/Configuratio DataStage Configuration file for sequential

ns/MDM_1X1.apt jobs
Miscellaneous DS_SOURCE_DATE_FORMAT %yyyy-%mm-%dd Timestamp format in the SIF Files
MDM_DEPLOYMENT_NAME WebSphere® Customer Center MDM deployment name required by the jobs
reading and writing to the CONFIGELEMENT
table. Must match the deployed MDM
application name in order for it to update the
correct values.
DS_LANGUAGE_TYPE_CODE 100 MDM Language ID - 100 (default). 100 =

English
DS_USE_NATIVE_KEY 1 Use NativeKey for Contract_ID resolution - 1

(true) / 0 (false).
36 MDM: RDP for MDM

category
File location DS_SUPPORT_FILE_DIR /mdmisdata03/data/MDMIS/PARAMETERS/ Directory where required files are installed.
For instance FREQUENCY files used by QS
Match. (at present this seems to be the only
files stored there)
FS_DATA_SET_HEADER_DIR /mdmisdata03/Projects/MDMISINT3/DATA/ Dataset headers directory. The place where

.ds files descriptors are stored. Actually ds
data are stored in the database.
FS_ERROR_DIR /mdmisdata03/Projects/MDMISINT3/ERROR/ Error files directory.
FS_LOG_DIR /mdmisdata03/data/MDMIS/LOG/ Log files directory.
FS_PARAM_SET_DIR ./ParameterSets/ Parameter Set directory
FS_REJECT_DIR /mdmisdata03/Projects/MDMISINT3/REJECT/ Reject files directory.
FS_SK_FILE_DIR /mdmisdata03/Projects/MDMISINT3/SK/ Surrogate key files directory.
FS_TMP_DIR /mdmisdata03/data/MDMIS/TMP/ Temporary files directory
QualityStage QS_A1_MATCH_CUTOFF_ORGANIZATION 205 Specify Org A1a Minimum Match Score - 205
QS_A1_MATCH_CUTOFF_PERSON 205 Specify Org A1 Minimum Match Score - 205
QS_A2_MATCH_CUTOFF_ORGANIZATION 175 Specify Org A2b Minimum Match Score - 175
QS_A2_MATCH_CUTOFF_PERSON 175 Specify Org A2 Minimum Match Score - 175

(default).
QS_ALLOW_LOB_MATCH 0 Allow match across Line of Business - 1 (true)

/ 0 (false)
QS_B_MATCH_CUTOFF_ORGANIZATION 150 Specify Org B Minimum Match Score - 150
QS_B_MATCH_CUTOFF_PERSON 150 Specify Person B Minimum Match Score - 150
QS_EXCLUDE_FIELDS_FROM_MATCH_ORGANIZATION (blank) Select Critical Data Fields for Organization

Match.
QS_EXCLUDE_FIELDS_FROM_MATCH_PERSON (blank) Select Critical Data Fields for Individual

Match.
QS_MATCH_ORG_1 I1 Specify Variable Match String type and TpCd

for organization — these are values I1
through I12 which correspond to entries in the
CDIDTP table in the ID_TP_CD column. For
example, I1 corresponds to the Social
Security Number as shown in Figure 2-2 on
page 48. The MDM UI allows you to add up to
4 user specified columns as shown in
Figure 2-2 on page 48 to include in the match
beside the 8 already pre-specified ones. The
actual values here are MDM codes denoting
the available match identifiers.

category
QS_MATCH_ORG_2 I2 (blank)
QS_MATCH_ORG_3 (blank) (blank)
QS_MATCH_ORG_4 I3 (blank)
QS_MATCH_ORG_NATID I8 Specify Variable Match NationalId for

organization - I8 (default) corresponds to the
passport number. The MDM UI allows you to
specify the document used for national id
(drivers license, passport number, SSN etc.)
QS_MATCH_PERSON_1 C1 Specify Variable Match String type and TpCd

for person - C1..C8 — these correspond to
entries in the CDCONTMETHTP table in the
CONT_METH_TP_CD column as shown in
Figure 2-3 on page 48. The MDM UI allows
you to add up to 4 user specified columns to
include in the match beside the already
pre-specified ones. The actual values here
are MDM codes denoting the available match
contact methods.
QS_MATCH_PERSON_2 C3 (blank)
QS_MATCH_PERSON_NATID C2 Specify Variable Match NationalId for person

- C2 (default). The MDM UI allows you to
specify the document used for national id
(drivers license, passport number, ssn, etc.)
QS_PERFORM_ORG_MATCH 0 Perform Organization Match - 1 (true) / 0

(false)
QS_PERFORM_PERSON_MATCH 0 Perform Person Match -1 (true) / 0 (false)
QS_PHONETIC_CODING_TYPE_ADDRESS QSNYSIIS May be QSSOUNDEX or custom
QS_PHONETIC_CODING_TYPE_ORGANIZATION QSNYSIIS May be QSSOUNDEX or custom
QS_PHONETIC_CODING_TYPE_PERSON QSNYSIIS May be QSSOUNDEX or custom
QS_REJECT_ADDRESS_IF_NOT_STANDARDIZED 0 Reject record if standardization fails - 1 (true)

/ 0 (false)
QS_REJECT_ORG_NAME_IF_NOT_STANDARDIZED 0 "Reject record if standardization fails -1 (true)

/ 0 (false)
If standardization leaves unhandled data
AND STREET_NAME, BOX_ID, DEL_ID are
all null"
QS_REJECT_PERSON_NAME_IF_NOT_STANDARDIZED 0 Reject record if standardization fails - 1 (true)

/ 0 (false)
QS_STAN_ADDRESS 0 If not equal to

"com.ibm.mdm.thirdparty.integration.iis8.
adapter.InfoServerStandarizerAdapter", will
bypass standardization for address.
QS_STAN_ORG_NAME 0 If not equal to

bypass standardization for name.
QS_STAN_PERSON_NAME 0 If not equal to

bypass
a. A1 is described in 2.8, “Match” on page 60

b. A2 is described in 2.8, “Match” on page 60
38 MDM: RDP for MDM

Table 2-3 RDP Configuration Parameters by the ERROR HANDLING category
category
DROP DS_DETECTED_DUPLICATES_ACTION E Action to take if duplicates (same key only) records are
detected in the SIF file. The duplicate records will be
removed from input. E: Error all duplicates / K: Keep first,
error others.
DS_PARTY_DROP_SEVERITY_LEVEL 4 Party will be dropped if there are errors with severity <=
DS_PARTY_DROP_SERVERITY_LEVEL. Severity level
ranges from 0 (worst) to 10 (least)
Notification DS_EMAIL_ERROR_CHECK_DISTRIBUTION Space-separated list of e-mail address to receive error

count report of SIF errors by file (abort or not is controlled
by the 3 parms: DS_SIF_ERROR_THRESHOLD,
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD,
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD_KOUNT)
DS_EMAIL_ERROR_CHECK_REPORT 1 Flag to indicate whether the error report (of SIF file error
count) should be e-mailed at all. (abort or not is controlled
by the 3 parms: DS_SIF_ERROR_THRESHOLD,
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD,
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD_KOUNT)
Abort DS_DROP_MAX_ITERATIONS 10 Number of times the job Contract_Iterative_Drop or

handling Party_Iterative_Drop will run (at max). The job will abort
when this level is reached. If it fails their are problems with
the data. Either fix the data or increase this value and restart
the job.
DS_FAILED_COLUMNIZATION_ACTION F "Action if ANY row fails columnization - F: Fail (Default) / C:

Continue.
Fail == abort the job"
DS_FAILED_RECORDIZATION_ACTION F "Action if ANY row fails recordization. Warning! Setting to C

may break the row counter - F: Fail (Default) / C: Continue.
Faill == abort the job"
DS_SIF_ERROR_THRESHOLD 101 Percentage of ALL SIF records with errors that will cause
the job stream to abort (any value above 100 will skip this
check)
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD 101 Percentage of an individual SIF File's Records with errors

that will cause the job stream to abort (Any value over 100
will skip this check)
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD_KOUNT 101 Number of Individual SIF Files, whose Error Threshold has
been exceeded, that are required for an abort.

category
Error DROP_ON_ASSIGNEDBY_ERR 1 Identifier assigned by party was dropped. 0. Do not drop the
Consolidation party, but drop the identifier record. 1. drop the party.
ReasonCode100385 severity <=party drop
DROP_ON_FROM_ERR 1 Contact Rel from party error action. 0. Do not drop the party,
but drop the contact rel record. 1. drop the party.
ReasonCode100383 severity <=party drop
DROP_ON_PRVBY_ERR 1 Provided by party id RI Validation 0 - Party will not be

dropped, 1 - drop parties when the provided by party was
dropped. ReasonCode100381 severity <=party drop
severity level
DROP_ON_REPLBY_ERR 1 Reply by contract id RI Validation 0 - Contract Will not be

dropped.; 1 - drop contracts when the REPLBY contract was
dropped ReasonCode 100388 severity <=party drop
severity level
DUPLICATES_ERR_MSG_TP_CD 12 Err_mssage_tp_cd associated with Duplicate primary key.

Used in conjunction with
MDMIS.DS_DETECTED_DUPLICATES_ACTION
INCLUDE_LIST_ERR_MSG_TP_CD |0|110127|110184|110125|110126|162 ErrorCodeIncludeList PipeDelimited List of ErrorCodes

6|110208|110209|110385| used for error thresholding in job EC_Error_Check
|0|110127|110184|110125|110126|1626|110208|110209|1
10385|
RESET_ON_ASSIGNEDBY_ERR 0 Reset the assigned by to null, if the assigned by party was

dropped. This parameter is only useful when
DROP_ON_ASSIGNEDBY_ERR=0. Error Reason code
100391 severity <=party drop severity level
Table 2-4 DP Configuration Parameters by the RUNTIME category

category
Runtime FS_HIERARCHY_SIF_FILE_PATTERN /mdmisdata03/Projects/MDMISINT3/SIF Hierarchy SIF files pattern. Includes full path and file
_IN/sanitycheck/*.hsif mask. All files meeting this pattern are read by the RDP
jobs
FS_SIF_FILE_PATTERN /mdmisdata03/Projects/MDMISINT3/SIF SIF files pattern. Includes full path and file mask. All files
_IN/sanitycheck/*.sif meeting this pattern are read by the RDP jobs
BATCH_ID (auto assigned) 1 "Batch ID generated at runtime if IL_000_AutoStart_EX is

used. If IL_000_INITIAL_LOAD is used batch ID is
assigned through the parameterset.
Will be appended to every output filename generated
during the job run"
DS_PROCESSING_DATE (auto assigned) 1900-01-01 00:00:00 Generated at runtime. Can be used to fix the processing
date if you are restarting the load at a later date.
40 MDM: RDP for MDM

Table 2-5 RDP Configuration Parameters by the ADVANCED category
category
DEBUG DS_LAND_FILE_FLAG N Apparently not used

(FUTURE)
DS PARAMETER $APT_IMPEXP_ALLOW_ZERO_LENGTH_FIXED_NULL True (blank)
$APT_IMPORT_PATTERN_USES_FILESET True (blank)
$APT_IMPORT_REJECT_STRING_FIELD_OVERRUNS True (blank)
$APT_NO_PART_INSERTION True (blank)
$APT_NO_SORT_INSERTION True (blank)
$APT_SORT_INSERTION_OPTIMIZATION True (blank)
HISTORY LOAD_HISTORY_FLAG C History flag to set history records creation

type - C: Compound /S: Simple / N: None.
SETUP DS_MD5_CRITICAL_ADDRESS_COLUMNS ADDR_LINE_ONE,ADDR_LINE_TWO,ADDR_L The columns used to calculate the MD5

INE_THREE,CITY_NAME,POSTAL_CODE,PR checksum used in address "matching"
OV_STATE_TP_CD,COUNTRY_TP_CD,RESID
ENCE_NUM
SURROGATE SK_LOAD_SUFFIX 88 Constant value that is appended to each

surrogate key. This avoids possible key
collisions with MDM Service generated IDs -
88
SK_MASK PPPMMMMMMMMMMMSS Format of surrogate keys. Example

PPPMMMMMMMMMMMSS P=set size of
Cyclical Sequence,M=set size of
midSequence. S=set size of load suffix -
PPPMMMMMMMMMMMSS
SK_MID_ADDRESS_ID_NEXT_VAL 1 Surrogate key value for ADDRESS_ID.
SK_MID_ADDRESS_ID_SF skMid_ADDRESS_ID.sf The file that holds the previous surrogate key
SK_MID_ALERT_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_ALERT_ID_SF skMid_ALERT_ID.sf The file that holds the previous surrogate key
SK_MID_CONT_EQUIV_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_CONT_EQUIV_ID_SF skMid_Contacts_CONTEQUIV_ID.sf The file that holds the previous surrogate key

category
SK_MID_CONT_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_CONT_ID_SF skMid_Contacts_CONT_ID.sf The file that holds the previous surrogate key
SK_MID_CONT_REL_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_CONT_REL_ID_SF skMid_ContactRel_CONT_REL_ID.sf The file that holds the previous surrogate key
SK_MID_CONTACT_METHOD_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_CONTACT_METHOD_ID_SF skMid_CONTACT_METHOD_ID.sf The file that holds the previous surrogate key
SK_MID_CONTR_COMP_VAL_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_CONTR_COMP_VAL_ID_SF skMid_CONTR_COMP_VAL_ID.sf The file that holds the previous surrogate key
SK_MID_CONTR_COMPONENT_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_CONTR_COMPONENT_ID_SF skMid_CONTR_COMPONENT_ID.sf The file that holds the previous surrogate key
SK_MID_CONTRACT_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_CONTRACT_ID_SF skMid_CONTRACT_ID.sf The file that holds the previous surrogate key
SK_MID_CONTRACT_ROLE_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_CONTRACT_ROLE_ID_SF skMid_CONTRACT_ROLE_ID.sf The file that holds the previous surrogate key
SK_MID_HIER_ULT_PAR_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_HIER_ULT_PAR_ID_SF skMid_HIER_ULT_PAR_ID.sf The file that holds the previous surrogate key
SK_MID_HIERARCHY_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_HIERARCHY_ID_SF skMid_HIERARCHY_ID.sf The file that holds the previous surrogate key
SK_MID_HIERARCHY_NODE_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_HIERARCHY_NODE_ID_SF skMid_HIERARCHY_NODE_ID.sf The file that holds the previous surrogate key
SK_MID_HIERARCHY_REL_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_HIERARCHY_REL_ID_SF skMid_HIERARCHY_REL_ID.sf The file that holds the previous surrogate key
SK_MID_IDENTIFIER_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_IDENTIFIER_ID_SF skMid_Identifier_IDENTIFIER_ID.sf The file that holds the previous surrogate key
SK_MID_LOB_REL_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_LOB_REL_ID_SF skMid_LOB_REL_ID.sf The file that holds the previous surrogate key
SK_MID_LOCATION_GROUP_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_LOCATION_GROUP_ID_SF skMid_LOCATION_GROUP_ID.sf The file that holds the previous surrogate key
SK_MID_MISCVALUE_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_MISCVALUE_ID_SF skMid_MISCVALUE_ID.sf The file that holds the previous surrogate key
SK_MID_NATIVE_KEY_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_NATIVE_KEY_ID_SF skMid_NativeKey_NATIVE_KEY_ID.sf The file that holds the previous surrogate key
42 MDM: RDP for MDM

category
SK_MID_ORG_NAME_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_ORG_NAME_ID_SF skMid_OrgName_ORG_NAME_ID.sf The file that holds the previous surrogate key
SK_MID_PERSON_NAME_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_PERSON_NAME_ID_SF skMid_PersonName_PERSON_NAME_ID.sf The file that holds the previous surrogate key
SK_MID_PERSON_SEARCH_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_PERSON_SEARCH_ID_SF skMid_PersonName_PERSON_SEARCH_ID.sf The file that holds the previous surrogate key
SK_MID_PPREF_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_PPREF_ID_SF skMid_PrivPref_PPREF_ID.sf The file that holds the previous surrogate key
SK_MID_ROLE_LOCATION_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_ROLE_LOCATION_ID_SF skMid_ROLE_LOCATION_ID.sf The file that holds the previous surrogate key
SK_MID_SUSPECT_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_MID_SUSPECT_ID_SF skMid_Contacts_SUSPECT_ID.sf The file that holds the previous surrogate key
SK_PREFIX_CONT_ID_NEXT_VAL 1
Surrogate key value for xxxx
SK_PREFIX_CONT_ID_SF skPrefix_Contacts_CONT_ID.sf
The file that holds the previous surrogate key
SK_PREFIX_CONTRACT_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_PREFIX_CONTRACT_ID_SF skPrefix_Contracts_CONTRACT_ID.sf The file that holds the previous surrogate key
SK_PREFIX_HIERARCHY_ID_NEXT_VAL 1 Surrogate key value for xxxx
SK_PREFIX_HIERARCHY_ID_SF skPrefix_HIERARCHY_ID.sf The file that holds the previous surrogate key
MUST MODIFY parameters

Table 2-6 on page 44 lists all the parameters that in our opinion, you must modify.
Database connection details, file names and directories, and key DataStage
parameters all must be provided before RDP for MDM jobs can be launched. It is
also our recommendation to enable standardization and matching. The default is
to disable these jobs. Recommendations are provided where appropriate.

Table 2-6 RDP Configuration Parameters MUST MODIFY list
Frequency Category Sub Parameter Recommendation
category
One time SETUP Connection DB_CONNECT_STRING (blank)
DB_INSTANCE (blank)
DB_PASSWORD (blank)
DB_SCHEMA (blank)
DB_USERID (blank)
$APT_DB2INSTANCE_HOME /home/dsadm/remote_db2config
DS PARAMETER $APT_IMPORT_PATTERN_USES_FILESET_MOUNTED TRUE
$APT_STRING_PADCHAR (blank)
DS_PARALLEL_APT_CONFIG_FILE /opt/IBM/InformationServer/Server/Configuratio
ns/MDM_Default.apt
DS_SEQUENTIAL_APT_CONFIG_FILE /opt/IBM/InformationServer/Server/Configuratio
ns/MDM_1X1.apt
Miscellaneous MDM_DEPLOYMENT_NAME WebSphere Customer Centera
DS_LANGUAGE_TYPE_CODE 100
File location DS_SUPPORT_FILE_DIR /mdmisdata03/data/MDMIS/PARAMETERS/
FS_DATA_SET_HEADER_DIR /mdmisdata03/Projects/MDMISINT3/DATA/
FS_ERROR_DIR /mdmisdata03/Projects/MDMISINT3/ERROR/
FS_LOG_DIR /mdmisdata03/data/MDMIS/LOG/
FS_PARAM_SET_DIR ./ParameterSets/
FS_REJECT_DIR /mdmisdata03/Projects/MDMISINT3/REJECT/
FS_SK_FILE_DIR /mdmisdata03/Projects/MDMISINT3/SK/
FS_TMP_DIR /mdmisdata03/data/MDMIS/TMP/
Runtime Runtime BATCH_ID (auto assigned) 1
DS_PROCESSING_DATE (auto assigned) 1/1/1900
FS_HIERARCHY_SIF_FILE_PATTERN /mdmisdata03/Projects/MDMISINT3/SIF_IN/san
itycheck/*.hsif
FS_SIF_FILE_PATTERN /mdmisdata03/Projects/MDMISINT3/SIF_IN/san
itycheck/*.sif
ADVANCED DS PARAMETER $APT_IMPEXP_ALLOW_ZERO_LENGTH_FIXED_NULL true
$APT_IMPORT_PATTERN_USES_FILESET true
$APT_IMPORT_REJECT_STRING_FIELD_OVERRUNS true
$APT_SORT_INSERTION_OPTIMIZATION true
44 MDM: RDP for MDM

category
Recurring SETUP QualityStage QS_MATCH_ORG_NATID I2
QS_MATCH_PERSON_NATID I1
QS_PERFORM_ORG_MATCH 1
QS_PERFORM_PERSON_MATCH 1
QS_STAN_ADDRESS 1
QS_STAN_ORG_NAME 1
QS_STAN_PERSON_NAME 1
a. This name must match the name used when deploying the MDM application
CONSIDER MODIFYING parameters

Table 2-7 lists all the parameters that in our opinion, you should consider
modifying. Recommendations are provided where appropriate.
Table 2-7 RDP configuration parameters in the CONSIDER MODIFYING list

category
One time SETUP Miscellaneous DS_SOURCE_DATE_FORMAT %yyyy-%mm-%nn %hh:%nn%ss.6
DS_USE_NATIVE_KEY 1
ADVANCED SURROGATE SK_MID_ADDRESS_ID_NEXT_VAL 1
SK_MID_ALERT_ID_NEXT_VAL 1
SK_MID_CONT_EQUIV_ID_NEXT_VAL 1
SK_MID_CONT_ID_NEXT_VAL 1
SK_MID_CONT_REL_ID_NEXT_VAL 1
SK_MID_CONTACT_METHOD_ID_NEXT_VAL 1
SK_MID_CONTR_COMP_VAL_ID_NEXT_VAL 1
SK_MID_CONTR_COMPONENT_ID_NEXT_VAL 1
SK_MID_CONTRACT_ID_NEXT_VAL 1

category
One time ADVANCED SURROGATE SK_MID_CONTRACT_ROLE_ID_NEXT_VAL 1
SK_MID_HIER_ULT_PAR_ID_NEXT_VAL 1
SK_MID_HIERARCHY_ID_NEXT_VAL 1
SK_MID_HIERARCHY_NODE_ID_NEXT_VAL 1
SK_MID_HIERARCHY_REL_ID_NEXT_VAL 1
SK_MID_IDENTIFIER_ID_NEXT_VAL 1
SK_MID_LOB_REL_ID_NEXT_VAL 1
SK_MID_LOCATION_GROUP_ID_NEXT_VAL 1
SK_MID_MISCVALUE_ID_NEXT_VAL 1
SK_MID_NATIVE_KEY_ID_NEXT_VAL 1
SK_MID_ORG_NAME_ID_NEXT_VAL 1
SK_MID_PERSON_NAME_ID_NEXT_VAL 1
SK_MID_PERSON_SEARCH_ID_NEXT_VAL 1
SK_MID_PPREF_ID_NEXT_VAL 1
SK_MID_ROLE_LOCATION_ID_NEXT_VAL 1
SK_MID_SUSPECT_ID_NEXT_VAL 1
SK_PREFIX_CONTRACT_ID_NEXT_VAL 1
SK_PREFIX_HIERARCHY_ID_NEXT_VAL 1
46 MDM: RDP for MDM

category
Recurring SETUP QualityStage QS_ALLOW_LOB_MATCH 0

Recurring
QS_EXCLUDE_FIELDS_FROM_MATCH_ORGANIZATION (blank)
QS_EXCLUDE_FIELDS_FROM_MATCH_PERSON (blank)
QS_MATCH_ORG_1 (blank)
QS_MATCH_PERSON_1 C1
QS_PHONETIC_CODING_TYPE_ADDRESS QSNYSIIS
QS_PHONETIC_CODING_TYPE_ORGANIZATION QSNYSIIS
QS_PHONETIC_CODING_TYPE_PERSON QSNYSIIS
QS_REJECT_ADDRESS_IF_NOT_STANDARDIZED 0
QS_REJECT_ORG_NAME_IF_NOT_STANDARDIZED 0
QS_REJECT_PERSON_NAME_IF_NOT_STANDARDIZED 0
Error Handling DROP DS_DETECTED_DUPLICATES_ACTION E
DS_PARTY_DROP_SEVERITY_LEVEL 4
Notification DS_EMAIL_ERROR_CHECK_DISTRIBUTION
DS_EMAIL_ERROR_CHECK_REPORT 1
Abort DS_DROP_MAX_ITERATIONS 10
handling
DS_FAILED_COLUMNIZATION_ACTION F
DS_FAILED_RECORDIZATION_ACTION F
DS_SIF_ERROR_THRESHOLD 101
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD 101
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD_KOUNT 101
The match columns for organization (QS_MATCH_ORG_*) and person

(QS_MATCH_PERSON_*) in Table 2-2 on page 36 allow you to specify match
fields to be used.
򐂰 Allowable values I1 through I12 correspond to entries in the ID_TP_CD
column in the CDIDTP table in the ID_TP_CD as shown in Figure 2-2 on
page 48. The actual values here are MDM codes denoting the available
match identifiers. For example, I1 corresponds to the social security number.

򐂰 Allowable values C1 through C8 correspond to entries in the
CONT_METH_TP_CD column in the CDCONTMETHTP table as shown in
Figure 2-3. The actual values here are MDM codes denoting the available
match contact methods. For example, C1 corresponds to the Home
Telephone number.
Figure 2-2 CDIDTP table contents: Corresponds to the I’n’ columns
Figure 2-3 CDCONTMETHTP table contents: Corresponds to the C’n’ columns
48 MDM: RDP for MDM

2.3.2 CONFIGELEMENT table in the MDM repository
The CONFIGELEMENT table currently contains all the configuration parameters
described in 2.3.1, “Configuration parameter file” on page 33, excepting the
following:
򐂰 FS_DATA_SET_HEADER_DIR
򐂰 FS_ERROR_DIR
򐂰 FS_REJECT_DIR
򐂰 FS_SK_FILE_DIR
Note: A fix is being developed to ensure that these parameters are processed
properly by the IL_000_AutoStart_PS_IL job.
The matching parameters (such as QS_MATCH_PERSON_1 through

QS_MATCH_PERSON_4, and QS_MATCH_ORG_1 through
QS_MATCH_ORG_4) in the CONFIGELEMENT table can be configured using
the MDM Server UI shown in Figure 1-8 on page 17 through Figure 1-10 on
page 19.
򐂰 Figure 1-8 on page 17 shows the navigation pane including the Matching
Critical Data Rules Console with the options to modify Person, Organization
and Configuration Options. The default is Suspect Duplicate Processing On
radio button being selected.
򐂰 Figure 1-9 on page 18 shows the Person option selection panel, which allows
you to specify Matching Critical Data For Person such as Minimum Match
Score for A1/Duplicate, and Selecting Matching Critical Data Fields. The 8
default matching fields shown are Name, Address, City, State/Province,
Country, Zip/Postal Code, Gender, Birth Date, and Social Security Number.
򐂰 Figure 1-10 on page 19 shows the corresponding Organization option fields.
Note: The other non-matching parameters in the CONFIGELEMENT table

can be updated using SQL or DB2 Control Center.

2.4 SIF
The SIF is a delimited ASCII file that contains data input to the load process. The
default delimiter is the pipe character (|). The file is a multi-record format flat file
with a record type code in the first field and a sub-record type code in the third
field following the separator.
Each record type / sub-record type (also referred to as RT/ST) combination has a
unique layout (metadata). The record type identifies the primary subject areas
which are Contact (P) and Contract (C). The contact and contract RT/ST
combinations are shown in Figure 2-4.
Contact record type (‘P’) & sub types Contract record type (‘C’) & sub types
PP Person Contact CH Contract
PO Organization Contact CK Native Key
PG Organization Name CC Contract Component
PH Person Name CR Contract Component Role
PE External Match CL Role Location
PA Address CV Contract Component Value
PC Contact Method CM Contract Misc Value
PI Identifier CT Contract Alert
PB Line of Business Relationship
PR Contact Relationship
PM Person Miscellaneous Value
PS Privacy Preference
PT Person Alert
Figure 2-4 Contact (P) and Contract (C) RT/ST combinations
Note: The layout of the RT/ST closely mirrors the tables defined in the MDM
Server repository. The financial services scenario described in Chapter 3,
“Financial services business scenario” on page 67 includes a spreadsheet
template identifying the metadata associated with each RT/ST combination.
This template can be used to define the mapping specification of columns in a
source system to those in the SIF for creating the SIF from that particular
source system.
The record layout of the SIF is as follows:

<RECORD_TYPE> | <SUBRECORD_TYPE> | <DATA> <CR><LF>
In the above record layout, <CR><LF> is the DOS Line Feed Character
50 MDM: RDP for MDM

The following considerations apply to the content of the SIF:
򐂰 The columns within the <DATA> section should be separated by a pipe
character (|), with a pipe character following the last data element. All pipe
separators must be present even if there is no data for a particular data
element.
Attention: Currently, there is no escape character provided should the

input data itself contain the | character. It is not possible to configure RDP
to use a different delimiter. This can cause errors to be flagged by the
parser in the Import SIF step. Ensure that pipe characters in the input data
are suitably managed before populating the SIF.
򐂰 The domain values of key columns in the SIF must contain the values defined
by the MDM Server. This will require transformation of domain values in the
source system to that of the MDM Server. For example, the domain values for
Gender in the MDM Server are ‘M’ and ‘F’, while the source system may have
‘0’ and ‘1’. The process creating the SIF is responsible for mapping the
domain values appropriately.
򐂰 When a column is identified as being not nullable, a value must be provided
for it and that value cannot be null.
򐂰 The Timestamp format is configurable using a format string such as
YYYY-MM-DD.HH.MM.SS. Refer to IBM WebSphere DataStage and
QualityStage Version 8 Parallel Job Developer Guide, SC18-9891-00 for
details on format strings.
򐂰 The order of rows does not matter, because the rows will be sorted in the
proper order by the DataStage jobs.

Example 2-1 shows the sample contents of a SIF.
Example 2-1 Sample SIF contents

P|P|1000002|8000719|A|N|||||||1||||||||||||||||||||3||||||1984-05-07 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000090|A|N|||||||2||||||||||||||||||||3|||||M|1975-09-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000001|200000071|A|N|||||||2|||||||||||||||||||||||||F|1989-08-23 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000001|200000041|A|N|||||||1|||||||||||||||||||||||||M|1998-08-03 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000000|70008172|A|N|||100|||||2|||||||||||||||||||||185|||M|1937-08-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000037|A|N|||||||1||||||||||||||||||||3|||||F|1986-09-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000297|A|N|||||||1||||||||||||||||||||2|||||F|1995-10-30 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000212|A|N|||||||1||||||||||||||||||||3|||||F|1997-11-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000001|200000291|A|N|||||||1|||||||||||||||||||||||||F|1977-03-03 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000000|70004432|A|N|||100|||||3|||||||||||||||||||||185|||M|1976-08-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000000|70004182|A|N|||100|||||2|||||||||||||||||||||185|||M|1975-09-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000640|A|N|||||||2||||||||||||||||||||3|||||F|1975-07-12 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000469|A|N|||||||2||||||||||||||||||||1|||||M|1990-03-14 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000111|A|N|||||||1||||||||||||||||||||1|||||M|1984-06-21 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000001|200000201|A|N|||||||1|||||||||||||||||||||||||M|1967-05-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000000|70005333|A|N|||100|||||2|||||||||||||||||||||185|||F|1945-03-12 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000000|70005817|A|N|||100|||||4|||||||||||||||||||||185|||M|1957-03-29 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000232|A|N|||||||1||||||||||||||||||||1||||||1966-08-25 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000259|A|N|||||||1||||||||||||||||||||2|||||F|1991-02-04 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000001|200000011|A|N|||||||3|||||||||||||||||||||||||F|1945-03-12 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000001|200000221|A|N|||||||2|||||||||||||||||||||||||M|1977-09-18 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000000|70003022|A|N|||100|||||2|||||||||||||||||||||185|||M|1989-12-11 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000002|8000640|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||6177 Purple Sage Ct|||San
Jose|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000002|8000469|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||5528 Muir Dr|||San
Jose|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000002|8000111|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||631 Ofarrell St|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000001|200000201|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||1363 14th Ave|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000000|70005333|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||6181 Camino Verde Dr,,San
Jose,95119||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
............
Tip: We recommend that you use naming conventions to identify the origin,
content, and date and time attributes of the SIF. Implement a directory
structure where multiple SIFs can be queued in a READY directory and moved
to a LOADING directory, before finally being moved to a COMPLETED
PROCESSING directory when complete.
52 MDM: RDP for MDM

2.5 Import SIF
Briefly, the IL_010_IS_Import_SIF job reads the SIF input data and parses it by
each record type. It identifies data type validation errors, and duplicate primary
keys, and creates Parallel Data Sets for each record type.
Figure 2-5 shows the high level flow of the Import SIF step.
Import SIF “master” tagged and written to dataset
1
Column Import Duplicate Drop Dataset
Split SIF .
by
SIF RT/ST . Duplicates dropped
file
into .
18 groups
18
Column Import Duplicate Drop Dataset
Invalid RT/ST
Failed recordization
Funnel Error Failed columnization
file Invalid date format
Duplicates
Figure 2-5 Import SIF flow
The IL_010_IS_Import_SIF job reads one or more SIF files through the File Set
facility of DataStage. During this step the row number and input file name
information is captured and appended to the input data before the end of record
(DOS Line Feed) characters. This is done to enable error reconciliation back to
the original input files.
The main configuration parameters of interest and their recommendations are

shown in Table 2-6 on page 44.
If the file (set) fails this basic recordization, the job fails. However, the process
captures all such errors before failing. The errors are written to a special reject
flat file. When parsing the data in the columns, invalid values in RT/ST and
Admin_ID generate error log rows and are directed to a reject file named
SIF_Import_ERR_MSG.[batchid].txt. Errors detected with columns are called
columnization errors.

Valid data is separated into multiple streams by RT/ST. Each stream is then
sorted by source system key and tested for duplicates. If duplicates are detected,
they are written to a DataStage parallel data set (PDS) with the following layout:
򐂰 Record Type
򐂰 Sub Type
򐂰 Admin_sys_tp_cd
򐂰 Admin_Client_ID or Admin_Contract ID
򐂰 Record Number or Offset
򐂰 Input file name
򐂰 …. Specific columns associated with that particular RT/ST
Statistics are collected on the number of input errors, the number of rows written
to each RT/ST link, and the number of rows rejected due to recordization and
columnization. Error handling includes recordization and columnization
processing as follows:
򐂰 During record-level parsing, if a SIF row is not properly formed (record type,
sub-record type, admin_sys_tp_cd, admin_client_id_or_contract_id, data
string, end of record DOS Line Feed character), any additional columns
detected after the final expected column (as defined by the metadata) are
ignored and the following warning message is written to the Director log, as
shown in Example D-2 on page 277.
[“Import consumed only ‘m’ bytes of the record's ‘n’ bytes (no
further warnings will be generated from this partition)”]
Note: Carefully review the Director log output for such warnings, because
they do not appear in the RDP for MDM error logs.
򐂰 During column parsing, the individual RT/ST record types are processed
through the Column Importer stage after being split. The number of columns
and data types as well as any NOT NULL SIF column restriction is enforced at
this stage. Note that this is not necessarily the same as a NOT NULL
restriction on a column in the target database.
Rejected data is written to an error data set with the record/offset number
within the source file, RT/ST, Admin columns along with an error code values.
One error data set is created for each RT/ST combination. In a later process
(Error Consolidation) all error data sets are consolidated and written to a
sequential file.
Note: A configuration option (Fail entire load if any row fails column
import?) specifies whether the load process should be halted or continued
on occurrence of this error. If this is set to allow the process to continue,
implicitly-related rows may be rejected in the Error Consolidation step.
54 MDM: RDP for MDM

2.6 Validation and Standardization
In this step, a separate job is launched for each RT/ST with processing explicit to
it. It includes a set of jobs (IL_020_VS_*) that perform code table lookups, date
bound checking, and other validations appropriate to each record type. They also
perform optional custom validations and create new Parallel Data Sets for input
to the Error Consolidation and Referential Integrity step. It also includes name
and address (QualityStage) Standardization jobs when that option is selected.
Figure 2-6 shows the high level flow of the Validation and Standardization step.
Validation and Standardization
…
MDM MDM MDM
Code & Code & Code &
Reference Reference Reference
Data Data Data
Output
dataset User Exit Lookup Transformer QualityStage Standardization Dataset
of SIF
Parser For custom
validations ..
18 separate
runs – one for Funnel
each output dataset
of the SIF Parser Reject
link
Invalid codes
Error Invalid date bounds
file etc.
Main Validation and Standardization related configuration parameters

DS_LANGUAGE_TYPE_CODE
DS_PROCESSING_DATE
Figure 2-6 Validation and Standardization flow

Prior to the first validation stage, you may perform some custom validation or
some additional defaulting or value substitution in the Pre-Code Table Validation
Exit. This occurs after the NOT NULL column rules have been applied. Whether
there was an error in the exit or not, the input row is not dropped at this point. If
an error is detected in the exit, it must set the “Error on Line Flag from User Exit”
column to 1. The row is then passed on to the MDM Code Table lookups. A
discussion of the Pre-Code Table Validation Exit is beyond the scope of this IBM
Redbooks publication.

The main steps associated with the processing of each RT/ST input are as
follows:
1. Validate data to metadata .
2. Validate data to metadata after the Pre-Code Table Validation Exit.
3. Test mandatory columns for data existence.
4. Check 'Error message from User Exit' column.
5. Validate the data to MDM Code Tables.
If a column in the SIF that has an associated MDM Code Table is flagged with
a N in the "Can be empty" column, it must have a value and that value must
exist on the MDM Code Table. If the column has a Y in this column, it can be
empty and the lookup to the MDM Code Table skipped or the lookup failure
ignored.
Note: The key to MDM Code Tables requires the code value, a language
code, and the expiration date. The default language code from the MDM
Configuration Manager and the processing date will be used.
Each of the tests may write an error log row and set the 'Error on Line' flag, but
the data will be passed to the next edit to enable all possible errors to be
captured in a single pass. The exception is an error in metadata validation.
A rejection of data due to a mismatch on metadata should not happen as the

data was validated during the column import step of the initial parsing Import SIF
step, unless the user exit altered the data incorrectly, or there is a job
defect/mismatch. A metadata error at this point is treated as fatal and the job is
terminated.
Rows that are to be rejected will generate an entry into the Error Log entry and
set the “Error on Line” flag to 1. All tests are performed on all rows. Each of the
following example tests (partial list) will place a 1 in an 'Error on Line' column and
generate an individual entry in the Error Log:
򐂰 “Error on Line from User Exit” contains a 1.
򐂰 A column is mandatory but not present.
򐂰 A column is present but must be empty.
򐂰 A value should exist on an MDM Code Table, and does not.
򐂰 A parent key should be located, but was not.
򐂰 A date column contains data in the wrong format.
The Error message log contains the record number of the row in the SIF, a
message number and description of the error including the column referred to,
and optionally a copy of the data or some snippet of the data that is important to
understanding the error.
56 MDM: RDP for MDM

The valid rows are then input to the QualityStage Standardization stage.
Special edits are performed for Person Name, Organization Name, and Address
with respect to standardization.
򐂰 For the Name RT/ST's a configuration option ("Stan_Person_Name" and
"Stan_Org_Name" for Person Names and Organization Names respectively)
is used to specify whether standardization is to be performed by
QualityStage.
򐂰 The Address RT/ST contains a column (OVERRIDE_IND) to indicate whether
QualityStage should perform standardization on a particular address. If the
value is N, then the normal processing occurs. If the value is Y then
standardization by QualityStage has been overridden and is bypassed.
Note: Phone number standardization does not have its own parameter. It
is driven by Address Standardization. If Address Standardization is on, the
phone number gets standardized as part of the IL_020_ContactMethod
process.
򐂰 For both Address and Name, a configuration option ("Phonetic Coding Type")
is used to specify whether QualityStage should generate NYSIIS, or Soundex
phonetic values, or none.
Edits are performed to ensure the other data on the SIF is consistent with the
flags/options.
Note: A configuration option ("Reject_Address_On_Stan_error") will specify if

address data that cannot be standardized should be rejected. This option only
applies to addresses that pass through QualityStage standardization. If this
option is set to reject addresses that cannot be standardized, each Address
row passed through the QualityStage Address Standardization Container will
be tested. If the ADDR_STANDARD_IND has a value of N indicating that
standardization failed, then the Error on Line flag will be set, and an Error Log
entry generated.
Data Standardization means different things to different people. In this context

we are referring only to Name/Address Standardization for matching and loading.
It refers to the process of extracting meaningful structured information freeform
text data such as first name and last name out of a freeform text name field.
Standardization, cleansing, and harmonization of data elements for loading,

other than name/address, should be done by custom processes prior to the SIF
being created (pre-SIF). Standardization may also be performed in the Validation
user exit.

Part of the QualityStage standardization includes the generation of phonetic
values. A configuration option (Phonetic Coding Type) will indicate if
QualityStage should generate NYSIIS or Soundex values or neither. Changing
this option after data has already been loaded will negatively effect the match
process. If this value needs to be changed, the data already in the database
needs to be updated through a special program not included in the RDP solution.
After all tests have been performed, the row is checked to see if the "Error On
Line" flag is a 1. If it is, the row is discarded and the number of rows discarded is
tracked. If it is not, the row is written to the "Valid RT/ST Data Set".
2.7 Error Consolidation and Referential Integrity

These set of jobs (IL_030_*) check referential integrity relationships and drop
parties with one or more errors. Figure 2-7 shows the high level flow of the Error
Consolidation and Referential Integrity step. The main configuration parameters
of interest and their recommendations are shown in Table 2-6 on page 44.
Error Consolidation and Referential Integrity

CONTACT Both
dataset Valid
of SPLIT JOIN COPY JOIN Contacts
Validation Org Dataset
Pers
PERSNAME JOIN Funnel

dataset
of
Validation
ORGNAME
dataset JOIN
of
Validation
Errors Errors Errors
Funnel
Error RI violations
file etc.
Figure 2-7 Error Consolidation and Referential Integrity flow
58 MDM: RDP for MDM

After all the RT/ST streams have completed processing (a DataStage Job
Sequencer is used to enforce this), the set of rows in the independently valid
RT/ST's are analyzed to determine that they conform to the referential integrity
(RI) rules of the target MDM database. For example, you cannot have a
PersonName if the Contact row got rejected, and you must have at least one
PersonName for a Contact.
Two error logs are produced, one for Contact-related errors and the other for
Contract-related errors.
Error consolidation is the process that collects all errors including those from the
previous steps (Import SIF and Validation and Standardization) and RI validation
in this step and copies the data into two processing streams as follows:
򐂰 One stream picks up the error severity associated with the errors and only
passes the row forward if the associated severity is greater than or equal to
the configuration option ("Party Drop Severity Level"). The errors kept are
sorted and duplicates removed so that only one row for each source system
key (SSK) is kept.
򐂰 The other stream contains all previous errors and will be joined with the errors
generated by the following association test to consolidate them into one error
data stream for Contacts and another data stream for Contracts.
Note: All prior jobs must have completed and will have consolidated their own
errors and written them to their own Error Parallel Data Sets. The Job
Sequencer ensures that this and other dependencies are controlled.
The next process is to drop rows with Association Errors. The process is built
with a configuration option ("Party Drop Severity Level") to specify if the process
should drop all rows for a new parent (Contact or Contract) if any of the data
associated with that parent is in error. Such dropped rows have an “error by
association” entry that is sent to the log.
Error log entries generated in this step are funneled together with all the errors
from the previous steps and a consolidated error file is created.

2.8 Match
The IL_050_MA_Match_LOB and IL_050_MA_Match_Person_Org jobs perform
Person match, Org match, LOB MATCH, and Suspect processing
Match processing is intended to identify potential duplicate contacts and

categorize and score them for subsequent processing. Person and Organization
data is separated into two streams of information.

shown in Table 2-6 on page 44. Configuration options are provided to enable the
implementation to select the following information:
򐂰 Identifier (QS_MATCH_PERSON_NATID and QS_MATCH_ORG_NATID)
used for the National Identifier.
򐂰 Identifier Types [I1 through I12) and Contact Methods (C1 through C8) used
for Match Strings 1-4 (QS_MATCH_PERSON_1, QS_MATCH_PERSON_2,
QS_MATCH_PERSON_3, QS_MATCH_PERSON_4, QS_MATCH_ORG_1,
QS_MATCH_ORG_2, QS_MATCH_ORG_3 and QS_MATCH_ORG_4).
Match Person and Org

A1
PERSNAME
JOIN Pers
Match
A2, B
ID
Funnel
CONTACT Pers
Method
JOIN Split
Person & Org
CONTACT
Org
ADDY
A1
ORGNAME
dataset JOIN Org
of Match
Validation
A2, B
Figure 2-8 Match Pers and Org flow
60 MDM: RDP for MDM

Matching is performed between records in the incoming SIF. Following the
QualityStage matching, a roll up is performed to consolidate records into unique
match pairs. The match stage outputs composite scores to matched record pairs.
These scores and configuration options will be used to assign one of three
possible categories to the matched records:
򐂰 A1
This category indicates a 100% confidence the two records represent the
same party.
򐂰 A2
This category indicates that there is a reasonable certainty that the two
records represent the same party.
򐂰 B
This category indicates that it is not sure that two records represent the same
party.
If a party is matched to another party and the SIF also contains a Contact
Relationship row for the matched parties, we have a conflict. If the two parties are
the same party (A1) they cannot also have a party-to-party relationship because
there is only one party. In this situation, the match results will be overridden and
the match type is changed to an A2. A suspect row is generated but the parties
are not merged into one party. This processing is done before the generation of
implied matches.
Note: A LOBa Relationship SIF row can override the results of the match
process. A configuration option (Allow Match Across LOB) will identify if
matching across lines of business (LOB) is allowed. If configuration options
specify that matching across LOB is not allowed (default is “allowed”), match
pairs identified as A1 are sent through a process to determine if a group of
matched records must be broken into multiple groups based on LOB.
a. Large enterprises often have multiple lines of business (LOB). Privacy laws and company practices
often restrict the sharing of data across lines of business. An example is an insurance company
that has individual life, group life and property and casualty lines of business. A party may only be a
group and individual life customer. In this example, the party would have two lines of business
relationships.

Match LOB
PERS LOB
A1 JOIN Pers
Match Match
Split
LOB
Rel Person & Org
Org LOB LOB

A1 JOIN Org Funnel A1
Recs
Match Match
Pers
A2, A3, B
ORG
A2, A3, B
Figure 2-9 Match LOB flow
Match Execution outputs data where the input rows and candidate rows are
grouped and include a Match Category (A1, A2, and so forth) and Match Score.
򐂰 If one or more contacts have an A1 match to each other, the data associated
with the contacts are merged at the row level, not at the column level.
Note: From a survivorship perspective, the last row’s values are retained. It
should be noted that the concept of last row is not deterministic because it
depends upon the output of a sort.
򐂰 If one or more contacts do not have an A1 match then, suspect rows are
created for each match.
62 MDM: RDP for MDM

2.9 ID Assignment
The default surrogate key (SK) assignment is a 19 digit key. The default starting
value for each unique instance of a counter is stored in a configuration option and
can be overriden at run time. The last value used by a load run is stored and
used as the default start for the next run.
The main configuration parameter of interest and the recommendation is shown

in Table 2-6 on page 44.
The MDM Server model provides the ability to store addresses as a top level
subject domain and then link them to their various usages. Multiple contacts,
contact methods within or across contacts, and contracts can share the same
address. Duplicate addresses may exist within the SIF input or between the SIF
and rows already in the database. Removal of duplicates and altering references
to point to the survivor is achieved using cryptography. The address columns
(DS_MD5_CRITICAL_ADDRESS_COLUMNS parameter specifies the columns
used to calculate the checksum used in address matching) used to determine if
an address is a duplicate is passed into a cryptographic function (MD5
(Message-Digest Algorithm 5) is a widely -cryptographic function with a 128-bit
hash value. The corresponding value is stored with the address. This value will
also be calculated for inserts/updates received in the SIF. The SIF values can
then be used to check for uniqueness within the SIF and to do a quick join/lookup
to the database to see if a duplicate already exists.

2.10 Load
With RDP for MDM, you have two choices for loading the MDM data repository:
򐂰 Bulk Load can only be done natively as there is no ODBC support for bulk
loading.
Figure 2-10 shows the load flow of the operational tables and the history
tables.
Tip: For superior performance, Bulk Load is the preferred loading choice.
Operational
Load Copy
Ready Records
Data
History
Figure 2-10 Bulk Load flow
64 MDM: RDP for MDM

򐂰 Upsert can be done either natively (Oracle® Enterprise Stage, DB2
Enterprise Stage) or through ODBC.
Figure 2-10 on page 64 shows the load flow of the operational tables and the
history tables. The Upsert process involves SQL statements which can fail —
such rows are written to a Rejects file. Under normal circumstances, rows
should not be rejected at this point. However, errors such as primary key
violations, field truncation, and null values in non-null fields may occur if
proper validations did not occur in the earlier stages.
Tip: If you do not want to install and configure the components required for
native access to the database, then Upsert is the preferred loading method
over Bulk Load.
Rejects
Operational
File
Load Copy
Ready Records
Data
History History
Rejects
Figure 2-11 Upsert Load flow
This choice is made by setting the parameter LOAD_METHOD to

ODBC_INSERT (Upsert) or BULK (native bulk loader). All other values will result
in the load phase being omitted from the job run.
You may also choose to load the history tables during the load by setting the
LOAD_HISTORY_FLAG configuration parameter to “C” for compound1 history
records (default), “S” for simple2 history records, or none.
1
Includes history all the changes that have occurred plus the original record
2
Includes only the history of all the changes that have occurred. If a record never changed, it will not
be in the history table

66 MDM: RDP for MDM
3
Chapter 3. Financial services business

scenario
In this chapter we describe a step-by-step approach to implementing IBM
InfoSphere Master Data Management Server using a typical financial services
business scenario on a Red Hat Advanced Server Enterprise platform. The initial
load of the IBM InfoSphere MDM Server is performed with Rapid Deployment
Package (RDP) for Master Data Management (MDM). The Delta RDP for MDM
solution was not available at the time of writing of this IBM Redbooks publication.
It will be addressed as an update to this IBM Redbooks publication or a separate
IBM Redpaper.
The topics covered include:

򐂰 Business requirement
򐂰 Environment configuration
򐂰 Scope of this IBM Redbooks publication
򐂰 Initial load
򐂰 Suspect resolution
򐂰 MDM consumption application

3.1 Introduction
TBank is a fictitious bank providing services such as savings, checking, and
loans in North America. These services were either developed independently or
obtained through acquisitions. This resulted in the same customer information
potentially being represented inconsistently in each system, thereby leading to
increased costs (such as mailing) and poor customer service.
To overcome these problems, TBank decided to implement a coexistence model

of an MDM solution. With a coexistence model, key master data from one or
more data sources is consolidated in the MDM repository. Changes occurring in
the data sources are applied to the MDM repository. Synchronization is
bi-directional — existing systems provide new or updated data into MDM Server
through delta load and then the MDM Server feeds accurate master data back
into existing systems. The latency of the master data in the MDM repository
varies by organization and frequency of delta load. Typically, some new
applications obtain master data from the MDM repository, while existing
applications continue to access the master data from the existing data sources.
Some overlap of customer information is expected between the checking,

savings, and loans systems. However, it is likely that a single customer may have
an account in only one or two of the systems.
3.2 Business requirement

The objective is to consolidate master data of a customer in an MDM repository
to deliver improved customer service and reduce operational costs. The MDM
repository needs to have regional hierarchies defined, which establish the
association of customers to marketing organizations of the bank.
Data latency of end-of-day is considered acceptable given the general

infrequency of changes to master data in their environment. This translates to
changes to Master Data in the operational systems being processed at the end
of every business day.
Note: Because the objective of this IBM Redbooks publication is to focus on

using the RDP for MDM solution for building the MDM repository, we assumed
a simple data model (single table) for each of the three (savings, checking,
and loans) systems
The RedHat Enterprise Linux 4 platform was chosen as the platform for the DB2
for LUW MDM repository.
68 MDM: RDP for MDM

3.3 Environment configuration
Figure 3-1 shows the configuration of the TBank environment with the MDM
repository.
Users
&
Administrators
WebSphere Application Server (Domain), WebSphere Application Server

XMETA and IADB of IBM InfoSphere MDM Server
of IBM InfoSphere Information Server (RedHat Enterprise Linux platform)
(RedHat Enterprise Linux platform) tarus.itsosj.sanjose.ibm.com (9.43.86.103)
virgo.itsosj.sanjose.ibm.com (9.43.86.104)
DataStage Engine
Legacy systems & MDM repository
of IBM InfoSphere Information Server IADB
(RedHat Enterprise Linux platform)
(RedHat Enterprise Linux platform)
orion.itsosj.sanjose.ibm.com (9.43.86.101)
phoenix.itsosj.sanjose.ibm.com (9.43.86.102)
CHECKING SAVINGS LOAN

MDM
repository
Figure 3-1 TBank environment configuration
Figure 3-1 shows the following information:

򐂰 Red Hat Enterprise Linux 4 server orion.itsosj.sanjose.ibm.com (9.43.86.101)
has the following systems:
– TBank’s core services of checking, savings, and loans on a DB2 for LUW
V9 database (TBANK) that has a table (CHECKING, SAVINGS, and
LOAN) for each of the 3 systems.
– MDM repository database (TBANK) containing the MDM data.
򐂰 Two Red Hat Enterprise Linux 4 servers — phoenix.itsosj.sanjose.ibm.com
(9.43.86.102) and virgo.itsosj.sanjose.ibm.com (9.43.86.104) has InfoSphere
Information Server 8.0.1 split as follows:
– phoenix.itsosj.sanjose.ibm.com (9.43.86.102): DataStage Engine
– virgo.itsosj.sanjose.ibm.com (9.43.86.104): WebSphere Application
Server (Domain), XMETA, and Information Analyzer IADB
Chapter 3. Financial services business scenario 69

򐂰 Red Hat Enterprise Linux 5 server tarus.itsosj.sanjose.ibm.com (9.43.86.103)
has the WebSphere Application Server of the IBM MDM Server.
Attention: Our objective was to showcase the RDP for MDM implementation
on a RedHat Enterprise Linux platform. For convenience, we chose our data
sources and target MDM repository to be hosted on a single Linux platform,
even though we recognize that in a real world environment these systems
would likely be hosted on an eclectic mix of operating systems, servers, and
database management systems. The configuration we used was only meant
to showcase the functionality of the RDP for MDM solution, and should in no
way be seen as delivering the scalability and performance requirements of
your business solution.
3.4 Scope of this IBM Redbooks publication

In a real world environment, you would likely perform the following tasks when
implementing a coexistence MDM solution using RDP for MDM:
1. Perform a Data Quality Assessment (DQA) of the data sources in your
environment to identify master data, assess data quality, and determine the
system of record (SOR) for your master data. Information Analyzer and
QualityStage would figure prominently in such an effort.
Note: DQA is assumed to have occurred and only the results of this task
are presented here.
2. Review the MDM data model and customize it to the specific requirements of
your organization.
In case of customization, RDP for MDM jobs will need to be modified as well.
Note: Customization of the MDM data model is not covered in this IBM
Redbooks publication.
3. Create the code mapping tables from source to SIF, and update the MDM
code tables with domain values if appropriate.
4. Create a canonical form from the various data sources (Three sources in this
scenario).
Attention: The canonical form was a concept we invented for this scenario
and is not defined in the RDP for MDM solution.
70 MDM: RDP for MDM

5. Validate the RDP for MDM rulesets with the canonical form created in the
previous step and modify as needed. If the rulesets are modified, incorporate
them into the RDP for MDM jobs.
6. Create the mapping templates (from canonical form columns to the SIF
RT/STs).
7. Create the SIF using the mapping templates and the code mapping tables
created earlier.
8. Execute the RDP for MDM jobs with Standardization and Matching enabled in
the configuration parameter file.
If errors occur, correct them and reprocess. You may either correct the errors
in the SIF and reprocess the entire SIF again, or correct the data in the data
sources and recreate the SIF for processing by the RDP for MDM jobs. This
can be time-consuming and is certainly the best approach when the number
of errors is high, and when the errors represent problems with creating the
SIF from the source. This is also a desirable approach when the total data
volume is relatively low (less than 10 million rows).
Correcting the data directly in the SIF can be tedious and error-prone, and
should only be used where the number of errors is small. In general, it is
much better to fix the data in the source system, or when creating the SIF
from the source.
Correcting just the records in error and reprocessing them later in delta mode
is another option depending upon the number and nature of the errors
reported. A detailed discussion of the considerations involved is beyond the
scope of this IBM Redbooks publication. However, this approach is best if the
data volume is high and the number of error records is low.
Note: In our case, we thoroughly cleaned the data prior to creating the SIF
so that no errors occurred.
However, to show the error messages generated by the RDP for MDM
jobs, we created other SIFs containing the most frequently occurring errors
and ran it through the RDP for MDM jobs. The purpose was to show the
correspondence between a particular error and the error messages
generated for it by the RDP for MDM jobs. This is described in Appendix D,
“Error processing” on page 271.
9. Verify the successful loading of the MDM repository using the MDM Server
Reporting facility.
10.Resolve any suspect parties that were not automatically collapsed in the load
jobs, but are suspected to be duplicates using the MDM Server Data
Stewardship UI.

11.Establish hierarchies associating customers to marketing organizations within
the bank.
12.Integrate the realtime services of MDM Server in your master data consuming
application. These are typically applications that you already have in your
environment such as sales and marketing, CRM, and operational systems
such as savings, checking, and loans.
We wrote a sample application that provides a 360° view of a person that
includes master data (such as address) from the MDM Server and
non-master data (such as balances) from the corresponding source systems.
13.Perform delta processing with updates occurring in the source systems after
the initial load.
Note: The Delta RDP for MDM solution was not available at the time of
writing of this IBM Redbooks publication. It will be addressed later as an
update to this IBM Redbooks publication or a separate IBM Redpaper.
In the following sections, we describe the step by step approach we used to

perform the following tasks:
򐂰 Initial load
򐂰 Suspect resolution
򐂰 Hierarchies
򐂰 MDM consumption application
We also have instances of executions of the RDP for MDM jobs with SIFs
containing commonly encountered problems to see the correspondence between
a specific error condition in the SIF, and the corresponding error messages
generated by the RDP for MDM jobs. This is described in Appendix D, “Error
processing” on page 271.
72 MDM: RDP for MDM

3.5 Initial load
Figure 3-2 provides a high-level overview of the processing flow of the initial load
of the MDM repository.
Data source Data source ….. Data source
Data Quality Assessment (DQA) Merge data from all the sources
MDM Server Key data columns into a canonical form
Key data + Domain values Domain values in key data columns
Create representative sample data

for standardization ruleset validation
Not covered in Yes Customize and potential modification
this IBM Redbook MDM data model?
No
Verify adequacy of RDP rulesets
No
Modify ruleset OK?
Yes
Use modified rulesets

in RDP jobs
Create mapping tables for each domain value
Map key columns to SIF records
Create SIF file
Populate MDM repository using the RDP jobs
Figure 3-2 Rapid MDM approach used in the scenario for the initial load
Briefly, a DQA is performed on the data sources to identify the master data
columns and the domain values in these master data columns for inclusion in the
MDM repository. The MDM data model’s master data columns and
corresponding domain values are reviewed against those of the 3 data sources.
Based on this review, the MDM Code Reference tables may need to be updated
with additional values, and source-to-SIF code mapping tables generated
between the source master data columns and corresponding MDM master data
columns.

Attention: If the MDM data model needs to be extended to support your
organization’s master data, then the MDM data model and behavior, and the
RDP for MDM jobs need to be customized to address your master data
requirements. In our scenario in this IBM Redbooks publication, we did not
pursue this avenue due to time constraints, and instead only chose to discuss
the considerations involved when customizing the MDM data model in
Appendix C, “Master Data Management Server customization considerations”
on page 263.
The master data from the data sources is loaded into a canonical form that
closely mirrors the format of the SIF records consumed by the RDP for MDM
jobs. During this process, you need to ensure that all MDM required columns (as
described in Appendix B, “Standard Interface File details” on page 247) have
valid data in them to avoid rejection by the RDP for MDM jobs. It is more efficient
to detect and fix these errors early in the cycle (potentially in the source system
itself) than after the RDP for MDM jobs have flagged it.
The purpose of creating a canonical form is to have a single format for validating
the efficacy of the RDP for MDM rulesets, and for simplifying the DataStage jobs
for creating the SIF. regardless of the number of data sources involved. Typically,
the data used for validating the efficacy of the RDP for MDM rulesets would be a
representative sample of all the data. If the RDP for MDM rulesets are modified
to address your organization’s data, these modified rulesets must replace the
corresponding default ones in the RDP for MDM jobs.
Attention: In creating this canonical form, it is important to resolve potential

domain value semantic inconsistencies for a given column between multiple
data sources. For example a Gender of ‘0’ means female in one of the source
systems, while in a corresponding column, Sex, ‘0’ means male in another
source system. After resolving such ambiguities in the canonical form, you
should ensure that any user querying the canonical form data is aware of the
revised semantics so as not to misinterpret the information retrieved.
The data in the canonical form is then loaded into the SIF using the
source-to-SIF column mapping templates you have created, and the
source-to-SIF code mapping tables generated earlier.
Important: Before the RDP for MDM jobs can be run, you must drop all
referential integrity constraints and triggers defined in the MDM repository.
74 MDM: RDP for MDM

The RDP for MDM configuration parameters are set to perform standardization
and matching in the RDP for MDM jobs. The created SIF is processed by the
RDP for MDM jobs. After all errors have been resolved, the MDM repository
would have been loaded successfully. Verify this by searching for known records
in the MDM repository using the MDM Server UI.
The referential constraints and triggers must be recreated before the MDM
repository can be considered operational and consumable by business
applications.
Important: The recommended approach is to use the Standardization and

Matching functionality of RDP for MDM jobs as much as possible, so as to
rapidly deploy your MDM implementation. However, no generalized
Standardization and Matching functionality may suit the particular
requirements of your organization’s data and could therefore require
modification. For maximum efficiency, the recommended approach allows you
to validate the efficacy of the RDP for MDM rulesets to your data and
customize it if required.
This overall flow is covered in more detail for our particular scenario as follows:
򐂰 TBank checking, savings, and loans systems
򐂰 Data Quality Assessment (DQA)
򐂰 Create a canonical form from the data sources
򐂰 Validate efficacy of the RDP for MDM rulesets and modify to suit
򐂰 Create SIF
򐂰 Execute RDP for MDM jobs
򐂰 Verify successful load
3.5.1 TBank checking, savings, and loans systems

TBank’s checking, savings, and loans systems are hosted on a DB2 for LUW V9
database. Because we were only interested in master data that needed to be
included in the MDM Server solution, we created one table for each system
containing all the required master data.
The DDL of the three tables is shown in Example 3-1 on page 76, while the data
content in each of these tables is shown in Figure 3-3 on page 78 through
Figure 3-11 on page 86. Note that all the columns are defined as being nullable
with no Primary Key defined. In a real-world environment, you would most likely
have a Primary Key defined for each table.
The master data columns in each table are highlighted in bold in Example 3-1 on
page 76.

Note: There is some overlap of customers as well as master data columns
between the data in the three systems shown in Figure 3-5 on page 80
through Figure 3-8 on page 83. There are also a few cases (such as address)
where the data may all exist in one column in one system, but is split over
multiple columns in another system.
Example 3-1 DDL of the checking, savings, and loan table

CREATE TABLE DB2INST1.CHECKING (
BALANCE DECIMAL(10 , 0),
RATE DECIMAL(10 , 0),
OVERDRAF_RATE DECIMAL(10 , 0),
OVERDRAF_FEE INTEGER,
CHECKINGID INTEGER,
CUSTOMERID INTEGER,
NAME VARCHAR(255),
ADDRESS VARCHAR(255),
COUNTRY VARCHAR(255),
PHONE VARCHAR(255),
SSN VARCHAR(255),
DOB VARCHAR(10),
DOD VARCHAR(10),
GENDER VARCHAR(255),
WORK_STATUS VARCHAR(255),
PREF_LANGUAGE VARCHAR(255),
AGEVERIFICATIONDOCUMENT VARCHAR(255),
AGEVERIFICATIONNB VARCHAR(255),
NATIONALITY VARCHAR(255),
CUSTOMER_STATUS VARCHAR(255)
)
DATA CAPTURE NONE
IN USERSPACE1;
CREATE TABLE DB2INST1.SAVINGS (

SAVINGSID INTEGER,
SALUTATION VARCHAR(255),
NAME VARCHAR(255),
STREET VARCHAR(255),
CITY VARCHAR(255),
SSN VARCHAR(255),
DOB DATE,
PHONE VARCHAR(255),
CELLPHONE VARCHAR(255),
76 MDM: RDP for MDM

GENDER INTEGER,
BALANCE DOUBLE,
RATE DOUBLE,
OVERDRAF_RATE DOUBLE,
CO_OWNER VARCHAR(10),
EFFECTIVE_CUSTOMERDATE DATE,
SOLICITATIONALLOW VARCHAR(255),
DRIVERLICENSEID VARCHAR(255),
CUSTOMER_PERFORMANCE VARCHAR(255)
)
DATA CAPTURE NONE
IN USERSPACE1;
CREATE TABLE DB2INST1.LOAN (

LOANID INTEGER,
CUSTOMERID INTEGER,
PASSPORTNB INTEGER,
TITLE VARCHAR(10),
FIRSTNAME VARCHAR(255),
LASTNAME VARCHAR(255),
INITIALS VARCHAR(255),
STREET VARCHAR(255),
CITY VARCHAR(255),
EMAIL VARCHAR(255),
GENDER VARCHAR(255),
DOB DATE,
DOD DATE,
PAYMENT_SCHEDULE INTEGER,
RATE DOUBLE,
INITIAL_VALUE INTEGER,
CREATION_DATE DATE,
LATE_FEE DOUBLE,
LATE_RATE DOUBLE,
BALANCE DOUBLE,
AUTOMAT_DEBT_IND VARCHAR(255),
GUARANTOR_ID VARCHAR(255),
MARRIED_STATUS VARCHAR(255),
CUSTOMER_STATUS VARCHAR(255)
)
DATA CAPTURE NONE
IN USERSPACE1;

Figure 3-3 Checking table data, part 1 of 2
78 MDM: RDP for MDM

Figure 3-4 Checking table data, part 2 of 2

Figure 3-5 Savings table data, part 1 of 3
80 MDM: RDP for MDM


82 MDM: RDP for MDM

Figure 3-8 Loans table data, part 1 of 4

84 MDM: RDP for MDM


86 MDM: RDP for MDM

Table 3-1 Overlapping customers in the checking, savings, and loans systems
Checking Savings Loans
AccountID Name AccountID Name AccountID Name

10000000 Bruce H Anderson 20000000 Bruce H Anderson
10000001 Christina Anderson 20000001 Christina Anderson
10000002 Alexandra Anderson
10000003 Carol Hansson 20000003 Carol Hansson
10000004 Anders Olsson 20000004 Anders Olsson
10000005 Alex Skov 20000005 Alex Skov 30000000 Alex Skov
10000006 Gayle Fagan 30000001 Gayle Fagan
10000007 Anna Fanelli 20000007 Anna Fanelli 30000002 Anna Fanelli
10000008 Arcangelo Fanelli 20000008 Arcangelo Fanelli 30000003 Arcangelo Fanelli
10000009 Denise Farrel 20000009 Denise Farrel 30000004 Denise Farrel
10000011 Curtis Madeson
10000012 Torben Andersom 20000012 Torben Andersom 30000007 Torben Andersom
10000013 Yesica Anderson 20000013 Yesica Anderson 30000008 Yesica Anderson
10000014 Kurt Madi 20000014 Kurt Madi 30000009 Kurt Madi
10000015 Maria Fanelli 30000010 Maria Fanelli
10000016 A Fanelli 20000016 A Fanelli 30000011 A Fanelli
20000017 A Carter 30000012 A Carter
10000018 Barry Rosen 20000018 Barry Rosen 30000013 Barry Rosen
10000019 Jastinderk Kumar 30000014 Jastinderk Kumar
10000020 Aaron Jensen 20000020 Aaron Jensen 30000015 Aaron Jensen
10000021 Allan Jensen 20000021 Allan Jensen 30000016 Allan Jensen
10000022 Andrew I Jensen 20000022 Andrew I Jensen 30000017 Andrew I Jensen
10000023 Anette A Jensen 30000018 Anette A Jensen
10000024 Anton T & Larue Jensen 20000024 Anton T & Larue Jensen 30000019 Anton T & Larue Jensen
10000025 Brandon Jensen 20000025 Brandon Jensen 30000020 Brandon Jensen
10000026 Steven C Preston 20000026 Steven C Preston 30000021 Steven C Preston
10000027 Allan Preston 30000022 Allan Preston
20000028 Martha S Peet 30000023 Martha S Peet
20000029 Jackie Jackson 30000024 Jackie Jackson
20000030 Renee Jackson 30000025 Renee Jackson

3.5.2 Data Quality Assessment (DQA)
DQA is the process of exposing technical and business data issues in order to
plan the data integration effort most likely to succeed within budget and time
constraints.
򐂰 Technical quality issues based on target technical standards, such as the
ones in the following list, are generally easy to discover and correct.
– Different or inconsistent standards in structure, format, or values
– Missing data, default values
– Spelling errors, data in wrong fields
– Buried information in free-form fields
򐂰 Business quality issues, on the other hand, are more subjective, and are
associated with business processes such as generating accurate reports,
ensuring that data driven processes are working correctly, and shipping in a
timely fashion.
Because accuracy, timeliness, and correctness are subjective measures, it
requires the involvement of the business community to assess the business
quality of the data.
Note: For enterprise level initiatives, such as ERP implementations or system

consolidation, integration challenges at both the business and technical levels
generally revolve around the semantic reconciliation of master data objects
like customer, product, and vendor.
Because the business is the ultimate recipient and user of the data resulting from
the integration effort, the success of a DQA is dependent upon the ability and
commitment of the business community to participate in the process, and more
importantly, to resolve semantic and business rule differences at the functional
level. Figure 3-12 on page 89 provides a high-level overview of the main steps of
DQA process.
򐂰 Prepare the data for assessment
Select the data sources to be investigated and analyzed.
򐂰 Conduct data discovery
The DA and SME perform the investigation and analyzes using tools such as
IBM WebSphere Information Analyzer and IBM InfoSphere AuditStage. This
involves checking metadata integrity, structural integrity, entity integrity,
relational integrity, and domain integrity.
򐂰 Document data quality issues and decisions
After all information about data quality is known, the appropriate data
alignment and cleansing decisions can be made and implemented.
88 MDM: RDP for MDM

Note: Typical DQA durations is between four to eight weeks. In short, focused
development efforts, these will be kept tight, though assessment can be
ongoing and iterative. In longer development efforts (6+ months), they will
typically run 6+ weeks and are a key part of requirements definition.
IT Data Analyst
SME
Information
Analyzer AuditStage
Staged All
Full Volume Targeted
Source(s) Information
Profiling Columns
Report Review Data Alignment
Meta Data/Domain Integrity Decisions
•Column Analysis
•Completeness Domain Integrity
•Consistency •Business Rule Identification and Validation
•Pattern Consistency
•Translation table creation
Structural Integrity
•Table Analysis
•Key Analysis
Entity Integrity
•Duplicate Analysis
•Targeted Data Accuracy
Relational Integrity
•Cross-Table Analysis
•Redundancy Analysis
Figure 3-12 DQA approach: Data assessment
The IBM InfoSphere Information Server product provides three tools for data
assessment:
򐂰 IBM InfoSphere Information Analyzer
This product enables you to assess large volumes of data in a fraction of the
time that could be handled manually. Through its Column Analysis, Primary
Key Analysis and Cross-Table Analysis functions, IBM InfoSphere Information
Analyzer enables systematic analysis and reporting of results, allowing the
data analyst and subject matter expert to focus on the real problem of data
quality issues.
򐂰 IBM InfoSphere QualityStage
This product complements IBM InfoSphere Information Analyzer by
investigating free-form text fields such as names, addresses, and
descriptions. IBM InfoSphere QualityStage allows you to define rules for
standardizing free-form text domains, which is essential for effective
probabilistic matching of potentially duplicate master data records. This level
of sophisticated data assessment is critical to understanding the total

cleansing effort required for a data integration project. IBM InfoSphere
QualityStage is covered in the IBM Redbooks publication IBM WebSphere
QualityStage Methodologies, Standardization, and Matching, SG24-7546.
򐂰 IBM InfoSphere AuditStage
This product enables you to apply professional quality control methods to
manage the accuracy, consistency, completeness, and integrity of information
stored in databases. By employing technology that integrates Total Quality
Management (TQM) principles with data modeling and relational database
concepts, IBM WebSphere AuditStage diagnoses data quality problems and
facilitates data cleanup efforts.
Figure 3-13 summarizes the functions provided by IBM InfoSphere Information

Analyzer, IBM InfoSphere QualityStage, and IBM InfoSphere AuditStage.
Figure 3-13 Data assessment tools functionality
90 MDM: RDP for MDM

A discussion of all the steps and the benefits of Data Quality Assessment (DQA)
is beyond the scope of this IBM Redbooks publication. For details on these steps,
refer to the IBM Redbooks publication IBM WebSphere Information Analyzer and
Data Quality Assessment, SG24-7508.
In this scenario, we focus on determining the domain values in the columns in the
source systems that need to be mapped to the corresponding columns in the
MDM repository. The determination of the domain values in the source systems
may necessitate adding new domain values to the MDM repository to
accommodate values that exist in the source systems.
Note: As part of the implementation preparation, the MDM code tables must
be populated with appropriate values. The MDM implementation process, the
steps to determine what these values should be and how they are loaded is
not within the scope of this IBM Redbooks publication.
For example, if the source system rates a customer into five categories (one
through five) and the MDM repository only allows four categories, you will need
to add an additional category to the MDM repository code reference table for
customer rating. Also, because the SIF must be loaded with domain values
expected by the MDM repository, the process creating the SIF must map the
values in the source systems to the values in the MDM repository. Mapping
tables are required for each code reference table in the MDM repository. For
example, gender may be stored as 0 (female) and 1 (male) in the source
systems, while the MDM repository expects M (male) and F (female). This
requires a mapping table for gender that maps 0 to F and 1 to M.
The Column Analysis Frequency Distribution Data report of IBM InfoSphere

Information Analyzer is used to determine the valid domain values in the various
code reference columns in the source systems. We assume the DQA process on
the organization’s master data has identified invalid domain values in the source
table columns that correspond to columns in the MDM, and that these invalid
values have been corrected in the source systems before mapping tables are
created. Figure 3-14 on page 92 through Figure 3-16 on page 93 show the report
for the GENDER column in the Checking, Savings, and Loan tables respectively.
One value in Figure 3-14 on page 92 for the Checking table shows an invalid
value X, which is assumed to be corrected in the source system. While the
Checking and Loan tables have M and F as the domain values, the Savings table
has 0 (female) and 1 (male) as domain values.

Figure 3-14 Frequency Distribution Data for GENDER in Checking table
Figure 3-15 Frequency Distribution Data for GENDER in Savings table
92 MDM: RDP for MDM

Figure 3-16 Frequency Distribution Data for GENDER in Loan table
Table 3-2 shows the columns that need to be mapped between the sources and
the target MDM repository. This list was arrived at after an analysis of the code
reference tables in the MDM repository and the ones in the source systems.
Table 3-2 Code table mapping between the sources and the MDM repository
Common Source systems MDM Server
columns Column & source & domain values domain values & column
Country COUNTRY in CHECKING US 185 COUNTRY_TP_CD in
(null) and other country codes CDCOUNTRYTP
COUNTRY in SAVINGS US
(null)
COUNTRY in LOAN US
(null)
Customer performance CUSTOMER_PERFORMANCE in SAVINGS LOW 1 CLIENT_IMP_TP_CD in

MID 2 CDCLIENTIMPTP
HIGH 3
(null)
CUSTOMER_STATUS in LOAN GOLD

SILVER
BRONZE
(null)

Common Source systems MDM Server
columns Column & source & domain values domain values & column
Customer status CUSTOMER_STATUS in CHECKING A 1 CLIENT_ST_TP_CD in
B 2 CDCLIENTSTTP
C 3
D 4
(null)
Gender GENDER in CHECKING M M Not validated in MDM

F F
(null)
GENDER in SAVINGS 1
0
(null)
GENDER in LOAN M
F
(null)
Marital status MARRIED_STATUS in LOAN Married 1 MARITAL_ST_TP_CD in

Single 2 CDMARITALSTTP
Divorced 3
(null)
Nationality NATIONALITY in CHECKING US 185 COUNTRY_TP_CD in

Preferred language PREF_LANGUAGE in CHECKING English 100 LANG_TP_CD in

(null) and other language codes CDLANGTP
Salutation SALUTATION in SAVINGS Mr. 14 PREFIX_NAME_TP_CD

Mrs. 15 in CDPREFIXNAMETP
(spaces) and other salutation codes
(null)
TITLE in LOAN Mr.

Mrs.
(spaces)
(null)
Attention: If the MDM repository is populated from the same column (such as
GENDER) in multiple data sources, it is possible that there could be
overlapping values that have different semantic meanings. For example, in
one system, the value 0 could represent a female, while 0 in another system
represents a male. When creating the canonical form, semantic conflicts must
be resolved before populating the column. This situation did not exist in our
scenario.
Figure 3-17 on page 96 shows the mapping between the master data columns in
the source systems to the corresponding columns in the canonical form table
(Example 3-2 on page 95).
94 MDM: RDP for MDM

In the canonical form the following characteristics are true:
򐂰 The CUSTOMERID columns gets mapped to the ADMIN_CLIENT_ID column
in the SIF, which becomes part of the SSK in the MDM data repository. It is for
all practical purposes the primary key for access in the source system.
Important: There is no CUSTOMERID equivalent column in the Savings

system. We artificially generated a value that concatenated the
SAVINGSID column (an implicit primary key) with an additional character
and populated the CUSTOMERID column with it. When coding the MDM
consumption application, we extract the SAVINGSID component from the
CUSTOMERID column when it needs to retrieve non-master data from the
Savings system, as shown in the JSP™ application code in Example 3-10
on page 233.
򐂰 There are two columns (SRCSYSTEMID and ZIPCODE) that do not have
corresponding columns in the source. The SRCSYSTEMID column is
generated based on the source system columns being mapped (1 for
Checking, 2 for Savings, and 3 for Loan), while the zip code is embedded in
other columns in the source systems and therefore not explicitly mapped.
Example 3-2 DDL of the canonical form table

CREATE TABLE STAGING.CANONICAL_TBL (
SRCSYSTEMID INTEGER NOT NULL,
CUSTOMERID VARCHAR(255) NOT NULL,
ACCOUNTID VARCHAR(255) NOT NULL,
WORKSTATUS VARCHAR(255),
CELLNB VARCHAR(255),
PHONENB VARCHAR(255),
EMAIL VARCHAR(255),
PASSPORTNB VARCHAR(255),
DRIVERLICNB VARCHAR(255),
SSN VARCHAR(255),
FIRSTNAME VARCHAR(255),
LASTNAME VARCHAR(255),
INITIALS VARCHAR(255),
STREETADDRESS VARCHAR(255),
CITY VARCHAR(255),
ZIPCODE VARCHAR(255),
DOD DATE,
DOB DATE,
MARITALSTATUS VARCHAR(255),
GENDER CHAR(1),

NATIONALITY VARCHAR(255),
CUSTOMERSTATUS VARCHAR(255),
CUSTOMERPERF VARCHAR(255),
STARTDATE DATE,
SOLICITATIONALLOW VARCHAR(255),
AGEVERIFICATIONDOC VARCHAR(255),
SALUTATION VARCHAR(255),
PREF_LANGUAGE VARCHAR(255),
FREEFORMNAME VARCHAR(255),
FREEFORMADDRESS VARCHAR(255)
)
DATA CAPTURE NONE
IN STAGINGSPACE;
Canonical form
Source systems
WORK_STATUS SRCSYSTEMID
PREF_LANGUAGE
CHECKINGID
CUSTOMERID
C GENDER ACCOUNTID
PHONE
H
E CUSTOMERID WORKSTATUS
C
NATIONALITY
CUSTOMER_STATUS
CELLNB
K DOD PHONENB
I AGEVERIFICATIONDOCUMENT
N ADDRESS EMAIL
G SSN
PASSPORTNB
COUNTRY
DOB
AGEVERIFICATIONNB
DRIVERLICNB
NAME SSN
FIRSTNAME
CITY
EFFECTIVE_CUSTOMERDATE
LASTNAME
S
SAVINGSID INITIALS
GENDER
A PHONE STREETADDRESS
V
I
STREET
DRIVERLICENSEID
CITY
N CELLPHONE COUNTRY
G COUNTRY
S SSN ZIPCODE
DOB
SALUTATION
DOD
NAME DOB
SOLICITATIONALLOW
CUSTOMER_PERFORMANCE MARITALSTATUS
GENDER
TITLE NATIONALITY
LASTNAME
CITY CUSTOMERSTATUS
L
O
EMAIL
PASSPORTNB
CUSTOMERPERF
A GENDER STARTDATE
N FIRSTNAME
S CUSTOMERID SOLICITATIONALLOW
STREET
CUSTOMER_STATUS
AGEVERIFICATIONDOC
DOD SALUTATION
INITIALS
COUNTRY PREF_LANGUAGE
MARRIED_STATUS
DOB
FREEFORMNAME
LOANID FREEFORMADDRESS
SRCSYSTEMID is assigned a value of 1 (checking), 2 (savings) or 3 (loans) depending upon the source
ZIPCODE has no assignment from any of the input sources
Figure 3-17 Mapping from sources to canonical form
96 MDM: RDP for MDM

3.5.3 Create canonical form from the data sources
This section describes the mapping of data from the three source systems to a
single canonical form table.
The main steps involved are as follows:

1. Define the sources to canonical form table target mapping.
2. Populate the canonical form table.
Defining the sources to canonical form table target mapping

We used the IBM InfoSphere Information Server's FastTrack component Version
8.0.1 to perform the mapping and generate the DataStage jobs1.
FastTrack provides Data Architects and Business Analysts with a drag-and-drop

user interface to InfoSphere Information Server, which allows them to define
source-to-target mapping specifications and to define and track additional
requirements for data transformations. From these mapping specifications,
DataStage jobs and job templates are generated. In our scenario, the mappings
were relatively simple. Therefore, the mapping specifications were used to
generate runnable DataStage jobs. The DataStage developer can choose to
either complete or verify the generated job, and then execute it to move source
data to the target.
Note: We assume that the Data Quality Assessment (DQA) has taken place
previously, and the required ODBC data sources (see Figure 3-18 on page 98,
which includes the definition of the TBANK and IADB data sources) have been
defined for both the sources and target systems. All the data sources that
were imported using InfoSphere Information Server console is also available
to FastTrack users. The metadata acquired from these data sources is used to
identify the target columns and tables in FastTrack, and to configure ODBC
connectivity in the generated DataStage jobs.
1
Template jobs for more complex requirements

Figure 3-18 ODBC data sources on odbc.ini file
Figure 3-19 on page 101 through Figure 3-31 on page 112 describe the main
screenshots in creating a specification that maps the SAVINGS source columns
to the corresponding CANONICAL target columns using FastTrack, and the
generation and configuration of the DataStage job for that specification.
Note: The mapping is repeated for the CHECKING and LOAN sources as
well, but that is not repeated here.
To define the sources to canonical form table target mapping, perform the
following steps:
1. Log in login to the appropriate server (virgo) with the user ID isadmin, who is
assumed to have the required permissions to access InfoSphere Information
Server, as shown in Figure 3-19 on page 101.
2. FastTrack source-to-target mapping specifications are contained in projects.
We opened a previously created project named SourceToSif_Canonical for
our mapping specification, as shown in Figure 3-20 on page 102.
98 MDM: RDP for MDM

3. Figure 3-21 on page 103 shows a list of source-to-target mapping
specifications defined in this project. Click New Mapping in the Tasks list to
create a new mapping specification.
4. Provide details of the new mapping specification in the Mapping Editor as
shown in Figure 3-22 on page 104 such as Name
(SAVINGS_TO_CANONICAL). Click Column Mappings in the Basic section
of the tab list to map the columns.
5. Open the Database metadata tab, expand the metadata tree underneath the
target host (ORION.ITSOSJ.SANJOSE.IBM.COM), and navigate to the target
canonical table (STAGING.CANONICAL_TBL in the database TBANK), as
shown in Figure 3-23 on page 105. Drag this target table on to the mapping
canvas in the Target Columns field. This causes this area to be populated with
the columns from the STAGING.CANONICAL_TBL as shown.
6. Open the Database metadata tab, expand the metadata tree, and navigate to
the required source table STAGING.SAVINGS table in the TBANK database
in the source host ORION.ITSOSJ.SANJOSE.IBM.COM as shown in
Figure 3-24 on page 106.
Select the required source columns (such as CITY) and drag it to the mapping
canvas in the Source Columns field, which corresponds to the target column
(CITY).
Repeat the process for the rest of the columns. However, in the case of the
target columns CUSTOMERID and SRCSYSTEMID, we need to define a
transformation function as follows:
– In general, the CUSTOMERID columns gets mapped to the
ADMIN_CLIENT_ID column in the SIF, which becomes part of the SSK in
the MDM data repository. It is, for all practical purposes, the primary key
for access in the source system.
However, because there is no CUSTOMERID equivalent column in the
Savings system, we artificially generated a value that concatenated the
SAVINGSID column (an implicit primary key) with an additional character
(1) and populated the CUSTOMERID column. This is achieved by placing
a value of SAVINGSID: “1” under the Transformation Function field for the
target CUSTOMERID column, as shown in Figure 3-25 on page 107.
This action is remembered in the MDM consumption application when it
needs to retrieve non-master data from the Savings system, as shown in
the JSP application code in Example 3-10 on page 233.
– In the case of the target column SRCSYSTEMID, we need to have a
constant value (2) placed in it to indicate that SAVINGS is the source
system. This is achieved by placing a value of 2 under the Transformation
Function field for the target SRCSYSTEMID column, as shown in

It also shows the Transformation Function field with a value of SetNull() for
some of the columns such as CUSTOMERSTATUS and STARTDATE.
Note: The mapping specification is complete when all the columns have
been mapped correctly.
Click Save and Close.

7. Select the newly created mapping specification SAVINGS_TO_CANONICAL
and click Generate Job in the Tasks list, as shown in Figure 3-27 on
page 109.
8. Select the No Composition radio button, as shown in Figure 3-28 on
page 110, because we are only generating a job from a single mapping
specification. Click Next.
9. Select the project (SourceToSIF) and folder (SourceToCanonical) and click
Finish. This will save the generated job (with the name of new job
SAVINGS_TO_CANONICAL) in the selected project and folder as shown in
Figure 3-29 on page 111. Click Next.
10.Define connection details for the generated job in the Job parameters, as
shown in Figure 3-30 on page 112. Job parameters are used to pass
database connection data such as username and password. As a general
guidelines, these database connections generally consist of the source
database, the target database, and the lookup data source (if lookups are
implemented). Navigate to the data sources and enter the appropriate job
parameter names for each data source. Supply the required username and a
password parameter and click Finish to generate the DataStage job.
Important: The generated job shown in Figure 3-31 on page 112 is the job
corresponding to CHECKING_TO_CANONICAL instead of
SAVINGS_TO_CANONICAL. This was an error on our part while capturing
screenshots.
100 MDM: RDP for MDM

Figure 3-19 Define the sources to canonical form table target mapping,
part 1 of 13

part 2 of 13

part 3 of 13

part 4 of 13

part 5 of 13

part 6 of 13

part 7 of 13

part 8 of 13

part 9 of 13

part 10 of 13

part 11 of 13

part 12 of 13
Figure 3-31 Populate the canonical form table, part 13 of 13

Populating the canonical form table
After the DataStage jobs have been generated (and modified if necessary to
meet additional requirements), the jobs need to be compiled and executed. The
DataStage jobs can either be run one by one from Director, or a job sequence
can be built that controls the jobs.
Attention: As mentioned earlier, we mistakenly only captured the screenshot

of the CHECKING_TO_CANONICAL job instead of
SAVINGS_TO_CANONICAL. The execution of the
CHECKING_TO_CANONICAL job is described in Figure 3-32 on page 114
through Figure 3-35 on page 115.
Note: Figure 3-31 on page 112 shows the generated job

CHECKING_TO_CANONICAL in Designer. Any changes to the job design,
specifically, the Transformer Stage of the generated job could be of interest,
because this stage implements the derivations and source to target mappings.
We did not make any modifications.
Figure 3-32 on page 114 through Figure 3-35 on page 115 show the main
screenshots in the execution of the generated job, and after all the sources were
processed, the partial contents of the canonical form table is shown in
Example 3-3 on page 116.
Perform the following steps to populate the canonical form table:

1. Invoke the Director by selecting Run Director from the Tools menu bar, as
shown in Figure 3-32 on page 114.
2. Navigate to the newly generated job CHECKING_TO_CANONICAL job
folders to the newly generated job, as shown in Figure 3-33 on page 114.
3. Specify the appropriate parameters in the Job Run Options, setting the
appropriate values for the job parameters, and click Run as shown in
4. Review the job’s execution in Director, as shown in Figure 3-35 on page 115.
5. The data from all the sources must be loaded into the CANONICAL_TBL. The
partial contents of this table is shown in Example 3-3 on page 116.



Example 3-3 Partial contents of CANONICAL_TBL
3,"8000037","30000004",,,,"dfx@usa.ibm.com","608813863",,,"Denise ","Farrel","DF","1735 Saratoga Ave","San
Jose","US",,,19860902,"Divorced","F",,,"GOLD ",,,,"Mr.",,,
3,"8000885","30000035",,,,"Kelly@gmail.com","743118324",,,"Kelly ","Hopkins","KH","1482 Rhode Island St","San
Francisco","US",,,20010317,"Married","F",,,"BRONZE ",,,,"Mrs.",,,
3,"8000637","30000029",,,,"Burr@gmail.com","111695965",,,"Burr ","Preston","BP","3726 Broderick St","San
Francisco","US",,,19890725,"Divorced","M",,,"BRONZE ",,,,,,,
3,"8000002","30000016",,,,"Allan@gmail.com","921319004",,,"Allan ","Jensen","AJ","PO Box 7424","San
Francisco","US",,,19570329,"Married","M",,,"BRONZE ",,,,"Mrs.",,,
................
..........
2,"200000161","20000016",,"(415) 923-1998","(408) 269-0922",,,"S99887766","111-345-2312",,,,"2584 Junction Ave","San
Jose","US",,,19971102,,"0",,,"HIGH ",,"Y",," ",,"A Fanelli",
2,"200000071","20000007",,"(408) 919-1500","(999) 999-9999",,,"S12312311",,,,,"1603 Bel Air Ave","San Jose","US",,,19890823,,"0",,,"MID
",,"Y",," ",,"Anna Fanelli",
2,"200000001","20000000",,"(408) 236-2527","(999) 999-9999",,,"S45118674","232-22-4444",,,,"6177 Purple Sage Ct","San
Jose","US",,,19370802,,"1",,,"MID ",,"Y",," ",,"Bruce H Anderson",
2,"200000121","20000012",,"(415) 561-8511","(408) 782-7100",,,"S22334455","112-99-1212",,,,"321 Curie Drivee","San
Jose","US",,,19750902,,"1",,,"MID ",,"Y",,"Mrs.",,"Torben Andersom",
2,"200000171","20000017",,"(415) 673-4598","(408) 919-1500",,,"E123456789","456-34-4563",,,,"5528 Muir Dr","San
Jose","US",,,19900314,,"1",,,"MID ",,"N",,"Mrs.",,"A Carter",
..............
..........
1,"70006245","10000022","Empl",,"(415) 296-9450",,,"xxxxxxxx","133-34-2345",,,,,,"US",,,19770918,,"M","US","B
",,,,"Drivers License",,"English","Andrew I Jensen","44 Montgomery St,Ste 3705,San Francisco,94104"
1,"70004432","10000006","Empl",,"(408) 850-6400",,,"S67856745","",,,,,,"US",,,19760802,,"M","US","C
",,,,"Drivers License",,"English","Gayle Fagan","2315 N 1st St,,San Jose,95119"
1,"70002305","10000023","Empl",,"(415) 282-0219",,,"S12312311","234-45-3434",,,,,,"US",,,19451022,,"F","US","B
",,,,"Drivers License",,"English","Anette A Jensen","77 Grand View Ave,Apt 202,San Francisco,94114"
1,"70002268","10000004","Empl",,"(800) 817-8232",,"134785432",,"xxx-xx-xxxx",,,,,,"US",,,19980803,,"M","US","B
",,,,"Passport",,"English","Anders Olsson","2050 North First Street,,San Jose,95119"
1,"70006863","10000020","Empl",,"(415) 683-0763",,,"xxxxxxxx","123-45-6789",,,,,,"US",,20081001,19670502,,"M","US","B
",,,,"Drivers License",,"English","Aaron Jensen","1363 14th Ave,,San Francisco,94122"
1,"70007096","10000027","Empl",,"(415) 677-9723",,"111345674",,"",,,,,,"US",,,19960411,,"M","US","A
",,,,"Passport",,"English","Allan Preston","720 Market St,Ste 900,San Francisco,94102"
1,"70005799","10000008","Empl",,"(800) 553-6387",,,"S98765432","345-34-2378",,,,,,"US",,,19370901,,"F","US","C
",,,,"Drivers License",,"English","Arcangelo Fanelli","170 W Tasman Dr,,San Jose,95119"
1,"70003060","10000024","Empl",,"(415) 586-7966",,,"S34565422","123-22-2222",,,,,,"US",,,19660825,,"X","US","B
",,,,"Drivers License",,"English","Anton T & Larue Jensen","258 Lisbon St,,San Francisco,94112"
1,"70005333","10000001","Empl",,"(408) 226-2327",,,"S13494673","453-42-1234",,,,,,"US",,,19450312,,"F","US","B
",,,,"Drivers License",,"English","Christina Anderson","6181 Camino Verde Dr,,San Jose,95119"
1,"70007859","10000002","Empl",,"(408) 782-7100",,,"S33433434","543-23-9999",,,,,,"US",,,19671201,,"F","US","B
",,,,"Drivers License",,"English","Alexandra Anderson","321 Curie Drivee,,San Jose,95119"
................
................
3.5.4 Validate efficacy of the RDP for MDM rulesets & modify to suit
The purpose of creating a canonical form is to have a single format for validating
the efficacy of the RDP for MDM rulesets, and for simplifying the DataStage jobs
for creating the SIF, regardless of the number of data sources involved. The data
used for validating the efficacy of the RDP for MDM rulesets should be a
representative sample of all the data.
Note: In our test environment, the volume of data was quite small. We
therefore chose to use all of it as input to this process.
If the RDP for MDM rulesets are modified to address your organization’s data,
then these modified rulesets must replace the corresponding default ones in the
RDP for MDM jobs.

Important: We adopted this approach after our bad experience with directly
executing the RDP for MDM jobs on the canonical form data (without this
validation step) in which a majority of the rows got rejected by the RDP for
MDM jobs because the critical CITY field in the SIF record was empty (due to
standardization errors). We introduced this approach of validating the
canonical form data with the RDP for MDM rulesets with the intention of
modifying the default RDP for MDM rulesets to address potential problems
with the CITY field in particular.
Due to time constraints, we chose to override the USPREP ruleset (which is

not in the RDP for MDM rulesets) to ensure that the CITY name in the input
was passed on to the SIF record appropriately. We did not make any changes
to improve the quality of the standardization (such as modifying the
classifications) and other overrides to fix problems (such as misspellings of
“Drivee” and “Avedue”).
We recommend that you perform the necessary overrides to correct such

problems to ensure quality data is loaded in to your MDM data repository.
Hence our description of the process of exporting the RDP for MDM rulesets
and importing them back after changes to them.
In this section, we describe the following tasks:

򐂰 Importing the out-of-the-box (OOTB) RDP for MDM rulesets into a DataStage
project
򐂰 Validating RDP for MDM ruleset in the standardization job
򐂰 Overriding Input Patterns and rerunning the standardization job and export
modified ruleset
򐂰 Importing modified RDP for MDM rulesets into RDP for MDM jobs
Note: As mentioned earlier, the following information is not meant to be a

tutorial on the use of QualityStage, because that is beyond the scope of this
IBM Redbooks publication. Please refer to IBM Redbooks publication IBM
WebSphere QualityStage Methodologies, Standardization, and Matching,
SG24-7546 for information about using QualityStage. In this IBM Redbooks
publication, we include some of the relevant screenshots to facilitate a better
understanding of the process adopted and recommendations proposed.

Importing OOTB RDP for MDM rulesets into a DataStage
project
We created a project named MDMTESTRULE in which we created a
standardization job to analyze all the data created in canonical form in 3.5.3,
“Create canonical form from the data sources” on page 97 for efficacy of the
OOTB RDP for MDM rulesets.
Figure 3-36 on page 119 through Figure 3-42 on page 123 show some of the
main screenshots that describe the import process. To import OOTB RDP for
MDM rulesets into a DataStage project, perform the following steps
1. Launch the WebSphere DataStage and QualityStage Designer. From the task
bar, navigate to Import → DataStage Components, as shown in Figure 3-36
on page 119.
2. In the DataStage Repository Import window (Figure 3-37 on page 119),
specify the RDP for MDM jobs dsx file. Select the Import selected radio
button to select the components to import. Click OK.
3. Figure 3-38 on page 120 through Figure 3-40 on page 122 show the available
components. Because we were only interested in the components related to
name and address standardization, we only selected them (four shared
containers and all the rulesets) and clicked OK, as shown in Figure 3-40 on
page 122. The progress of the import of the selected components is shown in
Note: As mentioned earlier, we only made changes to the USPREP ruleset

that is not in the RDP for MDM rulesets
At the completion of the import, you see the imported components in the
ValidationStanContainers in the navigation pane in Figure 3-42 on page 123.
You can now proceed to validate the efficacy of the OOTB RDP for MDM rulesets
in the standardization job in “Validating the RDP for MDM rulesets on the
standardization job” on page 123.

Figure 3-36 Import OOTB RDP for MDM rulesets into standardization job,
part 1 of 7
part 2 of 7

part 3 of 7

part 4 of 7

part 5 of 7
part 6 of 7

part 7 of 7
Validating the RDP for MDM rulesets on the standardization

job
We created a copy (named ORGUSPREP) of the USPREP OOTB RDP for MDM
ruleset and validated it with the representative sample of the canonical form data.
In our case, because our volume of data was quite small, we used all the data in
our sources as input to the validation effort.
Note: A copy was created as a backup, and also to be able to run

standardization using the original OOTB RDP for MDM rulesets. Should
modifications be required, they have to be performed on USPREP which
would then be imported into the RDP for MDM jobs for replacing the original
OOTB USPREP ruleset.

Attention: We only validated the address field because our focus was on
ensuring that the city name information could be parsed from the SIF
FREEFORMADDRESS field to populate the CITY_NAME column, which is
required by the MDM data repository. The CITY name field must be populated
during load by RDP for MDM for inserting into the MDM data repository. There
are other critical fields such as LAST_NAME and POSTAL_CODE that could
have been standardized, but we chose not to include them here. For a list of all
critical fields, please refer to the “Required for Insert” and “Required for
Update” columns in the MDM_RDP_SIF mapping template.
Our objective was to ensure that most, if not all of the input data was loaded
by RDP for MDM into the MDM data repository. Towards this end, we focused
on ensuring that the critical columns (CITY in this case) had the necessary
information. This led us to work on the USPREP ruleset. (The USPREP
ruleset is not supplied with the RDP for MDM jobs. It is a part of the standard
QualityStage ruleset.) Due to time constraints, we did make an effort to modify
the other rulesets to enhance the quality of the standardization performed by
the OOTB RDP for MDM rulesets. However, given our recommendation to
work with all the OOTB RDP for MDM rulesets, we demonstrate here the
process of validating the efficacy of the OOTB RDP for MDM rulesets and
modifying them if necessary for subsequent replacement of the original OOTB
rulesets.
main screenshots that describe the validation process. Perform the following
steps to validate RDP for MDM rulesets on the standardization job:
1. Launch the WebSphere DataStage and QualityStage Designer and display
the VSSTANAddress shared container on the Designer canvas, as shown in
Figure 3-43 on page 127 through Figure 3-45 on page 129. VSSTANAddress
is the shared container that contains the USPREP stage to be validated. This
stage processes address data from one or more source columns and moves it
into appropriate domain columns. Because we were only interested in the
address fields, our focus was on reviewing the street address in the
AddressDomain_USPREP column and the city name, state, and zip code in
the AreaDomain_USPREP column.
2. The USPREP stage was inspected to see which rulesets were used and how
they were used. We did not modify it. We reviewed them in order to generate
a corresponding standardization job (J02_ORGUSPREP_STAN) to test the
OOTB RDP for MDM rulesets.

Figure 3-46 on page 129 through Figure 3-48 on page 130 show the current
names of the source columns (ADDR_LINE_ONE, ADDR_LINE_TWO, and
ADDR_LINE_THREE) in the SIF files. The literal ZQADDRZQ is seen
included here.
This literal specifies that after field overrides and field modifications are
applied, it checks for common Address patterns. If not found, it checks for
Name and Area patterns. If not found, the field is defaulted to Address. Refer
to the IBM Redbooks publication IBM WebSphere QualityStage
Methodologies, Standardization, and Matching, SG24-7546 for details on
such literals.
3. A copy of these rulesets was created in a ORGUSPREP folder as shown in
Figure 3-49 on page 131 through Figure 3-53 on page 134.
4. We created our standardization job (J02_ORGUSPREP_STAN) with the
ORGUSPREP ruleset as shown in Figure 3-54 on page 135. The
ORGUSPREP stage identifies the two address columns in the canonical form
data (FREEFORMADDRESS and STREETADDRESS)1 with literal
ZQADDRZQ as shown in Figure 3-55 on page 135 and Figure 3-56 on
page 136 (which has two literals ZQADDRZQ).
The reason for including the two literals ZQADDRZQ is that our inspection of
the use of USPREP by RDP for MDM (as shown in Figure 3-47 on page 130
and Figure 3-48 on page 130) shows the following:
ZQADDRZQ
ADDR_LINE_ONE
ZQADDRZQ
ADDR_LINE_TWO
ZQADDRZQ
ADDR_LINE_THREE
We do not supply the values for ADDR_LINE_TWO and
ADDR_LINE_THREE in our canonical form data, and instead have the
following:
ZQADDRZQ
FREEFORMADDRESS
STREETADDRESS
ZQADDRZQ
ZQADDRZQ
The last two literals are essential in order to make our patterns match the
patterns generated in RDP for MDM.
1
FREEFORMADDRESS is really the column we wanted to target. But we knew that either
FREEFORMADDRESS or STREETADDRESS contained the data we needed. Therefore, this
setup ensures that it generates one address to be standardized for each row.

5. Figure 3-57 on page 136 shows the execution of the
J02_ORGUSPREP_STAN job.
6. Figure 3-58 on page 137 shows the data in the STREETADDRESS and
FREEFORMADDRESS columns in the canonical form data. The
FREEFORMADDRESS address shows the city name and zip code data, and
STREETADDRESS contains the street address, while the CITY column has
city information corresponding to the data in the STREETADDRESS column.
7. Figure 3-59 on page 138 shows the AreaDomain_ORGUSPREP column
contents after the processing by ORGUSPREP ruleset. It shows failure in
moving city name to this column from the FREEFORMA in the input. The
InputPattern_ORGUSPREP column shows the input pattern for the
addresses that were not processed correctly by ORGUSPREP.
Important: Because there is no city information, these records would fail

validation by the RDP for MDM jobs and cause them to be rejected. As
mentioned earlier, our earlier experience without the pre-validation phase
had resulted in a rejection of a majority of the rows by the RDP for MDM
jobs. Our analysis of the errors had indicated that the missing critical CITY
field was the cause of the rejections. Our pre-validation phase confirmed
the missing values of this critical field. Examples of other critical fields that
can cause rows to be rejected include LAST_NAME and POSTAL_CODE
which did not appear in our processing.
Because we did not want to have these records rejected, we proceeded to

override the input patterns in order for proper processing of city names to occur
as described in “Override Input Pattern, rerun the standardization job, and export
modified ruleset” on page 138.

Figure 3-43 Validate RDP for MDM ruleset on standardization job, part 1 of 17











Figure 3-59 Validate efficacy of RDP for MDM rulesets, part 17 of 17
Override Input Pattern, rerun the standardization job, and

export modified ruleset
We override the input patterns that were not handled in Figure 3-59 in the
USPREP1 ruleset and re-ran the standardization to ensure the address columns
were processed correctly, and then exported the modified USPREP ruleset to a
dsx file.
1
The USPREP is modified rather than ORGUSPREP, because that is the ruleset in the RDP for
MDM jobs which would need to be replaced.

main screenshots that describe the input pattern override, rerun of the
standardization job with the modified USPREP ruleset, and the modified
USPREP export process as follows:
1. The Rule Management window in Figure 3-60 on page 140 shows the
different parts that make up a rule set, including Overrides. Click Overrides in
the Rules Management window to add, copy, edit, or delete overrides to rules
sets.
2. We performed input pattern overrides as shown in Figure 3-61 on page 141
and Figure 3-62 on page 142.
With input pattern override, you can specify token overrides that are based on
the input pattern. The input pattern overrides take precedence over the
pattern-action file. Input pattern overrides are specified for the entire input
pattern.
Note: The override codes A (for ADDRESS) circled in the Enter Input
Pattern text field corresponds to the literal ZQADDRZQ explained earlier.
The boxed values correspond to the characters overridden with A in the
Override Code column of the Current Pattern List.
3. The modified rulesets are provisioned as shown in Figure 3-63 on page 143.
You need to provision new, copied, or customized rule sets in the Designer
client before you can compile and run a job that uses them.
4. A copy of the J02_ORGUSPREP_STAN job is created as
J12_USPREP_STAN using a stage named USPREP, as shown in Figure 3-64
on page 143.
The USPREP stage is modified to refer to the canonical form data columns
STREETADDRESS and FREEFORMADDRESS, as shown in Figure 3-65 on
page 144.
Figure 3-66 on page 144 shows the execution of this job.
5. Figure 3-67 on page 145 shows the results of processing by the modified
USPREP ruleset, which shows the AreaDomain_USPREP column populated
with the city name for the relevant rows. This indicates successful input
pattern overrides.
6. Figure 3-68 on page 146 through Figure 3-70 on page 147 show the
successful export of the modified USPREP ruleset as a dsx file
(USPREPCHANGED.dsx).
We proceeded to import the modified USPREP ruleset into the RDP for MDM
jobs as described in “Import modified RDP for MDM rulesets into RDP for MDM
project” on page 147.

Figure 3-60 Override Input Pattern & rerun the standardization job, part 1 of 11







Import modified RDP for MDM rulesets into RDP for MDM
project
Figure 3-71 on page 149 through Figure 3-75 on page 151 describe some of the
main screenshots involved in importing the modified RDP for MDM rulesets in
“Override Input Pattern, rerun the standardization job, and export modified
ruleset” on page 138 into the RDP for MDM project using WebSphere DataStage
Designer.

The RDP_RuleSet_Change project contains all the RDP for MDM jobs into which
the modified rulesets are imported.
1. Figure 3-71 on page 149 shows the contents of the RDP_RuleSet_Change
project that lists all the RDP for MDM jobs in it.
2. Figure 3-72 on page 150 through Figure 3-74 on page 150 shows the import
of all the components of the modified ruleset (USPREPCHANGED.dsx) file
into this project.
3. The modified ruleset is provisioned as shown in Figure 3-75 on page 151.
Note: After provisioning, the job using the modified rulesets must be
recompiled
With the creation of the SIF (that proceeded in parallel and is described in 3.5.5,
“Create SIF” on page 151), we proceeded to execute the RDP for MDM jobs as
described in 3.5.6, “Execute RDP for MDM jobs” on page 175.

Figure 3-71 Import modified RDP for MDM rulesets into RDP for MDM jobs,
part 1 of 5

part 2 of 5
part 3 of 5
part 4 of 5

part 5 of 5
3.5.5 Create SIF

As part of the custom work to create the SIF, typically you will create a mapping
document/spreadsheet that specifies the mapping of source to SIF target
columns, as well as any transformation of values (especially where code tables
are concerned). For code table transformations, a cross reference table is
recommended (instead of hard coding the transformations) that contains the
mapping of source code values to target code values. A custom DataStage job,

or other program/utility, should read the source data (canonical form data created
in 3.5.3, “Create canonical form from the data sources” on page 97 in our case)
and perform a lookup of the cross reference tables created manually (or
generated by Information Analyzer) to pick up the corresponding MDM code
values for loading to the SIF.
In this section, we describe the following:

򐂰 Creating a reference table using Information Analyzer.
򐂰 Generating the SIF from the data stored in the canonical form data table
created in 3.5.3, “Create canonical form from the data sources” on page 97.
Creating a reference table using Information Analyzer

As mentioned earlier, reference tables for mapping may be created manually or
using Information Analyzer.
Because we had used Information Analyzer in the DQA, we briefly describe the
process of creating one reference table to serve as a lookup for mapping values
in the canonical form table to code values stored in the MDM code tables.
These values need to be retrieved in order to correctly populate the SIF.
screenshots for creating a single reference table. It involves determining the code
values in the appropriate MDM code table and then creating the reference table
in Information Analyzer using these values as follows:
1. Figure 3-76 on page 154 shows the navigation pane in the MDM Server UI.
Navigate to Administration Console → Navigation tree → Code Tables. In
the content pane, select a code table of interest (CdAdminSysTp) from the
drop down list and click GO.
2. Figure 3-77 on page 155 shows the list of valid values in this table. We added
code values for the Checking (1000000), Savings (100001), and Loan
(1000002) systems through this GUI.
Note: Repeat this process for all the code tables of interest in the MDM data
repository for which reference tables need to be created.

3. Once the code values have been verified, the following steps need to be
repeated for all columns in the canonical form table that are required to be
translated to the target MDM code table values:
a. Select the Column Analysis tab and View Analysis Summary for the
CANONICAL_TBL table in Information Analyzer to view all the columns in
this table as shown in Figure 3-78 on page 156. We assume that Column
Analysis has been performed on the canonical form table, and that we
know the columns in the canonical form table that need to have their
values transformed to those in the corresponding code table in the MDM
data repository. Select the column (SRCSYSTEMID in this case) requiring
a code value lookup and click View Details.
b. Select the Frequency Distribution tab and enter the transformation of
values in the Transformation Value column, as shown in Figure 3-79 on
page 157. It shows the Data Value of 3 (Loan system) in the
SRCSYSTEMID column to be transformed to 1000002; Data Value of 1
(Checking system) in the SRCSYSTEMID column to be transformed to
1000000, and Data Value of 2 (Savings system) in the SRCSYSTEMID
column to be transformed to 1000001.
c. Select Reference Tables → New Reference Table to create a reference
table with these mappings, as shown in Figure 3-80 on page 158.
d. Provide the Name (CTS_LKP_SRCID) of the reference table and select
the Mapping (All Values) radio button, as shown in Figure 3-81 on
page 159. Click Save.
Note: Repeat this process for all the columns in the canonical form table
that have code values.
Table 3-3 on page 160 summarizes the code table value mappings between the
source and the target MDM data repository for our scenario.

Figure 3-76 Create of a reference table using Information Analyzer, part 1 of 6




Figure 3-80 Create SIF, part 5 of 6


Table 3-3 Code table mapping between canonical form columns and MDM
Source systems MDM Server
Column & source & domain values domain values & column
COUNTRY US 185 COUNTRY_TP_CD in
CUSTOMERPERF LOW 1 CLIENT_IMP_TP_CD in

MID 2 CDCLIENTIMPTP
HIGH 3
GOLD
SILVER
BRONZE
(null)
CUSTOMERSTATUS A 1 CLIENT_ST_TP_CD in
B 2 CDCLIENTSTTP
C 3
D 4
(null)
GENDER M M Not validated in MDM

F F
0
1
(null)
MARITALSTATUS Married 1 MARITAL_ST_TP_CD in

Single 2 CDMARITALSTTP
Divorced 3
(null)
NATIONALITY US 185 COUNTRY_TP_CD in

PREF_LANGUAGE English 100 LANG_TP_CD in

(null) and other language codes CDLANGTP
SALUTATION Mr. 14 PREFIX_NAME_TP_CD

Mrs. 15 in CDPREFIXNAMETP
(spaces) and other salutation codes
(null)
Generate the SIF from the data stored in the canonical form
data table
We used FastTrack Version 8.0.1 to define the mapping between the columns in
the canonical form data table and the SIF, and generated a DataStage job to load
the SIF tables. A subsequent DataStage job extracted the data from the SIF
tables and created the SIF file for processing by the RDP for MDM jobs.
Because the FastTrack process was similar to the one described in the creation
of the canonical form data table, it is not repeated here.

In the following screenshots, we describe only those steps that are different from
the ones performed earlier. Specifically, the creation of the SIF tables
(Figure 3-82 on page 163 shows the job jpSchemaPrep which reads the DDL for
the 22 SIF1 tables and creates the tables in the TBANK database), performing a
lookup of the reference tables created earlier, and the execution of the job
creating the SIF file from the SIF tables.
screenshots that perform the lookup of the reference tables for transforming the
code table values.
Table 3-4 on page 163 shows the mapping between the columns in the canonical
form data table to the corresponding SIF columns.
Figure 3-89 on page 171 shows the execution of the job jpGenerateOutputSIF
that extracts the contents of the SIF tables and generates the SIF file with the
pipe | delimiters between the columns.
Example 3-4 on page 171 shows the partial contents of the SIF file generated by
this process corresponding to the canonical form data.
A brief description of each screenshot follows:

1. Figure 3-83 on page 165 shows the FastTrack Mapping Editor for a specific
mapping specification (CANONICAL_ROLELOCATION_SIF). Click Lookup
Definitions in the task bar on the left for the list of lookup definitions for the
list of columns that need to be translated into code values.
2. Figure 3-84 on page 166 shows the names of the lookup definitions (such as
CTS_LKP_SRCID), the corresponding lookup table (CTS_LKP_SRCID) and
the source table (CANONICAL_TBL). Click New Lookup Definition to add
another lookup definition.
3. Provide the name (CTS_LKP_SRC) of the lookup definition and click OK as
4. After the new lookup definition has been created and saved, the lookup table
needs to be defined.
Note: We assume that the reference tables created using Information

Analyzer (or manually) have been imported into the InfoSphere Information
Server metadata repository.
1
These tables map one-to-one with the RT/ST combinations described in Appendix B.1, “SIF details”
on page 248. The DDL for these tables can be downloaded from the IBM Redbooks publications Web
page:
http://www.redbooks.ibm.com/redpieces/abstracts/sg247704.html

In Figure 3-86 on page 168, the reference table IAUSER.CTS_LKP_SRCID
from the database IADB on host VIRGO.ITSOSJ.SANJOSE.IBM.COM is to
be used as the lookup table. Drag and drop the table from the database
metadata tab to the Lookup Table field.
5. Figure 3-87 on page 169 shows the next step which defines the source table
for the lookup definition. In our case this is the canonical form table
STAGING.CANONICAL_TBL in the database TBANK on the host
ORION.ITSOSJ.SANJOSE.IBM. Drag and drop the table from the database
metadata tab to the Source Table field.
6. Figure 3-88 on page 170 shows the next step which is to define a join key for
the lookup definition. Make note of the key columns in both the lookup and the
source tables to be joined, and then click Add Key to open the Add Join Entry
window as shown in Figure 3-88 on page 170. Select the key columns in the
Lookup Table and Source Table from the drop down list.
After completing all definition of the mapping specification, save it and
generate and verify the DataStage job. The job name of a FastTack-generated
DataStage job generally defaults to the mapping name. This is similar to the
process described in “Defining the sources to canonical form table target
mapping” on page 97 and is not repeated here. When this generated job is
run, it loads the SIF tables from the canonical form data table — this is not
shown here.
7. After all the SIF tables have been loaded, the DataStage job
jpGenerateOutputSIF (Figure 3-89 on page 171) is run to create the SIF file
from the SIF tables. The dsx file of the jpGenerateOutputSIF DataStage job
can be downloaded from the IBM Redbooks publications Web page:

Figure 3-82 Create SIF tables
Table 3-4 Canonical form to SIF mapping

Canonical form columns RT/ST SIF column
CANONICAL_TBL.FREEFORMADDRESS,CANONICAL_TBL.STREETADDRESS PA ADDRESS.ADDR_LINE_ONE
CANONICAL_TBL.CUSTOMERID ADDRESS.ADMIN_CLIENT_ID
CTS_LKP_SRCID.TRANSFORMVALUE ADDRESS.ADMIN_SYS_TP_CD
CANONICAL_TBL.CITY ADDRESS.CITY_NAME
CTS_LKP_COUNTRY.TRANSFORMVALUE ADDRESS.COUNTRY_TP_CD
CANONICAL_TBL.ZIPCODE ADDRESS.POSTAL_CODE
CANONICAL_TBL.CUSTOMERID PP CONTACT.ADMIN_CLIENT_ID
CTS_LKP_SRCSYSTEM.TRANSFORMVALUE CONTACT.ADMIN_SYS_TP_CD
CTS_LKP_AGEVERDOC.TRANSFORMVALUE CONTACT.AGE_VER_DOC_TP_CD
CANONICAL_TBL.DOB CONTACT.BIRTH_DT
CTS_LKP_NATIONALITY.TRANSFORMVALUE CONTACT.CITIZENSHIP_TP_CD
CTS_LKP_CUSTPERF.TRANSFORMVALUE CONTACT.CLIENT_IMP_TP_CD
CTS_LKP_CUSTSTATUS.TRANSFORMVALUE CONTACT.CLIENT_ST_TP_CD
CANONICAL_TBL.DOD CONTACT.DECEASED_DT
CTS_LKP_GENDER.TRANSFORMVALUE CONTACT.GENDER_TP_CODE
CTS_LKP_MARITALST.TRANSFORMVALUE CONTACT.MARITAL_ST_TP_CD
CTS_LKP_PREFLANG.TRANSFORMVALUE CONTACT.PREF_LANG_TP_CD
CANONICAL_TBL.CUSTOMERID PC CONTACTMETHOD.ADMIN_CLIENT_ID
CTS_LKP_SRCID.TRANSFORMVALUE CONTACTMETHOD.ADMIN_SYS_TP_CD
CANONICAL_TBL.CELLNB (when it is NOT NULL) CONTACTMETHOD.REF_NUM
CANONICAL_TBL.EMAIL (when it is NOT NULL) CONTACTMETHOD.REF_NUM

Canonical form columns RT/ST SIF column
CANONICAL_TBL.PHONENB (when it is NOT NULL) CONTACTMETHOD.REF_NUM
CANONICAL_TBL.ACCOUNTID CH CONTRACT.ADMIN_CONTRACT_ID
CTS_LKP_SRCID.TRANSFORMVALUE CONTRACT.ADMIN_SYS_TP_CD
CANONICAL_TBL.ACCOUNTID CC CONTRACTCOMPONENT.ADMIN_CONTRACT_ID
CTS_LKP_SRCID.TRANSFORMVALUE CONTRACTCOMPONENT.ADMIN_SYS_TP_CD
CTS_LKP_PRODTP.TRANSFORMVALUE CONTRACTCOMPONENT.PROD_TP_CD
CANONICAL_TBL.CUSTOMERID CR CONTRACTROLE.ADMIN_CLIENT_ID
CTS_LKP_SRCID.TRANSFORMVALUE CONTRACTROLE.ADMIN_CLIENT_SYS_TP_CD
CANONICAL_TBL.ACCOUNTID CONTRACTROLE.ADMIN_CONTRACT_ID
CTS_LKP_SRCID.TRANSFORMVALUE CONTRACTROLE.ADMIN_SYS_TP_CD
CTS_LKP_PRODTP.TRANSFORMVALUE CONTRACTROLE.PROD_TP_CD
REGIONS_STAGING.REGION_ID HN HIERARCHY_NODE.ADMIN_CLIENT_ID
REGIONS_STAGING.REGION_DESCRIPTION HIERARCHY_NODE.DESCRIPTION
CANONICAL_TBL.CUSTOMERID when SRCSYSTEMID = '2' HN HIERARCHY_NODE.ADMIN_CLIENT_ID
REGIONS_STAGING.REGION_ID HN HIERARCHY_REL.ADMIN_CLIENT_ID_CHILD
REGIONS_STAGING.PARENT_REGION_ID HIERARCHY_REL.ADMIN_CLIENT_ID_PARENT
(when PARENT_REGION_ID IS NOT NULL)
Lookup Client Id Parent.REGION_ID HN HIERARCHY_REL.ADMIN_CLIENT_ID_PARENT

CANONICAL_TBL.CUSTOMERID HIERARCHY_REL.ADMIN_CLIENT_ID_CHILD
REGIONS_STAGING.REGION_DESCRIPTION when PARENT_REGION_ID IS NULL HN HIERARCHY.DESCRIPTION
REGIONS_STAGING.REGION_ID HN HIERARCHY_UP.ADMIN_CLIENT_ID
REGIONS_STAGING.REGION_DESCRIPTION HIERARCHY_UP.DESCRIPTION
(when PARENT_REGION_ID is null)
CANONICAL_TBL.CUSTOMERID PI IDENTIFIER.ADMIN_CLIENT_ID
CTS_LKP_SRCID.TRANSFORMVALUE IDENTIFIER.ADMIN_SYS_TP_CD
CANONICAL_TBL.DRIVERLICNB (when it is NOT NULL) IDENTIFIER.REF_NUM
CANONICAL_TBL.PASSPORTNB (when it is NOT NULL) IDENTIFIER.REF_NUM
CANONICAL_TBL.SSN (when it is NOT NULL) IDENTIFIER.REF_NUM
CANONICAL_TBL.CUSTOMERID PH PERSONNAME.ADMIN_CLIENT_ID
CTS_LKP_SRCID.TRANSFORMVALUE PERSONNAME.ADMIN_SYS_TP_CD
CANONICAL_TBL.FREEFORMNAME PERSONNAME.FREE_FORM_NAME
CANONICAL_TBL.FIRSTNAME,CANONICAL_TBL.FREEFORMNAME PERSONNAME.GIVEN_NAME_ONE
CANONICAL_TBL.FREEFORMNAME,CANONICAL_TBL.LASTNAME PERSONNAME.LAST_NAME
CTS_LKP_SALUTATION.TRANSFORMVALUE PERSONNAME.PREFIX_NAME_TP_CD
CANONICAL_TBL.CUSTOMERID CL ROLELOCATION.ADMIN_CLIENT_ID
CTS_LKP_SRCID.TRANSFORMVALUE ROLELOCATION.ADMIN_SYS_TP_CD
CANONICAL_TBL.ACCOUNTID ROLELOCATION.ADMIN_CONTRACT_ID
CTS_LKP_SRCID.TRANSFORMVALUE ROLELOCATION.ADMIN_CLIENT_SYS_TP_CD
CTS_LKP_PRODTP.TRANSFORMVALUE ROLELOCATION.PROD_TP_CD







Example 3-4 Partial contents of SIF file

C|R|1000002|30000029|A|1000002|8000637||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000022|A|1000002|8000712||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000024|A|1000002|8000917||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000031|A|1000002|8000467||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000010|A|1000002|8000263||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000001|20000028|A|1000001|200000281||7|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000001|20000030|A|1000001|200000301||7|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000001|20000005|A|1000001|200000051||7|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000023|A|1000002|8000259||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
.....................
P|I|1000001|200000161|A|3||S99887766|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000291|A|3||S347936486|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000251|A|3||D12456745|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70004432|A|3||S67856745|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70006363|A|3||S12314517|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70004262|A|3||S76193782|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000001|A|3||S45118674|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000091|A|3||S12314517|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70007287|A|3||S12312399|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000131|A|3||S12312456|||||||||||0|0|0|0|0|0|0|0|0|0|
......................
P|P|1000002|8000885|A|N|||||||3||||||||||||||||||||1|||||F|2001-03-17 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000362|A|N|||||||2||||||||||||||||||||1|||||F|1989-08-23 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000263|A|N|||||||2||||||||||||||||||||1|||||F|1986-06-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000001|200000301|A|N|||||||3|||||||||||||||||||||||||F|1987-07-06 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000000|70005799|A|N|||100|||||3|||||||||||||||||||||185|||F|1937-09-01 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000000|70002070|A|N|||100|||||4|||||||||||||||||||||185|||M|1923-04-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000001|200000051|A|N|||||||1|||||||||||||||||||||||||M|2011-08-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
,,,,,,,,,,,,,,,,
P|O|1000001|4|A|N||||||||||||||||||||||7|||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|O|1000001|8|A|N||||||||||||||||||||||7|||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000035|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000028|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|

C|C|1000002|30000002|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000007|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000011|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000001|20000017|A|7|2||||||||||||0|0|0|0|0|0|0|0|0|0|
.......................
P|G|1000001|1|A|1|US-Wide Marketing||||||||0|0|0|0|0|
P|G|1000001|5|A|1|Local Marketing - San Jose||||||||0|0|0|0|0|
P|G|1000001|9|A|1|Local Marketing - Eugene||||||||0|0|0|0|0|
C|L|1000002|30000029|A|1000002|8000637||10|1|1||||0|0|
C|L|1000002|30000024|A|1000002|8000917||10|1|1||||0|0|
C|L|1000002|30000033|A|1000002|8000402||10|1|1||||0|0|
C|L|1000001|20000014|A|1000001|200000141||7|1|1||||0|0|
.........................
C|R|1000002|30000035|A|1000002|8000885||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000028|A|1000002|8000049||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000002|A|1000002|8000362||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000007|A|1000002|8000090||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000001|20000000|A|1000001|200000001||7|1||||||||||||0|0|0|0|0|0|0|0|0|0|
..........................
P|I|1000001|200000071|A|3||S12312311|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000181|A|3||E342246798|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000041|A|3||E134564324|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70002305|A|3||S12312311|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70008172|A|3||S45118674|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70008992|A|3||S11223344|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000011|A|3||S13494673|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000221|A|3||E135687654|||||||||||0|0|0|0|0|0|0|0|0|0|
................................
P|P|1000002|8000719|A|N|||||||1||||||||||||||||||||3||||||1984-05-07 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000090|A|N|||||||2||||||||||||||||||||3|||||M|1975-09-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000001|200000071|A|N|||||||2|||||||||||||||||||||||||F|1989-08-23 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000001|200000041|A|N|||||||1|||||||||||||||||||||||||M|1998-08-03 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000000|70008172|A|N|||100|||||2|||||||||||||||||||||185|||M|1937-08-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000037|A|N|||||||1||||||||||||||||||||3|||||F|1986-09-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000297|A|N|||||||1||||||||||||||||||||2|||||F|1995-10-30 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000212|A|N|||||||1||||||||||||||||||||3|||||F|1997-11-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000001|200000291|A|N|||||||1|||||||||||||||||||||||||F|1977-03-03 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000000|70004432|A|N|||100|||||3|||||||||||||||||||||185|||M|1976-08-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
........................
P|O|1000001|1|A|N||||||||||||||||||||||7|||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|O|1000001|5|A|N||||||||||||||||||||||7|||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|O|1000001|9|A|N||||||||||||||||||||||7|||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000004|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000000|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000034|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000032|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
...........................
P|G|1000001|2|A|1|Marketing - California||||||||0|0|0|0|0|
P|G|1000001|6|A|1|Local Marketing - San Francisco||||||||0|0|0|0|0|
C|L|1000002|30000030|A|1000002|8000275||10|1|1||||0|0|
C|L|1000002|30000013|A|1000002|8000293||10|1|1||||0|0|
C|L|1000002|30000039|A|1000002|8000169||10|1|1||||0|0|
C|L|1000001|20000008|A|1000001|200000081||7|1|1||||0|0|
C|L|1000000|10000024|A|1000000|70003060||1|1|1||||0|0|
C|L|1000000|10000019|A|1000000|70007414||1|1|1||||0|0|
..............................
C|R|1000002|30000030|A|1000002|8000275||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000019|A|1000002|8000232||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000013|A|1000002|8000293||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000026|A|1000002|8000401||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000001|20000007|A|1000001|200000071||7|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000001|20000018|A|1000001|200000181||7|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000001|20000004|A|1000001|200000041||7|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000000|10000023|A|1000000|70002305||1|1||||||||||||0|0|0|0|0|0|0|0|0|0|
................................
P|I|1000001|200000171|A|3||E123456789|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000031|A|3||S76193782|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000261|A|3||S23567896|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70007859|A|3||S33433434|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70004182|A|3||S22334455|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70001195|A|3||D12456745|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000141|A|3||S98786753|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70003060|A|3||S34565422|||||||||||0|0|0|0|0|0|0|0|0|0|
.................................
P|P|1000002|8000049|A|N|||||||3||||||||||||||||||||2|||||F|2003-04-25 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000151|A|N|||||||1||||||||||||||||||||2|||||F|2005-09-15 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000001|200000281|A|N|||||||2|||||||||||||||||||||||||F|1956-03-12 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000267|A|N|||||||2||||||||||||||||||||2|||||M|1969-08-13 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000000|70008171|A|N|||100|||||2|||||||||||||||||||||185|||F|1975-07-12 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000371|A|N|||||||1||||||||||||||||||||1|||||M|1967-05-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
............................
P|O|1000001|2|A|N||||||||||||||||||||||7|||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|O|1000001|6|A|N||||||||||||||||||||||7|||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000040|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000027|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000020|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000036|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000003|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000001|20000029|A|7|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000001|20000025|A|7|2||||||||||||0|0|0|0|0|0|0|0|0|0|

C|C|1000000|10000006|A|1|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000000|10000002|A|1|2||||||||||||0|0|0|0|0|0|0|0|0|0|
.............................
P|G|1000001|3|A|1|Marketing - Washington||||||||0|0|0|0|0|
P|G|1000001|7|A|1|Local Marketing - Seattle||||||||0|0|0|0|0|
C|L|1000002|30000022|A|1000002|8000712||10|1|1||||0|0|
C|L|1000002|30000031|A|1000002|8000467||10|1|1||||0|0|
C|L|1000001|20000000|A|1000001|200000001||7|1|1||||0|0|
C|L|1000001|20000009|A|1000001|200000091||7|1|1||||0|0|
C|L|1000000|10000018|A|1000000|70003022||1|1|1||||0|0|
................................
P|H|1000001|200000121|A|15||1|Torben Andersom||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000040|A|1000002|8000719||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000027|A|1000002|8000968||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000020|A|1000002|8000267||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000039|A|1000002|8000169||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000001|20000001|A|1000001|200000011||7|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000001|20000008|A|1000001|200000081||7|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000001|20000022|A|1000001|200000221||7|1||||||||||||0|0|0|0|0|0|0|0|0|0|
..................................
P|I|1000001|200000281|A|3||S671234562|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000301|A|3||S294756383|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000051|A|3||S11223344|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70005799|A|3||S98765432|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70008171|A|3||S12312456|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70002070|A|3||S98786753|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000081|A|3||S98765432|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70003022|A|3||E342246798|||||||||||0|0|0|0|0|0|0|0|0|0|
..................................
P|P|1000002|8000968|A|N|||||||1||||||||||||||||||||2|||||F|1986-05-13 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000401|A|N|||||||3||||||||||||||||||||1|||||F|1986-01-10 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000001|200000181|A|N|||||||2|||||||||||||||||||||||||M|1989-12-11 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000000|70002305|A|N|||100|||||2|||||||||||||||||||||185|||F|1945-10-22 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000000|70008992|A|N|||100|||||2|||||||||||||||||||||185|||M|2011-08-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000638|A|N|||||||1||||||||||||||||||||1|||||M|2011-08-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000294|A|N|||||||3||||||||||||||||||||3||||||1984-05-01 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
........................
P|O|1000001|3|A|N||||||||||||||||||||||7|||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|O|1000001|7|A|N||||||||||||||||||||||7|||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000015|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000017|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000001|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000014|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000001|20000016|A|7|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000001|20000007|A|7|2||||||||||||0|0|0|0|0|0|0|0|0|0|
...................................
P|G|1000001|4|A|1|Marketing - Massachussetts||||||||0|0|0|0|0|
P|G|1000001|8|A|1|Local Marketing - Salem||||||||0|0|0|0|0|
C|L|1000002|30000019|A|1000002|8000232||10|1|1||||0|0|
C|L|1000002|30000023|A|1000002|8000259||10|1|1||||0|0|
C|L|1000001|20000001|A|1000001|200000011||7|1|1||||0|0|
C|L|1000001|20000022|A|1000001|200000221||7|1|1||||0|0|
C|L|1000000|10000015|A|1000000|70007287||1|1|1||||0|0|
C|L|1000002|30000015|A|1000002|8000371||10|1|1||||0|0|
...................................
P|H|1000002|8000401|A|14||1||Lorraine ||||Y Peterson|||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000001|200000071|A|||1|Anna Fanelli||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000000|70006863|A|||1|Aaron Jensen||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000000|70000748|A|||1|Anna Fanelli||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000000|70005904|A|||1|Curtis Madeson||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
...................................
P|C|1000001|200000171|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(415) 673-4598|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000001|200000001|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(408) 236-2527|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000001|200000091|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(999) 999-9999|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000001|200000241|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(415) 337-1781|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000002|8000640|A|||||||||2008-10-26 19:11:58.000000|||||7||||||anymail@hotmail.com|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
...........................
P|A|1000002|8000902|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||1603 Bel Air Avenue|||San
Jose|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000002|8000369|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||720 Market St, Ste 900|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
Jose|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
............................
C|H|1000002|30000008|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000002|30000009|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000002|30000012|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000002|30000038|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000002|30000041|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
......................
P|H|1000002|8000151|A|15||1||Kimberly ||||E Hopkins|||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000002|8000263|A|15||1||Maria ||||Fanelli|||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000001|200000281|A|14||1|Martha S Peet||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000000|70005333|A|||1|Christina Anderson||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000000|70006356|A|||1|A Fanelli||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000000|70005817|A|||1|Allan Jensen||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
...............................
P|C|1000001|200000161|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(415) 923-1998|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000001|200000031|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(408) 553-8211|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000001|200000141|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(415) 282-0219|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|

P|C|1000001|200000121|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(415) 561-8511|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000001|200000211|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(415) 664-0983|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000002|8000628|A|||||||||2008-10-26 19:11:58.000000|||||7||||||Renee@hotmail.com|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
................................
Jose|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000002|8000469|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||5528 Muir Dr|||San
Jose|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000002|8000111|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||631 Ofarrell St|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000001|200000201|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||1363 14th Ave|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
..........................
C|H|1000002|30000016|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000002|30000037|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000002|30000025|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000002|30000021|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000002|30000018|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000001|20000012|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000001|20000024|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
.............................
C|L|1000002|30000010|A|1000002|8000263||10|1|1||||0|0|
C|L|1000001|20000030|A|1000001|200000301||7|1|1||||0|0|
C|L|1000000|10000008|A|1000000|70005799||1|1|1||||0|0|
C|L|1000000|10000014|A|1000000|70002070||1|1|1||||0|0|
P|H|1000002|8000111|A|14||1||Åge ||||Øgendal|||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000001|200000131|A|15||1|Yesica Anderson||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000001|200000201|A|||1|Aaron Jensen||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000001|200000041|A|14||1|Anders Olsson||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000000|70002305|A|||1|Anette A Jensen||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000000|70008172|A|||1|Bruce H Anderson||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
..............................
P|C|1000001|200000071|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(408) 919-1500|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000001|200000041|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(408) 782-3700|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000001|200000011|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(408) 226-2888|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000001|200000221|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(415) 826-1031|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000001|200000201|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(415) 984-3772|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000002|8000351|A|||||||||2008-10-26 19:11:58.000000|||||7||||||APaley@gmail.com|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
...............................
P|A|1000002|8000002|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||PO Box 7424|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000002|8000628|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||1400 Candlelight
Dr|||Eugene|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000002|8000844|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||77 Grand View Ave|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000001|200000241|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||258 Lisbon St|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
...................................
C|H|1000002|30000029|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000002|30000022|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000002|30000024|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000002|30000031|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000002|30000033|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
............................
P|H|1000001|200000241|A|15||1|Anton T & Larue Jensen||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000001|200000301|A|14||1|Renee Jackson||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000001|200000051|A|||1|Alex Skov||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000000|70005799|A|||1|Arcangelo Fanelli||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000000|70008171|A|||1|Yesica Anderson||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000002|8000638|A|||1||Alex ||||Skov|||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
P|H|1000002|8000294|A|15||1||Beata ||||Hopkins|||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
................................
P|C|1000001|200000181|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(415) 677-9723|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000001|200000051|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(408) 243-1758|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000001|200000081|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(999) 999-9999|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000001|200000131|A|||||||||2008-10-26 19:11:56.000000|||||5||||||(415) 296-9450|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000002|8000002|A|||||||||2008-10-26 19:11:58.000000|||||7||||||Allan@gmail.com|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000002|8000469|A|||||||||2008-10-26 19:11:58.000000|||||7||||||carter@carter.com|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|C|1000002|8000111|A|||||||||2008-10-26 19:11:58.000000|||||7||||||Gloria@gmail.com|||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
.................................
P|A|1000002|8000351|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||555 California St, Ste 2900|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000002|8000060|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||Embarcadero Ctr|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000001|200000121|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||321 Curie Drivee|||San
Jose|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
..................................
C|H|1000002|30000030|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000002|30000019|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000002|30000013|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|H|1000002|30000023|A||||||||||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
......................................

3.5.6 Execute RDP for MDM jobs
After the SIF had been created, we dropped the triggers and referential
constraints on the MDM data repository, configured the parameters for the RDP
for MDM jobs, and launched the IL_000_INITIAL_LOAD job.
Drop triggers and Referential Integrity constraints

All the triggers and Referential Integrity (RI) constraints in the MDM data
repository were dropped as required.
򐂰 Drop triggers
The following command was used to drop the triggers:
db2 -tsvf droptriggers.sql
where droptriggers.sql contains the SQL to perform this operation. A

partial content of the drop_triggers.sql file is shown in
Example 3-5.
Note: The complete contents of the droptriggers.sql file can be

downloaded from the IBM Redbooks Web page:
Example 3-5 Partial contents of drop_triggers.sql

DROP TRIGGER D_AARSET;
DROP TRIGGER D_AASWER;
DROP TRIGGER D_ACJECT;
DROP TRIGGER D_ACTRIB;
DROP TRIGGER D_ACTYPE;
DROP TRIGGER D_AERESS;
DROP TRIGGER D_AEROUP;
DROP TRIGGER D_ASEVAL;
....
򐂰 Deactivate RI constraints
The following command was used to deactivate the RI constraints:
db2 -tsvf Deactivate_FK.sql
where Deactivate_FK.sql contains the SQL to perform this operation. A

partial content of the Deactivate_FK.sql file is shown in Example 3-6
on page 176.

Note: The complete contents of the Deactivate_FK.sql file can be
Example 3-6 Partial contents of Deactivate_FK.txt

ALTER TABLE ACCESSORENTITLE alter foreign key F1_ACCOR not
enforced;
enforced;
enforced;
ALTER TABLE ADDRESSGROUP alter foreign key F1_ADRGR not enforced;
ALTER TABLE ADDRESSGROUP alter foreign key F2_ADRGR not
enforced;
.........
Provide configuration parameters

Table 3-5 on page 178 and Table 3-6 on page 179 show our choices for the
MUST MODIFY and CONSIDER MODIFYING parameters as per the
recommendations in Table 2-6 on page 44 and Table 2-7 on page 45.
The main parameters of interest are as follows:

򐂰 Enabled standardization and matching in the RDP for MDM jobs
– QS_PERFORM_ORG_MATCH set to 1
– QS_PERFORM_PERSON_MATCH set to 1
– QS_STAN_ADDRESS set to 1
– QS_STAN_ORG_NAME set to 1
– QS_STAN_PERSON_NAME set to 1
򐂰 Modified the national ID from the default as follows:
– QS_MATCH_ORG_NATID set to I2 (Corporate Tax Identification)
The default was I8 which is the passport number. It does not make sense
to have it as a national identification for an organization. We therefore
chose I2.
– QS_MATCH_PERSON_NATID set to I1 (Social Security Number)
The default was c2 which is the business phone number, It does not make
sense to have it as a national identification for a person. We therefore
chose I1.

򐂰 We modified the database connection details, file system directories, and
other parameters for our particular environment.
Note: These changes should be made in the configuration parameter file

because we used the IL_000_INITIAL_LOAD job rather than the
IL_000_AutoStart_PS_IL job.

Table 3-5 RDP Configuration Parameters MUST MODIFY list
category
One time SETUP Connection DB_CONNECT_STRING MDM_DB
DB_INSTANCE db2inst1
DB_PASSWORD encrypted value
DB_SCHEMA DB2INST1.a
DB_USERID db2inst1
DS PARAMETER $APT_DB2INSTANCE_HOME /home/dsadm/remote_db2config
$APT_IMPORT_PATTERN_USES_FILESET_MOUNTED True
$APT_STRING_PADCHAR (blank)
DS_PARALLEL_APT_CONFIG_FILE /opt/IBM/InformationServer/Server/Configuratio
ns/MDM_Default.apt
DS_SEQUENTIAL_APT_CONFIG_FILE /opt/IBM/InformationServer/Server/Configuratio
ns/MDM_1X1.apt
Miscellaneous MDM_DEPLOYMENT_NAME WebSphere Customer Center
DS_LANGUAGE_TYPE_CODE 100
File location DS_SUPPORT_FILE_DIR /data/RDP/FREQ/
FS_DATA_SET_HEADER_DIR /data/RDP/DATA/
FS_ERROR_DIR /data/RDP/ERROR/
FS_LOG_DIR /data/RDP/LOG/
FS_PARAM_SET_DIR ./ParameterSets/
FS_REJECT_DIR /data/RDP/REJECT/
FS_SK_FILE_DIR /data/RDP/SK/
FS_TMP_DIR /data/RDP/TMP/
Runtime Runtime BATCH_ID (auto assigned) canonical_HIER
DS_PROCESSING_DATE (auto assigned) 2008-05-30 09:36:10
FS_HIERARCHY_SIF_FILE_PATTERN /data/RDP/SIF_IN/canonical_1/*.hsif
FS_SIF_FILE_PATTERN /data/RDP/SIF_IN/canonical_1/*.sif
ADVANCED DS PARAMETER $APT_IMPEXP_ALLOW_ZERO_LENGTH_FIXED_NULL True
$APT_IMPORT_PATTERN_USES_FILESET True
$APT_IMPORT_REJECT_STRING_FIELD_OVERRUNS True
$APT_SORT_INSERTION_OPTIMIZATION True

category
Recurring SETUP QualityStage QS_MATCH_ORG_NATIDb I2
c
QS_MATCH_PERSON_NATID I1
d
QS_PERFORM_ORG_MATCH 1
QS_PERFORM_PERSON_MATCHe 1
QS_STAN_ADDRESSf 1
g
QS_STAN_ORG_NAME 1
h
QS_STAN_PERSON_NAME 1
a. Period is required
b. Default is the passport number — this should be I2 which is the Corporate Tax Identification
c. The default setting of C2 equates to Business Phone Number, which is not a reasonable national id document. We therefore changed it to I1, which
is SSN
d. We chose to perform Org match
e. We chose to perform Person match
f. We chose to perform standardization on address
g. We chose to perform standardization on OrgName
h. We chose to perform standardization on PersonName
Table 3-6 RDP Configuration Parameters in the CONSIDER MODIFYING list

category
One time SETUP Miscellaneous DS_SOURCE_DATE_FORMATa %yyyy-%mm-%dd %hh:%nn:%ss.6
DS_USE_NATIVE_KEY 1
ADVANCED SURROGATE SK_MID_ADDRESS_ID_NEXT_VAL 1
SK_MID_ALERT_ID_NEXT_VAL 1
SK_MID_CONT_EQUIV_ID_NEXT_VAL 1
SK_MID_CONT_ID_NEXT_VAL 1
SK_MID_CONT_REL_ID_NEXT_VAL 1
SK_MID_CONTACT_METHOD_ID_NEXT_VAL 1
SK_MID_CONTR_COMP_VAL_ID_NEXT_VAL 1
SK_MID_CONTR_COMPONENT_ID_NEXT_VAL 1
SK_MID_CONTRACT_ID_NEXT_VAL 1

category
One time ADVANCED SURROGATE SK_MID_CONTRACT_ROLE_ID_NEXT_VAL 1
SK_MID_HIER_ULT_PAR_ID_NEXT_VAL 1
SK_MID_HIERARCHY_ID_NEXT_VAL 1
SK_MID_HIERARCHY_NODE_ID_NEXT_VAL 1
SK_MID_HIERARCHY_REL_ID_NEXT_VAL 1
SK_MID_IDENTIFIER_ID_NEXT_VAL 1
SK_MID_LOB_REL_ID_NEXT_VAL 1
SK_MID_LOCATION_GROUP_ID_NEXT_VAL 1
SK_MID_MISCVALUE_ID_NEXT_VAL 1
SK_MID_NATIVE_KEY_ID_NEXT_VAL 1
SK_MID_ORG_NAME_ID_NEXT_VAL 1
SK_MID_PERSON_NAME_ID_NEXT_VAL 1
SK_MID_PERSON_SEARCH_ID_NEXT_VAL 1
SK_MID_PPREF_ID_NEXT_VAL 1
SK_MID_ROLE_LOCATION_ID_NEXT_VAL 1
SK_MID_SUSPECT_ID_NEXT_VAL 1
SK_PREFIX_CONTRACT_ID_NEXT_VAL 1
SK_PREFIX_HIERARCHY_ID_NEXT_VAL 1

category
Recurring SETUP QualityStage QS_ALLOW_LOB_MATCH 0

Recurring
QS_EXCLUDE_FIELDS_FROM_MATCH_ORGANIZATION (blank)
QS_EXCLUDE_FIELDS_FROM_MATCH_PERSON (blank)
QS_MATCH_ORG_1b I2
QS_PHONETIC_CODING_TYPE_ADDRESS QSNYSIIS
QS_PHONETIC_CODING_TYPE_ORGANIZATION QSNYSIIS
QS_PHONETIC_CODING_TYPE_PERSON QSNYSIIS
QS_REJECT_ADDRESS_IF_NOT_STANDARDIZED 0
QS_REJECT_ORG_NAME_IF_NOT_STANDARDIZED 0
QS_REJECT_PERSON_NAME_IF_NOT_STANDARDIZED 0
Error Handling DROP DS_DETECTED_DUPLICATES_ACTION E
DS_PARTY_DROP_SEVERITY_LEVELc 0
Notification DS_EMAIL_ERROR_CHECK_DISTRIBUTION itso@us.ibm.com
DS_EMAIL_ERROR_CHECK_REPORT 1
Abort DS_DROP_MAX_ITERATIONS 10
handling
DS_FAILED_COLUMNIZATION_ACTIONd C
DS_FAILED_RECORDIZATION_ACTIONe C
DS_SIF_ERROR_THRESHOLD 120
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD 50
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD_KOUNT 12
a. We chose to adopt the time stamp format for finer granularity information
b. We did not have organizations in our input data. But if we had, then the default value of C1 (which is SSN) is not apropriate
c. Defines the severity level below which parties are dropped — we chose the least sensitive setting
d. Defines what you want to do with a parsing failure — we chose C(ontinue)
e. Defines what you want to do with a parsing failure — we chose C(ontinue)

Launch RDP for MDM jobs
We launched IL_000_INITIAL_LOAD using Director as shown in Figure 3-90.
The Job Run Options are shown in Figure 3-91 through Figure 3-95 on
page 186.
The successful completion of the job is shown in Figure 3-96 on page 187 with
an elapsed time of 7 minutes and 2 seconds.
We then proceeded to verify the successful load of the MDM data repository as
described in 3.5.7, “Verify successful load” on page 188.
Figure 3-90 Launch RDP for MDM jobs, part 1 of 7

This is the configuration file
we used in our job





3.5.7 Verify successful load
After loading the MDM data repository using RDP for MDM jobs, you need to
verify that the load was successful as follows:
򐂰 Re-establish Referential Integrity constraints and triggers
򐂰 Use the MDM Server UI to query select rows
Re-establish Referential Integrity constraints and triggers

The Referential Integrity constraints and triggers in the MDM data repository that
were dropped prior to the launch of the RDP for MDM jobs need to be
re-established before proceeding with the query of selected rows.
Note: If the RDP for MDM jobs have successfully validated all the rows, then
no errors should be highlighted by this process. If RI constraints are found to
be violated, then an error (SQLSTATE 23512) is raised and the table is put into
a check pending state. You will then have to resolve these errors before
proceeding further.
򐂰 Re-create dropped triggers

The following command should be use to recreate the triggers that were
dropped earlier:
db2 -td@ -svf CreateTriggers.sql
where CreateTriggers.sql contains the SQL to create all the

triggers. A partial list of the contents is shown in Example 3-7 on
page 189.
Note: The complete contents of the CreateTriggers.sql file can be


Example 3-7 Partial contents of CreateTriggers_simple.sql
-- ---------------------------------------------------- {COPYRIGHT-TOP} -----
-- Licensed Materials - Property of IBM
-- "Restricted Materials of IBM"
--
-- 5724-S78
--;
-- (C) Copyright IBM Corp. 2006, 2008 All Rights Reserved.
--
-- US Government Users Restricted Rights - Use, duplication, or
-- disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
-- ---------------------------------------------------- {COPYRIGHT-END} -----
--To run script replace <SCHEMA> with real name and enter command
--db2 -v -td@ -f CreateTriggers_compound.sql
CREATE TRIGGER i_actype AFTER INSERT ON addactiontype

REFERENCING NEW AS newrow FOR EACH ROW MODE DB2SQL
BEGIN ATOMIC
INSERT INTO h_addactiontype
( h_add_action_id
, h_action_code
, h_created_by
, h_create_dt
, h_end_dt
, add_action_id
, match_relev_tp_cd
, susp_reason_tp_cd
, org_tp_cd
, person_org_code
, add_action_code
, last_update_dt
, last_update_user
, last_update_tx_id
, suspect_tp_cd
) VALUES
( newrow.add_action_id
, 'I'
, COALESCE(newrow.last_update_user,'')
, newrow.last_update_dt
, NULL
, newrow.add_action_id
, newrow.match_relev_tp_cd
, newrow.susp_reason_tp_cd
, newrow.org_tp_cd
, newrow.person_org_code
, newrow.add_action_code
, newrow.last_update_user
, newrow.last_update_tx_id
, newrow.suspect_tp_cd
);
END@
CREATE TRIGGER u_actype AFTER UPDATE ON addactiontype

REFERENCING NEW AS newrow FOR EACH ROW MODE DB2SQL BEGIN ATOMIC
UPDATE h_addactiontype
SET h_end_dt = newrow.last_update_dt - 1 MICROSECOND WHERE
add_action_id=newrow.add_action_id AND
(h_action_code='U' or h_action_code='I') AND
h_end_dt IS NULL ;
INSERT INTO h_addactiontype
( h_add_action_id
, h_action_code
, h_created_by
, h_create_dt
, h_end_dt
, add_action_id
, match_relev_tp_cd
, susp_reason_tp_cd
, org_tp_cd
, person_org_code

, add_action_code
, last_update_dt
, last_update_user
, last_update_tx_id
, suspect_tp_cd
) VALUES
( newrow.add_action_id
, 'U'
, COALESCE(newrow.last_update_user,'')
, NULL
, newrow.add_action_id
, newrow.match_relev_tp_cd
, newrow.susp_reason_tp_cd
, newrow.org_tp_cd
, newrow.person_org_code
, newrow.add_action_code
, newrow.last_update_user
, newrow.last_update_tx_id
, newrow.suspect_tp_cd
);
END@
...........
򐂰 Re-establish Referential Integrity constraints

The following command should be use to re-establish the Referential Integrity
constrains that were disabled earlier:
db2 -tsvf Activate_FK.sql
where Activate_FK.sql contains the SQL to re-establish the

constraints. A partial list of the contents is shown in Example 3-8.
Note: The complete contents of the Activate_FK.sql file can be

Example 3-8 Partial contents of Activate_FK.sql

ALTER TABLE CONFIGELEMENT ALTER FOREIGN KEY F1_CONFI ENFORCED;
ALTER TABLE CONFIGELEMENT ALTER FOREIGN KEY F2_CONFI ENFORCED;
ALTER TABLE APPINSTANCE ALTER FOREIGN KEY F1_APPIN ENFORCED;
ALTER TABLE APPDEPLOYMENT ALTER FOREIGN KEY F1_APPDE ENFORCED;
ALTER TABLE ACCESSORENTITLE ALTER FOREIGN KEY F1_ACCOR ENFORCED ;
...............
Using the MDM Server UI to query select rows

After the RI constraints and triggers have been successfully reinstated, you can
use the MDM Server UI to query select rows to verify that they were successfully
loaded.
Figure 3-97 on page 192 through Figure 3-101 on page 196 show the search
and successful retrieval of information relating to a customer whose given name
is “Torben”.

Perform the following steps to use the MDM Server UI to query select rows:
1. Point to the URL for the MDM Server UI and navigate to Party Maintenance
Console → Navigation → Search → Party Search as shown in Figure 3-97
on page 192. Type Torben in the Given Name 1 field and ? in the Family Name
field (? is a wildcard). Click SUBMIT.
2. Figure 3-98 on page 193 shows the results of the search and identifies only a
single customer with a Family Name of Andersom as satisfying the conditions
of the search. Select the customer by clicking on the Party ID link as shown in
3. Figure 3-99 on page 194 shows the master data for this customer (such as
Party Info, Addresses, Identifiers, and Contact Method). Click the > icon in the
Identifiers panel to view details of additional identifiers associated with Torben
Andersom.
Note: The information associated with Torben Andersom has been merged
from different recordsa in the input because the RDP for MDM jobs was
able to automatically match (“A1”) the Torben Andersom records in the
CHECKING, SAVINGS, and LOAN systems.
a. Passport Number is from the LOAN system, while Social Security Number is
from the CHECKING/SAVINGS systems.
4. Figure 3-100 on page 195 shows details of Torben Andersom’s driver’s

license. Click the > icon again to view information about additional identifiers.
5. Figure 3-101 on page 196 shows passport details of Torben Andersom.
Note: You should review master data information of other important customers
as well, and once the information retrieved is deemed to be accurate, you can
conclude that the load by the RDP for MDM jobs was successful.

Figure 3-97 Verify successful load, part 1 of 5

Figure 3-98 Verify successful load , part 2 of 5




3.6 Suspect resolution
If matching is enabled in the RDP for MDM job in the configuration parameters
(QS_PERFORM_PERSON_MATCH and QS_PERFORM_ORG_MATCH both
set to 1, as shown in Figure 3-93 on page 184), then the RDP for MDM job is
likely to identify certain parties as being duplicates of each other and will
consolidate the information from these multiple parties into a single record in the
MDM data repository. However, when the match score falls below the A1 cutoff
value, but above the B cutoff value (as shown in Figure 3-93 on page 184), it
cannot conclusively make the determination that certain parties are duplicates. It
therefore marks such potential duplicates as suspects for manual review.
The process for manually reviewing these suspects and resolving duplicates is
called suspect resolution. The MDM Server UI provides the capability to find the
identified suspects and resolve (and mark) them as duplicates or not, as shown
in Figure 3-102 on page 199 through Figure 3-110 on page 207. It involves
searching for suspects, reviewing their details, and collapsing them into a single
record and choosing the column values to store in the collapsed record.
Note: Using the realtime services of MDM Server will ensure that they are not
raised as possible duplicates again.
Perform the following steps to resolve suspects:

1. Point to the URL for the MDM Server UI and navigate to Data Stewardship
Console → Navigation → Party Suspect Processing → Suspect Search,
as shown in Figure 3-102 on page 199. Click SEARCH to view up to 100
potential suspects in both persons and organizations.
2. Figure 3-103 on page 200 shows the results of the search and identifies a list
of persons (we did not have organizations in our data) that have suspects
associated with them. We chose to resolve potential duplicates associated
with the person named JASTINDERK with a party ID of 1120000000101288
by selecting the person and clicking the Open Suspect List icon.
3. Figure 3-104 on page 201 shows one other person with a party ID of
4070000000000788 with the same name and a Match Reason code of A2.
This code indicates that there is a reasonable certainty that the two records
represent the same party.
Because we believe this to be a duplicate, we select this suspect and click
COLLAPSE CANDIDATES LIST to view the Collapsed Candidates List
window (as shown in Figure 3-105 on page 202), which shows the single
candidate (Party ID 4070000000000788) that has the Suspect Status of
Parties are Duplicates. Click PREVIEW COLLAPSE to review what the
collapsed party information should be.

4. Figure 3-106 on page 203 through Figure 3-109 on page 206 show you the
source data and the suspect data side-by-side, and allows you to decide what
the collapsed party columns values should be. For example, in Figure 3-106
on page 203 you can use the drop-down list for Marital Status to select Single
as shown in Figure 3-107 on page 204. The Solicitation Indicator can be set
to any value from the drop-down list as shown in Figure 3-108 on page 205
and Figure 3-109 on page 206 even though neither the source nor the
suspect have this information. Click SUBMIT when the collapsed party
information has been updated. The UI will send a business service request to
the MDM Server to collapse suspects.
5. Figure 3-110 on page 207 shows the collapsed party information which now
has a new Party ID of 869122548964091809.
Note: Each time parties are collapsed, the MDM Server will invoke suspect
processing for a newly created party to identify any new potential suspects. If
you have modified the Standardization and Matching Quality Stage rules in the
RDP for MDM initial load, the same rules must be deployed in MDM Server
runtime to ensure identical business logic between runtime services requests
and load. For more information about integrating runtime Quality Stage rules
with MDM Server see InfoSphere MDM Server Developer Guide. This guide is
part of the documentation available to you after you have installed the MDM
Server.
You should repeat this process for all persons that have suspects associated with
them.
You may now integrate the realtime services of MDM Server into your existing
applications as described earlier.
In our scenario, we wrote a new application to consume master data in MDM

Server as described in 3.8, “MDM consumption application” on page 230.

Figure 3-102 Suspect , part 1 of 9

Figure 3-103 Suspect resolution, part 2 of 9








3.7 Hierarchies
Hierarchies provide a view of relationships between parties or contacts. The
deletion of a party or contact, or the collapse of multiple parties or contacts into a
single party or contact can have an impact on existing hierarchy structures.
3.7.1 Hierarchy overview

MDM supports a directed acyclic graph hierarchy. You can define multiple
hierarchies (Primary Key is the HIERARCHY_ID column in the HIERARCHY
table). A hierarchy includes hierarchy nodes (Primary Key is the
HIERARCHY_NODE_ID column in the HIERARCHYNODE table), and each
hierarchy node1 has a column INSTANCE_PK column whose value matches the
Party ID of the corresponding party or contact2. Each HIERARCHY_NODE_ID
value corresponds to a single hierarchy as identified by the HIERARCHY_ID
column in this table which is a foreign key to the HIERARCHY table. A party or
contact may have zero or many hierarchy nodes associated with it. A hierarchy
node may also be associated with multiple hierarchies.
Figure 3-111 on page 210 illustrates the concept. It shows the following
information:
򐂰 3 hierarchies
– National
– Western Region
– Eastern Region
򐂰 6 parties
– Austin
– Bill,
– Charles
– David
– Estelle
– Frank
1
Each node must reference a valid Hierarchy using the Hierarchy Name (such as Legal, Marketing &
Finance) and TypeCode (1, 2, & 3)
2
The RDP for MDM data model only supports a party/contact hierarchy even though MDM Server
supports product hierarchies as well.

򐂰 12 hierarchy nodes in all:
– 6 hierarchy nodes
Corresponding to Austin, Bill, Charles, David, Estelle, and Frank,
associated with the National hierarchy,
Corresponding to Austin, Bill, and Charles, associated with the Western
Region hierarchy
Corresponding to David, Estelle, and Frank, associated with the Eastern
Region hierarchy
Note: Each party has 2 corresponding hierarchy nodes associated with it.
򐂰 All 6 parties belong to the National hierarchy and are in a hierarchy of

reporting where Estelle and Frank report to David, who in turn reports to
Charles. Bill and Charles report to Austin who is at the top of the hierarchy
and is defined as the hierarchy ultimate parent.
򐂰 3 parties (Austin, Bill, and Charles) are associated with the Western Region
hierarchy. Austin is at the top of the hierarchy here and is defined as the
hierarchy ultimate parent.
򐂰 3 parties (David, Estelle, and Frank) are associated with the Eastern Region
hierarchy. David is at the top of this hierarchy, and is defined as the hierarchy
ultimate parent.
Note: Business rules have been defined to ensure that a cyclic graph does not
occur.

Hierarchy Hierarchy Hierarchy
National Western Region Eastern Region
Hierarchy Node (HUP) Hierarchy Node (HUP)

Hierarchy Node Hierarchy Ultimate Parent (HUP) Austin David
Austin
Hierarchy Node Hierarchy Node Hierarchy Node Hierarchy Node

Hierarchy Node Hierarchy Node Bill Charles Estelle Frank
Bill Charles
Hierarchy Node
David
Hierarchy Node Hierarchy Node

Estelle Frank
Figure 3-111 Hierarchy scenario example
Hierarchy data is processed as a separate feed after all other party or contact
data has been validated, matched, keys assigned, and the data loaded into the
MDM data repository. The input hierarchy data is validated against the hierarchy
data and party or contact information already in the MDM data repository.
The Hierarchy RT/ST (Table B-20 on page 260 through Table B-23 on page 261)
data is processed in the same manner as the non-hierarchy RT/ST party/contact
and contract data.

3.7.2 Hierarchy scenario
After successfully loading the MDM data repository for the TBank scenario, and
performing suspect resolution, we defined a single hierarchy of parties (persons
and organizations) that showed the relationship of customers to the bank’s
marketing organizations.
Note: There were no organizations in our TBank data. For the purposes of
creating a hierarchy, we loaded organization records in to the MDM data
repository. The loading of these organization records is not described here.
Figure 3-112 shows the MARKETING hierarchy, the various party (person1 and
organization2) hierarchy nodes, and the hierarchy node relationships defined for
the TBank scenario. We have combined persons and organizations in the same
hierarchy. We did not have persons in some organizations in our scenario (an
unlikely situation in the real world).
US – Wide Marketing
State - California State - Massachusetts State - Washington
Local Marketing – San Jose Local Marketing – San Francisco Local Marketing – Eugene Local Marketing – Salem Local Marketing – Seattle
Torben Andersom Aaron Jensen

Renee Jackson Jackie Jackson
Carol Hansson Anton T & Larue Jensen
Anna Fanelli Allan Jensen
Bruce H Anderson Brandon Jensen
Anders Olsson Andrew I Jensen
Arcangelo Fanelli Steven C Preston
Yesica Anderson
A Carter
Christina Anderson
Alex Skov
Denise Farrel
Kurt Madi
Barry Rosen
Figure 3-112 TBank hierarchy scenario
1
Oval shape
2
Rectangle shape

In this section, we describe the following:
򐂰 The SIF Hierarchy RT/ST records (that we created manually) to define the
hierarchy, hierarchy nodes, hierarchy relationships, and hierarchy ultimate
parent
򐂰 RDP for MDM jobs Director output
򐂰 Verify successful hierarchy creation using MDM Server UI
SIF Hierarchy RT/ST records

The hierarchy RT/ST records (HH, HN, HR, and HU) in the SIF corresponding to
the hierarchy shown in Figure 3-112 on page 211 is shown in Example 3-9.
The sequence of the SIF records is immaterial because they are sorted into the
required sequence by the RDP for MDM jobs.
Example 3-9 SIF Hierarchy RT/ST records

H|R|A|MARKETING|2|1000001|3|1000001|7||||0|0|
H|R|A|MARKETING|2|1000001|2|1000001|9||||0|0|
H|R|A|MARKETING|2|1000001|5|1000001|200000031|Leaf Node Link|||0|0|
H|N|A|MARKETING|2|1000001|7|CONTACT|City|||1||0|0|0|0|
H|N|A|MARKETING|2|1000001|200000141|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|R|A|MARKETING|2|1000001|1|1000001|2||||0|0|
H|R|A|MARKETING|2|1000001|1|1000001|4||||0|0|
H|N|A|MARKETING|2|1000001|2|CONTACT|State|||1||0|0|0|0|

H|H|A|MARKETING|2|United States of America|||0|0|
H|U|A|MARKETING|2|1000001|1|United States Of America|||0|0|
H|R|A|MARKETING|2|1000001|1|1000001|3||||0|0|
H|R|A|MARKETING|2|1000001|2|1000001|5||||0|0|
H|N|A|MARKETING|2|1000001|1|CONTACT|United States Of America|||1||0|0|0|0|
H|R|A|MARKETING|2|1000001|2|1000001|6||||0|0|
H|R|A|MARKETING|2|1000001|4|1000001|8||||0|0|

RDP for MDM jobs Director output
The IL_200_Hierarchy job processes the SIF containing the hierarchy RT/STs.
1. Figure 3-113 shows the launch of the IL_200_Hierarchy job from the Director,
and the selection of the canonical_1 file as the configuration parameter file for
this run. The contents of the canonical_1 file is not shown here.
2. Figure 3-114 on page 215 shows the successful execution of this job.
Figure 3-113 RDP for MDM jobs Director output, part 1 of 2

Figure 3-114 RDP for MDM jobs Director output, part 2 of 2
Verify successful hierarchy creation

We used the MDM Server UI to verify that the hierarchies were successfully
created as shown in Figure 3-115 on page 217 through Figure 3-129 on
page 229.
1. Expand the navigation pane in the MDM Server UI and click Search
Hierarchy By Party as shown in Figure 3-115 on page 217.
2. Enter Andersom in the Family Name field, and Torben in the Given Name 1
field in the Party Search pane and click Submit, as shown in Figure 3-116 on
page 218, to view the results of the search.
3. Figure 3-117 on page 219 shows one qualifying person in the Party Search
Results pane. Click on the 1180000000001888 link under Party ID to view the
associated hierarchy information.

4. Figure 3-118 on page 219 shows Torben Andersom belonging to the
MARKETING hierarchy. To view more information about the MARKETING
hierarchy, click MARKETING under Hierarchy Name.
5. Figure 3-119 on page 220 and Figure 3-120 on page 221 show the FULL
VIEW of the MARKETING hierarchy that shows the ancestors and the
children in this hierarchy. To view more details of a specific node in the
hierarchy such as Marketing - California, click the corresponding link as
6. Figure 3-121 on page 222 shows details of the specific node Marketing -
California. Click RETURN to go back to the previous panel.
7. Click US-Wide Marketing(UP) in Figure 3-122 on page 223 to view details of
this node (root or hierarchy ultimate parent).
8. Figure 3-123 on page 224 shows the details identifying it as the root —
Ultimate Parent Yes.
9. You may view the same information (FULL VIEW, ANCESTORS VIEW,
DESCENDENTS VIEW) by searching on an organization with Party Id
2030000000200388 as shown in Figure 3-124 on page 225 through

Figure 3-115 Hierarchy view using MDM Server UI, part 1 of 15













3.8 MDM consumption application
Our business requirement for the MDM solution implementation was one of
coexistence, where master data was maintained in the MDM repository that was
synchronized with changes occurring in the source systems at each end-of-day.
As mentioned earlier, you will typically integrate the realtime services of MDM
Server into your existing applications in order to access the master data therein.
However, our scenario involved writing a new simple MDM consumption
application that obtains a 360-degree view of a customer, where master data is
obtained from the MDM Server through a Web service call and non-master data
is retrieved from the corresponding source systems checking, savings, and loan.
Our application involved a GUI interface that provided for search on first name
and last name in the MDM repository, returning the address (master data from
the MDM repository) and balance (non-master data) from the appropriate source
systems checking, savings, and loan. The MDM consumption application was
developed as a JSP and J2EE™ application and can be downloaded from the
IBM Redbooks Web page:
The MDM consumption application performs the following functions:

򐂰 Uses a JSP to provide the GUI shown in Figure 3-130 on page 231 which
allows you to search on first name and last name
򐂰 Uses the PartyServiceProxy() Web service to obtain the party ID and address
information for persons matching the search criteria. In our case we assumed
only zero or one row to be returned matching the criteria. This is highly
unlikely in a real world environment. The code invoking the Web service is
highlighted in Example 3-10 on page 233.
Note: In our sample application, we did not provide for wild card searches
and assumed that the results of the search would either be zero or one row
from the MDM repository with the associated party id
򐂰 Uses the party ID to retrieve the address information from the MDM
repository, and the corresponding source system keys (SSK) for the checking,
saving, and loan systems as highlighted in Example 3-10 on page 233.
򐂰 Uses the SSKs to connect to the DB2 for LUW source systems to retrieve the
balance (non-key) data (as highlighted in Example 3-10 on page 233) and
present back to the user as shown in Figure 3-131 on page 232.
In this case, the customer Renee Jackson only has a Savings account, no
Checking or Loan accounts.

Figure 3-132 on page 232 and Figure 3-133 on page 233 show the customer
Torben Andersom having accounts in Checking, Savings, and Loan systems.
Note: The code shown in Example 3-10 on page 233 is only meant to show
the Web service calls and subsequent access to the source systems. It has no
error handling capabilities which would be essential in a real world application.
The CUSTOMERID value in the ADMIN_CLIENT_ID field of the CONTEQUIV

table in MDM is used to access the balance information for the Checking and
Loan systems.
In the case of the Savings system, the CUSTOMERID value in the

ADMIN_CLIENT_ID was generated from the SAVINGSID by adding another
character to it. The MDM consumption application strips this additional
character to get the SAVINGSID key which is then used to retrieve the
balance.
Figure 3-130 MDM consumption application, part 1 of 4


Example 3-10 TBANK 360 view test.jsp

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<%@page language="java" contentType="text/html; charset=ISO-8859-1"
pageEncoding="ISO-8859-1"%>
<%@page import="com.ibm.www.xmlns.prod.websphere.wcc.common.intf.schema.Control" %>
<%@page import="com.ibm.www.xmlns.prod.websphere.wcc.party.intf.schema.PersonSearchResultsResponse" %>
<%@page import="com.ibm.www.xmlns.prod.websphere.wcc.party.port.PartyServiceProxy" %>
<%@page import="com.ibm.www.xmlns.prod.websphere.wcc.party.schema.PersonSearch" %>
<%@page import="com.ibm.www.xmlns.prod.websphere.wcc.party.schema.PersonSearchResult"%>
<%@page import="com.ibm.www.xmlns.prod.websphere.wcc.party.intf.schema.PartyAdminSysKeyResponse"%>
<%@page import="java.sql.*"%>
<html>
<head>
<title>test</title>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<meta name="GENERATOR" content="Rational Application Developer">
</head>
<body>
<%
if(request.getParameter("query") != null){
// retrieve the first and last name as search parameters

String lastname = request.getParameter("lastname");
String firstname = request.getParameter("firstname");
// Retrieve the party

try{
PartyServiceProxy myPSP = new PartyServiceProxy();

PersonSearch myPSrch = new PersonSearch();
myPSrch.setLastName(lastname);
myPSrch.setGivenNameOne(firstname);
Control myControl = new Control();
myControl.setRequesterName("wasadmin");
myControl.setRequestId(12312);
PersonSearchResultsResponse myPSRR = myPSP.searchPerson(myControl, myPSrch);
%>

<table border="1" width="70%">
<tr><td colspan="2"><p>Customer Details:</td></tr>
<%
PersonSearchResult myPSrchR = myPSRR.getSearchResult(0);
%>
<tr><td colspan="2"><p><%=myPSrchR.getGivenNameOne()%> <%=myPSrchR.getLastName()%>

<%
PersonSearch myPSResult = myPSrchR.getMatchedFields();
long partyId = myPSResult.getPartyId().longValue(); %>
<p style="line-height: normal"><%=myPSResult.getAddrLineOne()%><br>
<%=myPSResult.getCityName() %></p></td></tr>
<%// get the balance for checking from DB

PartyAdminSysKeyResponse myPASKR = null;
String strCustId = null;
boolean proceedFlag = false;
try{
myPASKR = myPSP.getPartyAdminSysKeyByPartyId(myControl,"1000000",partyId);
strCustId = myPASKR.getAdminSysKey().getAdminSysPartyId();
proceedFlag = true;
} catch (Exception e){
proceedFlag = false;
// define the vars so that we can re-use them.

javax.sql.DataSource ds = null;
java.sql.Connection con = null;
String query = null;
PreparedStatement stmt = null;
int custid = 0;
ResultSet rs = null;
javax.naming.InitialContext ctx = new javax.naming.InitialContext();
ds = (javax.sql.DataSource) ctx.lookup("jdbc/test");
con = ds.getConnection("db2inst1","itso13sj");
if(proceedFlag){
query = "select balance from db2inst1.checking " + " where customerid =?";
stmt = con.prepareStatement(query);
custid = Integer.parseInt(strCustId);
stmt.setInt(1, custid);
rs = stmt.executeQuery();
if(rs.next()){ %>
<tr><td width="40%">Balance for Checking Account:</td><td

align="right"><%=rs.getBigDecimal("balance")%></td></tr>
<% }
stmt.close();
} else { %>
<tr><td colspan="2"><p> The customer does not have a checking account.</p></td></tr>
<% }
// now get the balance for the savings account

try{
strCustId = strCustId.substring(0,strCustId.length()-1);
proceedFlag = true;

if(proceedFlag){
query = "select balance from db2inst1.savings where savingsid =?";
if(rs.next()){ %>
<tr><td width="40%">Balance for Savings Account:</td><td

<% }
stmt.close();
} else { %>
<tr><td colspan="2"><p> The customer does not have a savings account.</p></td></tr>
<% }
// now get the balance for the Loan account

try{
proceedFlag = true;
if(proceedFlag){
query = "select balance from db2inst1.loan where customerid =?";
if(rs.next()){ %>
<tr><td width="40%">Balance for Loan:</td><td

<% }
stmt.close();
} else { %>
<tr><td colspan="2"><p> The customer does not have a Loan.</p></td></tr>
<% } %>
</table>
<% } catch(Exception e) {
e.printStackTrace();
} else {

%>
<p>Please enter the search criteria:
<form action="test.jsp" method="get">
<p>First Name: <input type="text" name="firstname"/>

<p>Last Name: <input type="text" name="lastname"/>
<input type="hidden" name="query"/>
<p><input type="submit" value="Submit"/>
</form>
<%
}
%>
</body>
</html>

A
Appendix A. Master Data Management

overview
In this appendix we provide an overview of Master Data Management (MDM),
IBM InfoSphere MDM Server, and its components.

A.1 Introduction
MDM is the process of consolidating enterprise master data from multiple
systems for the purpose of validating and cleaning the data to support and
improve various business processes. Master data includes business critical
information about customers, suppliers, products, and accounts. Effective
management of this data cannot only drive revenue gains for an organization, but
can help reduce costs, increase strategic flexibility, and lower risk.
Master data usage and functionality can be categorized into 3 different styles:
򐂰 Collaborative
Collaborative use of master data involves creating, defining, verifying and
augmenting master data to establish a single version of the truth about
customers, products, suppliers and accounts.
򐂰 Operational
Operation use focuses on the management, delivery and consumption of
master data in day to day operations.
򐂰 Analytical
Analytical use stages master data destined for analytical systems or to supply
rich insight to operational processes.
In general, MDM solutions should offer the following advantages:

򐂰 Consolidate data locked within the native systems and applications
򐂰 Manage common data and common data processes independently with
functionality for use in business processes
򐂰 Trigger business processes that originate from data change
򐂰 Provide a single understanding of the domain (customer, product, account,
location) for the enterprise
Some typical reasons that an organization might invest in MDM implementations

are as follows:
򐂰 Perform upsell and cross sell of products across the silos within the lines of
business (LOB)
򐂰 Facilitate better data integration in merger and acquisition scenarios
After the decision is made to consolidate data from multiple sources, you need to
decide how closely to integrate the data between the systems and how often to
keep it up to date.

The topics covered here are as follows:
򐂰 Architectural styles of MDM
򐂰 IBM InfoSphere MDM Server overview
A.2 Architectural styles of MDM

Figure A-1 shows the four architectural styles of MDM, as defined by the Gartner
Group.
Figure A-1 Architectural styles of MDM
Each of the four styles is briefly described in the following sections.
Appendix A. Master Data Management overview 239

A.2.1 Consolidation
The Consolidation style focuses on bringing together data from multiple sources
for purposes of a central reference or reporting analysis. This can be compared
to a typical data warehouse. The data collected is not used to update other
systems. The system of record of the data belongs to the individual source
systems.
A.2.2 Registry
The Registry style creates a skeleton record with minimum amount of data
required to identify the master record and to facilitate the linking of accounts
across multiple source systems. This is the most popular style implemented for a
first phase of IBM InfoSphere MDM Server. The data collected is not used to
update other systems. The system of record of the data belongs to the individual
source systems.
A.2.3 Coexistence
The Coexistence style implements all the features of the Registry style, but also
provides data elements that the client wants to track at the party level. Master
data can be updated in source systems or in MDM Server, in which case the data
is fed back to source systems. The Rapid Deployment Package (RDP) for MDM
solution facilitates the process of feeding MDM Server with master data from
source systems in batch fashion. The coexistence style is one step closer than
the Registry style to becoming the system of record, but the existing source
systems still remain as the system of record.
A.2.4 Transaction
The Transaction style implements centralized management of master data. All
data updates happen directly to the MDM solution and can be distributed to other
applications and systems, which implement read-only access. The MDM
repository becomes the system of record for master data.

A.3 IBM InfoSphere MDM Server overview
IBM InfoSphere Master Data Management Server is an operational MDM
solution from IBM that provides the strategic architecture companies need to
solve critical enterprise MDM issues. InfoSphere Master Data Management
Server helps organizations realize the full benefit of their investments in
customer relationship management (CRM), enterprise resource planning (ERP)
and business intelligence (BI) systems, as well as integration tools and data
warehouses.
InfoSphere MDM Server maintains master data for multiple domains including
customer, account, and product, as well as other data types such as location and
privacy preferences. Through business services, InfoSphere MDM Server
facilitates integration with all applications and business processes that consume
master data.
InfoSphere MDM Server is an enterprise application that provides the following

features:
򐂰 A unified operational view of your customers, accounts, and products
򐂰 An environment that processes updates to and from multiple channels. It
aligns these front office systems with multiple back office systems in real time,
providing a single source of truth for master data.
MDM Server uses a component-based Extensible Markup Language (XML) and

Java 2 Enterprise Edition (J2EE) architecture to rapidly integrate with other
systems and deliver flexibility and scalability. MDM Server can either be used in
its standard configuration, or modified through customization. You can customize
MDM Server through a number of externalized features that control its operation.
You can interface with MDM Server using one of the supported interfaces1
including the following interfaces:
򐂰 RMI
򐂰 JMS
򐂰 Batch
򐂰 Web Services
1
MDM Server supports an XML-based transaction interface. It comes with a request and a response
schema, defined in XSD. All input XMLs must conform to the request schema, while MDM Server
always responds with an XML conforming to the response schema. The schemas define the
structure of the business objects, which should be passed in or returned from MDM Server
transactions.

MDM Server takes advantage of the following products in the IBM InfoSphere
product suite to populate the MDM data repository from multiple source systems
according to the data latency requirements of the organization:
򐂰 IBM InfoSphere Information Server, which includes DataStage and
QualityStage
򐂰 IBM InfoSphere FastTrack
The MDM Server product architecture comprises the following features:

򐂰 MDM domains
MDM domains is the set of software services, functionality, and data models.
򐂰 MDM Platform
MDM Platform makes up the frameworks, common components, and
underlying infrastructure on which the MDM Domains are built. This is shown
in Figure A-2.
Figure A-2 MDM Platform

A brief explanation of the eight components detailed in Figure A-2 on page 242
follows:
򐂰 MDM Consumers
This component provides user interfaces, adapters, and frameworks to invoke
services by using the Request Framework.
There are numerous methods for invoking MDM Server services. Some
methods are part of the MDM Server product and others are components in a
client’s environment.
– MDM Server user interfaces
Given the nature of an MDM hub, MDM Server is, for the most part, a
headless application. However there are two user interfaces in the MDM
Server product.
• Business Administration Web application
The Business Administration Web application is used to configure
MDM Server. For example, this UI is used for maintaining code tables,
users, groups, external validations, and so on.
• Data Stewardship Web application
The Data Stewardship Web application is used for general party
maintenance, group and hierarchy maintenance and as part of suspect
processing. The UI can be used to add and update party information,
search for parties marked as suspects, collapse or split suspect
parties, and so on.
– Client interfaces
The way MDM Server is integrated in to a business environment is
different for each implementation, as it depends on the system
architecture, tools, and technical limitations and constraints.
Possible ways of invoking MDM Server services through the Request
Framework are as follows:
• An ESB, MQ broker or EAI broker
• Dashboard or portal user interfaces
• Client applications such as CRM and back-office administration
systems
򐂰 Request Framework
This component provides the entry point in to the core application by
managing service requests.
򐂰 MDM Core
MDM Core (the kernel of the product) offers a multitude of well-defined
services for managing party, product, account, and administrative data.

Figure A-3 shows a list of business features (services) available with MDM
Server.
Figure A-3 Business features (services) supported by MDM Server
򐂰 Extension Framework
This component provides mechanisms for extending the behavior and data
model of the product.
A code generation tool is provided to allow clients to add more columns to
existing fields and to add more tables to implement new business features.
The code generation tool also generates the Web services integration code
for the data extension or data addition.
򐂰 External Components
These are components delivered as part of MDM Server and are consumers
of MDM Server’s services.
MDM Server provides a framework for batch processing. The batch processor
is a common J2SE™ component that supports pluggable readers/writers,
multiple instances, and concurrent processing within an instance for high
throughput. The batch processor invokes the request framework for each
transaction read. Therefore, all MDM Server services and client-defined

services and extensions can be processed in batch. There are two batch
processors included: one is based on WebSphere XD, and one is a
standalone Java Standard Edition (JSE) application.
The Evergreen Processor is an application of the Event Manager. It uses the
Event Manager to monitor the MDM Server repository, and detect and notify
when suspect duplicate parties are found.
򐂰 IM/AIM Integration
These components provide integration to other information management and
AIM products.
򐂰 3rd Party Integration
These components provide integration to third party products.
򐂰 Common Components
These components provide well defined services to the core application and
request framework.
A basic transaction’s interaction with the various MDM Server components

(shown in Figure A-2 on page 242) is as follows:
1. The service controller receives a request from an MDM Server consumer.
2. The request is delegated to Request Handler, which gets a parser from a
factory.
3. The request is parsed into business objects.
4. The security component is used to authorize transaction.
5. The Business Transaction Manager (BTM) gets a business proxy capable of
handling the request.
6. The business proxy invokes the method on required controller component.
7. The controller component performs pre-processing, which includes invoking
the external validation engine to validate incoming data, and invoking the
extension controller to execute any “pre-transaction” extensions.
8. The controller component invokes required business logic component
methods.
9. The business logic component performs pre-processing, which includes
invoking the extension controller to run any “pre-action” extensions.
10.The business logic component runs business logic, including invoking
external business rules component to run the externalized business logic.
11.The business logic component invokes the persistency layer to persist data in
database.
12.The database triggers are used to create history data.

13.The business logic component performs post-processing, which includes
invoking the extension controller to run any post action extensions.
14.The control returns to controller component, which may invoke other business
logic component methods. Once this is complete, the control performs post-
processing, which includes invoking extension controller to execute any post
transaction extensions and invokes the Transaction Audit Information Log
component to audit the transaction.
15.The Control returns to business proxy. The business proxy can run additional
MDM Server transactions, using controller component methods.
16.The control is returned to the request handler which gets a response
constructor from a factory. The business objects are de-serialized into XML
and the response is returned to the MDM Server consumer.
Note: For full details on IBM InfoSphere MDM Server functionality and the
data model, refer to MDM Server documentation available with the RDP for
MDM packaged software.

B
Appendix B. Standard Interface File

details
In this appendix we provide an overview of the Record Type/System Type
(RT/ST) mapping of the Standard Interface File (SIF).

B.1 SIF details
The SIF has 23 RT/ST combinations (Table B-1 on page 250 through Table B-23
on page 261) to populate party and contact information in the MDM data
repository, each with specific fields that almost mirror corresponding columns
and tables in the MDM data repository model. It includes RT/ST combinations to
define hierarchies (“HH”, “HN”, “HR” and “HU”).
You should map the key data columns in your source systems to corresponding
columns in the appropriate SIF RT/ST records before they can be loaded into the
MDM repository using RDP for MDM jobs.
Note: The SIF supports both inserts to and updates of records in the MDM
repository, but not delete operations. In this IBM Redbooks publication, we
cover both inserts (for initial load) and updates to perform delta processing.
The data types of each column in the RT/ST must be known to map the columns
in your source systems to the SIF — this is defined in the RT/ST templates
provided as part of the RDP for MDM solution. Table B-1 on page 250 through
Table B-23 on page 261 do not contain the data type information.
When a value in the column of an RT/ST record can be null (as indicated by the
“N” in the “Can be empty?” column in Table B-1 on page 250), you can define the
action to be taken on the value in the corresponding column of the MDM data
repository when NULL is supplied as a value in that RT/ST column as follows:
Set the null indicator for that column in the RT/ST to a 1 or 0. The Mapping Rule
specifies the action to be taken on the value in the corresponding column of the
MDM data repository. The null indicator columns (names beginning with
“NULL_”) and their corresponding Mapping Rule are shown in Table B-1 on
page 250 through Table B-23 on page 261. For example, in the RT/ST, the
NULL_PREF_LANG_TP_CD in Table B-1 on page 250 column corresponds to
the PREF_LANG_TP_CD column (which can be empty) and the Mapping rule for
it specifies that the following action be taken:
If 1 then set to null, if 0 and column is empty use prior value, if 0
and column is not empty overwrite prior value

This specifies the following information:
򐂰 Setting a 1 in the NULL_PREF_LANG_TP_CD column in the PP RT/ST SIF
record specifies that you want the value in the corresponding column in the
MDM repository to be set to NULL.
򐂰 Setting to 0 with a null in the PREF_LANG_TP_CD column specifies that the
corresponding column in the MDM data repository should retain its prior
value. This applies to the case of an update operation.
򐂰 Setting to 0 with a non-null value in the PREF_LANG_TP_CD column
specifies that the corresponding column in the MDM data repository should
be over-written with the value in the PREF_LANG_TP_CD column. This
applies to the case of an update operation.
Note: If the PREF_LANG_TP_CD column has a value, then the null indicator
setting does not apply.
Table B-1 on page 250 through Table B-23 on page 261 provide a high level
overview of the individual columns and mapping rules for each of the 23 RT/ST
combinations.
Appendix B. Standard Interface File details 249

Table B-1 Contact information RT/ST is “PP”
Column name Can be Mapping rule Validate
empty? to table
RECTYPE N "P"
SUBTYPE N "P" or "O" (Cannot be updated)
ADMIN_SYS_TP_CD N CDADMINSYSTP
ADMIN_CLIENT_ID N
LOAD_TYPE Y “U” update, “A” add, “empty” either add or update as applicable
FORCE_MATCH N "Y" or "N"
CONTEQUIV_DESCRIPTION Y
ACCE_COMP_TP_CD Y CDACCETOCOMPTP
PREF_LANG_TP_CD Y CDLANGTP
CONTACT_NAME Y
SOLICIT_IND Y
CONFIDENTIAL_IND Y
CLIENT_IMP_TP_CD Y CDCLIENTIMPTP
CLIENT_ST_TP_CD Y CDCLIENTSTTP
CLIENT_POTEN_TP_CD Y CDCLIENTPOTENTP
RPTING_FREQ_TP_CD Y CDRPTINGFREQTP
LAST_STATEMENT_DT Y
ALERT_IND Y
PRVBY_ADMIN_SYS_TP_CD Y CDADMINSYSTP
PRVBY_ADMIN_CLIENT_ID Y
DO_NOT_DELETE_IND Y
SOURCE_IDENT_TP_CD Y CDSOURCEIDENTTP
LAST_USED_DT Y
LAST_VERIFIED_DT Y
SINCE_DT Y
LEFT_DT Y
ACCESS_TOKEN_VALUE Y
ORG_TP_CD Y MUST BE EMPTY if SUBTYPE = "P", REQUIRED FOR SUBTYPE = "O" CDORGTP
INDUSTRY_TP_CD Y MUST BE EMPTY if SUBTYPE = "P" CDINDUSTRYTP
ESTABLISHED_DT Y MUST BE EMPTY if SUBTYPE = "P"
BUY_SELL_AGR_TP_CD Y MUST BE EMPTY if SUBTYPE = "P" CDBUYSELLAGREETP
PROFIT_IND Y MUST BE EMPTY if SUBTYPE = "P"
MARITAL_ST_TP_CD Y MUST BE EMPTY if SUBTYPE = "O" CDMARITALSTTP
BIRTHPLACE_TP_CD Y MUST BE EMPTY if SUBTYPE = "O" CDCOUNTRYTP
CITIZENSHIP_TP_CD Y MUST BE EMPTY if SUBTYPE = "O" CDCOUNTRYTP
HIGHEST_EDU_TP_CD Y MUST BE EMPTY if SUBTYPE = "O" CDHIGHESTEDUTP
AGE_VER_DOC_TP_CD Y MUST BE EMPTY if SUBTYPE = "O" CDAGEVERDOCTP
GENDER_TP_CODE Y MUST BE EMPTY if SUBTYPE = "O" not validated
BIRTH_DT Y MUST BE EMPTY if SUBTYPE = "O"
DECEASED_DT Y MUST BE EMPTY if SUBTYPE = "O"
CHILDREN_CT Y MUST BE EMPTY if SUBTYPE = "O"
DISAB_START_DT Y MUST BE EMPTY if SUBTYPE = "O"
DISAB_END_DT Y MUST BE EMPTY if SUBTYPE = "O"
USER_IND Y MUST BE EMPTY if SUBTYPE = "O"
NULL_DESCRIPTION N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ACCE_COMP_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PREF_LANG_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CONTACT_NAME N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_SOLICIT_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CONFIDENTIAL_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CLIENT_IMP_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CLIENT_ST_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CLIENT_POTEN_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_RPTING_FREQ_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_LAST_STATEMENT_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ALERT_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PROVIDED_BY_CONT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_DO_NOT_DELETE_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_SOURCE_IDENT_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_LAST_USED_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_LAST_VERIFIED_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_SINCE_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_LEFT_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ACCESS_TOKEN_VALUE N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_INDUSTRY_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ESTABLISHED_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_BUY_SELL_AGR_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PROFIT_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_MARITAL_ST_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_BIRTHPLACE_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CITIZENSHIP_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_HIGHEST_EDU_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_AGE_VER_DOC_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_GENDER_TP_CODE N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_BIRTH_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_DECEASED_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CHILDREN_CT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_DISAB_START_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_DISAB_END_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_USER_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value

Table B-2 OrgName information RT/ST is “PG”
empty? to table
RECTYPE N "P"
SUBTYPE N "G"
ADMIN_SYS_TP_CD N Not required.
ADMIN_CLIENT_ID N
ORG_NAME_TP_CD N CDORGNAMETP
ORG_NAME N
S_ORG_NAME Y
START_DT Y Use Processing Date if not supplied.
END_DT Y
LAST_USED_DT Y
LAST_VERIFIED_DT Y
P_ORG_NAME Y
NULL_S_ORG_NAME N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_END_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Table B-3 Person Name / Person Search information RT/ST is “PH”

empty? to table
RECTYPE N "P"
SUBTYPE N "H"
ADMIN_CLIENT_ID N
PREFIX_NAME_TP_CD Y CDPREFIXNAMETP
PREFIX_DESC Y
NAME_USAGE_TP_CD N CDNAMEUSAGETP
FREE_FORM_NAME Y Must be supplied if LAST_NAME is empty. Must be empty if GIVEN_NAME or LAST_NAME present.
GIVEN_NAME_ONE Y
GIVEN_NAME_TWO Y
GIVEN_NAME_THREE Y
GIVEN_NAME_FOUR Y
LAST_NAME Y Must be empty if FREE_FORM_NAME supplied
GENERATION_TP_CD Y CDGENERATIONTP
SUFFIX_DESC Y
END_DT Y
USE_STANDARD_IND Y
LAST_USED_DT Y
LAST_VERIFIED_DT Y
P_LAST_NAME Y
P_GIVEN_NAME_ONE Y
P_GIVEN_NAME_TWO Y
P_GIVEN_NAME_THREE Y
P_GIVEN_NAME_FOUR Y
GIVEN_NAME_ONE_SEARCH Y
GIVEN_NAME_TWO_SEARCH Y
GIVEN_NAME_THREE_SEARCH Y
GIVEN_NAME_FOUR_SEARCH Y
LAST_NAME_SEARCH Y
NULL_PREFIX_NAME_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PREFIX_DESC N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_GIVEN_NAME_ONE N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_GIVEN_NAME_TWO N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_GIVEN_NAME_THREE N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_GIVEN_NAME_FOUR N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_GENERATION_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_SUFFIX_DESC N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_USE_STANDARD_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value

Table B-4 External Match RT/ST is “PE”
empty? to table
RECTYPE N "P"
SUBTYPE N "E’
ADMIN_CLIENT_ID N
DESCRIPTION Y
LINKTO_ADMIN_SYS_TP_CD N Not required.
LINKTO_ADMIN_CLIENT_ID N
Table B-5 Location_Group_Address_Group Address RT/ST is “PA”

empty? to table
RECTYPE N "P"
SUBTYPE N "A"
ADMIN_CLIENT_ID N
UNDEL_REASON_TP_CD Y CDUNDELREASONT
MEMBER_IND Y P
PREFERRED_IND N
SOLICIT_IND Y
EFFECT_START_MMDD Y
EFFECT_END_MMDD Y
EFFECT_START_TM Y
EFFECT_END_TM Y
END_DT Y
LAST_USED_DT Y
LAST_VERIFIED_DT Y
SOURCE_IDENT_TP_CD Y
CARE_OF_DESC Y CDSOURCEIDENTTP
ADDR_USAGE_TP_CD Y
COUNTRY_TP_CD Y CDADDRUSAGETP
RESIDENCE_TP_CD Y CDCOUNTRYTP
PROV_STATE_TP_CD Y CDRESIDENCETP
ADDR_LINE_ONE Y CDPROVSTATETP
ADDR_LINE_TWO Y
ADDR_LINE_THREE Y
CITY_NAME Y
POSTAL_CODE Y
ADDR_STANDARD_IND Y
OVERRIDE_IND Y
RESIDENCE_NUM Y
COUNTY_CODE Y
LATITUDE_DEGREES
LONGITUDE_DEGREES
POSTAL_BARCODE
P_CITY
BUILDING_NAME
STREET_NUMBER
STREET_NAME
P_STREET_NAME
STREET_SUFFIX
PRE_DIRECTIONAL
POST_DIRECTIONAL
BOX_DESIGNATOR
BOX_ID
STN_INFO
STN_ID
REGION
DEL_DESIGNATOR
DEL_ID
DEL_INFO

empty? to table
NULL_UNDEL_REASON_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_MEMBER_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PREFERRED_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_EFFECT_START_MMDD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_EFFECT_END_MMDD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_EFFECT_START_TM N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_EFFECT_END_TM N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CARE_OF_DESC N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CONTRY_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_RESIDENCE_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PROV_STATE_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ADDR_LINE_TWO N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ADDR_LINE_THREE N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_POSTAL_CODE N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ADDR_STANDARD_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_OVERRIDE_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_RESIDENCE_NUM N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_COUNTY_CODE N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_LATITUDE_DEGREES N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_LONGITUDE_DEGREES N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_POSTAL_BARCODE N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_BUILDING_NAME N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_STREET_NUMBER N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_STREET_NAME N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_STREET_SUFFIX N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PRE_DIRECTIONAL N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_POST_DIRECTIONAL N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_BOX_DESIGNATOR N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_BOX_ID N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_STN_INFO N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_STN_ID N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_REGION N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_DEL_DESIGNATOR N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_DEL_ID N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_DEL_INFO N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Table B-6 LocationGroup_ContactMethodGroup_ContactMethod RT/ST is “PC”

empty? to table
RECTYPE N "P"
SUBTYPE N "C"
ADMIN_CLIENT_ID N
UNDEL_REASON_TP_CD Y CDUNDELREASONTP
MEMBER_IND Y
PREFERRED_IND Y
SOLICIT_IND Y
EFFECT_START_MMDD Y
EFFECT_END_MMDD Y
EFFECT_START_TM Y
EFFECT_END_TM Y
END_DT Y
LAST_USED_DT Y
LAST_VERIFIED_DT Y
CONT_METH_TP_CD N CDCONTMETHTP
METHOD_ST_TP_CD Y CDMETHODSTATUSTP
ATTACH_ALLOW_IND Y
TEXT_ONLY_IND Y
MESSAGE_SIZE Y
COMMENT_DESC Y
REF_NUM N
CONT_METH_STD_IND Y
COUNTRY_CODE Y
AREA_CODE Y
EXCHANGE Y
PH_NUMBER Y S
EXTENSION Y

empty? to table
NULL_MEMBER_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PREFERRED_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_EFFECT_START_MMDD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_EFFECT_END_MMDD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_EFFECT_START_TM N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_EFFECT_END_TM N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_METHOD_ST_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ATTACH_ALLOW_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_TEXT_ONLY_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_MESSAGE_SIZE N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_COMMENT_DESC N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CONT_METH_STD_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_COUNTRY_CODE N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_AREA_CODE N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_EXCHANGE N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PH_NUMBER N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_EXTENSION N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Table B-7 Identifier RT/ST is “PI”

empty? to table
RECTYPE N "P"
SUBTYPE N "I"
ADMIN_CLIENT_ID N
ID_TP_CD N CDIDTP
ID_STATUS_TP_CD Y CDIDSTATUSTP
REF_NUM Y
END_DT Y
EXPIRY_DT Y
ASSIGNEDBY_ADMIN_SYS_TP_CD Y CDADMINSYSTP
ASSIGNEDBY_ADMIN_CLIENT_ID Y
IDENTIFIER_DESC Y
ISSUE_LOCATION Y
LAST_USED_DT Y
LAST_VERIFIED_DT Y
NULL_ID_STATUS_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_REF_NUM N ref_num can only be null for 1 identifier status type. If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not
empty overwrite prior value
NULL_EXPIRY_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ASSIGNED_BY N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_IDENTIFIER_DESC N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ISSUE_LOCATION N #########################################################################################################
Table B-8 LobRel RT/ST is “PB”

empty? to table
RECTYPE N "P"
SUBTYPE N "B"
ADMIN_CLIENT_ID N
ENTITY_NAME N "CONTACT"
LOB_TP_CD N CDLOBTP
LOB_REL_TP_CD N CDLOBRELTP
END_DT Y

Table B-9 ContactRel RT/ST is “PR”
empty? to table
RECTYPE N "P"
SUBTYPE N "R"
ADMIN_SYS_TP_CD_TO N Not required.
ADMIN_CLIENT_ID_TO N
ADMIN_SYS_TP_CD_FROM N Not required.
ADMIN_CLIENT_ID_FROM N TO and FROM SSKs cannot be the same.
REL_TP_CD N CDRELTP
REL_DESC Y
END_DT Y
REL_ASSIGN_TP_CD Y CDRELASSIGNTP
END_REASON_TP_CD Y CDENDREASONTP
NULL_REL_DESC N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_REL_ASSIGN_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_END_REASON_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Table B-10 Contract RT/ST is “CH”

empty? to table
RECTYPE N "C"
SUBTYPE N "H"
ADMIN_CONTRACT_ID N
CONTR_LANG_TP_CD Y CDLANGTP
CURRENCY_TP_CD Y CDCURRENCYTP
FREQ_MODE_TP_CD Y CDFREQMODETP
BILL_TP_CD Y CDBILLTP
PREMIUM_AMT Y
NEXT_BILL_DT Y
CURR_CASH_VAL_AMT Y
LINE_OF_BUSINESS Y
BRAND_NAME Y
SERVICE_ORG_NAME Y
BUS_ORGUNIT_ID Y
SERVICE_PROV_ID Y
REPLBY_ADMIN_SYS_TP_CD Y Required if Reply by contract ID present. CDADMINSYSTP
REPLBY_ADMIN_CONTRACT_ID Y
ISSUE_LOCATION Y
PREMAMT_CUR_TP Y CDCURRENCYTP
CASHVAL_CUR_TP Y CDCURRENCYTP
ACCESS_TOKEN_VALUE Y
MANAGED_ACCOUNT_IND Y WARNING: Leave null unless advised by MDM Server expert.
AGREEMENT_NAME Y
AGREEMENT_NICKNAME Y
SIGNED_DT Y
EXECUTED_DT Y
END_DT Y
ACCOUNT_LAST_TRANSACTION_DT Y
TERMINATION_DT Y
TERMINATION_REASON_TP_CD Y CDTERMINATIONREASONTP
AGREEMENT_DESCRIPTION Y
AGREEMENT_ST_TP_CD Y CDAGREEMENTSTTP
AGREEMENT_TP_CD Y CDAGREEMENTTP
SERVICE_LEVEL_TP_CD Y CDSERVICELEVELTP
LAST_VERIFIED_DT Y
LAST_REVIEWED_DT Y
PRODUCT_ID Y NOT USED
CLUSTER_KEY Y

empty? to table
NULL_CONTR_LANG_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CURRENCY_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_FREQ_MODE_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_BILL_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PREMIUM_AMT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_NEXT_BILL_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CURR_CASH_VAL_AMT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_LINE_OF_BUSINESS N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_BRAND_NAME N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_SERVICE_ORG_NAME N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_BUS_ORGUNIT_ID N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_SERVICE_PROV_ID N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_REPL_BY_CONTRACT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ISSUE_LOCATION N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PREMAMT_CUR_TP N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CASHVAL_CUR_TP N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ACCESS_TOKEN_VALUE N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_MANAGED_ACCOUNT_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_AGREEMENT_NAME N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_AGREEMENT_NICKNAME N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_SIGNED_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_EXECUTED_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_REPLACES_CONTRACT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ACCOUNT_LAST_TRANSACTION_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_TERMINATION_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_TERMINATION_REASON_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_AGREEMENT_DESCRIPTION N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_AGREEMENT_ST_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_AGREEMENT_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_SERVICE_LEVEL_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_LAST_REVIEWED_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PRODUCT_ID N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CLUSTER_KEY N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Table B-11 Contract RT/ST is “CK”

empty? to table
RECTYPE N "C"
SUBTYPE N "K"
ADMIN_FLD_NM_TP_CD N CDADMINFLDNMTP
ADMIN_CONTRACT_ID N
LINKTO_ADMIN_FLD_NM_TP_CD N CDADMINFLDNMTP
LINKTO_ADMIN_CONTRACT_ID N
CONTRACT_COMP_IND Y ANY VALUE INPUT WILL BE OVERRIDDEN TO "N"
Table B-12 Contract Component RT/ST is “CC”

empty? to table
RECTYPE N "C"
SUBTYPE N "C"
ADMIN_CONTRACT_ID N
PROD_TP_CD N CDPRODTP
CONTRACT_ST_TP_CD N CDCONTRACTSTTP
CURR_CASH_VAL_AMT Y
PREMIUM_AMT Y
ISSUE_DT Y
VIATICAL_IND Y
BASE_IND Y
CONTR_COMP_TP_CD Y CDCONTRCOMPTP
SERV_ARRANGE_TP_CD Y CDARRANGEMENTTP
EXPIRY_DT Y
PREMAMT_CUR_TP Y CDCURRENCYTP
CASHVAL_CUR_TP Y CDCURRENCYTP
CLUSTER_KEY Y

empty? to table
NULL_CURR_CASH_VAL_AMT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PREMIUM_AMT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ISSUE_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_VIATICAL_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_BASE_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_SERV_ARRANGE_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_EXPIRY_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PREMAMT_CUR_TP N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CASHVAL_CUR_TP N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CLUSTER_KEY N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Table B-13 Contract Role RT/ST is “CR”

empty? to table
RECTYPE N "C"
SUBTYPE N "R"
ADMIN_CONTRACT_ID N
ADMIN_CLIENT_SYS_TP_CD N Not required.
ADMIN_CLIENT_ID N
CONTR_ROLE_TP_CD N CDCONTRACTROLETP
REGISTERED_NAME Y
DISTRIB_PCT Y
IRREVOC_IND Y
END_DT Y
RECORDED_START_DT Y
RECORDED_END_DT Y
SHARE_DIST_TP_CD Y CDSHAREDISTTP
ARRANGEMENT_TP_CD Y CDARRANGEMENTTP
ARRANGEMENT_DESC Y
END_REASON_TP_CD Y CDENDREASONTP
NULL_REGISTERED_NAME N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_DISTRIB_PCT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_IRREVOC_IND N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_RECORDED_START_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_RECORDED_END_DT N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_SHARE_DIST_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ARRANGEMENT_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ARRANGEMENT_DESC N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_END_REASON_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Table B-14 Role Location RT/ST is “CR”

empty? to table
RECTYPE N "C"
SUBTYPE N "L"
ADMIN_CONTRACT_ID N
ADMIN_CLIENT_ID N
ADDR_USAGE_TP_CD N CDADDRUSAGETP
END_DT
UNDEL_REASON_TP_CD CDUNDELREASONTP

Table B-15 Role Location RT/ST is “CL”
empty? to table
RECTYPE N "C"
SUBTYPE N "L"
ADMIN_CONTRACT_ID N
ADMIN_CLIENT_ID N
ADDR_USAGE_TP_CD N CDADDRUSAGETP
END_DT
UNDEL_REASON_TP_CD CDUNDELREASONTP
Table B-16 ContractCompVal RT/ST is “CV”

empty? to table
RECTYPE N "C"
SUBTYPE N "V"
ADMIN_CONTRACT_ID N
DOMAIN_VALUE_TP_CD N CDDOMAINVALUETP
VALUE_STRING N
END_DT Y
Table B-17 MiscValue RT/ST is “CM” or “PM”

empty? to table
RECTYPE N "C" or "P"

SUBTYPE N "M"
ADMIN_CLIENT_OR_CONTRACT_ID N
MISCVALUE_TP_CD N CDMISCVALUETP
VALUE_STRING Y
PRIORITY_TP_CD Y CDPRIORITYTP
DESCRIPTION Y
END_DT Y
VALUEATTR_TP_CD_0 Y CDMISCVALUEATTRTP
ATTR0_VALUE Y
ATTR1_VALUE Y
ATTR2_VALUE Y
ATTR3_VALUE Y
ATTR4_VALUE Y
ATTR5_VALUE Y
ATTR6_VALUE Y
ATTR7_VALUE Y
ATTR8_VALUE Y
ATTR9_VALUE Y

empty? to table
NULL_VALUE_STRING N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PRIORITY_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_VALUEATTR_TP_CD_0 N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ATTR0_VALUE N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Table B-18 PPrefEntity_PrivPref RT/ST is “PS”

empty? to table
RECTYPE N "P"
SUBTYPE N "S"
ADMIN_CLIENT_ID N
PPREF_REASON_TP_CD N CDPPREFREASONTP
SOURCE_IDENT_TP_CD N CDSOURCEIDENTTP
VALUE_STRING Y
END_DT Y
PPREF_TP_CD N CDPPREFTP
PPREF_ACT_OPT_ID Y PPREFACTIONOPT
NULL_VALUE_STRING N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_PPREF_ACT_OPT_ID N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Table B-19 Alert RT/ST is “CT” or “PT”

empty? to table
RECTYPE N "C" or "P"

SUBTYPE N "T"
ADMIN_CONTRACT_OR_CLIENT_ID N
REMOVED_BY_USER Y
CREATED_BY_USER Y
ALERT_TP_CD N CDALTERTP
ALERT_SEV_TP_CD Y CDALERTSEVTP
END_DT Y
DESCRIPTION Y
NULL_REMOVED_BY_USER N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_CREATED_BY_USER N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
NULL_ALERT_SEV_TP_CD N If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value

Table B-20 Hierarchy RT/ST is “HH”
empty? to table
RECTYPE N "H"
SUBTYPE N "H"
NAME N
HIERARCHY_TP_CD N CDHIERARCHYTP
DESCRIPTION
START_DT
END_DT
NULL_DESCRIPTION N if 1 set null, if 0 use prior

NULL_END_DT N if 1 set null, if 0 use prior
Table B-21 Hierarchy Node RT/ST is “HN”

empty? to table
RECTYPE N "H"
SUBTYPE N "H"
NAME N
ADMIN_SYS_TP_CD N
ADMIN_CLIENT_ID N
ENTITY_NAME
DESCRIPTION
START_DT
END_DT
NODEDESIG_TP_CD
LOCALEDESCRIPTION

NULL_NODEDESIG_TP_CD N if 1 set null, if 0 use prior
NULL_LOCALEDESCRIPTION N if 1 set null, if 0 use prior
Table B-22 Hierarchy Rel RT/ST is “HR”

empty? to table
RECTYPE N "H"
SUBTYPE N "R"
NAME N
ADMIN_SYS_TP_CD_PARENT N CDADMINSYSTP
ADMIN_CLIENT_ID_PARENT N
ADMIN_SYS_TP_CD_CHILD N CDADMINSYSTP
ADMIN_CLIENT_ID_CHILD N
DESCRIPTION
START_DT
END_DT


Table B-23 Hierarchy Ultimate Parent RT/ST is “HU”
empty? to table
RECTYPE N "H"
SUBTYPE N "U"
NAME N
ADMIN_CLIENT_ID N
DESCRIPTION
START_DT
END_DT


C
Appendix C. Master Data Management

Server customization
considerations
In this appendix we describe the extensions supported by Master Data
Management (MDM) Server and the impact of such extensions on the Rapid
Deployment Package (RDP) for MDM jobs.

C.1 Introduction
Because MDM Server source code is not accessible to clients, there are a
number of extension and configuration mechanisms available to adapt the
product to your environment. The Extension Framework1 is one of these
mechanisms. It is tightly integrated with the kernel of the product.
The primary types of extensions are as follows:

򐂰 Data extensions and additions, which allow you to add new data elements as
well as create new business entities with a set of business services to
maintain them.
򐂰 Behavior extensions, which allow you to plug in new business rules or
functionality.
Note: MDM Server also comes with MDM Server Workbench, a development
tool to help with the creation of these data and behavior extensions. This
workbench comes in the form of a plug-in to Rational® Software Architect.
You may also create new transactions or services using the MDM Server
application framework. You can build transactions by constructing new
controller/business components and using the existing Request Framework and
Common Components.
This appendix briefly describes the following:

򐂰 Data extensions and additions
򐂰 Behavior extensions
򐂰 Impact of extensions on RDP for MDM
C.2 Data extensions and additions

MDM Server provides a mechanism for extending the data model. You can add
new attributes to existing tables as well as add new tables. Extended data
elements can be persisted and retrieved as part of existing MDM Server
transactions without the need to modify MDM Server code.
1
MDM Server also uses its own extension framework to plug in some modules, such as Rules of
Visibility, in order to keep it loosely coupled and easily configurable to turn “on” or “off”.

MDM Server has the following responsibilities when dealing with extended data:
򐂰 Parsing extended data as part of an XML service request and creating
extended business objects
򐂰 Invoking validation routines on the extended business objects
򐂰 Populating the extended data elements as part of the MDM Server meta-data
so that features such as external validation rules can be used
򐂰 Invoking methods on the extended business object when required to persist
or retrieve the extended data elements
򐂰 Constructing XML data as part of the service completion
C.3 Behavior extensions

MDM Server provides a mechanism for extending the behavior of the product in
an event-based way. The Pre/Post Transaction and Pre/Post Action points within
the product can be extended to provide additional functionality.
A transaction equates to a published service, or Controller Component

operation. An action equates to an operation on a business logic component.
There may be other predefined points that can be extended. They are
documented as part of the service specification. You can write extensions to
MDM Server behavior as Java code or in a rules engine language. Extensions
are organized into Extension Sets, which are similar to the rule sets within a rules
engine. Examples include generic prospective client rules or line of
business-specific rules like life insurance client rules. The Extension Controller is
the gateway from the core application to behavior extensions and is invoked at
extension points listed above. It is provided with the following information:
򐂰 Data about extension point that invoked it
򐂰 The transaction’s object hierarchy
򐂰 The action’s object hierarchy, in the case of an action extension point
򐂰 The transaction header that was provided in the original MDM Server request
The Extension Controller uses the parameters to determine if any Extension Sets
must be further evaluated. Relevant Extension Sets are then interrogated and
qualified extensions, either Java or rules sets, are invoked.
Appendix C. Master Data Management Server customization considerations 265

C.4 Impact of data/behavior extensions on RDP for MDM
The process of extending the MDM data model to support your organization's
specific master data requirements is beyond the scope of this IBM Redbooks
publication. However, this Appendix provides considerations for extending the
MDM Server, and the corresponding impact on the RDP for MDM assets.
Because RDP for MDM loads directly into the MDM target tables, creating new
MDM Server services or behavior extensions will have no impact on RDP for
MDM. However, with extensions to the MDM data model, changes must be made
to both the MDM Server and the corresponding RDP for MDM assets.
MDM Server provides a code generation tool to allow clients to change existing
column attributes, add new columns to existing tables, and add new tables to
satisfy business requirements. The code generation tool also generates the Web
Services integration code for these data extensions.

Corresponding changes to the RDP for MDM assets depend on the type of data
model change, as summarized in Table C-1.
Table C-1 MDM extensions impact on RDP for MDM

Type of MDM Nature of MDM Impact on RDP for MDM
extensions extension
Data extensions Add a new element to an Change to the corresponding ImportSIF shared
and additions existing SIF record when container (names starting with ILIS…) will propagate
that element does not through to the target.
participate in some
transformation or For BulkLoad, no further changes are required.
aggregation
For Insert (Upsert), change to the corresponding DB
or shared container (names starting with ILDBIN…) is also
required.
Modify an existing
element's data type and
precision/scale/length
when that element does
not participate in some
transformation or
aggregation.
Add a new element to an Change to the corresponding ImportSIF shared

existing SIF record when container (names starting with ILIS…) will propagate
that element participates through to the target.
in some transformation or
aggregation For BulkLoad, no further changes are required.
or For Insert (Upsert), change to the corresponding DB

shared container (names starting with ILDBIN…) is also
Modify an existing required.
element's data type and
precision/scale/length Search for dependent objects where the element is used,
when that element and examine transformation or validation logic to see if
participates in some changes are necessary.
transformation or
aggregation.
Add a new table (new SIF This is beyond the scope of a typical RDP for MDM
record) implementation.
Behavior extensions No impact
New transaction/service No impact

A brief overview of some of the considerations involved in extending RDP for
MDM is covered here as follows:
򐂰 Extending RDP for MDM
򐂰 Runtime Column Propagation
򐂰 Adding new elements (columns)
򐂰 Modifying existing elements (columns)
C.5 Extending RDP for MDM

In general, RDP for MDM jobs have been built using modular design techniques
and reusable shared containers. This allows changes to be made to source,
target, edit, and validation logic without changing the actual jobs themselves.
In this way, as long as the extensions are confined to existing shared containers,
it should be possible to upgrade core RDP jobs without losing client-specific
customizations.
Shared containers are provided for the following jobs:

򐂰 ImportSIF format
򐂰 Database select (code tables), and target (upsert and bulk load methods)
򐂰 Edit points for pre-validation
򐂰 Validation and Standardization
򐂰 Match Processing
򐂰 Error Conditions
򐂰 ID Assignment
C.6 Runtime Column Propagation

Runtime column propagation (RCP) is a feature of IBM InfoSphere Information
Server that allows job designs to accommodate additional columns beyond those
defined by the DataStage or QualityStage job developer.
Using RCP judiciously facilitates re-usable job designs based on input metadata,
rather than using a large number of jobs with hard-coded table definitions to
perform the same tasks. Furthermore, RCP facilitates re-use through parallel
shared containers.
By using RCP, only the columns explicitly referenced within the shared container
logic need to be defined, the remaining columns pass through at runtime, as long
as each stage in the shared container has RCP enabled on their stage Output
properties.

Before a DataStage developer can use RCP, it must be enabled at the project
level through the administrator client. RCP is then enabled or disabled on the
Output tab of each stage. When RCP is enabled, columns not explicitly defined
will be passed across the stage from input to output.
C.7 Adding new elements (columns)

Most of the RDP for MDM job designs enable RCP across their stages. In the
simplest case, this allows additional columns to be defined on input by modifying
the corresponding ImportSIF shared container. As long as these new columns
are not needed in additional derivations or validations, these additional columns
will flow from the source SIF to the target database table. For BulkLoad, no
additional changes are necessary. For Insert(Upsert), the table definition within
the corresponding DB shared container must also be updated with the new
column.
There are some RDP for MDM jobs, stages, and shared containers where RCP
is explicitly disabled. In most cases, RCP is disabled for QualityStage match, as
this is a standard practice (only matching key columns are output). However,
there are other objects and jobs where RCP for MDM is disabled that should be
reviewed to ensure the additional columns are passed down when necessary.
Table C-2 on page 270 summarizes the jobs and containers that may require
review.
Note: This is an incomplete table, and may change with new releases of RDP
assets.
If a newly-added element must also be validated, the corresponding EditPoint or

Validation shared container should be changed instead of changing an existing
base RDP for MDM job.

Table C-2 RDP for MDM objects with RCP disabled
Category Job/Container name
MDMIS R4 IL_000_PS_Stage_ErrReasonTbl
MDMIS R4 IL_010_IS_Import_SIF
MDMIS R4 IL_020_VS_Address
MDMIS R4/ EPCVSAddress (Container)

Shared Containers/EditPointContainers
MDMIS R4/ VSVALAddress (Container)

Shared Containers/ValidationStanContainers
MDMIS R4/ (names start with ILIS…)

Shared Containers/ImportSIFContainers
MDMIS R4/Shared Containers/DBContainers (names start with "ILDBIN…")
C.8 Modifying existing elements (columns)

By using RCP, changes to existing column attributes such as length, precision,
and in some cases even data type can also flow from source to target. Similar to
adding new columns, changes must be made in the source ImportSIF shared
container, and (if using Insert) target DB shared container.
It is possible that an existing column may also be used in a transformation or

aggregation that must be reviewed and updated. While the Advanced Find
feature may be useful in locating some objects, it is easiest to identify and
change all dependent transformations by exporting the entire RDP for MDM
project to a DSX file.
DSX files contain a clear-text representation of all DataStage and QualityStage

objects that can be easily searched using a text editor or command-line tool. Edit
a copy of the .DSX file, and update all transformations and stages where an
existing column is used. This updated copy can be re-imported into the RDP for
MDM project. This method is particularly effective for simple derivations.
Changes to some elements may require more advanced knowledge of

DataStage and QualityStage. For example, if a column is used in a QualityStage
standardization or match, those specifications will need to be updated. These
advanced or more extensive changes may require the guidance of an IBM
Professional Services consultant experienced with RDP for MDM.

D
Appendix D. Error processing

In this appendix we describe the most commonly encountered data related
problems in the Standard Interface File (SIF) and how they are highlighted in the
Rapid Deployment Package (RDP) for Master Data Management (MDM) error
log.

D.1 Introduction
Figure D-1 shows that errors processing the SIF may be identified during the
various phases (such as Import SIF, Validation & Standardization, and Error
Consolidation & Referential Integrity) of RDP for MDM processing. The errors are
consolidated into the consolidated error log.
RDP DS/QS Jobs
Import Validation & Error Consolidation & Match ID Assignment Load

SIF Standardization Referential Integrity
(Intra-record (Intra-record only) (Inter-record only)
only)
MDM GUI
Data type Standardization Referential Integrity
checking Code tables Transitive errors
RT/ST Pair date validations
validation
MDM
repository
MDM No more error logs

ConfigElement
table
Error Error ~17 files Error Variable number of files
log logs logs
Configuration
Parameters
One file for Party

Consolidated One file for Contract
SIF Error log
file
Figure D-1 Main components of RDP processing & error logs generated
The general format of the error messages in the error logs is shown in Table D-1
on page 273. Please refer to the download site for a document on the error
codes:

Table D-1 Error message format
Field# SIF column names Description
1 RECTYPE These two fields identify the record type, such as PA

(address), PI (identifier), CC (contract component)
2 SUBTYPE
3 ADMIN_SYS_TP_CD These two fields are the SSK (Source System Key)
4 ADMIN_CLIENT_ID_OR_
CONTRACT_ID
5 CONT_ID This is the surrogate key (SK) generated for each row
6 SIF_FILE_NAME This is the physical location of the error row among the
input data files
7 SIF_ROW_NUMBER
8 ERR_CODE This is the 'error type', such as "invalid state/province type

code". This field is an integer.
9 ERR_REASON_CODE This is a specific instance of an error code. Such as

"invalid state/province type code" in the Address Validation
job. This field is also an integer.
10 ERR_MSG This is a text message corresponding to the error code

(ERR_CODE) — the literal string "invalid state/province
type code"
11 ERR_SEVERITY_LEVEL The severity of the error.
12 BATCH_ID You should set this to a unique id which gets assigned to

each run of the jobs. It is used in all of the filenames for all
files created by the RDP for MDM jobs.
13 INTERNAL_ID This is a surrogate key that we apply inside the RDP for
MDM jobs for use there, it is dropped before loading the
database (where the CONT_ID is used instead).
14 ERR_TS This is a time stamp corresponding to when the error was

detected.
15 ERR_STAGE_NAME This is the name of the stage that detected the error and
produced the error row.
16 ERR_JOB_NAME This is the name of the job in which the

ERR_STAGE_NAME stage resides.
Appendix D. Error processing 273

Attention: The error log does not identify the offsets in the record of the fields
in error. Also, the sequence of error messages does not reflect the actual time
sequence of occurrence of the error. The parallel framework with the default
automated partitioning used in the RDP for MDM jobs causes the sequence of
these errors to be non-deterministic. This means that reruns of the same job
are likely to show the error messages generated in a different order each time.
In this appendix, we created a number of SIF files containing the most commonly
encountered errors to identify the corresponding error messages generated by
the RDP for MDM jobs. The contents of the consolidated error log is shown here.
Note: We chose to document each error in isolation in order to review the

corresponding errors generated in the logs. In practice, a combination of these
errors is likely to occur. Carefully review the error logs to determine and rectify
the errant rows in the input SIF.
The most commonly encountered errors are as follows:

򐂰 Pipe character “|” in the data
򐂰 Validation error with the code table
򐂰 RT/ST/ADMIN_SYS_TP_CD error
򐂰 End of record missing
򐂰 Start date after end date error
򐂰 Date format error
D.2 Pipe character “|” in the data

The pipe character “|” is the field delimiter in the SIF, and the presence of it in the
data will cause the SIF parser to fail with an error message.
We introduced a pipe character in the address field of row 28 of the SIF

(highlighted in red in Example D-1 on page 275. Only the partial contents of the
SIF is shown here) to view the error messages generated by the SIF parser.
Example D-2 on page 277 shows the contents of the error log for this error:
򐂰 The first record highlights row 28 in the SIF file SIF_Out.pipe with the SSK of
(1000000,70005817) that the SIF parser is unable to parse. Error message
shows Unable to parse record at RT/ST Level, and the error severity level
is 0. The name of the stage (tx_RTST_ci_Rejects) and job name
(IL_010_Parse_Columnization) is also provided.
This row is rejected.

򐂰 The subsequent records show the rows in the SIF that are also rejected
because they are associated with the previous row. The message Record
dropped by association. Fatal errors were detected on related party
records. is generated identifying the corresponding row numbers (496, 579,
126, 698, and 17) in the input SIF file.
Note: Currently, the pipe character cannot be substituted as the field delimiter,
nor is an escape character provided.
Example: D-1 Pipe character in the data error—partial contents of SIF

P|P|1000002|8000719|A|N|||||||1||||||||||||||||||||3||||||1984-05-07
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000090|A|N|||||||2||||||||||||||||||||3|||||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000071|A|N|||||||2|||||||||||||||||||||||||F|1989-08-23
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000041|A|N|||||||1|||||||||||||||||||||||||M|1998-08-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70008172|A|N|||100|||||2|||||||||||||||||||||185|||M|1937-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000037|A|N|||||||1||||||||||||||||||||3|||||F|1986-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000297|A|N|||||||1||||||||||||||||||||2|||||F|1995-10-30
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000212|A|N|||||||1||||||||||||||||||||3|||||F|1997-11-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000291|A|N|||||||1|||||||||||||||||||||||||F|1977-03-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004432|A|N|||100|||||3|||||||||||||||||||||185|||M|1976-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004182|A|N|||100|||||2|||||||||||||||||||||185|||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|

P|P|1000002|8000640|A|N|||||||2||||||||||||||||||||3|||||F|1975-07-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000469|A|N|||||||2||||||||||||||||||||1|||||M|1990-03-14
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000111|A|N|||||||1||||||||||||||||||||1|||||M|1984-06-21
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000201|A|N|||||||1|||||||||||||||||||||||||M|1967-05-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005333|A|N|||100|||||2|||||||||||||||||||||185|||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005817|A|N|||100|||||4|||||||||||||||||||||185|||M|1957-03-29
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000232|A|N|||||||1||||||||||||||||||||1||||||1966-08-25
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000259|A|N|||||||1||||||||||||||||||||2|||||F|1991-02-04
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000011|A|N|||||||3|||||||||||||||||||||||||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000221|A|N|||||||2|||||||||||||||||||||||||M|1977-09-18
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70003022|A|N|||100|||||2|||||||||||||||||||||185|||M|1989-12-11
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|A|1000002|8000640|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||6177 Purple
Sage Ct|||San
Jose|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000002|8000469|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||5528 Muir
Dr|||San
Jose|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000002|8000111|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||631 Ofarrell
St|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0
|0|0|0|0|0|0|0|0|0|0|0|0|0|0|

P|A|1000001|200000201|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||1363 14th
Ave|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0
|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000000|70005333|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||6181 Camino
Verde Dr,,San
Jose,95119||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0
|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000000|70005817|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||PO Box
7424||San
Francisco,94120||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000002|8000340|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||44 Montgomery
St|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0
|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
............
Example: D-2 Pipe character in the data error log

P|A|1000000|70005817|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.pipe|28|110362|100401|U
nable to parse record at RT/ST Level|0|canonical_errPipe|6696|2008-05-30
09:36:10|tx_RTST_ci_Rejects|IL_010_Parse_Columnization
P|C|1000000||70005817|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.pipe|496|110387|100387
|Record dropped by association. Fatal errors were detected on related party
records.|0|canonical_errPipe|6696|2008-05-30
09:36:10|Split_Kept_Dropped|IL_040_EC_Party_Last_Drop
P|H|1000000|70005817|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.pipe|579|110387|100387|
Record dropped by association. Fatal errors were detected on related party
P|I|1000000|70005817|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.pipe|126|110387|100387|
P|I|1000000|70005817|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.pipe|698|110387|100387|
P|P|1000000|70005817|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.pipe|17|110387|100387|R
ecord dropped by association. Fatal errors were detected on related party

D.3 Validation error with the code table
We introduced an invalid code (-13) in the CLIENT_POTEN_TP_CD field of row
2 of the SIF (highlighted in red in Example D-3 — only the partial contents of the
SIF is shown here) to view the error messages generated.
򐂰 The first record highlights row 2 in the SIF file SIF_Out.CodeError with the
SSK of (1000002,8000090) that is in error. Error message shows “The
following is not correct: ClientPotentialType”, and the error severity level is 0.
The name of the stage
(020_Contact.CheckCodeAndContentValidationErrors) and job name
(IL_020_VS_Contact) is also provided.
򐂰 The second row has the error message “Record In Error Dropped” for the
same row (2) in the SIF. It also It shows name of the stage in which this occurs
as being 020Contact.DropErrorRows, and the job name being
IL_020_VS_Contact.
򐂰 The subsequent records show the rows (689, 780, and 41) in the SIF that are
also rejected because they are associated with row 2 that was dropped. The
messages “Invalid PersonName Records: No Matching Contact Record” (row
689)” and “Record dropped by association. Fatal errors were detected on
related party records.” (rows 780 and 41) are generated.
Example: D-3 Validation error with the code table error—partial contents of SIF
P|P|1000002|8000719|A|N|||||||1||||||||||||||||||||3||||||1984-05-07
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000090|A|N|||||||2||-13||||||||||||||||||3|||||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000071|A|N|||||||2|||||||||||||||||||||||||F|1989-08-23
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000041|A|N|||||||1|||||||||||||||||||||||||M|1998-08-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70008172|A|N|||100|||||2|||||||||||||||||||||185|||M|1937-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000037|A|N|||||||1||||||||||||||||||||3|||||F|1986-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|

P|P|1000002|8000297|A|N|||||||1||||||||||||||||||||2|||||F|1995-10-30
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000212|A|N|||||||1||||||||||||||||||||3|||||F|1997-11-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000291|A|N|||||||1|||||||||||||||||||||||||F|1977-03-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004432|A|N|||100|||||3|||||||||||||||||||||185|||M|1976-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004182|A|N|||100|||||2|||||||||||||||||||||185|||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000640|A|N|||||||2||||||||||||||||||||3|||||F|1975-07-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000469|A|N|||||||2||||||||||||||||||||1|||||M|1990-03-14
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000111|A|N|||||||1||||||||||||||||||||1|||||M|1984-06-21
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000201|A|N|||||||1|||||||||||||||||||||||||M|1967-05-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005333|A|N|||100|||||2|||||||||||||||||||||185|||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005817|A|N|||100|||||4|||||||||||||||||||||185|||M|1957-03-29
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000232|A|N|||||||1||||||||||||||||||||1||||||1966-08-25
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000259|A|N|||||||1||||||||||||||||||||2|||||F|1991-02-04
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000011|A|N|||||||3|||||||||||||||||||||||||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000221|A|N|||||||2|||||||||||||||||||||||||M|1977-09-18
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|

P|P|1000000|70003022|A|N|||100|||||2|||||||||||||||||||||185|||M|1989-12-11
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
............
Example: D-4 Validation error with the code table error log output
P|P|1000002|8000090|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.CodeError|2|1624|100054|
The following is not correct: ClientPotentialType|0|canonical_errCode|8611|2008-11-01
09:24:13|020_Contact.CheckCodeAndContentValidationErrors|IL_020_VS_Contact
P|P|1000002|8000090|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.CodeError|2|110184|10006
6|Record In Error Dropped|0|canonical_errCode|8611|2008-11-01
09:24:13|020Contact.DropErrorRows|IL_020_VS_Contact
P|H|1000002|8000090|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.CodeError|689|110126|100
246|Invalid PersonName Records: No Matching Contact
Record|0|canonical_errCode|8611|2008-11-01 09:25:10|030_CONTACT_RIV.Party Join
Proc|IL_030_RI_Contact_Person_Org
P|I|1000002|8000090|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.CodeError|780|110387|100
387|Record dropped by association. Fatal errors were detected on related party
records.|0|canonical_errCode|8611|2008-05-30
P|A|1000002|8000090|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.CodeError|41|110387|1003
87|Record dropped by association. Fatal errors were detected on related party
records.|0|canonical_errCode|8611|2008-05-30
D.4 RT/ST/ADMIN_SYS_TP_CD error

We introduced an invalid code RT/ST/ADMIN_SYS_TP_CD (PP1) in row 3 of the
SIF (highlighted in red in Example D-5 on page 281 — only the partial contents
of the SIF is shown here) to view the error messages generated. Example D-6 on
page 282 shows the contents of the error log for this error:
򐂰 The third record highlights row 3 in the SIF file SIF_Out.errRTST with the SSK
of (1,200000071) that is in error. Error message shows “Invalid Contact
Record: No Match found in PersonName nor OrganizationName”, and the
error severity level is 0. The name of the stage
(030_CONTACT_RIV.Process_Contact_Join) and job name
(IL_030_RI_Contact_Person_Org) is also provided.
򐂰 The first two records and the ones following the third record are errors
resulting from the invalid RT/ST/ADMIN_SYS_TP_CD. Note the various rows
(42, 102, 115, 474, and 755), error messages, and stage and job in which
these errors were detected.

Example: D-5 RT/ST/ADMIN_SYS_TP_CD error—partial contents of SIF
P|P|1000002|8000719|A|N|||||||1||||||||||||||||||||3||||||1984-05-07
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000090|A|N|||||||2||||||||||||||||||||3|||||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1|200000071|A|N|||||||2|||||||||||||||||||||||||F|1989-08-23
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000041|A|N|||||||1|||||||||||||||||||||||||M|1998-08-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70008172|A|N|||100|||||2|||||||||||||||||||||185|||M|1937-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000037|A|N|||||||1||||||||||||||||||||3|||||F|1986-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000297|A|N|||||||1||||||||||||||||||||2|||||F|1995-10-30
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000212|A|N|||||||1||||||||||||||||||||3|||||F|1997-11-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000291|A|N|||||||1|||||||||||||||||||||||||F|1977-03-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004432|A|N|||100|||||3|||||||||||||||||||||185|||M|1976-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004182|A|N|||100|||||2|||||||||||||||||||||185|||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000640|A|N|||||||2||||||||||||||||||||3|||||F|1975-07-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000469|A|N|||||||2||||||||||||||||||||1|||||M|1990-03-14
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000111|A|N|||||||1||||||||||||||||||||1|||||M|1984-06-21
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|

P|P|1000001|200000201|A|N|||||||1|||||||||||||||||||||||||M|1967-05-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005333|A|N|||100|||||2|||||||||||||||||||||185|||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005817|A|N|||100|||||4|||||||||||||||||||||185|||M|1957-03-29
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000232|A|N|||||||1||||||||||||||||||||1||||||1966-08-25
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000259|A|N|||||||1||||||||||||||||||||2|||||F|1991-02-04
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000011|A|N|||||||3|||||||||||||||||||||||||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000221|A|N|||||||2|||||||||||||||||||||||||M|1977-09-18
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70003022|A|N|||100|||||2|||||||||||||||||||||185|||M|1989-12-11
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
..................
Example: D-6 RT/ST/ADMIN_SYS_TP_CD error log output

P|A|1000001|200000071|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|42|110107|1000
35|Invalid Internal ID|0|canonical_errRTST|0|2008-11-01
10:00:12|020_Address.CheckCodeAndContentValidationErrors|IL_020_VS_Address
P|A|1000001|200000071|-1|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|42|110184|100
024|Record In Error Dropped|0|canonical_errRTST|0|2008-11-01
10:00:11||IL_020_VS_Address
P|P|1|200000071|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|3|110105|100244|Inva
lid Contact Record: No Match found in PersonName nor
OrganizationName|0|canonical_errRTST|8625|2008-11-01
10:00:01|030_CONTACT_RIV.Process_Contact_Join|IL_030_RI_Contact_Person_Org
P|C|1000001|200000071|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|102|100995|100
995|Invalid Internal Id|0|canonical_errRTST|0|2008-11-01
09:58:44|020_ContactMethod.Type_Code_Chkup|IL_020_VS_ContactMethod
09:58:44|020_ContactMethod.Type_Code_Chkup|IL_020_VS_ContactMethod

077|Record In Error Dropped.|0|canonical_errRTST|0|2008-11-01
09:58:43|020_ContactMethod.Final_Recs_Process|IL_020_VS_ContactMethod
077|Record In Error Dropped.|0|canonical_errRTST|0|2008-11-01
09:58:43|020_ContactMethod.Final_Recs_Process|IL_020_VS_ContactMethod
P|I|1000001|200000071|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|115|110107|100
161|Invalid Internal ID|0|canonical_errRTST|0|2008-11-01
10:00:30|020_Identifier.CheckCodeAndContentValidationErrors|IL_020_VS_Identifier
P|I|1000001|200000071|-1|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|115|110184|10
0154|Record in Error Dropped|0|canonical_errRTST|0|2008-11-01
10:00:30|020_Identifier.DropErrorRows|IL_020_VS_Identifier
P|H|1000001|200000071|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|755|110107|100
09:58:59|020_PersonName.Validation_Chk|IL_020_VS_PersonName
P|H|1000001|200000071|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|755|110184|100
219|Record In Error Dropped|0|canonical_errRTST|0|2008-11-01
09:58:58|020_PersonName.Final_Rec_Process|IL_020_VS_PersonName
D.5 End of record missing error

We assumed that there was a problem with the code generating the SIF file
which resulted in the end of record (DOS Record Terminator Character) being
dropped at the end of a row. We simulated this error by concatenating 2 SIF
records into a single row as shown in row 2 in Example D-7 on page 284.
The two SSKs (1,955742003) and (1,955742002) appear to be columns in the

same row in the SIF.
When this error occurs, any additional columns detected after the final expected
column (as defined by the metadata) are ignored and a warning message
[“Import consumed only 74bytes of the record's 164 bytes (no further warnings
will be generated from this partition)”] is written to the Director log as shown in
Figure D-2 on page 284.
Attention: The main point here is to carefully review the Director log output for
such warnings because they do not appear in the RDP for MDM error logs.
The count of bytes (74 in our example) begins after the SSK because that is
where the columns begin — the count includes the column delimiter pipe “|”
character.

Example: D-7 End of record missing error — partial contents of SIF
P|H|1|955742001||||1||Alley|Mary|||Barton|||||||||||||||||||1|1|0|0|1|1|0|1|1|0|0|0|0
|
P|H|1|955742003||||1||Georgina|Elly|||Colborn|||||||||||||||||||1|1|0|0|1|1|0|1|1|0|0
|0|0|P|H|1|955742002||||1||Cheryl|Lynn|||Ainsworth|||||||||||||||||||1|1|0|0|1|1|0|1|
1|0|0|0|0|
P|H|1|955742004||||1||Margaret|F|||Conway|||||||||||||||||||1|1|0|0|1|1|0|1|1|0|0|0|0
|
Figure D-2 End of record missing error — partial contents of Director log output
D.6 Start date after end date error

We introduced an invalid end date DISAB_END_DT that preceded the start date
DISAB_START_DT (date bounds error) in row 8 of the SIF (highlighted in red in
Example D-8 on page 285 — only the partial contents of the SIF is shown here)
to view the error messages generated.
򐂰 The second record highlights row 8 in the SIF file
SIF_Out.endBeforeStartDate with the SSK of (1000002,8000212) that is in
error. Error message shows “EndDate must be after StartDate”, and the error
severity level is 0. The name of the stage
(020_Contact.CheckCodeAndContentValidationErrors) and job name
(IL_020_VS_Contact) is also provided.
򐂰 The first record also highlights the fact that row 8 is dropped with the error
message “Record In Error Dropped”.
򐂰 The subsequent records are errors resulting from the invalid date bounds.
Note the various rows (157, 254, 353, and 675), error messages, and stage
and job in which these errors were detected.

Example: D-8 Start date after end date error — partial contents of SIF
P|P|1000002|8000719|A|N|||||||1||||||||||||||||||||3||||||1984-05-07
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000090|A|N|||||||2||||||||||||||||||||3|||||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000071|A|N|||||||2|||||||||||||||||||||||||F|1989-08-23
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000041|A|N|||||||1|||||||||||||||||||||||||M|1998-08-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70008172|A|N|||100|||||2|||||||||||||||||||||185|||M|1937-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000037|A|N|||||||1||||||||||||||||||||3|||||F|1986-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000297|A|N|||||||1||||||||||||||||||||2|||||F|1995-10-30
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000212|A|N|||||||1||||||||||||||||||||3|||||F|1997-11-02
00:00:00.000000|||1999-08-23 00:00:00.000000|1989-08-23
00:00:00.000000||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|
P|P|1000001|200000291|A|N|||||||1|||||||||||||||||||||||||F|1977-03-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004432|A|N|||100|||||3|||||||||||||||||||||185|||M|1976-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004182|A|N|||100|||||2|||||||||||||||||||||185|||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000640|A|N|||||||2||||||||||||||||||||3|||||F|1975-07-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000469|A|N|||||||2||||||||||||||||||||1|||||M|1990-03-14
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000111|A|N|||||||1||||||||||||||||||||1|||||M|1984-06-21
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|

P|P|1000001|200000201|A|N|||||||1|||||||||||||||||||||||||M|1967-05-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005333|A|N|||100|||||2|||||||||||||||||||||185|||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005817|A|N|||100|||||4|||||||||||||||||||||185|||M|1957-03-29
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000232|A|N|||||||1||||||||||||||||||||1||||||1966-08-25
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000259|A|N|||||||1||||||||||||||||||||2|||||F|1991-02-04
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000011|A|N|||||||3|||||||||||||||||||||||||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000221|A|N|||||||2|||||||||||||||||||||||||M|1977-09-18
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70003022|A|N|||100|||||2|||||||||||||||||||||185|||M|1989-12-11
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
...........
Example: D-9 Start date after end date error log output
P|P|1000002|8000212|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.endBeforeStartDate|8|110
184|100066|Record In Error Dropped|0|canonical_endBeforeStartDate|9107|2008-11-04
04:44:17|020Contact.DropErrorRows|IL_020_VS_Contact
P|P|1000002|8000212|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.endBeforeStartDate|8|102
|100056|EndDate must be after
StartDate|0|canonical_endBeforeStartDate|9107|2008-11-04
04:44:17|020_Contact.CheckCodeAndContentValidationErrors|IL_020_VS_Contact
P|H|1000002|8000212|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.endBeforeStartDate|675|1
10126|100246|Invalid PersonName Records: No Matching Contact
Record|0|canonical_endBeforeStartDate|9107|2008-11-04 04:45:44|030_CONTACT_RIV.Party
Join Proc|IL_030_RI_Contact_Person_Org
P|I|1000002|8000212|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.endBeforeStartDate|254|1
10387|100387|Record dropped by association. Fatal errors were detected on related
party records.|0|canonical_endBeforeStartDate|9107|2008-05-30
P|C|1000002|8000212|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.endBeforeStartDate|353|1

P|A|1000002|8000212|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.endBeforeStartDate|157|1
D.7 Date format error

We introduced an invalid date format BIRTH_DT (dd-mm-yy) in row 1 of the SIF
(highlighted in red in Example D-10 — only the partial contents of the SIF is
shown here) to view the error messages generated.
򐂰 The first record highlights row 1 in the SIF file SIF_Out.dateFormatError with
the SSK of (1000002,8000719) that is in error. Error message shows “Unable
to parse record at RT/ST Level”, and the error severity level is 0. The name of
the stage (tx_RTST_ci_Rejects) and job name
(IL_010_Parse_Columnization) is also provided.
򐂰 The subsequent records are errors resulting from the invalid date format.
Note the various rows (227, 422, 513, and 859), error messages, and stage
and job in which these errors were detected.
Example: D-10 Date format error — partial contents of SIF

P|P|1000002|8000719|A|N|||||||1||||||||||||||||||||3||||||07-05-1984
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000090|A|N|||||||2||||||||||||||||||||3|||||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000071|A|N|||||||2|||||||||||||||||||||||||F|1989-08-23
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000041|A|N|||||||1|||||||||||||||||||||||||M|1998-08-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70008172|A|N|||100|||||2|||||||||||||||||||||185|||M|1937-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
...........

Example: D-11 Date format error log output
P|P|1000002|8000719|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.dateFormatError|1|110362
|100022|Unable to parse record at RT/ST
Level|0|canonical_dateFormatError|5925|2008-05-30
09:36:10|tx_RTST_ci_Rejects|IL_010_Parse_Columnization
P|H|1000002|8000719|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.dateFormatError|859|1101
26|100246|Invalid PersonName Records: No Matching Contact
Record|0|canonical_dateFormatError|5925|2008-10-28 15:13:22|030_CONTACT_RIV.Party
Join Proc|IL_030_RI_Contact_Person_Org
P|I|1000002|8000719|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.dateFormatError|513|1103
87|100387|Record dropped by association. Fatal errors were detected on related party
records.|0|canonical_dateFormatError|5925|2008-05-30
P|A|1000002|8000719|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.dateFormatError|422|1103
P|C|1000002|8000719|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.dateFormatError|227|1103

E
Appendix E. Additional material

This book refers to additional material that can be downloaded from the Internet
as described below.
Locating the Web material

The Web material associated with this book is available in softcopy on the
Internet from the IBM Redbooks Web server. Point your Web browser at:
Alternatively, you can go to the IBM Redbooks Web site at:

ibm.com/redbooks
Select the Additional materials and open the directory that corresponds with
the IBM Redbooks form number, SG247704.

Using the Web material
The additional Web material that accompanies this book includes the following
files:
File name Description
SG247704CodeScripts.zipZipped code and data used in the sceanrio
System requirements for downloading the Web material

The following system configuration is recommended:
Hard disk space: 500 MB
Operating System: Windows®
How to use the Web material

Create a subdirectory (folder) on your workstation, and unzip the contents of the
Web material zip file into this folder.

Master Data Management: Rapid Deployment Package for MDM
(0.5” spine)
0.475”<->0.875”
250 <-> 459 pages
Back cover ®
Master Data Management:

Rapid Deployment Package
(RDP) for MDM ®
RDP for MDM IBM InfoSphere Rapid Deployment Package (RDP) for MDM
overview provides a rapid deployment approach to implementing MDM INTERNATIONAL
solutions that provide immediate return on investment. It TECHNICAL
MDM overview provides a seamless upgrade path to IBM InfoSphere MDM SUPPORT
Server which provides the complete range of MDM ORGANIZATION
functionally in the market today.
Financial services
scenario In this IBM Redbooks publication, we use a simple financial
services MDM scenario to describe in detail the RDP for BUILDING TECHNICAL
MDM offering and show how it can deliver a return on INFORMATION BASED ON
investment in a short timeframe using a phased approach PRACTICAL EXPERIENCE
that ensures minimal risk.
IBM Redbooks are developed by

the IBM International Technical
Support Organization. Experts
from IBM, Customers and
Partners from around the world
create timely technical
information based on realistic
scenarios. Specific
recommendations are provided
to help you implement IT
solutions more effectively in
your environment.
For more information:

ibm.com/redbooks
SG24-7704-00 0738432636

Master Data Management

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Master Data Management

Uploaded by

Copyright:

Available Formats

Front cover

Master Data Management:

Nagraj Alur Norbert Eschle

MDM: RDP for MDM

First Edition (April 2009)

© Copyright International Business Machines Corporation 2009. All rights reserved.

Chapter 1. Rapid Deployment Package for Master Data Management

Chapter 2. Rapid Deployment Package details. . . . . . . . . . . . . . . . . . . . . . 25

© Copyright IBM Corp. 2009. All rights reserved. iii

Chapter 3. Financial services business scenario . . . . . . . . . . . . . . . . . . . 67

Appendix A. Master Data Management overview. . . . . . . . . . . . . . . . . . . 237

Appendix B. Standard Interface File details . . . . . . . . . . . . . . . . . . . . . . . 247

Appendix C. Master Data Management Server customization

iv MDM: RDP for MDM

Appendix E. Additional material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

© Copyright IBM Corp. 2009. All rights reserved. vii

viii MDM: RDP for MDM

x MDM: RDP for MDM

2-1 Import SIF phase jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

© Copyright IBM Corp. 2009. All rights reserved. xiii

xiv MDM: RDP for MDM

2-1 Sample SIF contents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

© Copyright IBM Corp. 2009. All rights reserved. xv

© Copyright IBM Corp. 2009. All rights reserved. xvii

Ascential® Information Agenda™ Redbooks®

The following terms are trademarks of other companies:

xviii MDM: RDP for MDM

This IBM® Redbooks® publication documents the procedures for implementing

It is aimed at IT architects, Information Management specialists, and Information

This book is organized as follows:

© Copyright IBM Corp. 2009. All rights reserved. xix

Alex Baryudin is currently serving as a Technical Architect on the InfoSphere

xx MDM: RDP for MDM

Norbert Eschle is an experienced consultant and sales engineer with broad

Clive Hannah is currently serving as a Global Technical Architect in the

Barry Rosen is currently serving as Executive Global Architect in the InfoSphere

Torben Skov holds a position as IT Specialist within the Information-On-Demand

xxii MDM: RDP for MDM

Become a published author

We want our books to be as helpful as possible. Send us your comments about

xxiv MDM: RDP for MDM

Chapter 1. Rapid Deployment Package

© Copyright IBM Corp. 2009. All rights reserved. 1

2 MDM: RDP for MDM

Application-specific Application-specific Application-specific Application-specific Application-specific

Batch, Real Time, Transactional, Web Services/SOA

Batch, Real Time, Transactional, Web Services/SOA

Success Factor = Be Flexible and Allow Multiple Methods

Data MDM HUB

DATA INTEGRATION HUB

Data Integration & MDM Consumers

Figure 1-2 Synchronization of master data in the enterprise

4 MDM: RDP for MDM

By utilizing the full suite, implementation

MDM Business Services

Key tasks in implementing RDP:

Figure 1-3 IBM Information Server suite in the flow

Core MDM Methodology and Best Practice

RDP JOB Assets Connect Metadata Lineage, Data Transformation Rules 3

& Data Quality Rules Across The Enterprises,

6 MDM: RDP for MDM

Attention: The RDP for MDM solution is continually being enhanced in

Note: Typically, customers use the Registry architectural style (described in