You are on page 1of 29

Informatica Data Quality

(Version 9.0.1 HotFix 2)

Repository Migration Guide

Informatica Data Quality Repository Migration Guide


Version 9.0.1 HotFix 2
January 2015
Copyright (c) 2010-2015 Informatica Corporation. All rights reserved.
This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing restrictions on use
and disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in
any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation. This Software may be protected by U.S.
and/or international Patents and other Patents Pending.
Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as
provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14
(ALT III), as applicable.
The information in this product or documentation is subject to change without notice. If you find any problems in this product or documentation, please report them to us
in writing.
Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange,
PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange Informatica
On Demand, Informatica Identity Resolution, Informatica Application Information Lifecycle Management, Informatica Complex Event Processing, Ultra Messaging and
Informatica Master Data Management are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world.
All other company and product names may be trade names or trademarks of their respective owners.
Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies. All rights
reserved. Copyright Sun Microsystems. All rights reserved. Copyright RSA Security Inc. All Rights Reserved. Copyright Ordinal Technology Corp. All rights
reserved.Copyright Aandacht c.v. All rights reserved. Copyright Genivia, Inc. All rights reserved. Copyright Isomorphic Software. All rights reserved. Copyright Meta
Integration Technology, Inc. All rights reserved. Copyright Intalio. All rights reserved. Copyright Oracle. All rights reserved. Copyright Adobe Systems
Incorporated. All rights reserved. Copyright DataArt, Inc. All rights reserved. Copyright ComponentSource. All rights reserved. Copyright Microsoft Corporation. All
rights reserved. Copyright Rogue Wave Software, Inc. All rights reserved. Copyright Teradata Corporation. All rights reserved. Copyright Yahoo! Inc. All rights
reserved. Copyright Glyph & Cog, LLC. All rights reserved. Copyright Thinkmap, Inc. All rights reserved. Copyright Clearpace Software Limited. All rights
reserved. Copyright Information Builders, Inc. All rights reserved. Copyright OSS Nokalva, Inc. All rights reserved. Copyright Edifecs, Inc. All rights reserved.
Copyright Cleo Communications, Inc. All rights reserved. Copyright International Organization for Standardization 1986. All rights reserved. Copyright ejtechnologies GmbH. All rights reserved. Copyright Jaspersoft Corporation. All rights reserved. Copyright International Business Machines Corporation. All rights
reserved. Copyright yWorks GmbH. All rights reserved. Copyright Lucent Technologies. All rights reserved. Copyright (c) University of Toronto. All rights reserved.
Copyright Daniel Veillard. All rights reserved. Copyright Unicode, Inc. Copyright IBM Corp. All rights reserved. Copyright MicroQuill Software Publishing, Inc. All
rights reserved. Copyright PassMark Software Pty Ltd. All rights reserved. Copyright LogiXML, Inc. All rights reserved. Copyright 2003-2010 Lorenzi Davide, All
rights reserved. Copyright Red Hat, Inc. All rights reserved. Copyright The Board of Trustees of the Leland Stanford Junior University. All rights reserved. Copyright
EMC Corporation. All rights reserved. Copyright Flexera Software. All rights reserved. Copyright Jinfonet Software. All rights reserved. Copyright Apple Inc. All
rights reserved. Copyright Telerik Inc. All rights reserved. Copyright BEA Systems. All rights reserved. Copyright PDFlib GmbH. All rights reserved. Copyright
Orientation in Objects GmbH. All rights reserved. Copyright Tanuki Software, Ltd. All rights reserved. Copyright Ricebridge. All rights reserved. Copyright Sencha,
Inc. All rights reserved. Copyright Scalable Systems, Inc. All rights reserved. Copyright jQWidgets. All rights reserved.
This product includes software developed by the Apache Software Foundation (http://www.apache.org/), and/or other software which is licensed under various versions
of the Apache License (the "License"). You may obtain a copy of these Licenses at http://www.apache.org/licenses/. Unless required by applicable law or agreed to in
writing, software distributed under these Licenses is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied. See the Licenses for the specific language governing permissions and limitations under the Licenses.
This product includes software which was developed by Mozilla (http://www.mozilla.org/), software copyright The JBoss Group, LLC, all rights reserved; software
copyright 1999-2006 by Bruno Lowagie and Paulo Soares and other software which is licensed under various versions of the GNU Lesser General Public License
Agreement, which may be found at http:// www.gnu.org/licenses/lgpl.html. The materials are provided free of charge by Informatica, "as-is", without warranty of any
kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose.
The product includes ACE(TM) and TAO(TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University, University of California,
Irvine, and Vanderbilt University, Copyright () 1993-2006, all rights reserved.
This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (copyright The OpenSSL Project. All Rights Reserved) and
redistribution of this software is subject to terms available at http://www.openssl.org and http://www.openssl.org/source/license.html.
This product includes Curl software which is Copyright 1996-2013, Daniel Stenberg, <daniel@haxx.se>. All Rights Reserved. Permissions and limitations regarding this
software are subject to terms available at http://curl.haxx.se/docs/copyright.html. Permission to use, copy, modify, and distribute this software for any purpose with or
without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
The product includes software copyright 2001-2005 () MetaStuff, Ltd. All Rights Reserved. Permissions and limitations regarding this software are subject to terms
available at http://www.dom4j.org/ license.html.
The product includes software copyright 2004-2007, The Dojo Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to
terms available at http://dojotoolkit.org/license.
This product includes ICU software which is copyright International Business Machines Corporation and others. All rights reserved. Permissions and limitations
regarding this software are subject to terms available at http://source.icu-project.org/repos/icu/icu/trunk/license.html.
This product includes software copyright 1996-2006 Per Bothner. All rights reserved. Your right to use such materials is set forth in the license which may be found at
http:// www.gnu.org/software/ kawa/Software-License.html.
This product includes OSSP UUID software which is Copyright 2002 Ralf S. Engelschall, Copyright 2002 The OSSP Project Copyright 2002 Cable & Wireless
Deutschland. Permissions and limitations regarding this software are subject to terms available at http://www.opensource.org/licenses/mit-license.php.
This product includes software developed by Boost (http://www.boost.org/) or under the Boost software license. Permissions and limitations regarding this software are
subject to terms available at http:/ /www.boost.org/LICENSE_1_0.txt.
This product includes software copyright 1997-2007 University of Cambridge. Permissions and limitations regarding this software are subject to terms available at
http:// www.pcre.org/license.txt.
This product includes software copyright 2007 The Eclipse Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to terms
available at http:// www.eclipse.org/org/documents/epl-v10.php and at http://www.eclipse.org/org/documents/edl-v10.php.
This product includes software licensed under the terms at http://www.tcl.tk/software/tcltk/license.html, http://www.bosrup.com/web/overlib/?License, http://
www.stlport.org/doc/ license.html, http:// asm.ow2.org/license.html, http://www.cryptix.org/LICENSE.TXT, http://hsqldb.org/web/hsqlLicense.html, http://
httpunit.sourceforge.net/doc/ license.html, http://jung.sourceforge.net/license.txt , http://www.gzip.org/zlib/zlib_license.html, http://www.openldap.org/software/release/

license.html, http://www.libssh2.org, http://slf4j.org/license.html, http://www.sente.ch/software/OpenSourceLicense.html, http://fusesource.com/downloads/licenseagreements/fuse-message-broker-v-5-3- license-agreement; http://antlr.org/license.html; http://aopalliance.sourceforge.net/; http://www.bouncycastle.org/licence.html;


http://www.jgraph.com/jgraphdownload.html; http://www.jcraft.com/jsch/LICENSE.txt; http://jotm.objectweb.org/bsd_license.html; . http://www.w3.org/Consortium/Legal/
2002/copyright-software-20021231; http://www.slf4j.org/license.html; http://nanoxml.sourceforge.net/orig/copyright.html; http://www.json.org/license.html; http://
forge.ow2.org/projects/javaservice/, http://www.postgresql.org/about/licence.html, http://www.sqlite.org/copyright.html, http://www.tcl.tk/software/tcltk/license.html, http://
www.jaxen.org/faq.html, http://www.jdom.org/docs/faq.html, http://www.slf4j.org/license.html; http://www.iodbc.org/dataspace/iodbc/wiki/iODBC/License; http://
www.keplerproject.org/md5/license.html; http://www.toedter.com/en/jcalendar/license.html; http://www.edankert.com/bounce/index.html; http://www.net-snmp.org/about/
license.html; http://www.openmdx.org/#FAQ; http://www.php.net/license/3_01.txt; http://srp.stanford.edu/license.txt; http://www.schneier.com/blowfish.html; http://
www.jmock.org/license.html; http://xsom.java.net; http://benalman.com/about/license/; https://github.com/CreateJS/EaselJS/blob/master/src/easeljs/display/Bitmap.js;
http://www.h2database.com/html/license.html#summary; http://jsoncpp.sourceforge.net/LICENSE; http://jdbc.postgresql.org/license.html; http://
protobuf.googlecode.com/svn/trunk/src/google/protobuf/descriptor.proto; https://github.com/rantav/hector/blob/master/LICENSE; http://web.mit.edu/Kerberos/krb5current/doc/mitK5license.html; http://jibx.sourceforge.net/jibx-license.html; https://github.com/lyokato/libgeohash/blob/master/LICENSE; https://github.com/hjiang/jsonxx/
blob/master/LICENSE; and https://code.google.com/p/lz4/.
This product includes software licensed under the Academic Free License (http://www.opensource.org/licenses/afl-3.0.php), the Common Development and Distribution
License (http://www.opensource.org/licenses/cddl1.php) the Common Public License (http://www.opensource.org/licenses/cpl1.0.php), the Sun Binary Code License
Agreement Supplemental License Terms, the BSD License (http:// www.opensource.org/licenses/bsd-license.php), the new BSD License (http://opensource.org/
licenses/BSD-3-Clause), the MIT License (http://www.opensource.org/licenses/mit-license.php), the Artistic License (http://www.opensource.org/licenses/artisticlicense-1.0) and the Initial Developers Public License Version 1.0 (http://www.firebirdsql.org/en/initial-developer-s-public-license-version-1-0/).
This product includes software copyright 2003-2006 Joe WaInes, 2006-2007 XStream Committers. All rights reserved. Permissions and limitations regarding this
software are subject to terms available at http://xstream.codehaus.org/license.html. This product includes software developed by the Indiana University Extreme! Lab.
For further information please visit http://www.extreme.indiana.edu/.
This product includes software Copyright (c) 2013 Frank Balluffi and Markus Moeller. All rights reserved. Permissions and limitations regarding this software are subject
to terms of the MIT license.
This Software is protected by U.S. Patent Numbers 5,794,246; 6,014,670; 6,016,501; 6,029,178; 6,032,158; 6,035,307; 6,044,374; 6,092,086; 6,208,990; 6,339,775;
6,640,226; 6,789,096; 6,823,373; 6,850,947; 6,895,471; 7,117,215; 7,162,643; 7,243,110; 7,254,590; 7,281,001; 7,421,458; 7,496,588; 7,523,121; 7,584,422;
7,676,516; 7,720,842; 7,721,270; 7,774,791; 8,065,266; 8,150,803; 8,166,048; 8,166,071; 8,200,622; 8,224,873; 8,271,477; 8,327,419; 8,386,435; 8,392,460;
8,453,159; 8,458,230; 8,707,336; 8,886,617 and RE44,478, International Patents and other Patents Pending.
DISCLAIMER: Informatica Corporation provides this documentation "as is" without warranty of any kind, either express or implied, including, but not limited to, the
implied warranties of noninfringement, merchantability, or use for a particular purpose. Informatica Corporation does not warrant that this software or documentation is
error free. The information provided in this software or documentation may include technical inaccuracies or typographical errors. The information in this software and
documentation is subject to change at any time without notice.
NOTICES
This Informatica product (the "Software") includes certain drivers (the "DataDirect Drivers") from DataDirect Technologies, an operating company of Progress Software
Corporation ("DataDirect") which are subject to the following terms and conditions:
1. THE DATADIRECT DRIVERS ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
2. IN NO EVENT WILL DATADIRECT OR ITS THIRD PARTY SUPPLIERS BE LIABLE TO THE END-USER CUSTOMER FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, CONSEQUENTIAL OR OTHER DAMAGES ARISING OUT OF THE USE OF THE ODBC DRIVERS, WHETHER OR NOT
INFORMED OF THE POSSIBILITIES OF DAMAGES IN ADVANCE. THESE LIMITATIONS APPLY TO ALL CAUSES OF ACTION, INCLUDING, WITHOUT
LIMITATION, BREACH OF CONTRACT, BREACH OF WARRANTY, NEGLIGENCE, STRICT LIABILITY, MISREPRESENTATION AND OTHER TORTS.
Part Number: DQ-MIG-90100-HF2-0003

Table of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Informatica Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Informatica My Support Portal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Informatica Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Informatica Product Availability Matrixes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Informatica Web Site. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Informatica How-To Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Informatica Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Informatica Support YouTube Channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Informatica Marketplace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Informatica Velocity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Informatica Global Customer Support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Chapter 1: Introduction to Data Quality Repository Migration. . . . . . . . . . . . . . . . . . . 8


Overview of Repository Migration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Informatica Data Quality 8.6.2 Repository Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Data Quality Plan and Mapping Comparisons. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Changes to Data Quality Transformations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Changes to Data Quality Sources and Targets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Component Comparison Checklist. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Changes to Reference Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Migration and Data Profiling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Chapter 2: Migrating Repository and Reference Data. . . . . . . . . . . . . . . . . . . . . . . . . . 15


Overview of Migration Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Migration Report Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
PackageReport Status Information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Migration Prerequisites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Migration.Properties File Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Finding the EDR.Port Value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Database Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Reference Table Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Exporting Data from the Data Quality 8.6.2 Repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Exporting Data from the 8.6.2 Workbench Machine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Exporting Data from the 8.6.2 Server Machine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Importing Data to Informatica Data Quality 9.0.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Chapter 3: Troubleshooting Migration of Data Quality Objects. . . . . . . . . . . . . . . . . 25


Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Token Labeling and Token Parsing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Table of Contents

Rule-Based Analyzer Components Without Input Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26


Text Qualifier Support for Sources and Targets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Empty Reference Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Data Quality 8.6.2 Settings Not Available in Data Quality 9.0.1. . . . . . . . . . . . . . . . . . . . . . . . 28
Changes in Address Validator Functionality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Identity Match Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Partial Support for the Context Parser Merge Option. . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Substring Dictionary Labeling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Table of Contents

Preface
The Informatica Data Quality Repository Migration Guide is written for data quality developers. This guide
assumes that you have an understanding of data quality concepts, flat file and relational database concepts,
and the database engines in your environment. This guide also assumes that you are familiar with the
concepts presented in the Informatica Developer User Guide.

Informatica Resources
Informatica My Support Portal
As an Informatica customer, you can access the Informatica My Support Portal at
http://mysupport.informatica.com.
The site contains product information, user group information, newsletters, access to the Informatica
customer support case management system (ATLAS), the Informatica How-To Library, the Informatica
Knowledge Base, Informatica Product Documentation, and access to the Informatica user community.

Informatica Documentation
The Informatica Documentation team makes every effort to create accurate, usable documentation. If you
have questions, comments, or ideas about this documentation, contact the Informatica Documentation team
through email at infa_documentation@informatica.com. We will use your feedback to improve our
documentation. Let us know if we can contact you regarding your comments.
The Documentation team updates documentation as needed. To get the latest documentation for your
product, navigate to Product Documentation from http://mysupport.informatica.com.

Informatica Product Availability Matrixes


Product Availability Matrixes (PAMs) indicate the versions of operating systems, databases, and other types
of data sources and targets that a product release supports. You can access the PAMs on the Informatica My
Support Portal at https://mysupport.informatica.com/community/my-support/product-availability-matrices.

Informatica Web Site


You can access the Informatica corporate web site at http://www.informatica.com. The site contains
information about Informatica, its background, upcoming events, and sales offices. You will also find product
and partner information. The services area of the site includes important information about technical support,
training and education, and implementation services.

Informatica How-To Library


As an Informatica customer, you can access the Informatica How-To Library at
http://mysupport.informatica.com. The How-To Library is a collection of resources to help you learn more
about Informatica products and features. It includes articles and interactive demonstrations that provide
solutions to common problems, compare features and behaviors, and guide you through performing specific
real-world tasks.

Informatica Knowledge Base


As an Informatica customer, you can access the Informatica Knowledge Base at
http://mysupport.informatica.com. Use the Knowledge Base to search for documented solutions to known
technical issues about Informatica products. You can also find answers to frequently asked questions,
technical white papers, and technical tips. If you have questions, comments, or ideas about the Knowledge
Base, contact the Informatica Knowledge Base team through email at KB_Feedback@informatica.com.

Informatica Support YouTube Channel


You can access the Informatica Support YouTube channel at http://www.youtube.com/user/INFASupport. The
Informatica Support YouTube channel includes videos about solutions that guide you through performing
specific tasks. If you have questions, comments, or ideas about the Informatica Support YouTube channel,
contact the Support YouTube team through email at supportvideos@informatica.com or send a tweet to
@INFASupport.

Informatica Marketplace
The Informatica Marketplace is a forum where developers and partners can share solutions that augment,
extend, or enhance data integration implementations. By leveraging any of the hundreds of solutions
available on the Marketplace, you can improve your productivity and speed up time to implementation on
your projects. You can access Informatica Marketplace at http://www.informaticamarketplace.com.

Informatica Velocity
You can access Informatica Velocity at http://mysupport.informatica.com. Developed from the real-world
experience of hundreds of data management projects, Informatica Velocity represents the collective
knowledge of our consultants who have worked with organizations from around the world to plan, develop,
deploy, and maintain successful data management solutions. If you have questions, comments, or ideas
about Informatica Velocity, contact Informatica Professional Services at ips@informatica.com.

Informatica Global Customer Support


You can contact a Customer Support Center by telephone or through the Online Support.
Online Support requires a user name and password. You can request a user name and password at
http://mysupport.informatica.com.
The telephone numbers for Informatica Global Customer Support are available from the Informatica web site
at http://www.informatica.com/us/services-and-training/support-services/global-support-centers/.

Preface

CHAPTER 1

Introduction to Data Quality


Repository Migration
This chapter includes the following topics:

Overview of Repository Migration, 8

Informatica Data Quality 8.6.2 Repository Features, 9

Data Quality Plan and Mapping Comparisons, 9

Changes to Data Quality Transformations, 9

Changes to Data Quality Sources and Targets, 10

Component Comparison Checklist, 10

Changes to Reference Data, 13

Migration and Data Profiling, 14

Overview of Repository Migration


Informatica provides batch files that you can use to export the contents of an Informatica Data Quality 8.6.2
repository to a 9.0.1 Model repository.
The batch files perform the following tasks:

Export Data Quality 8.6.2 repository objects and reference data to the file system in XML format.

Convert the 8.6.2 repository objects to 9.0.1 format.

Import the reference data as reference tables to the 9.0.1 Model repository and staging area

You complete the migration in the Developer tool by importing the XML package containing the
transformation, mapplet, and mapping XML to the Model repository.
Note: If a Data Quality 8.6.2 object reads a database source, the migration process preserves the database
connection information. You do not need to re-create the database connection in Data Quality 9.0.1.
To migrate from Data Quality 8.6.2 to Data Quality 9.0.1 HotFix 1, run the migration files associated with Data
Quality 9.0.1 HotFix 1.
To migrate from Data Quality 8.6.2 to Data Quality 9.0.1, run the migration files associated with the Data
Quality 9.0.1.

Informatica Data Quality 8.6.2 Repository Features


The Informatica Data Quality 8.6.2 repository shows the following similarities and differences when compared
with the 9.0.1 Model repository:

The Informatica Data Quality 8.6.2 repository contains two types of object: projects and plans. The 8.6.2
repository does not store transformation or data source definitions as separate objects. The 8.6.2
repository stores all metadata as XML.

An 8.6.2 repository project is similar to a 9.0.1 Model repository project. Both display user-defined folders
in the repository structure.

An 8.6.2 plan equates to a mapping in the 9.0.1 Model repository. A plan contains a data source and data
target connected by zero or more transformations. It runs in the same manner as a mapping.

The Informatica Data Quality 8.6.2 user creates and runs plans in a client application called Data Quality
Workbench. The application installs with a local repository. Informatica Data Quality 8.6.2 enables remote
clients to connect to an 8.6.2 repository in a client-server manner, but all Informatica Data Quality 8.6.2
repositories are identical.

Data Quality Plan and Mapping Comparisons


The migration process converts all Informatica Data Quality 8.6.2 plans to 9.0.1 mappings. Each mapping
has a folder in the Model repository.
Some sources, targets, and transformations in the migrated plans convert directly to sources, targets, and
transformations in the Model repository. Some sources, targets, and transformations convert to multiple
objects or to mapplets in the Model repository.

Changes to Data Quality Transformations


Some transformations are functionally identical across the product versions, while others convert to different
transformations. Some transformations do not migrate.
The following types of transformation change can occur:

The 8.6.2 transformation has a direct counterpart in 9.0.1. Informatica Data Quality 9.0.1 includes
transformations that are effectively copies of 8.6.2 transformations. For example, the Merge, ToUpper,
and Rule-Based Analyzer transformations in Informatica Data Quality 8.6.2 become Merge, Case, and
Decision transformations in Informatica Data Quality 9.0.1.

9.0.1 transformations provide equivalent functionality to or have evolved from 8.6.2 transformations. For
example, the 9.0.1 Comparison transformation combines the functionality of the Bigram, Jaro, Hamming
Distance, and Edit Distance transformations. These 8.6.2 transformations convert seamlessly to a
Comparison transformation.

The 8.6.2 transformation does not have a direct counterpart in 9.0.1 but the transformation functionality is
maintained in other transformations. In such cases, the 8.6.2 transformation metadata transfers to other
transformations. For example, the Word Manager transformation does not migrate to 9.0.1, but its
metadata transfers to the Standardizer transformation, which enables the same functionality.

Informatica Data Quality 8.6.2 Repository Features

The 8.6.2 transformation is not supported in 9.0.1 and the transformation functionality does not transfer to
other transformations. In such cases, the 8.6.2 transformation input and output metadata is applied to
another transformation, for example an Expression transformation.

Changes to Data Quality Sources and Targets


Some Informatica Data Quality 8.6.2 sources and targets are fully compatible with 9.0.1 data source and
target definitions. For example, a CSV Source from Informatica Data Quality 8.6.2 migrates to a file-based
data source object in 9.0.1. These sources and targets convert seamlessly to 9.0.1 source and target
definitions.
Some 8.6.2 sources and targets incorporate transformation functionality and do not have a one-to-one
correspondence with source and target definitions in 9.0.1. They convert to 9.0.1 sources and targets and
also generate 9.0.1 transformations that perform the operations configured in 8.6.2.
The following types of source and target do not correspond one-to-one with source and target definitions in
9.0.1:

Sources and targets used in grouping data records before duplicate analysis.

Sources and targets used in field matching procedures.

Sources and targets used in identity matching procedures.

Component Comparison Checklist


The following table lists the source, target, and transformation components available in Data Quality
Workbench and describes how they convert to objects in the 9.0.1 Model repository:

10

8.6.2 Component

9.0.1 Component

Aggregation

Aggregator transformation

Association [for PowerCenter]

Association transformation

Bigram

Comparison transformation

Character Labeler

Labeler transformations

Consolidation [for
PowerCenter]

Consolidation transformation

Context Parser

Labeler and Parser transformations. The Parser transformation is set to patternbased parsing mode.

Count

Mapplet containing Aggregator, Union, Expression, Joiner, Sorter, and Filter


transformations

CSV Dual Match Source

Two file-based data sources and a Match transformation

Chapter 1: Introduction to Data Quality Repository Migration

8.6.2 Component

9.0.1 Component

CSV Identity Group Source

File-based data source and Match transformation. May convert to a mapplet.

CSV Match Target

File-based data target

CSV Match Source

File-based data target and Match transformation

CSV Merge Target

File-based data target

CSV Target

File-based data target

CSV Source

File-based data source

DB Identity Group Source

Relational data source and Match transformation. May convert to a mapplet.

DB Match Source

Relational data source and Match transformation

DB Report Target

File-based data target

DB Target

SQL transformation and relational data target

DB Source

Relational data source

Dual Group Source

Multiple file-based data sources and Union transformation if required. May


convert to a mapplet.

Edit Distance

Comparison transformation

Fixed Width Target

File-based data target

Fixed Width Source

File-based data source

Global AV [Address Doctor


engine]

Address Validator transformation. This transformation needs additional


configuration following import to the 9.0.1 Model repository.

Global AV [Melissa Data


engine]

Address Validator transformation. This transformation needs additional


configuration following import to the 9.0.1 Model repository.

Global AV [QAS engine]

Address Validator transformation. This transformation needs additional


configuration following import to the 9.0.1 Model repository.

Global AV [SDK]

Not supported

Group Target

Flat-file data target and Sorter and Expression transformations

Group Source

Multiple file-based data sources and Union transformation if required. May


convert to a mapplet.

Hamming Distance

Comparison transformation

Identity Group Target

Mapplet output transformation

Identity Match

Match transformation

Jaro Distance

Comparison transformation

Component Comparison Checklist

11

8.6.2 Component

9.0.1 Component

Match Key Target

Relational data target

Merge

Merge transformation

MinAvgMax

Mapplet containing Aggregator, Union, Expression, Joiner, and Router


transformations

Missing Values

Mapplet containing Aggregator, Expression, Joiner transformations

Mixed Field Matcher

Not supported

Normalization [SDK]

Not supported

NYSIIS

Key Generator transformation

Parsing [SDK]

Not supported

Profile Standardizer

Parser transformation set to pattern-based parsing mode

Range Counter

Linear Range: Aggregator, Expression, Joiner, and Sorter transformations


Variable Range: Aggregator, Expression, Union transformations

12

Realtime Target

Mapplet containing data target

Realtime Source

Mapplet containing data source

Report Target

Flat-file data target

Rule Based Analyzer

Decision transformation

SAP Target

Not supported

SAP Source

Not supported

Scripting

Not supported

Search Replace

Standardizer transformation

Similarity [SDK]

Not supported

Soundex

Key Generator transformation

Splitter

Labeler, Parser, and Expression transformations. The Parser is set to patternbased parsing mode.

Sum

Mapplet containing Aggregator, Expression, Joiner, Sorter, and Union


transformations

To Upper

Case Converter transformation

Token Labeler

Labeler transformation

Token Parser

Parser transformation

Chapter 1: Introduction to Data Quality Repository Migration

8.6.2 Component

9.0.1 Component

Weight Based Analyzer

Weighted Average transformation

Word Manager

Standardizer transformation

Changes to Reference Data


Informatica Data Quality 8.6.2 can read reference data from dictionary files and database tables. The
migration process can export this data for use in Data Quality 9.0.1.
The migration process exports the following types of reference data:

Reference data that you created in file or database form. If you created database dictionaries in Data
Quality 8.6.2, the export process converts these to file. The import process reads reference data files into
the 9.0.1 Model repository and staging area.

Informatica dictionary files that the process does not recognize as part of the Data Quality 9.0.1 Content
Installer file set. The process exports Country Pack and Region Pack files.

The migration process does not export the following types of reference data:

Address reference data

Identity population data

Informatica reference data shipped by default with the Data Quality 9.0.1 Content Installer

Note: Each version of Informatica 9.0.1 performs reference data migration in a different way. You must run
migration files that are compatible with your version of Informatica 9.0.1.
The following table describes the differences between each release:
Informatica Release

Dictionary File Treatment

Cross-Version Compatibility

Data Quality 9.0.1

Does not copy Informatica


reference dictionary files.

Not compatible with other versions


of Data Quality 9.0.1.

Data Quality 9.0.1 HotFix 1

Does not copy Informatica


reference dictionary files.

Compatible with Data Quality 9.0.1


HotFix 2.

Data Quality 9.0.1 HotFix 2

Does not copy a dictionary file if


the Content Installer contains an
updated version of the file.

Compatible with Data Quality 9.0.1


HotFix 1.

Copies all other dictionary files.


Run the Data Quality Content Installer to install all Informatica reference data.
The migration process can recognize that a plan reads reference data when it exports the plan from the 8.6.2
repository. In such cases, the migration process retains the link between the plan and the reference data in
the exported XML. When you import the project and the plan metadata to the 9.0.1 Model repository, the
reference data is copied into reference tables in the Model repository and staging area. The mapping created
for the plan reads the reference data from these tables. You do not need to reconnect the mapping to the
reference tables.

Changes to Reference Data

13

The migration process recognizes Informatica reference data even if the reference data file name has
changed between versions 8.6.2 and 9.0.1. If an 8.6.2 plan reads a reference data file that is represented by
a reference table in 9.0.1, the migration process updates the imported mapping to read the new reference
table.
The migration process requires that Data Quality 8.6.2 dictionaries use UTF-8 encoding. If your Data Quality
8.6.2 dictionaries use encodings other than UTF-8, convert the dictionaries to UTF-8 before migration.

Migration and Data Profiling


Informatica Data Quality 8.6.2 performs profiling differently from Informatica Data Quality 9.0.1.
Informatica Data Quality 8.6.2 users create plans to profile data sources and write the results to data targets.
The migration process preserves the logic of these plans, so that the files or database tables written by the
mapping in 9.0.1 contain data that corresponds to profile results.

14

Chapter 1: Introduction to Data Quality Repository Migration

CHAPTER 2

Migrating Repository and


Reference Data
This chapter includes the following topics:

Overview of Migration Process, 15

Migration Report Files, 16

Migration Prerequisites, 17

Exporting Data from the Data Quality 8.6.2 Repository, 21

Exporting Data from the 8.6.2 Workbench Machine, 22

Exporting Data from the 8.6.2 Server Machine, 23

Importing Data to Informatica Data Quality 9.0.1, 23

Overview of Migration Process


To migrate repository and reference data, you must run batch files provided by Informatica. Informatica
provides the files in the IDQMigration.zip file.
IDQMigration.zip contains the following files:

ClientPackage. Exports the 8.6.2 repository contents and copies reference dictionary data to the file
system. The batch processes compresses and save the files in a format legible to the ServerImport batch
file.
You can append parameters to the ClientPackage batch file to read plan metadata from the file system
and not from the 8.6.2 repository. You must use these parameters when migrating metadata from a Data
Quality Server repository.

ServerImport. Extracts and writes reference metadata to the 9.01 Model repository. Extracts and writes
reference data to the 9.0.1 staging database. The file also save plan metadata in a format legible to the
9.0.1 Model repository. It does not write the plan metadata to the Model repository.

Note: You must manually import the plan metadata to the 9.0.1 Model repository.

15

Migration Report Files


The migration process creates HTML report files when you run the ClientPackage and ServerImport batch
files.
The ClientPackage report files describe the status of the objects you export from Data Quality 8.6.2. The
ServerImport report files describe the status of the objects you import to Data Quality 9.0.1.
Review the reports as you run each batch file to verify the success of each stage in the process. Take note of
any objects that do not export or import as expected or that require user tuning for use in Data Quality 9.0.1.

ClientPackage Report Files


ClientPackage.bat creates a single report file named PackageReport.html. It writes this file to the Package
directory.
You find the Package directory in the same directory as ClientPackage.bat.

ServerImport Report Files


ServerImport.bat creates a report file for each plan imported to the 9.0.1 Model repository. It also creates a
summary file named ServerMigrationReport.html.
ServerImport.bat writes the report files to the migration_reports directory. You find this directory in the same
directory as ServerImport.bat.
Note: ClientPackage and ServerImport also create log files in their respective directories. These files provide
additional information on the success of export and import operations.

PackageReport Status Information


The ClientPackage process generates a report named PackageReport.htm. Review this report file before you
run the server import process. If you find issues in PackageReport.htm, you can address them before you
import items to Data Quality 9.0.1.
Pay attention to the following items:
Warnings and Errors
If the report includes warnings or errors, you must manually edit the items affected. You can edit them
following import to Data Quality 9.0.1, or you can edit the plans or files in your 8.6.2 environment before
proceeding. If you make any changes in Data Quality 8.6.2, rerun the ClientPackage batch file.
Unused Dictionaries
An unused dictionary is not used by any plan packaged for migration. If you have many unused
dictionaries, you may want to exclude them from the 9.0.1 import. Use the RTM.ImportSet property in the
migration.properties file to control the import of dictionary files.
Missing Dictionaries
A missing dictionary is one that a plan is configured to read but that is absent from the expected location
on the Data Quality 8.6.2 machine. If you continue with the migration, you must edit any mapping
configured to read such a dictionary.
If you change your dictionary location or replace missing dictionaries in Data Quality 8.6.2, rerun the
ClientPackage batch file.

16

Chapter 2: Migrating Repository and Reference Data

Migration Prerequisites
You must verify that the client batch file can access all Informatica Data Quality 8.6.2 objects and data. You
must also understand the changes that migrated objects can undergo during the migration process.
Before you begin the migration process, answer the following questions:
Do the plans read reference data provided by Informatica?
Informatica Data Quality 8.6.2 uses dictionary files as reference data. If you migrate plans that read
dictionary files, you must verify that the dictionaries are accessible on the Data Quality 8.6.2 machine.
The migration process reads the location of the dictionary files from the Data Quality config.xml file.
Default location for configuration file: [install_dir]\config.xml
Example: C:\Program Files\Informatica Data Quality\config.xml
Default location for dictionaries: [install_dir]\Dictionaries
Example: C:\Program Files\Informatica Data Quality\Dictionaries
Note: The migration process ignores most Informatica dictionary files when it exports items from Data
Quality 8.6.2. Use the Data Quality Content Installer to add Informatica reference data to Informatica
Data Quality 9.0.1. Ensure that you include Country Pack dictionaries and premium address reference
data files read by the 8.6.2 plans when you run the Content Installer.
Run the Server and Client Content Installers before you perform any migration tasks on an Informatica
Data Quality 9.0.1 machine.
Do the plans read from or write to database tables?
If the plans read from or write to a database, take note of the database connection details. Verify that the
9.0.1 Data Integration Service can access the database host machines.
If the plans read from or write to files, copy these files to a location accessible to the 9.0.1 Data
Integration Service. You can set the location of source and target files in the migration.properties file.
Is Informatica Data Quality 9.0.1 installed, and are the required services running?
The following 9.0.1 services must be running before you import migrated files:

Model Repository Service

Data Integration Service

Analyst Service

Have you created a project in the Model repository for the data you want to import?
Create this project before you import migrated files. Create a folder in the project to store the reference
tables created from the 8.6.2 dictionary files.
Have you reviewed the migration.properties file?
Before you run the ServerImport process on the Data Quality 9.0.1 system, must review the
migration.properties file and verify that the property settings are correct for your environment and the
migration objects.

Migration Prerequisites

17

Migration.Properties File Settings


Before you migrate repository objects or files from Data Quality 8.6.2, review the settings in the
migration.properties file. Verify that the settings are correct for your Data Quality environment and the items
that you migrate. You find this file in the Config directory of the IDQMigration.zip package.
The ServerImport process reads all properties in this file. The ServerImport and ClientPackage processes
read the Migration.Formatter, Migration.LogLevel, Report.Format, and Report.Generate properties.
The following table describes the most frequently used options in migration.properties:

18

Property

Description

DSO.DefaultSourceFolder

The path to the folder that you want to contain flat file data sources in Data
Quality 9.0.1. Set this property if you want all flat file data objects to read
data from a single location. This location must be accessible in the Data
Quality 9.0.1 server environment.

DSO.DefaultTargetFolder

The path to the folder that you want to contain flat file data targets in Data
Quality 9.0.1. Used by the ServerImport process reads this property. Set
this property if you want all flat file data objects to write data to a single
location. This location must be accessible in the Data Quality 9.0.1 server
environment.

EDR.Host

The Data Integration Service host machine.

EDR.Port

The port number that the ServerImport process uses to communicate with
Informatica 9.0.1 services. This port number must match the Service
Manager port used during Data Quality 9.0.1 installation process. Default is
6006.

Locale.Client

The locale used by Informatica Developer. An incorrect locale may result in


incorrect settings on metadata items such as match thresholds. Default is
en.

Migration.Formatter

The quantity of additional information that ClientPackage and ServerImport


can write to log messages. Enter Default to enable ClientPackage and
ServerImport to add a single line of information to log messages, in addition
to the information defined by Migration.LogLevel. Enter Custom to enable
ClientPackage and ServerImport to add multiple lines. Default is Custom.

Migration.LogLevel

The level of logging performed during the ClientPackage and ServerImport


processes. Default is Normal. If you see issues during the ServerImport
process, set this property to Debug and rerun the process.

Report.Format

The file format of report files. Set the property to HTML or XML. Default is
HTML.

Report.Generate

Determines if the ClientPackage and ServerImport processes generate


report files. Ensure that this property is set to Yes. You must review the
report files to verify the results of each process. Default is Yes.

RTM.AtService

The Analyst Service name in Data Quality 9.0.1.

Chapter 2: Migrating Repository and Reference Data

Property

Description

RTM.ContentProject

The Model repository project that contains reference data installed by the
Content Installer. If the plans you export from Data Quality 8.6.2 read
Informatica dictionaries, the migration process can link the imported
transformations to the Informatica 9.0.1 reference data.
Set RTM.MapRTM to Yes to enable imported objects to read Informatica
reference data.

RTM.ContentRootDirectory

The folder within RTM.ContentProject that contains reference data read by


the imported transformations.

RTM.Host

The Model repository host machine name in Data Quality 9.0.1.

RTM.ImportSet

Determines the dictionary files that are written as reference tables during
the import process. If the ClientPackage process identifies a large quantity
of unused dictionaries, set this to UsedOnly. Default is All.

RTM.MapRTM

Determines if imported objects read Informatica reference data installed to


Data Quality 9.0.1 in place of the dictionary files configured in Data Quality
8.6.2. The recommended setting is Yes. If set to No, ServerImport creates
empty reference tables, and imported transformations reference the empty
tables. Default is No.

RTM.Repository

The Model repository name in Data Quality 9.0.1.

RTM.UserProject

The Model repository project that reference data and mappings import to.
Create the project before you run ServerImport.

RTM.UserRootDirectory

The folder within RTM.UserProject to contain the reference data read by the
imported objects. Create the folder before you run ServerImport.

Server.Password

Password for Data Quality 9.0.1. You must have read and write permissions
on the project folders that you import to.

Server.UserName

User name for Data Quality 9.0.1. You must have read and write
permissions on the project folders that you import to.

Stage.Oracle

The staging database type. If you have configured a staging database and
schema of a particular type, update the property with the name of the
connection that uses the database and schema. Default for each property is
blank.

Stage.SqlServer
Stage.ODBC
Stage.MySQL

Finding the EDR.Port Value


If you know the Service Manager port number that was defined during the Data Quality 9.0.1 installation
process, use it as the EDR.Port value in migration.properties. The default Service Manager port number is
6006.
If you do not know the Service Manager port number, you can find it in nodemeta.xml in the Data Quality
9.0.1 installation.
1.

Find nodemeta.xml.
The default location of this file is
<Informatica_services_installation_directory>/isp/config

Migration Prerequisites

19

2.

Copy the file to another location.

3.

Open the file in a text editor.

4.

Search the file for a string in this format:


<address imx:id="ID_2" xsi:type="common:NodeAddress" host="servername"
httpPort="6005" port="6006/> <portals>
The port value in this file is the Service Manager port number. In this example, the number is 6006.

Use the port value from nodemeta.xmlas the EDR.Port value.

Database Considerations
If Data Quality 8.6.2 uses multiple staging database types, you must ensure that a database or schema and a
connection object exist for each type. Add the name of each connection object to the Stage.<databasetype>
property in the migration.properties file.
Informatica connects to Microsoft SQL Server and MySQL databases through ODBC. The migration process
creates connection objects for these databases, but you must ensure that the ODBC Data Source has been
created in the Informatica 9.0.1 server environment.
For example, if Data Quality 8.6.2 connects to a Microsoft SQL Server connection through ODBC DSN
'MS_SQL_CONNECTION,' the connection object created also uses this name. If the ODBC DSN on the
server has a different name, edit the name of the DSN or the connection object so that they are consistent.
If you are using MySQL as a staging database in Data Quality 8.6.2, complete the following tasks before
migrating to Data Quality 9.0.1:
1.

Create a MySQL 5.0 database that is accessible to the Data Quality 9.0.1 server system.

2.

Create an ODBC DSN to access the MySQL 5.0 database.

3.

In the Data Quality 9.0.1 connection manager, create a new ODBC connection object that uses this
ODBC DSN.

4.

Open the Migration.properties file and set the value of the Stage.MySQL property to the name of the
ODBC connection object.

When you run the server import process, plans that refer to the Data Quality 8.6.2 staging database will not
be configured to use the connection object specified by Stage.MySQL.

Reference Table Considerations


Use migration.properties to configure how the ServerImport process imports reference data to Data Quality
9.0.1. You can also configure migration.properties to link imported mapping objects to the Informatica
reference tables defined in the Model repository.

Reference Data Import


Use the RTM.ImportSet property in migration properties to determine how ServerImport imports reference
data.
Set the property to one of the following options:
All
Imports all reference data from the migration package.

20

Chapter 2: Migrating Repository and Reference Data

UsedOnly
Imports reference data from the migration package if the data is used by an imported mapping object.
You may want to prevent the import of unused reference data if your Data Quality 8.6.2 installation
contains many unused dictionaries.
None
Does not import any dictionaries.

Reference Table Creation


Use the RTM.MapRTM property to determine if the mappings you import to Data Quality 9.0.1 read reference
tables in the Model repository.
Set the property to one of the following options:
Yes
If ServerImport finds a dictionary file link during the conversion of a plan to a mapping, it searches for a
corresponding reference table in the Model repository and links new mapping to that reference table.
No
If ServerImport finds a dictionary file link during the conversion of a plan to a mapping, it creates a
dummy reference table for the dictionary and adds a warning to the report files for the affected plan.

Informatica Reference Tables


The Data Quality Content Installer provides a default set of reference tables for Data Quality 9.0.1. These
reference tables include updated versions of the standard dictionary files issued for Data Quality 8.6.2.
If your plans use Informatica dictionaries in Data Quality 8.6.2, install the Informatica reference tables to Data
Quality 9.0.1 before proceeding with the migration. Select the Informatica reference data file when you run
the Content Installer. Update the RTM.ContentProject and RTM.ContentRootDirectory properties in the
migration.properties file with the project and directory locations of the reference data.
Note: If you download and install an accelerator pack for Data Quality 9.0.1, you must also install the default
Content Installer reference data when you install the accelerator reference data. Complete this task even if
you have previously installed the default Content Installer data. This step maintains the links between the
Informatica reference tables in the Model repository and the mappings that read them.

Exporting Data from the Data Quality 8.6.2


Repository
Run ClientPackage.bat to create a compressed file that contains repository metadata and reference data.
The export procedures differ for Data Quality Workbench and Data Quality Server installations.
Workbench installations
Run ClientPackage.bat on the Workbench machine to export Workbench repository contents and
reference data to a compressed migration file.

Exporting Data from the Data Quality 8.6.2 Repository

21

Server installations
Use Workbench to export plan metadata to the file system on a Server repository machine. Run
ClientPackage.bat on the Server repository machine to create a compressed migration file that contains
the plan metadata and the Server reference data.

Exporting Data from the 8.6.2 Workbench Machine


Run ClientPackage.bat on the Workbench machine to export the local repository contents and reference
data.
ClientPackage.bat creates a compressed file that contains these items. The default name for the file is
MigrationPackage.zip.
1.

Copy the IDQMigration.zip file to the Workbench host machine.

2.

Extract the IDQMigration.zip file.

3.

Run ClientPackage.bat. You can apply the following optional parameters:


Option

Description

-d

Path to the directory where the batch file creates MigrationPackage.zip.

-f

Path to a folder that contains plans already exported from the Data Quality
repository. Use this parameter if you have used Workbench to export repository
contents to file. Do not use with the -r parameter.

-o

Alternative name for MigrationPackage.zip.

-r

Server repository export only. Specifies that ClientPackage.bat will run on a remote
Data Quality repository and extract plan and reference data to the Workbench file
system.

-s

Staging directory for temporary files.

The batch process creates the compressed migration file that contains the exported repository and
reference data files.
4.

Review the PackageReport.html file created by the export process.


This reports lists the plans, reference files, and database connection information copied to the migration
file during the export process.

22

Chapter 2: Migrating Repository and Reference Data

Exporting Data from the 8.6.2 Server Machine


Use Data Quality Workbench to export the Server repository contents to the Server file system. Run
ClientPackage.bat on the Server machine to read the exported repository data and copy reference data from
the Server machine.
ClientPackage.bat creates a compressed file that contains these items. The default name for the file is
MigrationPackage.zip.
1.

Use Data Quality Workbench to export plans from the Data Quality Server repository.
Create the exported XML files on the Server repository machine.

2.

Copy the IDQMigration.zip file to the Server repository host machine.

3.

Extract the IDQMigration.zip file.

4.

Run ClientPackage.bat. You can apply the following parameters:


Option

Description

-d

Optional. Path to the directory where the batch file creates the
MigrationPackage.zip.

-f

Path to the directory that contains plans exported from the Data Quality Server
repository. Do not use with the -r parameter.

-o

Optional. Alternative name for MigrationPackage.zip.

-r

Optional. Specifies that ClientPackage.bat will run on a remote repository and


extract plan metadata to the local file system.
You can run ClientPackage.bat -r in place of step 1 if you do not have access
permissions on the Server repository machine.

-s

Optional. Staging directory for temporary files.

The batch process creates a MigrationPackage.zip file that contains the exported repository and data files.

Importing Data to Informatica Data Quality 9.0.1


You import data in two steps. First, you run the ServerImport batch process to create one or more XML files
that are compatible with Data Quality 9.0.1. Then, you import the XML files to the 9.0.1 Model repository and
staging area.
1.

Copy the compressed file that contains the exported repository objects and reference data to the Model
repository host machine.
The default name of the compressed repository objects file is MigrationPackage.zip.

2.

Extract the contents of the file.

3.

Open the migration.properties file from the extracted files. Update migration.properties with the following
information:

Informatica 9.0.1 Model repository host machine name.

Model repository name.

Exporting Data from the 8.6.2 Server Machine

23

Analyst Service name.

Name of the project and folder to contain the user-defined reference data.

Name of the project and folder that contains the Informatica reference data.

The locale setting on the Data Quality Workbench that last edited the plans. If required, use the
Locale.Client property to set the locale.

4.

Set the RTM.MapRTM property to Yes.

5.

Save and close migration.properties.

6.

Run ServerImport.bat or ServerImport.sh. Use the following parameters:


Option

Description

-f

Required. Path to the folder that contains the compressed repository objects file.

-d

Optional. Specify an alternative Output folder for the mapping XML file.

-o

Optional. Specify a new name for the exported objects XML file.

-p

Optional. Specify an alternative properties file to migration.properties.

-s

Optional. Specify an alternative temporary working folder.

The ServerImport batch process creates the XML that you import to the 9.0.1 Model repository and
staging area. The process creates the XML in a subfolder named Output in the folder that contains the
serverImport batch file.
7.

Review the ServerMigrationReport.html file and any other report files created by the ServerImport
process. Address any issues that arose during the process.

8.

Copy the XML file or files to the Developer tool machine.

9.

Open the Developer tool and import the mapping XML to the Model repository.
If the number of 8.6.2 plans is too large for a single XML file, ServerImport creates multiple files. In this
case, you must import the files in numerical order by file name, starting with the lowest-numbered file.
When the plans are imported, they appear as mappings in a folder in the Model repository. Any 8.6.2
transformations that convert to mapplets are also saved to a separate folder.

Note: The migration process creates connection objects for the databases that are used by the migrated
plans. Update the JDBC string information in the database connection objects.

24

Chapter 2: Migrating Repository and Reference Data

CHAPTER 3

Troubleshooting Migration of Data


Quality Objects
This chapter includes the following topics:

Overview, 25

Token Labeling and Token Parsing, 25

Rule-Based Analyzer Components Without Input Fields, 26

Text Qualifier Support for Sources and Targets, 27

Empty Reference Tables, 27

Data Quality 8.6.2 Settings Not Available in Data Quality 9.0.1, 28

Overview
Because some transformations function differently in Data Quality 9.0.1 than in Data Quality 8.6.2, you may
observe differences in mapping configuration and data output following migration. For example, the 9.0.1
Labeler transformation outputs some tokens differently than the 8.6.2 Token Labeler component.
Review the ServerMigrationReport.html file and associated report files to troubleshoot the effects of
migration on Data Quality 8.6.2 plans.

Token Labeling and Token Parsing


Labeling and parsing functionality differs from Data Quality 8.6.2 to Data Quality 9.0.1.

Labeler Transformation Changes


In Data Quality 8.6.2, you use the Token Labeler or Character Labeler components to label strings. Data
Quality 9.0.1 combines token labeling and character labeling functionality in the Labeler transformation.

25

Review the following changes in labeling behavior:

In Data Quality 9.0.1, some token label names have changed. The following table lists the changes:
Data Quality 8.6.2

Data Quality 9.0.1

codesymbol

code

num

number

text

word

A Data Quality 9.0.1 mapping does not preserve delimiters between tokens in the token stream. The
migration process replaces all delimiters with a space character.

Data Quality 9.0.1 does not support case-sensitive token labels. The following table shows how different
versions of Data Quality label the string seattle Seattle SEATTLE:
Data Quality 8.6.2

Data Quality 9.0.1

word Word WORD

word word word

Labeler and Parser Transformation Changes


In Data Quality 8.6.2, you use the Token Parser, Context Parser, or Profile Standardizer to parse strings by
the type of information they contain. Data Quality 9.0.1 combines parsing functionality in the Parser
transformation. In Data Quality 8.6.2, parsing components can read outputs from a Token Labeler
component, and in Data Quality 9.0.1, the Parser transformation can read output from a Labeler
transformation.
Review the following change in parsing behavior, and note that the Labeler transformation is affected:

The migration process converts Token Labeler word outputs and Token Parser text outputs into word
tokens in the 9.0.1 Labeler and Parser transformations. However, the definition of a 9.0.1 word token is
more restrictive than 8.6.2 text output settings and word tokens. A mapping that uses the 9.0.1 word token
may have different output than the original plan.

Rule-Based Analyzer Components Without Input


Fields
In Data Quality 8.6.2, valid plans can contain Rule-Based Analyzer components with no input ports. When
you migrate these plans to 9.0.1 mappings, the migration process creates Decision transformations with no
input fields.
After migration, you need to connect a field from an upstream component in the mapping to the Decision
transformation.

26

Chapter 3: Troubleshooting Migration of Data Quality Objects

Text Qualifier Support for Sources and Targets


In Data Quality 9.0.1, you can use single or double quotes as text qualifiers for sources and targets. In Data
Quality 8.6.2, you can use other symbols as text qualifiers for sources and targets.
When you migrate 8.6.2 sources or targets that use text qualifiers other than quotation marks and
apostrophes, the migration process sets the text qualifier to None.
If a Data Quality 8.6.2 plan uses text qualifiers not supported in Data Quality 9.0.1, migration may produce
mappings that process data differently than the 8.6.2 plan. For example, consider that an 8.6.2 plan contains
a semicolon as a text qualifier and that you process the following data stream in both the 8.6.2 plan and the
9.0.1 mapping:
data1,;data2, data3;,data4
In Data Quality 8.6.2, the plan parses the data into three data columns:
data1 | data2, data3 | data4
In Data Quality 9.0.1, the mapping parses the data into four columns:
data1 | ;data2 | data3; | data4
The differences that you experience in data processing depend on the unique combination of input data and
text qualifiers that you use.

Empty Reference Tables


If you find a reference table within a mapping folder, the reference table is empty.
Empty reference tables can occur for the following reasons:

Informatica reference data was not installed to the Data Quality 9.0.1 system.
The ServerImport process assumes that you have run the Content Installer to add Informatica reference
data to Data Quality 9.0.1. The ClientPackage process does not migrate an Informatica dictionary if the
dictionary data is present in the Content Installer file set or in an accelerator pack.
Run the Content Installer on the Data Integration Service machine, or on a machine that the Data
Integration Service can access, to install the reference data you need. Then run the ServerImport process.

A customized dictionary file was not present on the Data Quality 8.6.2 machine when the ClientPackage
process ran.
The ClientPackage process migrates any dictionary file that is not included in an Informatica reference
data set or accelerator pack. In this case, find the missing dictionary, add it to the dictionary folder on the
Data Quality 8.6.2 machine, and rerun the ClientPackage process.

Text Qualifier Support for Sources and Targets

27

Data Quality 8.6.2 Settings Not Available in Data


Quality 9.0.1
Some Data Quality 8.6.2 settings for Token Labeler, Identity Match, Character Labeler, and Context Parser
components are not available in Data Quality 9.0.1 transformations.

Changes in Address Validator Functionality


Some Address Validator transformation ports may be unconnected in an imported mapping. This can arise if
the functionality of a port has changed between 8.6.2 and 9.0.1.
You can resolve this issue in the following ways:

Edit the Address Validator transformation and delete the bad links.

Edit the Address Validator transformation to use different ports and recreate the port links.

Create another Address Validator transformation in Data Quality 9.0.1, configure it in parse-only mode,
and connect the required data ports to the new transformation.

The following table lists the affected ports in Data Quality 8.6.2 and alternative ports you can use on a parseonly Address Validation transformation:
Data Quality 8.6.2 Port Name

Data Quality 9.0.1 Port Name

GlobalAV_ParsedSuiteName

SubBuildingNumber1

GlobalAV_ParsedSuiteRange

SubBuildingName1

GlobalAV_ParsedPre_Direction

StreetPreDirectional1

GlobalAV_ParsedSuffix

StreetPostDescriptor1

GlobalAV_ParsedPost_Direction

StreetPostDirectional1

Identity Match Settings


Some Identity Match component settings in Data Quality 8.6.2 are not available in Data Quality 9.0.1.
The following Identity Match component settings are not available in Data Quality 9.0.1:
Stop on Error
When selected, this option stops identity matching plans if the plan generates an error.
Identity Match Decision Port
The output from this port indicates the status of identity match decisions.

Partial Support for the Context Parser Merge Option


Data Quality 9.0.1 provides partial support for the functionality enabled by the Merge option in the 8.6.2
Context Parser component.
The Data Quality 8.6.2 Context Parser component could merge many repeated tokens into a single block of
data. The migration process uses Context Parser components to create Labeler and Pattern-Based Parser

28

Chapter 3: Troubleshooting Migration of Data Quality Objects

transformations. Although these transformations can merge repeated tokens, input that produces more than
five repeated tokens may result in output data that differs from Data Quality 8.6.2 plan output.

Substring Dictionary Labeling


In the Data Quality 8.6.2 Character Labeler component, you can set a start and length value to limit the
amount of text processed when using a dictionary. You cannot use these settings in Data Quality 9.0.1.
In Data Quality 9.0.1, the Labeler transformation applies dictionary labeling across all of the data in the input
port. The migration operation adds a warning to the migration report file if substring dictionary labeling is
present in a migrated plan.

Data Quality 8.6.2 Settings Not Available in Data Quality 9.0.1

29

You might also like