Professional Documents
Culture Documents
Informatica PowerCenter®
(Version 7.1.1)
Informatica PowerCenter Workflow Administration Guide
Version 7.1.1
August 2004
This software and documentation contain proprietary information of Informatica Corporation, they are provided under a license agreement
containing restrictions on use and disclosure and is also protected by copyright law. Reverse engineering of the software is prohibited. No
part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)
without prior consent of Informatica Corporation.
Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software
license agreement as provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013(c)(1)(ii) (OCT 1988), FAR
12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable.
The information in this document is subject to change without notice. If you find any problems in the documentation, please report them to
us in writing. Informatica Corporation does not warrant that this documentation is error free.
Informatica, PowerMart, PowerCenter, PowerChannel, PowerCenter Connect, MX, and SuperGlue are trademarks or registered trademarks
of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be
trade names or trademarks of their respective owners.
Informatica PowerCenter products contain ACE (TM) software copyrighted by Douglas C. Schmidt and his research group at Washington
University and University of California, Irvine, Copyright (c) 1993-2002, all rights reserved.
Portions of this software contain copyrighted material from The JBoss Group, LLC. Your right to use such materials is set forth in the GNU
Lesser General Public License Agreement, which may be found at http://www.opensource.org/licenses/lgpl-license.php. The JBoss materials
are provided free of charge by Informatica, “as-is”, without warranty of any kind, either express or implied, including but not limited to the
implied warranties of merchantability and fitness for a particular purpose.
Portions of this software contain copyrighted material from Meta Integration Technology, Inc. Meta Integration® is a registered trademark
of Meta Integration Technology, Inc.
This product includes software developed by the Apache Software Foundation (http://www.apache.org/).
The Apache Software is Copyright (c) 1999-2004 The Apache Software Foundation. All rights reserved.
DISCLAIMER: Informatica Corporation provides this documentation “as is” without warranty of any kind, either express or implied,
including, but not limited to, the implied warranties of non-infringement, merchantability, or use for a particular purpose. The information
provided in this documentation may include technical inaccuracies or typographical errors. Informatica could make improvements and/or
changes in the products described in this documentation at any time without notice.
Table of Contents
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxv
New Features and Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxvi
PowerCenter 7.1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxvi
PowerCenter 7.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxviii
PowerCenter 7.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xlii
About Informatica Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xlviii
About this Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xlix
Document Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xlix
Other Informatica Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . l
Visiting Informatica Customer Portal . . . . . . . . . . . . . . . . . . . . . . . . . . . l
Visiting the Informatica Webzine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . l
Visiting the Informatica Web Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . l
Visiting the Informatica Developer Network . . . . . . . . . . . . . . . . . . . . . . l
Obtaining Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . li
iii
Writing Historical Information to the Repository . . . . . . . . . . . . . . . . . . 10
Sending Post-Session Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Data Transformation Manager (DTM) Process . . . . . . . . . . . . . . . . . . . . . . . 11
Reading the Session Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Expanding Variables and Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Creating the Session Log File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Validating Code Pages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Verifying Connection Object Permissions . . . . . . . . . . . . . . . . . . . . . . . 12
Running Pre-Session Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Running the Processing Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Running Post-Session Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Sending Post-Session Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Understanding Processing Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Thread Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Threads and Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
PowerCenter Server Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Reading Source Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Blocking Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Block Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
System Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
CPU Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Load Manager Shared Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
DTM Buffer Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Cache Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Code Pages and Data Movement Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
ASCII Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Unicode Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Output Files and Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
PowerCenter Server Log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Workflow Log File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Session Log File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Session Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Performance Detail File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Reject Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Row Error Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Recovery Tables and Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Control File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
iv Table of Contents
Indicator File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Output File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Cache Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Table of Contents v
Zooming the Workspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Working with Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Viewing Object Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Entering Descriptions for Repository Objects . . . . . . . . . . . . . . . . . . . . . 73
Renaming Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Checking Out and In Versioned Repository Objects . . . . . . . . . . . . . . . . . . . 74
Checking Out Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Checking In Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Searching For Versioned Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Copying Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Copying Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Copying Workflow Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Comparing Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Steps for Comparing Objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Working with Metadata Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Creating a Metadata Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Editing a Metadata Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Deleting a Metadata Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Keyboard Shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
vi Table of Contents
Scheduling a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Creating a Reusable Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Configuring Scheduler Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Editing Scheduler Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Disabling Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Validating a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Expression Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Task Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Workflow Properties Validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Running Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Running the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Selecting a Server to Run the Workflow . . . . . . . . . . . . . . . . . . . . . . . . 122
Assigning the PowerCenter Server to a Workflow . . . . . . . . . . . . . . . . . 122
Running a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Running a Part of a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Running a Task in the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Suspending the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Configuring Suspension Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Stopping or Aborting the Workflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Server Handling of Stop and Abort . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Stopping or Aborting a Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Table of Contents ix
Working with File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Configuring Source Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Configuring Fixed-Width File Properties . . . . . . . . . . . . . . . . . . . . . . . 220
Configuring Delimited File Properties . . . . . . . . . . . . . . . . . . . . . . . . . 222
Configuring Line Sequential Buffer Length . . . . . . . . . . . . . . . . . . . . . 225
Server Handling for File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Character Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Multibyte Character Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Null Character Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Row Length Handling for Fixed-Width Flat Files . . . . . . . . . . . . . . . . . 228
Numeric Data Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Using a File List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Creating the File List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Configuring a Session to Use a File List . . . . . . . . . . . . . . . . . . . . . . . . 231
x Table of Contents
Working with File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Configuring Target Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Configuring Fixed-Width Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Configuring Delimited Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
Server Handling for File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
Writing to Fixed-Width Flat Files with Relational Target Definitions . . 268
Writing to Fixed-Width Files with Flat File Target Definitions . . . . . . . 269
Writing Multibyte Data to Fixed-Width Flat Files . . . . . . . . . . . . . . . . 270
Null Characters in Fixed-Width Files . . . . . . . . . . . . . . . . . . . . . . . . . 272
Character Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
Writing Metadata to Flat File Targets . . . . . . . . . . . . . . . . . . . . . . . . . 273
Working with Heterogeneous Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
Table of Contents xi
Recovering a Suspended Workflow with Sequential Sessions . . . . . . . . . 305
Recovering a Suspended Workflow with Concurrent Sessions . . . . . . . . 306
Steps for Recovering a Suspended Workflow . . . . . . . . . . . . . . . . . . . . . 307
Recovering a Failed Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
Recovering a Failed Workflow with Sequential Sessions . . . . . . . . . . . . . 308
Recovering a Failed Workflow with Concurrent Sessions . . . . . . . . . . . . 309
Steps for Recovering a Failed Workflow . . . . . . . . . . . . . . . . . . . . . . . . 310
Recovering a Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Recovering Sequential Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Recovering Concurrent Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Steps for Recovering a Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
Server Handling for Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
Verifying Recovery Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
Running Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
Completing Unrecoverable Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
Table of Contents xv
Chapter 16: Log Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
Workflow Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
Workflow Log Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Configuring Workflow Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Viewing Workflow Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
Session Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
Session Log Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
Load Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
Detailed Transformation Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
Configuring Session Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
Viewing Session Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
Reject Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
Locating Reject Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
Reading Reject Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
xx Table of Contents
Calculating the Lookup Index Cache . . . . . . . . . . . . . . . . . . . . . . . . . . 629
Calculating the Lookup Data Cache . . . . . . . . . . . . . . . . . . . . . . . . . . 631
Rank Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632
Calculating the Rank Index Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . 632
Calculating the Rank Data Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763
Welcome to PowerCenter, Informatica’s software product that delivers an open, scalable data
integration solution addressing the complete life cycle for all data integration projects
including data warehouses and data marts, data migration, data synchronization, and
information hubs. PowerCenter combines the latest technology enhancements for reliably
managing data repositories and delivering information resources in a timely, usable, and
efficient manner.
The PowerCenter metadata repository coordinates and drives a variety of core functions,
including extracting, transforming, loading, and managing data. The PowerCenter Server can
extract large volumes of data from multiple platforms, handle complex transformations on the
data, and support high-speed loads. PowerCenter can simplify and accelerate the process of
moving data warehouses from development to test to production.
xxxv
New Features and Enhancements
This section describes new features and enhancements to PowerCenter 7.1.1, 7.1, and 7.0.
PowerCenter 7.1.1
This section describes new features and enhancements to PowerCenter 7.1.1.
Data Profiling
♦ Data sampling. You can create a data profile for a sample of source data instead of the
entire source. You can view a profile from a random sample of data, a specified percentage
of data, or for a specified number of rows starting with the first row.
♦ Verbose data enhancements. You can specify the type of verbose data you want the
PowerCenter Server to write to the Data Profiling warehouse. The PowerCenter Server can
write all rows, the rows that meet the business rule, or the rows that do not meet the
business rule.
♦ Session enhancement. You can save sessions that you create from the Profile Manager to
the repository.
♦ Domain Inference function tuning. You can configure the Data Profiling Wizard to filter
the Domain Inference function results. You can configure a maximum number of patterns
and a minimum pattern frequency. You may want to narrow the scope of patterns returned
to view only the primary domains, or you may want to widen the scope of patterns
returned to view exception data.
♦ Row Uniqueness function. You can determine unique rows for a source based on a
selection of columns for the specified source.
♦ Define mapping, session, and workflow prefixes. You can define default mapping,
session, and workflow prefixes for the mappings, sessions, and workflows generated when
you create a data profile.
♦ Profile mapping display in the Designer. The Designer displays profile mappings under a
profile mappings node in the Navigator.
PowerCenter Server
♦ Code page. PowerCenter supports additional Japanese language code pages, such as JIPSE-
kana, JEF-kana, and MELCOM-kana.
♦ Flat file partitioning. When you create multiple partitions for a flat file source session, you
can configure the session to create multiple threads to read the flat file source.
♦ pmcmd. You can use parameter files that reside on a local machine with the Startworkflow
command in the pmcmd program. When you use a local parameter file, pmcmd passes
variables and values in the file to the PowerCenter Server.
xxxvi Preface
♦ SuSE Linux support. The PowerCenter Server runs on SuSE Linux. On SuSE Linux, you
can connect to IBM, DB2, Oracle, and Sybase sources, targets, and repositories using
native drivers. Use ODBC drivers to access other sources and targets.
♦ Reserved word support. If any source, target, or lookup table name or column name
contains a database reserved word, you can create and maintain a file, reswords.txt,
containing reserved words. When the PowerCenter Server initializes a session, it searches
for reswords.txt in the PowerCenter Server installation directory. If the file exists, the
PowerCenter Server places quotes around matching reserved words when it executes SQL
against the database.
♦ Teradata external loader. When you load to Teradata using an external loader, you can
now override the control file. Depending on the loader you use, you can also override the
error, log, and work table names by specifying different tables on the same or different
Teradata database.
Repository
♦ Exchange metadata with other tools. You can exchange source and target metadata with
other BI or data modeling tools, such as Business Objects Designer. You can export or
import multiple objects at a time. When you export metadata, the PowerCenter Client
creates a file format recognized by the target tool.
Repository Server
♦ pmrep. You can use pmrep to perform the following functions:
− Remove repositories from the Repository Server cache entry list.
− Enable enhanced security when you create a relational source or target connection in the
repository.
− Update a connection attribute value when you update the connection.
♦ SuSE Linux support. The Repository Server runs on SuSE Linux. On SuSE Linux, you
can connect to IBM, DB2, Oracle, and Sybase repositories.
Security
♦ Oracle OS Authentication. You can now use Oracle OS Authentication to authenticate
database users. Oracle OS Authentication allows you to log on to an Oracle database if you
have a logon to the operating system. You do not need to know a database user name and
password. PowerCenter uses Oracle OS Authentication when the user name for an Oracle
connection is PmNullUser.
Preface xxxvii
♦ Pipeline partitioning. You can create multiple partitions in a session containing web
service source and target definitions. The PowerCenter Server creates a connection to the
Web Services Hub based on the number of sources, targets, and partitions in the session.
XML
♦ Multi-level pivoting. You can now pivot more than one multiple-occurring element in an
XML view. You can also pivot the view row.
PowerCenter 7.1
This section describes new features and enhancements to PowerCenter 7.1.
Data Profiling
♦ Data Profiling for VSAM sources. You can now create a data profile for VSAM sources.
♦ Support for verbose mode for source-level functions. You can now create data profiles
with source-level functions and write data to the Data Profiling warehouse in verbose
mode.
♦ Aggregator function in auto profiles. Auto profiles now include the Aggregator function.
♦ Creating auto profile enhancements. You can now select the columns or groups you want
to include in an auto profile and enable verbose mode for the Distinct Value Count
function.
♦ Purging data from the Data Profiling warehouse. You can now purge data from the Data
Profiling warehouse.
♦ Source View in the Profile Manager. You can now view data profiles by source definition
in the Profile Manager.
♦ PowerCenter Data Profiling report enhancements. You can now view PowerCenter Data
Profiling reports in a separate browser window, resize columns in a report, and view
verbose data for Distinct Value Count functions.
♦ Prepackaged domains. Informatica provides a set of prepackaged domains that you can
include in a Domain Validation function in a data profile.
Documentation
♦ Web Services Provider Guide. This is a new book that describes the functionality of Real-time
Web Services. It also includes information from the version 7.0 Web Services Hub Guide.
♦ XML User Guide. This book consolidates XML information previously documented in the
Designer Guide, Workflow Administration Guide, and Transformation Guide.
Licensing
Informatica provides licenses for each CPU and each repository rather than for each
installation. Informatica provides licenses for product, connectivity, and options. You store
xxxviii Preface
the license keys in a license key file. You can manage the license files using the Repository
Server Administration Console, the PowerCenter Server Setup, and the command line
program, pmlic.
PowerCenter Server
♦ 64-bit support. You can now run 64-bit PowerCenter Servers on AIX and HP-UX
(Itanium).
♦ Partitioning enhancements. If you have the Partitioning option, you can define up to 64
partitions at any partition point in a pipeline that supports multiple partitions.
♦ PowerCenter Server processing enhancements. The PowerCenter Server now reads a
block of rows at a time. This improves processing performance for most sessions.
♦ CLOB/BLOB datatype support. You can now read and write CLOB/BLOB datatypes.
Repository Server
♦ Updating repository statistics. PowerCenter now identifies and updates statistics for all
repository tables and indexes when you copy, upgrade, and restore repositories. This
improves performance when PowerCenter accesses the repository.
♦ Increased repository performance. You can increase repository performance by skipping
information when you copy, back up, or restore a repository. You can choose to skip MX
data, workflow and session log history, and deploy group history.
♦ pmrep. You can use pmrep to back up, disable, or enable a repository, delete a relational
connection from a repository, delete repository details, truncate log files, and run multiple
pmrep commands sequentially. You can also use pmrep to create, modify, and delete a
folder.
Repository
♦ Exchange metadata with business intelligence tools. You can export metadata to and
import metadata from other business intelligence tools, such as Cognos Report Net and
Business Objects.
♦ Object import and export enhancements. You can compare objects in an XML file to
objects in the target repository when you import objects.
♦ MX views. MX views have been added to help you analyze metadata stored in the
repository. REP_SERVER_NET and REP_SERVER_NET_REF views allow you to see
information about server grids. REP_VERSION_PROPS allows you to see the version
history of all objects in a PowerCenter repository.
Preface xxxix
Transformations
♦ Flat file lookup. You can now perform lookups on flat files. When you create a Lookup
transformation using a flat file as a lookup source, the Designer invokes the Flat File
Wizard. You can also use a lookup file parameter if you want to change the name or
location of a lookup between session runs.
♦ Dynamic lookup cache enhancements. When you use a dynamic lookup cache, the
PowerCenter Server can ignore some ports when it compares values in lookup and input
ports before it updates a row in the cache. Also, you can choose whether the PowerCenter
Server outputs old or new values from the lookup/output ports when it updates a row. You
might want to output old values from lookup/output ports when you use the Lookup
transformation in a mapping that updates slowly changing dimension tables.
♦ Union transformation. You can use the Union transformation to merge multiple sources
into a single pipeline. The Union transformation is similar to using the UNION ALL SQL
statement to combine the results from two or more SQL statements.
♦ Custom transformation API enhancements. The Custom transformation API includes
new array-based functions that allow you to create procedure code that receives and
outputs a block of rows at a time. Use these functions to take advantage of the
PowerCenter Server processing enhancements.
♦ Midstream XML transformations. You can now create an XML Parser transformation or
an XML Generator transformation to parse or generate XML inside a pipeline. The XML
transformations enable you to extract XML data stored in relational tables, such as data
stored in a CLOB column. You can also extract data from messaging systems, such as
TIBCO or IBM MQSeries.
Usability
♦ Viewing active folders. The Designer and the Workflow Manager highlight the active
folder in the Navigator.
♦ Enhanced printing. The quality of printed workspace has improved.
Version Control
You can run object queries that return shortcut objects. You can also run object queries based
on the latest status of an object. The query can return local objects that are checked out, the
latest version of checked in objects, or a collection of all older versions of objects.
xl Preface
Note: PowerCenter Connect for Web Services allows you to create sources, targets, and
transformations to call web services hosted by other providers. For more informations, see
PowerCenter Connect for Web Services User and Administrator Guide.
Workflow Monitor
The Workflow Monitor includes the following performance and usability enhancements:
♦ When you connect to the PowerCenter Server, you no longer distinguish between online
or offline mode.
♦ You can open multiple instances of the Workflow Monitor on one machine.
♦ You can simultaneously monitor multiple PowerCenter Servers registered to the same
repository.
♦ The Workflow Monitor includes improved options for filtering tasks by start and end
time.
♦ The Workflow Monitor displays workflow runs in Task view chronologically with the most
recent run at the top. It displays folders alphabetically.
♦ You can remove the Navigator and Output window.
XML Support
PowerCenter XML support now includes the following features:
♦ Enhanced datatype support. You can use XML schemas that contain simple and complex
datatypes.
♦ Additional options for XML definitions. When you import XML definitions, you can
choose how you want the Designer to represent the metadata associated with the imported
files. You can choose to generate XML views using hierarchy or entity relationships. In a
view with hierarchy relationships, the Designer expands each element and reference under
its parent element. When you create views with entity relationships, the Designer creates
separate entities for references and multiple-occurring elements.
♦ Synchronizing XML definitions. You can synchronize one or more XML definition when
the underlying schema changes. You can synchronize an XML definition with any
repository definition or file used to create the XML definition, including relational sources
or targets, XML files, DTD files, or schema files.
♦ XML workspace. You can edit XML views and relationships between views in the
workspace. You can create views, add or delete columns from views, and define
relationships between views.
♦ Midstream XML transformations. You can now create an XML Parser transformation or
an XML Generator transformation to parse or generate XML inside a pipeline. The XML
transformations enable you to extract XML data stored in relational tables, such as data
stored in a CLOB column. You can also extract data from messaging systems, such as
TIBCO or IBM MQSeries.
Preface xli
♦ Support for circular references. Circular references occur when an element is a direct or
indirect child of itself. PowerCenter now supports XML files, DTD files, and XML
schemas that use circular definitions.
♦ Increased performance for large XML targets. You can create XML files of several
gigabytes in a PowerCenter 7.1 XML session by using the following enhancements:
− Spill to disk. You can specify the size of the cache used to store the XML tree. If the size
of the tree exceeds the cache size, the XML data spills to disk in order to free up
memory.
− User-defined commits. You can define commits to trigger flushes for XML target files.
− Support for multiple XML output files. You can output XML data to multiple XML
targets. You can also define the file names for XML output files in the mapping.
PowerCenter 7.0
This section describes new features and enhancements to PowerCenter 7.0.
Data Profiling
If you have the Data Profiling option, you can profile source data to evaluate source data and
detect patterns and exceptions. For example, you can determine implicit data type, suggest
candidate keys, detect data patterns, and evaluate join criteria. After you create a profiling
warehouse, you can create profiling mappings and run sessions. Then you can view reports
based on the profile data in the profiling warehouse.
The PowerCenter Client provides a Profile Manager and a Profile Wizard to complete these
tasks.
Documentation
♦ Glossary. The Installation and Configuration Guide contains a glossary of new PowerCenter
terms.
♦ Installation and Configuration Guide. The connectivity information in the Installation
and Configuration Guide is consolidated into two chapters. This book now contains
chapters titled “Connecting to Databases from Windows” and “Connecting to Databases
from UNIX.”
♦ Upgrading metadata. The Installation and Configuration Guide now contains a chapter
titled “Upgrading Repository Metadata.” This chapter describes changes to repository
xlii Preface
objects impacted by the upgrade process. The change in functionality for existing objects
depends on the version of the existing objects. Consult the upgrade information in this
chapter for each upgraded object to determine whether the upgrade applies to your current
version of PowerCenter.
Functions
♦ Soundex. The Soundex function encodes a string value into a four-character string.
SOUNDEX works for characters in the English alphabet (A-Z). It uses the first character
of the input string as the first character in the return value and encodes the remaining
three unique consonants as numbers.
♦ Metaphone. The Metaphone function encodes string values. You can specify the length of
the string that you want to encode. METAPHONE encodes characters of the English
language alphabet (A-Z). It encodes both uppercase and lowercase letters in uppercase.
Installation
♦ Remote PowerCenter Client installation. You can create a control file containing
installation information, and distribute it to other users to install the PowerCenter Client.
You access the Informatica installation CD from the command line to create the control
file and install the product.
PowerCenter Server
♦ DB2 bulk loading. You can enable bulk loading when you load to IBM DB2 8.1.
♦ Distributed processing. If you purchase the Server Grid option, you can group
PowerCenter Servers registered to the same repository into a server grid. In a server grid,
PowerCenter Servers balance the workload among all the servers in the grid.
♦ Row error logging. The session configuration object has new properties that allow you to
define error logging. You can choose to log row errors in a central location to help
understand the cause and source of errors.
♦ External loading enhancements. When using external loaders on Windows, you can now
choose to load from a named pipe. When using external loaders on UNIX, you can now
choose to load from staged files.
Preface xliii
♦ External loading using Teradata Warehouse Builder. You can use Teradata Warehouse
Builder to load to Teradata. You can choose to insert, update, upsert, or delete data.
Additionally, Teradata Warehouse Builder can simultaneously read from multiple sources
and load data into one or more tables.
♦ Mixed mode processing for Teradata external loaders. You can now use data driven load
mode with Teradata external loaders. When you select data driven loading, the
PowerCenter Server flags rows for insert, delete, or update. It writes a column in the target
file or named pipe to indicate the update strategy. The control file uses these values to
determine how to load data to the target.
♦ Concurrent processing. The PowerCenter Server now reads data concurrently from
sources within a target load order group. This enables more efficient joins with minimal
usage of memory and disk cache.
♦ Real time processing enhancements. You can now use real-time processing in sessions that
also process active transformations, such as the Aggregator transformation. You can apply
the transformation logic to rows defined by transaction boundaries.
Repository Server
♦ Object export and import enhancements. You can now export and import objects using
the Repository Manager and pmrep. You can export and import multiple objects and
objects types. You can export and import objects with or without their dependent objects.
You can also export objects from a query result or objects history.
♦ pmrep commands. You can use pmrep to perform change management tasks, such as
maintaining deployment groups and labels, checking in, deploying, importing, exporting,
and listing objects. You can also use pmrep to run queries. The deployment and object
import commands require you to use a control file to define options and resolve conflicts.
♦ Trusted connections. You can now use a Microsoft SQL Server trusted connection to
connect to the repository.
Security
♦ LDAP user authentication. You can now use default repository user authentication or
Lightweight Directory Access Protocol (LDAP) to authenticate users. If you use LDAP, the
repository maintains an association between your repository user name and your external
login name. When you log in to the repository, the security module passes your login name
to the external directory for authentication. The repository maintains a status for each
user. You can now enable or disable users from accessing the repository by changing the
status. You do not have to delete user names from the repository.
♦ Use Repository Manager privilege. The Use Repository Manager privilege allows you to
perform tasks in the Repository Manager, such as copy object, maintain labels, and change
object status. You can perform the same tasks in the Designer and Workflow Manager if
you have the Use Designer and Use Workflow Manager privileges.
♦ Audit trail. You can track changes to repository users, groups, privileges, and permissions
through the Repository Server Administration Console. The Repository Agent logs
security changes to a log file stored in the Repository Server installation directory. The
xliv Preface
audit trail log contains information, such as changes to folder properties, adding or
removing a user or group, and adding or removing privileges.
Transformations
♦ Custom transformation. Custom transformations operate in conjunction with procedures
you create outside of the Designer interface to extend PowerCenter functionality. The
Custom transformation replaces the Advanced External Procedure transformation. You can
create Custom transformations with multiple input and output groups, and you can
compile the procedure with any C compiler.
You can create templates that customize the appearance and available properties of a
Custom transformation you develop. You can specify the icons used for transformation,
the colors, and the properties a mapping developer can modify. When you create a Custom
transformation template, distribute the template with the DLL or shared library you
develop.
♦ Joiner transformation. You can use the Joiner transformation to join two data streams that
originate from the same source.
Version Control
The PowerCenter Client and repository introduce features that allow you to create and
manage multiple versions of objects in the repository. Version control allows you to maintain
multiple versions of an object, control development on the object, track changes, and use
deployment groups to copy specific groups of objects from one repository to another. Version
control in PowerCenter includes the following features:
♦ Object versioning. Individual objects in the repository are now versioned. This allows you
to store multiple copies of a given object during the development cycle. Each version is a
separate object with unique properties.
♦ Check out and check in versioned objects. You can check out and reserve an object you
want to edit, and check in the object when you are ready to create a new version of the
object in the repository.
♦ Compare objects. The Repository Manager and Workflow Manager allow you to compare
two repository objects of the same type to identify differences between them. You can
compare Designer objects and Workflow Manager objects in the Repository Manager. You
can compare tasks, sessions, worklets, and workflows in the Workflow Manager. The
PowerCenter Client tools allow you to compare objects across open folders and
repositories. You can also compare different versions of the same object.
♦ Delete or purge a version. You can delete an object from view and continue to store it in
the repository. You can recover or undelete deleted objects. If you want to permanently
remove an object version, you can purge it from the repository.
♦ Deployment. Unlike copying a folder, copying a deployment group allows you to copy a
select number of objects from multiple folders in the source repository to multiple folders
in the target repository. This gives you greater control over the specific objects copied from
one repository to another.
Preface xlv
♦ Deployment groups. You can create a deployment group that contains references to
objects from multiple folders across the repository. You can create a static deployment
group that you manually add objects to, or create a dynamic deployment group that uses a
query to populate the group.
♦ Labels. A label is an object that you can apply to versioned objects in the repository. This
allows you to associate multiple objects in groups defined by the label. You can use labels
to track versioned objects during development, improve query results, and organize groups
of objects for deployment or export and import.
♦ Queries. You can create a query that specifies conditions to search for objects in the
repository. You can save queries for later use. You can make a private query, or you can
share it with all users in the repository.
♦ Track changes to an object. You can view a history that includes all versions of an object
and compare any version of the object in the history to any other version. This allows you
to see the changes made to an object over time.
XML Support
PowerCenter contains XML features that allow you to validate an XML file against an XML
schema, declare multiple namespaces, use XPath to locate XML nodes, increase performance
for large XML files, format your XML file output for increased readability, and parse or
generate XML data from various sources. XML support in PowerCenter includes the
following features:
♦ XML schema. You can use an XML schema to validate an XML file and to generate source
and target definitions. XML schemas allow you to declare multiple namespaces so you can
use prefixes for elements and attributes. XML schemas also allow you to define some
complex datatypes.
♦ XPath support. The XML wizard allows you to view the structure of XML schema. You
can use XPath to locate XML nodes.
♦ Increased performance for large XML files. When you process an XML file or stream, you
can set commits and periodically flush XML data to the target instead of writing all the
output at the end of the session. You can choose to append the data to the same target file
or create a new target file after each flush.
♦ XML target enhancements. You can format the XML target file so that you can easily view
the XML file in a text editor. You can also configure the PowerCenter Server to not output
empty elements to the XML target.
Usability
♦ Copying objects. You can now copy objects from all the PowerCenter Client tools using
the copy wizard to resolve conflicts. You can copy objects within folders, to other folders,
and to different repositories. Within the Designer, you can also copy segments of
mappings to a workspace in a new folder or repository.
♦ Comparing objects. You can compare workflows and tasks from the Workflow Manager.
You can also compare all objects from within the Repository Manager.
xlvi Preface
♦ Change propagation. When you edit a port in a mapping, you can choose to propagate
changed attributes throughout the mapping. The Designer propagates ports, expressions,
and conditions based on the direction that you propagate and the attributes you choose to
propagate.
♦ Enhanced partitioning interface. The Session Wizard is enhanced to provide a graphical
depiction of a mapping when you configure partitioning.
♦ Revert to saved. You can now revert to the last saved version of an object in the Workflow
Manager. When you do this, the Workflow Manager accesses the repository to retrieve the
last-saved version of the object.
♦ Enhanced validation messages. The PowerCenter Client writes messages in the Output
window that describe why it invalidates a mapping or workflow when you modify a
dependent object.
♦ Validate multiple objects. You can validate multiple objects in the repository without
fetching them into the workspace. You can save and optionally check in objects that
change from invalid to valid status as a result of the validation. You can validate sessions,
mappings, mapplets, workflows, and worklets.
♦ View dependencies. Before you edit or delete versioned objects, such as sources, targets,
mappings, or workflows, you can view dependencies to see the impact on other objects.
You can view parent and child dependencies and global shortcuts across repositories.
Viewing dependencies help you modify objects and composite objects without breaking
dependencies.
♦ Refresh session mappings. In the Workflow Manager, you can refresh a session mapping.
Preface xlvii
About Informatica Documentation
The complete set of documentation for PowerCenter includes the following books:
♦ Data Profiling Guide. Provides information about how to profile PowerCenter sources to
evaluate source data and detect patterns and exceptions.
♦ Designer Guide. Provides information needed to use the Designer. Includes information to
help you create mappings, mapplets, and transformations. Also includes a description of
the transformation datatypes used to process and transform source data.
♦ Getting Started. Provides basic tutorials for getting started.
♦ Installation and Configuration Guide. Provides information needed to install and
configure the PowerCenter tools, including details on environment variables and database
connections.
♦ PowerCenter Connect® for JMS® User and Administrator Guide. Provides information
to install PowerCenter Connect for JMS, build mappings, extract data from JMS messages,
and load data into JMS messages.
♦ Repository Guide. Provides information needed to administer the repository using the
Repository Manager or the pmrep command line program. Includes details on
functionality available in the Repository Manager and Administration Console, such as
creating and maintaining repositories, folders, users, groups, and permissions and
privileges.
♦ Transformation Language Reference. Provides syntax descriptions and examples for each
transformation function provided with PowerCenter.
♦ Transformation Guide. Provides information on how to create and configure each type of
transformation in the Designer.
♦ Troubleshooting Guide. Lists error messages that you might encounter while using
PowerCenter. Each error message includes one or more possible causes and actions that
you can take to correct the condition.
♦ Web Services Provider Guide. Provides information you need to install and configure the Web
Services Hub. This guide also provides information about how to use the web services that the
Web Services Hub hosts. The Web Services Hub hosts Real-time Web Services, Batch Web
Services, and Metadata Web Services.
♦ Workflow Administration Guide. Provides information to help you create and run
workflows in the Workflow Manager, as well as monitor workflows in the Workflow
Monitor. Also contains information on administering the PowerCenter Server and
performance tuning.
♦ XML User Guide. Provides information you need to create XML definitions from XML,
XSD, or DTD files, and relational or other XML definitions. Includes information on
running sessions with XML data. Also includes details on using the midstream XML
transformations to parse or generate XML data within a pipeline.
xlviii Preface
About this Book
The Workflow Administration Guide is written for developers and administrators who are
responsible for creating workflows and sessions, running workflows, and administering the
PowerCenter Server. This guide assumes you have knowledge of your operating systems,
relational database concepts, and the database engines, flat files or mainframe system in your
environment. This guide also assumes you are familiar with the interface requirements for
your supporting applications.
The material in this book is available for online use.
Document Conventions
This guide uses the following formatting conventions:
italicized monospaced text This is the variable name for a value you enter as part of an
operating system command. This is generic text that should be
replaced with user-supplied values.
Warning: The following paragraph notes situations where you can overwrite
or corrupt data, unless you follow the specified procedure.
bold monospaced text This is an operating system command you enter from a prompt to
run a task.
Preface xlix
Other Informatica Resources
In addition to the product manuals, Informatica provides these other resources:
♦ Informatica Customer Portal
♦ Informatica Webzine
♦ Informatica web site
♦ Informatica Developer Network
♦ Informatica Technical Support
l Preface
The site contains information on how to create, market, and support customer-oriented add-
on solutions based on Informatica’s interoperability interfaces.
Belgium
Phone: +32 15 281 702
Hours: 9 a.m. - 5:30 p.m. (local time)
France
Phone: +33 1 41 38 92 26
Hours: 9 a.m. - 5:30 p.m. (local time)
Germany
Phone: +49 1805 702 702
Hours: 9 a.m. - 5:30 p.m. (local time)
Netherlands
Phone: +31 306 082 089
Hours: 9 a.m. - 5:30 p.m. (local time)
Singapore
Phone: +65 322 8589
Hours: 9 a.m. - 5 p.m. (local time)
Switzerland
Phone: +41 800 81 80 70
Hours: 8 a.m. - 5 p.m. (local time)
Preface li
lii Preface
Chapter 1
1
Overview
You can register multiple PowerCenter Servers to a repository. The PowerCenter Server moves
data from sources to targets based on workflow and mapping metadata stored in a repository.
A workflow is a set of instructions that describes how and when to run tasks related to
extracting, transforming, and loading data. The PowerCenter Server runs workflow tasks
according to the conditional links connecting the tasks. You can run a task by placing it in a
workflow.
When you have multiple PowerCenter Servers, you can assign a server to start a workflow or a
session. This allows you to distribute the workload. You can increase performance by using a
server grid to balance the workload. A server grid is a server object that allows you to
automate the distribution of sessions across multiple servers. For more information about
server grids, see “Working with Server Grids” on page 446.
A session is a type of workflow task. A session is a set of instructions that describes how to
move data from sources to targets using a mapping. Other workflow tasks include commands,
decisions, timers, pre-session SQL commands, post-session SQL commands, and email
notification. For details on workflow tasks, see “Working with Tasks” on page 131.
Use the Designer to import source and target definitions into the repository and to build
mappings. A mapping is a set of source and target definitions linked by transformation
objects that define the rules for data transformation. Use the Workflow Manager to develop
and manage workflows. Use the Workflow Monitor to monitor workflows and stop the
PowerCenter Server.
When a workflow starts, the PowerCenter Server retrieves mapping, workflow, and session
metadata from the repository to extract data from the source, transform it, and load it into
the target. It also runs the tasks in the workflow. The PowerCenter Server uses Load Manager
and Data Transformation Manager (DTM) processes to run the workflow.
Figure 1-1 shows the processing path between the PowerCenter Server, repository, source, and
target:
PowerCenter
Source Target
Source Server Transformed
Data Data
Instructions
from
Metadata
Repository
Workflow Processes
The PowerCenter Server uses both process memory and system shared memory to perform
these tasks. It runs as a daemon on UNIX and a service on Windows. The PowerCenter Server
uses the following processes to run a workflow:
♦ The Load Manager process. Starts and locks the workflow, runs workflow tasks, and starts
the DTM to run sessions.
♦ The Data Transformation Manager (DTM) process. Performs session validations. Creates
threads to initialize the session, read, write, and transform data, and handle pre- and post-
session operations.
Pipeline Partitioning
When running sessions, the PowerCenter Server can achieve high performance by
partitioning the pipeline and performing the extract, transformation, and load for each
partition in parallel. To accomplish this, use the following session and server configuration:
♦ Configure the session with multiple partitions.
♦ Install the PowerCenter Server on a machine with multiple CPUs.
You can configure the partition type at most transformations in the pipeline. The
PowerCenter Server can partition data using round-robin, hash, key-range, database
partitioning, or pass-through partitioning.
For relational sources, the PowerCenter Server creates multiple database connections to a
single source and extracts a separate range of data for each connection. For XML or file
sources, the PowerCenter Server reads multiple files concurrently. The files must have the
same structure or hierarchy.
When the PowerCenter Server transforms the partitions concurrently, it passes data between
the partitions as needed to perform operations such as aggregation. When the PowerCenter
Server loads relational data, it creates multiple database connections to the target and loads
partitions of data concurrently. When the PowerCenter Server loads data to file targets, it
creates a separate file for each partition. You can choose to merge the target files.
Figure 1-2 shows a mapping that contains two partitions:
Overview 3
For more information about pipeline partitioning, see “Pipeline Partitioning” on page 345.
TCP/IP
Repository Server
Repository Agent
Native/ODL
PowerCenter
Repository
Table 1-1 summarizes the software you need to connect the PowerCenter Server to the
platform components, source databases, and target databases:
Running a Workflow 7
Load Manager Process
The Load Manager is the primary PowerCenter Server process. It accepts requests from the
PowerCenter Client and from pmcmd. The Load Manager runs and monitors the workflow. It
performs the following tasks:
♦ Manages workflow scheduling.
♦ Locks and reads the workflow.
♦ Reads the parameter file.
♦ Creates the workflow log file.
♦ Runs workflow tasks and evaluates the conditional links connecting tasks.
♦ Starts the DTM, which runs the session.
♦ Writes historical run information to the repository.
♦ Sends post-session email in the event of DTM failure.
INFO : LM_36302 : (2076|2224) Started DTM process [pid = 508] for session
instance [s_BOOKINGS].
For more information on workflow log files, see “Log Files” on page 455.
For more information on session log files, see “Log Files” on page 455.
Thread Types
The master thread creates different types of threads for a session. The types of threads the
master thread creates depend on the following factors:
♦ Pre- and post-session properties
♦ Types of transformations in the mapping
Table 1-2 lists the types of threads that the master thread can create:
Mapping Thread One thread for each session. Fetches session and mapping information.
Compiles the mapping. Cleans up after session execution.
Pre- and Post-Session One thread each to perform pre- and post-session operations.
Threads
Reader Thread One thread for each partition for each source pipeline. Reads from sources.
Relational sources use relational reader threads, and file sources use file
reader threads.
Transformation Thread One or more transformation threads for each partition. Processes data
according to the transformation logic in the mapping.
Writer Thread One thread for each partition, if a target exists in the source pipeline. Writes to
targets. Relational targets use relational writer threads, and file targets use file
writer threads.
The mapping in Figure 1-4 contains a single partition. In this case, the master thread creates
one reader, one transformation, and one writer thread to process the data. The reader thread
controls how the PowerCenter Server extracts source data and passes it to the source qualifier,
the transformation thread controls how the PowerCenter Server processes the data, and the
writer thread controls how the PowerCenter Server loads data to the target.
When the pipeline contains only a source definition, source qualifier, and a target definition,
the data bypasses the transformation threads, proceeding directly from the reader buffers to
the writer. This type of pipeline is a pass-through pipeline.
Figure 1-5 shows the threads for a pass-through pipeline with one partition:
Note: The previous examples assume that each session contains a single partition. For
information on how partitions and partition points affect thread creation, see “Threads and
Partitioning” on page 16.
Reader Threads
The master thread creates reader threads to extract source data. The number of reader threads
depends on the partitioning information for each pipeline. The number of reader threads
equals the number of partitions. For more information, see “Threads and Partitioning” on
page 16.
The PowerCenter Server creates an SQL statement for each reader thread to extract data from
a relational source. For file sources, the PowerCenter Server can create multiple threads to
read a single source.
Writer Threads
The master thread creates writer threads to load target data. The number of writer threads
depends on the partitioning information for each pipeline. If the pipeline contains one
partition, the master thread creates one writer thread. If it contains multiple partitions, the
master thread creates multiple writer threads. For more information, see “Threads and
Partitioning” on page 16.
Each writer thread creates connections to the target databases to load data. If the target is a
file, each writer thread creates a separate file. You can configure the session to merge these
files.
If the target is relational, the writer thread takes data from buffers and commits it to session
targets. When loading targets, the writer commits data based on the commit interval in the
session properties. You can configure a session to commit data based on the number of source
rows read, the number of rows written to the target, or the number of rows that pass through
a transformation that generates transactions, such as a Transaction Control transformation.
* * *
The mapping in Figure 1-6 contains four stages by default. The partition point at the source
qualifier marks the boundary between the first (reader) and second (transformation) stages.
The partition point at the Aggregator transformation marks the boundary between the second
and third (transformation) stages. The partition point at the target instance marks the
boundary between the third (transformation) and the fourth (writer) stages.
If you use PowerCenter, you can add and delete partition points at other transformations. For
information on valid partition points, see “Pipeline Partitioning” on page 345. When you add
a partition point, you increase the number of pipeline stages by one. When you remove a
partition point, you decrease the number of pipeline stages by one.
* Partition Points
* * * *
First Stage Second Stage Third Stage Fourth Stage Fifth Stage
Number of Partitions
The number of threads that process each pipeline stage depends on the number of partitions.
A partition is a pipeline stage that executes in a single reader, transformation, or writer thread.
The number of partitions in any pipeline stage equals the number of threads in that stage. If
you do not specify otherwise, the PowerCenter Server creates one partition in every pipeline
stage. If you purchased the partitioning option, you can configure multiple partitions for a
single pipeline stage.
You can specify the number of partitions at any partition point. The number of partitions
must be consistent across a pipeline. Therefore, if you define two partitions at the source
qualifier, the Workflow Manager sets two partitions at all transformations that are partition
points, and two partitions at the target instances.
For example, suppose you need to use the mapping in Figure 1-6 on page 17 to read data from
three flat files. To do this, you need to specify three partitions at the source qualifier. When
you do this, the Workflow Manager sets three partitions at all other partition points in the
pipeline.
The master thread creates three sets of threads. Figure 1-8 shows thread creation for a
mapping with three partitions:
* * *
* Partition Points
*
*
*
Each source pipeline in Figure 1-9 contains a transformation thread. The Joiner
transformation is not a partition point, so both transformation threads can process data at the
Joiner and Expression transformations. However, only one transformation thread processes a
row at any given time. The target load order group contains one target, so the master thread
creates only one writer thread.
Suppose you add a partition point at the Joiner transformation in Figure 1-9. Figure 1-10
shows the mapping in Figure 1-9 with a partition point at the Joiner transformation:
* Partition Points
*
* *
*
A T1
Pipeline B
Blocking Data
You can include multiple input group transformations in a mapping. The PowerCenter Server
passes data to the input groups concurrently. However, sometimes the transformation logic of
a multiple input group transformation requires that the PowerCenter Server block data on
one input group while it waits for a row from a different input group.
Blocking is the suspension of the data flow into an input group of a multiple input group
transformation. When the PowerCenter Server blocks data, it reads data from the source
connected to the input group until it fills the reader and transformation buffers. Once the
PowerCenter Server fills the buffers, it does not read more source rows until the
transformation logic allows the PowerCenter Server to stop blocking the source. When the
PowerCenter Server stops blocking a source, it processes the data in the buffers and continues
to read from the source.
The PowerCenter Server blocks data at one input group when it needs a specific row from a
different input group to perform the transformation logic. Once the PowerCenter Server
reads and processes the row it needs, it stops blocking the source.
Block Processing
The PowerCenter Server reads and processes a block of rows at a time. The number of rows in
the block depend on the row size and the DTM buffer size. In the following circumstances,
the PowerCenter Server processes one row in a block:
♦ Log row errors. When you log row errors, the PowerCenter Server processes one row in a
block.
♦ Connect CURRVAL. When you connect the CURRVAL port in a Sequence Generator
transformation, the session processes one row in a block. For optimal performance,
Informatica recommends that you connect only the NEXTVAL port in mappings. For
more information, see “Sequence Generator Transformation” in the Transformation Guide.
♦ Configure array-based mode for Custom transformation procedure. When you configure
the data access mode for a Custom transformation procedure to be row-based, the
PowerCenter Server processes one row in a block. By default, the data access mode is array-
based, and the PowerCenter Server processes multiple rows in a block. For more
information, see “Custom Transformation Functions” in the Transformation Guide.
CPU Usage
The PowerCenter Server performs read, transformation, and write processing for a pipeline in
parallel. It can process multiple partitions of a pipeline within a session, and it can process
multiple sessions in parallel.
If you have a symmetric multi-processing (SMP) platform, you can use multiple CPUs to
concurrently process session data or partitions of data. This provides increased performance,
as true parallelism is achieved. On a single processor platform, these tasks share the CPU, so
there is no parallelism.
The PowerCenter Server can use multiple CPUs to process a session that contains multiple
partitions. The number of CPUs used depends on factors such as the number of partitions,
the number of threads, the number of available CPUs, and amount or resources required to
process the mapping.
For more information about partitioning, see “Pipeline Partitioning” on page 345.
System Resources 25
Cache Memory
The DTM process creates in-memory index and data caches to temporarily store data used by
the following transformations:
♦ Aggregator transformation (without sorted input)
♦ Rank transformation
♦ Joiner transformation
♦ Lookup transformation (with caching enabled)
You configure memory size for the index and data cache in the transformation properties. By
default, the PowerCenter Server allocates 1,000,000 bytes for the index cache and 2,000,000
bytes for the data cache.
By default, the DTM creates cache files in the directory configured for the $PMCacheDir
server variable. If the DTM requires more space than it allocates, it pages to local index and
data files.
The DTM process also creates an in-memory cache to store data used by a Sorter
transformation. You configure the memory size for the cache in the transformation properties.
By default, the PowerCenter Server allocates 8,388,608 bytes for the cache, and the DTM
creates cache files in the directory configured for the $PMTempDir server variable. If the
DTM requires more cache space than it allocates, it pages to local cache files.
When processing large amounts of data, the DTM may create multiple index and data files.
The session does not fail if it runs out of cache memory and pages to the cache files. It does
fail, however, if the local directory for cache files runs out of disk space.
After the session completes, the DTM releases memory used by the index and data caches and
deletes any index and data files. However, if the session is configured to perform incremental
aggregation or if a Lookup transformation is configured for a persistent lookup cache, the
DTM saves all index and data cache information to disk for the next session run.
For more information about caching, see “Session Caches” on page 613.
ASCII Mode
Use ASCII mode when all sources and targets are 7-bit ASCII or EBCDIC character sets. In
ASCII mode, the PowerCenter Server recognizes 7-bit ASCII and EBCDIC characters and
stores each character in a single byte. When the PowerCenter Server runs in ASCII mode, it
does not validate session code pages. It reads all character data as ASCII characters and does
not perform code page conversions. It also treats all numerics as U.S. Standard and all dates as
binary data.
Unicode Mode
Use Unicode mode when sources or targets use 8-bit or multibyte character sets and contain
character data. In Unicode mode, the PowerCenter Server recognizes multibyte character sets
as defined by supported code pages.
If you configure the PowerCenter Server to validate data code pages, the PowerCenter Server
validates source and target code page compatibility when you run a session. If you configure
the PowerCenter Server for relaxed data code page validation, the PowerCenter Server lifts
source and target compatibility restrictions.
When reading a source, the PowerCenter Server converts data from the source character set to
Unicode based on the source code page. The PowerCenter Server allots two bytes for each
character when moving data through a mapping. The PowerCenter Server converts data from
Unicode to the target character set based on the target code page when writing to the target. It
also treats all numerics as U.S. Standard and all dates as binary data.
The PowerCenter Server code page must be compatible with the code pages of the
PowerCenter Client.
For details on code page compatibility and validation, see “Globalization Overview” in the
Installation and Configuration Guide.
Some message codes are embedded within other codes, for example:
CMN_1050 [LM 2041 Received request to start session]
You can also configure the PowerCenter Server on Windows to write error messages to the
Application Log, which you can view with the Event Viewer. Messages sent from the
PowerCenter Server display PowerCenter in the Source column, the code prefix in the
Category column, and the code number in the Event column. However, since some message
codes are embedded within other codes, to ensure you are viewing the true message code, you
must view the text of the message.
Figure 1-12 shows a sample application log:
Error Messages
Using the listed error code, consult the Troubleshooting Guide for probable causes and actions
to correct the problem.
Session Details
When you run a session, the Workflow Manager creates session details that provide load
statistics for each target in the mapping. You can monitor session details during the session or
after the session completes. Session details include information such as table name, number of
rows written or rejected, and read and write throughput. You can view this information by
double-clicking the session in the Workflow Monitor.
For more information on session details file, see “Monitoring Session Details” on page 434.
Reject Files
By default, the PowerCenter Server creates a reject file for each target in the session. The
reject file contains rows of data that the writer does not write to targets.
The writer may reject a row in the following circumstances:
♦ It is flagged for reject by an Update Strategy or Custom transformation.
♦ It violates a database constraint, such as primary key constraint.
♦ A field in the row was truncated or overflowed, and the target database is configured to
reject truncated or overflowed data.
By default, the PowerCenter Server saves the reject file in the directory entered for the server
variable $PMBadFileDir in the Workflow Manager, and names the reject file
target_table_name.bad.
Note: If you enable row error logging, the PowerCenter Server does not create a reject file.
For more information about the reject file, see “Log Files” on page 455.
Email
You can compose and send email messages by creating an Email task in the Workflow
Designer or Task Developer. You can place the Email task in a workflow, or you can associate
it with a session. The Email task allows you to automatically communicate information about
a workflow or session run to designated recipients.
Email tasks in the workflow send email depending on the conditional links connected to the
task. For post-session email, you can create two different messages, one to be sent if the
session completes successfully, the other if the session fails. You can also use variables to
generate information about the session name, status, and total rows loaded.
For example, if your database administrator wants to track how long a session takes to
complete, you can configure the session to send an email containing the time and date the
session starts and completes. Or, if you want to notify your Informatica administrator when a
session fails, you can configure the session to send an email only if it fails and attach the
session log to the email.
For more information, see “Sending Email” on page 319.
Indicator File
If you use a flat file as a target, you can configure the PowerCenter Server to create an
indicator file for target row type information. For each target row, the indicator file contains a
number to indicate whether the row was marked for insert, update, delete, or reject. The
PowerCenter Server names this file target_name.ind and stores it in the same directory as the
target file. For more information about configuring the PowerCenter Server, see the
Installation and Configuration Guide.
Output File
If the session writes to a target file, the PowerCenter Server creates the target file based on a
file target definition. By default, the PowerCenter Server names the target file based on the
target definition name. If a mapping contains multiple instances of the same target, the
PowerCenter Server names the target files based on the target instance name.
Cache Files
When the PowerCenter Server creates memory cache it also creates cache files. The
PowerCenter Server creates index and data cache files for the following transformations in a
mapping:
♦ Aggregator transformation
♦ Joiner transformation
♦ Rank transformation
♦ Lookup transformation
♦ Sorter transformation
By default, the DTM creates the index and data files for Aggregator, Rank, Joiner, and
Lookup transformations in the directory configured for the $PMCacheDir server variable.
The PowerCenter Server names the index file PM*.idx, and the data file PM*.dat. The
PowerCenter Server creates the index and data files for the Sorter transformation in the
$PMTempDir server variable directory.
The PowerCenter Server writes to the cache files during the session in the following cases:
♦ The mapping contains one or more Aggregator transformations configured without sorted
ports.
♦ The session is configured for incremental aggregation.
♦ The mapping contains a Lookup transformation that is configured to use a persistent
lookup cache, and the PowerCenter Server runs the session for the first time.
♦ The mapping contains a Lookup transformation that is configured to initialize the
persistent lookup cache.
♦ The DTM runs out of cache memory and pages to the local cache files. The DTM may
create multiple files when processing large amounts of data. The session fails if the local
directory runs out of disk space.
After the session completes, the DTM generally deletes the overflow index and data files. It
does not delete the cache files under the following circumstances:
♦ The session is configured to perform incremental aggregation.
♦ The session is configured with a persistent lookup cache.
37
Overview
Before you can use the Workflow Manager to create workflows and sessions, you must
configure the Workflow Manager. You can configure display options and connection
information in the Workflow Manager. You must register a PowerCenter Server before you
can start it or create a workflow to run against it.
You can configure the following information in the Workflow Manager:
♦ Configure Workflow Manager options. You can configure options such as grouping
sessions or docking and undocking windows. For details, see “Customizing the Workflow
Manager Options” on page 39.
♦ Register PowerCenter Servers. Before you can start a PowerCenter Server, you must
register it with the repository. For details, see “Registering the PowerCenter Server” on
page 46.
♦ Create a server grid. When you have multiple PowerCenter Servers registered to the same
repository you can create a server grid to balance workloads. For details, see “Working with
Server Grids” on page 446.
♦ Create source and target database connections. Create connections to each source and
target database. You must create connections to a database before you can create a session
that accesses the database. For details, see “Setting Up a Relational Database Connection”
on page 53.
♦ Create connections objects. Create connection objects in the repository when you define
database, FTP, and external loader connections. For details, see “Configuring Connection
Object Permissions” on page 51.
Table 2-1 describes general options you can configure in the Workflow Manager:
Option Description
Reload Tasks/ Reloads the last view of a tool when you open it. For example, if you have a workflow open
Workflows When when you disconnect from a repository, select this option so that the same workflow displays
Opening a Folder the next time you open the folder and Workflow Designer. Enabled by default.
Ask Whether to Reload Appears only when you select Reload tasks/workflows when opening a folder. Select this
the Tasks/Workflows option if you want the Workflow Manager to prompt you to reload tasks, workflows, and
worklets each time you open a folder. Disabled by default.
Overview Window Pans By default, when you drag the focus of the Overview window, the focus of the workbook
Delay moves concurrently. When you select this option, the focus of the workspace does not
change until you release the mouse button. Disabled by default.
Allow Invoking In-Place By default, you can press F2 to edit objects directly in the workspace instead of opening the
Editing Using the Edit Task dialog box. Select this option so you can also click the object name in the
Mouse workspace to edit the object. Disabled by default.
Option Description
Open Editor When Task Opens the Edit Task dialog box when you create a task. By default, the Workflow Manager
Is Created creates the task in the workspace. If you do not enable this option, double-click the task to
open the Edit Task dialog box. Disabled by default.
Workspace File The directory for workspace files created by the Workflow Manager. Workspace files
Directory maintain the last task or workflow you saved. This directory should be local to the
PowerCenter Client to prevent file corruption or overwrites by multiple users. By default, the
Workflow Manager creates files in the PowerCenter Client installation directory.
Display Tool Names On Displays the name of the tool in the upper left corner of the workspace or workbook. Enabled
Views by default.
Always Show the Full Shows the full name of a task when you select it. By default, the Workflow Manager
Name of Selected Task abbreviates the task name in the workspace. Enabled by default.
Show the Expression Shows the link condition in the workspace. If you do not enable this option, the Workflow
On a Link Manager abbreviates the link condition in the workspace. Enabled by default.
Launch Workflow The Workflow Monitor launches when you start a workflow or a task. Enabled by default.
Monitor when Workflow
is Started
Receive Notifications Allows you to receive notification messages from the Repository Server. The Repository
from Server Server sends notification about actions performed on repository objects. Enabled by default.
For details, see “Understanding the Repository” in the Repository Guide.
Table 2-2 describes the format options for the Workflow Manager:
Option Description
Show Solid Lines for Displays links as solid lines. By default, the Workflow Manager displays links as dotted lines.
Links
Workspace Colors Displays all items that you can customize in the selected tool. Select an item to change its
color.
Font Categories Select the Workflow Manager tool for which you want to customize the display font.
Change Font Select to change the display font and language script for the Workflow Manager tool you
choose from the Categories menu.
Reset All Resets all format options to their original default values.
Figure 2-3. Copy Wizard, Versioning, and Target Load Type Options
Table 2-3 describes the options for the Copy Wizard, Versioning, and Target Load Type:
Option Description
Generate Unique Name When Generates unique names for copied objects if you select the Rename option. For
Resolved to “Rename” example, if the workflow wf_Sales has the same as a workflow in the destination
folder, the Rename option generates the unique name wf_Sales1. Enabled by
default.
Get Default Object When Uses the object with the same name in the destination folder if you select the
Resolved to “Choose” Choose option.
Show Check Out Image in Displays the Check Out icon when an object has been checked out. Enabled by
Navigator default.
Option Description
Reset All Resets all Copy Wizard and Versioning options to their default values.
Target Load Type Sets default load type for sessions. You can choose normal or bulk loading.
Any change you make takes effect after you restart the Workflow Manager.
You can override this setting in the session properties. Default is Bulk.
For more information on normal and bulk loading, see Table A-15 on page 697.
Owner Read/Write/Execute
World No permissions
If you do not enable enhanced security, the Workflow Manager assigns Read, Write, and
Execute permissions to all users or groups for the connection.
Enabling enhanced security does not lock the restricted access settings for connection objects.
You can continue to change the permissions for connection objects after enabling enhanced
security.
If you delete the Owner from the repository, the Workflow Manager automatically assigns
ownership of the object to Administrator.
1. Choose Tools-Options.
2. Click the Advanced Tab.
4. Click OK.
Server Variables
You can define server variables for each PowerCenter Server you register. Some server variables
define the path and directories for workflow output files and caches. By default, the
PowerCenter Server places output files in these directories when you run a workflow. Other
server variables define server attributes such as log file count. In a server grid, you must use
the same server variables for each server.
The installation process creates directories in the location where you install the PowerCenter
Server. To use these directories as the default location for the session output files, you must
first set the server variable $PMRootDir to define the path to the directories.
Required/
Server Variable Description
Optional
$PMRootDir Required A root directory to be used by any or all other server variables.
Informatica recommends you use the PowerCenter Server installation
directory as the root directory.
$PMCacheDir Required Default directory for the index and data cache files. Defaults to
$PMRootDir/Cache. To avoid performance problems, always use a drive
local to the PowerCenter Server for the cache directory. Do not use a
mapped or mounted drive for cache files.
$PMSuccessEmailUser Optional Email address to receive post-session email when the session completes
successfully. Use to address post-session email. The default value is an
empty string. For details, see “Sending Email” on page 319.
$PMFailureEmailUser Optional Email address to receive post-session email when the session fails. The
default value is an empty string. Use to address post-session email.
$PMSessionLogCount Optional Number of session logs the PowerCenter Server archives for the session.
Use to archive session logs. For details, see “Viewing Session Logs” on
page 474. Defaults to 0.
$PMSessionErrorThreshold Optional Number of non-fatal errors the PowerCenter Server allows before failing
the session. Non-fatal errors include reader, writer, and DTM errors. If
you want to stop the session on errors, enter the number of non-fatal
errors you want to allow before stopping the session. The PowerCenter
Server maintains an independent error count for each source, target, and
transformation. Use to configure the Stop On option in the session
properties.
Defaults to 0. If you use the default setting, non-fatal errors do not cause
the session to stop.
Required/
Server Variable Description
Optional
$PMWorkflowLogCount Optional Number of workflow logs the PowerCenter Server archives for the
workflow. Defaults to 0.
Required/
TCP/IP Option Description
Optional
Server Name Required The name of PowerCenter Server. This name must be unique to
the repository.
Host Name or IP Required Server host name or IP address of the PowerCenter Server
address machine.
Resolved IP Address n/a (read-only) The IP address resolved by the Workflow Manager. This is a
read-only field.
Port Number Required Port number the PowerCenter Server uses. Must be the same
port listed in the PowerCenter Server configuration parameters.
Required/
TCP/IP Option Description
Optional
Timeout Required Number of seconds the Workflow Manager waits for a response
from the PowerCenter Server.
Code Page Required Character set associated with the PowerCenter Server. Select
the code page identical to the PowerCenter Server operating
system code page. Must be identical to or compatible with the
repository code page.
7. For $PMRootDir, enter a valid root directory for the PowerCenter Server platform.
Informatica recommends using the PowerCenter Server installation directory as the root
directory because the PowerCenter Server installation creates the default server directories
there. If you enter a different root directory, make sure to create the necessary directories.
8. Enter the server variables, as desired.
Do not use trailing delimiters. A trailing delimiter might invalidate the directory used by
the PowerCenter Server. For example, enter c:\data\sessionlog, not c:\data\sessionlog\.
See Table 2-5 on page 47 for a list of server variables.
9. Click OK.
The new PowerCenter Server appears in the Navigator below the repository.
To delete a server:
1. Open the Connection Browser dialog box for the connection object. For example, choose
Connections-Relational to open the Connection Browser dialog box for a relational
database connection.
2. Select the connection object you want to configure in the Connection Browser dialog
box.
3. Click Permissions to open the Permissions dialog box.
5. Add user or group you want to assign permissions for the connection, and click OK.
5. For relational database connections, enter the connection information listed in Table 2-9:
User Name Required Database user name with the appropriate read and write
database permissions to access the database. If you are using
Oracle OS Authentication, or you are using databases such as
ISG Navigator that do not allow user names, enter PmNullUser.
For Teradata connections, this overrides the default database
user name in the ODBC entry.
Password Required Password for the database user name. For Oracle OS
Authentication, or for databases such as ISG Navigator that do
not allow passwords, enter PmNullPassword. For Teradata
connections, this overrides the database password in the ODBC
entry.
Passwords must be in 7-bit ASCII only.
Connect String Required for all Connect string used to communicate with the database. For
databases, syntax, see “Database Connect Strings” on page 53.
except Microsoft
SQL Server and
Sybase
Code Page Required Specifies the code page the PowerCenter Server uses to read
from a source database or write to a target database or file.
6. For each type of relational database connection, enter the attributes listed in Table 2-10:
Relational Database
Attribute Name Description
Type
Rollback Segment Oracle The name of the rollback segment. A rollback segment
records database transactions in the event that you
want to undo the transaction.
Enable Parallel Mode Oracle Enables parallel processing when loading data into a
table in bulk mode.
Environment SQL All relational databases Enter SQL commands to set the database environment
when you connect to the database.
Database Name Sybase, Microsoft SQL The name of the database. For Teradata connections,
Server, and Teradata this overrides the default database name in the ODBC
entry. Also, if you do no enter a database name here for
a Teradata connection, the PowerCenter Server uses
the default database name in the ODBC entry.
Data Source Name Teradata The name of the Teradata ODBC data source.
Server Name Sybase and Microsoft Database server name. Used to configure workflows.
SQL Server
Packet Size Sybase and Microsoft Used to optimize the ODBC connection to Sybase and
SQL Server Microsoft SQL Server.
Domain Name Microsoft SQL Server The name of the domain. Used for Microsoft SQL Server
on Windows.
Use Trusted Connection Microsoft SQL Server If selected, the PowerCenter Server uses Windows
authentication to access the Microsoft SQL Server
database. The user name that starts the PowerCenter
Server must be a valid Windows user with access to the
Microsoft SQL Server database.
7. Click OK.
The new database connection appears in the Connection Browser list.
8. To add more database connections, repeat steps 3-7.
1. Choose Connections-Relational.
The Relational Connection Browser appears.
Replace a connection.
65
Overview
In the Workflow Manager, you define a set of instructions called a workflow to execute
mappings you build in the Designer. Generally, a workflow contains a session and any other
task you may want to perform when you execute a session. Tasks can include a session, email
notification, or scheduling information. You connect each task with links in the workflow.
You can also create a worklet in the Workflow Manager. A worklet is an object that groups a
set of tasks. A worklet is similar to a workflow, but without scheduling information. You can
execute a batch of worklets inside a workflow.
After you create a workflow, you run the workflow in the Workflow Manager and monitor it
in the Workflow Monitor. For details on the Workflow Monitor, see “Monitoring Workflows”
on page 401.
Workflow Tasks
You can create the following types of tasks in the Workflow Manager:
♦ Assignment. Assigns a value to a workflow variable. For details, see “Working with the
Assignment Task” on page 140.
♦ Command. Specifies a shell command to run during the workflow. For details, see “Using
Workflow Variables” on page 103.
Overview 67
Figure 3-2 shows the Workflow Manager windows:
Overview
Output
Status Bar
Using Toolbars
The Workflow Manager can display the following toolbars to help you select tools and
perform operations quickly:
♦ Standard. Contains buttons to connect to and disconnect from repositories and folders,
toggle windows, zoom in and out, pan the workspace, and find objects.
♦ Connections. Contains buttons to open connection browsers and to assign servers.
♦ Repository. Contains buttons to connect to, disconnect from, and add repositories, open
folders, close tools, save changes to repositories, and print the workspace.
♦ View. Contains buttons to customize toolbars, toggle the status bar and windows, toggle
full-screen view, create a new workbook, and view the properties of objects.
♦ Layout. Contains buttons to arrange and restore objects in the workspace, find objects,
zoom in and out, and pan the workspace.
♦ Tasks. Contains buttons to create tasks.
♦ Workflow. Contains buttons to edit workflow properties.
♦ Run. Contains buttons to schedule the workflow, start the workflow, or start a task.
1. In any Workflow Manager tool, click the Find in Workspace toolbar button or choose
Edit-Find in Workspace.
The Find in Workspace dialog box opens:
2. Choose whether you want to search for tasks, links, variables, or events.
3. Enter a search string, or select a string from the list.
The Workflow Manager saves the last 10 search strings in the list.
4. Specify whether or not to match whole words and whether or not to perform a case-
sensitive search.
5. Click Find Now.
The Workflow Manager lists task names, link conditions, event names, or variable names
that match the search string at the bottom of the dialog box.
6. Click Close.
1. To search for a task, link, event, or variable, open the appropriate Workflow Manager
tool and click a task, link, or event. To search for text in the Output window, click the
appropriate tab in the Output window.
2. Enter a search string in the Find field on the standard toolbar.
The search is not case-sensitive.
Find Next Button
Find Field
3. Choose Edit-Find Next, click the Find Next button on the toolbar, or press Enter or F3
to search for the string.
The Workflow Manager highlights the first task name, link condition, event name, or
variable name that contains the search string, or the first string in the Output window
that matches the search string.
4. To search for the next item, press Enter or F3 again.
The Workflow Manager alerts you when you have searched through all items in the
workspace or Output window before it highlights the same objects a second time.
Checking In Objects
You commit changes to the repository by checking in objects. When you check in an object,
the repository creates a new version of the object and assigns it a version number. The
repository increments the version number by one each time it creates a new version.
You can check in an object from the Workflow Manager workspace. To do this, select the
object and choose Versioning-Check in.
You can check in an object when you review the results of the following tasks:
♦ View object history. You can check in an object from the View History window when you
view the history of an object.
♦ View checkouts. You can check in an object from the View Checkouts window when you
search for checked out objects.
♦ View query results. You can check in an object from the Query Results window when you
search for object dependencies or run an object query.
To check in an object, select the object or objects and choose Versioning-Check in.
Enter text into the comment field in the Check In dialog box.
When you check in an object, the repository creates a new version of the object and
increments the version number by one.
Edit a query.
Delete a query.
Create a query.
Configure permissions.
Run a query.
From the Query Browser, you can create, edit, and delete queries. You can also configure
permissions for each query from the Query Browser. You can run any queries for which you
have read permissions from the Query Browser.
For information about working with object queries, see “Grouping Versioned Objects” in the
Repository Guide.
Copying Sessions
When you copy a Session task, the Copy Wizard looks for the database connection and
associated mapping in the destination folder. If the mapping or connection does not exist in
the destination folder, you can select a new mapping or connection. If the destination folder
does not contain any mapping, you must first copy a mapping to the destination folder in the
Designer before you can copy the session.
When you copy a session that has mapping variable values saved in the repository, the
Workflow Manager either copies or retains the saved variable values.
1. Open the folders that contain the objects you want to compare.
2. Open the appropriate Workflow Manager tool.
3. Choose Tasks-Compare, Worklets-Compare, or Workflow-Compare.
A dialog box similar to the following one opens:
Drill down to
further compare
objects.
Differences between
objects are
highlighted and the
nodes are flagged.
Differences
between object
properties are
marked.
Displays the
properties of the
node you select.
You can further compare differences between object properties by clicking the Compare
Further icon or by right-clicking the differences.
6. If you want to save the comparison as a text or HTML file, choose File-Save to File.
User-Defined
Metadata
Extensions
This tab lists the existing user-defined and vendor-defined metadata extensions. User-
defined metadata extensions appear in the User Defined Metadata Domain. If they exist,
vendor-defined metadata extensions appear in their own domains.
5. Click the Add button.
A new row appears in the User Defined Metadata Extension Domain.
6. Enter the information in Table 3-1:
Required/
Field Description
Optional
Extension Name Required Name of the metadata extension. Metadata extension names must
be unique for each type of object in a domain. Metadata extension
names cannot contain any special characters except underscores
and cannot begin with numbers.
Precision Required for string The maximum length for string metadata extensions.
objects
Required/
Field Description
Optional
UnOverride Optional Restores the default value of the metadata extension when you
click Revert. This column appears only if the value of one of the
metadata extensions was changed.
7. Click OK.
To Press
Edit the text of a cell. F2. Then move the cursor to the desired location.
Find all combination and list boxes. Type the first letter on the list.
Paste copied or cut text from the clipboard into a cell. Ctrl+V
Table 3-3 lists the Workflow Manager keyboard shortcuts for navigating in the workspace:
To Press
Create links. Ctrl+F2. Press Ctrl+F2 to select first task you want to link.
Press Tab to select the rest of the tasks you want to link.
Press Ctrl+F2 again to link all the tasks you selected.
Expand selected node and all its children. SHIFT + * (use asterisk on numeric keypad )
87
Overview
A workflow is a set of instructions that tells the PowerCenter Server how to execute tasks such
as sessions, email notifications, and shell commands. After you create tasks in the Task
Developer and Workflow Designer, you connect the tasks with links to create a workflow.
In the Workflow Designer, you can specify conditional links and use workflow variables to
create branches in the workflow. The Workflow Manager also provides Event-Wait and Event-
Raise tasks so you can control the sequence of task execution in the workflow. You can also
create worklets and nest them inside the workflow.
Every workflow contains a Start task, which represents the beginning of the workflow.
Figure 4-1 shows a sample workflow:
After you create a workflow, select a PowerCenter Server to run the workflow. You can then
start the workflow using the Workflow Manager, Workflow Monitor, or pmcmd.
Use the Workflow Monitor to see the progress of a workflow during its run. The Workflow
Monitor can also show the history of a workflow. For more information about the Workflow
Monitor, see “Monitoring Workflows” on page 401.
Use the following guidelines when you develop a workflow:
1. Create a new workflow. Create a new workflow in the Workflow Designer. For details on
creating a new workflow, see “Creating a New Workflow” on page 91.
2. Add tasks in the workflow. You might have already created tasks in the Task Developer.
Or, you can add tasks to the workflow as you develop the workflow in the Workflow
Designer. For details on workflow tasks, see “Working with Tasks” on page 131.
3. Connect tasks with links. After you add tasks in the workflow, connect them with links
to specify the order of execution in the workflow. For details on links, see “Working with
Links” on page 92.
4. Specify conditions for each link. You can specify conditions on the links to create
branches and dependencies. For details, see “Working with Links” on page 92.
5. Validate workflow. Validate the workflow in the Workflow Designer to identify errors.
For details on validation rules, see “Validating a Workflow” on page 119.
6. Save workflow. When you save the workflow, the Workflow Manager validates the
workflow and updates the repository.
7. Run workflow. In the workflow properties, select a PowerCenter Server to run the
workflow. Run the workflow from the Workflow Manager, Workflow Monitor, or
pmcmd. You can monitor the workflow in the Workflow Monitor. For details on starting
a workflow, see “Running the Workflow” on page 122.
For a complete list of workflow properties, see “Workflow Properties Reference” on page 721.
Overview 89
Workflow Privileges
You need the one of the following privileges to create a workflow:
♦ Use Workflow Manager privilege with read and write folder permissions
♦ Super User privilege
You need one of the following privileges to run, schedule, and monitor the workflow:
♦ Workflow Operator privilege
♦ Super User privilege
For information on using the Workflow Wizard, see “Using the Workflow Wizard” on
page 99.
Developing Workflows 91
To create a workflow automatically:
The Workflow Manager does not allow you to create a workflow that contains a loop, such as
the loop shown in Figure 4-4. Figure 4-4 shows a loop where the three sessions may be run
multiple times:
Use the following procedure to link tasks in the Workflow Designer or the Worklet Designer.
Link Button
2. In the workspace, click the first task you want to connect and drag it to the second task.
3. A link appears between the two tasks.
If you have a number of tasks that you want to link concurrently, you may not wish to
connect each link manually. To quickly link tasks concurrently, use the following procedure.
Developing Workflows 93
Note: Do not use Ctrl+A or Edit-Select to choose tasks.
After you specify the link condition in the Expression Editor, the Workflow Manager validates
the link condition and displays it next to the link in the workflow.
Figure 4-6 shows the link condition displayed in the workspace:
Link Condition
1. In the Workflow Designer workspace, double-click the link you want to specify.
or
Right-click the link and choose Edit. The Expression Editor displays.
Developing Workflows 95
2. In the Expression Editor, enter the link condition.
The Expression Editor provides pre-defined workflow variables, user-defined workflow
variables, variable functions, and boolean and arithmetic operators.
3. Validate the expression using the Validate button. The Workflow Manager displays error
messages in the Output window.
Tip: Click and drag the end point of a link to move it from one task to another without losing
the link condition.
The Expression Editor displays system variables, user-defined, and pre-defined workflow
variables such as $Session.status. For details on workflow variables, see “Using Workflow
Variables” on page 103.
The Expression Editor also displays a list of functions. PowerCenter uses a SQL-like language
that contains many functions designed to handle common expressions. For example, you can
use the ABS function to find the absolute value. For a complete list of functions, see the
Transformation Language Reference.
Validating Expressions
You can use the Validate button to validate an expression. If you do not validate an
expression, the Workflow Manager validates it when you close the Expression Editor. You
cannot run a workflow with invalid expressions.
Expressions in link conditions and Decision task conditions must evaluate to a numerical
value. Workflow variables used in expressions must exist in the workflow.
Deleting a Workflow
You may decide to delete a workflow that you no longer use. When you delete a workflow,
you delete all non-reusable tasks and reusable task instances associated with the workflow.
Reusable tasks used in the workflow remain in the folder when you delete the workflow.
If you delete a workflow that is running, the PowerCenter Server aborts the workflow. If you
delete a workflow that is scheduled to run, the PowerCenter Server removes the workflow
from the schedule.
You can delete a workflow in the Navigator window, or you can delete the workflow currently
displayed in the Workflow Designer workspace.
♦ To delete a workflow from the Navigator window, open the folder, select the workflow and
press the Delete key.
♦ To delete a workflow currently displayed in the Workflow Designer workspace, choose
Workflows-Delete.
Developing Workflows 97
Editing a Workflow
When you edit a workflow, the repository updates the workflow information when you save
the workflow. If a workflow is running when you make edits, the PowerCenter Server uses the
updated information the next time you run the workflow.
1. In the Worklet Designer or Workflow Designer, right-click a task and choose Highlight
Path.
2. Choose Forward Path, Backward Path, or Both.
The Workflow Manager highlights all links in the branch you select.
1. In the Worklet Designer or Workflow Designer, select all links you want to delete.
Tip: You can use the mouse to click and drag the selection, or you can Ctrl-click the tasks
and links.
2. Choose Edit-Delete Links.
The Workflow Manager removes all selected links.
1. In the Workflow Manager, open the folder containing the mapping you want to use in
the workflow.
2. Open the Workflow Designer.
3. Choose Workflows-Wizard.
To create a session:
1. In the second step of the Workflow Wizard, select a valid mapping and click the right
arrow button.
The Workflow Wizard creates a Session task in the right pane using the selected mapping
and names it s_MappingName by default.
2. You can select additional mappings to create more Session tasks in the workflow.
When you add multiple mappings to the list, the Workflow Wizard creates sequential
sessions in the order you add them.
3. Use the arrow buttons to change the session order.
4. Specify whether the session should be reusable.
When you create a reusable session, you can use the session in other workflows. For
details on reusable sessions, see “Working with Tasks” on page 131
5. Specify how you want the PowerCenter Server to run the workflow.
You can specify that the PowerCenter Server runs sessions only if previous sessions
complete, or you can specify that the PowerCenter Server always runs each session. When
you select this option, it applies to all sessions you create using the Workflow Wizard.
1. In the third step of the Workflow Wizard, configure the scheduling and run options. For
more information about scheduling a workflow, see “Scheduling a Workflow” on
page 112.
2. Click Next.
The Workflow Wizard displays the settings for the workflow:
3. Verify the workflow settings and click Finish. To edit settings, click Back.
The completed workflow opens in the Workflow Designer workspace. From the
workspace, you can add tasks, create concurrent sessions, add conditions to links, or
modify properties.
4. When you finish modifying the workflow, choose Repository-Save.
Create an
expression
using
variables.
When you build an expression, you can select pre-defined variables on the Pre-Defined tab.
You can select user-defined variables on the User-Defined tab. The Functions tab contains
functions that you can use with workflow variables.
Use the point-and-click method to enter an expression using a variable. For information on
using the Expression Editor, see “Using the Expression Editor” on page 96.
You can use the following keywords to write expressions for user-defined and pre-defined
workflow variables:
♦ AND
♦ OR
♦ NOT
♦ TRUE
♦ FALSE
♦ NULL
♦ SYSDATE
EndTime Date and time the associated task ended. All tasks Date/time
ErrorCode Last error code for the associated task. If there is no error, All tasks Integer
the PowerCenter Server sets ErrorCode to 0 when the
task completes.
ErrorMsg Last error message for the associated task. All tasks Nstring*
If there is no error, the PowerCenter Server sets ErrorMsg
to an empty string when the task completes.
FirstErrorCode Error code for the first error message in the session. Session Integer
If there is no error, the PowerCenter Server sets
FirstErrorCode to 0 when the session completes.
PrevTaskStatus Status of the previous task in the workflow that the All tasks Integer
PowerCenter Server ran. Statuses include:
- ABORTED
- FAILED
- STOPPED
- SUCCEEDED
Use these key words when writing expressions to evaluate
the status of the previous task. For more information, see
“Evaluating Task Status in a Workflow” on page 107.
SrcFailedRows Total number of rows the PowerCenter Server failed to Session Integer
read from the source.
SrcSuccessRows Total number of rows successfully read from the sources. Session Integer
StartTime Date and time the associated task started. All tasks Date/time
Status Status of the previous task in the workflow. Task statuses All tasks Integer
include:
- ABORTED
- DISABLED
- FAILED
- NOTSTARTED
- STARTED
- STOPPED
- SUCCEEDED
Use these key words when writing expressions to evaluate
the status of the current task. For more information, see
“Evaluating Task Status in a Workflow” on page 107.
TgtFailedRows Total number of rows the PowerCenter Server failed to Session Integer
write to the target.
TgtSuccessRows Total number of rows successfully written to the targets. Session Integer
All pre-defined workflow variables except Status have a default value of null. The
PowerCenter Server uses the default value of null when it encounters a pre-defined variable
from a task that has not yet run in the workflow. Therefore, expressions and link conditions
that depend upon tasks not yet run are valid. The default value of Status is NOTSTARTED.
The Expression Editor displays the pre-defined workflow variables on the Pre-defined tab.
The Workflow Manager groups task-specific variables by task and lists system variables under
the Built-in node. To use a variable in an expression, double-click the variable. The
Expression Editor displays task-specific variables in the Expression field in the following
format:
$<TaskName>.<Pre-definedVariable>
Link condition:
$Session2.Status = SUCCEEDED
The PowerCenter Server returns value based on the
previous task in the workflow, Session2.
When you run the workflow, the PowerCenter Server evaluates the link condition and returns
the value based on the status of Session2.
Disabled Task
Link condition:
$Session2.PrevTaskStatus = SUCCEEDED
The PowerCenter Server returns value based on the
previous task run, Session1.
When you run the workflow, the PowerCenter Server skips Session2 because the session is
disabled. When the PowerCenter Server evaluates the link condition, it returns the value
based on the status of Session1.
Tip: If you do not disable Session2, the PowerCenter Server returns the value based on the
status of Session2. You do not need to change the link condition when you enable and disable
Session2.
Double 0
Integer 0
Add Button
Validate Button
To schedule a workflow:
Edit scheduler
settings.
6. Click OK.
To remove a workflow from its schedule, right-click the workflow in the Navigator window
and choose Unschedule Workflow.
Required/
Scheduler Options Description
Optional
Schedule Options: Optional Required if you select Run On Server Initialization, or if you do
Run Once/Run Every/ not choose any setting in Run Options.
Customized Repeat If you select Run Once, the PowerCenter Server runs the
workflow once, as scheduled in the scheduler.
If you select Run Every, the PowerCenter Server runs the
workflow at regular intervals, as configured.
If you select Customized Repeat, the PowerCenter Server runs
the workflow on the dates and times specified in the Repeat
dialog box.
When you select Customized Repeat, click Edit to open the
Repeat dialog box. The Repeat dialog box allows you to
schedule specific dates and times for the workflow run. The
selected scheduler appears at the bottom of the page.
Required/
Scheduler Options Description
Optional
Start Options: Start Date/Start Optional Start Date indicates the date on which the PowerCenter Server
Time begins the workflow schedule.
Start Time indicates the time at which the PowerCenter Server
begins the workflow schedule.
End Options: End On/End Required/ Required if the workflow schedule is Run Every or Customized
After/Forever Optional Repeat.
If you select End On, the PowerCenter Server stops scheduling
the workflow in the selected date.
If you select End After, the PowerCenter Server stops
scheduling the workflow after the set number of workflow runs.
If you select Forever, the PowerCenter Server schedules the
workflow as long as the workflow does not fail.
Required/
Repeat Option Description
Optional
Repeat Every Required Enter the numeric interval you would like the PowerCenter Server to schedule
the workflow, and then select Days, Weeks, or Months, as appropriate.
If you select Days, select the appropriate Daily Frequency settings.
If you select Weeks, select the appropriate Weekly and Daily Frequency
settings.
If you select Months, select the appropriate Monthly and Daily Frequency
settings.
Weekly Required/ Required to enter a weekly schedule. Select the day or days of the week on
Optional which you would like the PowerCenter Server to run the workflow.
Daily Optional Enter the number of times you would like the PowerCenter Server to run the
workflow on any day the session is scheduled.
If you select Run Once, the PowerCenter Server schedules the workflow once
on the selected day, at the time entered on the Start Time setting on the Time
tab.
If you select Run Every, enter Hours and Minutes to define the interval at which
the PowerCenter Server runs the workflow. The PowerCenter Server then
schedules the workflow at regular intervals on the selected day. The
PowerCenter Server uses the Start Time setting for the first scheduled
workflow of the day.
Disabling Workflows
You may want to disable the workflow while you edit it. This prevents the PowerCenter Server
from running the workflow on its schedule. Select the Disable Workflows option on the
General tab of the workflow properties. The PowerCenter Server does not run disabled
workflows until you clear the Disable Workflows option. Once you clear the Disable
Workflows option, the PowerCenter Server reschedules the workflow.
Expression Validation
The Workflow Manager validates all expressions in the workflow. You can enter expressions in
the Assignment task, Decision task, and link conditions. The Workflow Manager writes any
error message to the Output window.
Expressions in link conditions and Decision task conditions must evaluate to a numerical
value. Workflow variables used in expressions must exist in the workflow.
The Workflow Manager marks the workflow invalid if a link condition is invalid.
Task Validation
The Workflow Manager validates each task in the workflow as you create it. When you save or
validate the workflow, the Workflow Manager validates all tasks in the workflow except
Session tasks. It marks the workflow invalid if it detects any invalid task in the workflow.
The Workflow Manager verifies that attributes in the tasks follow validation rules. For
example, the user-defined event you specify in an Event task must exist in the workflow. The
Workflow Manager also verifies that you linked each task properly. For example, you must
link the Start task to at least one task in the workflow. For details on task validation rules, see
“Validating Tasks” on page 139.
When you delete a reusable task, the Workflow Manager removes the instance of the deleted
task from workflows. The Workflow Manager also marks the workflow invalid when you
delete a reusable task used in a workflow.
The Workflow Manager verifies that there are no duplicate task names in a folder, and that
there are no duplicate task instances in the workflow.
Running Validation
When you validate a workflow, you validate worklet instances, worklet objects, and all other
nested worklets in the workflow. You validate task instances and worklets, regardless of
whether you have edited them.
The Workflow Manager validates the worklet object using the same validation rules for
workflows. The Workflow Manager validates the worklet instance by verifying attributes in
the Parameter tab of the worklet instance. For details on validating worklets, see “Validating
Worklets” on page 171.
If the workflow contains nested worklets, you can select a worklet to validate the worklet and
all other worklets nested under it. To validate a worklet and its nested worklets, right-click the
worklet and choose Validate.
Example
For example, you have a workflow that contains a non-reusable worklet called Worklet_1.
Worklet_1 contains a nested worklet called Worklet_a. The workflow also contains a reusable
worklet instance called Worklet_2. Worklet_2 contains a nested worklet called Worklet_b.
In the example workflow in Figure 4-15, the Workflow Manager validates links, conditions,
and tasks in the workflow. The Workflow Manager validates all tasks in the workflow,
including tasks in Worklet_1, Worklet_2, Worklet_a, and Worklet_b.
You can validate a part of the workflow. Right-click Worklet_1 and choose Validate. The
Workflow Manager validates all tasks in Worklet_1 and Worklet_a.
Figure 4-15 shows the example workflow:
Select a server.
Select a server to
assign.
Select a folder.
Assign a server to
a workflow.
3. From the Choose Server list, select the server you want to assign.
4. From the Show Folder list, select the folder you want to view. Or, choose All to view
workflows in all folders in the repository.
5. Select the Select check box for each workflow you want to run on the PowerCenter
Server.
6. Click Assign.
Running a Workflow
When you choose Workflows-Start, the PowerCenter Server runs the entire workflow.
To run a workflow from pmcmd, use the startworkflow command. For details on using
pmcmd, see “Using pmcmd” on page 581.
To suspend a workflow:
4. Click OK.
131
Overview
The Workflow Manager contains many types of tasks to help you build workflows and
worklets. You can create reusable tasks in the Task Developer. Or, create and add tasks in the
Workflow or Worklet Designer as you develop the workflow.
Table 5-1 summarizes workflow tasks available in Workflow Manager:
Assignment Workflow Designer No Assigns a value to a workflow variable. For details, see
Worklet Designer “Working with the Assignment Task” on page 140.
Command Task Developer Yes Specifies shell commands to run during the workflow.
Workflow Designer You can choose to run the Command task only if the
Worklet Designer previous task in the workflow completes. For details, see
“Working with the Command Task” on page 143.
Control Workflow Designer No Stops or aborts the workflow. For details, see “Working
Worklet Designer with the Control Task” on page 147.
Email Task Developer Yes Sends email during the workflow. For details, see
Workflow Designer “Sending Email” on page 319.
Worklet Designer
Session Task Developer Yes Set of instructions to run a mapping. For details, see
Workflow Designer “Working with Sessions” on page 173.
Worklet Designer
Timer Workflow Designer No Waits for a specified period of time to run the next task.
Worklet Designer For details, see “Working with Event Tasks” on
page 153.
The Workflow Manager validates tasks attributes and links. If a task is invalid, the workflow
becomes invalid. Workflows containing invalid sessions may still be valid. For details on
validating tasks, see “Validating Tasks” on page 139.
1. In the Task Developer, choose Tasks-Create. The Create Task dialog box appears.
2. Select the task type you want to create, Command, Session, or Email.
3. Enter a name for the task.
4. For session tasks, select the mapping you want to associate with the session.
5. Click Create.
The Task Developer creates the workflow task.
6. Click Done to close the Create Task dialog box.
When you use a task in the workflow, you can edit the task in the Workflow Designer and
configure the following task options in the General tab:
♦ Treat input link as AND or OR. Choose to have the PowerCenter Server run the task
when all or one of the input link conditions evaluates to True.
♦ Disable this task. Choose to disable the task so you can run the rest of the workflow
without the task.
♦ Fail parent if this task fails. Choose to fail the workflow or worklet containing the task if
the task fails.
♦ Fail parent if this task does not run. Choose to fail the workflow or worklet containing
the task if the task does not run.
1. In the Workflow Designer, double-click the task you want to make reusable.
2. In the General tab of the Edit Task dialog box, check the Make Reusable option.
3. When prompted whether you are sure you want to promote the task, click Yes.
4. Click OK to return to the workflow.
5. Choose Repository-Save.
The newly promoted task appears in the list of reusable tasks in the Tasks node in the
Navigator window.
Disabling Tasks
In the Workflow Designer, you can disable a workflow task so that the PowerCenter Server
runs the workflow without the disabled task. The status of a disabled task is DISABLED.
Disable a task in the workflow by selecting the Disable This Task option in the Edit Tasks
dialog box.
1. In the Workflow Designer, click the Assignment icon on the Tasks toolbar.
or
Choose Tasks-Create. Select Assignment Task for the task type.
2. Enter a name for the Assignment task. Click Create. Then click Done.
The Workflow Designer creates and adds the Assignment task to the workflow.
3. Double-click the Assignment task to open the Edit Task dialog box.
4. On the Expressions tab, click Add to add an assignment.
Add an assignment.
Open Button
6. Select the variable for which you want to assign a value. Click OK.
7. Click the Edit button in the Expression field to open the Expression Editor.
The Expression Editor shows pre-defined workflow variables, user-defined workflow
variables, variable functions, and boolean and arithmetic operators.
8. Enter the value or expression you want to assign. For example, if you want to assign the
value 500 to the user-defined variable $$custno1, enter the number 500 in the
Expression Editor.
Validate the expression before you close the Expression Editor.
For a UNIX server, you would use the following command to perform a similar operation:
cp sales/sales_adj marketing/
Each shell command runs in the same environment (UNIX or Windows) as the PowerCenter
Server. Environment settings in one shell command script do not carry over to other scripts.
To run all shell commands in the same environment, call a single shell script that invokes
other scripts.
1. In the Workflow Designer or the Task Developer, click the Command Task icon on the
Tasks toolbar.
or
Choose Task-Create. Select Command Task for the task type.
2. Enter a name for the Command task. Click Create. Then click Done.
3. Double-click the Command task in the workspace to open the Edit Tasks dialog box.
4. In the Commands tab, click the Add button to add a command.
Add Button
Edit Button
7. Enter the command you want to perform. Enter only one command in the Command
Editor.
8. Click OK to close the Command Editor.
9. Repeat steps 3-8 to add more commands in the task.
10. Click OK.
If you specify non-reusable shell commands for a session, you can promote the non-reusable
shell commands to a reusable Command task. For details, see “Creating a Reusable Command
Task from Pre- or Post-Session Commands” on page 191.
1. In the Workflow Designer, click the Control Task icon on the Tasks toolbar.
or
Choose Tasks-Create. Select Control Task for the task type.
2. Enter a name for the Control task. Click Create. Then click Done.
The Workflow Manager creates and adds the Control task to the workflow.
3. Double-click the Control task in the workspace to open it.
Fail Me Marks the Control task as “Failed.” The PowerCenter Server fails
the Control task if you choose this option. If you choose Fail Me in
the Properties tab and choose Fail Parent If This Task Fails in the
General tab, the PowerCenter Server fails the parent workflow.
Fail Parent Marks the status of the workflow or worklet that contains the
Control task as failed after the workflow or worklet completes.
Stop Parent Stops the workflow or worklet that contains the Control task.
Abort Parent Aborts the workflow or worklet that contains the Control task.
Example
For example, you have a Command task that depends on the status of the three sessions in the
workflow. You want the PowerCenter Server to run the Command task when any of the three
sessions fails. To accomplish this, use a Decision task with the following decision condition:
$Q1_session.status = FAILED OR $Q2_session.status = FAILED OR
$Q3_session.status = FAILED
You can then use the pre-defined condition variable in the input link condition of the
Command task. Configure the input link with the following link condition:
$Decision.condition = True
You can configure the same logic in the workflow without the Decision task. Without the
Decision task, you need to use three link conditions and treat the input links to the
Command task as OR links.
Figure 5-5 shows the example workflow without the Decision task:
You can further expand the example workflow in Figure 5-4. In Figure 5-4, the PowerCenter
Server runs the Command task if any of the three Session tasks fails. Suppose now you want
the PowerCenter Server to also run an Email task if all three Session tasks succeed.
$Decision.condition = True
$Decision.condition = False
1. In the Workflow Designer, click the Decision Task icon on the Tasks toolbar.
or
Choose Tasks-Create. Select Decision Task for the task type.
2. Enter a name for the Decision task. Click Create. Then click Done.
The Workflow Designer creates and adds the Decision task to the workspace.
4. Click the Open button in the Value field to open the Expression Editor.
5. In the Expression Editor, enter the condition you want the PowerCenter Server to
evaluate.
Validate the expression before you close the Expression Editor.
6. Click OK.
Add a user-
defined event.
1. In the Workflow Designer workspace, create an Event-Raise task and place it in the
workflow to represent the user-defined event you want to trigger. A user-defined event is
the sequence of tasks in the branch from the Start task to the Event-Raise task.
3. Click the Open button in the Value field on the Properties tab to open the Events
Browser for user-defined events.
1. In the workflow, create an Event-Wait task and double-click the Event-Wait task to open
the Edit Task dialog box.
2. In the Events tab of the Edit Tasks dialog box, select User-Defined.
Perform the following steps to wait for a pre-defined event in the workflow.
1. Create an Event-Wait task and double-click the Event-Wait task to open it.
5. Click OK.
You can use a Timer task anywhere in the workflow after the Start task.
1. In the Workflow Designer, click the Timer task icon on the Tasks toolbar.
or
Choose Tasks-Create. Select Timer Task for the task type.
2. Double-click the Timer task to open it.
3. On the General tab, enter a name for the Timer task.
Specify attributes for Absolute Time or Relative Time described in Table 5-2:
Absolute Time: Specify the The PowerCenter Server starts the next task in the workflow at the
exact time to start exact date and time you specify.
Absolute Time: Use this Specify a user-defined date-time workflow variable. The
workflow date-time variable to PowerCenter Server starts the next task in the workflow at the time
calculate the wait you choose.
The Workflow Manager verifies that the variable you specify has
the Date/Time datatype.
The Timer task fails if the date-time workflow variable evaluates to
NULL.
Relative time: Start after Specify the period of time the PowerCenter Server waits to start
executing the next task in the workflow.
Relative time: from the start Choose this option to wait a specified period of time after the start
time of this task time of the Timer task to run the next task.
Relative time: from the start Choose this option to wait a specified period of time after the start
time of the parent workflow/ time of the parent workflow/worklet to run the next task.
worklet
Relative time: from the start Choose this option to wait a specified period of time after the start
time of the top-level workflow time of the top-level workflow to run the next task.
163
Overview
A worklet is an object that represents a set of tasks. It can contain any task available in the
Workflow Manager. You can run worklets inside a workflow. The workflow that contains the
worklet is called the parent workflow. You can also nest a worklet in another worklet.
Create a worklet when you want to reuse a set of workflow logic in several workflows. Use the
Worklet Designer to create and edit worklets.
When the PowerCenter Server runs a worklet, it expands the worklet. The PowerCenter
Server then runs the worklet as it would any other workflow, executing tasks and evaluating
links in the worklet.
The worklet does not contain any scheduling or server information. To run a worklet, include
the worklet in a workflow. The worklet runs on the PowerCenter Server you choose for the
workflow. The Workflow Manager does not provide a parameter file or log file for worklets.
The PowerCenter Server writes information about worklet execution in the workflow log.
Suspending Worklets
When you choose Suspend On Error for the parent workflow, the PowerCenter Server also
suspends the worklet if a task in the worklet fails. When a task in the worklet fails, the
PowerCenter Server stops executing the failed task and other tasks in its path. If no other task
is running in the worklet, the worklet status is “Suspended.” If one or more tasks are still
running in the worklet, the worklet status is “Suspending.” The PowerCenter Server suspends
the parent workflow when the status of the worklet is “Suspended” or “Suspending.”
For details on suspending workflows, see “Suspending the Workflow” on page 127.
1. In the Worklet Designer, choose Worklets-Create. The Create Worklet dialog box
appears.
Nesting Worklets
You can nest a worklet within another worklet. When you run a workflow containing nested
worklets, the PowerCenter Server runs the nested worklet from within the parent worklet. You
can group several worklets together by function or simplify the design of a complex workflow
when you nest worklets.
You might choose to nest worklets to load data to fact and dimension tables. Create a nested
worklet to load fact and dimension data into a staging area. Then, create a nested worklet to
load the fact and dimension data from the staging area to the data warehouse.
You might choose to nest worklets to simplify the design of a complex workflow. Nest
worklets that can be grouped together within one worklet. In the workflow in Figure 6-1, two
worklets relate to regional sales and two worklets relate to quarterly sales.
Figure 6-1 shows a workflow that uses multiple worklets:
When you run the example workflow shown in Figure 6-3, the persistent worklet variable
retains its value from Worklet1 and becomes the initial value in Worklet2. After the
PowerCenter Server executes Worklet2, it retains the value of the persistent variable in the
repository and uses the value the next time you run the workflow.
Worklet variables only persist when you run the same workflow. A worklet variable does not
retain its value when you use instances of the worklet in different workflows.
Add Button
Select a user-defined
worklet variable.
3. Click the open button in the User-Defined Worklet Variables field to select a worklet
variable.
4. Click the Open button in the Parent Workflow Variable field to select a workflow
variable to assign to the worklet variable.
5. Click Apply.
The worklet variable in this worklet instance now has the selected workflow variable as its
initial value.
173
Overview
A session is a set of instructions that tells the PowerCenter Server how and when to move data
from sources to targets. A session is a type of task, similar to other tasks available in the
Workflow Manager. In the Workflow Manager, you configure a session by creating a Session
task. To run a session, you must first create a workflow to contain the Session task.
When you create a Session task, you enter general information such as the session name,
session schedule, and the PowerCenter Server to run the session. You can also select options to
execute pre-session shell commands, send On-Success or On-Failure email, and use FTP to
transfer source and target files.
Using session properties, you can also override parameters established in the mapping, such as
source and target location, source and target type, error tracing levels, and transformation
attributes. When you assign a server in a server grid to a session, the server you specify at the
session level overrides the server you specify at the workflow level.
You can run as many sessions in a workflow as you need. You can run the Session tasks
sequentially or concurrently, depending on your needs.
The PowerCenter Server creates several files and in-memory caches depending on the
transformations and options used in the session. For more details on session output files and
caches, see “Output Files and Caches” on page 28.
Session Privileges
To create sessions, you must have one of the following sets of privileges and permissions:
♦ Use Workflow Manager privilege with read, write, and execute permissions
♦ Super User privilege
You must have read permission for connection objects associated with the session in addition
to the above privileges and permissions.
PowerCenter allows you to set a read-only privilege for sessions. The Workflow Operator
privilege allows a user to view, start, stop, and monitor sessions without being able to edit
session properties.
1. In the Workflow Designer, click the Session Task icon on the Tasks toolbar.
or
Choose Tasks-Create. Select Session Task for the task type.
2. Enter a name for the Session task.
3. Click Create. The Mappings dialog box appears.
For a target
instance, you can
change writers,
connections, and
properties
settings.
Table 7-1 shows the options you can use to apply attributes to objects in a session. You can
apply different options depending on whether the setting is a reader or writer, connection, or
an object property.
Reader Apply Type to All Instances Applies a reader or writer type to all instances of the same object
Writer type in the session. For example, you can apply a relational
reader type to all the other readers in your session.
Reader Apply Type to All Partitions Applies a reader or writer type to all the partitions in a pipeline.
Writer For example, if you have four partitions, you can change the writer
type in one partition for a target instance. Then you can use this
option to apply the change to the other three partitions.
Connections Apply Connection Type Applies the same type of connection to all instances. Connection
types are relational, FTP, queue, application, or external loader.
Connections Apply Connection Value Apply a connection value to all instances or partitions. The
connection value defines a specific connection that you can view
in the connection browser. You can only apply a connection value
that is valid for the existing connection type.
Connections Apply Connection Attributes Apply only the connection attribute values to all instances or
partitions. Each type of connection has different attributes. You
can apply connection attributes separately from connection
values. To view sample connection attributes, see Figure 7-3 on
page 181.
Connections Apply Connection Data Apply the connection value and its connection attributes to all the
other instances that have the same connection type. This option
combines the connection option and the connection attribute
option.
Connections Apply All Connection Applies the connection value and its attributes to all the other
Information instances even if they do not have the same connection type. This
option is similar to Apply Connection Data, but it allows you to
change the connection type.
Properties Apply Attribute to all Applies an attribute value to all instances of the same object type
Instances in the session. For example, if you have a relational target you can
choose to truncate a table before you load data. You can apply the
attribute value to all the relational targets in your session.
Properties Apply Attribute to all Applies an attribute value to all partitions in a pipeline. For
Partitions example, you can change the name of the reject file name in one
partition for a target instance, then apply the file name change to
the other partitions.
5. Select an option from the list and choose to apply it to all instances or all partitions.
6. Click OK to apply the attribute or property.
Select a
session
configuration
object.
Click the Browse button in the Config Name field to choose a session configuration. Select a
user-defined or default session configuration object from the browser.
5. Click OK.
For session configuration object settings descriptions, see “Config Object Tab” on page 675.
Error Handling
You can configure error handling on the Config Object tab. You can choose to stop or
continue the session if the PowerCenter Server encounters an error issuing the pre- or post-
session SQL command.
Figure 7-6. Stop or Continue the Session on Pre- or Post-Session SQL Errors
Stop or
continue the
session on pre-
or post-
session SQL
error.
Perform the following steps to create pre- or post-session shell commands for a specific
session.
1. In the Components tab of the session properties, select Non-reusable for pre- or post-
session shell command.
Edit pre-
session
commands.
2. Click the Edit button in the Value field to open the Edit Pre- or Post-Session Command
dialog box.
3. Enter a name for the command in the General tab.
5. In the Commands tab, click the Add button to add shell commands.
Enter one command for each line.
Add a command.
6. Click OK.
1. In the Components tab of the session properties, click Reusable for the pre- or post-
session shell command.
2. Click the Edit button in the Value field to open the Task Browser dialog box.
3. Select the Command task you want to run as the pre- or post-session shell command.
4. Click the Override button in the Task Browser dialog box if you want to change the order
of the commands, or if you want to specify whether to run the next command when the
previous command fails.
Changes you make to the Command task from the session properties only apply to the
session. In the session properties, you cannot edit the commands in the Command task.
5. Click OK to select the Command task for the pre- or post-session shell command.
The name of the Command task you select appears in the Value field for the shell
command.
Figure 7-8. Stop or Continue the Session on Pre-Session Shell Command Error
Stop or
continue the
session on pre-
session shell
command error.
Select a
server.
Select a folder.
Show sessions.
Threshold Errors
You can choose to stop a session on a designated number of non-fatal errors. A non-fatal error
is an error that does not force the session to stop on its first occurrence. Establish the error
threshold in the session properties with the Stop On option. When you enable this option,
the PowerCenter Server counts non-fatal errors that occur in the reader, writer, and
transformation threads.
The PowerCenter Server maintains an independent error count when reading sources,
transforming data, and writing to targets. The PowerCenter Server counts the following non-
fatal errors when you set the stop on option in the session properties:
♦ Reader errors. Errors encountered by the PowerCenter Server while reading the source
database or source files. Reader threshold errors can include alignment errors while
running a session in Unicode mode.
♦ Writer errors. Errors encountered by the PowerCenter Server while writing to the target
database or target files. Writer threshold errors can include key constraint violations,
loading nulls into a not null field, and database trigger responses.
♦ Transformation errors. Errors encountered by the PowerCenter Server while transforming
data. Transformation threshold errors can include conversion errors, and any condition set
up as an ERROR, such as null input.
When you create multiple partitions in a pipeline, the PowerCenter Server maintains a
separate error threshold for each partition. When the PowerCenter Server reaches the error
threshold for any partition, it stops the session. The writer may continue writing data from
one or more partitions, but it does not affect your ability to perform a successful recovery.
Note: If alignment errors occur in a non line-sequential VSAM file, the PowerCenter Server
sets the error threshold to 1 and stops the session.
Fatal Error
A fatal error occurs when the PowerCenter Server cannot access the source, target, or
repository. This can include loss of connection or target database errors, such as lack of
ABORT Function
Use the ABORT function in the mapping logic to abort a session when the PowerCenter
Server encounters a designated transformation error.
For more information about ABORT, see “Functions” in the Transformation Language
Reference.
User Command
You can stop or abort the session from the Workflow Manager. You can also stop the session
using pmcmd.
- Error threshold met due to reader errors The PowerCenter Server performs the following tasks:
- Stop command using Workflow Manager or - Stops reading.
pmcmd - Continues processing data.
- Continues writing and committing data to targets.
If the PowerCenter Server cannot finish processing and committing
data, you need to issue the Abort command to stop the session.
Abort command using Workflow Manager The PowerCenter Server performs the following tasks:
- Stops reading.
- Continues processing data.
- Continues writing and committing data to targets.
If the PowerCenter Server cannot finish processing and committing
data within 60 seconds, it kills the PowerCenter Server process.
- Fatal error from database The PowerCenter Server performs the following tasks:
- Error threshold met due to writer errors - Stops reading and writing.
- Rolls back all data not committed to the target database.
If the session stops due to fatal error, the commit or rollback may
or may not be successful.
- Error threshold met due to transformation errors The PowerCenter Server performs the following tasks:
- ABORT( ) - Stops reading.
- Invalid evaluation of transaction control - Flags the row as an abort row and continues processing data.
expression - Continues to write to the target database until it hits the abort row.
- Issues commits based on commit intervals.
- Rolls back all data not committed to the target database.
1. In the Navigator window of the Workflow Manager, right-click the Session task and
select View Persistent Values.
Enable
High
Precision
207
Overview
In the Workflow Manager, you can create sessions with the following sources:
♦ Relational. You can extract data from any relational database that the PowerCenter Server
can connect to. When extracting data from relational sources and Application sources, you
must configure the database connection to the data source prior to configuring the session.
♦ File. You can create a session to extract data from a flat file, COBOL, or XML source. The
PowerCenter Server can extract data from any local directory or FTP connection for the
source file. If the file source requires an FTP connection, you need to configure the FTP
connection to the host machine before you create the session.
♦ Heterogeneous. You can extract data from multiple sources in the same session. You can
extract from multiple relational sources, such as Oracle and SQL Server. Or, you can
extract from multiple source types, such as relational and flat file. When you configure a
session with heterogeneous sources, configure each source instance separately.
Globalization Features
You can choose a code page that you want the PowerCenter Server to use for relational sources
and flat files. You specify code pages for relational sources when you configure database
connections in the Workflow Manager. You can set the code page for file sources in the session
properties. For more information about code pages, see “Globalization Overview” in the
Installation and Configuration Guide.
Source Connections
Before you can extract data from a source, you must configure the connection properties the
PowerCenter Server uses to connect to the source file or database. You can configure source
database and FTP connections in the Workflow Manager.
For more information on creating database connections, see “Configuring the Workflow
Manager” on page 37. For more information on creating FTP connections, see “Using FTP”
on page 559.
Partitioning Sources
You can create multiple partitions for relational, Application, and file sources. For relational
or Application sources, the PowerCenter Server creates a separate connection to the source
database for each partition you set in the session properties. For file sources, you can
configure the session to read the source with one thread or multiple threads.
For more information on partitioning data, see “Pipeline Partitioning” on page 345.
Overview 209
Configuring Sources in a Session
Configure source properties for sessions in the Sources node of the Mapping tab of the session
properties. When you configure source properties for a session, you define properties for each
source instance in the mapping.
Figure 8-1 shows the Sources node on the Mapping tab:
The Sources node lists the sources used in the session and displays their settings. To view and
configure settings for a source, select the source from the list. You can configure the following
settings for a source:
♦ Readers
♦ Connections
♦ Properties
Configuring Readers
You can click the Readers settings on the Sources node to view the reader the PowerCenter
Server uses with each source instance. The Workflow Manager specifies the necessary reader
for each source instance in the Readers settings on the Sources node.
Figure 8-2. Readers Settings in the Sources Node of the Mapping Tab
Configuring Connections
Click the Connections settings on the Sources node to define source connection information.
Edit a
connection.
Choose a
connection.
For relational sources, choose a configured database connection in the Value column for each
relational source instance. By default, the Workflow Manager displays the source type for
relational sources. For details on configuring database connections, see “Selecting the Source
Database Connection” on page 214.
For flat file and XML sources, choose one of the following source connection types in the
Type column for each source instance:
♦ FTP. If you want to read data from a flat file or XML source using FTP, you must specify
an FTP connection when you configure source options. You must define the FTP
connection in the Workflow Manager prior to configuring the session.
You must have read permission for any FTP connection you want to associate with the
session. The user starting the session must have execute permission for any FTP
connection associated with the session. For details on using FTP, see “Using FTP” on
page 559.
♦ None. Choose None when you want to read from a local flat file or XML file.
Configuring Properties
Click the Properties settings in the Sources node to define source property information. The
Workflow Manager displays properties, such as source file name and location for flat file,
Figure 8-4. Properties Settings in the Sources Node of the Mapping Tab
For more information on configuring sessions with relational sources, see “Working with
Relational Sources” on page 214. For more information on configuring sessions with flat file
sources, see “Working with File Sources” on page 218. For more information on configuring
sessions with XML sources, see the XML User Guide.
Treat Source
Rows As
Property
Table 8-1 describes the options you can choose for the Treat Source Rows As property:
Insert The PowerCenter Server marks all rows to insert into the target.
Delete The PowerCenter Server marks all rows to delete from the target.
Update The PowerCenter Server marks all rows to update the target. You can further
define the update operation in the target options. For more information, see “Target
Properties” on page 241.
Data Driven The PowerCenter Server uses the Update Strategy transformations in the mapping
to determine the operation on a row-by-row basis. You define the update operation
in the target options. If the mapping contains an Update Strategy transformation,
this option defaults to Data Driven. You can also use this option when the mapping
contains Custom transformations configured to set the update strategy.
Once you determine how to treat all rows in the session, you also need to set update strategy
options for individual targets. For more information on setting the target update strategy
options, see “Target Properties” on page 241.
For more information on setting the update strategy for a session, see “Update Strategy
Transformation” in the Transformation Guide.
Owner Name
SQL Query
Figure 8-8. Properties Settings in the Sources Node for a Flat File Source
Source File Optional Enter the directory name in this field. By default, the PowerCenter Server looks
Directory in the server variable directory, $PMSourceFileDir, for file sources.
If you specify both the directory and file name in the Source Filename field,
clear this field. The PowerCenter Server concatenates this field with the Source
Filename field when it runs the session.
You can also use the $InputFileName session parameter to specify the file
directory.
For details on session parameters, see “Session Parameters” on page 495.
Source Filename Required Enter the file name, or file name and path. Optionally use the $InputFileName
session parameter for the file name.
The PowerCenter Server concatenates this field with the Source File Directory
field when it runs the session. For example, if you have “C:\data\” in the Source
File Directory field, then enter “filename.dat” in the Source Filename field.
When the PowerCenter Server begins the session, it looks for
“C:\data\filename.dat”.
By default, the Workflow Manager enters the file name configured in the source
definition.
For details on session parameters, see “Session Parameters” on page 495.
Source Filetype Required Allows you to configure multiple file sources using a file list.
Indicates whether the source file contains the source data, or whether it
contains a list of files with the exact same file properties. Choose Direct if the
source file contains the source data. Choose Indirect if the source file contains
a list of files.
When you select Indirect, the PowerCenter Server finds the file list and reads
each listed file when it runs the session. For details on file lists, see “Using a
File List” on page 230.
Set File Properties Optional Opens a dialog box that allows you to override source file properties. By
link default, the Workflow Manager displays file properties as configured in the
source definition.
For more information, see “Configuring Fixed-Width File Properties” on
page 220 and “Configuring Delimited File Properties” on page 222.
To edit the fixed-width properties, select Fixed Width and click Advanced. The Fixed-Width
Properties dialog box appears. By default, the Workflow Manager displays file properties as
configured in the mapping. Edit these settings to override those configured in the source
definition.
Figure 8-10 shows the Fixed-Width Properties dialog box:
Fixed-Width Required/
Description
Properties Options Optional
Text/Binary Required Indicates the character representing a null value in the file. This can be any
valid character in the file code page, or any binary value from 0 to 255. For
more information about specifying null characters, see “Null Character
Handling” on page 227.
Repeat Null Optional If selected, the PowerCenter Server reads repeat NULL characters in a
Character single field as a single NULL value. If you do not select this option, the
PowerCenter Server reads a single null character at the beginning of a field
as a null field.
Important: For multibyte code pages, Informatica recommends that you
specify a single-byte null character if you are using repeating non-binary null
characters. This ensures that repeating null characters fit into the column
exactly.
For more information about specifying null characters, see “Null Character
Handling” on page 227.
Code Page Required Select the code page of the fixed-width file. The default setting is the client
code page.
Number of Initial Optional The PowerCenter Server skips the specified number of rows before reading
Rows to Skip the file. Use this to skip header rows. One row may contain multiple records.
If you select the Line Sequential File Format option, the PowerCenter Server
ignores this option.
Number of Bytes to Optional The PowerCenter Server skips the specified number of bytes between
Skip Between records. For example, you have an ASCII file on Windows with one record on
Records each line, and a carriage return and line feed appear at the end of each line.
If you want the PowerCenter Server to skip these two single-byte characters,
enter 2.
If you have an ASCII file on UNIX with one record for each line, ending in a
carriage return, skip the single character by entering 1.
Strip Trailing Blanks Optional If selected, the PowerCenter Server strips trailing blank spaces from records
before passing them to the Source Qualifier transformation.
Line Sequential File Optional Select this option if the file uses a carriage return at the end of each record,
Format shortening the final column.
To edit the delimited properties, select Delimited and click Advanced. The Delimited File
Properties dialog box appears. By default, the Workflow Manager displays file properties as
configured in the mapping. Edit these settings to override those configured in the source
definition.
Figure 8-12 shows the Delimited File Properties dialog box:
Delimiters Required Character used to separate columns of data in the source file. Use the button
to the right of this field to enter a different delimiter. Delimiters can be either
printable or single-byte unprintable characters, and must be different from
the escape character and the quote character (if selected). You cannot select
unprintable multibyte characters as delimiters. The delimiter must be in the
same code page as the flat file code page.
Treat Consecutive Optional By default, the PowerCenter Server reads pairs of delimiters as a null value.
Delimiters as One If selected, the PowerCenter Server reads any number of consecutive
delimiter characters as one.
For example, a source file uses a comma as the delimiter character and
contains the following record: 56, , , Jane Doe. By default, the PowerCenter
Server reads that record as four columns separated by three delimiters: 56,
NULL, NULL, Jane Doe. If you select this option, the PowerCenter Server
reads the record as two columns separated by one delimiter: 56, Jane Doe.
Optional Quotes Required Select No Quotes, Single Quote, or Double Quotes. If you select a quote
character, the PowerCenter Server ignores delimiter characters within the
quote characters. Therefore, the PowerCenter Server uses quote characters
to escape the delimiter.
For example, a source file uses a comma as a delimiter and contains the
following row: 342-3849, ‘Smith, Jenna’, ‘Rockville, MD’, 6.
If you select the optional single quote character, the PowerCenter Server
ignores the commas within the quotes and reads the row as four fields.
If you do not select the optional single quote, the PowerCenter Server reads
six separate fields.
When the PowerCenter Server reads two optional quote characters within a
quoted string, it treats them as one quote character. For example, the
PowerCenter Server reads the following quoted string as I’m going
tomorrow:
2353, ‘I’’m going tomorrow.’, MD
Additionally, if you select an optional quote character, the PowerCenter
Server only reads a string as a quoted string if the quote character is the first
character of the field.
Note: You can improve session performance if the source file does not
contain quotes or escape characters.
Code Page Required Select the code page of the delimited file. The default setting is the client
code page.
Remove Escape Optional This option is selected by default. Clear this option to include the escape
Character From Data character in the output string.
Number of Initial Optional The PowerCenter Server skips the specified number of rows before reading
Rows to Skip the file. Use this to skip title or header rows in the file.
Figure 8-13. Line Sequential Buffer Length Property for File Sources
Line
Sequential
Buffer Length
Character Set
You can configure the PowerCenter Server to run sessions in either ASCII or Unicode data
movement mode.
Table 8-5 describes source file formats supported by each data movement path in
PowerCenter:
Table 8-5. Support for ASCII and Unicode Data Movement Modes
EBCDIC-based MBCS Supported Not supported. The PowerCenter Server terminates the session.
If you configure a session to run in ASCII data movement mode, delimiters, escape
characters, and null characters must be valid in the ISO Western European Latin 1 code page.
Any 8-bit characters you specified in previous versions of PowerCenter are still valid. In
Unicode data movement mode, delimiters, escape characters, and null characters must be
valid in the specified code page of the flat file.
For more information about configuring and working with data movement modes, see
“Globalization Overview” in the Installation and Configuration Guide.
Binary Disabled A column is null if the first byte in the column is the binary null character. The
PowerCenter Server reads the rest of the column as text data only to determine the
column alignment and track the shift state for shift sensitive code pages. If data in the
column is misaligned, the PowerCenter Server skips the row and writes the skipped row
and a corresponding error message to the session log.
Non-binary Disabled A column is null if the first character in the column is the null character. The
PowerCenter Server reads the rest of the column only to determine the column
alignment and track the shift state for shift sensitive code pages. If data in the column is
misaligned, the PowerCenter Server skips the row and writes the skipped row and a
corresponding error message to the session log.
Binary Enabled A column is null if it contains only the specified binary null character. The next column
inherits the initial shift state of the code page.
Non-binary Enabled A column is null if the repeating null character fits into the column exactly, with no bytes
leftover. For example, a five-byte column is not null if you specify a two-byte repeating
null character. In shift-sensitive code pages, shift bytes do not affect the null value of a
column. A column is still null if it contains a shift byte at the beginning or end of the
column.
Informatica recommends you specify a single-byte null character if you use repeating
non-binary null characters. This ensures that repeating null characters fit into a column
exactly.
d:\data\eastern_trans.dat
e:\data\midwest_trans.dat
f:\data\canada_trans.dat
Once you create the file list, place it in a directory local to the PowerCenter Server.
Source
Filename
Indirect
File Type
233
Overview
In the Workflow Manager, you can create sessions with the following targets:
♦ Relational. You can load data to any relational database that the PowerCenter Server can
connect to. When loading data to relational targets, you must configure the database
connection to the target before you configure the session.
♦ File. You can load data to a flat file or XML target. The PowerCenter Server can load data
to any local directory or FTP connection for the target file. If the file target requires an
FTP connection, you need to configure the FTP connection to the host machine before
you create the session.
♦ Heterogeneous. You can output data to multiple targets in the same session. You can
output to multiple relational targets, such as Oracle and Microsoft SQL Server. Or, you
can output to multiple target types, such as relational and flat file. For more information,
see “Working with Heterogeneous Targets” on page 274.
Globalization Features
You can configure the PowerCenter Server to run sessions in either ASCII or Unicode data
movement mode.
Table 9-1 describes target character sets supported by each data movement mode in
PowerCenter:
Table 9-1. Support for ASCII and Unicode Data Movement Modes
PowerCenter allows you to work with targets that use multibyte character sets. You can choose
a code page that you want the PowerCenter Server to use for relational objects and flat files.
You specify code pages for relational objects when you configure database connections in the
Workflow Manager. The code page for a database connection used as a target must be a
superset of the repository code page.
When you change the database connection code page to one that is not two-way compatible
with the old code page, the Workflow Manager generates a warning and invalidates all
sessions that use that database connection.
Target Connections
Before you can load data to a target, you must configure the connection properties the
PowerCenter Server uses to connect to the target file or database. You can configure target
database and FTP connections in the Workflow Manager.
For details on creating database connections, see “Setting Up a Relational Database
Connection” on page 53. For details on creating FTP connections, see “Using FTP” on
page 559.
Partitioning Targets
When you create multiple partitions in a session with a relational target, the PowerCenter
Server creates multiple connections to the target database to write target data concurrently.
When you create multiple partitions in a session with a file target, the PowerCenter Server
creates one target file for each partition. You can configure the session properties to merge
these target files.
For details on configuring a session for pipeline partitioning, see “Pipeline Partitioning” on
page 345.
Overview 235
Configuring Targets in a Session
Configure target properties for sessions in the Transformations view on Mapping tab of the
session properties. Click the Targets node to view the target properties. When you configure
target properties for a session, you define properties for each target instance in the mapping.
Figure 9-1 shows where you define target properties in a session:
Targets Node
Writers Settings
Connections Settings
Properties Settings
Transformations View
The Targets node contains the following settings where you define properties:
♦ Writers
♦ Connections
♦ Properties
Configuring Writers
Click the Writers settings in the Transformations view to define the writer to use with each
target instance.
Figure 9-2. Writers Settings on the Mapping Tab of the Session Properties
Writers Settings
When the mapping target is a flat file, an XML file, an SAP BW target, or an IBM MQSeries
target, the Workflow Manager specifies the necessary writer in the session properties.
However, when the target in the mapping is relational, you can change the writer type to File
Writer if you plan to use an external loader.
Note: You can change the writer type for non-reusable sessions in the Workflow Designer and
for reusable sessions in the Task Developer. You cannot change the writer type for instances of
reusable sessions in the Workflow Designer.
When you override a relational target to use the file writer, the Workflow Manager changes
the properties for that target instance on the Properties settings. It also changes the
connection options you can define in the Connections settings.
After you override a relational target to use a file writer, define the file properties for the
target. Click Set File Properties and choose the target to define. For more information, see
“Configuring Fixed-Width Properties” on page 265 and “Configuring Delimited Properties”
on page 266.
Configuring Connections
View the Connections settings on the Mapping tab to define target connection information.
Figure 9-3. Connections Settings on the Mapping Tab of the Session Properties
Connections Settings
Choose a connection.
Edit a connection.
For relational targets, the Workflow Manager displays Relational as the target type by default.
In the Value column, choose a configured database connection for each relational target
instance. For details on configuring database connections, see “Target Database Connection”
on page 241.
For flat file and XML targets, choose one of the following target connection types in the Type
column for each target instance:
♦ FTP. If you want to load data to a flat file or XML target using FTP, you must specify an
FTP connection when you configure target options. FTP connections must be defined in
the Workflow Manager prior to configuring sessions.
You must have read permission for any FTP connection you want to associate with the
session. The user starting the session must have execute permission for any FTP
connection associated with the session. For details on using FTP, see “Using FTP” on
page 559.
♦ Loader. You can use the external loader option to improve the load speed to Oracle, DB2,
Sybase IQ, or Teradata target databases.
To use this option, you must use a mapping with a relational target definition and choose
File as the writer type on the Writers settings for the relational target instance. The
PowerCenter Server uses an external loader to load target files to the Oracle, DB2, Sybase
Configuring Properties
View the Properties settings on the Mapping tab to define target property information. The
Workflow Manager displays different properties for the different target types: relational, flat
file, and XML.
Figure 9-4 shows the Properties settings on the Mapping tab:
Figure 9-4. Properties Settings on the Mapping Tab of the Session Properties
Properties Settings
For more information on relational target properties, see “Working with Relational Targets”
on page 240. For more information on flat file target properties, see “Working with File
Targets” on page 261. For more information on XML target properties, see “Working with
Heterogeneous Targets” on page 274.
For more information on configuring sessions with multiple target types, see “Working with
Heterogeneous Targets” on page 274.
Target Properties
You can configure session properties for relational targets in the Transformations view on the
Mapping tab, and in the General Options settings on the Properties tab. Define the properties
for each target instance in the session.
When you click the Transformations view on the Mapping tab, you can view and configure
the settings of a specific target. Select the target under the Targets node.
Figure 9-5. Properties Settings on the Mapping Tab for a Relational Target
Table 9-2 describes the properties available in the Properties settings on the Mapping tab of
the session properties:
Required/
Target Property Description
Optional
Insert* Optional If selected, the PowerCenter Server inserts all rows flagged for insert.
By default, this option is selected.
Update (as Update)* Optional If selected, the PowerCenter Server updates all rows flagged for update.
By default, this option is selected.
Update (as Insert)* Optional If selected, the PowerCenter Server inserts all rows flagged for update.
By default, this option is not selected.
Required/
Target Property Description
Optional
Update (else Insert)* Optional If selected, the PowerCenter Server updates rows flagged for update if they
exist in the target, then inserts any remaining rows marked for insert.
By default, this option is not selected.
Delete* Optional If selected, the PowerCenter Server deletes all rows flagged for delete.
By default, this option is selected.
Truncate Table Optional If selected, the PowerCenter Server truncates the target before loading.
By default, this option is not selected.
For details on this feature, see “Truncating Target Tables” on page 245.
Reject File Directory Optional Enter the directory name in this field. By default, the PowerCenter Server
writes all reject files to the server variable directory, $PMBadFileDir.
If you specify both the directory and file name in the Reject Filename field,
clear this field. The PowerCenter Server concatenates this field with the
Reject Filename field when it runs the session.
You can also use the $BadFileName session parameter to specify the file
directory.
For details on session parameters, see “Session Parameters” on page 495.
Reject Filename Required Enter the file name, or file name and path. By default, the PowerCenter
Server names the reject file after the target instance name:
target_name.bad. Optionally use the $BadFileName session parameter for
the file name.
The PowerCenter Server concatenates this field with the Reject File
Directory field when it runs the session. For example, if you have
“C:\reject_file\” in the Reject File Directory field, and enter “filename.bad” in
the Reject Filename field, the PowerCenter Server writes rejected rows to
C:\reject_file\filename.bad.
For details on session parameters, see “Session Parameters” on page 495.
*For details on target update strategies, see “Update Strategy Transformation” in the Transformation Guide.
Test Load
Options
Table 9-3 describes the test load options on the General Options settings on the Properties
tab:
Required/
Property Description
Optional
Enable Test Load Optional You can configure the PowerCenter Server to perform a test load.
With a test load, the PowerCenter Server reads and transforms data without
writing to targets. The PowerCenter Server generates all session files, and
performs all pre- and post-session functions, as if running the full session.
The PowerCenter Server writes data to relational targets, but rolls back the
data when the session completes. For all other target types, such as flat file
and SAP BW, the PowerCenter Server does not write data to the targets.
Enter the number of source rows you want to test in the Number of Rows to
Test field.
You cannot perform a test load on sessions using XML sources.
Note: You can perform a test load for relational targets when you configure a
session for normal mode. If you configure the session for bulk mode, the
session fails.
Number of Rows to Optional Enter the number of source rows you want the PowerCenter Server to test
Test load.
The PowerCenter Server reads the exact number you configure for the test
load.
Table contains a primary key Table does not contain a primary key
Target Database
referenced by a foreign key referenced by a foreign key
If the PowerCenter Server issues a truncate target table command and the target table instance
specifies a table name prefix, the PowerCenter Server verifies the database user privileges for
the target table by issuing a truncate command. If the database user is not specified as the
target owner name or does not have the database privilege to truncate the target table, the
PowerCenter Server automatically issues a delete command instead and writes the following
error message to the session log:
WRT_8208 Error truncating target table <target table name> trying DELETE
FROM query.
If the PowerCenter Server issues a delete command and the database has logging enabled, the
database saves all deleted records to the log for rollback. If you do not want to save deleted
records for rollback, you can disable logging to improve the speed of the delete.
For all databases, if the PowerCenter Server fails to truncate or delete any selected table
because the user lacks the necessary privileges, the session fails.
If you use truncate target tables with one of the following functions, the PowerCenter Server
fails to successfully truncate target tables for the session:
♦ Incremental aggregation. When you enable both truncate target tables and incremental
aggregation in the session properties, the Workflow Manager issues a warning that you
cannot enable truncate target tables and incremental aggregation in the same session.
Truncate Target
Table Option
4. In the Properties settings, select Truncate Target Table Option for each target table you
want the PowerCenter Server to truncate before it runs the session.
5. Click OK.
Deadlock Retry
Select the Session Retry on Deadlock option in the session properties if you want the
PowerCenter Server to retry target writes on a deadlock. A deadlock might occur when the
PowerCenter Server attempts to take control of the same lock for a row when loading
partitioned targets or when running two sessions simultaneously to the same target.
Session Retry
on Deadlock
Constraint-Based Loading
In the Workflow Manager, you can specify constraint-based loading for a session. When you
select this option, the PowerCenter Server orders the target load on a row-by-row basis. For
every row generated by an active source, the PowerCenter Server loads the corresponding
transformed row first to the primary key table, then to any foreign key tables. Constraint-
based loading depends on the following requirements:
♦ Active source. Related target tables must have the same active source.
♦ Key relationships. Target tables must have key relationships.
♦ Target connection groups. Targets must be in one target connection group.
♦ Treat rows as insert. Use this option when you insert into the target. You cannot use
updates with constraint-based loading.
Active Source
When target tables receive rows from different active sources, the PowerCenter Server reverts
to normal loading for those tables, but loads all other targets in the session using constraint-
based loading when possible. For example, a mapping contains three distinct pipelines. The
first two contain a source, source qualifier, and target. Since these two targets receive data
from different active sources, the PowerCenter Server reverts to normal loading for both
targets. The third pipeline contains a source, Normalizer, and two targets. Since these two
targets share a single active source (the Normalizer), the PowerCenter Server performs
constraint-based loading: loading the primary key table first, then the foreign key table.
For more information on active sources, see “Working with Active Sources” on page 259.
Key Relationships
When target tables have no key relationships, the PowerCenter Server does not perform
constraint-based loading. Similarly, when target tables have circular key relationships, the
After loading the first set of targets, the PowerCenter Server begins reading source B. If there
are no key relationships between T_5 and T_6, the PowerCenter Server reverts to a normal
load for both targets.
If T_6 has a foreign key that references a primary key in T_5, since T_5 and T_6 receive data
from a single active source, the Aggregator AGGTRANS, the PowerCenter Server loads rows
to the tables in the following order:
♦ T_5
♦ T_6
1. In the General Options settings of the Properties tab, choose Insert for the Treat Source
Rows As property.
Treat rows
as insert.
Constraint Based
Load Ordering
3. Click OK.
Bulk Loading
You can enable bulk loading when you load to DB2, Sybase, Oracle, or Microsoft SQL Server.
If you enable bulk loading for other database types, the PowerCenter Server reverts to a
normal load. Bulk loading improves the performance of a session that inserts a large amount
of data to the target database. Configure bulk loading on the Mapping tab.
When bulk loading, the PowerCenter Server invokes the database bulk utility and bypasses
the database log, which speeds performance. Without writing to the database log, however,
the target database cannot perform rollback. As a result, you may not be able to perform
recovery. Therefore, you must weigh the importance of improved session performance against
the ability to recover an incomplete session.
For more information on increasing session performance when bulk loading, see “Bulk
Loading” on page 642.
Note: When loading to DB2, Microsoft SQL Server, and Oracle targets, you must specify a
normal load for data driven sessions. When you specify bulk mode and data driven, the
PowerCenter Server reverts to normal load.
Oracle Guidelines
Oracle allows bulk loading for the following software versions:
♦ Oracle server version 8.1.5 or higher
♦ Oracle client version 8.1.7.2 or higher
You can use the Oracle client 8.1.7 if you install the Oracle Threaded Bulk Mode patch.
Use the following guidelines when bulk loading to Oracle:
♦ Do not define CHECK constraints in the database.
♦ Do not define primary and foreign keys in the database. However, you can define primary
and foreign keys for the target definitions in the Designer.
♦ To bulk load into indexed tables, choose non-parallel mode. To do this, you must disable
the Enable Parallel Mode option. For more information, see “Configuring a Relational
Database Connection” on page 56.
Note that when you disable parallel mode, you cannot load multiple target instances,
partitions, or sessions into the same table.
To bulk load in parallel mode, you must drop indexes and constraints in the target tables
before running a bulk load session. After the session completes, you can rebuild them. If
you use bulk loading with the session on a regular basis, you can use pre- and post-session
SQL to drop and rebuild indexes and key constraints.
♦ When you use the LONG datatype, verify it is the last column in the table.
♦ Specify the Table Name Prefix for the target when you use Oracle client 9i. If you do not
specify the table name prefix, the PowerCenter Server uses the database login as the prefix.
For more information, see your Oracle documentation.
DB2 Guidelines
Use the following guidelines when bulk loading to DB2:
♦ You must drop indexes and constraints in the target tables before running a bulk load
session. After the session completes, you can rebuild them. If you use bulk loading with
the session on a regular basis, you can use pre- and post-session SQL to drop and rebuild
indexes and key constraints.
♦ When you bulk load to DB2, the DB2 database writes non-fatal errors and warnings to a
message log file in the session log directory. The message log file name is
<session_log_name>.<target_instance_name>.<partition_index>.log. You can check both
the message log file and the session log when you troubleshoot a DB2 bulk load session.
For more information, see your DB2 documentation.
1. In the Workflow Manager, open the session properties and click the Transformations
view on the Mapping tab.
2. Select the target instance under the Targets node.
Target Instance
Reserved Words
If any table name or column name contains a database reserved word, such as MONTH or
YEAR, the session fails with database errors when the PowerCenter Server executes SQL
against the database. You can create and maintain a reserved words file, reswords.txt, in the
PowerCenter Server installation directory. When the PowerCenter Server initializes a session,
it searches for reswords.txt. If the file exists, the PowerCenter Server places quotes around
matching reserved words when it executes SQL against the database.
Use the following rules and guidelines when working with reserved words.
♦ The PowerCenter Server searches the reserved words file when it generates SQL to connect
to source, target, and lookup databases.
♦ If you override the SQL for a source, target, or lookup, you must enclose any reserved
word in quotes.
♦ You may need to enable some databases, such as Microsoft SQL Server and Sybase, to use
SQL-92 standards regarding quoted identifiers. You can use environment SQL to issue the
command. For example, with Microsoft SQL Server, you can use the following command:
SET QUOTED_IDENTIFIER ON
MONTH
DATE
INTERVAL
[Oracle]
OPTION
START
[DB2]
[SQL Server]
CURRENT
[Informix]
[ODBC]
MONTH
[Sybase]
Figure 9-9. Properties Settings on the Mapping Tab for a Flat File Target
Set File
Properties
Properties
Settings
Table 9-5 describes the properties you define in the Properties settings for flat file target
definitions:
Required/
Target Properties Description
Optional
Merge Partitioned Optional When selected, the PowerCenter Server merges the partitioned target files into
Files one file when the session completes, and then deletes the individual output
files. If the PowerCenter Server fails to create the merged file, it does not
delete the individual output files.
You cannot merge files if the session uses FTP, an external loader, or a
message queue.
For details on configuring a session for partitioning, see “Pipeline Partitioning”
on page 345.
Merge File Optional Enter the directory name in this field. By default, the PowerCenter Server
Directory writes the merged file in the server variable directory, $PMTargetFileDir.
If you enter a full directory and file name in the Merge File Name field, clear
this field.
Merge File Name Optional Name of the merge file. Default is target_name.out. This property is required if
you select Merge Partitioned Files.
Required/
Target Properties Description
Optional
Output File Optional Enter the directory name in this field. By default, the PowerCenter Server
Directory writes output files in the server variable directory, $PMTargetFileDir.
If you specify both the directory and file name in the Output Filename field,
clear this field. The PowerCenter Server concatenates this field with the Output
Filename field when it runs the session.
You can also use the $OutputFileName session parameter to specify the file
directory.
For details on session parameters, see “Session Parameters” on page 495.
Output Filename Required Enter the file name, or file name and path. By default, the Workflow Manager
names the target file based on the target definition used in the mapping:
target_name.out.
If the target definition contains a slash character, the Workflow Manager
replaces the slash character with an underscore.
When you use an external loader to load to an Oracle database, you must
specify a file extension. If you do not specify a file extension, the Oracle loader
cannot find the flat file and the PowerCenter Server fails the session. For more
information about external loading, see “Loading to Oracle” on page 533.
Enter the file name, or file name and path. Optionally use the $OutputFileName
session parameter for the file name.
The PowerCenter Server concatenates this field with the Output File Directory
field when it runs the session.
For details on session parameters, see “Session Parameters” on page 495.
Note: If you specify an absolute path file name when using FTP, the
PowerCenter Server ignores the Default Remote Directory specified in the FTP
connection. When you specify an absolute path file name, do not use single or
double quotes.
Reject File Optional Enter the directory name in this field. By default, the PowerCenter Server
Directory writes all reject files to the server variable directory, $PMBadFileDir.
If you specify both the directory and file name in the Reject Filename field,
clear this field. The PowerCenter Server concatenates this field with the Reject
Filename field when it runs the session.
You can also use the $BadFileName session parameter to specify the file
directory.
For details on session parameters, see “Session Parameters” on page 495.
Reject Filename Required Enter the file name, or file name and path. By default, the PowerCenter Server
names the reject file after the target instance name: target_name.bad.
Optionally use the $BadFileName session parameter for the file name.
The PowerCenter Server concatenates this field with the Reject File Directory
field when it runs the session. For example, if you have “C:\reject_file\” in the
Reject File Directory field, and enter “filename.bad” in the Reject Filename
field, the PowerCenter Server writes rejected rows to
C:\reject_file\filename.bad.
For details on session parameters, see “Session Parameters” on page 495.
Set File Properties Optional Opens a dialog box that allows you to define flat file properties. For more
Link information, see “Configuring Fixed-Width Properties” on page 265 and
“Configuring Delimited Properties” on page 266.
When you output to a flat file using a relational target definition in the mapping,
make sure you define the flat file properties by clicking the Set File Properties
link.
Test Load
Options
Table 9-6 describes the test load options in the General Options settings on the Properties
tab:
Required/
Property Description
Optional
Enable Test Load Optional You can configure the PowerCenter Server to perform a test load.
With a test load, the PowerCenter Server reads and transforms data without
writing to targets. The PowerCenter Server generates all session files and
performs all pre- and post-session functions, as if running the full session.
The PowerCenter Server writes data to relational targets, but rolls back the
data when the session completes. For all other target types, such as flat file
and SAP BW, the PowerCenter Server does not write data to the targets.
Enter the number of source rows you want to test in the Number of Rows to
Test field.
You cannot perform a test load on sessions using XML sources.
Note: You can perform a test load for relational targets when you configure a
session for normal mode. If you configure the session for bulk mode, the
session fails.
Number of Rows to Optional Enter the number of source rows you want the PowerCenter Server to test
Test load.
The PowerCenter Server reads the number you configure for the test load.
To edit the fixed-width properties, select Fixed Width and click Advanced.
Figure 9-12 shows the Fixed Width Properties dialog box:
Fixed-Width Required/
Description
Properties Options Optional
Null Character Required Enter the character you want the PowerCenter Server to use to represent
null values. You can enter any valid character in the file code page.
For more information about using null characters for target files, see “Null
Characters in Fixed-Width Files” on page 272.
Repeat Null Character Optional Select this option to indicate a null value by repeating the null character to
fill the field. If you do not select this option, the PowerCenter Server enters
a single null character at the beginning of the field to represent a null
value. For more information about specifying null characters for target
files, see “Null Characters in Fixed-Width Files” on page 272.
Code Page Required Select the code page of the fixed-width file. The default setting is the client
code page.
Table 9-8 describes the options you can define in the Delimited File Properties dialog box:
Delimiters Required Character used to separate columns of data. Use the button to the right of this
field to enter a non-printable delimiter. Delimiters can be either printable or
single-byte unprintable characters, and must be different from the escape
character and the quote character (if selected). You cannot select unprintable
multibyte characters as delimiters.
Optional Quotes Required Select None, Single, or Double. If you select a quote character, the
PowerCenter Server does not treat delimiter characters within the quote
characters as a delimiter. For example, suppose an output file uses a comma
as a delimiter and the PowerCenter Server receives the following row: 342-
3849, ‘Smith, Jenna’, ‘Rockville, MD’, 6.
If you select the optional single quote character, the PowerCenter Server
ignores the commas within the quotes and writes the row as four fields.
If you do not select the optional single quote, the PowerCenter Server writes
six separate fields.
Code Page Required Select the code page of the delimited file. The default setting is the client code
page.
Table 9-10. Field Length Measurements for Fixed-Width Flat File Targets
String Precision
Table 9-11 lists the characters you must accommodate when you configure the precision or
field width for flat file target definitions to accommodate the total length of the target field:
Table 9-11. Characters to Include when Calculating Field Length for Fixed-Width Targets
Datetime - Date and time separators, such as slashes (/), dashes (-), and colons (:).
For example, the format MM/DD/YYYY HH24:MI:SS has a total length of 19 bytes.
When you edit the flat file target definition in the mapping, define the precision or field
width great enough to accommodate both the target data and the characters in Table 9-11.
For example, suppose you have a mapping with a fixed-width flat file target definition. The
target definition contains a number column with a precision of 10 and a scale of 2. You use a
comma as the decimal separator and a period as the thousands separator. You know some rows
of data might have a negative value. Based on this information, you know the longest possible
number is formatted with the following format:
-NN.NNN.NNN,NN
Open the flat file target definition in the mapping and define the field width for this number
column as a minimum of 14 bytes.
For more information on formatting numeric and datetime values, see “Working with Flat
Files” in the Designer Guide.
Notation Description
A Double-byte character
-o Shift-out character
-i Shift-in character
For the first target column, the PowerCenter Server writes only three of the double-byte
characters to the target. It cannot write any additional double-byte characters to the output
column because the column must end in a single-byte character. If you add two more bytes to
the first target column definition, then the PowerCenter Server can add shift characters and
write all the data without truncation.
For the second target column, the PowerCenter Server writes all four single-byte characters to
the target. It does not add write shift characters to the column because the column begins and
ends with single-byte characters.
Character Set
You can configure the PowerCenter Server to run sessions with flat file targets in either ASCII
or Unicode data movement mode.
If you configure a session with a flat file target to run in Unicode data movement mode, the
target file code page must be a superset of the PowerCenter Server code page and the source
code page. Delimiters, escape, and null characters must be valid in the specified code page of
the flat file.
If you configure a session to run in ASCII data movement mode, delimiters, escape, and null
characters must be valid in the ISO Western European Latin1 code page. Any 8-bit character
you specified in previous versions of PowerCenter is still valid.
The column width for ITEM_ID is six. When you enable the Output Metadata For Flat File
Target option, the PowerCenter Server writes the following text to a flat file:
#ITEM_ITEM_NAME PRICE
100001Screwdriver 9.50
100002Hammer 12.90
For information about configuring the PowerCenter Server to output flat file metadata, see
the Installation and Configuration Guide.
Understanding Commit
Points
This chapter covers the following topics:
♦ Overview, 276
♦ Target-Based Commits, 277
♦ Source-Based Commits, 278
♦ User-Defined Commits, 283
♦ Understanding Transaction Control, 287
♦ Setting Commit Properties, 292
275
Overview
A commit interval is the interval at which the PowerCenter Server commits data to targets
during a session. The commit point can be a factor of the commit interval, the commit
interval type, and the size of the buffer blocks. The commit interval is the number of rows
you want to use as a basis for the commit point. The commit interval type is the type of rows
that you want to use as a basis for the commit point. You can choose between the following
commit types:
♦ Target-based commit. The PowerCenter Server commits data based on the number of
target rows and the key constraints on the target table. The commit point also depends on
the buffer block size, the commit interval, and the PowerCenter Server configuration for
writer timeout.
♦ Source-based commit. The PowerCenter Server commits data based on the number of
source rows. The commit point is the commit interval you configure in the session
properties.
♦ User-defined commit. The PowerCenter Server commits data based on transactions
defined in the mapping properties. You can also configure some commit and rollback
options in the session properties.
Source-based and user-defined commit sessions have partitioning restrictions. If you
configure a session with multiple partitions to use source-based or user-defined commit, you
can only choose pass-through partitioning at certain partition points in a pipeline. For more
information, see “Specifying Partition Types” on page 356.
The PowerCenter Server might commit less rows to the target than the number of rows
produced by the active source. For example, you have a source-based commit session that
passes 10,000 rows through an active source, and 3,000 rows are dropped due to
transformation logic. The PowerCenter Server issues a commit to the target when the 7,000
remaining rows reach the target.
The number of rows held in the writer buffers does not affect the commit point for a source-
based commit session. For example, you have a source-based commit session that passes
10,000 rows through an active source. When those 10,000 rows reach the targets, the
PowerCenter Server issues a commit. If the session completes successfully, the PowerCenter
Server issues commits after 10,000, 20,000, 30,000, and 40,000 source rows.
If the targets are in the same transaction control unit, the PowerCenter Server commits data
to the targets at the same time. If the session fails or aborts, the PowerCenter Server rolls back
all uncommitted data in a transaction control unit to the same source row.
If the targets are in different transaction control units, the PowerCenter Server performs the
commit when each target receives the commit row. If the session fails or aborts, the
PowerCenter Server rolls back each target to the last commit point. It might not roll back to
the same source row for targets in separate transaction control units. For more information on
transaction control units, see “Understanding Transaction Control Units” on page 289.
Note: Source-based commit may slow session performance if the session uses a one-to-one
mapping. A one-to-one mapping is a mapping that moves data from a Source Qualifier, XML
Source Qualifier, or Application Source Qualifier transformation directly to a target. For
more information about performance, see “Performance Tuning” on page 635.
Transformation Scope
property is All Input.
Transformation Scope
property is All Input.
The mapping contains a target load order group with one source pipeline that branches from
the Source Qualifier transformation to two targets. One pipeline branch contains an
Aggregator transformation with the All Input transformation scope, and the other contains an
Expression transformation. The PowerCenter Server identifies the Source Qualifier
transformation as the commit source for t_monthly_sales and the Aggregator as the commit
source for T_COMPANY_ALL. It performs a source-based commit for both targets, but uses
a different commit source for each.
Connected to an XML
Source Qualifier
transformation with multiple
connected output groups.
PowerCenter Server uses
target-based commit when
loading to these targets.
Connected to an active
source that generates
commits, AGG_Sales.
PowerCenter Server uses
source-based commit
when loading to this
target.
This mapping contains an XML Source Qualifier transformation with multiple output groups
connected downstream. Because you connect multiple output groups downstream, the XML
Source Qualifier transformation does not generate commits. You connect the XML Source
Qualifier transformation to two relational targets, T_STORE and T_PRODUCT. Therefore,
these targets do not receive any commit generated by an active source. The PowerCenter
Server uses target-based commit when loading to these targets.
However, the mapping includes an active source that generates commits, AGG_Sales, between
the XML Source Qualifier transformation and T_YTD_SALES. The PowerCenter Server uses
source-based commit when loading to T_YTD_SALES.
===================================================
WRT_8036 Target: TCustOrders (Instance Name: [TCustOrders])
When the PowerCenter Server writes all rows in a transaction to all targets, it issues commits
sequentially for each target.
The PowerCenter Server rolls back data based on the return value of the transaction control
expression or error handling configuration. If the transaction control expression returns a
rollback value, the PowerCenter Server rolls back the transaction. If an error occurs, you can
choose to roll back or commit at the next commit point.
If the transaction control expression evaluates to a value other than commit, rollback, or
continue, the PowerCenter Server fails the session. For more information about valid values,
see “Transaction Control Transformation” in the Transformation Guide.
When the session completes, the PowerCenter Server may write data to the target that was not
bound by commit rows. You can choose to commit at end of file or to roll back that open
transaction.
Note: If you use bulk loading with a user-defined commit session, the target may not recognize
the transaction boundaries. If the target connection group does not support transactions, the
PowerCenter Server writes the following message to the session log:
WRT_8234 Warning: Target Connection Group’s connection doesn’t support
transactions. Targets may not be loaded according to specified transaction
boundaries rules.
Rollback Evaluation
If the transaction control expression returns a rollback value, the PowerCenter Server rolls
back the transaction and writes a message to the session log indicating that the transaction
was rolled back. It also indicates how many rows were rolled back.
The following message is a sample message that the PowerCenter Server writes to the session
log when the transaction control expression returns a rollback value:
WRITER_1_1_1> WRT_8326 User-defined rollback processed
WRT_8162 ===================================================
WRT_8330 Rolled back [333] inserted, [0] deleted, [0] updated rows for the
target [TCustOrders]
The following message is a sample message indicating that Commit on End of File is enabled
in the session properties:
WRITER_1_1_1> WRT_8143
4 Rolled-back insert
5 Rolled-back update
6 Rolled-back delete
Note: The PowerCenter Server does not roll back a transaction if it encounters an error before
it processes any row through the Transaction Control transformation.
The following table describes row indicators in the reject file for committed transactions in a
failed transaction control unit:
7 Committed insert
8 Committed update
9 Committed delete
Transformation Scope
You can configure how the PowerCenter Server applies the transformation logic to incoming
data with the Transformation Scope transformation property. When the PowerCenter Server
processes a transformation, it either drops transaction boundaries or preserves transaction
boundaries, depending on the transformation scope and the mapping configuration.
You can choose one of the following values for the transformation scope:
♦ Row. Applies the transformation logic to one row of data at a time. Choose Row when a
row of data does not depend on any other row. When you choose Row for a
Transaction
Control Unit 1
Target Connection Group 2
Note that T5_ora1 uses the same connection name as T1_ora1 and T2_ora1. Because
T5_ora1 is connected to a separate Transaction Control transformation, it is in a separate
transaction control unit and target connection group. If you connect T5_ora1 to
tc_TransactionControlUnit1, it will be in the same transaction control unit as all targets, and
in the same target connection group as T1_ora1 and T2_ora1.
Commit Type
Commit Interval
Commit on
End of File
Roll Back
Transactions
on Error
Table 10-2 describes the session commit properties that you set in the General Options
settings of the Properties tab:
Commit on End of File Commits data at the end of Commits data at the end of Commits data at the end of
the file. Enabled by default. the file. Clear this option if the file. Clear this option if
You cannot disable this you want the PowerCenter you want the PowerCenter
option. Server to roll back open Server to roll back open
transactions. transactions.
Recovering Data
295
Overview
If you stop a session or if an error causes a session to stop unexpectedly, refer to the session
logs to determine the cause of the failure. Correct the errors, and then complete the session.
The method you use to complete the session depends on the configuration of the mapping
and the session, the specific failure, and how much progress the session made before it failed.
If the PowerCenter Server did not commit any data, run the session again. If the session
issued at least one commit and is recoverable, consider running the session in recovery mode.
Recovery allows you to restart a failed session and complete it as if the session had run
without pause. When the PowerCenter Server runs in recovery mode, it continues to commit
data from the point of the last successful commit. For more information on PowerCenter
Server processing during recovery, see “Server Handling for Recovery” on page 314.
All recovery sessions run as part of a workflow. When you recover a session, you also have the
option to run part of the workflow. Consider the configuration and design of the workflow
and the status of other tasks in the workflow before you choose a method of recovery.
Depending on the configuration and status of the workflow and session, you can choose one
or more of the following recovery methods:
♦ Recover a suspended workflow. If the workflow suspends due to session failure, you can
recover the failed session and resume the workflow. For details, see “Recovering a
Suspended Workflow” on page 305.
♦ Recover a failed workflow. If the workflow fails as a result of session failure, you can
recover the session and run the rest of the workflow. For details, see “Recovering a Failed
Workflow” on page 308.
♦ Recover a session task. If the workflow completes, but a session fails, you can recover the
session alone without running the rest of the workflow. You can also use this method to
recover multiple failed sessions in a branched workflow. For details, see “Recovering a
Session Task” on page 311.
For more information on session failure, see “Stopping and Aborting a Session” on page 200.
REP_GID VARCHAR(240)
WFLOW_ID NUMBER
SUBJ_ID NUMBER
TASK_INST_ID NUMBER
TGT_INST_ID NUMBER
PARTITION_ID NUMBER
TGT_RUN_ID NUMBER
RECOVERY_VER NUMBER
CHECK_POINT NUMBER
ROW_COUNT NUMBER
LAST_TGT_RUN_ID NUMBER
Note: If you manually create the PM_TGT_RUN_ID table, you must specify a value other
than zero in the LAST_TGT_RUN_ID column to ensure that the session runs successfully in
recovery mode.
Code Description
12 The PowerCenter Server cannot start recovery because the session or workflow is scheduled, suspending,
waiting for an event, waiting, initializing, aborting, stopping, disabled, or running.
19 The PowerCenter Server cannot start the session in recovery mode because the workflow is configured to run
continuously.
For details on additional pmcmd return codes, see “pmcmd Return Codes” on page 590.
Aggregator Always.
Rank Always.
Union Never.
To run a session in recovery mode, you must first enable the failed session for recovery. To
enable a session for recovery, the Workflow Manager verifies all targets in the mapping receive
data from transformations that produce repeatable data. The Workflow Manager uses the
values in the Table 11-4 to determine whether or not you can enable a session for recovery.
However, the Workflow Manager cannot verify whether or not you configure some
transformations, such as the Sequence Generator transformation, correctly and always allows
you to enable these sessions for recovery. You may get inconsistent results if you do not
configure these transformations correctly.
The mapping contains an Aggregator transformation that always produces repeatable data.
The Aggregator transformation provides data for the Lookup and Expression transformations.
Lookup and Expression transformations produce repeatable data if they receive repeatable
data. Therefore, the target receives repeatable data, and you can enable this session for
recovery.
The mapping contains two Source Qualifier transformations that produce repeatable data.
However, the mapping contains a Union and Custom transformation downstream that never
produce repeatable data. The Lookup transformation only produces repeatable data if it
receives repeatable data. Therefore, the target does not receive repeatable data, and you
cannot enable this session for recovery.
You can modify this mapping to enable the session for recovery by adding a Sorter
transformation configured for distinct output rows immediately after transformations that
never output repeatable data. Since the Union transformation is connected directly to another
transformation that never produces repeatable data, you only need to add a Sorter
transformation after the Custom transformation, as shown in the mapping in Figure 11-3:
Example
Suppose the workflow w_ItemOrders contains two sequential sessions. In this workflow,
s_ItemSales is enabled for recovery, and the workflow is configured to suspend on error.
Suppose s_ItemSales fails, and the PowerCenter Server suspends the workflow. You correct the
error and resume the workflow in recovery mode. The PowerCenter Server recovers the
session successfully, and then runs s_UpdateOrders.
If s_UpdateOrders also fails, the PowerCenter Server suspends the workflow again. You
correct the error, but you cannot resume the workflow in recovery mode because you did not
enable the session for recovery. Instead, you resume the workflow. The PowerCenter Server
starts s_UpdateOrders from the beginning, completes the session successfully, and then runs
the StopWorkflow control task.
Example
Suppose you have the workflow w_ItemsDaily, containing three concurrent sessions,
s_SupplierInfo, s_PromoItems, and s_ItemSales. In this workflow, s_SupplierInfo and
s_PromoItems are enabled for recovery, and the workflow is configured to suspend on error.
Workflow
configured to
suspend on error.
Suppose s_SupplierInfo fails while the PowerCenter Server is running the three sessions. The
PowerCenter Server places the workflow in a suspending state and continues running the
other two sessions. s_PromoItems and s_ItemSales also fail, and the PowerCenter Server then
places the workflow in a suspended state.
You correct the errors that caused each session to fail and then resume the workflow in
recovery mode. The PowerCenter Server starts s_SupplierInfo and s_PromoItems in recovery
mode. Since s_ItemSales is not enabled for recovery, it restarts the session from the beginning.
The PowerCenter Server runs the three sessions concurrently.
After all sessions succeed, the PowerCenter Server runs the Command task.
Example
Suppose the workflow w_ItemOrders contains two sequential sessions. s_ItemSales is enabled
for recovery and also configured to fail the parent workflow if it fails.
Figure 11-6 illustrates w_ItemOrders:
Example
Suppose the workflow w_ItemsDaily contains three concurrent sessions, s_SupplierInfo,
s_PromoItems, and s_ItemSales. In this workflow, each session is enabled for recovery and
configured to fail the parent workflow if the session fails.
Figure 11-7 illustrates w_ItemsDaily:
1. Select the failed session in the Navigator or in the Workflow Designer workspace.
2. Right-click the failed session and choose Recover Workflow from Task.
The PowerCenter Server runs the failed session in recovery mode, and then runs the rest
of the workflow.
Suppose s_ItemSales fails and the PowerCenter Server fails the workflow. s_PromoItems and
s_SupplierInfo also fail. You correct the errors that caused the sessions to fail.
After you correct the errors, you individually recover each failed session. The PowerCenter
Server successfully recovers the sessions. The workflow paths after the sessions converge at the
Command task, allowing you to start the workflow from the Command task and complete
the workflow.
Alternatively, after you correct the errors, you could also individually recover two of the three
failed sessions. After the PowerCenter Server successfully recovers the sessions, you can
recover the workflow from the third session. The PowerCenter Server then recovers the third
session and, on successful recovery, runs the rest of the workflow.
1. Select the failed session in the Navigator or in the Workflow Designer workspace.
2. Right-click the failed session and choose Recover Task.
The PowerCenter Server runs the session in recovery mode.
Running Recovery
If a session enabled for recovery fails, you can run the session in recovery mode. The
PowerCenter Server moves a recovery session through the states of a normal session:
scheduled, waiting, running, succeeded, and failed. When the PowerCenter Server starts the
recovery session, it runs all pre-session tasks.
Sending Email
319
Overview
You can send email to designated recipients when the PowerCenter Server runs a workflow.
For example, if you want to track how long a session takes to complete, you can configure the
session to send an email containing the time and date the session starts and completes. Or, if
you want the PowerCenter Server to notify you when a workflow suspends, you can configure
the workflow to send email when it suspends.
When you create a workflow or worklet, you can include the following types of email:
♦ Email task. You can include reusable and non-reusable Email tasks anywhere in the
workflow or worklet. For more information, see “Using Email Tasks in a Workflow or
Worklet” on page 341.
♦ Post-session email. You can configure the session so the PowerCenter Server sends an
email when the session completes or fails. You create an Email task and use it for post-
session email. For more information, see “Working with Post-Session Email” on page 332.
When you configure the subject and body of post-session email, you can use email
variables to include information about the session run, such as session name, status, and
the total number of records loaded. You can also use email variables to attach the session
log or other files to email messages. For more information, see “Email Variables and
Format Tags” on page 333.
♦ Suspension email. You can configure the workflow so the PowerCenter Server sends an
email when the workflow suspends. You create an Email task and use it for suspension
email. For more information, see “Working with Suspension Email” on page 339.
Before you can configure a session or workflow to send email, you need to create an Email
task. For more information, see “Working with Email Tasks” on page 328.
The PowerCenter Server on Windows sends email in MIME format. This allows you to
include characters in the subject and body that are not in 7-bit ASCII. For more information
on the MIME format or the MIME decoding process, see your email documentation.
Before creating Email tasks, configure the PowerCenter Server to send email. For more
information, see “Configuring Email on UNIX” on page 321 and “Configuring Email on
Windows” on page 322.
1. Log on to the UNIX system as the Informatica user who starts the PowerCenter Server.
2. Type the following lines at the prompt and press Enter:
rmail <your fully qualified email address>,<second fully
qualified email address>
From <your_user_name>
1. Log on to the UNIX system as the Informatica user who starts the PowerCenter Server.
2. Type the following line at the prompt and press Enter:
rmail <your fully qualified email address>,<second fully
qualified email address>
3. To indicate the end of the message, type . on a line of its own and press Enter.
Or, type ^D.
You should receive a blank email from the email account of the Informatica user. If not,
locate the directory where rmail resides and add that directory to the path.
Once you verify that rmail is installed correctly, you can send email. For more information on
configuring email, see “Working with Email Tasks” on page 328.
1. Open the Control Panel on the machine running the PowerCenter Server.
2. Double-click the Mail (or Mail and Fax) icon.
3. On the Services tab of the user Properties dialog box, click Show Profiles.
The Mail dialog box displays the list of profiles configured for the computer.
4. If you have a Microsoft Outlook profile set up for the Informatica Service startup
account, skip to “Step 3. Configure Logon Network Security” on page 325. If you do not
already have a Microsoft Outlook profile set up for the Informatica Service startup
account, continue to the next step.
5. Click Add in the mail properties window.
The Microsoft Outlook Setup Wizard appears.
7. Enter a profile name. You can enter any name, but Informatica recommends that you
enter a text string that matches the Informatica Service startup account. Click Next.
11. Indicate whether you want to run Outlook when you start Windows. Click Next.
12. The Setup Wizard indicates that you have successfully configured an Outlook profile.
13. Click Finish.
1. Open the Control Panel on the machine running the PowerCenter Server.
2. Double-click the Mail (or Mail and Fax) icon. The User Properties sheet appears.
4. Click the Advanced tab. Set the Logon network security option to NT Password
Authentication.
5. Click OK.
1. In the Task Developer, choose Tasks-Create. The Create Task dialog box appears.
2. Select an Email task and enter a name for the task. Click Create.
The Workflow Manager creates an Email task in the workspace.
3. Click Done.
8. Enter the fully qualified email address of the mail recipient in the Email User Name field.
For more information on entering the email address, see “Email Address Tips and
Guidelines” on page 328.
11. Enter the text of the email message in the Email Editor.
When you use the Email task, you can incorporate format tags in your message. For more
information, see “Email Variables and Format Tags” on page 333.
You can leave the Email Text field blank.
12. Click OK twice to save your changes.
Use a reusable
Email task.
Select a
reusable Email
task.
Use a non-
reusable Email
task.
You can specify a reusable Email task you create in the Task Developer for either success email
or failure email. Or, you can create a non-reusable Email task for each session property. When
you create a non-reusable Email task for the session property, you create the Email task for
that session only. You cannot use the Email task in the workflow or worklet.
%s Session name.
%e Session status.
%t Source and target table details, including read throughput in bytes per second and write throughput
in rows per second. The PowerCenter Server includes all information displayed in the session detail
dialog box.
%a<filename> Attach the named file. The file must be local to the PowerCenter Server. The following are valid file
names: %a<c:\data\sales.txt> or %a</users/john/data/sales.txt>.
Note: The file name cannot include the greater than character (>) or a line break.
Note: The PowerCenter Server ignores %a, %g, or %t when you include them in the email subject. Include these variables in the email
message only.
Table 12-2 lists the format tags you can use in an Email task:
tab \t
new line \n
2. Select Reusable in the Type column for the success email or failure email field.
3. Click the Open button in the Value column to select the reusable Email task.
2. Select Non-Reusable in the Type column for the success email or failure email field.
4. Edit the Email task and click OK. For more information on editing Email tasks, see
“Working with Email Tasks” on page 328.
5. Click OK to close the session properties.
Sample Email
The following is user-entered text from a sample post-session email configuration using
variables:
Session complete.
Session name: %s
%l
%r
%e
%b
%c
%i
%g
Completed
Note: The Workflow Manager returns an error message if you do not have any reusable
Email tasks in the folder. Create a reusable Email task in the folder before you configure
suspension email.
5. Choose a reusable Email task and click OK.
6. Click OK to close the workflow properties.
Configure the gen_report Command task to execute a shell script that generates the report.
Verify the shell script saves the report to a directory local to the PowerCenter Server.
Configure the em_report Email task to attach the file generated from the shell script.
Tips 343
344 Chapter 12: Sending Email
Chapter 13
Pipeline Partitioning
This chapter covers the following subjects:
♦ Overview, 346
♦ Configuring Partitioning Information, 351
♦ Cache Partitioning, 359
♦ Round-Robin Partition Type, 360
♦ Hash Keys Partition Types, 361
♦ Key Range Partition Type, 363
♦ Pass-Through Partition Type, 367
♦ Database Partitioning Partition Type, 369
♦ Partitioning Relational Sources, 371
♦ Partitioning File Sources, 374
♦ Partitioning Relational Targets, 378
♦ Partitioning File Targets, 380
♦ Partitioning Joiner Transformations, 384
♦ Partitioning Lookup Transformations, 391
♦ Partitioning Sorter Transformations, 392
♦ Mapping Variables in Partitioned Pipelines, 394
♦ Partitioning Rules, 395
345
Overview
You create a session for each mapping you want the PowerCenter Server to run. Every
mapping contains one or more source pipelines. A source pipeline consists of a source
qualifier and all the transformations and targets that receive data from that source qualifier.
If you purchase the Partitioning option, you can specify partitioning information for each
source pipeline in a mapping. The partitioning information for a pipeline controls the
following factors:
♦ The number of reader, transformation, and writer threads that the master thread creates
for the pipeline. For more information, see “Understanding Processing Threads” on
page 14.
♦ How the PowerCenter Server reads data from the source, including the number of
connections to the source.
♦ How the PowerCenter Server distributes rows of data to each transformation as it processes
the pipeline.
♦ How the PowerCenter Server writes data to the target, including the number of
connections to each target in the pipeline.
You can specify partitioning information for a pipeline by setting the following attributes:
♦ Location of partition points. Partition points mark the thread boundaries in a pipeline
and divide the pipeline into stages. The PowerCenter Server sets partition points at several
transformations in a pipeline by default. If you have the Partitioning option, you can
define other partition points. When you add partition points, you increase the number of
transformation threads, which can improve session performance. The PowerCenter Server
can redistribute rows of data at partition points, which can also improve session
performance. For more information on partition points, see “Partition Points” on
page 346.
♦ Number of partitions. A partition is a pipeline stage that executes in a single thread. If you
purchase the Partitioning option, you can set the number of partitions at any partition
point. When you add partitions, you increase the number of processing threads, which can
improve session performance. For more information, see “Number of Partitions” on
page 348.
♦ Partition types. The PowerCenter Server specifies a default partition type at each partition
point. If you purchase the Partitioning option, you can change the partition type. The
partition type controls how the PowerCenter Server redistributes data among partitions at
partition points. For more information, see “Partition Types” on page 348.
Partition Points
By default, the PowerCenter Server sets partition points at various transformations in the
pipeline. Partition points mark thread boundaries as well as divide the pipeline into stages. A
stage is a section of a pipeline between any two partition points. When you set a partition
point at a transformation, the new pipeline stage includes that transformation.
Transformation Default
Description
(Partition Point) Partition Type
Source Qualifier or Pass-through Controls how the PowerCenter Server reads data from the source
Normalizer transformation and passes data into the source qualifier.
Rank and unsorted Hash auto-keys Ensures that the PowerCenter Server groups rows properly before it
Aggregator transformations sends them to the transformation.
Target instances Pass-through Controls how the target instances pass data to the targets.
If you purchase the Partitioning option, you can add partition points at other transformations
and delete some partition points.
Figure 13-1 shows the default partition points and pipeline stages for a simple mapping with
one source pipeline:
The mapping in Figure 13-1 contains four stages. The partition point at the source qualifier
marks the boundary between the first (reader) and second (transformation) stages. The
partition point at the Aggregator transformation marks the boundary between the second and
third (transformation) stages. The partition point at the target instance marks the boundary
between the third (transformation) and fourth (writer) stage.
When you add a partition point, you increase the number of pipeline stages by one. Similarly,
when you delete a partition point, you reduce the number of stages by one. For more
information, see “Understanding Processing Threads” on page 14.
Besides marking stage boundaries, partition points also mark the points in the pipeline where
the PowerCenter Server can redistribute data across partitions. For example, if you place a
partition point at a Filter transformation and define multiple partitions, the PowerCenter
Server can redistribute rows of data among the partitions before the Filter transformation
processes the data. The partition type you set at this partition point controls the way in which
the PowerCenter Server passes rows of data to each partition. For more information, see
“Partition Types” on page 348.
For more information on adding and deleting partition points, see “Adding and Deleting
Partition Points” on page 353.
Overview 347
Number of Partitions
A partition is a pipeline stage that executes in a single reader, transformation, or writer thread.
By default, the PowerCenter Server defines a single partition in the source pipeline. If you
purchase the Partitioning option, you can increase the number of partitions. This increases
the number of processing threads, which can improve session performance.
For example, you need to use the mapping in Figure 13-1 to extract data from three flat files
of various sizes. To do this, you define three partitions at the source qualifier to read the data
simultaneously. When you do this, the Workflow Manager defines three partitions in the
pipeline.
Figure 13-2 shows the threads that the master thread creates for this mapping:
Figure 13-2. Threads Created for a Sample Mapping with Three Partitions
By default, the PowerCenter Server sets the number of partitions to one. You can generally
define up to 64 partitions at any partition point. However, there are situations in which you
can define only one partition in the pipeline. For more information, see “Restrictions on the
Number of Partitions” on page 395.
Note: Increasing the number of partitions or partition points increases the number of threads.
Therefore, increasing the number of partitions or partition points also increases the load on
the server machine. If the server machine contains ample CPU bandwidth, processing rows of
data in a session concurrently can increase session performance. However, if you create a large
number of partitions or partition points in a session that processes large amounts of data, you
can overload the system.
For more information on adding and deleting partitions, see “Adding and Deleting Partitions”
on page 356.
Partition Types
When you configure the partitioning information for a pipeline, you must specify a partition
type at each partition point in the pipeline. The partition type determines how the
PowerCenter Server redistributes data across partition points.
The mapping in Figure 13-3 reads data about items and calculates average wholesale costs and
prices. The mapping must read item information from three flat files of various sizes, and
then filter out discontinued items. It sorts the active items by description, calculates the
average prices and wholesale costs, and writes the results to a relational database in which the
target tables are partitioned by key range.
When you use this mapping in a session, you can increase session performance by specifying
different partition types at the following partition points in the pipeline:
♦ Source qualifier. To read data from the three flat files concurrently, you must specify three
partitions at the source qualifier. Accept the default partition type, pass-through.
Overview 349
♦ Filter transformation. Since the source files vary in size, each partition processes a
different amount of data. Set a partition point at the Filter transformation, and choose
round-robin partitioning to balance the load going into the Filter transformation.
♦ Sorter transformation. To eliminate overlapping groups in the Sorter and Aggregator
transformations, use hash auto-keys partitioning at the Sorter transformation. This causes
the PowerCenter Server to group all items with the same description into the same
partition before the Sorter and Aggregator transformations process the rows. You can
delete the default partition point at the Aggregator transformation.
♦ Target. Since the target tables are partitioned by key range, specify key range partitioning
at the target to optimize writing data to the target.
For more information on specifying partition types, see “Specifying Partition Types” on
page 356.
Delete a partition
point.
Selected Partition
Point
Partitioning
Workspace
Edit Keys
Specify key
ranges.
Click to display
Partitions view.
Table 13-2. Options on Session Properties Partitions View on the Mapping Tab
Add Partition Point Click to add a new partition point in the mapping. When you add a partition point, the
transformation name appears under the Partition Points node.
Edit Partition Point Click to edit the selected partition point. This opens the Edit Partition Point dialog box. For
more information on the options in this dialog box, see Table 13-3 on page 353.
Key Range Displays the key and key ranges for the partition point, depending on the partition type.
For key range partitioning, you specify the key ranges.
For hash user keys partitioning, this field displays the partition key.
The Workflow Manager does not display this area for other partition types.
Edit Keys Click to add or remove the partition key for key range or hash user keys partitioning. You
cannot create a partition key for hash auto-keys, round-robin, or pass-through partitioning.
You can configure the following information when you edit or add a partition point:
♦ Specify the partition type at the partition point.
♦ Add and delete partitions.
♦ Enter a description for each partition.
Figure 13-5 shows the configuration options in the Edit Partition Point dialog box:
Delete a partition.
Select a partition.
Enter the partition description.
Partition Names Selects individual partitions from this dialog box to configure.
Add a Partition Adds a partition. You can add up to 64 partitions at any partition point. The number of
partitions must be consistent across the pipeline. Therefore, if you define three partitions
at one partition point, the Workflow Manager defines three partitions at all partition points
in the pipeline.
Delete a Partition Deletes the selected partition. Each partition point must contain at least one partition.
In this mapping, the Workflow Manager creates partition points at the source qualifier and
target instance by default. You can place an additional partition point at Expression
transformation EXP_3.
If you place a partition point at EXP_3 and define one partition, the master thread creates the
following threads:
* Partition Points
* *
*
Transformation Reason
EXP_1 and EXP_2 If you could place a partition point at EXP_1 or EXP_2, you would create an additional pipeline
stage that processes data from the source qualifier to EXP_1 or EXP_2. In this case, EXP_3
would receive data from two pipeline stages, which is not allowed.
For more information about processing threads, see “Understanding Processing Threads” on
page 14.
1. On the Partitions view of the Mapping tab, select a transformation that is not already a
partition point, and click the Add a Partition Point button.
Tip: You can select a transformation from the Non-Partition Points node.
2. Select the partition type for the partition point or accept the default value. For
information on specifying a valid partition type, see “Specifying Partition Types” on
page 356.
3. Click OK.
The transformation appears in the Partition Points node in the Partitions view on the
Mapping tab of the session properties.
Transformation Round- Hash Hash User Key Pass- Database Default Partition
(Partition Point) Robin Auto-Keys Keys Range Through Partitioning Type
Normalizer X Pass-through
(COBOL sources)
Normalizer X X X X Pass-through
(relational)
Custom X X X X Pass-through
Expression X X X X Pass-through
Filter X X X X Pass-through
Joiner X X Based on
transformation scope*
Lookup X X X X X Pass-through
Rank X X Based on
transformation scope*
Router X X X X Pass-through
Sorter X X X Based on
transformation scope*
Union X X X X Pass-through
Transformation Round- Hash Hash User Key Pass- Database Default Partition
(Partition Point) Robin Auto-Keys Keys Range Through Partitioning Type
The session based on this mapping reads item information from three flat files of different
sizes:
♦ Source file 1: 80,000 rows
♦ Source file 2: 5,000 rows
♦ Source file 3: 15,000 rows
When the PowerCenter Server reads the source data, the first partition begins processing 80%
of the data, the second partition processes 5% of the data, and the third partition processes
15% of the data.
To distribute the workload more evenly, set a partition point at the Filter transformation and
set the partition type to round-robin. The PowerCenter Server distributes the data so that
each partition processes approximately one third of the data.
Hash Auto-Keys
You can use hash auto-keys partitioning at or before Rank, Sorter, Joiner, and unsorted
Aggregator transformations to ensure that rows are grouped properly before they enter these
transformations.
Figure 13-8 shows a mapping where hash auto-keys partitioning causes the PowerCenter
Server to distribute rows to each partition according to group before they enter the Sorter and
Aggregator transformations:
In this mapping, the Sorter transformation sorts items by item description. If items with the
same description exist in more than one source file, each partition will contain items with the
same description. Without hash auto-keys partitioning, the Aggregator transformation might
calculate average costs and prices for each item incorrectly.
To prevent errors in the cost and prices calculations, set a partition point at the Sorter
transformation and set the partition type to hash auto-keys. When you do this, the
PowerCenter Server redistributes the data so that all items with the same description reach the
Sorter and Aggregator transformations in a single partition.
To rearrange the order of the ports that make up the key, select a port in the Selected Ports list
and click the up or down arrow.
Figure 13-10. Mapping where Key Range Partitioning Can Increase Performance
When you do this, the PowerCenter Server sends all items with IDs less than 3000 to the first
partition. It sends all items with IDs between 3000 and 5999 to the second partition. Items
with IDs greater than or equal to 6000 go to the third partition. For more information on key
ranges, see “Adding Key Ranges” on page 365.
To rearrange the order of the ports that make up the partition key, select a port in the Selected
Ports list and click the up or down arrow.
In key range partitioning, the order of the ports does not affect how the PowerCenter Server
redistributes rows among partitions, but it can affect session performance. For example, you
might configure the following compound partition key:
Selected Ports
ITEMS.DESCRIPTION
ITEMS.DISCONTINUED_FLAG
Since boolean comparisons are usually faster than string comparisons, the session may run
faster if you arrange the ports in the following order:
Selected Ports
ITEMS.DISCONTINUED_FLAG
ITEMS.DESCRIPTION
You can leave the start or end range blank for a partition. When you leave the start range
blank, the PowerCenter Server uses the minimum data value as the start range. When you
leave the end range blank, the PowerCenter Server uses the maximum data value as the end
range.
For example, you can add the following ranges for a key based on CUSTOMER_ID in a
pipeline that contains two partitions:
CUSTOMER_ID Start Range End Range
Partition #1 135000
Partition #2 135000
When the PowerCenter Server reads the Customers table, it sends all rows that contain
customer IDs less than 135000 to the first partition, and all rows that contain customer IDs
equal to or greater than 135000 to the second partition. The PowerCenter Server eliminates
rows that contain null values or values that fall outside the key ranges.
By default, this mapping contains partition points only at the source qualifier and target
instance. Since this mapping contains an XML target, you can configure only one partition at
any partition point.
In this case, the master thread creates one reader thread to read data from the source, one
transformation thread to process the data, and one writer thread to write data to the target.
Each pipeline stage processes the rows as follows:
Source Qualifier Transformations Target Instance
(First Stage) (Second Stage) (Third Stage)
Time
Row Set 1 – –
Row Set 2 Row Set 1 –
Row Set 3 Row Set 2 Row Set 1
Row Set 4 Row Set 3 Row Set 2
... ... ...
Row Set n Row Set n-1 Row Set n-2
Because the pipeline contains three stages, the PowerCenter Server can process three sets of
rows concurrently.
If the Expression transformations are very complicated, processing the second
(transformation) stage can take a long time and cause low data throughput. To improve
performance, set a partition point at Expression transformation EXP_2 and set the partition
The PowerCenter Server can now process four sets of rows concurrently as follows:
Source FIL_1 & EXP_1 EXP_2 & LKP_1 Target
Qualifier Transformations Transformations Instance
(First Stage) (Second Stage) (Third Stage) (Fourth Stage)
Time
Row Set 1 - - -
Row Set 2 Row Set 1 - -
Row Set 3 Row Set 2 Row Set 1 -
Row Set 4 Row Set 3 Row Set 2 Row Set 1
... ... ... ...
Row Set n Row Set n-1 Row Set n-2 Row Set n-3
By adding an additional partition point at Expression transformation EXP_2, you replace one
long running transformation stage with two shorter running transformation stages. Data
throughput depends on the longest running stage. So in this case, data throughput increases.
For more information about processing threads, see “Understanding Processing Threads” on
page 14.
♦ You cannot use database partitioning when you configure the session to use source-based
or user-defined commit, constraint-based loading, or session recovery.
♦ The target table must contain a partition key. Also, you must link all not-null partition key
columns in the target instance to a transformation in the mapping.
♦ You must use high precision mode when the IBM DB2 table partitioning key uses a Bigint
field. The PowerCenter Server fails the session when the IBM DB2 table partitioning key
uses a Bigint field and you use low precision mode.
♦ If you create multiple partitions for a DB2 bulk load session, you must use database
partitioning for the target partition type. If you choose any other partition type, the
PowerCenter Server reverts to normal load and writes the following message to the session
log:
ODL_26097 Only database partitioning is support for DB2 bulk load.
Changing target load type variable to Normal.
If you configure a session for database partitioning, the PowerCenter Server reverts to pass-
through partitioning under the following circumstances:
♦ The DB2 target table is stored on one node.
♦ You run the session in debug mode using the Debugger.
Figure 13-14. Overriding the SQL Query and Entering a Filter Condition
Browse Button
Transformations View
For more information about partitioning Application sources, refer to the PowerCenter
Connect documentation.
If you know that the IDs for customers outside the USA fall within the range for a particular
partition, you can enter a filter in that partition to exclude them. Therefore, you enter the
following filter condition for the second partition:
CUSTOMERS.COUNTRY = ‘USA’
When the session runs, the following queries for the two partitions appear in the session log:
READER_1_1_1> RR_4010 SQ instance [SQ_CUSTOMERS] SQL Query [SELECT
CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.LAST_NAME FROM
CUSTOMERS WHERE CUSTOMER.CUSTOMER ID < 135000]
[...]
READER_1_1_2> RR_4010 SQ instance [SQ_CUSTOMERS] SQL Query [SELECT
CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.LAST_NAME FROM
CUSTOMERS WHERE CUSTOMERS.COUNTRY = ‘USA’ AND 135000 <=
CUSTOMERS.CUSTOMER_ID]
Source File Directory Enter the local source file directory. The default location is $PMSourceFileDir.
Source File Name Enter the local source file name. You can also use the session variable, $InputFileName, as
defined in the parameter file. If you use a file list, enter the name of the list.
By default, the Workflow Manager uses the source file name for each partition. Edit the file
name property for partitions 2-n based on how you want the PowerCenter Server to read
the files.
Source File Type Choose Direct to use source files or Indirect to use a file list.
Partition #1 ProductsA.txt The PowerCenter Server creates one thread to read ProductsA.txt. It reads
Partition #2 empty.txt rows in the file sequentially. After it reads the file, it passes the data to
Partition #3 empty.txt three partitions in the transformation pipeline.
Partition #1 ProductsA.txt The PowerCenter Server creates two threads. It creates one thread to read
Partition #2 empty.txt ProductsA.txt, and it creates one thread to read ProductsB.txt. It reads the
Partition #3 ProductsB.txt files concurrently, and it reads rows in the files sequentially.
If you use FTP to access source files, you can choose a different connection for each direct
file. For more information about using FTP to access source files, see “Using FTP” on
page 559.
Partition #1 ProductsA.txt The PowerCenter Server creates three threads to concurrently read
Partition #2 <blank> ProductsA.txt.
Partition #3 <blank>
Partition #1 ProductsA.txt The PowerCenter Server creates three threads to read ProductsA.txt and
Partition #2 <blank> ProductsB.txt concurrently. Two threads read ProductsA.txt and one thread
Partition #3 ProductsB.txt reads ProductsB.txt.
Figure 13-15. Properties Settings for Relational Targets in the Session Properties
Properties Settings
Transformations View
Attribute Description
Reject File Directory Location for the target reject files. Default is $PMBadFileDir.
Reject File Name Name of reject file. Default is target name partition number.bad. You can also use the session
variable, $BadFileName, as defined in the parameter file.
Database Compatibility
When you configure a session with multiple partitions at the target instance, the PowerCenter
Server creates one connection to the target for each partition. If you configure multiple target
partitions in a session that loads to a database or ODBC target that does not support multiple
concurrent connections to tables, the session fails.
When you create multiple target partitions in a session that loads data to an Informix
database, you must create the target table with row-level locking. If you insert data from a
session with multiple partitions into an Informix target configured for page-level locking, the
session fails and returns the following message:
WRT_8206 Error: The target table has been created with page level locking.
The session can only run with multi partitions when the target table is
created with row level locking.
Sybase IQ does not allow multiple concurrent connections to tables. If you create multiple
target partitions in a session that loads to Sybase IQ, the PowerCenter Server loads all of the
data in one partition.
Figure 13-16. Connections Settings for File Targets in the Session Properties
Connection Type
Transformations View
Table 13-9 describes the connection options for file targets in a mapping:
Attribute Description
Connection Type Choose a local, FTP, external loader, or message queue connection. Select None for a local
connection.
The connection type is the same for all partitions.
Value For an FTP, external loader, or message queue connection, click the button in this field to
select the connection object.
You can specify a different connection object for each partition.
Figure 13-17. Properties Settings for File Targets in the Session Properties
Properties Settings
Table 13-10 describes the file properties for file targets in a mapping:
Attribute Description
Merge Partitioned Files If you select this option, the PowerCenter Server merges the partitioned target files into one
file when the session completes, and then deletes the individual output files. It does not
delete the individual files if it fails to create the merged file.
You cannot merge files if the session uses FTP, an external loader, or an MQSeries
message queue.
Merge File Directory Location for the merge file. Default is $PMTargetFileDir.
Merge File Name Name of the merge file. Default is target name.out.
Output File Directory Location for the target file. Default is $PMTargetFileDir.
Attribute Description
Output File Name Name of target file. Default is target name partition number.out. You can also use the
session variable, $OutputFileName, as defined in the parameter file.
Reject File Directory Location for the target reject files. Default is $PMBadFileDir.
Reject File Name Name of reject file. Default is target name partition number.bad.
Flat File
Source
Qualifier
Joiner
transformation
Flat File 1
Source
Flat File 2 Qualifier
with pass-
Flat File 3 through
partition
Sorted Data
The Joiner transformation may output unsorted data depending on the join type. If you use a
full outer or detail outer join, the PowerCenter Server processes unmatched master rows last,
which can result in unsorted data.
Source
Qualifier
Joiner
transformation
with hash auto-
keys partition
point
Source
Qualifier
Sorted Data
No Data
The example in Figure 13-19 shows sorted data passed in a single partition to maintain the
sort order. The first partition contains sorted file data while all other partitions pass empty file
data. At the Joiner transformation, the PowerCenter Server distributes the data among all
partitions while maintaining the order of the sorted data.
Joiner
transformation
Sorted output
depends on join
type.
The Joiner transformation may output unsorted data depending on the join type. If you use a
full outer or detail outer join, the PowerCenter Server processes unmatched master rows last,
which can result in unsorted data.
Source Qualifier
Relational transformation with
Source key-range partition
point Joiner
transformation
with hash auto-
keys partition
Source Qualifier point
Relational transformation with
Source key-range partition
point
Sorted Data
No Data
The example in Figure 13-21 shows sorted relational data passed in a single partition to
maintain the sort order. The first partition contains sorted relational data while all other
partitions pass empty data. After the PowerCenter Server joins the sorted data, it redistributes
data among multiple partitions.
Figure 13-22. Using Sorter Transformations with Hash Auto-Keys to Maintain Sort Order
Sorted Data
Unsorted Data
Note: For best performance, use sorted flat files or sorted relational data. You may want to
calculate the processing overhead for adding Sorter transformations to your mapping.
For more information about cache partitioning, see “Cache Partitioning” on page 359.
SetCountVariable PowerCenter Server calculates the final count values from all partitions.
SetMaxVariable PowerCenter Server compares the final variable value for each partition and saves the
highest value.
SetMinVariable PowerCenter Server compares the final variable value for each partition and saves the
lowest value.
Note: You should use the SetVariable function only once for each mapping variable in a
pipeline. When you create multiple partitions in a pipeline, the PowerCenter Server uses
multiple threads to process that pipeline. If you use this function more than once for the same
variable, the current value of a mapping variable may have indeterministic results.
Transformation Restrictions
Custom transformation By default, you can only specify one partition if the pipeline contains a Custom
transformation.
However, this transformation contains an option on the Properties tab to allow
multiple partitions. If you enable this option, you can specify multiple partitions at this
transformation. Do not select Is Partitionable if the Custom transformation procedure
performs the procedure based on all the input data together, such as data cleansing.
External Procedure By default, you can only specify one partition if the pipeline contains an External
transformation Procedure transformation.
This transformation contains an option on the Properties tab to allow multiple
partitions. If this option is enabled, you can specify multiple partitions at this
transformation.
Joiner transformation You can specify only one partition if the pipeline contains the master source for a
Joiner transformation and you do not add a partition point at the Joiner
transformation.
XML target instance You can specify only one partition if the pipeline contains XML targets.
Product Restrictions
PowerCenter Connect for PeopleSoft If the pipeline contains an Application Source Qualifier transformation for
PeopleSoft when it is connected to or associated with a PeopleSoft tree, then
you can specify only one partition and the partition type must be pass-
through.
PowerCenter Connect for IBM For MQSeries sources, you can specify multiple partitions only if there is no
MQSeries associated source qualifier in the pipeline.
You cannot merge output files from sessions with multiple partitions if you
use an MQSeries message queue as the target connection type.
PowerCenter Connect for SAP R/3 If the mapping contains hierarchies or IDOCs, then you can specify only one
partition and the partition type must be pass-through.
If you generate the ABAP program using exec SQL, then you can specify only
one partition and the partition type must be pass-through.
You must use the Informatica default date format to enter dates in key
ranges.
PowerCenter Connect for SAP BW You can specify only one partition when the target load order group contains
an SAP BW target.
Product Restrictions
PowerCenter Connect for Siebel When you use a source filter in a join override, always use the following
syntax for Siebel business components:
SiebelBusinessComponentName.SiebelFieldName
When you create a source filter for a Siebel business component, always use
the following syntax:
SiebelBusinessComponentName.SiebelFieldName
PowerCenter Connect SDK If the mapping contains a multi-group target that receives data from more
than one pipeline, then you can specify only one partition.
If the mapping contains a multi-group target that receives data from multiple
groups, then the partition type must be pass-through.
For more information about these other products, please see the product documentation.
Partitioning Guidelines
This section summarizes the other guidelines that appear throughout this chapter.
Monitoring Workflows
401
Overview
You can monitor workflows and tasks in the Workflow Monitor. View details about a
workflow or task in Gantt Chart view or Task view. You can run, stop, abort, and resume
workflows from the Workflow Monitor.
The Workflow Monitor displays workflows that have run at least once. The Workflow
Monitor continuously receives information from the PowerCenter Server and Repository
Server. It also fetches information from the repository to display historic information.
The Workflow Monitor consists of the following windows:
♦ Navigator window. Displays monitored repositories, servers, and repository objects.
♦ Output window. Displays messages from the PowerCenter Server and the Repository
Server.
♦ Time window. Displays progress of workflow runs.
♦ Gantt Chart view. Displays details about workflow runs in chronological (Gantt Chart)
format.
♦ Task view. Displays details about workflow runs in a report format, organized by workflow
run.
The Workflow Monitor displays time relative to the time configured on the PowerCenter
Server machine. For example, a folder contains two workflows. One workflow runs on a
PowerCenter Server in your local time zone, and the other runs on a PowerCenter Server in a
time zone two hours later. If you start both workflows at 9 a.m. local time, the Workflow
Monitor displays the start time as 9 a.m. for one workflow and as 11 a.m. for the other
workflow.
Navigator
Window
Gantt
Chart
View
Toggle between Gantt Chart view and Task view by clicking the tabs on the bottom of the
Workflow Monitor.
Note: You can view and hide the Output window in the Workflow Monitor. To toggle back
and forth, choose View-Output.
Overview 403
Using the Workflow Monitor
The Workflow Monitor provides options to view information about workflow runs. After you
open the Workflow Monitor and connect to a repository, you can view dynamic information
about workflow runs by connecting to a PowerCenter Server.
You can customize the Workflow Monitor display by configuring the maximum days or
workflow runs the Workflow Monitor shows. You can also filter tasks and servers in both
Gantt Chart and Task view.
Complete the following steps to monitor workflows:
1. Open the Workflow Monitor.
2. Connect to the repository containing the workflow.
3. Connect to the PowerCenter Server.
4. Select the workflow you want to monitor.
5. Choose from Gantt Chart view or Task view.
Filtering Tasks
You can view all or some workflow tasks. You can filter out tasks to view only tasks you want.
For example, if you want to view only Session tasks, you can hide all other tasks. You can view
all tasks at any time.
To filter tasks:
1. Choose Filters-Tasks.
The Filter Tasks dialog box appears.
2. Clear the tasks you want to hide, and select the tasks you want to view.
3. Click OK.
Note: When you filter a task, the Gantt Chart view displays a red link between tasks to
indicate a filtered task. You can double-click the link to view the tasks you hid.
Filtering Servers
When you connect to a repository, the Workflow Monitor displays a list of registered servers
and deleted servers. When you register multiple servers, you can filter out servers to view only
servers you want to monitor.
When you hide a server, the Workflow Monitor hides the server from the Navigator for both
Gantt Chart and Task view. You can show the server at any time.
You can hide unconnected servers. When you hide a connected server, the Workflow Monitor
asks if you want to disconnect from the server and then filter it. You must disconnect from a
server before hiding it.
2. Select the servers you want to view, and clear the servers you want to filter. Click OK.
If you are connected to a server that you clear, the Workflow Monitor prompts you to
disconnect from the server before filtering.
3. Click Yes to disconnect from the server and filter it.
The Workflow Monitor hides the server from the Navigator.
Click No to remain connected to the server. If you click No, you cannot filter the server.
Tip: You can also filter a server in the Navigator by right-clicking it and selecting Filter Server.
Viewing Properties
You can view properties for the following items:
♦ Tasks. You can view properties such as task name, start time, and status.
♦ Sessions. You can view properties about the Session task and session run, such as mapping
name and number of rows successfully loaded. You can also view load statistics about the
session run. For more information on session details, see “Monitoring Session Details” on
page 434. You can also view performance details about the session run. For more
information, see “Creating and Viewing Performance Details” on page 436.
♦ Workflows. You can view properties such as start time, status, and run type.
♦ Links. When you double-click a link between tasks in Gantt Chart view, you can view
tasks you hide.
♦ Servers. You can view properties such as server version and startup time. You can also view
the sessions and workflows running on the PowerCenter Server.
♦ Folders. You can view properties such as the number of workflow runs displayed in the
Time window.
To view properties for all objects, right-click the object and select Properties. You can right-
click items in the Navigator or the Time window in either Gantt Chart view or Task view.
To view link properties, double-click the link in the Time window of Gantt Chart view.
When you view link properties, you can double-click a task in the Link Properties dialog box
to view the properties for the filtered task.
Table 14-1 describes the options you can configure on the General tab:
Setting Description
Maximum Days Specifies the number of tasks the Workflow Monitor displays up to a maximum
number of days. The default is 5.
Maximum Workflow Runs per Specifies the maximum number of workflow runs the Workflow Monitor displays for
Folder each folder. The default is 200.
Receive Messages from Select this option to receive messages from the Workflow Manager. The Workflow
Workflow Manager Manager sends messages when you start or schedule a workflow in the Workflow
Manager. The Workflow Monitor displays these messages in the Output window.
Receive Notifications from Select this option to receive notifications from the Repository Server. Notifications
Repository Server from the Repository Server display in the Output window Notifications tab.
Log File Editor Enter the path and file name of the text editor to view and edit workflow and session
logs. You can browse to select an editor. By default, the Workflow Monitor uses
WordPad.
Location The location where the Workflow Monitor stores temporary versions of log files
when you open session or workflow logs from the Workflow Monitor.
Table 14-2 describes the options you can configure on the Gantt Chart Options tab:
Status Color Choose a status and configure the color for the status. The Workflow Monitor displays tasks
with the selected status in the colors you choose. You can choose two colors to display a
gradient.
Recovery Color Configure the color for the recovery sessions. The Workflow Monitor uses the status color for
the body of the status bar, and it uses and the recovery color as a gradient in the status bar.
Table 14-3 describes the options you can configure on the Advanced tab:
Setting Description
Hide Folders/Workflows That Do Not Contain Hides folders or workflows under the Workflow Run column in the Time
Any Runs When Filtering By Running/ window when you filter running or scheduled tasks.
Schedule Runs
Highlight the Entire Row When an Item Is Highlights the entire row in the Time window for selected items. When
Selected you disable this option, the Workflow Monitor highlights the item in the
Workflow Run column in the Time window.
Setting Description
Open Latest 20 Runs At a Time Allows you to open the number of workflow runs of your choice. The
number of runs to be opened is set at 20 by default.
Minimum Number of Workflow Runs (Per Specifies the minimum number of workflow runs per server that the
Server) the Workflow Monitor Will Workflow Monitor holds in memory before it starts releasing older runs
Accumulate in Memory from memory.
When you connect to a server, the Workflow Monitor fetches the
number of workflow runs specified on the General tab for each folder
you connect to. When the number of runs is less than the number
specified in this option, the Workflow Monitor stores new runs in
memory until it reaches this number. Then it releases the oldest run
from memory when it fetches a new run.
When the number of workflow runs the Workflow Monitor initially
fetches exceeds the number specified in this option, the Workflow
Monitor stores all those runs and then releases the oldest run from
memory when it fetches a new run.
♦ Server. Contains buttons to connect to and disconnect from PowerCenter Servers, to ping
the server, and to start and stop workflows, worklets, and tasks.
Figure 14-8 displays the Server toolbar:
♦ View. Contains buttons to refresh the view and to open workflow and session logs.
Figure 14-9 displays the View toolbar:
♦ Filter. Contains buttons to display most recent runs, and to filter tasks, servers, and
folders.
Figure 14-10 displays the Filter toolbar:
1. In the Navigator, select the task from which you want to run the workflow.
1. In the Navigator, select the task, workflow, or worklet you want to stop or abort.
2. Choose Tasks-Stop or Tasks-Abort.
or
Right-click the task, workflow, or worklet in the Navigator and choose Stop or Abort.
3. The Workflow Monitor displays the status of the stop or abort command in the Output
window.
Aborted Workflows The PowerCenter Server aborted the workflow or task. The PowerCenter
Tasks Server kills the DTM process when you abort a workflow or task.
Aborting Workflows The PowerCenter Server is in the process of aborting the workflow or task.
Tasks
Disabled Workflows You select the Disabled option in the workflow or task properties. The
Tasks PowerCenter Server does not run the disabled workflow or task until you clear
the Disabled option.
Failed Workflows The PowerCenter Server failed the workflow or task due to errors.
Tasks
Scheduled Workflows You schedule the workflow to run at a future date. The PowerCenter Server
runs the workflow for the duration of the schedule.
Stopped Workflows You choose to stop the workflow or task in the Workflow Monitor. The
Tasks PowerCenter Server stopped the workflow or task.
Stopping Workflows The PowerCenter Server is in the process of stopping a workflow or task.
Tasks
Succeeded Workflows The PowerCenter Server successfully completed the workflow or task.
Tasks
Suspended Workflows The PowerCenter Server suspends the workflow because a task fails and no
Worklets other tasks are running in the workflow. This status is available only when you
choose the Suspend on Error option.
Suspending Workflows A task fails in the workflow when other tasks are still running. The PowerCenter
Worklets Server stops executing the failed task and continues executing tasks in other
paths. This status is available only when you choose the Suspend on Error
option.
Terminated Workflows The PowerCenter Server terminated unexpectedly when it was running this
workflow or task.
Unscheduled Workflows You removed a workflow from the schedule. Or, the workflow is scheduled and
the PowerCenter Server is about to run the scheduled workflow.
Waiting Workflows The PowerCenter Server is waiting for available resources so it can execute
Tasks the workflow or task. For example, you may set the maximum number of
concurrent sessions to 10. If the PowerCenter Server is already executing 10
concurrent sessions, all other workflows and tasks has the Waiting status until
the PowerCenter Server is free to execute more tasks.
Organizing Tasks
In Gantt Chart view, you can organize tasks in the Navigator. You can drag and drop tasks
within a workflow to change the order they appear in the Navigator.
You can drag and drop the Decision task within the Navigator so the Decision task is in the
middle or at the bottom of the list of tasks for that workflow:
1. Open the Gantt Chart view and choose Edit-List Tasks. The List Tasks dialog box
appears.
2. In the List What field, select the type of task status you want to list.
For example, select Failed to view a list of failed tasks and workflows.
3. Click List to view the list.
Tip: Double-click the task name in the List Tasks dialog box to highlight the task in Gantt
Chart view.
Zoom
30 Minute
Increments
Solid Line
For Hour
Increments
Dotted Line
For Half Hour
Increments
To zoom the Time window in Gantt Chart view, choose View-Zoom and then choose the
desired time increment.
You can also choose the time increment in the Zoom button on the toolbar.
Performing a Search
Use the search tool in the Gantt Chart view to search for tasks, workflows, and worklets in all
repositories you connect to. The Workflow Monitor searches for the word you specify in task
names, workflow names, and worklet names. You can highlight the task in Gantt Chart view
by double-clicking the task after searching.
1. Open the Gantt Chart view and choose Edit-Find. The Find Object dialog box appears.
2. In the Find What field, enter the keyword you want to find.
3. Click Find Now.
The Workflow Monitor displays a list of tasks, workflows, and worklets that match the
keyword.
Tip: Double-click the task name in the Find Object dialog box to highlight the task in
Gantt Chart view.
Navigator
Window
Workflow
Run List
Time Window
Task View
Output
Window
Filter Button
Select the
workflows you want
to display.
When you click the Filter button in either the Start Time or Completion Time column,
you can choose a custom time to filter.
4. Select Custom for either Start Time or Completion Time. The Filter Start Time or
Custom Completion Time dialog box appears.
5. Choose to show tasks before, after, or between the time you specify. Select the date and
time. Click OK.
When you create multiple partitions in a session, the PowerCenter Server provides session
details for each partition. You can use these details to determine if the data is evenly
distributed among the partitions. For example, if the PowerCenter Server moves more rows
through one target partition than another, or if the throughput is not evenly distributed, you
might want to adjust the data range for the partitions.
When you load data to a target with multiple groups, such as an XML target, the
PowerCenter Server provides session details for each group.
Table 14-5 lists the information on the Transformation Statistics tab:
Instance Name Name of the source qualifier instance or the target instance in the mapping. If you create
multiple partitions in the source or target, the Instance Name displays the partition number.
If the source or target contains multiple groups, the Instance Name displays the group
name.
Applied Rows For targets, shows the number of rows the PowerCenter Server successfully applied to the
target (that is, the target returned no errors).
For sources, shows the number of rows the PowerCenter Server successfully read from
the source.
Note: The number of applied rows equals the number of affected rows for sources.
Affected Rows For targets, shows the number of rows affected by the specified operation. For example,
you have a table with one column called SALES_ID and five rows containing the values 1,
2, 3, 2, and 2. You mark rows for update where SALES_ID is 2. The writer affects three
rows, even though there was only one update request. Or, if you mark rows for update
where SALES_ID is 4, the writer affects 0 rows.
For sources, shows the number of rows the PowerCenter Server successfully read from
the source.
Note: The number of applied rows equals the number of affected rows for sources.
Rejected Rows Number of rows the PowerCenter Server dropped when reading from the source, or the
number of rows the PowerCenter Server rejected when writing to the target.
Throughput (Rows/Sec) Rate at which the PowerCenter Server read rows from the source or wrote data into the
target in bytes per second.
Last Error Message The most recent error message written to the session log. If you view details after the
session completes, this field displays the last error message.
Last Error Code The error message code of the most recent error message written to the session log. If you
view details after the session completes, this field displays the last error code.
Start Time The time the PowerCenter Server started to read from the source or write to the target.
The Workflow Monitor displays time relative to the PowerCenter Server.
End Time The time the PowerCenter Server finished reading from the source or writing to the target.
The Workflow Monitor displays time relative to the PowerCenter Server.
Enabling Monitoring
To view performance details, you must enable monitoring in the session properties before
running the session.
To enable monitoring:
1. While the session is running, right-click the session in the Workflow Monitor and choose
Properties.
2. Click the Performance tab in the Properties dialog box.
3. Click OK.
Note: When you increase the number of partitions, the number of aggregate or rank input
rows may be different from the number of output rows from the previous transformation.
Table 14-6 lists the counters that may appear in the Session Performance Details dialog box or
in the performance details file:
If you have multiple source qualifiers and targets, evaluate them as a whole. For source
qualifiers and targets, a high value is considered 80-100 percent. Low is considered 0-20
percent.
Tips 441
442 Chapter 14: Monitoring Workflows
Chapter 15
443
Overview
You can register and run multiple PowerCenter Servers against a local or global repository.
When you register multiple PowerCenter Servers to the same repository, you can distribute
the workload across the servers to increase performance.
You have the following options to run workflows and sessions using multiple servers:
♦ Use a server grid to run workflows. You can use a server grid to automate the distribution
of sessions. A server grid is a server object that distributes sessions in a workflow to servers
based on server availability. The grid maintains connections to multiple servers in the grid.
For more information about using server grids, see “Working with Server Grids” on
page 446.
♦ Change the assigned server for a workflow. When you configure a workflow, you assign a
server to run that workflow. Each time the scheduled workflow runs, it runs on the
assigned server. You can change the assigned server for a workflow in the workflow
properties.
♦ Change the assigned server for a session. When you configure a session, by default it runs
on the server assigned to the workflow. You can change the assigned server for a session in
the session properties.
♦ Start a workflow on a non-assigned server. By default, each workflow runs on its assigned
PowerCenter Server. You can run a workflow on a non-assigned server if the workflow is
not currently running. Use the Start Workflow button on the Standard toolbar, and choose
a PowerCenter Server.
You can use the Workflow Monitor to monitor workflows running on multiple servers. For
server grids, the Workflow Monitor shows the individual status of each server in a grid. You
can identify the server grid that a server is assigned to by right-clicking the server in the
Workflow Monitor and selecting Properties. For more information about using the Workflow
Monitor, see “Monitoring Workflows” on page 401.
Tip: You might want to place the most CPU intensive sessions on the more powerful servers.
Distributing Sessions
In a server grid, the master server starts the workflow and then distributes sessions to worker
servers. The master server is the server that starts a workflow. A worker server is a server that
runs sessions assigned to it by a master server. By default, each PowerCenter Server in a server
grid is both a master server and a worker server. This means that a server in a grid can
distribute sessions to and receive sessions from every server in the grid. The master server
distributes sessions that are ready to run to available worker servers in a round-robin fashion
based on server availability. The starting point for the session assignment is random.
If a worker server is running the maximum number of concurrent sessions, the master server
assigns another worker server to run the session. If all worker servers are running the
maximum number of concurrent sessions, the master server places the session in its own ready
queue.
For information about configuring the maximum number of concurrent sessions, see
“Installing and Configuring the PowerCenter Server on Windows” and “Installing and
Configuring the PowerCenter Server on UNIX” in the Installation and Configuration Guide.
Figure 15-1 shows how a master server distributes the sessions in Workflow1 among the
servers in a grid. The server grid contains Server A, Server B, and Server C. Server A is the
master server, and Server B and Server C are worker servers.
Server B
Server A
Server C
Server A
Server C Server B
Worker server shuts down The worker server is not available to the master servers in the server grid.
unexpectedly or you shut it down Master servers do not assign a session to the unavailable worker server and
before it receives a session. proceed with the round-robin distribution of sessions.
Worker server shuts down The master server marks the status of the session as terminated. The worker
unexpectedly while running a server stops running all sessions. The session settings you specify determine if
session. the workflow fails. For more information about the Fail parent if this task fails
option, Fail parent if this task does not run option, or Disable this task option,
see “Configuring Tasks” on page 135.
You shut down a worker server while The shut down mode you specify determines how the worker server handles
it is running a session. sessions when it shuts down. When you shut down the worker server in
complete mode, it continues to run the sessions it started until it completes, but
does not accept sessions from master servers. For more information about
shut down modes, see “pmcmd Reference” on page 594.
Worker server loses its network The worker server continues to run the session and writes its status to the
connection and cannot connect to the session log. However, the master server marks the status of the session as
server grid. terminated.
You must resume the workflow or resume from the failed task to continue
running the workflow and update the session status. If you do not need the
session status of the previous run, you can restart the workflow or restart the
workflow from a task to start up a new workflow run. For more information, see
“Working with Tasks and Workflows” on page 416.
Master server shuts down Workflow fails. You must restart the workflow on another server or wait for the
unexpectedly. master server to become available.
You shut down the master server The shut down mode you specify determines how the master server handles
while running a workflow or session. workflows and sessions when it shuts down. When you shut down the master
server in complete mode, it continues to run the workflows and sessions it
started until they complete, but does not accept tasks from other master
servers. For more information about shut down modes, see “pmcmd
Reference” on page 594.
Master server loses its network The master server continues to run workflows as a standalone PowerCenter
connection and cannot connect to the Server. If a worker server is assigned to a session, the session fails because
server grid. the master server cannot distribute the session to the worker server. The
session settings you specify determine if the workflow fails. For more
information about the Fail parent if this task fails option, Fail parent if this task
does not run option, or Disable this task option, see “Configuring Tasks” on
page 135.
Level Configuration
Table 15-3 shows a configuration where the session properties override the server grid
properties. The session runs on Server B, even though you configure Server B not to accept
tasks from the grid because you assigned the session to Server B.
Level Configuration
Configure as both a
master and worker
server.
6. Repeat steps 4 and 5 until you have chosen all the servers for the grid.
8. Click Close.
Log Files
455
Overview
The PowerCenter Server can create log files for each workflow it runs. These files contain
information about the tasks the PowerCenter Server performs, plus statistics about the
workflow and all sessions in the workflow. If the writer or target database rejects data during a
session run, the PowerCenter Server creates a file that contains the rejected rows.
The PowerCenter Server can create the following types of log files:
♦ Workflow log. Contains information about the workflow run such as workflow name,
tasks executed, and workflow errors. By default, the PowerCenter Server writes this
information to the server log or Windows Event Log, depending on how you configure the
PowerCenter Server. If you wish to create a workflow log, enter a workflow file name in the
workflow properties. For more information, see “Workflow Logs” on page 457.
♦ Session log. Contains information about the tasks that the PowerCenter Server performs
during a session, plus load summary and transformation statistics. By default, the
PowerCenter Server creates one session log for each session it runs. If a workflow contains
multiple sessions, the PowerCenter Server creates a separate session log for each session in
the workflow. For more information, see “Session Logs” on page 463.
♦ Reject file. Contains rows rejected by the writer or target file during a session run. If the
writer or target does not reject any data during a session, the PowerCenter Server does not
generate a reject file for that session. For more information, see “Reject Files” on page 476.
By default, the PowerCenter Server saves each type of log file in its own directory. The
PowerCenter Server represents these directories using server variables.
Table 16-1 shows the default location for each type of log file:
Default Directory
Log File Type Value
(Server Variable)
You can change the default directories at the server level by editing the server connection in
the Workflow Manager. You can also override these values for individual workflows or sessions
by updating the workflow or session properties.
CMN Messages related to databases, memory allocation, Lookup and Joiner transformations, and internal
errors.
Parameter File Name Designates the name and directory for the parameter file. Use the parameter file to
define workflow parameters. For details on parameter files, see “Parameter Files”
on page 511.
Workflow Log File Name Optionally enter a file name, or a file name and directory.
If you leave this field blank, the PowerCenter Server does not create a workflow
log. Instead, the PowerCenter Server writes workflow log messages to the server
log or Windows Event Log, depending on how you configure the PowerCenter
Server.
If you fill in this field, the PowerCenter Server appends information in this field to
that entered in the Workflow Log File Directory field. For example, if you have
"C:\workflow_logs\" in the Workflow Log File Directory field, then enter
"logname.txt" in the Workflow Log File Name field, the PowerCenter Server writes
logname.txt to the C:\workflow_logs\ directory.
Workflow Log File Directory Designates a location for the workflow log file. By default, the PowerCenter Server
writes the log file in the server variable directory, $PMWorkflowLogDir.
If you enter a full directory and file name in the Workflow Log File Name field, clear
this field.
Save Workflow Log By If you select Save Workflow Log by Timestamp, the PowerCenter Server saves all
workflow logs, appending a timestamp to each log.
If you select Save Workflow Log by Runs, the PowerCenter Server saves a
designated number of workflow logs. Configure the number of workflow logs in the
Save Workflow Log for These Runs option.
For details on these options, see “Archiving Workflow Logs” on page 459.
You can also use the $PMWorkflowLogCount server variable to save the
configured number of workflow logs for the PowerCenter Server.
Save Workflow Log for The number of historical workflow logs you want the PowerCenter Server to save.
These Runs The Informatica saves the number of historical logs you specify, plus the most
recent workflow log. Therefore, if you specify 5 runs, the PowerCenter Server
saves the most recent workflow log, plus historical logs 0 to 4, for a total of 6 logs.
You can specify up to 2,147,483,647 historical logs. If you specify 0 logs, the
PowerCenter Server saves only the most recent workflow log.
To use the Workflow Monitor to view the most recent workflow log:
1. In the Navigator window, connect to the server on which the workflow runs.
2. Open the folder that contains the workflow.
3. Right-click the workflow and choose Get Workflow Log.
If you save workflow logs by timestamp, you can also use the Workflow Monitor to view past
workflow logs. To do this, right click the workflow in the Gantt chart view and choose Get
Workflow Log.
For more information about the Workflow Monitor, see “Using the Workflow Monitor” on
page 404.
BLKR Messages related to reader process, including Application, relational, or flat file.
CMN Messages related to databases, memory allocation, Lookup and Joiner transformations, and
internal errors.
SF Messages related to server framework, used by Load Manager and Repository Server.
Thread Identification
The thread identification consists of the thread type and a series of numbers separated by
underscores. The numbers following a thread name indicate the following information:
♦ Target load order group number
♦ Partition point number
♦ Partition number
Note: The PowerCenter Server writes an asterisk (*) as the partition point number for writer
threads.
The PowerCenter Server prints the thread identification before the log file code and the
message text in the session log. The following example illustrates a reader thread from target
load order group one, concurrent source set one, source pipeline one, and partition one:
READER_1_1_1> DBG_21438 Reader: Source is [p152636], user [jennie]
MASTER> CMN_1688 Allocated [12000000] bytes from process memory for [DTM
Buffer Pool].
Target tables:
Emp_target
READER_1_1_1> BLKR_16019 Read [1] rows, read [0] error rows for source
table [EMP_SRC] instance name [EMP_SRC]
Load Summary
The session log includes a load summary that reports the number of rows inserted, updated,
deleted, and rejected for each target as of the last commit point. The PowerCenter Server
reports the load summary for each session by default. However, you can set tracing level to
Verbose Initialization or Verbose Data to report the load summary for each transformation.
The following sample is an excerpt from a load summary:
*****START LOAD SESSION*****
Target tables:
Emp_target
===================================================
LOAD SUMMARY
============
,
WRITER_1_*_1> WRT_8043 *****END LOAD SESSION*****
The PowerCenter Server reports statistics for each of the following operations performed on
the target:
♦ Inserted. Shows the number of rows the PowerCenter Server marked for insert into the
target. The number of affected rows cannot be larger than requested for this operation.
♦ Updated. Shows the number of rows the PowerCenter Server marked for update in the
target. The number of affected rows can be different from the number of requested rows.
For example, you have a table with one column called SALES_ID and five rows containing
the values: 1, 2, 3, 2, and 2. You mark rows for update where SALES_ID is 2. The writer
affects three rows, even though there was only one update request. Or, if you mark rows for
update where SALES_ID is 4, the writer affects 0 rows.
♦ Deleted. Shows the number of rows the PowerCenter Server marked to remove from the
target. The number of affected rows can be different from the number of requested rows.
♦ Rejected. Shows the number of rows the PowerCenter Server rejected during the writing
process. These rows cannot be applied to the target. For the Rejected rows category, the
number of affected and applied rows is always zero since these rows are not written to the
target.
The load summary provides the following statistics:
♦ Requested rows. Shows the number of rows the writer actually received for the specified
operation.
♦ Applied rows. Shows the number of rows the writer successfully applied to the target (that
is, the target returned no errors).
♦ Affected rows. Shows the number of rows affected by the specified operation. Depending
on the operation, the number of affected rows can be different from the number of
requested rows. For example, you have a table with one column called SALES_ID and five
rows containing the values: 1, 2, 3, 2, and 2. You mark rows for update where SALES_ID
is 2. The writer affects three rows, even though there was only one update request. Or, if
you mark rows for update where SALES_ID is 4, the writer affects 0 rows.
♦ Rejected rows. Shows the number of rows the writer could not apply to the target. For
example, the target database rejects a row if the PowerCenter Server attempts to insert
NULL into a not-null field. The PowerCenter Server writes all rejected rows to the session
reject file, or to the row error log, depending on how you configure the session.
---------------------------------
MAPPING>
MAPPING>
Session Log File By default, the PowerCenter Server uses the session name for the log file name:
Name s_mapping name.log. For a debug session, it uses DebugSession_mapping
name.log.
Optionally enter a file name, a file name and directory, or use the
$PMSessionLogFile session parameter. The PowerCenter Server appends
information in this field to that entered in the Session Log File Directory field. For
example, if you have “C:\session_logs\” in the Session Log File Directory field, then
enter “logname.txt” in the Session Log File field, the PowerCenter Server writes the
logname.txt to the C:\session_logs\ directory.
You can also use the $PMSessionLogFile session parameter to represent the name
of the session log or the name and location of the session log. For details on session
parameters, see “Session Parameters” on page 495.
Session Log File Location of the log file. Enter a valid directory local to the PowerCenter Server. By
Directory default, the PowerCenter Server creates session logs in the directory configured for
the $PMSessionLogDir server variable.
Save Session Log By If you select Save Session Log by Timestamp, the PowerCenter Server saves all
session logs, appending a timestamp to each log.
If you select Save Session Log by Runs, the PowerCenter Server saves a designated
number of session logs. Configure the number of sessions in the Save Session Log for
These Runs option.
You can also use the $PMSessionLogCount server variable to save the configured
number of session logs for the PowerCenter Server.
Save Session Log for The number of historical session logs you want the PowerCenter Server to save.
These Runs The Informatica saves the number of historical logs you specify, plus the most recent
session log. Therefore, if you specify 5 runs, the PowerCenter Server saves the most
recent session log, plus historical logs 0 to 4, for a total of 6 logs.
You can specify up to 2,147,483,647 historical logs. If you specify 0 logs, the
PowerCenter Server saves only the most recent session log.
None The PowerCenter Server uses the tracing level set in the mapping.
Terse PowerCenter Server logs initialization information as well as error messages and notification of
rejected data.
Normal PowerCenter Server logs initialization and status information, errors encountered, and skipped
rows due to transformation row errors. Summarizes session results, but not at the level of
individual rows.
Verbose In addition to normal tracing, PowerCenter Server logs additional initialization details, names of
Initialization index and data files used, and detailed transformation statistics.
Verbose Data In addition to verbose initialization tracing, PowerCenter Server logs each row that passes into the
mapping. Also notes where the PowerCenter Server truncates string data to fit the precision of a
column and provides detailed transformation statistics.
When you configure the tracing level to verbose data, the PowerCenter Server writes row data for
all rows in a block when it processes a transformation.
You can also enter tracing levels for individual transformations in the mapping. When you
enter a tracing level in the session properties, you override tracing levels configured for
transformations in the mapping.
Tracing
Level
2. Select a tracing level from the Override Tracing list. Table 16-4 on page 473 describes the
session log tracing levels.
3. Click OK to save the session.
To use the Workflow Monitor to view the most recent session log:
1. In the Navigator window, connect to the server on which the workflow runs.
2. Open the folder that contains the workflow.
3. Open the workflow that contains the session whose log you wish to view.
4. Right-click the session and choose Get Session Log.
If you save session logs by timestamp, you can also use the Workflow Monitor to view past
session logs. To do this, right-click the session in the Gantt chart view and choose Get Session
Log.
For more information about the Workflow Monitor, see “Using the Workflow Monitor” on
page 404.
When you run a session that contains multiple partitions, the PowerCenter Server creates a
separate reject file for each partition.
0,D,1922,D,Page,D,Ian,D,415-541-5145,D
0,D,1928,D,De Souza,D,Leo,D,415-541-5145,D
0,D,2001,D,S. MacDonald,D,Ira,D,415-541-5145,D
Row Indicators
The first column in the reject file is the row indicator. The number listed as the row indicator
tells the writer what to do with the row of data.
Table 16-5 describes the row indicators in a reject file:
3 Reject Writer
If a row indicator is 3, the writer rejected the row because an update strategy expression
marked it for reject.
If a row indicator is 0, 1, or 2, either the writer or the target database rejected the row. To
narrow down the reason why rows marked 0, 1, or 2 were rejected, review the column
indicators and consult the session log.
Column Indicators
After the row indicator is a column indicator, followed by the first column of data, and
another column indicator. Column indicators appear after every column of data and define
the type of the data preceding it.
Column
Type of data Writer Treats As
Indicator
D Valid data. Good data. Writer passes it to the target database. The
target accepts it unless a database error occurs, such
as finding a duplicate key.
O Overflow. Numeric data exceeded the Bad data, if you configured the mapping target to reject
specified precision or scale for the column. overflow or truncated data.
N Null. The column contains a null value. Good data. Writer passes it to the target, which rejects it
if the target database does not accept null values.
T Truncated. String data exceeded a Bad data, if you configured the mapping target to reject
specified precision for the column, so the overflow or truncated data.
PowerCenter Server truncated it.
Null columns appear in the reject file with commas marking their column. An example of a
null column surrounded by good data appears as follows:
5,D,,N,5,D
Because either the writer or target database can reject a row, and because they can reject the
row for a number of reasons, you need to evaluate the row carefully and consult the session
log to determine the cause for reject.
481
Overview
When you configure a session, you can choose to log row errors in a central location. When a
row error occurs, the PowerCenter Server logs error information that allows you to determine
the cause and source of the error. The PowerCenter Server logs information such as source
name, row ID, current row data, transformation, timestamp, error code, error message,
repository name, folder name, session name, and mapping information.
You can log row errors into relational tables or flat files. When you enable error logging, the
PowerCenter Server creates the error tables or an error log file the first time it runs the session.
Error logs are cumulative. If the error logs exist, the PowerCenter Server appends error data to
the existing error logs.
You can choose to log source row data. Source row data includes row data, source row ID, and
source row type from the source qualifier where an error occurs. The PowerCenter Server
cannot identify the row in the source qualifier that contains an error if the error occurs after a
non pass-through partition point with more than one partition or one of the following active
sources:
♦ Aggregator
♦ Custom, configured as an active transformation
♦ Joiner
♦ Normalizer (pipeline)
♦ Rank
♦ Sorter
By default, the PowerCenter Server logs transformation errors in the session log and reject
rows in the reject file. When you enable error logging, the PowerCenter Server does not
generate a reject file or write dropped rows to the session log. Without a reject file, the
PowerCenter Server does not log Transaction Control transformation rollback or commit
errors. If you want to write rows to the session log in addition to the row error log, you can
enable verbose data tracing.
Note: When you log row errors, session performance may decrease because the PowerCenter
Server processes one row at a time instead of a block of rows at once.
PMERR_DATA
When the PowerCenter Server encounters a row error, it inserts an entry into the
PMERR_DATA table. This table stores data and metadata about a transformation row error
and its corresponding source row.
Table 17-1 describes the structure of the PMERR_DATA table:
WORKLET_RUN_ID Integer A unique identifier for the worklet. If a session is not part of
a worklet, this value is “0”.
TRANS_GROUP Varchar Name of the input group or output group where an error
occurred. Defaults to either “input” or “output” if the
transformation does not have a group.
TRANS_ROW_ID Integer Specifies the row ID generated by the last active source.
TRANS_ROW_DATA Long Varchar Delimited string containing all column data, including the
column indicator. Column indicators are:
D - valid
O - overflow
N - null
T - truncated
B - binary
U - data unavailable
The fixed delimiter between column data and column
indicator is colon ( : ). The delimiter between the columns
is pipe ( | ). You can override the column delimiter in the
error handling settings.
This value can span multiple rows. When the data exceeds
2000 bytes, the PowerCenter Server creates a new row.
The line number for each row error entry is stored in the
LINE_NO column.
SOURCE_ROW_ID Integer Value that the source qualifier assigns to each row it
reads. If the PowerCenter Server cannot identify the row,
the value is -1.
SOURCE_ROW_TYPE Integer The row indicator that tells whether the row was marked
for insert, update, delete, or reject.
0 - Insert
1 - Update
2 - Delete
3 - Reject
SOURCE_ROW_DATA Long Varchar Delimited string containing all column data, including the
column indicator. Column indicators are:
D - valid
O - overflow
N - null
T - truncated
B - binary
U - data unavailable
The fixed delimiter between column data and column
indicator is colon ( : ). The delimiter between the columns
is pipe ( | ). You can override the column delimiter in the
error handling settings.
This value can span multiple rows. When the data exceeds
2000 bytes, the PowerCenter Server creates a new row.
The line number for each row error entry is stored in the
LINE_NO column.
LINE_NO Integer Specifies the line number for each row error entry in
SOURCE_ROW_DATA and TRANS_ROW_DATA that
spans multiple rows.
Informatica recommends using the fields in bold to join tables.
PMERR_MSG
When the PowerCenter Server encounters a row error, it inserts an entry into the
PMERR_MSG table. This table stores metadata about the error and the error message.
Table 17-2 describes the structure of the PMERR_MSG table:
WORKLET_RUN_ID Integer A unique identifier for the worklet. If a session is not part
of a worklet, this value is “0”.
TRANS_GROUP Varchar Name of the input group or output group where an error
occurred. Defaults to either “input” or “output” if the
transformation does not have a group.
TRANS_ROW_ID Integer Specifies the row ID generated by the last active source.
ERROR_SEQ_NUM Integer Counter for the number of errors per row in each
transformation group. If a session has multiple partitions,
the PowerCenter Server maintains this counter for each
partition.
For example, if a transformation generates three errors in
partition 1 and two errors in partition 2,
ERROR_SEQ_NUM generates the values 1, 2, and 3 for
partition 1, and values 1 and 2 for partition 2.
ERROR_MSG Long Varchar Error message, which can span multiple rows. When the
data exceeds 2000 bytes, the PowerCenter Server
creates a new row. The line number for each row error
entry is stored in the LINE_NO column.
ERROR_TYPE Integer The type of error that occurred. The PowerCenter Server
uses the following values:
1 - Reader error
2 - Writer error
3 - Transformation error
LINE_NO Integer Specifies the line number for each row error entry in
ERROR_MSG that spans multiple rows.
Informatica recommends using the fields in bold to join tables.
PMERR_SESS
When you choose relational database error logging, the PowerCenter Server inserts entries
into the PMERR_SESS table. This table stores metadata about the session where an error
occurred.
WORKLET_RUN_ID Integer A unique identifier for the worklet. If a session is not part of a
worklet, this value is “0”.
SESS_START_UTC_TIME Integer The Coordinated Universal Time, also known as Greenwich Mean
Time, of when the session starts.
FOLDER_NAME Varchar Specifies the folder where the mapping and session are located.
WORKFLOW_NAME Varchar Specifies the workflow that runs the session being logged.
TASK_INST_PATH Varchar Fully qualified session name that can span multiple rows. The
PowerCenter Server creates a new line for the session name. The
PowerCenter Server also creates a new line for each worklet in the
qualified session name. For example, you have a session named
WL1.WL2.S1. Each component of the name appears on a new line:
WL1
WL2
S1
The PowerCenter Server writes the line number in the LINE_NO
column.
LINE_NO Integer Specifies the line number for each row error entry in
TASK_INST_PATH that spans multiple rows.
Informatica recommends using the fields in bold to join tables.
PMERR_TRANS
When the PowerCenter Server encounters a transformation error, it inserts an entry into the
PMERR_TRANS table. This table stores metadata, such as the name and datatype of the
source and transformation ports.
Table 17-4 describes the structure of the PMERR_TRANS table:
TRANS_GROUP Varchar Name of the input group or output group where an error
occurred. Defaults to either “input” or “output” if the
transformation does not have a group.
TRANS_ATTR Varchar Lists the port names and datatypes of the input or
output group where the error occurred. Port name and
datatype pairs are separated by commas, for example:
portname1:datatype, portname2:datatype.
SOURCE_NAME Varchar Name of the source qualifier. N/A appears when a row
error occurs downstream of an active source that is not
a source qualifier or a non pass-through partition point
with more than one partition. For a list of active sources
that can affect row error logging, see “Overview” on
page 482.
LINE_NO Integer Specifies the line number for each row error entry in
TRANS_ATTR and SOURCE_ATTR that spans multiple
rows.
Informatica recommends using the fields in bold to join tables.
[Column Header]
[Column Data]
♦ Session header. Contains session run information. Information in the session header is like
the information stored in the PMERR_SESS table.
♦ Column header. Contains data column names.
♦ Column data. Contains actual row data and error message information.
The following sample error log file contains a session header, column header, and column
data:
**********************************************************************
Repository: CustomerInfo
Folder: Row_Error_Logging
Workflow: wf_basic_REL_errors_AGG_case
Session: s_m_basic_REL_errors_AGG_case
Mapping: m_basic_REL_errors_AGG_case
**********************************************************************
agg_REL_basic||N/A||Input||1||1||1||08/03/2004
16:57:03||1067126223||11019||Port [CUST_ID_NULL]: Default value is:
ERROR(<<Expression Error>> [ERROR]: [AGG] CUST_ID - NULL detected on
input.\n... nl:ERROR(s:'[AGG] CUST_ID - NULL detected on
input.')).||3||D:1221|N:|N:|N:|D:Kauai Dive Shoppe|D:4-976 Sugarloaf
Hwy|D:Kapaa Kauai|D:HI|D:94766|D:[AGG] DEFAULT SID VALUE.|D:01/01/2001
00:00:00||mplt_add_NULLs_to_QACUST3||SQ_QACUST3||1||0||D:1221|D:Kauai
Dive Shoppe|D:4-976 Sugarloaf Hwy|D:Kapaa Kauai|D:HI|D:94766
agg_REL_basic||N/A||Input||1||4||1||08/03/2004
16:57:03||1067126223||11019||Port [CITY_IN]: Default value is:
ERROR(<<Expression Error>> [ERROR]: [AGG] Null detected for City_IN.\n...
nl:ERROR(s:'[AGG] Null detected for
City_IN.')).||3||D:1354|N:|N:|D:1354|T:Cayman Divers World|D:PO Box
541|N:|D:Gr|N:|D:[AGG] DEFAULT SID VALUE.|D:01/01/2001
00:00:00||mplt_add_NULLs_to_QACUST3||SQ_QACUST3||4||0||D:1354|D:Cayman
Divers World Unlim|D:PO Box 541|N:|D:Gr|N:
agg_REL_basic||N/A||Input||1||5||1||08/03/2004
16:57:03||1067126223||11131||Transformation [agg_REL_basic] had an error
evaluating variable column [Var_Divide_by_Price]. Error message is
[<<Expression Error>> [/]: divisor is zero\n... f:(f:2 / f:(f:1 -
f:TO_FLOAT(i:1)))].||3||D:1356|N:|N:|D:1356|T:Tom Sawyer Diving C|T:632-1
Third Frydenh|D:Christiansted|D:St|D:00820|D:[AGG] DEFAULT SID
VALUE.|D:01/01/2001
00:00:00||mplt_add_NULLs_to_QACUST3||SQ_QACUST3||5||0||D:1356|D:Tom
Sawyer Diving Centre|D:632-1 Third Frydenho|D:Christiansted|D:St|D:00820
Transformation The name of the transformation used by a mapping where an error occurred.
Transformation Mapplet Name Name of the mapplet that contains the transformation. N/A appears when this
information is not available.
Transformation Group Name of the input or output group where an error occurred. Defaults to either “input”
or “output” if the transformation does not have a group.
Partition Index Specifies the partition number of the transformation partition where an error
occurred.
Error Sequence Counter for the number of errors per row in each transformation group. If a session
has multiple partitions, the PowerCenter Server maintains this counter for each
partition.
For example, if a transformation generates three errors in partition 1 and two errors
in partition 2, ERROR_SEQ_NUM generates the values 1, 2, and 3 for partition 1,
and values 1 and 2 for partition 2.
Error Timestamp Timestamp of the PowerCenter Server when the error occurred.
Error UTC Time The Coordinated Universal Time, also known as Greenwich Mean Time, when the
error occurred.
Error Code The error code that corresponds to the error message.
Error Type The type of error that occurred. The PowerCenter Server uses the following values:
1 - Reader error
2 - Writer error
3 - Transformation error
Transformation Data Delimited string containing all column data, including the column indicator. Column
indicators are:
D - valid
O - overflow
N - null
T - truncated
B - binary
U - data unavailable
The fixed delimiter between column data and column indicator is a colon ( : ). The
delimiter between the columns is a pipe ( | ). You can override the column delimiter
in the error handling settings.
The PowerCenter Server converts all column data to text string in the error file. For
binary data, the PowerCenter Server uses only the column indicator.
Source Name Name of the source qualifier. N/A appears when a row error occurs downstream of
an active source that is not a source qualifier or a non pass-through partition point
with more than one partition. For a list of active sources that can affect row error
logging, see “Overview” on page 482.
Source Row ID Value that the source qualifier assigns to each row it reads. If the PowerCenter
Server cannot identify the row, the value is -1.
Source Row Type The row indicator that tells whether the row was marked for insert, update, delete, or
reject.
0 - Insert
1 - Update
2 - Delete
3 - Reject
Source Data Delimited string containing all column data, including the column indicator. Column
indicators are:
D - valid
O - overflow
N - null
T - truncated
B - binary
U - data unavailable
The fixed delimiter between column data and column indicator is a colon ( : ). The
delimiter between the columns is a pipe ( | ). You can override the column delimiter
in the error handling settings.
The PowerCenter Server converts all column data to text string in the error table or
error file. For binary data, the PowerCenter Server uses only the column indicator.
Error Log
Options
Required/
Error Log Options Description
Optional
Error Log Type Required Specifies the type of error log to create. You can specify relational
database, flat file, or no log. By default, the PowerCenter Server
does not create an error log.
Error Log DB Connection Required/ Specifies the database connection for a relational log. This option is
Optional required when you enable relational database logging.
Error Log Table Name Optional Specifies the table name prefix for relational logs. The PowerCenter
Prefix Server appends 11 characters to the prefix name. Oracle and
Sybase have a 30 character limit for table names. If a table name
exceeds 30 characters, the session fails.
Error Log File Directory Required/ Specifies the directory where errors are logged. By default, the error
Optional log file directory is $PMBadFilesDir\. This option is required when
you enable flat file logging.
Error Log File Name Required/ Specifies error log file name. The character limit for the error log file
Optional name is 255. By default, the error log file name is PMError.log. This
option is required when you enable flat file logging.
Log Row Data Optional Specifies whether or not to log transformation row data. By default,
the PowerCenter Server logs transformation row data. If you disable
this property, N/A or -1 appears in transformation row data fields.
Log Source Row Data Optional If you choose not to log source row data, or if source row data is
unavailable, the PowerCenter Server writes an indicator such as N/
A or -1, depending on the column datatype.
If you do not need to capture source row data, consider disabling
this option to increase PowerCenter Server performance.
Data Column Delimiter Required Delimiter for string type source row data and transformation group
row data. By default, the PowerCenter Server uses a pipe ( | )
delimiter. Verify that you do not use the same delimiter for the row
data as the error logging columns. If you use the same delimiter, you
may find it difficult to read the error log file.
4. Click OK.
Session Parameters
495
Overview
Session parameters, like mapping parameters, represent values you might want to change
between sessions, such as a database connection or source file. Use session parameters in the
session properties, and then define the parameters in a parameter file. You can specify the
parameter file for the session to use in the session properties. You can also specify it when you
use pmcmd to start the session.
The Workflow Manager provides one built-in session parameter, $PMSessionLogFile. With
$PMSessionLogFile, you can change the name of the session log generated for the session.
The Workflow Manager also allows you to create user-defined session parameters.
Table 18-1 describes required naming conventions for the session parameters you can define:
Use session parameters to make sessions more flexible. For example, you have the same type of
transactional data written to two different databases, and you use the database connections
TransDB1 and TransDB2 to connect to the databases. You want to use the same mapping for
both tables. Instead of creating two sessions for the same mapping, you can create a database
connection parameter, $DBConnectionSource, and use it as the source database connection
for the session. When you create a parameter file for the session, you set
$DBConnectionSource to TransDB1 and run the session. After the session completes, you set
$DBConnectionSource to TransDB2 and run the session again.
You might use several session parameters together to make session management easier. For
example, you might use source file and database connection parameters to configure a session
to read data from different source files and write the results to different target databases. You
can then use reject file parameters to write the session reject files to the target machine. You
can use the session log parameter, $PMSessionLogFile, to write to different session logs in the
target machine, as well.
When you use session parameters, you must define the parameters in the parameter file.
Session parameters do not have default values. When the PowerCenter Server cannot find a
value for a session parameter, it fails to initialize the session.
Session Log
Parameter
Session Log
Directory
Parameter
Filename
For example, in a session, you leave Session Log File Directory set to its default value, the
$PMSessionLogDir server variable. For Session Log File Name, you enter the session
parameter $PMSessionLogFile. In the parameter file, you set $PMSessionLogFile to
“TestRun.txt”. When you registered the PowerCenter Server, you defined $PMSessionLogDir
as C:/Program Files/Informatica/PowerCenter Server/SessLogs. When the PowerCenter Server
1. In the session properties, click the General Options settings of the Properties tab.
2. Enter $PMSessionLogFile in the Session Log File field.
3. If you want $PMSessionLogFile to represent both the session log name and directory,
clear the Session Log File Directory field.
4. Enter a parameter file and directory in the Parameter File Name field.
5. Click OK.
Before you run the session, create the parameter file in the specified directory and define
$PMSessionLogFile. For details, see “Parameter Files” on page 511.
1. In the session properties, click the Mapping tab (Transformation view) and click
Connections settings for the sources or targets node.
Open
Button
Source File
Directory
Source
Filename In the
Parameter File
Target file
directory
Target file name
in the
parameter file
Lookup File
Directory
Lookup file
name in the
parameter file
Reject file
directory
Reject file
name in the
parameter file
Parameter Files
511
Overview
You can use a parameter file to define the values for parameters and variables used in a
workflow, worklet, or session. You can create a parameter file using a text editor such as
WordPad or Notepad. You list the parameters or variables and their values in the parameter
file. Parameter files can contain the following types of parameters and variables:
♦ Workflow variables
♦ Worklet variables
♦ Session parameters
♦ Mapping parameters and variables
When you use parameters or variables in a workflow, worklet, or session, the PowerCenter
Server checks the parameter file to determine the start value of the parameter or variable. You
can use a parameter file to initialize workflow variables, worklet variables, mapping
parameters, and mapping variables. If you do not define start values for these parameters and
variables, the PowerCenter Server checks for the start value of the parameter or variable in
other places. For more information, see “Using Workflow Variables” on page 103 and
“Mapping Parameters and Variables” in the Designer Guide.
You can place parameter files on the PowerCenter Server machine or on a local machine. Use
a local parameter file if you do not have access to parameter files on the PowerCenter Server
machine. When you use a local parameter file, pmcmd passes variables and values in the file to
the PowerCenter Server. Local parameter files are used with the startworkflow pmcmd
command. For more information, see “pmcmd Reference” on page 594.
You must define session parameters in a parameter file. Since session parameters do not have
default values, when the PowerCenter Server cannot locate the value of a session parameter in
the parameter file, it fails to initialize the session.
You can include parameter or variable information for more than one workflow, worklet, or
session in a single parameter file by creating separate sections for each object within the
parameter file.
You can also create multiple parameter files for a single workflow, worklet, or session and
change the file that these tasks use as needed. To specify the parameter file the PowerCenter
Server uses with a workflow, worklet, or session, you can do either of the following:
♦ Enter the parameter file name and directory in the workflow, worklet, or session
properties.
♦ Start the workflow, worklet, or session using pmcmd and enter the parameter filename and
directory in the command line. For details, see “Using pmcmd” on page 581.
If you enter a parameter file name and directory in both the workflow, worklet, or session
properties and in the pmcmd command line, the PowerCenter Server uses the information
you enter in the pmcmd command line.
♦ Worklet variables:
[folder name.WF:workflow name.WT:worklet name]
or
[folder name.session name]
or
[session name]
Below each heading, you define parameter and variable values as follows:
parameter name=value
parameter2 name=value
variable name=value
variable2 name=value
For example, you have a session, s_MonthlyCalculations, in the Production folder. The
session uses a string mapping parameter, $$State, that you want to set to “MA”, and a
datetime mapping variable, $$Time. $$Time already has an initial values of “9/30/2000
00:00:00” saved in the repository, but you want to override this value to “10/1/2000
00:00:00.” The session also uses session parameters to connect to source files and target
databases, as well as to write session log to the appropriate session log file.
Table 19-1 shows the parameters and variables that you define in the parameter file:
Parameter and Variable Type Parameter and Variable Name Desired Definition
Parameter and Variable Type Parameter and Variable Name Desired Definition
The parameter file for the session includes the folder and session name, as well as each
parameter and variable:
[Production.s_MonthlyCalculations]
$$State=MA
$$Time=10/1/2000 00:00:00
$InputFile1=sales.txt
$DBConnection_target=sales
$PMSessionLogFile=D:/session logs/firstrun.txt
The next time you run the session, you might edit the parameter file to change the state to
MD and delete the $$Time variable. This allows the PowerCenter Server to use the value for
the variable that was set in the previous session run.
parameter_name=value
variable_name=value
mapplet_name.parameter_name=value
[folder2_name.session_name]
parameter_name=value
variable_name=value
mapplet_name.parameter_name=value
♦ Specify headings in any order. You can place headings in any order in the parameter file.
However, if you define the same parameter or variable more than once in the file, the
PowerCenter Server assigns the parameter or variable value using the first instance of the
parameter or variable.
♦ Specify parameters and variables in any order. Below each heading, you can specify the
parameters and variables in any order.
♦ When defining parameter values, do not use unnecessary line breaks or spaces. The
PowerCenter Server might interpret additional spaces as part of the value.
♦ List all necessary mapping parameters and variables. Values entered for mapping
parameters and variables become the start value for parameters and variables in a mapping.
Mapping parameter and variable names are not case sensitive.
♦ List all session parameters. Session parameters do not have default values. An undefined
session parameter can cause the session to fail. Session parameter names are not case-
sensitive.
♦ Use correct date formats for datetime values. When entering datetime values, use the
following date formats:
− MM/DD/RR
− MM/DD/RR HH24:MI:SS
mapplet2_name.variable_name=value
$$platform=unix
[HET_TGTS.WF:wf_TGTS_ASC_ORDR.ST:s_TGTS_ASC_ORDR]
$$platform=unix
$DBConnection_ora=qasrvrk2_hp817
[ORDERS.WF:wf_PARAM_FILE.WT:WL_PARAM_Lvl_1]
$$DT_WL_lvl_1=02/01/2000 00:00:00
$$Double_WL_lvl_1=2.2
[ORDERS.WF:wf_PARAM_FILE.WT:WL_PARAM_Lvl_1.WT:NWL_PARAM_Lvl_2]
$$DT_WL_lvl_2=03/01/2000 00:00:00
$$Int_WL_lvl_2=3
$$String_WL_lvl_2=ccccc
1. Select Workflows-Edit.
2. Click the Properties tab.
3. Enter the parameter directory and name in the Parameter Filename field.
You can enter either a direct path or a server variable directory. Use the appropriate
delimiter for the PowerCenter Server operating system.
Enter the
parameter
directory.
4. Click OK.
1. Click the Properties tab and open the General Options settings.
2. Enter the parameter directory and name in the Parameter Filename field.
Enter the
parameter
directory.
4. Click OK.
I am trying to use a source file parameter to specify a source file and location, but the
PowerCenter Server cannot find the source file.
Make sure to clear the source file directory in the session properties. The PowerCenter Server
concatenates the source file directory with the source file name to locate the source file.
Also, make sure to enter a directory local to the PowerCenter Server and to use the
appropriate delimiter for the operating system.
I am trying to run a workflow with a parameter file and one of the sessions keeps failing.
The session might contain a parameter that is not listed in the parameter file. The
PowerCenter Server uses the parameter file to start all sessions in the workflow. Check the
session properties, then verify that all session parameters are defined correctly in the
parameter file.
Use pmcmd and multiple parameter files for sessions with regular cycles.
When you change parameter values for a session in a cycle, reuse the same values on a regular
basis. If you run a session against both the sales and marketing databases once a week, you
might want to create separate parameter files for each regular session run. Then, instead of
changing the parameter file in the session properties each time you run the session, use pmcmd
to specify the parameter file to use when you start the session.
Tips 521
522 Chapter 19: Parameter Files
Chapter 20
External Loading
523
Overview
You can configure a session to use DB2, Oracle, Sybase IQ, and Teradata external loaders to
load session target files into the respective databases. External Loaders can increase session
performance since these databases can load information directly from files faster than they can
run the SQL commands to insert the same data into the database.
To use an external loader for a session, you must perform the following tasks:
1. Create an external loader connection in the Workflow Manager and configure the
external loader attributes. For details on creating external loader connections, see
“Creating an External Loader Connection” on page 551.
2. Configure the session to write to flat file instead of to a relational database. For more
information, see “Configuring a Session to Write to a File” on page 553.
3. Choose an external loader connection for each target file in the session properties. For
more information, see “Selecting an External Loader Connection” on page 555.
When you run a session that uses an external loader, the PowerCenter Server creates a control
file and a target flat file. The control file contains information about the target flat file such as
data format and loading instructions for the external loader. The control file has an extension
of .ctl. You can view the control file and the target flat file in the target file directory (default:
$PMTargetFileDir).
The PowerCenter Server waits for all external loading to complete before performing post-
session commands, external procedures, and sending post-session email.
Before you run external loaders, consider the following issues:
♦ Disable constraints. Normally, you disable constraints built into the tables receiving the
data before performing the load. Consult your database documentation for instructions on
how to disable constraints.
♦ Performance issues. To preserve high performance, you can increase commit intervals and
turn off database logging. However, to perform database recovery on failed sessions, you
must have database logging turned on.
♦ Code page requirements. DB2, Oracle, Sybase IQ, and Teradata database servers must run
in the same code page as the target flat file code page. The external loaders start in the
target flat file code page. The PowerCenter Server creates the control and target flat files
using the target flat file code page. If you are using a code page other than 7-bit ASCII for
the target flat file, run the PowerCenter Server in Unicode data movement mode.
The PowerCenter Server can use multiple external loaders within one session. For example, if
the mapping contains two targets, you can create a session that uses different connection
types: one uses an Oracle external loader connection and the other uses a Sybase IQ external
loader connection.
Default
Attributes Description
Value
Opmode Insert The DB2 external loader operation mode. Choose one of the following operation
modes:
- Insert
- Replace
- Restart
- Terminate
For more information about DB2 operation modes, see “Setting DB2 External
Loader Operation Modes” on page 528.
External Loader db2load The name of the DB2 EE external loader executable file.
Executable
Default
Attributes Description
Value
DB2 Server Location Remote The location of the DB2 EE database server relative to the PowerCenter Server.
Select Local if the DB2 EE database server resides on the PowerCenter Server
machine. Select Remote if the DB2 EE Server resides on another machine.
Is Staged Disabled The method of loading data. Select Is Staged to load data to a flat file staging
area before loading to the database. Otherwise, the data is loaded to the
database using a named pipe. For more information, see “Loading Data Using
Named Pipes” on page 526 or “Staging Data to Flat Files” on page 526.
Recoverable Enabled Sets tablespaces in backup pending state if forward recovery is enabled. If you
disable forward recovery, the DB2 tablespace will not set to backup pending
state. If the DB2 tablespace is in backup pending state, you must fully back up
the database before you perform any other operation on the tablespace.
Any other return code indicates that the load operation failed. The PowerCenter Server writes
the following error message to the session log:
WRT_8047 Error: External loader process <external loader name> exited with
error <return code>.
Table 20-3 describes the return codes for the DB2 EE external loader:
Code Description
2 The external loader could not open the external loader log file.
3 The external loader could not access the control file because the control file is locked by another process.
Default
Attribute Description
Value
Opmode Insert The DB2 external loader operation mode. Choose one of the following
operation modes:
- Insert
- Replace
- Restart
- Terminate
For more information about DB2 operation modes, see “Setting DB2 External
Loader Operation Modes” on page 528.
External Loader db2atld The name of the DB2 EEE external loader executable file.
Executable
Split File Location n/a The location of the split files. The external loader creates split files if you
configure SPLIT_ONLY loading mode.
Output Nodes n/a The database partitions on which the load operation is to be performed.
Default
Attribute Description
Value
Split Nodes n/a The database partitions that determine how to split the data. If you do not
specify this attribute, the external loader automatically determines an optimal
splitting method.
Mode Split and The loading mode the external loader uses to load the data. Choose one of the
load following loading modes:
- Split and load
- Split only
- Load only
- Analyze
Force No Forces the external loader operation to continue even if it determines at startup
time that some target partitions or tablespaces are offline.
Status Interval 100 Number of megabytes of data the external loader loads before writing a
progress message to the external loader log. You can specify a value between
1 and 4,000 MB.
Ports 6000-6063 The range of TCP ports the external loader uses to create sockets for internal
communications with the DB2 server.
Check Level Nocheck Specifies whether the external loader should check for record truncation during
input or output.
Map File Input n/a The name of the file that specifies the partitioning map. If you want to use a
customized partitioning map, you must specify this attribute. You can generate
a customized partitioning map when you run the external loader in Analyze
loading mode.
Map File Output n/a The name of the partitioning map when you run the external loader in Analyze
loading mode. You must specify this attribute if you want to run the external
loader in Analyze loading mode.
Trace 0 The number of rows the external loader traces when you need to review a
dump of the data conversion process and output of hashing values.
Is Staged Disabled The method of loading data. Select Is Staged to load data to a flat file staging
area before loading to the database. Otherwise, the data is loaded to the
database using a named pipe. For more information, see “Loading Data Using
Named Pipes” on page 526 or “Staging Data to Flat Files” on page 526.
Date Format mm/dd/ The date format. The date format in the Connection Object definition must
yyyy match the date format you define in the target definition. DB2 supports the
following date formats:
- mm/dd/yyyy
- yyyy-mm-dd
- dd.mm.yyyy
- yyyy-mm-dd
Error Limit 1 Number of errors to allow before the external loader stops the load
operation.
Load Mode Append The loading mode the external loader uses to load data. Choose from
one of the following loading modes:
- Append
- Insert
- Replace
- Truncate
Load Method Use Conventional The method the external loader uses to load data. Choose from one of
Path the following load methods:
- Use Conventional Path
- Use Direct Path (Recoverable)
- Use Direct Path (Unrecoverable)
Enable Parallel Enable Parallel Determines whether the Oracle external loader loads data in parallel to a
Load Load partitioned Oracle target table. Choose either Enable Parallel Load or Do
Not Enable Parallel Load.
You can create multiple partitions in a session if you use a loader
configured to enable parallel load. Sessions with multiple partitions fail if
you use a loader configured not to enable parallel load. For more
information, see “Partitioning Sessions with External Loaders” on
page 526.
Rows Per Commit 10000 For Conventional Path load method, this attribute specifies the number
of rows in the bind array for load operations. For Direct Path load
methods, this attribute specifies the number of rows the external loader
reads from the target flat file before it saves the data to the database.
External Loader sqlload The name of the external loader executable file.
Executable
Log File Name n/a The path and name of the external loader log file.
Is Staged Disabled The method of loading data. Select Is Staged to load data to a flat file
staging area before loading to the database. Otherwise, the data is
loaded to the database using a named pipe. For more information, see
“Loading Data Using Named Pipes” on page 526 or “Staging Data to Flat
Files” on page 526.
Reject File
The Oracle external loader creates a reject file for data rejected by the database. The reject file
has an extension of .ldrreject. The loader saves the reject file in the target files directory
(default location: $PMTargetFileDir).
♦ When you create a Sybase IQ external loader connection, the Workflow Manager sets the
name of the external loader executable file to dbisql by default. If you use an executable file
with a different name, for example, dbisqlc, you must update the External Loader
Executable field. If the external loader executable file directory is not in the system path,
you must enter the file path and file name in this field.
Table 20-6 describes the attributes for Sybase IQ external loader connections:
Default
Attribute Description
Value
Block Factor 10000 The number of records per block in the target Sybase table. The external
loader applies the Block Factor attribute to load operations for fixed-
width flat file targets only.
Block Size 50000 The size of blocks used in Sybase database operations. The external
loader applies the Block Size attribute to load operations for delimited
flat file targets only.
Default
Attribute Description
Value
Notify Interval 1000 The number of rows the Sybase IQ external loader loads before it writes
a status message to the external loader log.
Server Datafile Directory n/a The location of the flat file target. You must specify this attribute relative
to the database server installation directory. Enter the target file directory
path using the syntax for the machine hosting the database server
installation. For example, if the PowerCenter Server is on a Windows
machine and the Sybase IQ Server is on a UNIX machine, use UNIX
syntax.
External Loader dbisql The name of the Sybase IQ external loader executable.
Executable
Is Staged Enabled The method of loading data. Select Is Staged to load data to a flat file
staging area before loading to the database. Otherwise, the data is
loaded to the database using a named pipe. For more information, see
“Loading Data Using Named Pipes” on page 526 or “Staging Data to Flat
Files” on page 526.
In the Control File Editor dialog box, click Generate to create the default control file. The
Workflow Manager creates the default control file based on the session and loader properties.
Edit the generated control file, and click OK to save your changes.
Note that if you change a target or loader connection setting after you edit the control file,
the control file does not include those changes. If you want to include those changes, you
must generate the control file again and edit it.
Note: The Workflow Manager does not validate the control file syntax. Teradata verifies the
control file syntax when you run a session. If the control file is invalid, the session fails.
Default
Attribute Description
Value
Date Format n/a The date format. The date format in the Connection Object definition must match
the date format you define in the target definition. The PowerCenter Server
supports the following date formats:
- dd/mm/yyyy
- mm/dd/yyyy
- yyyy/dd/mm
- yyyy/mm/dd
Error Limit 0 The total number of rejected records that MultiLoad can write to the MultiLoad error
tables. Uniqueness violations do not count as rejected records.
An error limit of 0 means that there is no limit on the number of rejected rows.
Checkpoint 10,000 The interval between checkpoints. You can set the interval to the following values:
- 60 or more: MultiLoad performs a checkpoint operation after it processes each
multiple of that number of records.
- 1–59: MultiLoad performs a checkpoint operation at the specified interval, in
minutes.
- 0: MultiLoad does not perform any checkpoint operations during the import task.
Tenacity 10,000 Specifies how long, in hours, MultiLoad tries to log onto the required sessions. If a
logon fails, MultiLoad delays for the number of minutes specified in the Sleep
attribute, and then retries the logon. MultiLoad keeps trying until the logon
succeeds or the number of hours specified in the Tenacity attribute elapses.
Default
Attribute Description
Value
Load Mode Upsert The mode to generate SQL commands: Insert, Delete, Update, Upsert, or Data
Driven.
When you select Data Driven loading, the PowerCenter Server follows instructions
coded in an Update Strategy or Custom transformations within the mapping to
determine how to flag rows for insert, delete, or update. The PowerCenter Server
writes a column in the target file or named pipe to indicate the update strategy. The
control file uses these values to determine how to load data to the target. The
PowerCenter Server uses the following values to indicate the update strategy:
0 - Insert
1 - Update
2 - Delete
Drop Error Tables Enabled Specifies whether to drop the MultiLoad error tables before beginning the next
session. Select this option to drop the tables, or clear it to keep them.
External Loader mload The name and optional file path of the Teradata external loader executable. If the
Executable external loader executable directory is not in the system path, you must enter the
file path and filename.
Max Sessions 1 The maximum number of MultiLoad sessions per MultiLoad job. Max Sessions must
be between 1 and 32,767.
Running multiple MultiLoad sessions causes the client and database to use more
resources. Therefore, setting this value to a small number may improve
performance.
Sleep 6 The number of minutes MultiLoad waits before retrying a logon. MultiLoad tries until
the logon succeeds or the number of hours specified in the Tenacity attribute
elapses.
Sleep must be greater than 0. If you specify 0, MultiLoad issues an error message
and uses the default value, 6 minutes.
Is Staged Disabled The method of loading data. Select Is Staged to load data to a flat file staging area
before loading to the database. Otherwise, the data is loaded to the database using
a named pipe. For more information, see “Loading Data Using Named Pipes” on
page 526 or “Staging Data to Flat Files” on page 526.
Error Database n/a The error database name. You can use this attribute to override the default error
database name. If you do not specify a database name, the PowerCenter Server
uses the target table database.
Work Table n/a The work table database name. You can use this attribute to override the default
Database work table database name. If you do not specify a database name, the
PowerCenter Server uses the target table database.
Log Table n/a The log table database name. You can use this attribute to override the default log
Database table database name. If you do not specify a database name, the PowerCenter
Server uses the target table database.
Table 20-8. Teradata MultiLoad External Loader Attributes Defined at the Session Level
Default
Attribute Description
Value
Error Table 1 n/a The table name for the first error table. You can use this attribute to override
the default error table name. If you do not specify an error table name, the
PowerCenter Server uses ET_<target_table_name>.
Error Table 2 n/a The table name for the second error table. You can use this attribute to
override the default error table name. If you do not specify an error table name,
the PowerCenter Server uses UV_<target_table_name>.
Work Table n/a The work table name. You can use this attribute to override the default work
table name. If you do not specify a work table name, the PowerCenter Server
uses WT_<target_table_name>.
Log Table n/a The log table name. You can use this attribute to override the default log table
name. If you do not specify a log table name, the PowerCenter Server uses
ML_<target_table_name>.
Control File Content n/a The control file text. You can use this attribute to override the control file the
Override PowerCenter Server uses when it loads to Teradata. For more information, see
“Overriding the Control File” on page 539.
For more information about these attributes, consult your Teradata documentation.
Default
Attribute Description
Value
Default
Attribute Description
Value
Error Limit 0 Limits the number of rows rejected for errors. When the error limit is exceeded,
TPump rolls back the transaction that causes the last error. An error limit of 0
causes TPump to stop processing after any error.
Checkpoint 15 The number of minutes between checkpoints. You must set the checkpoint to a
value between 0 and 60.
Tenacity 4 Specifies how long, in hours, TPump tries to log onto the required sessions. If a
logon fails, TPump delays for the number of minutes specified in the Sleep
attribute, and then retries the logon. TPump keeps trying until the logon succeeds
or the number of hours specified in the Tenacity attribute elapses.
To disable Tenacity, set the value to 0.
Load Mode Upsert The mode to generate SQL commands: Insert, Delete, Update, Upsert, or Data
Driven.
When you select Data Driven loading, the PowerCenter Server follows instructions
coded in an Update Strategy or Custom transformations within the session mapping
to determine how to flag rows for insert, delete, or update. The PowerCenter Server
writes a column in the target file or named pipe to indicate the update strategy. The
control file uses these values to determine how to load data to the database. The
PowerCenter Server uses the following values to indicate the update strategy:
0 - Insert
1 - Update
2 - Delete
Drop Error Tables Enabled Specifies whether to drop the TPump error tables before beginning the next
session. Select this option to drop the tables, or clear it to keep them.
External Loader tpump The name and optional file path of the Teradata external loader executable. If the
Executable external loader executable directory is not in the system path, you must enter the
file path and filename.
Max Sessions 1 The maximum number of TPump sessions per TPump job. Each partition in a
session starts its own TPump job. Running multiple TPump sessions causes the
client and database to use more resources. Therefore, setting this value to a small
number may improve performance.
Sleep 6 The number of minutes TPump waits before retrying a logon. TPump tries until the
logon succeeds or the number of hours specified in the Tenacity attribute elapses.
Packing Factor 20 The number of rows that each session buffer holds. Packing improves network/
channel efficiency by reducing the number of sends and receives between the
target flat file and the Teradata database.
Statement Rate 0 The initial maximum rate, per minute, at which the TPump executable sends
statements to the Teradata database. If you set this attribute to 0, the statement
rate is unspecified.
Default
Attribute Description
Value
Serialize Disabled Determines whether or not operations on a given key combination (row) occur
serially.
You may want to check this option if the TPump job contains multiple changes to
one row. Sessions that contain multiple partitions with the same key range but
different filter conditions may cause multiple changes to a single row. In this case,
you may want to enable Serialize to prevent locking conflicts in the Teradata
database, especially if you set the Pack attribute to a value greater than 1.
If you select this option, the PowerCenter Server uses the primary key specified in
the target table as the Key column. If no primary key exists in the target table, you
must either clear this checkbox or indicate the Key column in the data layout
section of the control file.
Robust Disabled When Robust is not selected, it signals TPump to use simple restart logic. In this
case, restarts cause TPump to begin at the last checkpoint. TPump reloads any
data that was loaded after the checkpoint. This method does not have the extra
overhead of the additional database writes in the robust logic.
No Monitor Enabled When selected, this attribute prevents TPump from checking for statement rate
changes from, or update status information for, the TPump monitor application.
Is Staged Disabled The method of loading data. Select Is Staged to load data to a flat file staging area
before loading to the database. Otherwise, the data is loaded to the database using
a named pipe. For more information, see “Loading Data Using Named Pipes” on
page 526 or “Staging Data to Flat Files” on page 526.
Error Database n/a The error database name. You can use this attribute to override the default error
database name. If you do not specify a database name, the PowerCenter Server
uses the target table database.
Log Table n/a The log table database name. You can use this attribute to override the default log
Database table database name. If you do not specify a database name, the PowerCenter
Server uses the target table database.
Table 20-10 shows the attributes that you configure when you edit a session and override the
Teradata TPump external loader connection object:
Table 20-10. Teradata TPump External Loader Attributes Defined at the Session Level
Default
Attribute Description
Value
Error Table n/a The error table name. You can use this attribute to override the default error
table name. If you do not specify an error table name, the PowerCenter Server
uses ET_<target_table_name><partition_number>.
Log Table n/a The log table name. You can use this attribute to override the default log table
name. If you do not specify a log table name, the PowerCenter Server uses
LT_<target_table_name><partition_number>.
Control File Content n/a The control file text. You can use this attribute to override the control file the
Override PowerCenter Server uses when it loads to Teradata. For more information, see
“Overriding the Control File” on page 539.
Default
Attribute Description
Value
Error Limit 1,000,000 The maximum number of rows that FastLoad rejects before it stops loading data to
the database table.
Checkpoint 0 The number of rows transmitted to the Teradata database between checkpoints. If
processing stops while a FastLoad job is running, you can restart the job at the
most recent checkpoint.
If you enter 0, FastLoad does not perform checkpoint operations.
Tenacity 4 The number of hours FastLoad tries to log on to the required FastLoad sessions
when the maximum number of load jobs are already running on the Teradata
database. When FastLoad tries to log on for a new session, and the Teradata
database indicates that the maximum number of load sessions is already running,
FastLoad logs off all new sessions that were logged on, delays for the number of
minutes specified in the Sleep attribute, and then retries the logon. FastLoad keeps
trying until it logs on for the required number of sessions or exceeds the number of
hours specified in the Tenacity attribute.
Default
Attribute Description
Value
Drop Error Tables Enabled Specifies whether to drop the FastLoad error tables before beginning the next
session. FastLoad will not run if non-empty error tables exist from a prior job.
Select this option to drop the tables, or clear it to keep them.
External Loader fastload The name and optional file path of the Teradata external loader executable. If the
Executable external loader executable directory is not in the system path, you must enter the
file path and file name.
Max Sessions 1 The maximum number of FastLoad sessions per FastLoad job. Max Sessions must
be between 1 and the total number of access module processes (AMPs) on your
system.
Sleep 6 The number of minutes FastLoad pauses before retrying a logon. FastLoad tries
until the logon succeeds or the number of hours specified in the Tenacity attribute
elapses.
Truncate Target Disabled Specifies whether to truncate the target database table before beginning the
Table FastLoad job. FastLoad cannot load data to non-empty tables.
Is Staged Disabled The method of loading data. Select Is Staged to load data to a flat file staging area
before loading to the database. Otherwise, the data is loaded to the database using
a named pipe. For more information, see “Loading Data Using Named Pipes” on
page 526 or “Staging Data to Flat Files” on page 526.
Error Database n/a The error database name. You can use this attribute to override the default error
database name. If you do not specify a database name, the PowerCenter Server
uses the target table database.
Table 20-12 shows the attributes that you configure when you edit a session and override the
Teradata FastLoad external loader connection object:
Table 20-12. Teradata FastLoad External Loader Attributes Defined at the Session Level
Default
Attribute Description
Value
Error Table 1 n/a The table name for the first error table. You can use this attribute to override
the default error table name. If you do not specify an error table name, the
PowerCenter Server uses ET_<target_table_name>.
Error Table 2 n/a The table name for the second error table. You can use this attribute to
override the default error table name. If you do not specify an error table
name, the PowerCenter Server uses UV_<target_table_name>.
Control File Content n/a The control file text. You can use this attribute to override the control file the
Override PowerCenter Server uses when it loads to Teradata. For more information,
see “Overriding the Control File” on page 539.
For more information about these attributes, consult your Teradata documentation.
Operator Protocol
Load Uses FastLoad protocol. Load attributes are described in Table 20-14. For more
information about how FastLoad works, see “Teradata FastLoad External Loader
Attributes” on page 545.
Update Uses MultiLoad protocol. Update attributes are described in Table 20-14. For more
information about how MultiLoad works, see “Teradata MultiLoad External Loader
Attributes” on page 540.
Stream Uses TPump protocol. Stream attributes are described in Table 20-14. For more
information about how TPump works, see “Teradata TPump External Loader
Attributes” on page 542.
Each Teradata Warehouse Builder operator has associated attributes. Not all attributes
available for FastLoad, MultiLoad, and TPump external loaders are available for Teradata
Warehouse Builder.
Table 20-14 shows the attributes that you configure for Teradata Warehouse Builder:
Default
Attribute Description
Value
Operator Update The Warehouse Builder operator used to load the data. Choose Load, Update, or
Stream.
Max instances 4 The maximum number of parallel instances for the defined operator.
Default
Attribute Description
Value
Error Limit 0 The maximum number of rows that Warehouse Builder rejects before it stops loading
data to the database table.
Checkpoint 0 The number of rows transmitted to the Teradata database between checkpoints. If
processing stops while a Warehouse Builder job is running, you can restart the job at
the most recent checkpoint.
If you enter 0, Warehouse Builder does not perform checkpoint operations.
Tenacity 4 The number of hours Warehouse Builder tries to log on to the Warehouse Builder
sessions when the maximum number of load jobs are already running on the
Teradata database. When Warehouse Builder tries to log on for a new session, and
the Teradata database indicates that the maximum number of load sessions is
already running, Warehouse Builder logs off all new sessions that were logged on,
delays for the number of minutes specified in the Sleep attribute, and then retries the
logon. Warehouse Builder keeps trying until it logs on for the required number of
sessions or exceeds the number of hours specified in the Tenacity attribute.
To disable Tenacity, set the value to 0.
Load Mode Upsert The mode to generate SQL commands. Choose Insert, Update, Upsert, Delete or
Data Driven.
When you use the Update or Stream operators, you can choose Data Driven load
mode. When you select data driven loading, the PowerCenter Server follows
instructions coded in Update Strategy or Custom transformations within the mapping
to determine how to flag rows for insert, delete, or update. The PowerCenter Server
writes a column in the target file or named pipe to indicate the update strategy. The
control file uses these values to determine how to load data to the database. The
PowerCenter Server uses the following values to indicate the update strategy:
0 - Insert
1 - Update
2 - Delete
Drop Error Tables Enabled Specifies whether to drop the Warehouse Builder error tables before beginning the
next session. Warehouse Builder will not run if error tables containing data exist from
a prior job. Clear the option to keep error tables.
Truncate Target Disabled Specifies whether to truncate target tables. Enable this option to truncate the target
Table database table before beginning the Warehouse Builder job.
External Loader tbuild The name and optional file path of the Teradata external loader executable file. If the
Executable external loader directory is not in the system path, enter the file path and file name.
Max Sessions 4 The maximum number of Warehouse Builder sessions per Warehouse Builder job.
Max Sessions must be between 1 and the total number of access module processes
(AMPs) on your system.
Sleep 6 The number of minutes Warehouse Builder pauses before retrying a logon.
Warehouse Builder tries until the logon succeeds or the number of hours specified in
the Tenacity attribute elapses.
Default
Attribute Description
Value
Packing Factor 20 The number of rows that each session buffer holds. Packing improves network/
channel efficiency by reducing the number of sends and receives between the target
file ad the Teradata database. Enabled with Stream operator only.
Robust Disabled The recovery or restart mode. When you disable Robust, the Stream operator uses
simple restart logic. The Stream operator reloads any data that was loaded after the
last checkpoint.
When you enable Robust, Warehouse Builder uses robust restart logic. In robust
mode, the Stream operator determines how many rows were processed since the
last checkpoint. The Stream operator processes all the rows that were not processed
after the last checkpoint. Enabled with Stream operator only.
Is Staged Disabled The method of loading data. Select Is Staged to load data to a flat file staging area
before loading to the database. Otherwise, the data is loaded to the database using a
named pipe. For more information, see “Loading Data Using Named Pipes” on
page 526 or “Staging Data to Flat Files” on page 526.
Error Database n/a The error database name. You can use this attribute to override the default error
database name. If you do not specify a database name, the PowerCenter Server
uses the target table database.
Work Table n/a The work table database name. You can use this attribute to override the default
Database work table database name. If you do not specify a database name, the PowerCenter
Server uses the target table database.
Log Table n/a The log table database name. You can use this attribute to override the default log
Database table database name. If you do not specify a database name, the PowerCenter
Server uses the target table database.
Note: Valid attributes depend upon the operator you select.
Table 20-15 shows the attributes that you configure when you edit a session and override
Teradata Warehouse Builder external loader connection object:
Table 20-15. Teradata Warehouse Builder External Loader Attributes Defined at the Session Level
Default
Attribute Description
Value
Error Table 1 n/a The table name for the first error table. You can use this attribute to override the
default error table name. If you do not specify an error table name, the
PowerCenter Server uses ET_<target_table_name>.
Error Table 2 n/a The table name for the second error table. You can use this attribute to override
the default error table name. If you do not specify an error table name, the
PowerCenter Server uses UV_<target_table_name>.
Work Table n/a The work table name. You can use this attribute to override the default work table
name. If you do not specify a work table name, the PowerCenter Server uses
WT_<target_table_name>.
Log Table n/a The log table name. You can use this attribute to override the default log table
name. If you do not specify a log table name, the PowerCenter Server uses
RL_<target_table_name>.
Default
Attribute Description
Value
Control File n/a The control file text. You can use this attribute to override the control file the
Content Override PowerCenter Server uses when it loads to Teradata. For more information, see
“Overriding the Control File” on page 539.
Note: Valid attributes depend upon the operator you select.
For more information about these attributes, consult your Teradata documentation.
2. Click New.
Target
Instance
Writer Type
Target
Instance
Properties
Settings
To set the file properties, select the target instance in the Instances list.
Attribute Description
Output File Directory Enter the directory name in this field. By default, the PowerCenter Server writes output
files to the directory $PMTargetFileDir.
If you enter a full directory and file name in the Output Filename field, clear this field.
External loader sessions may fail if you use double spaces in the path for the output file.
Output Filename Enter the file name, or file name and path. By default, the Workflow Manager names the
target file based on the target definition used in the mapping: target_name.out. External
loader sessions may fail if you use double spaces in the path for the output file.
Reject File Directory By default, the PowerCenter Server writes all reject files to the directory $PMBadFileDir.
If you enter a full directory and file name in the Reject Filename field, clear this field.
Reject Filename Enter the file name, or file name and directory. The PowerCenter Server appends
information in this field to that entered in the Reject File Directory field. For example, if you
have “C:/reject_file/” in the Reject File Directory field, and enter “filename.bad” in the
Reject Filename field, the PowerCenter Server writes rejected rows to C:/reject_file/
filename.bad.
By default, the PowerCenter Server names the reject file after the target instance name:
target_name.bad.
You can also enter a reject file session parameter to represent the reject file or the reject
file and directory. Name all reject file parameters $BadFileName. For details on session
parameters, see “Session Parameters” on page 495.
Set File Properties Opens a dialog box that allows you to define flat file properties. When you use an external
loader, you must define the flat file properties by clicking the Set File Properties button.
For Oracle external loaders, the target flat file can be fixed-width or delimited.
For Sybase IQ external loaders, the target flat file can be fixed-width or delimited.
For Teradata external loaders, the target flat file must be fixed-width. For DB2 external
loaders, the target flat file must be delimited.
For more information, see “Configuring Fixed-Width Properties” on page 265 and
“Configuring Delimited Properties” on page 266.
Note: Do not select Merge Partitioned Files or enter a merge file name. You cannot merge
partitioned output files when you use an external loader.
Target
Instance
Connection
Type and
selected
Connection
Object
I am trying to run a session that uses TPump, but the session fails. The session log displays
an error saying that the Teradata output file name is too long.
The PowerCenter Server uses the Teradata output file name to generate names for the TPump
error and log files, as well as the log table name. To do this, the PowerCenter Server adds a
prefix of several characters to the output file name. It adds three characters for sessions with
one partition and five characters for sessions with multiple partitions.
Teradata allows log table names of up to 30 characters. Because the PowerCenter Server adds a
prefix, if you are running a session with a single partition, specify a target output file name
with a maximum of 27 characters, including the file extension. If you are running a session
with multiple partitions, specify a target output file name with a maximum of 25 characters,
including the file extension.
I tried to load data to Teradata using TPump, but the session failed. I corrected the error,
but the session still fails.
Occasionally, Teradata does not drop the log table when you rerun the session. Check the
Teradata database, and manually drop the log table if it exists. Then rerun the session.
Troubleshooting 557
558 Chapter 20: External Loading
Chapter 21
Using FTP
559
Overview
The PowerCenter Server can use File Transfer Protocol (FTP) to access source and target files.
With both source and target files, you can use FTP to transfer the files directly to the
PowerCenter Server or stage them on a local directory.
You can also stage files by creating a pre-session shell command to move the files local to the
PowerCenter Server. Accessing files directly with FTP generally provides better session
performance than using FTP to stage the files. However, you may want to stage FTP files to
keep a local archive.
Before creating an FTP session, you must configure the FTP connection in the Workflow
Manager. For details, see “Creating an FTP Connection” on page 561.
When using FTP file sources and targets in a session, you should know the following
information:
♦ FTP connection name
♦ Remote file name and exact path
♦ Whether you want to stage the files
Mainframe Notes
Due to mainframe restrictions, the following constraints apply when using FTP with
mainframe machines:
♦ You cannot execute sessions concurrently if the sessions use the same FTP source file or
target file located on a mainframe.
♦ If you abort a workflow containing a session with a staged FTP source or target from a
mainframe, you may need to wait for the connection to timeout before you can run the
workflow again.
or
IP address:port-number
When you specify a port number, enable that port number for FTP on the host machine.
♦ Default remote directory. The directory you want the PowerCenter Server to use by
default. In the session, when you enter a file name without a directory, the PowerCenter
Server appends the file name to this directory. Therefore, this path must be exact and
contain the appropriate trailing delimiters. For example, if you enter c:/data/ and in the
session specify the file FILENAME, the PowerCenter Server reads the path and file name
as c:\data\FILENAME.
If you enter the wrong delimiter for an FTP directory, the Workflow Manager does not
correct it. If the FTP host is a mainframe machine, the directory must begin with a single
quote and end with the period delimiter, such as: ‘defaultdir. You can override this option
in the session properties.
Depending on the remote machine you access, you might also need to enter the user name
and password. The password must be in 7-bit ASCII only. As with database connections, if
you edit an FTP connection, all sessions using the FTP connection use the updated
connection.
FTP Permissions
If you enable enhanced security, you can set FTP connection permissions in the Workflow
Manager. The Workflow Manager assigns Owner permissions to the user who registers the
connection. The Workflow Manager grants Owner Group permissions to the first group in
the Group Memberships list of the owner. You can manage FTP connection permissions if
you are the owner of the connection or if you have Super User privileges.
A registered FTP connection does not appear in the list of FTP connections if you do not
have at least read permission for the connection. If you want to edit a connection, you must
Required/
FTP Option Description
Optional
User Name Optional User name necessary to access the host machine.
Password Optional Password for the user name. Must be in 7-bit ASCII only.
Host Name Required Host name or dotted IP address of the FTP connection.
Optionally, you can specify a port number between 1 and 65535,
inclusive. If you do not specify a port number, the PowerCenter Server
uses 21 by default. Use the following syntax for specifying the host
name:
hostname:port-number
-or-
IP address:port-number
When you specify a port number, enable that port number for FTP on the
host machine.
Default Remote Required Enter a valid FTP directory on the host machine.
Directory Do not enclose the default remote directory in quotation marks.
The default directory name must be exact and include a trailing delimiter.
Note: Depending on the FTP server you use, you may have limited
options for entering FTP directories. Please see your FTP server
documentation for details.
Select an
FTP
connection.
If you enter a file name without a leading slash or drive letter, the PowerCenter Server
appends the file name to the Default Remote Directory path entered in the FTP
Connection dialog box. For example, if your default remote directory is c:/data/, and you
enter a remote file name of FILENAME, the PowerCenter Server connects to the FTP
host and looks for c:/data/FILENAME.
If you enter a fully qualified file name in the Remote Filename field, the PowerCenter
Server uses the named path rather than the path entered in the Default Remote
Directory.
To access the file, FILENAME, from the default mainframe directory, enter the following
in the Remote Filename field:
filename’
When the PowerCenter Server begins the session, it connects to the mainframe host and
looks for:
‘defaultdir.filename’
In contrast, if you want to use a file in a different directory, you must enter that directory
and file name in the Remote Filename field, like this:
‘overridedir.filename’
Note: Depending on the FTP server you use, you may have limited options for entering
FTP directories. Please see your FTP server documentation for details.
5. To store the file in a directory local to the PowerCenter Server, select Is Staged.
When you select this option for a source file, the PowerCenter Server moves the source
file from the FTP host to a local directory before the session begins, then uses the local
file during the session. If the staged file exists, the PowerCenter Server truncates the
staged file before running the session.
The location of the local file differs depending on the information entered in the
Properties settings of the Sources tab:
Select an
FTP
connection.
3. Click the Open button in the Value field to select an FTP connection.
4. Click Override and enter the remote file name.
Note: Depending on the FTP server you use, you may have limited options for entering
FTP directories. Please see your FTP server documentation for details.
5. To store the target file in a directory on the machine where the PowerCenter Server runs,
select Is Staged.
When you select this option, the PowerCenter Server writes to the local target file during
the session, then moves the file to the FTP host after the session is complete. The
location of the local file differs depending on the information entered in the Properties
settings of the Mapping tab:
Using Incremental
Aggregation
This chapter covers the following topics:
♦ Overview, 574
♦ PowerCenter Server Processing for Incremental Aggregation, 575
♦ Reinitializing the Aggregate Files, 576
♦ Moving or Deleting the Aggregate Files, 577
♦ Partitioning Guidelines with Incremental Aggregation, 578
♦ Preparing for Incremental Aggregation, 579
573
Overview
When using incremental aggregation, you apply captured changes in the source to aggregate
calculations in a session. If the source changes only incrementally and you can capture
changes, you can configure the session to process only those changes. This allows the
PowerCenter Server to update your target incrementally, rather than forcing it to process the
entire source and recalculate the same data each time you run the session.
For example, you might have a session using a source that receives new data every day. You
can capture those incremental changes because you have added a filter condition to the
mapping that removes pre-existing data from the flow of data. You then enable incremental
aggregation.
When the session runs with incremental aggregation enabled for the first time on March 1,
you use the entire source. This allows the PowerCenter Server to read and store the necessary
aggregate data. On March 2, when you run the session again, you filter out all the records
except those time-stamped March 2. The PowerCenter Server then processes only the new
data and updates the target accordingly.
Consider using incremental aggregation in the following circumstances:
♦ You can capture new source data. Use incremental aggregation when you can capture new
source data each time you run the session. Use a Stored Procedure or Filter transformation
to process only new data.
♦ Incremental changes do not significantly change the target. Use incremental aggregation
when the changes do not significantly change the target. If processing the incrementally
changed source alters more than half the existing target, the session may not benefit from
using incremental aggregation. In this case, drop the table and re-create the target with
complete source data.
Note: Do not use incremental aggregation if your mapping contains percentile or median
functions. The PowerCenter Server uses system memory to process Percentile and Median
functions in addition to the cache memory you configure in the session property sheet. As a
result, the PowerCenter Server does not store incremental aggregation values for Percentile
and Median functions in disk caches.
If you do not run the session using Verbose Init mode or use an identifiable transformation
naming convention, you may have difficulty determining which files belong to each session.
For more information about cache file storage and naming conventions, see “Cache Files” on
page 615.
Configure
incremental
aggregation.
Note: You cannot use incremental aggregation when the mapping includes an Aggregator
transformation with Transaction transformation scope. The Workflow Manager marks the
session invalid.
Using pmcmd
581
Overview
pmcmd is a program that you can use to communicate with the PowerCenter Server. You can
perform some of the tasks that you can also perform in the Workflow Manager such as
starting and stopping workflows and tasks.
You can use pmcmd in the following modes:
♦ Command line mode. The command line syntax allows you to write scripts for scheduling
workflows. Each command you write in the command line mode must include connection
information to the PowerCenter Server.
♦ Interactive mode. You establish and maintain an active connection to the PowerCenter
Server. This allows you to issue a series of commands.
You can use repository user names and passwords as environment variables with pmcmd. You
can also customize the way pmcmd displays the date and time on the machine running the
PowerCenter Server. Before you use pmcmd, configure these variables on the PowerCenter
Server. For more information, see “Configuring Environment Variables” on page 585.
Note: To issue the shutdownserver command, you must have the Super User privilege or
Administer Server privilege.
Table 23-1 provides a description for the pmcmd commands. For details on command syntax
and usage, see “pmcmd Reference” on page 594.
aborttask Command line, Aborts a task. Issue this command only after the PowerCenter
Interactive Server fails to stop when you issue the stoptask command. For
more information, see “Aborttask” on page 596.
abortworkflow Command line, Aborts a workflow. Issue this command only after the PowerCenter
Interactive Server fails to stop the workflow when you issue the stopworkflow
command. For more information, see “Abortworkflow” on page 597.
connect Interactive Connects to the PowerCenter Server in the interactive mode. Use
this command in conjunction with connection information. For more
information, see “Connect” on page 597.
disconnect Interactive Disconnects from the PowerCenter Server in the interactive mode.
For more information, see “Disconnect” on page 598.
exit Interactive Exits from pmcmd in the interactive mode. For more information,
see “Exit” on page 598.
getrunningsessionsdetails Command line, Displays details for sessions currently running on a PowerCenter
Interactive Server including information for the folder, workflow, and session
instance. Displays session status and statistics on each target
table and source qualifier. For more information, see
“Getrunningsessionsdetails” on page 598.
getserverdetails Command line, Displays details for the PowerCenter Server including server
Interactive status, information on active workflows, and timestamp
information.
In a server grid, this command displays the PowerCenter Servers
that runs each task instance. For more information, see
“Getserverdetails” on page 599.
getserverproperties Command line, Displays the PowerCenter Server name, type, and version. It
Interactive returns the timestamp on the PowerCenter Server and the name of
the repository. It also indicates the data movement mode and
whether the PowerCenter Server can debug mappings. For more
information, see “Getserverproperties” on page 599.
getsessionstatistics Command line, Displays session details including information for the folder,
Interactive workflow, and task instance. Displays session status and statistics
on each target table and source qualifier.
In a server grid, this command displays the PowerCenter Servers
that runs each task instance. For more information, see
“Getsessionstatistics” on page 600.
gettaskdetails Command line, Displays details for a task including folder and workflow name. Also
Interactive displays the task, status, and run mode.
In a server grid, this command displays the PowerCenter Servers
that runs each task instance. For more information, see
“Gettaskdetails” on page 601.
getworkflowdetails Command line, Displays details for a workflow including workflow name, status,
Interactive and run mode. Also displays information when the workflow was
last executed. For more information, see “Getworkflowdetails” on
page 601.
help Command line, Displays a list of pmcmd commands and syntax. For more
Interactive information, see “Help” on page 602.
pingserver Command line, Determines whether the PowerCenter Server is running. For more
Interactive information, see “Pingserver” on page 602.
quit Interactive Quits from pmcmd in the interactive mode. For more information,
see “Quit” on page 602.
resumeworkflow Command line, Resumes a suspended workflow. For more information, see
Interactive “Resumeworkflow” on page 603.
resumeworklet Command line, Resumes a suspended worklet. For more information, see
Interactive “Resumeworklet” on page 603.
scheduleworkflow Command line, The scheduleworkflow command instructs the PowerCenter Server
Interactive to schedule a workflow. Use this command to manually reschedule
a workflow that has been removed from the schedule. For more
information, see “Scheduleworkflow” on page 604.
setfolder Interactive Designates a folder as the default folder in which to execute all
subsequent commands. For more information, see “Setfolder” on
page 604.
Overview 583
Table 23-1. pmcmd Commands
showsettings Interactive Displays the settings for the interactive mode, including
PowerCenter Server and repository name, username, wait mode,
and default folder. For more information, see “Showsettings” on
page 605.
shutdownserver Command line, Shuts down the PowerCenter Server. Use this command in
Interactive conjunction with a shutdownmode option. For more information,
see “Shutdownserver” on page 605.
startask Command line, Starts a task. Use this command in conjunction with a task name.
Interactive For more information, see “Starttask” on page 606.
startworkflow Command line, Starts a workflow. Use this command in conjunction with a
Interactive workflow name. For more information, see “Startworkflow” on
page 607.
stoptask Command line, Stops a task. Use this command in conjunction with a task name.
Interactive For more information, see “Stoptask” on page 609.
stopworkflow Command line, Stops a workflow. Use this command in conjunction with a workflow
Interactive name. For more information, see “Stopworkflow” on page 609.
unscheduleworkflow Command line, Instructs the PowerCenter Server to remove the workflow from the
Interactive schedule. For more information, see “Unscheduleworkflow” on
page 610.
unsetfolder Interactive Designates no folder as the default folder. For more information,
see “Unsetfolder” on page 610.
version Command line, Displays the PowerCenter version number. For more information,
Interactive see “Version” on page 611.
waittask Command line, Instructs the PowerCenter Server to wait for the completion of a
Interactive running task before starting another command. Use this command
in conjunction with a task name. For more information, see
“Waittask” on page 611.
waitworkflow Command line, Notifies you of the status of a workflow. Use this command in
Interactive conjunction with a workflow name. For more information, see
“Waitworkflow” on page 611.
Configuring PM_CODEPAGENAME
pmcmd uses the code page of the machine hosting pmcmd unless you specify the code page
environment variable, PM_CODEPAGENAME, to override it. The code page must be
compatible with the PowerCenter Server code page. pmcmd sends commands in Unicode. If
the code pages are not compatible, the PowerCenter Server might not find the workflow,
session, or task in the repository. For more information about code page compatibility, see
“Globalization Overview” and “Code Pages” in the Installation and Configuration Guide.
export PM_CODEPAGENAME
Configuring PMTOOL_DATEFORMAT
Use this environment variable to customize the way pmcmd displays the date and time. The
pmcmd program verifies that the string you specify is a valid format. If the format string is not
valid, the PowerCenter Server generates a warning message and displays the date in the format
DY MON DD HH24:MI:SS YYYY.
export PMTOOL_DATEFORMAT
export USERNAME
You can assign the environment variable any valid UNIX name.
1. In a UNIX session, navigate to the directory where the PowerCenter Server is installed.
2. At the shell prompt, type:
pmpasswd YourPassword
This command runs the encryption utility pmpasswd located in the directory where the
PowerCenter Server is installed. The encryption utility generates and displays your
encrypted password. The following is sample output. In this example, the password
entered was “monday.”
Encrypted string -->bX34dqq<--
export PASSWORD
You can assign the environment variable any valid UNIX name.
1. In Windows DOS, navigate to the directory where the PowerCenter Server is installed.
2. At the command line, type:
pmpasswd YourPassword
The encryption utility generates and displays your encrypted password. The following is
sample output. In this example, the password entered was “monday.”
Encrypted string -->bX34dqq<--
Will decrypt to -->monday<--
Configuring PM_HOME
Use the PM_HOME variable to start pmcmd from a directory other than the install directory.
On UNIX, point the PM_HOME and PATH environment variables to the PowerCenter
export PM_HOME
The following command immediately starts the workflow wSalesAvg, located in the east
folder, on the remote PowerCenter Server with host name Sales listening at port 6258:
pmcmd startworkflow -u seller3 -p jackson -s SALES:6258 -f east -wait
wSalesAvg
The user, seller3, with the password “jackson” sends the request to start the workflow. When
you use the wait option, pmcmd returns to the shell or command prompt when the workflow
completes.
For a list of commands you can use in the command line mode, see Table 23-1 on page 582.
For details on each command see “pmcmd Reference” on page 594.
For information on defining username and password environment variables, see “Configuring
Repository Username and Password” on page 586.
Required/
Parameter Flags Description
Optional
username -user Required Your repository username. Required if userEnvVar is not used.
-u
serveraddr -serveraddr Required Server address of the machine hosting the PowerCenter Server.
-s
host N/A Optional Name of the machine hosting the PowerCenter Server. If you do
not specify a host name, pmcmd assumes the PowerCenter
Server runs on the machine executing pmcmd.
portno N/A Required Port number at which the PowerCenter Server listens.
Code Description
0 For all commands, a return value of zero indicates that the command ran successfully. You can issue
these commands in the wait or nowait mode: starttask, startworkflow, resumeworklet, resumeworkflow,
aborttask, and abortworkflow. If you issue a command in the wait mode, a return value of zero indicates
the command ran successfully. If you issue a command in the nowait mode, a return value of zero
indicates that the request was successfully transmitted to the PowerCenter Server, and it acknowledged
the request.
1 The PowerCenter Server is down, or pmcmd cannot connect to the PowerCenter Server. The TCP/IP
host name or port number or a network problem occurred.
2 The specified task name, workflow name, or folder name does not exist.
Code Description
6 An error occurred while stopping the PowerCenter Server. Contact Informatica Technical Support.
8 You do not have the appropriate permissions or privileges to perform this task.
9 The connection to the PowerCenter Server timed out while sending the request.
12 The PowerCenter Server cannot start recovery because the session or workflow is scheduled,
suspending, waiting for an event, waiting, initializing, aborting, stopping, disabled, or running.
18 The PowerCenter Server found the parameter file, but it did not have the initial values for the session
parameters, such as $input or $output.
19 The PowerCenter Server cannot start the session in recovery mode because the workflow is configured
to run continuously.
20 A repository error has occurred. Please make sure that the Repository Server and the database are
running and the number of connections to the database is not exceeded.
22 The PowerCenter Server cannot find a unique instance of workflow/session you specified. Enter the
command again with the folder name and workflow name.
24 Out of memory.
25 Command is cancelled.
The following commands immediately start the workflow wSalesAvg, located in the east
folder:
pmcmd> connect -user seller3 -password jackson -serveraddr SALES:6258
pmcmd> setwait
The setwait command means that for all subsequent commands, pmcmd returns the command
prompt when the workflow completes. The setfolder command means that for all subsequent
commands dealing with workflows or tasks, pmcmd uses the specified workflow or task from
the east folder.
For a list of commands you can use in the interactive mode, see Table 23-1 on page 582. For
details on each command see “pmcmd Reference” on page 594.
1. In either a Windows DOS session or a UNIX session, navigate to the directory where the
PowerCenter Server is installed.
2. At the shell or command prompt, type:
pmcmd
This command returns the PowerCenter version number and the pmcmd prompt.
3. From the pmcmd prompt, type:
connect -u YourUserName -p YourPassword -s ServerName:PortNo
Or, if you use username and password environment variables, type the following at the
pmcmd prompt:
connect -uv USERNAME -pv PASSWORD -serveraddr ServerName:PortNo
For information on defining user name and password environment variables, see
“Configuring Repository Username and Password” on page 586.
Command Description
setfolder Designates a folder as the default folder in which to execute all subsequent commands.
setnowait Instructs the PowerCenter Server to execute subsequent commands in the nowait mode.
The pmcmd prompt is available after the PowerCenter Server receives the previous
command. The nowait mode is the default mode.
setwait Instructs the PowerCenter Server to execute subsequent commands in the wait mode.
The pmcmd prompt is available only after the PowerCenter Server completes the previous
command.
For a list of all the commands that you can use in the interactive mode, see Table 23-1 on
page 582.
You can use -password or -p before entering a password. Or, use -passwordvar or -pv before a
password environment variable.
To enter a password, precede the password with either the -password or the -p flag.
-password YourPassword
or
-p YourPassword
If you use a password environment variable, precede the variable name with either the -pv flag
or the -passwordvar flag.
-passwordvar PASSWORD
or
-pv PASSWORD
For a list of all the parameters you can use with pmcmd, see Table 23-5 on page 594.
Command Parameters
When you use most parameters, you precede the parameter with a flag. For ease of use, you
can use a shortened version for most flags. For example, you can either use -serveraddr or its
shortened equivalent, -s.
Table 23-5 describes the parameters used in pmcmd commands and lists the associated flags:
folder -folder Name of the folder containing the workflow or task. Required if the workflow
-f or task name is not unique in the repository.
host N/A The name of the machine hosting the PowerCenter Server. If you do not
specify a host name, pmcmd assumes the PowerCenter Server runs on the
machine executing pmcmd.
localparamfile -localparamfile The localparamfile is a parameter file on a local machine that pmcmd uses
-lpf when you start a workflow. Use in conjunction with the startworkflow
command.
paramfile -paramfile The paramfile parameter determines which parameter file is used when a task
or workflow runs. It overrides the configured parameter file for the workflow or
task. Use in conjunction with the starttask or startworkflow commands.
passwordEnvVar -passwordvar Specifies the password environment variable. Required if password is not
-pv used.
portno N/A Specifies the port number at which the PowerCenter Server listens.
recovery -recovery Specifies you want to run the session in recovery mode.
serveraddr -serveraddr Server address of the machine hosting the PowerCenter Server.
-s
startfrom -startfrom Starts a workflow from a specified task, taskInstancePath. Use the startfrom
parameter in conjunction with the startworkflow command. Write the
taskInstancePath as a fully qualified string.
taskInstancePath N/A Indicates a task and where it appears within the workflow. A task within a
workflow is indicated by its task name alone. A task within a worklet is
indicated by WorkletName.TaskName.
userEnvVar -uservar Specifies the username environment variable. Required if username is not
-uv used.
To denote an empty string, use two single quotes (‘’) or two double quotes (“”). Be sure you
match an opening quote with a closing quote.
Syntax Notation
Table 23-6 describes the notation used in pmcmd syntax:
Convention Description
-z Flag placed before a parameter. This designates the parameter you enter. For
example, to enter the username, type -u or -user followed by the username.
<x> Required parameter. If you omit a required parameter, pmcmd returns an error
message.
Convention Description
<x | y > Select between required parameters. For the command to run, you must select
from the listed parameters. If you omit a required parameter, pmcmd returns an
error message.
[x] Optional parameter. The command runs whether or not you enter in optional
parameters. For example, if you want to use the help command, the syntax is a
follows:
Help [Command]
If you enter a command, pmcmd returns information on that command only. If you
omit the command name, pmcmd returns a list of all commands.
[x|y] Select between optional parameters. The command runs whether or not you
enter in optional parameters. For example, many commands run in either the wait
or nowait mode.
[-wait|-nowait]
The command runs in the mode you specify. If you do not specify a mode,
pmcmd runs the command in the default nowait mode.
<< x | y>| <a | b>> When a set contains subsets, the superset is indicated with bold brackets < >. A
bold pipe symbol (| )separates the subsets.
Tip: When you enter commands in pmcmd, type the command name first followed by the
optional parameters in any order.
Aborttask
The aborttask command aborts a task. Issue this command only after the PowerCenter Server
fails to stop the task when you issue the stoptask command. For details on how the
PowerCenter Server aborts and stops tasks, see “Server Handling of Stop and Abort” on
page 129.
In the command line mode, use the following syntax to abort a task:
pmcmd aborttask
<-serveraddr|-s> [host:]portno
[<-folder|-f> folder]
<<-workflow|-w> workflow>
[-wait|-nowait]
taskInstancePath
In the interactive mode, enter the following syntax at the pmcmd prompt to abort a task:
aborttask
[<-folder|-f> folder]
<<-workflow|-w> workflow>
taskInstancePath
Write the taskInstancePath as a fully qualified string. If the task is within a worklet, write the
string as WorkletName.TaskName. If the task is directly within a workflow, use the task name
alone.
For information on other parameters used in this command, see Table 23-5 on page 594.
Abortworkflow
The abortworkflow command aborts a workflow. Issue this command only after the
PowerCenter Server fails to stop the workflow when you issue the stopworkflow command.
For details on how the PowerCenter Server aborts and stops workflows, see “Server Handling
of Stop and Abort” on page 129.
In the command line mode, use the following syntax to abort a workflow:
pmcmd abortworkflow
<-serveraddr|-s> [host:]portno
[<-folder|-f> folder]
[-wait|-nowait]
workflow
In the interactive mode, enter the following syntax at the pmcmd prompt to abort a workflow:
abortworkflow
[<-folder|-f> folder]
[-wait|-nowait]
workflow
For information on other parameters used in this command, see Table 23-5 on page 594.
Connect
The connect command connects the pmcmd program to the PowerCenter Server in the
interactive mode. If you omit connection information, pmcmd prompts you to enter the
correct information. Once pmcmd successfully connects, you receive the pmcmd prompt. At
the pmcmd prompt, you can issue commands without specifying the connection information.
connect
<-serveraddr|-s> [host:]portno
Disconnect
The disconnect command disconnects pmcmd from the PowerCenter Server. It does not close
the pmcmd program. Use this command when you want to disconnect from a PowerCenter
Server and connect to another in the interactive mode.
In the interactive mode, use the following syntax to disconnect pmcmd from a PowerCenter
Server:
disconnect
Note: You can use this command only in the pmcmd interactive mode.
Exit
The exit command disconnects pmcmd from the PowerCenter Server and closes the pmcmd
program.
In the interactive mode, use the following syntax to exit pmcmd:
exit
Note: You can use this command only in the pmcmd interactive mode.
Getrunningsessionsdetails
The getrunningsessionsdetails command returns the details for all sessions currently running
on the PowerCenter Server. Details include startup and current time, folder and workflow
names, session instance, master and execution servers, number of successful and failed rows in
sources and targets, number of transformation errors, and number of sessions running on the
PowerCenter Server.
In the command line mode, use the following syntax to get details about sessions running on
the PowerCenter Server:
pmcmd getrunningsessionsdetails
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
In the interactive mode, enter the following syntax at the pmcmd prompt to get details about
the PowerCenter Server:
getrunningsessionsdetails
<-serveraddr|-s> [host:]portno
[-all|-running|-scheduled]
In the interactive mode, enter the following syntax at the pmcmd prompt to get details about
the PowerCenter Server:
getserverdetails
[-all|-running|-scheduled]
Issue the getserverdetails command for all or some of the workflows. The -running option
returns status details on active workflows. Active workflows include running, suspending, and
suspended workflows. The -scheduled option returns status details on the scheduled
workflows. The default option is the -all option, and it returns status details on the scheduled
and running workflows.
For information on other parameters used in this command, see Table 23-5 on page 594.
Getserverproperties
The getserverproperties command returns the PowerCenter Server name, type, and version. It
returns the timestamp on the PowerCenter Server, the PowerCenter Server startup time, and
the name of the repository. It indicates the data movement mode, the PowerCenter Server
code page, and whether the PowerCenter Server can debug mappings. It also specifies the
server grid name.
In the command line mode, use the following syntax to see the PowerCenter Server
properties:
pmcmd getserverproperties
In the interactive mode, enter the following syntax at the pmcmd prompt to see PowerCenter
Server properties:
getserverproperties
<-serveraddr|-s>[host:]portno
Serveraddr is the server name and port number of the PowerCenter Server.
Getsessionstatistics
The getsessionstatistics command returns session details and statistics. The command returns
the following information for each partition:
♦ Session details. Session details include the name of the folder, workflow, task instance, and
mapping. It includes the task run status, session log file name, first error code and message,
the number of transformation errors, and the number of successful and failed rows for the
sources and targets. It also includes the name of the master server, worker server, and server
grid.
♦ Session statistics. Session statistics include the transformation name, transformation
instance name, and the number of applied, affected, and rejected rows. It also includes the
throughput, last error code and message, and start and end time for the session.
In the command line mode, use the following syntax to get session statistics:
pmcmd getsessionstatistics
<-serveraddr|-s> [host:]portno
[<-folder|-f> folder]
<<-workflow|-w> workflow>
taskInstancePath
In the interactive mode, enter the following syntax at the pmcmd prompt to get session
statistics:
getsessionstatistics
[<-folder|-f> folder]
<<-workflow|-w> workflow>
taskInstancePath
When using this command, specify the workflow name. Also, write the taskInstancePath as a
fully qualified string. If the task is within a worklet, write the string as
WorkletName.TaskName. If the task is directly within a workflow, enter only the task name.
For information on other parameters used in this command, see Table 23-5 on page 594.
[<-folder|-f> folder]
<<-workflow|-w> workflow>
taskInstancePath
In the interactive mode, enter the following syntax at the pmcmd prompt to get details on a
task:
gettaskdetails
[<-folder|-f> folder]
<<-workflow|-w> workflow>
taskInstancePath
When you use this command, specify the workflow name. Also, write the taskInstancePath as
a fully qualified string. If the task is within a worklet, write the string as
WorkletName.TaskName. If the task is directly within a workflow, enter only the task name.
For information on other parameters used in this command, see Table 23-5 on page 594.
Getworkflowdetails
The getworkflowdetails command returns the folder name, workflow name, last start time,
last completion time, workflow status, run mode, and the username that ran the last
workflow.
In the command line mode, use the following syntax to get details on a workflow:
pmcmd getworkflowdetails
<-serveraddr|-s> [host:]portno
workflow
In the interactive mode, enter the following syntax at the pmcmd prompt to get details on a
workflow:
getworkflowdetails
[<-folder|-f> folder]
workflow
For information on other parameters used in this command, see Table 23-5 on page 594.
Help
The help command returns the syntax for the command you specify. If you omit the
command name, pmcmd lists each command and syntax.
In the command line mode, use the following command for help with command line
commands:
pmcmd help [command]
In the interactive mode, use the following command for help with interactive mode
commands:
help [command]
Pingserver
The pingserver command verifies that the PowerCenter Server is running.
In the command line mode, use the following syntax to ping the PowerCenter Server:
pmcmd pingserver
<-serveraddr|-s> [host:]portno
In the interactive mode, enter the following syntax at the pmcmd prompt to ping the
PowerCenter Server:
pingserver
Serveraddr is the host name and port number of the PowerCenter Server.
Quit
The quit command disconnects pmcmd from the PowerCenter Server and closes the pmcmd
program.
In the interactive mode, use the following syntax to quit pmcmd:
quit
Note: You can use this command in the pmcmd interactive mode only.
[-wait|-nowait]
[-recovery]
workflow
In the interactive mode, enter the following syntax at the pmcmd prompt to resume a
workflow:
resumeworkflow
[<-folder|-f> folder]
[-wait|-nowait]
[-recovery]
workflow
For information on other parameters used in this command, see Table 23-5 on page 594.
Resumeworklet
The resumeworklet command resumes suspended worklets. To resume the workflow from a
specific worklet, specify the taskInstancePath as a fully qualified string. If you do not specify a
taskInstancePath, the workflow resumes from the suspended worklet.
In the command line mode, use the following syntax to resume a worklet:
pmcmd resumeworklet
<-serveraddr|-s> [host:]portno
[<-folder|-f> folder]
<<-workflow|-w> workflow>
[-wait|-nowait]
[-recovery]
In the interactive mode, enter the following syntax at the pmcmd prompt to resume a worklet:
resumeworklet
[<-folder|-f> folder]
<<-workflow|-w> workflow>
[-wait|-nowait]
[-recovery]
taskInstancePath
For information on other parameters used in this command, see Table 23-5 on page 594.
Scheduleworkflow
The scheduleworkflow command instructs the PowerCenter Server to schedule a workflow.
Use this command to reschedule a workflow that has been removed from the schedule.
In the command line mode, use the following syntax to schedule a workflow:
pmcmd scheduleworkflow <-serveraddr|-s> [host:]portno
In the interactive mode, enter the following syntax at the pmcmd prompt to schedule a
workflow:
scheduleworkflow [<-folder|-f> folder] workflow
For information on other parameters used in this command, see Table 23-5 on page 594.
Setfolder
The setfolder command designates a folder as the default folder in which to execute all
subsequent commands. After issuing this command, you do not need to enter a folder name
for workflow, task, and session commands. If you enter a folder name in a command after the
setfolder command, that folder name overrides the default folder name for that command
only.
In the interactive mode, enter the following syntax at the pmcmd prompt to designate a folder
as the default folder:
setfolder folder
Note: You can use this command in the pmcmd interactive mode only.
When the nowait mode is set, the pmcmd prompt is available after the PowerCenter Server
receives the previous command. No parameters are required for this command.
Note: You can use this command in the pmcmd interactive mode only.
Setwait
The setwait command instructs the PowerCenter Server to execute subsequent commands in
the wait mode. The pmcmd prompt is available only after the PowerCenter Server completes
the previous command.
In the interactive mode, enter the following syntax at the pmcmd prompt to instruct the
PowerCenter Server to execute subsequent commands in the wait mode:
setwait
Showsettings
The showsettings command returns the name of the PowerCenter Server and repository to
which pmcmd is connected. It displays the username, wait mode, and default folder. No
parameters are required for this command.
In the interactive mode, enter the following syntax at the pmcmd prompt to display interactive
mode settings:
showsettings
Note: You can use this command in the pmcmd interactive mode only.
Shutdownserver
The shutdownserver command stops the PowerCenter Server. You must have the Super User
or Administer Server privilege to use this command.
You can shut down the PowerCenter Server in the complete, stop, or abort mode. In the
complete mode, pmcmd allows currently running workflows to complete before shutting
down the PowerCenter Server. In the stop mode, the PowerCenter Server stops the running
workflows. In the abort mode, the PowerCenter Server aborts the running workflows. For
<-serveraddr|-s>[host:]portno
<-complete|-stop|-abort>
In the interactive mode, enter the following syntax at the pmcmd prompt to stop the
PowerCenter Server:
shutdownserver
<-complete|-stop|-abort>
For information on other parameters used in this command, see Table 23-5 on page 594.
Starttask
The starttask command starts a task.
In the command line mode, use the following syntax to start a task:
pmcmd starttask
<-serveraddr|-s> [host:]portno
[<-folder|-f> folder]
<<-workflow|-w> workflow>
[-paramfile paramfile]
[-wait|-nowait]
[-recovery]
taskInstancePath
In the interactive mode, enter the following syntax at the pmcmd prompt to start a task:
starttask
[<-folder|-f> folder]
<<-workflow|-w> workflow>
[-paramfile paramfile]
[-wait|-nowait]
[-recovery]
taskInstancePath
For Windows command prompt users, the parameter file name cannot have beginning or
trailing spaces. If the name includes spaces, enclose the file name in double quotes:
-paramfile ”$PMRootDir\my file.txt”
When you write a pmcmd command that includes a parameter file located on another
machine, use the backslash (\) with the dollar sign ($). This ensures that the machine where
the variable is defined expands the server variable.
pmcmd starttask -uv USERNAME -pv PASSWORD -s SALES:6258 -f east -w
wSalesAvg -paramfile ’\$PMRootDir/myfile.txt’ taskA
For information on other parameters used in this command, see Table 23-5 on page 594.
Startworkflow
The startworkflow command starts a workflow.
In the command line mode, use the following syntax to start a workflow:
pmcmd startworkflow
<-serveraddr|-s> [host:]portno
[<-startfrom> taskInstancePath]
[-recovery]
[-paramfile paramfile]
[<-localparamfile|-lpf> localparamfile]
[-wait|-nowait]
workflow
In the interactive mode, enter the following syntax at the pmcmd prompt to start a workflow:
startworkflow
[<-folder|-f> folder]
[-recovery]
[-paramfile paramfile]
[<-localparamfile|-lpf> localparamfile]
[-wait|-nowait]
workflow
Use the -startfrom flag to start the workflow at a designated taskInstancePath. Write the
taskInstancePath as a fully qualified string. If the task is within a worklet, write the string as
WorkletName.TaskName. If the task is directly within a workflow, enter only the task. If you
do not specify a starting point, the workflow starts at the Start task.
♦ Local machine. When you use a parameter file located on the machine where pmcmd is
invoked, pmcmd passes variables and values in the file to the PowerCenter Server. When
you list a local parameter file, specify the absolute path or relative path to the file. Use the
-localparamfile or -lpf option to indicate the location and name of the local parameter file.
On UNIX, use the following syntax:
-lpf ‘param_file.txt’
-lpf ‘c:\Informatica\parameterfiles\param file.txt’
-localparamfile param_file.txt
For information on other parameters used in this command, see Table 23-5 on page 594.
Stoptask
The stoptask command stops a task.
In the command line mode, use the following syntax to stop a task:
pmcmd stoptask
<-serveraddr|-s> [host:]portno
[<-folder|-f> folder]
<<-workflow|-w> workflow>
[-wait|-nowait]
taskInstancePath
In the interactive mode, enter the following syntax at the pmcmd prompt to stop a task:
stoptask
[<-folder|-f> folder]
<<-workflow|-w> workflow>
[-wait|-nowait] taskInstancePath
Write the taskInstancePath as a fully qualified string. If the task is within a worklet, write the
string as WorkletName.TaskName. If the task is directly within a workflow, use the task name
alone.
For information on other parameters used in this command, see Table 23-5 on page 594.
Stopworkflow
The stopworkflow command stops a workflow.
In the command line mode, use the following syntax to stop a workflow:
pmcmd stopworkflow
<-serveraddr|-s> [host:]portno
[<-folder|-f> folder]
workflow
In the interactive mode, enter the following syntax at the pmcmd prompt to stop a workflow:
stopworkflow
[<-folder|-f> folder]
[-wait|-nowait]
workflow
For information on other parameters used in this command, see Table 23-5 on page 594.
Unscheduleworkflow
The unscheduleworkflow command instructs the PowerCenter Server to remove the workflow
from the schedule.
In the command line mode, enter the following syntax at the pmcmd prompt to remove the
workflow from the schedule:
pmcmd unscheduleworkflow <-serveraddr|-s> [host:]portno
In the interactive mode, enter the following syntax at the pmcmd prompt to remove the
workflow from the schedule:
unscheduleworkflow [<-folder|-f> folder] workflow
For information on other parameters used in this command, see Table 23-5 on page 594.
Unsetfolder
The unsetfolder command designates no folder as the default folder. After you issue this
command, you must specify a folder name each time you enter a command for a session,
workflow, or task.
In the interactive mode, enter the following syntax at the pmcmd prompt to clear the setfolder
command:
unsetfolder
In the interactive mode, enter the following syntax at the pmcmd prompt to verify the
PowerCenter version:
version
Waittask
The waittask command instructs the PowerCenter Server to complete the task before
returning the pmcmd prompt to the command prompt or shell.
In the command line mode, use the following syntax to set a task in the wait mode:
pmcmd waittask
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
[<-folder|-f> folder]
<<-workflow|-w> workflow>
taskInstancePath
In the interactive mode, enter the following syntax at the pmcmd prompt to set a task in the
wait mode:
waittask
[<-folder|-f> folder]
<<-workflow|-w> workflow>
taskInstancePath
Write the taskInstancePath as a fully qualified string. If the task is within a worklet, write the
string as WorkletName.TaskName. If the task is directly within a workflow, use the task name
alone.
For information on other parameters used in this command, see Table 23-5 on page 594.
Waitworkflow
The waitworkflow command notifies you whether the specified workflow has run successfully
or is not running. If the workflow is running, pmcmd indicates the success with return code 0
after the workflow has completed. If the workflow is not running, pmcmd indicates the
<-serveraddr|-s> [host:]portno
[<-folder|-f> folder]
workflow
In the interactive mode, enter the following syntax at the pmcmd prompt to set a workflow to
the wait mode:
waitworkflow
[<-folder|-f> folder]
workflow
You can use waitworkflow in conjunction with the startworkflow command if you are running
scripts. For example, you may want to check the status of a critical workflow that was
previously started. You can use the waitworkflow command to wait for that workflow to
complete before you start the next workflow.
For information on other parameters used in this command, see Table 23-5 on page 594.
Session Caches
613
Overview
The PowerCenter Server creates index and data caches in memory for Aggregator, Rank,
Joiner, and Lookup transformations in a mapping. The PowerCenter Server stores key values
in the index cache and output values in the data cache. You configure memory parameters for
the index and data cache in the transformation or session properties.
If the PowerCenter Server requires more memory, it stores overflow values in cache files.
When the session completes, the PowerCenter Server releases cache memory, and in most
circumstances, it deletes the cache files.
The PowerCenter Server creates cache files based on the PowerCenter Server code page.
Table 24-1 gives an overview of the type of information that the PowerCenter Server stores in
the index and data caches:
Aggregator Stores group values as configured in the Stores calculations based on the group by
group by ports. ports.
Rank Stores group values as configured in the Stores ranking information based on the group
group by ports. by ports.
Joiner Stores index values for the master source Stores master source rows.
table as configured in the join condition.
Lookup Stores lookup condition information. Stores lookup data that is not stored in the
index cache.
Memory Cache
The PowerCenter Server creates a memory cache based on the size configured in the session
properties. When you create a mapping, you specify the index and data cache size for each
transformation instance. When you create a session, you can override the index and data
cache size for each transformation instance in the session properties.
When you configure a session, you calculate the amount of memory the PowerCenter Server
needs to process the session. Calculate requirements based on factors such as processing
overhead and column size for key and output columns.
By default, the PowerCenter Server allocates 1,000,000 bytes to the index cache and
2,000,000 bytes to the data cache for each transformation instance. If the PowerCenter Server
cannot allocate the configured amount of cache memory, it cannot initialize the session and
the session fails.
If a server grid has 32-bit and 64-bit servers, and if a session exceeds 2 GB of memory, the
master server assigns it to a 64-bit server. For information on server grids, see “Working with
Server Grids” on page 446.
Cache Files
If the PowerCenter Server requires more memory than the configured cache size, it stores
overflow values in the cache files. Since paging to disk can slow session performance, try to
configure the index and data cache sizes to store data in memory.
The PowerCenter Server creates the index and data cache files by default in the PowerCenter
Server variable directory, $PMCacheDir. If you do not define $PMCacheDir, the
PowerCenter Server saves the files in the PMCache directory specified in the UNIX
configuration file or the cache directory in the Windows registry. If the UNIX PowerCenter
Server does not find a directory there, it creates the index and data files in the installation
directory. If the PowerCenter Server on Windows does not find a directory there, it creates the
files in the system directory.
If a cache file handles more than 2 GB of data, the PowerCenter Server creates multiple index
and data files. When creating these files, the PowerCenter Server appends a number to the
end of the filename, such as PMAGG*.idx1 and PMAGG*.idx2. The number of index and
data files are limited only by the amount of disk space available in the cache directory.
When you run a session, the PowerCenter Server writes a message in the session log indicating
the cache file name and the transformation name. When a session completes, the
PowerCenter Server typically deletes index and data cache files. However, you may find index
and data files in the cache directory under the following circumstances:
♦ The session performs incremental aggregation.
♦ You configure the Lookup transformation to use a persistent cache.
♦ The session does not complete successfully.
The PowerCenter Server use the following naming convention when it creates cache files:
[<Name Prefix> | <Prefix> <session ID>_<transformation ID>]_[partition
index]<suffix>.[overflow index]
Overview 615
Table 24-2 describes the naming convention for cache files that the PowerCenter Server
creates:
Name Prefix Cache file name prefix configured in the Lookup transformation.
Partition Index If the session contains more than one partition, this identifies the partition number. The
partition index is zero-based, so the first partition has no partition index. Partition index 2
indicates a cache file created in the third partition.
Overflow Index If a cache file handles more than 2 GB of data, the PowerCenter Server creates multiple
index and data files. When creating these files, the PowerCenter Server appends an
overflow index to the filename, such as PMAGG*.idx.1 and PMAGG*.idx.2. The number of
index and data files are limited by the amount of disk space available in the cache
directory.
For example, in the file name, PMLKUP8_4_2.idx, PMLKUP identifies the transformation
type as Lookup, 8 is the session ID, 4 is the transformation ID, and 2 is the partition index.
The cache directory should be local to the PowerCenter Server. You might encounter
performance or reliability problems when you cache large quantities of data on a mapped or
mounted drive.
For details on tuning the caches, see “Performance Tuning” on page 635.
Cache Calculations
To determine cache requirements for a session, first add the total column size in the cache to
the row overhead. Multiply the result by the number of groups or rows in the cache. This
gives the minimum caching requirements. To determine the maximum requirements for the
index cache, you multiply the minimum requirements by two.
The following tables provide the calculations for the minimum cache requirements for each
transformation:
Data # groups[( Σ column size) + 7] - Non group by input ports used in non-aggregate output
expression.
- Non group by input/output ports.
- Local variable ports.
- Column containing aggregate function (multiply by
three).*
* Each aggregate function has different cache space requirements. As a general rule, you can multiply the column containing the
aggregate function by three.
Data # groups [(# ranks *( Σ column size + 10)) + 20] - Non group by input ports used in
non-aggregate output expression.
- Non group by input/output ports.
- Local variable ports.
- Rank ports.
Index # master rows [( Σ column size) + 16] Master column in join conditions.
Data # master rows [( Σ column size) + 8] Master column not in join condition and used for
output.
Index # rows in lookup table [( Σ column size) + 16] * 2 Columns in lookup condition.
(maximum)
Data # rows in lookup table [( Σ column size) + 8] Connected output ports not in
the lookup condition.
Return port (for unconnected
Lookup transformations).
For more information about each cache, see the separate sections in this chapter.
Date/Time 18 24
Double 10 16
Real 10 16
Integer 6 16
Small integer 6 16
The column sizes include the bytes required for a null indicator.
Additionally, to increase lookup and join performance, the PowerCenter Server aligns all data
for lookup and joiner caches on an eight byte boundary. So, each Lookup and Joiner column
includes rounding to the nearest multiple of eight.
Use the column sizes in Table 24-7 on page 618 to add the group by columns.
You know that there are 36 stores and 2,000 items, so the total number of groups is 72,000.
Use the following calculation to determine the minimum index cache requirements:
72,000 * (24 + 17) = 2,952,000
Therefore, this Aggregator transformation requires an index cache size between 2,952,000
and 5,904,000 bytes.
# groups[( Σ column size) + 7] - Non group by input ports used in non-aggregate output expression.
- Non group by input/output ports.
- Local variable ports.
- Port containing aggregate function (multiply by three).*
*The cache space requirements for aggregate functions are different for each function. However, you can multiply the port containing the
aggregate function by three for all aggregate functions.
Use the column sizes in Table 24-7 on page 618 to add the columns in the data cache:
Note that you do not use STORE_ID and ITEM in the data cache calculation. These
columns are connected to the target, but you do not use them in the cache calculation because
they are group by ports and are used in the index cache calculation.
The total number of groups as calculated for the index cache size is 72,000. Use the following
calculation to determine the minimum data cache requirements:
72,000 * (36 + 7) = 3,096,000
Therefore, this Aggregator transformation requires a data cache size of 3,096,000 bytes.
Use the column sizes in Table 24-7 on page 618 to add the columns in the index cache:
PRODUCTS is the master source and has 90,000 rows. Use the following calculation to
determine the minimum index cache requirements:
90,000 * (16 + 16) = 2,880,000
Therefore, this Joiner transformation requires an index cache size between 2,880,000 and
5,760,000 bytes.
# master rows [( Σ column size) + 8] Master column not in join condition and used for output.
Use the column sizes in Table 24-7 on page 618 to add the columns for the data cache:
Note that you do not use ITEM_NO in the data cache calculation because it is part of the
join condition and is used in the index cache.
The master source has 90,000 rows.
Use the following calculation to determine the minimum data cache requirements:
90,000 * (62 + 8) = 6,300,000
Static Cache
When you use a static lookup cache, the PowerCenter Server creates one memory cache for
each partition.
If you use cache partitioning, the PowerCenter Server requires only a portion of the total
memory to cache each partition. So, when you configure cache size, you can divide the total
memory requirements by the number of partitions.
If you do not use cache partitioning, the PowerCenter Server requires as much memory for
each partition as it does for a single partition pipeline. So, when you configure cache size, you
enter the total memory requirements for the transformation.
If two Lookup transformations in a mapping share the cache, the PowerCenter Server does not
allocate additional memory for shared transformations in the same pipeline stage. For shared
transformations in a different pipeline stage, the PowerCenter Server does allocate additional
memory.
Static Lookup transformations that use the same data or a subset of data to create a disk cache
can share the disk cache. However, the lookup keys may be different, so the transformations
must have separate memory caches.
For more information about caching the Lookup transformation, see “Lookup Caches” in the
Transformation Guide.
Dynamic Cache
When you use a dynamic lookup cache, the PowerCenter Server creates the memory cache
based on whether you use cache partitioning or not.
If you use cache partitioning, the PowerCenter Server creates one memory cache for each
partition. It requires only a portion of the total memory to cache each partition. So, when you
configure cache size, you can divide the total memory requirements by the number of
partitions.
Example
The Lookup transformation, LKP_PROMOS, looks up values based on the ITEM_ID. It
uses the following lookup condition:
ITEM_ID = IN_ITEM_ID1
Use the column sizes in Table 24-7 on page 618 to add the columns for the index cache:
The lookup condition uses one column, ITEM_ID, and the table contains 60,000 rows.
Use the following calculation to determine the minimum index cache requirements:
200 * (16 + 16) = 6,400
Use the following calculation to determine the maximum index cache requirements:
60,000 * (16 + 16) * 2 = 3,840,000
# rows in lookup table [( Σ column size) + 8] Connected output ports not in the lookup condition.
Use return ports for unconnected transformations.
The following figure shows the connected output ports for LKP_PROMOS:
Use the column sizes in Table 24-7 on page 618 to add the columns for the data cache:
10,000
12,210
5,000
2,455
6,324
The PowerCenter Server caches the first three rows (10,000, 12,210, and 5,000). When the
PowerCenter Server reads the next row (2,455) it compares it to the cache values. Since the
row is lower in rank than the cached rows, it discards the row with 2,455. The next row
(6,324), however, is higher in rank than one of the cached rows. Therefore, the PowerCenter
Server replaces the cached row with the higher-ranked input row.
If the Rank transformation is configured to rank across multiple groups, the PowerCenter
Server ranks incrementally for each group it finds.
The PowerCenter Server uses cache partitioning, when you create multiple partitions in a
pipeline that contains a Rank transformation. It creates one memory cache and one disk cache
per partition and routes data from one partition to another based on group key values of the
transformation.
After you configure the partitions in the session, you can configure the memory requirements
and cache directories for the Rank transformation on the Mappings tab in session properties.
For more information about the Rank transformation, see “Rank Transformation” in the
Transformation Guide.
Use the column sizes in Table 24-7 on page 618 to add the columns in the index cache:
There are 10,000 product categories, so the total number of groups is 10,000. Use the
following calculation to determine the minimum index cache requirements:
10,000 * (24 + 17) = 410,000
Therefore, this Rank transformation requires an index cache size between 410,000 and
820,000 bytes.
# groups [(# ranks *( Σ column size + 10)) + 20] - Non group by input ports used in non-
aggregate output expression.
- Non group by input/output ports.
- Local variable ports.
- Rank ports.
Use the column sizes in Table 24-7 on page 618 to add the columns in the data cache:
RNK_TOPTEN ranks by price, and the total number of ranks is 10. The number of groups is
10,000.
Use the following calculation to determine the minimum data cache requirements:
10,000[(10 * (46 + 10)) + 20] = 5,800,000
Performance Tuning
635
Overview
The goal of performance tuning is to optimize session performance by eliminating
performance bottlenecks. To tune the performance of a session, first you identify a
performance bottleneck, eliminate it, and then identify the next performance bottleneck until
you are satisfied with the session performance. You can use the test load option to run sessions
when you tune session performance.
The most common performance bottleneck occurs when the PowerCenter Server writes to a
target database. You can identify performance bottlenecks by the following methods:
♦ Running test sessions. You can configure a test session to read from a flat file source or to
write to a flat file target to identify source and target bottlenecks.
♦ Studying performance details. You can create a set of information called performance
details to identify session bottlenecks. Performance details provide information such as
buffer input and output efficiency. For details about performance details, see “Creating
and Viewing Performance Details” on page 436.
♦ Monitoring system performance. You can use system monitoring tools to view percent
CPU usage, I/O waits, and paging to identify system bottlenecks.
Once you determine the location of a performance bottleneck, you can eliminate the
bottleneck by following these guidelines:
♦ Eliminate source and target database bottlenecks. Have the database administrator
optimize database performance by optimizing the query, increasing the database network
packet size, or configuring index and key constraints.
♦ Eliminate mapping bottlenecks. Fine tune the pipeline logic and transformation settings
and options in mappings to eliminate mapping bottlenecks.
♦ Eliminate session bottlenecks. You can optimize the session strategy and use performance
details to help tune session configuration.
♦ Eliminate system bottlenecks. Have the system administrator analyze information from
system monitoring tools and improve CPU and network performance.
If you tune all the bottlenecks above, you can further optimize session performance by
increasing the number of pipeline partitions in the session. Adding partitions can improve
performance by utilizing more of the system hardware while processing the session.
Because determining the best way to improve performance can be complex, change only one
variable at a time, and time the session both before and after the change. If session
performance does not improve, you might want to return to your original configurations.
Bulk Loading
You can use bulk loading to improve the performance of a session that inserts a large amount
of data to a DB2, Sybase, Oracle, or Microsoft SQL Server database. Configure bulk loading
on the Mapping tab.
When bulk loading, the PowerCenter Server bypasses the database log, which speeds
performance. Without writing to the database log, however, the target database cannot
perform rollback. As a result, you may not be able to perform recovery. Therefore, you must
External Loading
You can use the External Loader session option to integrate external loading with a session.
If you have a DB2 EE or DB2 EEE target database, you can use the DB2 EE or DB2 EEE
external loaders to bulk load target files. The DB2 EE external loader uses the PowerCenter
Server db2load utility to load data. The DB2 EEE external loader uses the DB2 Autoloader
utility.
If you have a Teradata target database, you can use the Teradata external loader utility to bulk
load target files.
If your target database runs on Oracle, you can use the Oracle SQL*Loader utility to bulk
load target files. When you load data to an Oracle database using a pipeline with multiple
partitions, you can increase performance if you create the Oracle target table with the same
number of partitions you use for the pipeline.
If your target database runs on Sybase IQ, you can use the Sybase IQ external loader utility to
bulk load target files. If your Sybase IQ database is local to the PowerCenter Server on your
UNIX system, you can increase performance by loading data to target tables directly from
named pipes.
For details on the External Loader option, see “External Loading” on page 523.
Caching Lookups
If a mapping contains Lookup transformations, you might want to enable lookup caching. In
general, you want to cache lookup tables that need less than 300MB.
When you enable caching, the PowerCenter Server caches the lookup table and queries the
lookup cache during the session. When this option is not enabled, the PowerCenter Server
queries the lookup table on a row-by-row basis. You can increase performance using a shared
or persistent cache:
♦ Shared cache. You can share the lookup cache between multiple transformations. You can
share an unnamed cache between transformations in the same mapping. You can share a
named cache between transformations in the same or different mappings.
♦ Persistent cache. If you want to save and reuse the cache files, you can configure the
transformation to use a persistent cache. Use this feature when you know the lookup table
does not change between session runs. Using a persistent cache can improve performance
because the PowerCenter Server builds the memory cache from the cache files instead of
from the database.
For more information on lookup caching options, see “Lookup Transformation” in the
Transformation Guide.
Optimizing Expressions
As a final step in tuning the mapping, you can focus on the expressions used in
transformations. When examining expressions, focus on complex expressions for possible
simplification. Remove expressions one-by-one to isolate the slow expressions.
Once you locate the slowest expressions, take a closer look at how you can optimize those
expressions.
If you factor out the aggregate function call, as below, the PowerCenter Server adds
COLUMN_A to COLUMN_B, then finds the sum of both.
SUM(COLUMN_A + COLUMN_B)
This results in three IIFs, two comparisons, two additions, and a faster session.
Evaluating Expressions
If you are not sure which expressions slow performance, the following steps can help isolate
the problem.
Suggested Suggested
Setting Default Value
Minimum Value Maximum Value
Pipeline Partitioning
If you purchased the partitioning option, you can increase the number of partitions in a
pipeline to improve session performance. Increasing the number of partitions allows the
PowerCenter Server to create multiple connections to sources and process partitions of source
data concurrently.
When you create a session, the Workflow Manager validates each pipeline in the mapping for
partitioning. You can specify multiple partitions in a pipeline if the PowerCenter Server can
maintain data consistency when it processes the partitioned data.
For details on partitioning sessions, see “Pipeline Partitioning” on page 663.
100 * 2 = 200
2. Next, based on default settings, you determine that you can change the DTM Buffer Size
to 15,000,000, or you can change the Default Buffer Block Size to 54,000:
(session Buffer Blocks) = (.9) * (DTM Buffer Size) / (Default Buffer Block
Size) * (number of partitions)
or
200 = .9 * 12000000 / 54000 * 1
Session Properties
Reference
This appendix contains a listing of settings in the session properties. These settings are
grouped by the following tabs:
♦ General Tab, 668
♦ Properties Tab, 670
♦ Config Object Tab, 675
♦ Mapping Tab (Transformations View), 681
♦ Mapping Tab (Partitions View), 705
♦ Components Tab, 710
♦ Metadata Extensions Tab, 718
667
General Tab
By default, the General tab appears when you edit a session task.
Figure A-1 displays the General tab:
On the General tab you can rename the session task and enter a description for the session
task.
Table A-1 describes settings on the General tab:
Rename Optional The Rename button allows you to enter a new name for the session task.
Description Optional You can enter a description for the session task in the Description field.
Mapping name Required The name of the mapping associated with the session task.
Server Required The name of the server associated with the session task.
Fail Parent if this Optional Fails the parent worklet or workflow if this task fails.
task fails*
Fail parent if this Optional Fails the parent worklet or workflow if this task does not run.
task does not run*
Treat the input links Required Runs the task when all or one of the input link conditions evaluate to True.
as AND or OR*
*Appears only in the Workflow Designer.
Session Log File Optional By default, the PowerCenter Server uses the session name for the log file
Name name: s_mapping name.log. For a debug session, it uses
DebugSession_mapping name.log.
Optionally enter a file name, a file name and directory, or use the
$PMSessionLogFile session parameter. The PowerCenter Server appends
information in this field to that entered in the Session Log File Directory field.
For example, if you have “C:\session_logs\” in the Session Log File Directory
File field, then enter “logname.txt” in the Session Log File field, the
PowerCenter Server writes the logname.txt to the C:\session_logs\ directory.
You can also use the $PMSessionLogFile session parameter to represent the
name of the session log or the name and location of the session log. For details
on session parameters, see “Session Parameters” on page 495.
Session Log File Required Designates a location for the session log file. By default, the PowerCenter
Directory Server writes the log file in the server variable directory,
$PMSessionLogFileDir.
If you enter a full directory and file name in the Session Log File Name field,
clear this field.
Parameter File Optional Designates the name and directory for the parameter file. Use the parameter
Name file to define session parameters. You can also use it to override values of
mapping parameters and variables. For details on session parameters, see
“Session Parameters” on page 495. For details on mapping parameters and
variables, see “Mapping Parameters and Variables” in the Designer Guide.
Enable Test Load Optional You can configure the PowerCenter Server to perform a test load.
With a test load, the PowerCenter Server reads and transforms data without
writing to targets. The PowerCenter Server generates all session files, and
performs all pre- and post-session functions, as if running the full session.
The PowerCenter Server writes data to relational targets, but rolls back the
data when the session completes. For all other target types, such as flat file
and SAP BW, the PowerCenter Server does not write data to the targets.
Enter the number of source rows you want to test in the Number of Rows to
Test field.
You cannot perform a test load on sessions using XML sources.
Note: You can perform a test load when you configure a session for normal
mode. If you configure the session for bulk mode, the session fails.
Number of Rows to Optional Enter the number of source rows you want the PowerCenter Server to test
Test load.
The PowerCenter Server reads the exact number you configure for the test
load. You cannot perform a test load when you run a session against a
mapping that contains XML sources.
$Source Optional Enter the database connection you want the PowerCenter Server to use for the
Connection Value $Source variable. Choose a relational or application database connection. You
can also choose a $DBConnection parameter.
You can use the $Source variable in Lookup and Stored Procedure
transformations to specify the database location for the lookup table or stored
procedure.
If you use $Source in a mapping, you can specify the database location in this
field to ensure the PowerCenter Server uses the correct database connection
to run the session.
If you use $Source in a mapping, but do not specify a database connection in
this field, the PowerCenter Server determines which database connection to
use when it runs the session. If it cannot determine the database connection, it
fails the session. For more information, see “Lookup Transformation” and
“Stored Procedure Transformation” in the Transformation Guide.
$Target Connection Optional Enter the database connection you want the PowerCenter Server to use for the
Value $Target variable. Choose a relational or application database connection. You
can also choose a $DBConnection parameter.
You can use the $Target variable in Lookup and Stored Procedure
transformations to specify the database location for the lookup table or stored
procedure.
If you use $Target in a mapping, you can specify the database location in this
field to ensure the PowerCenter Server uses the correct database connection
to run the session.
If you use $Target in a mapping, but do not specify a database connection in
this field, the PowerCenter Server determines which database connection to
use when it runs the session. If it cannot determine the database connection, it
fails the session. For more information, see “Lookup Transformation” and
“Stored Procedure Transformation” in the Transformation Guide.
Treat Source Rows Required Indicates how the PowerCenter Server treats all source rows. If the mapping
As for the session contains an Update Strategy transformation or a Custom
transformation configured to set the update strategy, the default option is Data
Driven.
When you select Data Driven and you load to either a Microsoft SQL Server or
Oracle database, you must use a normal load. If you bulk load, the
PowerCenter Server fails the session.
Commit Type Required Determines whether the PowerCenter Server uses a source- or target-based,
or user-defined commit. You can choose source- or target-based commit if the
mapping has no Transaction Control transformation or only ineffective
Transaction Control transformations. By default, the PowerCenter Server
performs a target-based commit.
A User-Defined commit is enabled by default if the mapping has effective
Transaction Control transformations.
For details on Commit Intervals, see “Setting Commit Properties” on page 292.
Commit Interval Required In conjunction with the selected commit interval type, indicates the number of
rows. By default, the PowerCenter Server uses a commit interval of 10,000
rows.
This option is not available for user-defined commit.
Commit On End Of Required By default, this option is enabled and the PowerCenter Server performs a
File commit at the end of the file. Clear this option if you want to roll back open
transactions.
This option is enabled by default for a target-based commit. You cannot disable
it.
Rollback Optional For source-based commit, the PowerCenter Server rolls back the transaction at
Transactions on the next commit point when it encounters a non-fatal writer error.
Errors For user-defined commit, the PowerCenter Server rolls back the transaction at
the next commit point when it encounters a non-fatal error.
This option is not available for target-based commit.
*Tip: When you bulk load to Microsoft SQL Server or Oracle targets, define a large commit interval. Microsoft SQL
Server and Oracle start a new bulk load transaction after each commit. Increasing the commit interval reduces the
number of bulk load transactions and increases performance.
Performance Settings
You can configure performance settings on the Properties tab. In Performance settings you
can increase memory size, collect performance details, and set configuration parameters.
Figure A-3 displays the Performance settings on the Properties tab:
Performance Required/
Description
Settings Optional
DTM Buffer Size Required The amount of memory allocated to the session from the DTM process. By
default, the Workflow Manager allocates 12 MB for DTM buffer memory. If a
session contains large amounts of character data and you configure it to run in
Unicode mode, increase the DTM Buffer size to 24 MB.
Note: If a source contains a large binary object with a precision larger than the
allocated DTM buffer size, then increase the DTM buffer size to increase the
buffer memory. If you do not increase the DTM buffer memory, the session will
fail.
For information on improving session performance, see “Performance Tuning”
on page 635.
Collect Optional When selected, the PowerCenter Server creates session performance details.
Performance Data Use this file to help determine how you can improve session performance. For
more information, see “Performance Tuning” on page 635.
Incremental Optional Select Incremental Aggregation option if you want the PowerCenter Server to
Aggregation perform incremental aggregation. For details, see “Using Incremental
Aggregation” on page 573.
Enable High Optional When selected, the PowerCenter Server processes the Decimal datatype to a
Precision precision of 28. If a session does not use the Decimal datatype, leave this
setting clear. For details on using the Decimal datatype with high precision, see
“Handling High Precision Data” on page 204.
Session Retry On Optional Select this option if you want the PowerCenter Server to retry target writes on
Deadlock deadlock. You can only use Session Retry on Deadlock for sessions configured
for normal load. This option is disabled for bulk mode. You can configure the
PowerCenter Server to set the number of deadlock retries and the deadlock
sleep time period.
Session Sort Order Required Specify a sort order for the session. The session properties display all sort
orders associated with the PowerCenter Server code page. When the
PowerCenter Server runs in Unicode mode, it sorts character data in the
session using the selected sort order. When the PowerCenter Server runs in
ASCII mode, it ignores this setting and uses a binary sort order to sort
character data.
Advanced Settings
Advanced settings allow you to configure constraint-based loading, lookup caches, and buffer
sizes.
Table A-4 describes the Advanced settings of the Config Object tab:
Advanced Required/
Description
Settings Optional
Constraint Based Optional The PowerCenter Server loads targets based on primary key-foreign key
Load Ordering constraints where possible.
Cache Lookup() Optional If selected, the PowerCenter Server caches PowerMart 3.5 LOOKUP functions
Function in the mapping, overriding mapping-level LOOKUP configurations.
If not selected, the PowerCenter Server performs lookups on a row-by-row
basis, unless otherwise specified in the mapping.
Advanced Required/
Description
Settings Optional
Default Buffer Optional This setting is performance related. For details on performance tuning, see
Block Size “Performance Tuning” on page 635.
Note: The session must have enough buffer blocks to initialize. The minimum
number of buffer blocks must be greater than the total number of sources
(Source Qualifiers, Normalizers for COBOL sources), and targets. The number
of buffer blocks in a session = DTM Buffer Size / Buffer Block Size. Default
settings create enough buffer blocks for 83 sources and targets. If the session
contains more than 83, you might need to increase DTM Buffer Size or
decrease Default Buffer Block Size.
Line Sequential Optional Affects the way the PowerCenter Server reads flat files. Increase this setting
Buffer Length from the default of 1024 bytes per line only if source flat file records are larger
than 1024 bytes.
Required/
Log Options Settings Description
Optional
Save Session Log By Required If you select Save Session Log by Timestamp, the PowerCenter Server
saves all session logs, appending a timestamp to each log.
If you select Save Session Log by Runs, the PowerCenter Server saves
a designated number of session logs. Configure the number of sessions
in the Save Session Log for These Runs option.
You can also use the $PMSessionLogCount server variable to save the
configured number of session logs for the PowerCenter Server.
For details on these options, see “Configuring Session Logs” on
page 469.
Save Session Log for Required The number of historical session logs you want the PowerCenter Server
These Runs to save.
The Informatica saves the number of historical logs you specify, plus the
most recent session log. Therefore, if you specify 5 runs, the
PowerCenter Server saves the most recent session log, plus historical
logs 0-4, for a total of 6 logs.
You can specify up to 2,147,483,647 historical logs. If you specify 0 logs,
the PowerCenter Server saves only the most recent session log.
Table A-6 describes the Error handling settings of the Config Object tab:
Stop On Errors Optional Indicates how many non-fatal errors the PowerCenter Server can
encounter before it stops the session. Non-fatal errors include reader,
writer, and DTM errors. Enter the number of non-fatal errors you want to
allow before stopping the session. The PowerCenter Server maintains an
independent error count for each source, target, and transformation. If
you specify 0, non-fatal errors do not cause the session to stop.
Optionally use the $PMSessionErrorThreshold server variable to stop on
the configured number of errors for the PowerCenter Server.
Override Tracing Optional Overrides tracing levels set on a transformation level. Selecting this
option enables a menu from which you choose a tracing level: None,
Terse, Normal, Verbose Initialization, or Verbose Data. For details on
tracing levels, see “Configuring Session Logs” on page 469.
On Stored Procedure Optional Required if the session uses pre- or post-session stored procedures.
Error If you select Stop Session, the PowerCenter Server stops the session on
errors executing a pre-session or post-session stored procedure.
If you select Continue Session, the PowerCenter Server continues the
session regardless of errors executing pre-session or post-session stored
procedures.
By default, the PowerCenter Server stops the session on Stored
Procedure error and marks the session failed.
On Pre-Post SQL Error Optional Required if the session uses pre- or post-session SQL.
If you select Stop Session, the PowerCenter Server stops the session
errors executing pre-session or post-session SQL.
If you select Continue, the PowerCenter Server continues the session
regardless of errors executing pre-session or post-session SQL.
By default, the PowerCenter Server stops the session upon pre- or post-
session SQL error and marks the session failed.
Enable Recovery Optional Enables recovery for the session. For details on recovery, see
“Recovering Data” on page 295.
Error Log Type Required Specifies the type of error log to create. You can specify relational, file, or
no log. By default, the Error Log Type is set to none.
Error Log DB Connection Optional Specifies the database connection for a relational error log.
Error Log Table Name Optional Specifies table name prefix for a relational error log. Oracle and Sybase
Prefix have a 30 character limit for table names. If a table name exceeds 30
characters, the session fails.
Error Log File Directory Optional Specifies the directory where errors are logged. By default, the error log
file directory is $PMBadFilesDir\.
Error Log File Name Optional Specifies error log file name. By default, the error log file name is
PMError.log.
Log Row Data Optional Specifies whether or not to log row data. By default, the check box is clear
and row data is not logged.
Log Source Row Data Optional Specifies whether or not to log source row data. By default, the check box
is clear and source row data is not logged.
Data Column Delimiter Optional Delimiter for string type source row data and transformation group row
data. By default, the PowerCenter Server uses a pipe ( | ) delimiter. Verify
that you do not use the same delimiter for the row data as the error
logging columns. If you use the same delimiter, you may find it difficult to
read the error log file.
Connections Node
The Connections node displays the source, target, lookup, stored procedure, FTP, external
loader, and queue connections. You can choose connection types and connection values. You
can also edit connection object values.
Figure A-7 displays the Connections settings on the Mapping tab:
Connections Required/
Description
Node Settings Optional
Type Required Enter the connection type for relational and non-relational sources and targets.
Specifies Relational for relational sources and targets.
You can choose the following connection types for flat file, XML, and MQSeries
sources/Targets:
- Queue. Select this connection type to access a MQSeries source if you are
using MQ Source Qualifiers. For static MQSeries targets, set the connection
type to FTP or Queue. For dynamic MQSeries targets, the connection type is
set to Queue. MQSeries connections must be defined in the Workflow
Manager prior to configuring sessions. For more information, see the
PowerCenter Connect for IBM MQSeries User and Administrator Guide .
- Loader. Select this connection type to use the External Loader to load output
files to Teradata, Oracle, DB2, or Sybase IQ databases. If you select this
option, select a configured loader connection in the Value column.
To use this option, you must use a mapping with a relational target definition
and choose File as the writer type on the Writers tab for the relational target
instance. As the PowerCenter Server completes the session, it uses an
external loader to load target files to the Oracle, Sybase IQ, DB2, or Teradata
database. You cannot choose external loader for flat file or XML target
definitions in the mapping.
Note to Oracle 8 users: If you configure a session to write to an Oracle 8
external loader target table in bulk mode with NOT NULL constraints on any
columns, the session may write the null character into a NOT NULL column if
the mapping generates a NULL output.
For details on using the external loader feature, see “External Loading” on
page 523.
- FTP. Select this connection type to use FTP to access the source/target
directory for flat file and XML sources/targets. If you select this option, select
a configured FTP connection in the Value column. FTP connections must be
defined in the Workflow Manager prior to configuring sessions. For details on
using FTP, see “Using FTP” on page 559.
- None. Choose None when you want to read from a local flat file or XML file, or
if you are using an associated source for a MQSeries session.
The type also specifies lists the connections in the mapping, such as $Source
connection value and $Target connection value.
You can also configure connection information for Lookups and Stored
Procedures.
Connections Required/
Description
Node Settings Optional
Value Required Enter a source and target connection based on the value you choose in the
Type column. You can also specify the $Source and $Target connection value:
- $Source connection value. Enter the database connection you want the
PowerCenter Server to use for the $Source variable. Choose a relational or
application database connection. You can also choose a $DBConnection
parameter. You can use the $Source variable in Lookup and Stored
Procedure transformations to specify the database location for the lookup
table or stored procedure. If you use $Source in a mapping, you can specify
the database location in this field to ensure the PowerCenter Server uses the
correct database connection to run the session. If you use $Source in a
mapping, but do not specify a database connection in this field, the
PowerCenter Server determines which database connection to use when it
runs the session. If it cannot determine the database connection, it fails the
session. For more information, see the Transformation Guide.
- $Target connection value. Enter the database connection you want the
PowerCenter Server to use for the $Target variable. Choose a relational or
application database connection. You can also choose a $DBConnection
parameter. You can use the $Target variable in Lookup and Stored Procedure
transformations to specify the database location for the lookup table or stored
procedure. If you use $Target in a mapping, you can specify the database
location in this field to ensure the PowerCenter Server uses the correct
database connection to run the session. If you use $Target in a mapping, but
do not specify a database connection in this field, the PowerCenter Server
determines which database connection to use when it runs the session. If it
cannot determine the database connection, it fails the session. For more
information, see the Transformation Guide.
You can also specify the lookup and stored procedure location information
value, if your mapping has lookups or stored procedures.
Sources Node
The Sources node lists the sources used in the session and displays their settings. If you want
to view and configure the settings of a specific source, select the source from the list.
You can configure the following settings:
♦ Readers. The Readers settings displays the reader the PowerCenter Server uses with each
source instance. For more information, see “Readers Settings” on page 684.
♦ Connections. The Connections settings allows you to configure connections for the
sources. For more information, see “Connections Settings” on page 684.
♦ Properties. The Properties settings allows you to configure the source properties. For more
information, see “Properties Settings” on page 686.
Connections Settings
You can configure the connections the PowerCenter Server uses with each source instance.
Table A-8 describes the Connections settings on the Mapping tab (Sources node):
Connections Required/
Description
Settings Optional
Type Required Enter the connection type for relational and non-relational sources. Specifies
Relational for relational sources.
You can choose the following connection types for flat file, XML, and MQSeries
sources:
- Queue. Select this connection type to access a MQSeries source if you are using
MQ Source Qualifiers. MQSeries connections must be defined in the Workflow
Manager prior to configuring sessions. For more information, see the PowerCenter
Connect for IBM MQSeries User and Administrator Guide .
- FTP. Select this connection type to use FTP to access the source directory for flat
file and XML sources. If you want to extract data from a flat file or XML source
using FTP, you must specify an FTP connection when you configure source
options. If you select this option, select a configured FTP connection in the Value
column. FTP connections must be defined in the Workflow Manager prior to
configuring sessions. For details on using FTP, see “Using FTP” on page 559.
- None. Choose None when you want to read from a local flat file or XML file, or if
you are using an associated source for a MQSeries session.
Value Required Enter a source connection based on the value you choose in the Type column.
Table A-9 describes Properties settings on the Mapping tab for relational sources:
Table A-9. Mapping Tab - Sources Node - Properties Settings (Relational Sources)
Relational Required/
Description
Source Options Optional
User Defined Join Optional Specifies the condition used to join data from multiple sources
represented in the same Source Qualifier transformation. For
more information about user defined join, see “Source
Qualifier Transformation” in the Transformation Guide.
Tracing Level N/A Specifies the amount of detail included in the session log
when you run a session containing this transformation. You
can view the value of this attribute when you click Show all
properties. For more information about tracing level, see
“Setting Tracing Levels” on page 473.
Relational Required/
Description
Source Options Optional
Pre SQL Optional Pre-session SQL commands to run against the source
database before the PowerCenter Server reads the source.
For more information about pre-session SQL, see “Using Pre-
and Post-Session SQL Commands” on page 186.
Post SQL Optional Post-session SQL commands to run against the source
database after the PowerCenter Server writes to the target.
For more information about post-session SQL, see “Using
Pre- and Post-Session SQL Commands” on page 186.
Sql Query Optional Defines a custom query that replaces the default query the
PowerCenter Server uses to read data from sources
represented in this Source Qualifier. A custom query overrides
entries for a custom join or a source filter. For more
information, see “Overriding the SQL Query” on page 216.
Source Filter Optional Specifies the filter condition the PowerCenter Server applies
when querying records. For more information, see “Source
Qualifier Transformation” in the Transformation Guide.
Table A-10 describes the Properties settings on the Mapping tab for file sources:
Table A-10. Mapping Tab - Sources Node - Properties Settings (File Sources)
Source File Optional Enter the directory name in this field. By default, the PowerCenter Server looks
Directory in the server variable directory, $PMSourceFileDir, for file sources.
If you specify both the directory and file name in the Source Filename field,
clear this field. The PowerCenter Server concatenates this field with the Source
Filename field when it runs the session.
You can also use the $InputFileName session parameter to specify the file
directory.
For details on session parameters, see “Session Parameters” on page 495.
Source Filename Required Enter the file name, or file name and path. Optionally use the $InputFileName
session parameter for the file name.
The PowerCenter Server concatenates this field with the Source File Directory
field when it runs the session. For example, if you have “C:\data\” in the Source
File Directory field, then enter “filename.dat” in the Source Filename field.
When the PowerCenter Server begins the session, it looks for
“C:\data\filename.dat”.
By default, the Workflow Manager enters the file name configured in the source
definition.
For details on session parameters, see “Session Parameters” on page 495.
Source Filetype Required Allows you to configure multiple file sources using a file list.
Indicates whether the source file contains the source data, or a list of files with
the same file properties. Choose Direct if the source file contains the source
data. Choose Indirect if the source file contains a list of files.
When you select Indirect, the PowerCenter Server finds the file list then reads
each listed file when it executes the session. For details on file lists, see “Using
a File List” on page 230.
Set File Properties Optional Allows you to configure the file properties. For more information, see “Setting
File Properties for Sources” on page 688.
Datetime Format* N/A Displays the datetime format for datetime fields.
Decimal Separator* N/A Displays the decimal separator for numeric fields.
*You can view the value of this attribute when you click Show all properties. This attribute is read-only. For more information, see the
Designer Guide.
Select the file type (fixed-width or delimited) you want to configure and click Advanced.
Figure A-12 displays the Fixed Width Properties dialog box for flat file sources:
Table A-11 describes the options you define in the Fixed Width Properties dialog box for
sources:
Fixed-Width Required/
Description
Properties Options Optional
Null Character: Text/ Required Indicates the character representing a null value in the file. This can be any
Binary valid character in the file code page, or any binary value from 0 to 255. For
more information about specifying null characters, see “Null Character
Handling” on page 227.
Repeat Null Optional If selected, the PowerCenter Server reads repeat NULL characters in a
Character single field as a single NULL value. If you do not select this option, the
PowerCenter Server reads a single null character at the beginning of a field
as a null field. Important: For multibyte code pages, Informatica
recommends that you specify a single-byte null character if you are using
repeating non-binary null characters. This ensures that repeating null
characters fit into the column exactly.
For more information about specifying null characters, see “Null Character
Handling” on page 227.
Code Page Required Select the code page of the fixed-width file. The default setting is the client
code page.
Number of Initial Optional The PowerCenter Server skips the specified number of rows before reading
Rows to Skip the file. Use this to skip header rows. One row may contain multiple rows. If
you select the Line Sequential File Format option, the PowerCenter Server
ignores this option.
You can enter any integer from zero to 2147483647.
Fixed-Width Required/
Description
Properties Options Optional
Number of Bytes to Optional The PowerCenter Server skips the specified number of bytes between
Skip Between records. For example, you have an ASCII file on Windows with one record on
Records each line, and a carriage return and line feed appear at the end of each line.
If you want the PowerCenter Server to skip these two single-byte characters,
enter 2.
If you have an ASCII file on UNIX with one record for each line, ending in a
carriage return, skip the single character by entering 1.
Strip Trailing Blanks Optional If selected, the PowerCenter Server strips trailing blank spaces from records
before passing them to the Source Qualifier transformation.
Line Sequential File Optional Select this option if the file uses a carriage return at the end of each record,
Format shortening the final column.
Figure A-13 displays the Delimited File Properties dialog box for flat file sources:
Delimiters Required Character used to separate columns of data in the source file. Use the
Browse button to the right of this field to enter a different delimiter. Delimiters
can be either printable or single-byte unprintable characters, and must be
different from the escape character and the quote character (if selected). You
cannot select unprintable multibyte characters as delimiters. The delimiter
must be in the same code page as the flat file code page.
Optional Quotes Required Select None, Single, or Double. If you select a quote character, the
PowerCenter Server ignores delimiter characters within the quote characters.
Therefore, the PowerCenter Server uses quote characters to escape the
delimiter.
For example, a source file uses a comma as a delimiter and contains the
following row: 342-3849, ‘Smith, Jenna’, ‘Rockville, MD’, 6.
If you select the optional single quote character, the PowerCenter Server
ignores the commas within the quotes and reads the row as four fields.
If you do not select the optional single quote, the PowerCenter Server reads
six separate fields.
When the PowerCenter Server reads two optional quote characters within a
quoted string, it treats them as one quote character. For example, the
PowerCenter Server reads the following quoted string as I’m going
tomorrow:
2353, ‘I’’m going tomorrow.’, MD
Additionally, if you select an optional quote character, the PowerCenter
Server only reads a string as a quoted string if the quote character is the first
character of the field.
Note: You can improve session performance if the source file does not
contain quotes or escape characters.
Code Page Required Select the code page of the delimited file. The default setting is the client
code page.
Remove Escape Optional This option is selected by default. Clear this option to include the escape
Character From Data character in the output string.
Treat Consecutive Optional By default, the PowerCenter Server reads pairs of delimiters as a null value.
Delimiters as One If selected, the PowerCenter Server reads any number of consecutive
delimiter characters as one.
For example, a source file uses a comma as the delimiter character and
contains the following record: 56, , , Jane Doe. By default, the PowerCenter
Server reads that record as four columns separated by three delimiters: 56,
NULL, NULL, Jane Doe. If you select this option, the PowerCenter Server
reads the record as two columns separated by one delimiter: 56, Jane Doe.
Number of Initial Optional The PowerCenter Server skips the specified number of rows before reading
Rows to Skip the file. Use this to skip title or header rows in the file.
Targets Node
The Targets node lists the used in the session and displays their settings. If you want to view
and configure the settings of a specific target, select the target from the list.
You can configure the following settings:
♦ Writers. The Writers settings displays the writer the PowerCenter Server uses with each
target instance. For more information, see “Writers Settings” on page 692.
♦ Connections. The Connections settings allows you to configure connections for the
targets. For more information, see “Connections Settings” on page 693.
♦ Properties. The Properties settings allows you to configure the target properties. For more
information, see “Properties Settings” on page 695.
Writers Settings
You can view and configure the writer the PowerCenter Server uses with each target instance.
The Workflow Manager specifies the necessary writer for each target instance. For relational
targets the writer is Relational Writer and for file targets it is File Writer.
Table A-13 describes the Writers settings on the Mapping tab (Targets node):
Writers Required/
Description
Setting Optional
Writers Required For relational targets, choose Relational Writer or File Writer. When the target in the
mapping is a flat file, an XML file, a SAP BW target, or MQ target, the Workflow
Manager specifies the necessary writer in the session properties.
When you choose File Writer for a relational target you can use an external loader
to load data to this target. For more information, see “External Loading” on
page 523.
When you override a relational target to use the file writer, the Workflow Manager
changes the properties for that target instance on the Properties settings. It also
changes the connection options you can define on the Connections settings.
After you override a relational target to use a file writer, define the file properties for
the target. Click Set File Properties and choose the target to define. For more
information, see “Configuring Fixed-Width Properties” on page 265 and “Configuring
Delimited Properties” on page 266.
Connections Settings
You can enter connection types and specific target database connections on the Targets node
of the Mappings tab.
Connections Required/
Description
Settings Optional
Type Required Enter the connection type for non-relational targets. Specifies Relational for
relational targets.
You can choose the following connection types for flat file, XML, and MQ
targets:
- FTP. Select this connection type to use FTP to access the target directory for
flat file and XML targets. If you want to load data to a flat file or XML target
using FTP, you must specify an FTP connection when you configure target
options. If you select this option, select a configured FTP connection in the
Value column. FTP connections must be defined in the Workflow Manager
prior to configuring sessions. For details on using FTP, see “Using FTP” on
page 559.
- External Loader. Select this connection type to use the External Loader to
load output files to Teradata, Oracle, DB2, or Sybase IQ databases. If you
select this option, select a configured loader connection in the Value column.
To use this option, you must use a mapping with a relational target definition
and choose File as the writer type on the Writers tab for the relational target
instance. As the PowerCenter Server completes the session, it uses an
external loader to load target files to the Oracle, Sybase IQ, DB2, or Teradata
database. You cannot choose external loader for flat file or XML target
definitions in the mapping.
Note to Oracle 8 users: If you configure a session to write to an Oracle 8
external loader target table in bulk mode with NOT NULL constraints on any
columns, the session may write the null character into a NOT NULL column if
the mapping generates a NULL output.
For details on using the external loader feature, see “External Loading” on
page 523.
- Queue. Choose Queue when you want to output to an MQSeries message
queue. If you select this option, select a configured MQ connection in the
Value column. For more information, see the PowerCenter Connect for IBM
MQSeries User and Administrator Guide.
- None. Choose None when you want to write to a local flat file or XML file.
Value Required Enter a target connection based on the value you choose in the Type column.
Properties Settings
Click the Properties settings to define target property information. The Workflow Manager
displays different properties for the different target types: relational, flat file, and XML.
Required/
Target Property Description
Optional
Insert Optional If selected, the PowerCenter Server inserts all rows flagged for insert.
By default, this option is selected.
For details on target update strategies, see “Update Strategy
Transformation” in the Transformation Guide.
Update (as Update) Optional If selected, the PowerCenter Server updates all rows flagged for update.
By default, this option is selected.
For details on target update strategies, see “Update Strategy
Transformation” in the Transformation Guide.
Update (as Insert) Optional If selected, the PowerCenter Server inserts all rows flagged for update.
By default, this option is not selected.
For details on target update strategies, see “Update Strategy
Transformation” in the Transformation Guide.
Update (else Insert) Optional If selected, the PowerCenter Server updates rows flagged for update if it
they exist in the target, then inserts any remaining rows marked for insert.
For details on target update strategies, see “Update Strategy
Transformation” in the Transformation Guide.
Delete Optional If selected, the PowerCenter Server deletes all rows flagged for delete.
For details on target update strategies, see “Update Strategy
Transformation” in the Transformation Guide.
Truncate Table Optional If selected, the PowerCenter Server truncates the target before loading. For
details on this feature, see “Truncating Target Tables” on page 245.
Required/
Target Property Description
Optional
Reject File Directory Optional Enter the directory name in this field. By default, the PowerCenter Server
writes all reject files to the server variable directory, $PMBadFileDir.
If you specify both the directory and file name in the Reject Filename field,
clear this field. The PowerCenter Server concatenates this field with the
Reject Filename field when it runs the session.
You can also use the $BadFileName session parameter to specify the file
directory.
For details on session parameters, see “Session Parameters” on page 495.
Reject Filename Required Enter the file name, or file name and path. By default, the PowerCenter
Server names the reject file after the target instance name:
target_name.bad. Optionally use the $BadFileName session parameter for
the file name.
The PowerCenter Server concatenates this field with the Reject File
Directory field when it runs the session. For example, if you have
“C:\reject_file\” in the Reject File Directory field, and enter “filename.bad” in
the Reject Filename field, the PowerCenter Server writes rejected rows to
C:\reject_file\filename.bad.
For details on session parameters, see “Session Parameters” on page 495.
Rejected Truncated/ Optional Instructs the PowerCenter Server to write the truncated and overflowed
Overflowed rows* rows to the reject file.
Table Name Prefix Optional Specify the owner of the target tables.
Pre SQL Optional You can enter pre-session SQL commands for a target instance in a
mapping to execute commands against the target database before the
PowerCenter Server reads the source.
Post SQL Optional Enter post-session SQL commands to execute commands against the target
database after the PowerCenter Server writes to the target.
*You can view the value of this attribute when you click Show all properties. This attribute is read-only. For more information, see the
Designer Guide.
Table A-16 describes the Properties settings on the Mapping tab for file targets:
Required/
Target Property Description
Optional
Merge Partitioned Optional When selected, the PowerCenter Server merges the partitioned target files into
Files one file when the session completes, and then deletes the individual output
files. If the PowerCenter Server fails to create the merged file, it does not
delete the individual output files.
You cannot merge files if the session uses FTP, an external loader, or a
message queue.
For details on configuring a session for partitioning, see “Pipeline Partitioning”
on page 345.
Merge File Optional Enter the directory name in this field. By default, the PowerCenter Server
Directory writes the merged file in the server variable directory, $PMTargetFileDir.
If you enter a full directory and file name in the Merge File Name field, clear
this field.
Merge File Name Optional Name of the merge file. Default is target_name.out. This property is required if
you select Merge Partitioned Files.
Required/
Target Property Description
Optional
Output File Optional Enter the directory name in this field. By default, the PowerCenter Server
Directory writes output files in the server variable directory, $PMTargetFileDir.
If you specify both the directory and file name in the Output Filename field,
clear this field. The PowerCenter Server concatenates this field with the Output
Filename field when it runs the session.
You can also use the $OutputFileName session parameter to specify the file
directory.
For details on session parameters, see “Session Parameters” on page 495.
Output Filename Required Enter the file name, or file name and path. By default, the Workflow Manager
names the target file based on the target definition used in the mapping:
target_name.out.
If the target definition contains a slash character, the Workflow Manager
replaces the slash character with an underscore.
When you use an external loader to load to an Oracle database, you must
specify a file extension. If you do not specify a file extension, the Oracle loader
cannot find the flat file and the PowerCenter Server fails the session. For more
information about external loading, see “Loading to Oracle” on page 533.
Enter the file name, or file name and path. Optionally use the $OutputFileName
session parameter for the file name.
The PowerCenter Server concatenates this field with the Output File Directory
field when it runs the session.
For details on session parameters, see “Session Parameters” on page 495.
Note: If you specify an absolute path file name when using FTP, the
PowerCenter Server ignores the Default Remote Directory specified in the FTP
connection. When you specify an absolute path file name, do not use single or
double quotes.
Reject File Optional Enter the directory name in this field. By default, the PowerCenter Server
Directory writes all reject files to the server variable directory, $PMBadFileDir.
If you specify both the directory and file name in the Reject Filename field,
clear this field. The PowerCenter Server concatenates this field with the Reject
Filename field when it runs the session.
You can also use the $BadFileName session parameter to specify the file
directory.
For details on session parameters, see “Session Parameters” on page 495.
Reject Filename Required Enter the file name, or file name and path. By default, the PowerCenter Server
names the reject file after the target instance name: target_name.bad.
Optionally use the $BadFileName session parameter for the file name.
The PowerCenter Server concatenates this field with the Reject File Directory
field when it runs the session. For example, if you have “C:\reject_file\” in the
Reject File Directory field, and enter “filename.bad” in the Reject Filename
field, the PowerCenter Server writes rejected rows to
C:\reject_file\filename.bad.
For details on session parameters, see “Session Parameters” on page 495.
Set File Properties Optional Allows you to configure the file properties. For more information, see “Setting
File Properties for Targets” on page 701.
Datetime Format* N/A Displays the datetime format selected for datetime fields.
Required/
Target Property Description
Optional
Decimal Separator* N/A Displays the decimal separator for numeric fields.
*You can view the value of this attribute when you click Show all properties. This attribute is read-only. For more information, see the
Designer Guide.
Select the file type (fixed-width or delimited) you want to configure and click Advanced.
Table A-17 describes the options you define in the Fixed Width Properties dialog box:
Fixed-Width Required/
Description
Properties Options Optional
Null Character Required Enter the character you want the PowerCenter Server to use to represent
null values. You can enter any valid character in the file code page.
For more information about specifying null characters for target files, see
“Null Characters in Fixed-Width Files” on page 272.
Repeat Null Character Optional Select this option to indicate a null value by repeating the null character to
fill the field. If you do not select this option, the PowerCenter Server enters
a single null character at the beginning of the field to represent a null
value. For more information about specifying null characters for target
files, see “Null Characters in Fixed-Width Files” on page 272.
Code Page Required Select the code page of the fixed-width file. The default setting is the client
code page.
Delimiters Required Character used to separate columns of data. Use the Browse button to the right
of this field to enter a non-printable delimiter. Delimiters can be either printable
or single-byte unprintable characters, and must be different from the escape
character and the quote character (if selected). You cannot select unprintable
multibyte characters as delimiters.
Optional Quotes Required Select No Quotes, Single Quote, or Double Quotes. If you select a quote
character, the PowerCenter Server does not treat delimiter characters within
the quote characters as a delimiter. For example, suppose an output file uses a
comma as a delimiter and the PowerCenter Server receives the following row:
342-3849, ‘Smith, Jenna’, ‘Rockville, MD’, 6.
If you select the optional single quote character, the PowerCenter Server
ignores the commas within the quotes and writes the row as four fields.
If you do not select the optional single quote, the PowerCenter Server writes
six separate fields.
Code Page Required Select the code page of the delimited file. The default setting is the client code
page.
Transformations Node
On the Transformations node, you can override properties that you configure in
transformation and target instances in a mapping. The attributes you can configure depends
on the type of transformation you select.
HashKeys Node
The HashKeys node you can configure hash key partitioning. Select Edit Keys to edit the
partition key. For more information, see “Edit Partition Key” on page 708.
Partition Points
Description
Node
Add Partition Point Click to add a new partition point to the Transformation list. For information on adding partition
points, see “Adding and Deleting Partition Points” on page 353.
Delete Partition Click to delete the current partition point. You cannot delete certain partition points. For details,
Point see “Adding and Deleting Partition Points” on page 353.
Edit Keys Click to add, remove, or edit the key for key range or hash user keys partitioning. This button is
not available for auto-hash, round-robin, or pass-through partitioning.
For more information on adding keys and key ranges, see “Adding Keys and Key Ranges” on
page 358.
Table A-20 describes the options in the Edit Partition Point dialog box:
Add button Click to add a partition. You can add up to 64 partitions. For more information on
adding partitions, see “Adding and Deleting Partitions” on page 356.
Delete button Click to delete the selected partition. For more information on deleting partitions, see
“Adding and Deleting Partitions” on page 356.
Select Partition Type Select a partition type from the list. For more information, see “Specifying Partition
Types” on page 356.
You can specify one or more ports as the partition key. To rearrange the order of the ports that
make up the key, select a port in the Selected Ports list and click the up or down arrow.
For information on adding a key for key range partitioning, see “Key Range Partition Type”
on page 363. For information on adding a key for hash partitioning, see “Hash Keys Partition
Types” on page 361.
Task n/a Tasks you can perform in the Components tab. You can configure pre- or post-
session shell commands and success or failure email messages in the
Components tab.
Type Required Select None if you do not want to configure commands and emails in the
Components tab.
For pre- and post-session commands, select Reusable to call an existing
reusable Command task as the pre- or post-session shell command. Select
Non-Reusable to create pre- or post-session shell commands for this session
task.
For success or failure emails, select Reusable to call an existing Email task as
the success or failure email. Select Non-Reusable to create email messages
for this session task.
Pre-Session Optional Shell commands that the PowerCenter Server performs at the beginning of a
Command session. For details on using pre-session shell commands, see “Using Pre- or
Post-Session Shell Commands” on page 188.
Post-Session Optional Shell commands that the PowerCenter Server performs after the session
Success Command completes successfully. For details on using pre-session shell commands, see
“Using Pre- or Post-Session Shell Commands” on page 188.
Post-Session Optional Shell commands that the PowerCenter Server performs after the session if the
Failure Command session fails. For details on using pre-session shell commands, see “Using
Pre- or Post-Session Shell Commands” on page 188.
On Success Email Optional The PowerCenter Server sends On Success email message if the session
completes successfully.
On Failure Email Optional The PowerCenter Server sends On Failure email message if the session fails.
Click the Override button to override the Run If Previous Completed option in the
Command task. For details on the Run If Previous Completed option, see Table A-24 on
page 714.
Table A-23 describes General tab for editing pre- or post-session shell commands:
Name Required Enter a name for the pre- or post-session shell command.
Make Reusable Required Select Make Reusable to create a reusable Command task from the pre- or
post-session shell commands.
Clear the Make Reusable option if you do not want the Workflow Manager to
create a reusable Command task from the shell commands.
For details on creating Command tasks from pre- or post-session shell
commands, see “Creating a Reusable Command Task from Pre- or Post-
Session Commands” on page 191.
Description Optional Enter a description for the pre- or post-session shell command.
Properties Tab
for Pre- or Post- Required/
Description
Session Optional
Commands
Run If Previous Required Select this option if you want the PowerCenter Server to perform the next
Completed command only if the previous command completed successfully.
Table A-25 describes the Commands tab for editing pre- or post-session commands:
Commands Tab
for Pre- or Post- Required/
Description
Session Optional
Commands
Command Required The shell command you want the PowerCenter Server to perform. Enter one
command for each line. You can use session parameters or server variables in
shell commands.
If your command contains spaces, enclose the command in quotes. For
example, if you want to call c:\program files\myprog.exe, you must enter
“c:\program files\myprog.exe”, including the quotes. Enter only one command
on each line.
Reusable Email
Select Reusable in the Type field for the On-Success or On-Failure email if you want to select
an existing Email task as the On-Success or On-Failure email. The Email Object Browser
appears when you click the right side of the Values field.
Select an Email task to use as On-Success or On-Failure email. Click the Override button to
override properties of the email. For more information about email properties, see Table A-27
on page 717.
Non-Reusable Email
Select Non-Reusable in the Type field to create a non-reusable email for the session. Non-
Reusable emails do not appear as Email tasks in the Task folder. Click the right side of the
Values field to edit the properties for the non-reusable On-Success or On-Failure emails. For
more information about email properties, see Table A-27 on page 717.
Email Properties
You configure email properties for On-Success or On-Failure Emails when you override an
existing Email task or when you create a non-reusable email for the session.
Table A-26 describes general settings for editing On-Success or On-Failure emails:
Required/
Email Settings Description
Optional
Name Required Enter a name for the email you want to configure.
Description Required Enter a description for the email you want to configure.
Table A-27 describes the email properties for On-Success or On-Failure emails:
Required/
Email Properties Description
Optional
Email user name Required Required to send On-Success or On-Failure session email. Enter the email
address of the person you want the PowerCenter Server to email after the
session completes. The email address must be entered in 7-bit ASCII.
For success email, you can enter $PMSuccessEmailUser to send email to the
user configured for the server variable.
For failure email, you can enter $PMFailureEmailUser to send email to the user
configured for the server variable.
Email subject Optional Enter the text you want to appear in the subject header.
Email text Optional Enter the text of the email. You can use several variables when creating this
text to convey meaningful information, such as the session name and session
status. For details, see “Sending Email” on page 319.
The Metadata Extensions tab allows you to create and promote metadata extensions. For
information on creating metadata extensions, see “Metadata Extensions” in the Repository
Guide.
Table A-28 describes the configuration options for the Metadata Extensions tab:
Metadata
Required/
Extensions Tab Description
Optional
Options
Extension Name Required Name of the metadata extension. Metadata extension names must be unique in
a domain.
Datatype Required The data type: numeric (integer), string, boolean, or XML.
Metadata
Required/
Extensions Tab Description
Optional
Options
Precision Required for The maximum length for string or XML metadata extensions.
string and
XML objects
Reusable Required Select to make the metadata extension apply to all objects of this type
(reusable). Clear to make the metadata extension apply to this object only
(non-reusable).
Workflow Properties
Reference
This appendix contains a listing of settings in the workflow properties. These settings are
grouped by the following tabs:
♦ General Tab, 722
♦ Properties Tab, 724
♦ Scheduler Tab, 726
♦ Variables Tab, 731
♦ Events Tab, 732
♦ Metadata Extensions Tab, 733
721
General Tab
You can change the workflow name and enter a comment for the workflow on the General
tab. By default, the General tab appears when you open the workflow properties.
Figure B-1 displays the General tab of the workflow properties:
Select a
PowerCenter Server
to run the workflow.
Select a suspension
email.
Tasks must run on Optional Requires all workflow tasks to run on the PowerCenter Server that you
Server select.
Suspension Email Optional Select a reusable email task for the suspension email. When a task fails,
the PowerCenter Server suspends the workflow and sends the
suspension email.
For details on suspending workflows, see “Suspending the Workflow” on
page 127.
Disabled Optional Select to disable the workflow from the schedule. The PowerCenter
Server stops running the workflow until you clear the Disabled option.
For details on the Disabled option, see “Disabling Workflows” on
page 118.
Suspend On Error Optional If selected, the PowerCenter Server suspends the workflow when a task
in the workflow fails.
For details on suspending workflows, see “Suspending the Workflow” on
page 127.
Web Services Optional If selected, you create a service workflow. Click Config Service to
configure service information.
For more information on creating web services, see the Web Services
Provider Guide.
Parameter File Optional Designates the name and directory for the parameter file. Use the parameter
Name file to define workflow parameters. For details on parameter files, see
“Parameter Files” on page 511.
Workflow Log File Optional Optionally enter a file name, or a file name and directory.
Name If you leave this field blank, the PowerCenter Server does not create a
workflow log. Instead, the PowerCenter Server writes workflow log messages
to the server log or Windows Event Log, depending on how you configure the
PowerCenter Server.
If you fill in this field, the PowerCenter Server appends information in this field
to that entered in the Workflow Log File Directory field. For example, if you
have "C:\workflow_logs\" in the Workflow Log File Directory field, then enter
"logname.txt" in the Workflow Log File Name field, the PowerCenter Server
writes logname.txt to the C:\workflow_logs\ directory.
Workflow Log File Required Designates a location for the workflow log file. By default, the PowerCenter
Directory Server writes the log file in the server variable directory, $PMWorkflowLogDir.
If you enter a full directory and file name in the Workflow Log File Name field,
clear this field.
Save Workflow Log Required If you select Save Workflow Log by Timestamp, the PowerCenter Server saves
By all workflow logs, appending a timestamp to each log.
If you select Save Workflow Log by Runs, the PowerCenter Server saves a
designated number of workflow logs. Configure the number of workflow logs in
the Save Workflow Log for These Runs option.
For details on these options, see “Archiving Workflow Logs” on page 459.
You can also use the $PMWorkflowLogCount server variable to save the
configured number of workflow logs for the PowerCenter Server.
Save Workflow Log Required The number of historical workflow logs you want the PowerCenter Server to
For These Runs save.
The Informatica saves the number of historical logs you specify, plus the most
recent workflow log. Therefore, if you specify 5 runs, the PowerCenter Server
saves the most recent workflow log, plus historical logs 0–4, for a total of 6
logs.
You can specify up to 2,147,483,647 historical logs. If you specify 0 logs, the
PowerCenter Server saves only the most recent workflow log.
Edit
scheduler
settings.
Required/
Scheduler Tab Options Description
Optional
Figure B-4. Workflow Properties - Scheduler Tab - Edit Scheduler Dialog Box
Table B-4. Workflow Properties - Scheduler Tab - Edit Scheduler Dialog Box
Required/
Scheduler Options Description
Optional
Run Options: Run On Server Optional Indicates the workflow schedule type.
Initialization/Run On Demand/ If you select Run On Server Initialization, the PowerCenter
Run Continuously Server runs the workflow as soon as the server is initialized.
If you select Run On Demand, the PowerCenter Server only runs
the workflow when you start the workflow.
If you select Run Continuously, the PowerCenter Server starts
the next run of the workflow as soon as it finishes the first run.
Schedule Options: Run Once/ Optional Required if you select Run On Server Initialization in Run
Run Every/Customized Repeat Options.
Also required if you do not choose any setting in Run Options.
If you select Run Once, the PowerCenter Server runs the
workflow once, as scheduled in the scheduler.
If you select Run Every, the PowerCenter Server runs the
workflow at regular intervals, as configured.
If you select Customized Repeat, the PowerCenter Server runs
the workflow on the dates and times specified in the Repeat
dialog box.
Start Date Optional Required if you select Run On Server Initialization in Run
Options.
Also required if you do not choose any setting in Run Options.
Indicates the date on which the PowerCenter Server begins
scheduling the workflow.
Start Time Optional Required if you select Run On Server Initialization in Run
Options.
Also required if you do not choose any setting in Run Options.
Indicates the time at which the PowerCenter Server begins
scheduling the workflow.
End Options: End On/End Optional Required if the workflow schedule is Run Every or Customized
After/Forever Repeat.
If you select End On, the PowerCenter Server stops scheduling
the workflow in the selected date.
If you select End After, the PowerCenter Server stops
scheduling the workflow after the set number of workflow runs.
If you select Forever, the PowerCenter Server schedules the
workflow as long as the workflow does not fail.
Required/
Repeat Option Description
Optional
Repeat Every Required Enter the numeric interval you want to schedule the workflow, then select Days,
Weeks, or Months, as appropriate.
If you select Days, select the appropriate Daily Frequency settings.
If you select Weeks, select the appropriate Weekly and Daily Frequency
settings.
If you select Months, select the appropriate Monthly and Daily Frequency
settings.
Weekly Optional Required to enter a weekly schedule. Select the day or days of the week on
which you want to schedule the workflow.
Required/
Repeat Option Description
Optional
Daily Required Enter the number of times you would like the PowerCenter Server to run the
workflow on any day the session is scheduled.
If you select Run Once, the PowerCenter Server schedules the workflow once
on the selected day, at the time entered on the Start Time setting on the Time
tab.
If you select Run Every, enter Hours and Minutes to define the interval at which
the PowerCenter Server runs the workflow. The PowerCenter Server then
schedules the workflow at regular intervals on the selected day. The
PowerCenter Server uses the Start Time setting for the first scheduled
workflow of the day.
Required/
Variable Options Description
Optional
Persistent Required Indicates whether the PowerCenter Server maintains the value of the variable
from the previous workflow run.
The Metadata Extensions tab allows you to create and promote metadata extensions. For
information on creating metadata extensions, see “Metadata Extensions” in the Repository
Guide.
Table B-8 describes the configuration options for the Metadata Extensions tab:
Metadata
Required/
Extensions Tab Description
Optional
Options
Extension Name Required Name of the metadata extension. Metadata extension names must be unique in
a domain.
Metadata
Required/
Extensions Tab Description
Optional
Options
Precision Required for The maximum length for string or XML metadata extensions.
string and
XML objects
Reusable Required Select to make the metadata extension apply to all objects of this type
(reusable). Clear to make the metadata extension apply to this object only
(non-reusable).
UnOverride Optional This column appears only if the value of one of the metadata extensions was
changed. To restore the default value, click Revert.
735
Overview
The Workflow Manager and Workflow Monitor replace the Server Manager in PowerCenter
5.x and PowerMart 5.x. This appendix compares session properties in the Server Manager
with session and workflow options in the Workflow Manager. It lists the session properties as
they appeared on the session properties in the Server Manager. It then gives the corresponding
options in the Workflow Manager.
The session properties for the Server Manager contain the following tabs:
♦ General tab
♦ Source Location tab
♦ Time tab
♦ Log and Error Handling tab
♦ Transformations tab
♦ Partitions tab
In the Server Manager, you configured the following options from the General tab:
♦ General options
♦ Source options
♦ Target options
♦ Session commands
♦ Performance
General Options
In the Server Manager, you could configure the Session Name field, Server Name, and the
Session Enabled option on the General tab of the session properties.
In the Workflow Manager, these options are on either the General tab of the session
properties or in the workflow properties.
Session Enabled General tab-Disable this task. You can only view this property when you edit
the session instance from the Workflow Designer.
Source Options
In the Server Manager, Source options appeared under the Session Name field on the General
tab.
In the Workflow Manager, source options appear under the Sources node on the Mapping tab
(Transformations view). The Sources node contains connections, properties, and readers
settings.
Table C-2 compares Source options for the Server Manager with the corresponding properties
for the Workflow Manager:
Figure C-2. Server Manager Source Options Dialog Box for File Sources
Table C-3 compares source options for file sources for the Server Manager with the
corresponding options for the Workflow Manager:
Figure C-5. Server Manager Source Options Dialog Box (XML Sources)
FTP Properties
In the Server Manager, the FTP Properties dialog box appeared when you edited FTP
properties.
In the Workflow Manager, the FTP Connection Editor appears when you choose FTP as the
connection type from the Sources tab, click the Edit button on the right side of the Value
field, and then click Override to edit the FTP properties.
Figure C-6 shows the Server Manager FTP Properties dialog box:
Target Options
In the Server Manager target options appeared on the General tab. In the target options, you
could select the target type for the session, configure reject file names, and create database
connection session parameters in the target options.
In the Workflow Manager, the Mapping tab-Transformations view-Targets node contains
connections, properties, and writers settings.
Table C-6 compares target options for the Server Manager with the corresponding options for
Workflow Manager:
Target Options Button Properties in the Target Options dialog box are located on the Mapping
tab-Transformations view-Targets node-Properties settings.
Reject Options Button Properties in the Rejects Options dialog box are located on the Mapping
tab-Transformations view-Targets node-Properties settings.
Table C-7 compares relational target options for the Server Manager with the corresponding
options for the Workflow Manager:
Output Files
In the Server Manager, the Output Files dialog box appeared when you selected a file target
type, then clicked Target Options on the General tab.
In the Workflow Manager, output file target options appear on the Mapping tab-
Transformations view. The Targets node contains connections, properties, and writer settings.
Figure C-8 shows the Server Manager Output Files dialog box:
Merge Targets For Partitioned Sessions Mapping tab-Transformations view-Targets node-Properties settings.
Fixed-Width Properties
In the Server Manager, the Fixed-Width dialog box appeared when you configured a session
to write to a fixed-width target file, and then clicked Edit Null Character.
In the Workflow Manager, you can access the Fixed-Width Properties dialog box from the
Properties settings of the Mappings tab. Click Set File Properties, and select Fixed-Width.
Figure C-10 shows the Server Manager Fixed-Width dialog box:
Figure C-11. Server Manager Delimited File Properties Dialog Box (Output Files)
XML Targets
In the Server Manager, the XML Target dialog box appeared when you selected an XML file
target type, then clicked Target Options.
In the Workflow Manager, you can access the XML Target dialog box from the Properties
settings of the Mappings tab. Click Set File Properties.
Figure C-12 shows the Server Manager XML Target dialog box:
Table C-9 compares XML target options for the Server Manager with the corresponding
options for Workflow Manager:
Reject Files
In the Server Manager, the Reject Files dialog box appeared when you clicked Reject Options
on the General tab.
In the Workflow Manager, the reject file options appear in the Targets node Properties
settings on the Mapping tab.
Figure C-13 shows the Server Manager Reject File dialog box:
Table C-10 compares Reject Files options for the Server Manager with the corresponding
options for Workflow Manager:
Pre-Session Commands
In the Server Manager, the Pre-Session Commands dialog box appeared when you clicked Pre-
Session on the General tab of the session properties.
In the Workflow Manager, pre-session command options appear on the Components tab.
Figure C-14 shows the Server Manager Pre-Session Commands dialog box:
Table C-11 compares session command options for the Server Manager with the
corresponding options for the Workflow Manager:
Description Components tab. Click the Edit button on the right side of the Value field
for Pre-Session Commands. Enter the description in the General tab of
the Edit Pre-Session Commands dialog box.
Command Components tab. Click the Edit button on the right side of the Value field
for Pre-Session Commands. Enter the command in the Command tab of
the Edit Pre-Session Commands dialog box.
Table C-12 compares post-session command and email options for the Server Manager with
the corresponding options for the Workflow Manager:
Description Components tab. Click the Edit button on the right side of the Value field
for Post-Session Commands. Enter the description in the General tab of
the Edit Post-Session Commands dialog box.
Command Components tab. Click the Edit button on the right side of the Value field
for Post-Session Commands. Enter the command in the Command tab
of the Edit Post-Session Commands dialog box.
Email User Name Components tab. Click the Edit button on the right side of the Value field
for On Success Email or On Failure Email. Enter the email user name in
the Properties tab of the Edit Success Email or Edit Failure Email dialog
box.
Email Subject Components tab. Click the Edit button on the right side of the Value field
for On Success Email or On Failure Email. Enter the email subject in the
Properties tab of the Edit Success Email or Edit Failure Email dialog
box.
Email Text Components tab. Click the Edit button on the right side of the Value field
for On Success Email or On Failure Email. Enter the email text in the
Properties tab of the Edit Success Email or Edit Failure Email dialog
box.
Advanced Options button Config Object tab, Mapping tab, and Properties tab.
Configuration Parameters
In the Server Manager, the Configuration Parameters dialog box appeared when you clicked
Advanced Options on the General tab. In the Configuration Parameters dialog box, you could
configure the DTM memory parameters, general parameters, reader parameters, and event-
based scheduling.
In the Workflow Manager, the configuration parameters options appear on multiple tabs.
Figure C-16 shows the Server Manager Configuration Parameter dialog box:
Enable Decimal Arithmetic Properties tab-Performance settings. The option name is Enable High
Precision.
Event-Based Scheduling-Indicator File To Event Wait Task-Events tab-Pre Defined Event. Enter the name of the
Wait For file to watch.
In the Server Manager, you configured the following options from the Time tab:
♦ Schedule options
♦ Start options
♦ Duration options
♦ Batch option
Schedule Options
In the Server Manager, you used the Schedule options on the Time tab of the session
properties to schedule the frequency of a session run.
Repeat Options
In the Server Manager, the Repeat dialog box appeared when you selected Customized
Repeat, then clicked Edit on the Time tab.
In the Workflow Manager, the Customized Repeat dialog box appears when you schedule a
session to run on server initialization, select Customized Repeat, and then click Edit.
Figure C-19 shows the Server Manager Repeat dialog box:
Start Options
In the Server Manager, the Start options appeared below the Schedule options on the Time
tab. In the Start options, you could select the session start date and session start time.
In the Workflow Manager, the Start options appear on the Schedule tab of the workflow
properties.
Duration Options
In the Server Manager, Duration options appeared next to Start options on the Time tab. In
Duration options, you could set the end date of a session run, the number of session runs, or
schedule a session to run forever as long as it was successful.
In the Workflow Manager, End options appear next to Start options on the Scheduler tab of
the workflow properties.
In the Server Manager, on the Log and Error Handling tab you could configure the following
options:
♦ Log File options
♦ Parameter File option
♦ Batch Handling option
♦ Error Handling options
Server Path to Log Files Properties tab-General Options settings. Enter the path in Session Log
File Directory.
Session Log File Properties tab-General Options settings. Enter the log file name in
Session Log File Name.
Save the Session Log From the Last Config Object tab-Log Options settings.
<number> Session Runs
Log and Error Handling tab-On pre-session Config Object tab-Error handling settings.
command errors-Stop session/Continue
session
Log and Error Handling tab-On stored Config Object tab-Error handling settings.
procedure errors-Stop session/Continue
session
Table C-17 compares the Transformations tab options for Server Manager with the
corresponding options for the Workflow Manager:
A adding
tasks 92
ABORT function advanced settings
See also Transformation Language Reference session properties 675
session failure 200 aggregate caches
aborted status 421 calculating the data cache 622
aborting calculating the index cache 621
Control tasks 147 overview 621
server handling 129 reinitializing 576, 674
sessions 130 aggregate files
status 421 deleting 577
tasks 129 moving 577
tasks in Workflow Monitor 418 aggregate function calls
workflows 129 minimizing 652
Aborttask Aggregator transformation
pmcmd syntax 596 cache options 621
Abortworkflow cache partitioning 621
pmcmd syntax 597 caches 26, 34
absolute time data cache 622
specifying 162 index cache 621
Timer task 161 optimizing performance 650
active sources optimizing with Sorted Input 651
constraint-based loading 248 partitioning guidelines 347
defined 259 performance detail 639
generating commits 278 allocating memory
row error logging 260 XML sources 655
source-based commit 278 AND links 137
transaction generators 259 archiving
XML targets 259 session logs 471
763
workflow logs 459
arrange
C
workflows vertically 40 cache files
workspace objects 71 locating 577
ASCII mode naming convention 615
See also Installation and Configuration Guide permissions 28
See also Unicode mode cache partitioning
overview 27 Aggregator transformation 621
performance 661 described 359
session behavior 16 incremental aggregation 621
assigning Joiner transformation 624
PowerCenter Servers 122, 198 Lookup transformation 391
Assignment tasks Rank transformation 620
creating 140 caches
definition 140 Aggregator transformation 621
description 132 calculating Aggregator data cache 622
using expression editor 96 calculating Aggregator index cache 621
variables in 103 calculating Joiner data cache 626
calculating Joiner index cache 625
calculating Lookup data cache 631
B calculating Lookup index cache 629
calculating Rank data cache 633
$BadFile calculating Rank index cache 632
definition 508 default directory 34
naming convention 496, 520 files for index and data 614
using 509 files, overview 34
blocking Joiner transformation 624
definition 23 Lookup transformation 628
blocking source data memory 26, 614
PowerCenter Server handling 23 memory usage 26
buffer block size optimizing 658
configuring 677 overview 28, 614
optimizing 655, 657 resetting with real-time sessions 288
buffer memory session cache files 614
allocating 655 transformation 34
buffer blocks 25 caching
DTM process 25 lookup functions 676
bulk loading Char datatypes
commit interval 253 removing trailing blanks for optimization 653
data driven session 252 check point interval
DB2 642 optimizing 642
DB2 guidelines 253 checking in
Oracle 643 versioned objects 74
Oracle guidelines 253 checking out versioned objects 74
session properties 252, 697 COBOL sources
Sybase IQ 643 error handling 227
targets 642 numeric data handling 229
test load 244 code page compatibility
using user-defined commit 283 See also Installation and Configuration Guide
multiple file sources 230
targets 235
764 Index
code pages sessions 79
See also Installation and Configuration Guide tasks 79
data movement modes 27 workflows 79
database connections 54, 234 worklets 79
delimited source 224 Components tab
delimited target 267, 703 properties 710
external loader files 524 concurrent connections
fixed-width sources 222 in partitioned pipelines 379
fixed-width target 266, 702 Config Object tab
relaxed validation 55 properties 675
validation 12 configuring
viewing the session log 475 error handling options 493
color connect string
setting 42 examples 54
workspace 42 syntax 54
command line mode for pmcmd connection objects
connecting 589 See also Repository Guide
return codes 590 assigning permissions 51
using 589 definition 51
command line program See pmcmd deleting 59
Command task connection settings
multiple UNIX commands 145 applying to all session instances 180
Command tasks targets 695
creating 143 connections
definition 143 copy as 59, 60
description 132 copying a relational database connection 59
executing commands 145 external loader 551
promoting to reusable 145 FTP 561
Run if Previous Completed 145 multiple targets 274
using server variables 188, 193 relational database 56
using session parameters 143 replacing a relational database connection 62
comments sources 211
adding in Expression Editor 97 targets 237
commit interval connectivity
bulk loading 253 See also Installation and Configuration Guide
configuring 292 connect string examples 54
description 276 overview 5
optimizing 655, 658 server grids 447
source- and target-based 276 constraint-based loading
commit source active sources 248
source-based commit 278 configuring 248
commit type enabling 251
configuring 672 key relationships 248
committing data session property 676
target connect groups 278 target connection groups 249
transaction control 283 Update Strategy transformations 249
common logic control file
factoring 652 overriding Teradata 539
comparing objects overview 33
See also Designer Guide permissions 28
See also Repository Guide
Index 765
Control tasks finding 577
definition 147 data flow
description 132 See pipeline
options 148 data movement mode
stopping or aborting the workflow 129 See also ASCII mode
copying See also Installation and Configuration Guide
repository objects 77 See also Unicode mode
counters affecting incremental aggregation 577
BufferInput_efficiency 640 overview 27
BufferOutput_efficiency 640 database connections
overview 437 See also Installation and Configuration Guide
Rowsinlookupcache 639 configuring 56
Transformation_errorrows 639 copying a relational database connection 59
Transformation_readfromdisk 639 domain name 58
Transformation_writetodisk 639 packet size 58
CPU usage privileges required to create 53
PowerCenter Server 24 replacing a relational database connection 62
creating rollback segment 58
external loader connections 551 session parameter 499
FTP sessions 565 use trusted connection 58
server grids 451 using Oracle OS Authentication 53
sessions 175 databases
workflows 91 connection requirements 57
CUME connectivity overview 46
partitioning restrictions 395 environment SQL 55
Custom transformation optimizing sources 645
partitioning guidelines 396 optimizing targets 642
customized repeat selecting code pages 54
daily 117 setting up connections 53
editing 115 datatypes
monthly 117 See also Designer Guide
options 116 Char 653
repeat every 117 Decimal 269
weekly 117 Double 269
Float 269
Integer 269
D minimizing conversions 648
Money 269
data Numeric 269
capturing incremental source changes 574, 579 padding bytes for fixed-width targets 268
data caches Real 269
Aggregator transformation 622 Varchar 653
description 614 dates
for incremental aggregation 577 configuring 38
memory usage 26 formats 38
optimizing 655, 658 DB2
Rank transformation 633 bulk loading 642
data driven bulk loading guidelines 253
bulk loading 252 commit interval 253
data files See IBM DB2
creating directory 579
766 Index
$DBConnection session properties, targets 266
definition 499 description
naming convention 496, 520 repository objects 73
using 499 directories
deadlock for historical aggregate data 579
retry session 674 server defaults 46
deadlock retry server variables 46
See also Installation and Configuration Guide workspace file 41
configuring 246 disabled
target connection groups 257 status 421
Debugger disabling
restrictions in partitioned pipelines 396 tasks 137
decimal arithmetic workflows 118
See high precision displaying
Decision tasks customizing windows 69
creating 151 date time format 38
decision condition variable 149 Expression Editor 97
definition 149 fonts 42
description 132 options 39
example 149 servers in Workflow Monitor 406
using Expression Editor 96 show solid lines for links 42
variables in 103 toolbars 69
DECODE function workspace color 42
See also Transformation Language Reference documentation
using for optimization 653 conventions xlix
default remote directories description xlviii
for FTP connections 561 online xlix
deleting domain name 58
connection objects 59 dropping
servers 50 indexes 248
workflows 97 DTM (Data Transformation Manager)
delimited flat files buffer memory 25
code page 691 overview 3
code page, sources 224 post-session email 10
code page, targets 267 process 7, 11
consecutive delimiters 692 running sessions and workflows 7
escape character 691 transformation statistics example 469
escape character, sources 224 DTM Buffer Pool Size
numeric data handling 229 optimizing 655
quote character 691 session property 674
quote character, sources 224 tuning 656
quote character, targets 267
session properties, sources 222
session properties, targets 266 E
sources 691
delimited sources edit
number of rows to skip 692 delimiter 690
delimited targets edit null characters
session properties 703 session properties 702
delimiter editing
session properties, sources 222 delimiter 702
Index 767
session privileges 178 guidelines for entering 55
sessions 177 environment variables
email PM_CODEPAGENAME 585
attaching files 333, 342 PM_HOME 587
configuring a user on Windows 322, 342 PMTOOL_DATEFORMAT 585
configuring the PowerCenter Server on UNIX 321 repository username and password 586
configuring the PowerCenter Server on Windows 322 error handling 186
distribution lists 326 COBOL sources 227
email variables 333 error log files 489
format tags 333 fixed-width file 227
logon network security on Windows 325 options 493
MIME format 320 overview 201
multiple recipients 326 PMError_MSG table schema 485
on failure 332 PMError_ROWDATA table schema 483
on success 332 PMError_Session table schema 486
overview 320 pre- and post-session SQL 186
post-session 332 settings 679
rmail 321 transaction control 284
server variables 333 error log
session properties 714 options 494
specifying a Microsoft Outlook profile 327 session errors 201
suspending workflows 339 error log files 489
text message 328 error log tables
tips 342 creating 483
user name 328 overview 483
using other mail programs 343 error logging
using server variables 333 overview 482
Windows service startup account 322 error logs
workflows 341 messages 29
worklets 341 error messages
Email tasks external loader 527
creating 329 error threshold
description 132 $PMSessionErrorThreshold 47
overview 328 pipeline partitioning 200
See also email 328 stop on errors 200
suspension email 128 errors
email variables See also Troubleshooting Guide
overview 333 eliminating to improve performance 648
Enable Past Events option 159 fatal 200
enabling enhanced security 44 minimizing tracing level to improve performance 659
end of file pre-session shell command 193
transaction control 284 stopping on 679
end options threshold 200
end after 116 validating in Expression Editor 97
end on 116 Event-Raise tasks
forever 116 configuring 155
enhanced security declaring user-defined event 155
enabling 44 definition 153
enabling for connection objects 44 description 132
environment SQL in worklets 167
configuring 55
768 Index
events using Control task 148
in worklets 167 fatal errors
pre-defined events 153 session failure 200
user-defined events 153 file list
Event-Wait tasks creating for multiple sources 230
definition 153 creating for partitioned sources 375
description 132 using for source file 230
for pre-defined events 158 file server
for user-defined events 157 for multiple PowerCenter Servers 445
waiting for past events 159 setting up for multiple servers 445
working with 156 file sources
Expression Editor numeric data handling 229
adding comments 97 partitioning 374
displaying 97 server handling 226, 229
syntax colors 97 session properties 218
using 96 file targets
validating 119 partitioning 380
validating expressions using 97 session properties 261
expressions filter conditions
optimizing 652 in partitioned pipelines 372
validating 97 filtering
external loader deleted tasks in Workflow Monitor 406
behavior 526 servers in Workflow Monitor 406
code page 524 tasks in Gantt Chart view 405
connections 551 tasks in Task View 431
DB2 528 filters
error messages 527 optimizing 650
loading multibyte data 533, 535 finding objects
on Windows systems 526 Workflow Manager 70
Oracle 533 fixed-width files
overview 524 code page 689
performance 643 code page, sources 222
permissions 525 code page, targets 266
PowerCenter Server support 524 error handling 227
privileges required to create connection 525 multibyte character handling 227
session properties 682, 695 null character 689
setting up Workflow Manager 553 null characters, sources 222
Sybase IQ 535 null characters, targets 266
Teradata 538 numeric data handling 229
using with partitioned pipeline 380 padded bytes in fixed-width targets 268
External Procedure transformation source session properties 220
See also Designer Guide target session properties 265
partitioning guidelines 396 writing to 268, 269
fixed-width sources
session properties 689
F fixed-width targets
session properties 702
fail parent workflow 138 flat file definitions
failed status 421 escape character, sources 224
failing workflows PowerCenter Server handling, targets 268
failing parent workflows 148 quote character, sources 224
Index 769
quote character, targets 267
session properties, sources 218
G
session properties, targets 261 Gantt Chart
flat files configuring 411
See also Designer Guide filtering 405
code page, sources 222 listing tasks and workflows 424
code page, targets 266 navigating 425
delimiter, sources 224 opening and closing folders 407
delimiter, targets 267 organizing 425
increasing performance 660 overview 402
multibyte data 270 searching 427
null characters, sources 222 using 423
null characters, targets 266 zooming 426
numeric data handling 229 general options
output file session parameter 504 arranging workflow vertically 40
output files 33 configuring 39
precision 270 in-place editing 40
precision, targets 269 launching Workflow Monitor 41
shift-sensitive target 271 open editor 41
source file session parameter 502 panning windows 40
fonts receive notification from server 41
setting 42 reload task or workflow 40
format options session properties 668
changing the font 42 show expression on a link 41
color 42 show full name of task 41
configuring 42 General tab in session properties
date and time 38 FTP properties 742
reset all 42 in Server Manager 737
schedule 38 in Workflow Manager 668
show solid lines for links 42 session commands 750
Timer task 38 source options 738
FTP (File Transfer Protocol) target options 743
accessing source files 565 General tab of session properties
accessing target files 568 general options 737
connecting to file targets 380 performance options 752
connection names 561 generating
connection options 563 commits with source-based commit 278
creating a session 565 Getrunningsessionsdetails
defining connections 561 pmcmd syntax 598
defining default remote directory 561 Getserverdetails
defining host names 561 pmcmd syntax 599
mainframe restrictions 560 Getserverproperties
overview 560 pmcmd syntax 599
privileges required to create connections 562 Getsessionstatistics
session properties 682, 695 pmcmd syntax 600
functions Gettaskdetails
See also Transformation Language Reference pmcmd syntax 601
minimizing for optimization 653 Getworkflowdetails
pmcmd syntax 601
globalization
See also Installation and Configuration Guide
770 Index
database connections 234 partitioning data 578
overview 234 performance 651
targets 234 preparing to enable 579
processing 575
reinitializing cache 576
H incremental changes
capturing 579
hash partitioning index caches
adding hash keys 362 Aggregator transformation 621
hash auto-keys partitioning 361 description 614
hash user keys partitioning 362 for incremental aggregation 577
overview 348, 361 memory usage 26
Help optimizing 655, 658
pmcmd syntax 602 Rank transformation 632
heterogeneous sources indexes
defined 208 creating directory 579
heterogeneous targets dropping for target tables 248
overview 274 finding 577
high precision optimizing by dropping 642
disabling 658 recreating for target tables 248
enabling 674 indicator files
handling 204 description 33
optimizing 655 pre-defined events 156
history names session output 33
in Workflow Monitor 419 Informatica
host names documentation xlviii
for FTP connections 561 Webzine l
registering the PowerCenter Server 49 Informix
connect string syntax 54
row-level locking 379
I in-place editing 40
$InputFile
IBM DB2 definition 502
connect string example 54 naming convention 496, 520
icon using 503, 507
Workflow Monitor 404 interactive mode for pmcmd
worklet validation 171 connecting 592
IIF expressions setting defaults 592
See also Transformation Language Reference
optimizing 653
incremental aggregation
See also Installation and Configuration Guide
J
cache partitioning 621 joiner cache
changing server code page 577 overview 624
changing server data movement mode 577 Joiner transformation
changing session sort order 577 cache partitioning 624
configuring 674 caches 26, 34, 624
configuring the session 579 joining sorted flat files 385
deleting files 577 joining sorted relational data 387
files 34 optimizing 651
moving files 577 optimizing performance 650
overview 574
Index 771
partitioning guidelines 396 Log and Error Handling tab
performance detail 639 batch handling option 759
threads created 19 error handling option 759
log file options 758
parameter file option 759
K Server Manager session properties 758
log files
key constraints See session logs, workflow logs
optimizing by dropping 642 See also Installation and Configuration Guide
key range partitioning 348, 363 editor for Workflow Monitor 410
keys server variable for 46
constraint-based loading 248 session log 671
log options
settings 677
L logs
server 28
launch session 31
Workflow Monitor 41, 404 workflow 30
line sequential buffer length lookup cache
configuring 677 calculating size 629, 631
sources 225 overview 628
links persistent 35
AND 137 pipeline partitioning 628
condition 92 ports included 628
example link condition 94 session property 676
linking tasks concurrently 93 lookup caches
linking tasks sequentially 94 See also Designer Guide
loops 92 enabling 649
OR 137 query created 628
show expression on a link 41 LOOKUP function
show solid lines 42 See also Transformation Language Reference
specifying condition 94 minimizing for optimization 653
using Expression Editor 96 Lookup SQL Override option
variables in 103 reducing cache size 649
working with 92 Lookup transformation
List Tasks See also Designer Guide
in Workflow Monitor 424 cache partitioning 391
Load Manager caches 26, 34, 628
creating log files 11 calculating cache size 628, 629, 631
memory usage 24 enabling caching 649
overview 3 optimizing 639, 649
parameters 25 optimizing lookup condition 649
post-session email 10 optimizing multiple lookup expressions 650
process 7, 8 optimizing with indexing 649
running sessions and workflows 7 loops in workflow 92
scheduling workflows 8
validating code pages 12
load summary
sessions 467
M
local variables mapping bottlenecks
replacing sub-expressions 652 identify 638
772 Index
mapping parameters MIME format
See also Designer Guide email 320
in session properties 203 monitoring
overriding 203 data flow 639
mapping threads session details 434
description 14 MOVINGAVG
mapping variables See also Transformation Language Reference
See also Designer Guide partitioning restrictions 395
in partitioned pipelines 394 MOVINGSUM
mappings See also Transformation Language Reference
definition 2 partitioning restrictions 395
factoring common logic 652 multibyte data
identify bottlenecks 638 character handling 227
increasing performance 636 Oracle external loader 533
single-pass reading 647 Sybase IQ external loader 535
master servers 446 writing to files 270
master thread multiple servers
description 14 overview 444
Maximum Days multiple sessions 196
Workflow Monitor 410
maximum sessions
See also Installation and Configuration Guide
parameter, description 25
N
Maximum Workflow Runs naming convention
Workflow Monitor 410 See also Getting Started Guide
memory naming conventions
caches 614 session parameters 496, 520
DTM buffer 25 native connect string
increasing to avoid paging 662 See connect string
merge target files navigating
session properties 699 workspace 69
merging target files 380, 382 network packets
message queue increasing 643, 646
using with partitioned pipeline 380 non-persistent variables 110
metadata extensions non-reusable tasks
creating 82 inherited changes 136
deleting 85 promoting to reusable 136
editing 84 normal loading
overview 82 session properties 697
session properties 718 Normal tracing levels
Microsoft Access definition 473
pipeline partitioning 379 Normalizer transformation
Microsoft Outlook partitioning guidelines 347
configuring an email user 322, 342 notification
configuring the PowerCenter Server 322 general option 41
Microsoft SQL Server null characters
bulk loading 642 editing 702
commit interval 253 file targets 266
connect string syntax 54 server handling 227
optimizing 646 session properties, targets 265
targets 702
Index 773
numeric operations OR links 137
optimizing by using 653 Oracle
numeric values bulk loading 642
reading from sources 229 bulk loading guidelines 253
commit intervals 253
connect string syntax 54
O connection with OS Authentication 53
Oracle external loader
open transaction attributes 533
defined 287 bulk loading 643
operators connecting with OS Authentication 552
using for optimization 653 data precision 533
optimizing delimited flat file target 533
block size 657 external loader connections 551
buffer block size 655 external loader support 524, 533
choosing numeric vs. string operations 653 fixed-width flat file target 533
commit interval 655, 658 multibyte data 533
data cache 655 null constraint 533
data caches 658 partitioned target files 533
data flow 440, 637, 639 reject file 534
disabling high precision 658 output files
dropping indexes and key constraints 642 overview 28, 33
DTM Buffer Pool Size 655 permissions 28
eliminating transformation errors 648 session parameter 504
expressions 652 session properties 700
factoring out common logic 652 targets 263
filters 650 $OutputFile
high precision 655 definition 504
IIF expressions 653 naming convention 496, 520
increasing checkpoint interval 642 using 505
increasing network packet size 646 override
index cache 655, 658 Teradata loader control file 539
Joiner transformation 651 tracing levels 473, 679
Lookup transformation 649, 650 owner name
mapping 647 truncating target tables 245
minimizing aggregate function calls 652
minimizing datatype conversions 648
minimizing error tracing 659
pipeline partitioning 663
P
removing trailing blank spaces 653 packet size 58
replacing sub-expressions with local variables 652 paging
sessions 655 eliminating 662
single-pass reading 647 parameter files
source database 645 format 513
system-level 660 location 518
target database 642 session 512
Tracing Level 655 specifying in session 518
using DECODE vs. LOOKUP expressions 653 using with pmcmd starttask 607
using operators vs. functions 653 using with pmcmd startworkflow 608
optimizing performance parameters
Aggregator transformation 650 session 496
774 Index
partition keys performance data
adding 358, 362, 364 collecting 674
adding key ranges 365 performance detail files
partition points creating 436
adding and deleting 353 enabling session monitoring 436
default 17 permissions 28
description 17, 346 understanding counters 437
Joiner transformation 384 viewing 436
partition types performance settings
description 348 session properties 674
partitioning permissions
See pipeline partitioning connection objects 51
partitioning data creating a session 175
incremental aggregation 578 database 51
partitioning restrictions deleting a PowerCenter Server 50
Debugger 396 editing sessions 177
Informix 379 external loader 525
numerical functions 395 FTP connections 561
PowerCenter Connect for IBM MQSeries restrictions FTP session 565
397 output and log files 28
PowerCenter Connect for PeopleSoft restrictions 397 recovery files 28
PowerCenter Connect for SAP BW 397 scheduling 90
PowerCenter Connect for SAP R/3 397 Workflow Monitor tasks 403
PowerCenter Connect for Siebel 398 persistent lookup cache
relational targets 395 session output 35
Sybase IQ 379, 395 persistent variables 110
transformations 395 in worklets 169
unconnected transformations 353 pinging
XML targets 396 pmcmd syntax 602
Partitioning tab PowerCenter Server in Workflow Monitor 405
in the Server Manager 762 Pingserver
in the Workflow Manager 762 pmcmd syntax 602
Partitions pipeline partitioning
properties 352 adding and deleting partitions 356
partitions adding hash keys 362
adding and deleting 356 adding key ranges 365
description 18, 348 adding partition points 353
Partitions views caching Lookup transformations 628
properties 351 concurrent connections 379
pass-through pipeline configuring a session 351
overview 15 configuring for sorted data 384
performance configuring to optimize join performance 384
See also optimizing database compatibility 379
commit interval 278 description 346
detail file 31 error threshold 200
identifying bottlenecks 637 example of use 349
monitoring 436 external loaders 380, 526
server data movement mode 661 file lists 375
Sybase IQ 643 file sources 374
tuning, overview 636 file targets 380
filter conditions 372
Index 775
hash auto-keys partitioning 361 command line mode 589
hash partitioning 361 command parameters 594
hash user keys partitioning 362 commands, list 582
Joiner transformation 384 commands, reference 594
key range 363 environment variables 585
loading to Informix 379 getserverdetails 599
mapping variables 394 getserverproperties 599
merge target files 699 getsessionstatistics 600
merging target files 380, 382 gettaskdetails 601
message queues 380 getworkflowdetails 601
multiple CPUs 3 help 602
multiple source pipelines 19 interactive mode 592
numerical functions restrictions 395 overview 582
object validation 396 parameter files 607, 608
optimizing performance 663 pingserver 602
optimizing source databases 663 resumeworkflow 603
optimizing target databases 664 return codes 300
overview 3 setfolder 604
partition keys 358, 362, 364 setnowait 605
partition types overview 356 setwait 605
partitioning indirect files 375 showsettings 605
pass-through partitioning 367 shutdownserver 605
recovery 200 starttask 606
reject file 476 startworkflow 607
relational sources 371 stoptask 609
relational targets 378 stopworkflow 609
round-robin partitioning 360 syntax 595
rules and restrictions 395, 398 unsetfolder 610
session properties 705 version 611
sorted flat files 385 waittask 611
sorted relational data 387 waitworkflow 611
Sorter transformation 389, 392 writing scripts 589
SQL queries 371 PMError_MSG table schema 485
symmetric processing platform 24 PMError_ROWDATA table schema 483
threads and partitions 18 PMError_Session table schema 486
threads created 16 $PMFailureEmailUser
Transaction Control transformation 356 definition 333
pipelines tips 342
See source pipelines PmNullPasswd
active sources 259 reserved word 53
data flow monitoring 440, 637, 639 PmNullUser
description 346 reserved word 53
PM_CODEPAGENAME pmserver
using with pmcmd 585 process 11
PM_RECOVERY table $PMSessionLogCount
format 299 saving a number of logs 471
PM_TGT_RUN_ID table $PMSessionLogDir
format 299 configuring the session log 471
pmcmd definition 469
aborttask 596 $PMSessionLogFile
abortworkflow 597 definition 497
776 Index
using 498 logs 28
$PMSuccessEmailUser messages 29
definition 333 monitoring 436
tips 342 multiple servers overview 444
PMTOOL_DATEFORMAT multiple source file list 230
using with pmcmd 585 online and offline mode 405
$PMWorkflowLogDir output files 33
definition 459 performance detail file 31
$PMWorkflowLogCount permissions to delete 50
saving a number of logs 460 pinging in Workflow Monitor 405
post-session command privileges required to register 46
session properties 711 processing data 22
shell command properties 714 reading sources 22
post-session email registering 46, 48
overview 33, 332 removing assigned sessions 199
See also email removing assigned workflows 123
session options 716 reporting session statistics 468
session properties 711 server grids overview 446
post-session shell command system resources 24
configuring non-reusable 189 tracing levels 473
configuring reusable 192 truncating target tables 245
using 188 using FTP 561
post-session SQL commands 186 using multiple to increase performance 661
post-session threads using server grids to increase performance 661
description 14 variables for 46
PowerCenter Connect for IBM MQSeries pre- and post-session SQL
partitioning restrictions 397 entering 186
PowerCenter Connect for PeopleSoft guidelines 186
partitioning restrictions 397 precision
PowerCenter Connect for SAP BW flat files 270
partitioning restrictions 397 writing to file targets 269
PowerCenter Connect for SAP R/3 pre-defined events
partitioning restrictions 397 waiting for 158
PowerCenter Connect for Siebel pre-defined variables
partitioning restrictions 398 in Decision tasks 149
PowerCenter Server 22 pre-session shell command
architecture 2 configuring non-reusable 189
assigning sessions 198 configuring reusable 192
assigning workflows 122 errors 193
blocking data 23 session properties 711
changing servers 445 using 188
commit interval overview 276 pre-session SQL commands 186
configuring for multiple servers 445 pre-session threads
connecting in Workflow Monitor 405 description 14
connectivity overview 5, 46 privileges
creating server grids 451 See also permissions
data movement modes 27 See also Repository Guide
deleting 50 scheduling 90
external loader support 524 session 175
filtering in Workflow Monitor 406 workflow 90
handling file targets 268 Workflow Monitor tasks 403
Index 777
workflow operator 90 resume/recover 305
Properties tab in session properties server handling 314
in Workflow Manager 670 recovery files
permissions 28
recreating
Q indexes 248
registering
Quit PowerCenter Server 46, 48
pmcmd syntax 602 registering server
quoted identifiers See also Installation and Configuration Guide
reserved words 255 reinitializing
aggregate cache 576
reject file
R changing names 476
column indicators 478
rank cache locating 456, 476
calculating data cache 633 Oracle external loader 534
calculating index cache 632 overview 32
location 632 permissions 28
overview 632 pipeline partitioning 476
size 632 reading 477
Rank transformation row indicators 478
See also Transformation Guide session parameter 508
cache partitioning 620 session properties 243, 263, 698, 700
caches 26, 34, 632 transaction control 284
partitioning guidelines 347 viewing 476
performance detail 639 relational connections
reader threads See relational databases
description 14, 15 relational databases
reading configuring a connection 56
sources 22 copying a relational database connection 59
real-time sessions replacing a relational database connection 62
transformation scope 288 rollback segment 58
recovering relational sources
pipeline partitioning 200 partitioning 371
recovery session properties 214
completing unrecoverable sessions 316 relational targets
configuring mappings 297 partitioning 378
configuring the session 297 partitioning restrictions 395
configuring the target database 298 session properties 240, 697
configuring the workflow 298 Relative time
files, permissions 28 specifying 162
overview 296 Timer task 161
PM_RECOVERY table format 299 reload task or workflow
PM_TGT_RUN_ID table format 299 configuring 40
pmcmd return codes 300 rename
recover from task 308 repository objects 73
recover task 311 repositories
recovering a failed workflow 308 adding 73
recovering a session task 311 connecting in Workflow Monitor 405
recovering a suspended workflow 305 enter description 73
recovery table layout 314
778 Index
repository objects
configuring 73
S
rename 73 saving
Repository Server session logs 471
notification 41 workflow logs 459
notification in Workflow Monitor 410 scheduled status 421
requirements scheduling
server grids 448 configuring 114
reserved words creating reusable scheduler 114
generating SQL with 255 disabling workflows 118
resword.txt 255 editing 117
reserved words file end options 116
creating 256 error message 113
reset all 42 permission 90
restarting run every 115
in Workflow Monitor 416 run once 115
Resumeworkflow run options 115
pmcmd syntax 603 schedule options 115
Resumeworklet start date 116
pmcmd syntax 603 start time 116
reusable tasks workflows 112
inherited changes 136 searching
reverting changes 136 for versioned objects in the Workflow Manager 76
reverting changes Workflow Manager 70
tasks 136 Workflow Monitor 427
rmail Sequence Generator transformation
See also email partitioning guidelines 353, 396
configuring 321 server
rollback segment 58 See PowerCenter Server
rolling back data See also database-specific server
transaction control 283 selecting 122, 197
round-robin partitioning 348, 360 server code page
row error log files See also PowerCenter Server
permissions 28 affecting incremental aggregation 577
row error logging Server Grid Browser 453
active sources 260 Server Grid Editor 452
row indicators server grids
reject file 478 connectivity 447
rows to skip creating 451
delimited files 692 definition 444
Run if Previous Completed distributing sessions 446
in Command Tasks 145 increasing performance 661
session command 714 master servers 446
run options overview 446
run continuously 115 requirements 448
run on demand 115 worker servers 446
server initialization 115 server handling
running status 421 file targets 268
running, sessions 197 fixed-width targets 269, 270
running, workflows 122 multibyte data to file targets 271
shift-sensitive data, targets 271
Index 779
server logs timestamp 472
messages 29 tracing levels 473
overview 28 transformation statistics 469
Server Manager session properties viewing 474
General tab 737 viewing dynamically 419
Log and Error Handling tab 758 viewing in Workflow Monitor 419
Partitioning tab 762 session output
Source Location tab 754 cache files 34
Time tab 755 control file 33
Transformations tab 761 incremental aggregation files 34
server variables indicator file 33
description 46 performance detail file 31
email 333 persistent lookup cache 35
for multiple servers 445 post-session email 33
in Command tasks 188, 193 PowerCenter Server log 28
list 47 reject file 32
log files 46 session logs 31
servers target output file 33
assigned 444 session parameters
non-associated 444 database connection parameter 499
session command settings defining 512
session properties 711 in Command tasks 143
session details naming conventions 496, 520
monitoring sessions 434 overview 496
session errors 201 reject file parameter 508
session logs session log parameter 497
archiving 471 session parameter file 512
changing location 498 source file parameter 502
changing locations 471 target file parameter 504
changing name 497 session properties
changing names 471 Components tab 710
code page 475 Config Object tab 675
codes 463 constraint-based loading 251
creation 11 delimited files, sources 222
default name 470 delimited files, targets 266
editing 419 edit delimiter 690, 702
external loader error messages 527 edit null character 702
generating using UTF-8 463 email 332, 714
load summary 467 external loader 682, 695
locating 456, 469 fixed-width files, sources 220
location 671 fixed-width files, targets 265
log file settings 469, 470, 472, 474 FTP files 682, 695
overview 31 general settings 668
parameter 497 General tab 668
permissions 28 log files 469, 470, 472, 474
reading 463 Metadata Extensions tab 718
sample 466 null character, targets 265
saving 678 on failure email 332
session details 31 on success email 332
session parameter 497 output files, flat file 700
thread identification 465 partition attributes 351, 352
780 Index
Partitions View 705 metadata extensions in 82
performance settings 674 monitoring counters 437
post-session email 332 multiple source files 230
post-session shell command 714 optimizing 636, 655
Properties tab 670 output files 28
reject file, flat file 263, 700 overview 174
reject file, relational 243, 698 parameter file 512
relational sources 214 parameters 496
relational targets 240 performance detail file 31
session command settings 711 performance tuning 636
session retry on deadlock 246 properties reference 667
sort order 577 read-only 175
source connections 211 removing assigned PowerCenter Servers 199
sources 210 running 197
table name prefix 254 runtime operations overview 7
target connection settings 682, 695 session details file 31
target connections 237 starting 197
target load options 252, 697 stopping 130, 200
target-based commit 292 test load 244, 264
targets 236 truncating target tables 245
Transformation node 703 using FTP 565
transformations 703 validating 195
session properties comparison viewing performance details 436
overview 736 Setfolder
session retry on deadlock pmcmd syntax 604
See also Installation and Configuration Guide Setnowait
overview 246 pmcmd syntax 605
sessions Setwait
See also session logs pmcmd syntax 605
See also session properties shared memory
aborting 130, 200 Load Manager 24
apply attributes to all instances 178 shell commands
assigning PowerCenter Servers 198 executing in Command tasks 145
caches 28 make reusable 191
configuring for multiple source files 231 post-session 188
configuring to optimize join performance 384 post-session properties 714
creating 175 pre-session 188
creating a session configuration object 183 using Command tasks 143
definition 2, 174 using server variables 188, 193
description 132 using session parameters 143
distributing in server grids 446 Showsettings
DTM buffer memory 25 pmcmd syntax 605
editing 177 Shutdownserver
editing privileges 178 pmcmd syntax 605
eliminating paging 621 single-pass reading
email 320 definition 647
enabling monitoring 436 sort order
external loading 524, 553 See also session properties
failure 200 affecting incremental aggregation 577
high-precision data 204 sorted flat files
identifying bottlenecks 639 partitioning for optimized join performance 385
Index 781
sorted ports connections 211
caching requirements 621 delimiters 224
sorted relational data escape character 691
partitioning for optimized join performance 387 line sequential buffer length 225
Sorter transformation multiple sources in a session 230
partitioning 392 null character 689
partitioning for optimized join performance 389 null character handling 227
$Source null characters 222
session properties 672 overriding SQL query, session 216
source bottlenecks partitioning 371, 374
using a database query to identify 638 quote character 691
using a read test session to identify 638 reading 22
using filter transformation to identify 637 session properties 210
source data specifying code page 689, 691
capturing changes for aggregation 574 SQL
source databases configuring environment SQL 55
database connection session parameter 499 guidelines for entering environment SQL 55
identifying bottlenecks 637 SQL queries
optimizing 645 in partitioned pipelines 371
optimizing by partitioning 663 stages
optimizing the query 645 description 17
optimizing with conditional filters 646 staging areas
source files removing to improve performance 659
accessing through FTP 560, 565 start date, scheduling 116
configuring for multiple files 230, 231 Start tasks, definition 88
delimited properties 691 start time, scheduling 116
fixed-width properties 689 starting
session parameter 502 selecting a server 122, 197
session properties 220, 687 sessions 197
using parameters 502, 506 start from task 124
source location starting a part of a workflow 124
session properties 220, 687 starting tasks 125
Source Location tab starting workflows using Workflow Manager 124
in the Workflow Manager 754 Workflow Monitor 404
Server Manager session properties 754 workflows 122
source pipelines Starttask
description 346 pmcmd syntax 606
pass-through 15 using a parameter file 607
reading 22 Startworkflow
stages 17 pmcmd syntax 607
target load order groups 22 using a parameter file 608
threads created 19 statistics
with Joiner transformations 19 for Workflow Monitor 408
Source Qualifier transformation viewing 408
partitioning guidelines 347 status
source-based commit aborted 421
active sources 278 aborting 421
description 278 disabled 421
sources failed 421
code page 224 in Workflow Monitor 421
code page, flat file 222 running 421
782 Index
scheduled 421 Sybase IQ external loader
stopped 421 attributes 536
stopping 421 bulk loading 643
succeeded 421 connections 551
suspended 127, 421 data precision 535
suspending 127, 421 delimited flat file targets 536
tasks 421 fixed-width flat file targets 535
terminated 421 multibyte data 535
unscheduled 421 optional quotes 535
waiting 421 overview 535
workflows 421 support 524
stop on Sybase SQL Server
$PMSessionErrorThreshold 47 bulk loading 642
error threshold 200 connect string example 54
errors 679 optimizing 646
pre- and post-session SQL errors 186 symmetric processing platform
stopped status 421 pipeline partitioning 24
stopping system bottlenecks
PowerCenter Server See Installation and Configuration identifying 640
Guide UNIX 641
in Workflow Monitor 418 Windows 640
server handling 129 system-level optimization
sessions 130 improving network speed 660
tasks 129 overview 660
using Control tasks 147 using additional CPUs 661
workflows 129
stopping status 421
Stoptask
pmcmd syntax 609
T
Stopworkflow table name prefix
pmcmd syntax 609 target owner 254
string operations table owner name
minimizing for performance 653 session properties 216
sub-expressions targets 254
replacing with local variables 652 $Target
succeeded status 421 session properties 672
Suspend On Error option 127 target connect groups
suspended status 127, 421 committing data 278
suspending target connection group
behavior 127 Transaction Control transformation 289
email 128 target connection groups
resume in Workflow Monitor 417 constraint-based loading 249
status 127 defined 257
workflows 127 target connection settings
worklets 164 session properties 682, 695
suspending status 421 target databases
suspension email 339 bulk loading 642
Sybase database connection session parameter 499
commit interval 253 identifying bottlenecks 637
Sybase IQ optimizing 642
partitioning restrictions 379, 395 optimizing by partitioning 664
Index 783
optimizing Oracle target database 643 Task view
target files configuring 412
delimited 703 customizing 412
fixed-width 702 displaying 430
target load order filtering 431
constraint-based loading 249 hiding 412
groups 22 opening and closing folders 407
target load order groups overview 402
defined 22 using 430
target owner tasks
table name prefix 254 aborted 421
target properties aborting 129, 421
bulk mode 241 adding in workflows 92
test load 241 arranging 71
update strategy 241 Assignment tasks 140
target tables Command tasks 143
truncating 245 configuring 135
target-based commit Control task 147
WriterWaitTimeout 277 copying 77
target-based commit interval creating 133
description 277 creating in Task Developer 133
targets creating in Workflow Designer 133
accessing through FTP 560, 568 Decision tasks 149
code page 267, 702, 703 disabled 421
code page compatibility 235 disabling 137
code page, flat file 266 email 328
connection settings 695 Event-Raise tasks 153
connections 237 Event-Wait tasks 153
database connections 234 failed 421
delimiters 267 failing parent workflow 138
file writer 236 in worklets 166
globalization features 234 inherited changes 136
heterogeneous 274 instances 136
load, session properties 252, 697 list of 132
merging output files 380, 382 non-reusable 92
multiple connections 274 overview 132
multiple types 274 promoting to reusable 136
null characters 266 restarting in Workflow Monitor 416
output files 263 reusable 92
output files for 33 reverting changes 136
partitioning 378, 380 running 421
relational settings 697 show full name 41
relational writer 236 starting 125
session properties 236, 240 status 421
specifying null character 702 stopped 421
truncating tables 245 stopping 129, 421
viewing session detail 31 stopping and aborting in Workflow Monitor 418
writers 236 succeeded 421
Task Developer Timer tasks 161
creating tasks 133 using Tasks toolbar 92
displaying and hiding tool name 41 validating 119
784 Index
Tasks toolbar Timer tasks
creating tasks 134 absolute time 161, 162
TCP/IP network protocol definition 161
server settings 49 description 132
Teradata example 161
connect string example 54 relative time 161, 162
Teradata external loader variables in 103
code page 538 timestamps
connections 551 session logs 472
date format 538 workflow logs 460, 462
FastLoad attributes 545 Workflow Monitor 402
MultiLoad attributes 540 tool names
overriding the control file 539 displaying and hiding 41
support 524 toolbars 69
Teradata Warehouse Builder attributes 547 adding tasks 92
TPump attributes 542 creating tasks 134
Teradata Warehouse Builder using 69
attributes 547 Workflow Monitor 415
operators 547 Tracing Level
terminated status 421 optimizing 655
Terse tracing levels tracing levels
See also Designer Guide See also Designer Guide
defined 473 Normal 473
test load overriding 679
bulk loading 244 session 473
enabling 671 Terse 473
file targets 264 Verbose Data 474
number of rows to test 671 Verbose Initialization 474
relational targets 244 transaction
thread identification defined 287
session log file 465 transaction boundary
threads dropping 287
and partitions 18 transaction control 287
creation 13, 14 transaction control
mapping 14 bulk loading 283
master 14 end of file 284
post-session 14 open transaction 287
pre-session 14 overview 287
reader 14, 15 PowerCenter Server handling 283
transformation 14, 16 real-time sessions 287
types 14 reject file 284
writer 14, 16 rules and guidelines 290
time transaction control points 287
configuring 38 transformation error 284
formats 38 transformation scope 287
Time tab user-defined commit 283
duration options 756 transaction control point
schedule options 755 defined 287
Server Manager session properties 755 Transaction Control transformation
start options 756 partitioning guidelines 356
use absolute time option 757 target connection group 289
Index 785
transaction control unit update strategy
defined 289 target properties 241
transaction generator Update Strategy transformation
active sources 259 constraint-based loading 249
effective and ineffective 259 updating
transaction control points 287 incrementally 579
transformation scope URL
defined 287 adding through business documentation links 97
real-time processing 288 user-defined commit
transformations 288 see also transaction control
transformation threads bulk loading 283
description 14, 16 user-defined events
transformations declaring 155
as partition points 353 example 153
eliminating errors 648 waiting for 157
optimizing 639 using multiple servers 444
partitioning restrictions 395
session properties 703
statistics on 469 V
Transformations node
properties 703 validating 196
Transformations tab expressions 97, 119
in the Server Manager 761 tasks 119
in the Workflow Manager 761 workflows 119, 120
Transformations view worklets 171
session properties 681 Varchar datatypes
Treat Source Rows As See also Designer Guide
bulk loading 252 removing trailing blanks for optimization 653
Treat Source Rows As property variables
overview 214 email 333
truncating server 46
Table Name Prefix 245 workflow 103
target tables 245 Verbose Data tracing levels
configuring session log 474
See also Designer Guide
U Verbose Initialization tracing levels
configuring session log 474
unconnected transformations See also Designer Guide
partitioning restrictions 353 Version
Unicode mode pmcmd syntax 611
See also Installation and Configuration Guide versioned objects
code pages 27 See also Repository Guide
session behavior 16 checking in 74
UNIX systems checking out 74
email 321 searching for in the Workflow Manager 76
external loader behavior 526 viewing
PowerCenter Server as daemon 3 reject file 476
unscheduled status 421 session logs 474
Unsetfolder workflow logs 462
pmcmd syntax 610
786 Index
W viewing dynamically 419
viewing in Workflow Monitor 419
waiting status 421 Workflow Manager
Waittask adding repositories 73
pmcmd syntax 611 arrange 71
Waitworkflow checking out and in versioned objects 74
pmcmd syntax 611 configuring for multiple source files 231
web links copying 77
adding to expressions 97 creating external loader connections 551
webzine l customizing options 39
windows date and time formats 38
customizing 69 defining FTP connections 561
displaying and closing 69 display options 39
docking and undocking 69 entering object descriptions 73
Navigator 67 format options 42
Output 67 general options 39
overview 67 increasing network packet size 646
panning 40 managing multiple servers 444
reloading 40 messages to Workflow Monitor 410
Workflow Manager 67 overview 38, 46, 66
Workflow Monitor 402 registering the PowerCenter Server 46, 48
workspace 67 searching for items 70
Windows System Tray searching for versioned objects 76
accessing Workflow Monitor 404 setting up database connections 53, 56
Windows systems toolbars 69
email 322 tools 66
external loader behavior 526 validating sessions 195
Informatica service owner 322 windows 67, 69
logon network security 325 zooming the workspace 71
PowerCenter Server service 3 Workflow Monitor
worker servers 446 closing folders 407
Workflow Designer configuring 409
creating tasks 133 connecting to repositories 405
displaying and hiding tool name 41 connecting to server 405
workflow logs customizing columns 412
archiving 459 deleted servers 405
changing locations 461 deleted tasks 406
changing name 461 disconnecting from server 405
codes 458 displaying servers 406
configuring 460 dynamic logs 419
creation 9 editing logs 419
editing 419 filtering deleted tasks 406
enabling and disabling 459, 461 filtering servers 406
locating 456, 459 filtering tasks in Task View 405, 431
log file settings 459, 460 Gantt Chart view 402
overview 30 hiding columns 412
permissions 28 hiding servers 406
reading 458 icon 404
sample 458 launching 404
timestamp 460 launching automatically 41
viewing 462 listing tasks and workflows 424
Index 787
log file editor 410 workflows
Maximum Days 410 aborted 421
Maximum Workflow Runs 410 aborting 129, 421
monitor modes 405 adding tasks 92
navigating the Time window 425 assigning PowerCenter Servers 122
notification from Repository Server 410 branches 88
opening folders 407 copying 77
overview 402 creating 91
performing tasks 416 definition 2, 88
permissions and privileges 403 deleting 97
pinging the PowerCenter Server 405 developing 89, 91
receive messages from Workflow Manager 410 disabled 421
restarting tasks, workflows, and worklets 416 disabling 118
resuming a workflow or worklet 417 editing 98
searching 427 email 341
session details 434 events 88
starting 404 fail parent workflow 138
statistics 408 failed 421
stopping or aborting tasks and workflows 418 guidelines 89
switching views 403 links 88
System Tray 404 locking 8
Task view 402 metadata extensions in 82
time 402 monitor 89
toolbars 415 overview 88
viewing history names 419 parameter file 9
viewing session logs 419 privileges 90
viewing workflow logs 419 properties reference 721
workflow and task status 421 removing assigned PowerCenter Servers 123
zooming 426 restarting in Workflow Monitor 416
workflow output resuming in Workflow Monitor 417
email 33 running 7, 122, 421
workflow logs 30 runtime operations overview 7
workflow parameter file 110 scheduled 421
workflow properties scheduling 112
log files 459, 460 selecting a server 89
suspension email 339 starting 122
workflow variables starting on non-associated server 444
creating 110 status 127, 421
datatypes 105, 110 stopped 421
default values 106, 109, 110 stopping 129, 421
keywords 104 stopping and aborting in Workflow Monitor 418
non-persistent variables 110 succeeded 421
persistent variables 110 suspended 421
pre-defined 105 suspending 127, 421
start and current values 109 suspension email 339
SYSDATE 105 terminated 421
user-defined 108 unscheduled 421
using 103 using tasks 132
using in expressions 106 validating 119
WORKFLOWSTARTTIME 105 variables 103
waiting 421
788 Index
Worklet Designer target-based commit 277
displaying and hiding tool name 41
worklets
adding tasks 166
configuring properties 166
Z
create non-reusable worklets 165 zooming
create reusable worklets 165 Workflow Manager 71
declaring events 167 Workflow Monitor 426
developing 165
email 341
fail parent worklet 138
metadata extensions in 82
overriding variable value 169
overview 164
parameters tab 169
persistent variable example 169
persistent variables 169
restarting in Workflow Monitor 416
resuming in Workflow Monitor 417
suspended 421
suspending 164, 421
unscheduled 421
validating 171
variables 169
waiting 421
workspace
color 42
navigating 69
setting colors 42
setting fonts 42
zooming 71
workspace file directory 41
writer threads
description 14, 16
writers
session properties 692
WriterWaitTimeout
target-based commit 277
writing
multibyte data to files 270
to fixed-width files 268, 269
X
XML sources
allocating memory 655
numeric data handling 229
XML targets
active sources 259
partitioning restrictions 396
Index 789
790 Index