Workflow Administration Guide

Workflow Administration Guide
Informatica PowerCenter®
(Version 7.1.1)
Informatica PowerCenter Workflow Administration Guide
Version 7.1.1
August 2004
Copyright (c) 1998–2004 Informatica Corporation.

All rights reserved. Printed in the USA.
This software and documentation contain proprietary information of Informatica Corporation, they are provided under a license agreement
containing restrictions on use and disclosure and is also protected by copyright law. Reverse engineering of the software is prohibited. No
part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)
without prior consent of Informatica Corporation.
Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software
license agreement as provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013(c)(1)(ii) (OCT 1988), FAR
12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable.
The information in this document is subject to change without notice. If you find any problems in the documentation, please report them to
us in writing. Informatica Corporation does not warrant that this documentation is error free.
Informatica, PowerMart, PowerCenter, PowerChannel, PowerCenter Connect, MX, and SuperGlue are trademarks or registered trademarks
of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be
trade names or trademarks of their respective owners.
Portions of this software are copyrighted by DataDirect Technologies, 1999-2002.
Informatica PowerCenter products contain ACE (TM) software copyrighted by Douglas C. Schmidt and his research group at Washington
University and University of California, Irvine, Copyright (c) 1993-2002, all rights reserved.
Portions of this software contain copyrighted material from The JBoss Group, LLC. Your right to use such materials is set forth in the GNU
Lesser General Public License Agreement, which may be found at http://www.opensource.org/licenses/lgpl-license.php. The JBoss materials
are provided free of charge by Informatica, “as-is”, without warranty of any kind, either express or implied, including but not limited to the
implied warranties of merchantability and fitness for a particular purpose.
Portions of this software contain copyrighted material from Meta Integration Technology, Inc. Meta Integration® is a registered trademark
of Meta Integration Technology, Inc.
This product includes software developed by the Apache Software Foundation (http://www.apache.org/).
The Apache Software is Copyright (c) 1999-2004 The Apache Software Foundation. All rights reserved.
DISCLAIMER: Informatica Corporation provides this documentation “as is” without warranty of any kind, either express or implied,
including, but not limited to, the implied warranties of non-infringement, merchantability, or use for a particular purpose. The information
provided in this documentation may include technical inaccuracies or typographical errors. Informatica could make improvements and/or
changes in the products described in this documentation at any time without notice.
Table of Contents
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxi
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxv
New Features and Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxvi
PowerCenter 7.1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxvi
PowerCenter 7.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxviii
PowerCenter 7.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xlii
About Informatica Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xlviii
About this Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xlix
Document Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xlix
Other Informatica Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . l
Visiting Informatica Customer Portal . . . . . . . . . . . . . . . . . . . . . . . . . . . l
Visiting the Informatica Webzine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . l
Visiting the Informatica Web Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . l
Visiting the Informatica Developer Network . . . . . . . . . . . . . . . . . . . . . . l
Obtaining Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . li
Chapter 1: Understanding the Server Architecture . . . . . . . . . . . . . . . 1

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Workflow Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Pipeline Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
PowerCenter Server Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Running a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Load Manager Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Managing Workflow Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Locking and Reading the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Reading the Parameter File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Creating the Workflow Log File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Running Workflow Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Distributing Sessions to Worker Servers. . . . . . . . . . . . . . . . . . . . . . . . . . 9
Starting the DTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Running Sessions from Master Servers . . . . . . . . . . . . . . . . . . . . . . . . . . 10
iii
Writing Historical Information to the Repository . . . . . . . . . . . . . . . . . . 10
Sending Post-Session Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Data Transformation Manager (DTM) Process . . . . . . . . . . . . . . . . . . . . . . . 11
Reading the Session Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Expanding Variables and Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Creating the Session Log File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Validating Code Pages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Verifying Connection Object Permissions . . . . . . . . . . . . . . . . . . . . . . . 12
Running Pre-Session Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Running the Processing Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Running Post-Session Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Sending Post-Session Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Understanding Processing Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Thread Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Threads and Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
PowerCenter Server Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Reading Source Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Blocking Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Block Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
System Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
CPU Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Load Manager Shared Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
DTM Buffer Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Cache Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Code Pages and Data Movement Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
ASCII Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Unicode Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Output Files and Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
PowerCenter Server Log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Workflow Log File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Session Log File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Session Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Performance Detail File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Reject Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Row Error Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Recovery Tables and Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Control File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
iv Table of Contents
Indicator File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Output File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Cache Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Chapter 2: Configuring the Workflow Manager . . . . . . . . . . . . . . . . . 37

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Setting the Date/Time Display Format . . . . . . . . . . . . . . . . . . . . . . . . . 38
Customizing the Workflow Manager Options . . . . . . . . . . . . . . . . . . . . . . . . 39
Configuring General Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Configuring Format Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Configuring Miscellaneous Options . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Enabling Enhanced Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Registering the PowerCenter Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Server Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Steps for Registering a PowerCenter Server . . . . . . . . . . . . . . . . . . . . . . 48
Deleting a PowerCenter Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Configuring Connection Object Permissions . . . . . . . . . . . . . . . . . . . . . . . . 51
Connection Object Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Setting Up a Relational Database Connection . . . . . . . . . . . . . . . . . . . . . . . 53
Database Connect Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Database Connection Code Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Configuring Environment SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Configuring a Relational Database Connection . . . . . . . . . . . . . . . . . . . 56
Deleting Connection Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Copying a Relational Database Connection . . . . . . . . . . . . . . . . . . . . . . 59
Replacing a Relational Database Connection . . . . . . . . . . . . . . . . . . . . . . . . 62
Chapter 3: Using the Workflow Manager . . . . . . . . . . . . . . . . . . . . . . 65

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Workflow Manager Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Workflow Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Workflow Manager Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Navigating the Workspace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Customizing Workflow Manager Windows . . . . . . . . . . . . . . . . . . . . . . 69
Using Toolbars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Searching for Items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Arranging Objects in the Workspace . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Table of Contents v
Zooming the Workspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Working with Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Viewing Object Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Entering Descriptions for Repository Objects . . . . . . . . . . . . . . . . . . . . . 73
Renaming Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Checking Out and In Versioned Repository Objects . . . . . . . . . . . . . . . . . . . 74
Checking Out Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Checking In Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Searching For Versioned Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Copying Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Copying Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Copying Workflow Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Comparing Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Steps for Comparing Objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Working with Metadata Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Creating a Metadata Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Editing a Metadata Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Deleting a Metadata Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Keyboard Shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Chapter 4: Working with Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Workflow Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Developing Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Creating a New Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Adding Tasks to Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Working with Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Using the Expression Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Deleting a Workflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Editing a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Using the Workflow Wizard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Step 1. Assign a Name and PowerCenter Server to the Workflow . . . . . . . 99
Step 2. Create a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Step 3. Schedule a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Using Workflow Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Pre-Defined Workflow Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
User-Defined Workflow Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
vi Table of Contents
Scheduling a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Creating a Reusable Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Configuring Scheduler Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Editing Scheduler Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Disabling Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Validating a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Expression Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Task Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Workflow Properties Validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Running Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Running the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Selecting a Server to Run the Workflow . . . . . . . . . . . . . . . . . . . . . . . . 122
Assigning the PowerCenter Server to a Workflow . . . . . . . . . . . . . . . . . 122
Running a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Running a Part of a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Running a Task in the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Suspending the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Configuring Suspension Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Stopping or Aborting the Workflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Server Handling of Stop and Abort . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Stopping or Aborting a Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Chapter 5: Working with Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Creating a Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Creating a Task in the Task Developer . . . . . . . . . . . . . . . . . . . . . . . . . 133
Creating a Task in the Workflow or Worklet Designer . . . . . . . . . . . . . 133
Configuring Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Reusable Workflow Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
AND or OR Input Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Disabling Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Failing Parent Workflow or Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Validating Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Working with the Assignment Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Working with the Command Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Using Session Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Creating a Command Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Table of Contents vii

Executing Commands in the Command Task . . . . . . . . . . . . . . . . . . . . 145
Working with the Control Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Working with the Decision Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Using the Decision Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Creating a Decision Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Working with Event Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Example of User-Defined Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Working with Event-Raise Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Working With Event-Wait Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Working with the Timer Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Chapter 6: Working with Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Suspending Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Developing a Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Creating a Reusable Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Creating a Non-Reusable Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Configuring Worklet Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
Adding Tasks in Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
Nesting Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Using Worklet Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Persistent Worklet Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Overriding Initial Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Validating Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Chapter 7: Working with Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . 173

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
Creating a Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Session Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Steps to Create a Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Editing a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
Edit Session Privilege . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
Applying Attributes to All Instances . . . . . . . . . . . . . . . . . . . . . . . . . . 178
Creating a Session Configuration Object . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Using Pre- and Post-Session SQL Commands . . . . . . . . . . . . . . . . . . . . . . . 186
Guidelines for Entering Pre- and Post-Session SQL Commands . . . . . . . 186
Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
viii Table of Contents

Using Pre- or Post-Session Shell Commands . . . . . . . . . . . . . . . . . . . . . . . . 188
Using Server and Session Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Configuring Non-Reusable Shell Commands . . . . . . . . . . . . . . . . . . . . 189
Configuring Reusable Shell Commands . . . . . . . . . . . . . . . . . . . . . . . . 192
Using Server Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Pre-Session Shell Command Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Using Post-Session Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Validating a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Validating Multiple Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Running the Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Selecting a Server to Run the Session . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Assigning the PowerCenter Server to a Session . . . . . . . . . . . . . . . . . . . 198
Stopping and Aborting a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Threshold Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Fatal Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
ABORT Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
User Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
PowerCenter Server Handling for Session Failure . . . . . . . . . . . . . . . . . 201
Mapping Parameters and Variables in Sessions . . . . . . . . . . . . . . . . . . . . . . 203
Handling High Precision Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Chapter 8: Working with Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Globalization Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Source Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Permissions and Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Allocating Buffer Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Partitioning Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Configuring Sources in a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
Configuring Readers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
Configuring Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Configuring Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
Working with Relational Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Selecting the Source Database Connection . . . . . . . . . . . . . . . . . . . . . . 214
Defining the Treat Source Rows As Property . . . . . . . . . . . . . . . . . . . . 214
Configuring the Table Owner Name . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Overriding the SQL Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Table of Contents ix
Working with File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Configuring Source Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Configuring Fixed-Width File Properties . . . . . . . . . . . . . . . . . . . . . . . 220
Configuring Delimited File Properties . . . . . . . . . . . . . . . . . . . . . . . . . 222
Configuring Line Sequential Buffer Length . . . . . . . . . . . . . . . . . . . . . 225
Server Handling for File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Character Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Multibyte Character Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Null Character Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Row Length Handling for Fixed-Width Flat Files . . . . . . . . . . . . . . . . . 228
Numeric Data Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Using a File List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Creating the File List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Configuring a Session to Use a File List . . . . . . . . . . . . . . . . . . . . . . . . 231
Chapter 9: Working with Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Globalization Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Target Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Partitioning Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Permissions and Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Configuring Targets in a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Configuring Writers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Configuring Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Configuring Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Working with Relational Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Target Database Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Target Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Truncating Target Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Deadlock Retry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
Dropping and Recreating Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
Constraint-Based Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
Bulk Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Table Name Prefix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
Reserved Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Working with Target Connection Groups . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Working with Active Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
x Table of Contents
Working with File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Configuring Target Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Configuring Fixed-Width Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Configuring Delimited Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
Server Handling for File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
Writing to Fixed-Width Flat Files with Relational Target Definitions . . 268
Writing to Fixed-Width Files with Flat File Target Definitions . . . . . . . 269
Writing Multibyte Data to Fixed-Width Flat Files . . . . . . . . . . . . . . . . 270
Null Characters in Fixed-Width Files . . . . . . . . . . . . . . . . . . . . . . . . . 272
Character Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
Writing Metadata to Flat File Targets . . . . . . . . . . . . . . . . . . . . . . . . . 273
Working with Heterogeneous Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
Chapter 10: Understanding Commit Points . . . . . . . . . . . . . . . . . . . 275

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Target-Based Commits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Source-Based Commits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
Determining the Commit Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
Switching from Source-Based to Target-Based Commit . . . . . . . . . . . . . 280
User-Defined Commits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
Rolling Back Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
Understanding Transaction Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
Transformation Scope. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
Understanding Transaction Control Units . . . . . . . . . . . . . . . . . . . . . . 289
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
Setting Commit Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
Chapter 11: Recovering Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
Preparing for Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Configuring the Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Configuring the Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Configuring the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Configuring the Target Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Creating pmcmd Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
Working with Repeatable Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
Recovering a Suspended Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
Table of Contents xi
Recovering a Suspended Workflow with Sequential Sessions . . . . . . . . . 305
Recovering a Suspended Workflow with Concurrent Sessions . . . . . . . . 306
Steps for Recovering a Suspended Workflow . . . . . . . . . . . . . . . . . . . . . 307
Recovering a Failed Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
Recovering a Failed Workflow with Sequential Sessions . . . . . . . . . . . . . 308
Recovering a Failed Workflow with Concurrent Sessions . . . . . . . . . . . . 309
Steps for Recovering a Failed Workflow . . . . . . . . . . . . . . . . . . . . . . . . 310
Recovering a Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Recovering Sequential Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Recovering Concurrent Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Steps for Recovering a Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
Server Handling for Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
Verifying Recovery Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
Running Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
Completing Unrecoverable Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
Chapter 12: Sending Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
Configuring Email on UNIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
Configuring Email on Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
Step 1. Verify the Informatica Service Startup Account . . . . . . . . . . . . . 322
Step 2. Configure a Microsoft Outlook User . . . . . . . . . . . . . . . . . . . . 322
Step 3. Configure Logon Network Security . . . . . . . . . . . . . . . . . . . . . 325
Step 4. Create Distribution Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
Step 5. Configure the PowerCenter Server Setup . . . . . . . . . . . . . . . . . 327
Working with Email Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
Email Address Tips and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
Steps to Create an Email Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
Working with Post-Session Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
Using Server Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
Email Variables and Format Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
Configuring Post-Session Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
Sample Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Working with Suspension Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Using Email Tasks in a Workflow or Worklet . . . . . . . . . . . . . . . . . . . . . . . 341
Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
xii Table of Contents

Chapter 13: Pipeline Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
Partition Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
Number of Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
Partition Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
Configuring Partitioning Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
Adding and Deleting Partition Points . . . . . . . . . . . . . . . . . . . . . . . . . 353
Adding and Deleting Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Entering Partition Descriptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Specifying Partition Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Adding Keys and Key Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
Cache Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Round-Robin Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
Hash Keys Partition Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Hash Auto-Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Hash User Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
Key Range Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
Adding a Partition Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
Adding Key Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
Adding Filter Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
Pass-Through Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
Database Partitioning Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
Partitioning Relational Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
Entering an SQL Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
Entering a Filter Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
Partitioning File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
Guidelines for Partitioning File Sources . . . . . . . . . . . . . . . . . . . . . . . . 374
Using One Thread to Read a File Source . . . . . . . . . . . . . . . . . . . . . . . 375
Using Multiple Threads to Read a File Source . . . . . . . . . . . . . . . . . . . 375
Configuring for File Partitioning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
Partitioning Relational Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
Database Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
Partitioning File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
Configuring Connection Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
Configuring File Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
Partitioning Joiner Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
Partitioning Sorted Joiner Transformations . . . . . . . . . . . . . . . . . . . . . 384
Table of Contents xiii

Using Sorted Flat Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
Using Sorted Relational Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
Using Sorter Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
Optimizing Sorted Joiner Transformations with Partitions . . . . . . . . . . 390
Partitioning Lookup Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Partitioning Sorter Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
Configuring Sorter Transformation Work Directories . . . . . . . . . . . . . . 392
Mapping Variables in Partitioned Pipelines. . . . . . . . . . . . . . . . . . . . . . . . . 394
Partitioning Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
Restrictions on the Number of Partitions . . . . . . . . . . . . . . . . . . . . . . . 395
Partition Restrictions for Editing Objects . . . . . . . . . . . . . . . . . . . . . . . 396
Partition Restrictions for Informatica Application Products . . . . . . . . . . 397
Partitioning Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
Chapter 14: Monitoring Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . 401

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
Permissions and Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
Using the Workflow Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
Opening the Workflow Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
Connecting to Repositories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
Connecting to PowerCenter Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
Filtering Tasks and Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
Opening and Closing Folders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
Viewing Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
Viewing Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
Customizing Workflow Monitor Options . . . . . . . . . . . . . . . . . . . . . . . . . . 409
Configuring General Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
Configuring Gantt Chart View Options . . . . . . . . . . . . . . . . . . . . . . . . 411
Configuring Task View Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
Configuring Advanced Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
Using Workflow Monitor Toolbars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
Working with Tasks and Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
Running a Task, Workflow, or Worklet . . . . . . . . . . . . . . . . . . . . . . . . 416
Resuming a Workflow or Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
Recovering a Workflow or Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
Stopping or Aborting Tasks and Workflows . . . . . . . . . . . . . . . . . . . . . 418
Scheduling and Unscheduling Workflows . . . . . . . . . . . . . . . . . . . . . . . 418
xiv Table of Contents

Viewing Session Logs and Workflow Logs . . . . . . . . . . . . . . . . . . . . . . 419
Viewing History Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
Workflow and Task Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
Using the Gantt Chart View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
Organizing Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
Listing Tasks and Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
Navigating the Time Window in Gantt Chart View . . . . . . . . . . . . . . . 425
Zooming the Gantt Chart View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
Performing a Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
Opening All Folders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
Using the Task View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
Filtering in Task View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
Opening All Folders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
Monitoring Session Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434
Creating and Viewing Performance Details . . . . . . . . . . . . . . . . . . . . . . . . 436
Enabling Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
Viewing Session Performance Details. . . . . . . . . . . . . . . . . . . . . . . . . . 436
Memory Requirement for Performance Details . . . . . . . . . . . . . . . . . . . 437
Understanding Performance Counters . . . . . . . . . . . . . . . . . . . . . . . . . 437
Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
Chapter 15: Using Multiple Servers. . . . . . . . . . . . . . . . . . . . . . . . . . 443

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
Using Server Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
Using a File Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
Running Sessions with Cache Files . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
Working with Server Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
Distributing Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
Server Grid Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
Server Grid Guidelines and Requirements . . . . . . . . . . . . . . . . . . . . . . 448
Configuring Server Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
Configuring Server Grid Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
Configuring Workflow Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
Configuring Session Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
Override Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
Steps for Creating a Server Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
Table of Contents xv
Chapter 16: Log Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
Workflow Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
Workflow Log Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Configuring Workflow Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Viewing Workflow Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
Session Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
Session Log Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
Load Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
Detailed Transformation Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
Configuring Session Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
Viewing Session Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
Reject Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
Locating Reject Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
Reading Reject Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
Chapter 17: Row Error Logging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
Error Log Code Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
Understanding the Error Log Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
PMERR_DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
PMERR_MSG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
PMERR_SESS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486
PMERR_TRANS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
Understanding the Error Log File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
Configuring Error Log Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
Chapter 18: Session Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496
Session Log Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
Changing the Session Log Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
Changing the Session Log Name and Location . . . . . . . . . . . . . . . . . . . 498
Steps for Using $PMSessionLogFile . . . . . . . . . . . . . . . . . . . . . . . . . . . 498
Database Connection Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
Source File Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
Changing the Source File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
Changing the Source File and Directory . . . . . . . . . . . . . . . . . . . . . . . . 503
xvi Table of Contents

Steps for Using a Source File Parameter . . . . . . . . . . . . . . . . . . . . . . . . 503
Target File Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
Changing the Target File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
Changing the Target File and Directory . . . . . . . . . . . . . . . . . . . . . . . . 505
Steps for Using a Target File Parameter . . . . . . . . . . . . . . . . . . . . . . . . 505
Lookup File Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
Changing the Lookup File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
Changing the Lookup File and Directory . . . . . . . . . . . . . . . . . . . . . . . 507
Steps for Using a Lookup File Parameter . . . . . . . . . . . . . . . . . . . . . . . 507
Reject File Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508
Changing the Reject File Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508
Changing the Reject File and Directory . . . . . . . . . . . . . . . . . . . . . . . . 509
Steps for Using a Reject File Parameter . . . . . . . . . . . . . . . . . . . . . . . . 509
Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510
Chapter 19: Parameter Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
Parameter File Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
Guidelines for Creating Parameter Files . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
Sample Parameter File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
Configuring the Parameter File Location . . . . . . . . . . . . . . . . . . . . . . . . . . 518
Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521
Chapter 20: External Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
External Loader Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
Permissions and Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
External Loader Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
Loading Data Using Named Pipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
Staging Data to Flat Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
Partitioning Sessions with External Loaders . . . . . . . . . . . . . . . . . . . . . 526
Errors and Error Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
Loading to DB2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
Setting DB2 External Loader Operation Modes . . . . . . . . . . . . . . . . . . 528
Configuring Authorities, Privileges, and Permissions . . . . . . . . . . . . . . 528
Configuring DB2 EE External Loader Attributes . . . . . . . . . . . . . . . . . 529
Table of Contents xvii

Configuring DB2 EEE External Loader Attributes . . . . . . . . . . . . . . . . 530
Loading to Oracle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
Loading Multibyte Data to Oracle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
Oracle External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
Reject File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
Loading to Sybase IQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
Using Sybase IQ External Loader on UNIX . . . . . . . . . . . . . . . . . . . . . 535
Loading Multibyte Data to Sybase IQ . . . . . . . . . . . . . . . . . . . . . . . . . 535
Sybase IQ External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . 536
Loading to Teradata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 538
Overriding the Control File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
Teradata MultiLoad External Loader Attributes . . . . . . . . . . . . . . . . . . 540
Teradata TPump External Loader Attributes . . . . . . . . . . . . . . . . . . . . . 542
Teradata FastLoad External Loader Attributes . . . . . . . . . . . . . . . . . . . . 545
Teradata Warehouse Builder External Loader Attributes . . . . . . . . . . . . 547
Creating an External Loader Connection . . . . . . . . . . . . . . . . . . . . . . . . . . 551
Configuring External Loading in a Session . . . . . . . . . . . . . . . . . . . . . . . . . 553
Configuring a Session to Write to a File . . . . . . . . . . . . . . . . . . . . . . . . 553
Configuring File Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554
Selecting an External Loader Connection . . . . . . . . . . . . . . . . . . . . . . . 555
Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
Chapter 21: Using FTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
Mainframe Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
Creating an FTP Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
FTP Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
Steps for Creating an FTP Connection . . . . . . . . . . . . . . . . . . . . . . . . 562
Creating an FTP Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
FTP File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
FTP File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568
Chapter 22: Using Incremental Aggregation. . . . . . . . . . . . . . . . . . . 573

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574
PowerCenter Server Processing for Incremental Aggregation . . . . . . . . . . . . 575
Reinitializing the Aggregate Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576
Moving or Deleting the Aggregate Files . . . . . . . . . . . . . . . . . . . . . . . . . . . 577
xviii Table of Contents

Finding Index and Data Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577
Partitioning Guidelines with Incremental Aggregation . . . . . . . . . . . . . . . . 578
Preparing for Incremental Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
Configuring the Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
Configuring the Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
Chapter 23: Using pmcmd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582
Configuring Environment Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
Configuring PM_CODEPAGENAME . . . . . . . . . . . . . . . . . . . . . . . . . 585
Configuring PMTOOL_DATEFORMAT . . . . . . . . . . . . . . . . . . . . . . 585
Configuring Repository Username and Password . . . . . . . . . . . . . . . . . 586
Configuring PM_HOME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587
Using the Command Line Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589
Connecting to the PowerCenter Server in the Command Line Mode . . . 589
pmcmd Return Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
Using the Interactive Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592
Connecting to the PowerCenter Server in the Interactive Mode . . . . . . . 592
Setting Defaults in the Interactive Mode . . . . . . . . . . . . . . . . . . . . . . . 593
pmcmd Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594
Command Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594
Using Quotation Marks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
Syntax Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
Aborttask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596
Abortworkflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597
Connect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597
Disconnect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598
Exit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598
Getrunningsessionsdetails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598
Getserverdetails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599
Getserverproperties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599
Getsessionstatistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 600
Gettaskdetails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
Getworkflowdetails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
Pingserver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
Quit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
Table of Contents xix

Resumeworkflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603
Resumeworklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603
Scheduleworkflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604
Setfolder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604
Setnowait . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
Setwait . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
Showsettings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
Shutdownserver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
Starttask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606
Startworkflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607
Stoptask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609
Stopworkflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609
Unscheduleworkflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
Unsetfolder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
Version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611
Waittask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611
Waitworkflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611
Chapter 24: Session Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614
Memory Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614
Cache Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615
Determining Cache Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
Cache Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
Cache Column Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618
Cache Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620
Aggregator Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621
Calculating the Aggregator Index Cache. . . . . . . . . . . . . . . . . . . . . . . . 621
Calculating the Aggregator Data Cache . . . . . . . . . . . . . . . . . . . . . . . . 622
Joiner Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
Calculating the Number of Master Rows . . . . . . . . . . . . . . . . . . . . . . . 625
Calculating the Joiner Index Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
Calculating the Joiner Data Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
Lookup Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
Static Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
Dynamic Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
Sharing Partitioned Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629
xx Table of Contents
Calculating the Lookup Index Cache . . . . . . . . . . . . . . . . . . . . . . . . . . 629
Calculating the Lookup Data Cache . . . . . . . . . . . . . . . . . . . . . . . . . . 631
Rank Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632
Calculating the Rank Index Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . 632
Calculating the Rank Data Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633
Chapter 25: Performance Tuning. . . . . . . . . . . . . . . . . . . . . . . . . . . . 635

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636
Identifying the Performance Bottleneck . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
Identifying Target Bottlenecks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
Identifying Source Bottlenecks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
Identifying Mapping Bottlenecks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 638
Identifying a Session Bottleneck . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 639
Identifying a System Bottleneck . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 640
Optimizing the Target Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642
Dropping Indexes and Key Constraints . . . . . . . . . . . . . . . . . . . . . . . . 642
Increasing Checkpoint Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642
Bulk Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642
External Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
Increasing Database Network Packet Size . . . . . . . . . . . . . . . . . . . . . . . 643
Optimizing Oracle Target Databases . . . . . . . . . . . . . . . . . . . . . . . . . . 643
Optimizing the Source Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645
Optimizing the Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645
Using tempdb to Join Sybase and Microsoft SQL Server Tables . . . . . . . 646
Using Conditional Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646
Increasing Database Network Packet Sizes . . . . . . . . . . . . . . . . . . . . . . 646
Connecting to Oracle Source Databases . . . . . . . . . . . . . . . . . . . . . . . . 646
Optimizing the Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647
Configuring Single-Pass Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647
Optimizing Datatype Conversions . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
Eliminating Transformation Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
Optimizing Lookup Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . 649
Optimizing Filter Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 650
Optimizing Aggregator Transformations . . . . . . . . . . . . . . . . . . . . . . . 650
Optimizing Joiner Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 651
Optimizing Sequence Generator Transformations . . . . . . . . . . . . . . . . . 652
Optimizing Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 652
Table of Contents xxi

Optimizing the Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655
Pipeline Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655
Allocating Buffer Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655
Increasing the Cache Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658
Increasing the Commit Interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658
Disabling High Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658
Reducing Error Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
Removing Staging Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
Optimizing the System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660
Improving Network Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660
Using Multiple PowerCenter Servers . . . . . . . . . . . . . . . . . . . . . . . . . . 661
Using Server Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661
Running the PowerCenter Server in ASCII Data Movement Mode . . . . . 661
Using Additional CPUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661
Reducing Paging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662
Using Processor Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662
Pipeline Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663
Optimizing the Source Database for Partitioning . . . . . . . . . . . . . . . . . 663
Optimizing the Target Database for Partitioning . . . . . . . . . . . . . . . . . 664
Appendix A: Session Properties Reference . . . . . . . . . . . . . . . . . . . 667

General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668
Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670
General Options Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670
Performance Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
Config Object Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675
Advanced Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675
Log Options Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 677
Error Handling Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678
Mapping Tab (Transformations View) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681
Connections Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681
Sources Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683
Targets Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 692
Mapping Tab (Partitions View) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705
Partition Properties Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705
KeyRange Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706
HashKeys Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706
xxii Table of Contents

Partition Points Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706
Non-Partition Points Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 709
Components Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 710
Reusable Pre- or Post-Session Commands . . . . . . . . . . . . . . . . . . . . . . 711
Non-Reusable Pre- or Post-Session Commands . . . . . . . . . . . . . . . . . . 712
Reusable Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714
Non-Reusable Email. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715
Email Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715
Metadata Extensions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718
Appendix B: Workflow Properties Reference . . . . . . . . . . . . . . . . . . 721

General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724
Scheduler Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
Edit Scheduler Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727
Variables Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 731
Events Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 732
Metadata Extensions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733
Appendix C: Session Properties Comparison Reference . . . . . . . . 735

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736
General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737
General Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737
Source Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 738
Target Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 743
Session Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 750
Performance Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 752
Source Location Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754
Time Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755
Schedule Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755
Start Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756
Duration Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756
Use Absolute Time Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757
Log and Error Handling Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 758
Log File Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 758
Parameter File Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759
Batch Handling Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759
Table of Contents xxiii

Error Handling Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759
Transformations Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761
Partitions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763
xxiv Table of Contents

List of Figures
Figure 1-1. PowerCenter Server and Data Movement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 2
Figure 1-2. Partitioned Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3
Figure 1-3. PowerCenter Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 6
Figure 1-4. Thread Creation for a Simple Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Figure 1-5. Thread Creation for a Pass-through Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Figure 1-6. Pipeline Stages in a Mapping With an Unsorted Aggregator Transformation . . . . . 17
Figure 1-7. Pipeline Stages in a Mapping with an Additional Partition Point . . . . . . . . . . . . . . 18
Figure 1-8. Thread Creation for a Mapping with Three Partitions . . . . . . . . . . . . . . . . . . . . . . 18
Figure 1-9. Thread Creation with Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Figure 1-10. Thread Creation with a Partition Point at a Joiner Transformation . . . . . . . . . . . 20
Figure 1-11. Target Load Order Groups and Source Pipelines . . . . . . . . . . . . . . . . . . . . . . . . . 22
Figure 1-12. Event Viewer Application Log Message . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Figure 1-13. Application Log Message Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Figure 2-1. Workflow Manager General Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Figure 2-2. Workflow Manager Format Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Figure 2-3. Copy Wizard, Versioning, and Target Load Type Options . . . . . . . . . . . . . . . . . . 43
Figure 3-1. Sample Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Figure 3-2. Workflow Manager Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Figure 3-3. Check In Workflow Manager Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Figure 3-4. Query Browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Figure 3-5. Diff Tool Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Figure 4-1. Sample Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Figure 4-2. Sample Workflow With Two Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Figure 4-3. Valid Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Figure 4-4. Example of a Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Figure 4-5. Setting Link Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Figure 4-6. Displaying Link Condition in the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Figure 4-7. Expression Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Figure 4-8. Expression Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Figure 4-9. Expression Using a Pre-Defined Workflow Variable . . . . . . . . . . . . . . . . . . . . . . 107
Figure 4-10. Status Variable Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Figure 4-11. PrevTaskStatus Variable Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Figure 4-12. Sample Workflow Using Workflow Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Figure 4-13. Schedule tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Figure 4-14. Customized Repeat Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Figure 4-15. Example Workflow - Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Figure 4-16. Running Part of a Workflow - Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Figure 5-1. General Tab - Edit Tasks Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Figure 5-2. Revert Button in Session Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Figure 5-3. Run If Previous Completed Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
List of Figures xxv

Figure 5-4. Example Workflow Using a Decision Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .150
Figure 5-5. Example Workflow without a Decision Task . . . . . . . . . . . . . . . . . . . . . . . . . . . .150
Figure 5-6. Expanded Example Workflow Using a Decision Task . . . . . . . . . . . . . . . . . . . . . .151
Figure 5-7. Example of User-Defined Event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .153
Figure 5-8. Example Workflow Using the Timer Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .161
Figure 6-1. Workflow with Multiple Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .167
Figure 6-2. Workflow with Nested Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .168
Figure 6-3. Example of Persistent Worklet Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .169
Figure 7-1. Session Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .177
Figure 7-2. Session Target Object Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .179
Figure 7-3. Connection Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .181
Figure 7-4. Config Object Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .183
Figure 7-5. Session Configuration Browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .184
Figure 7-6. Stop or Continue the Session on Pre- or Post-Session SQL Errors . . . . . . . . . . . . .187
Figure 7-7. Make Reusable Option for Pre-Session Shell Commands . . . . . . . . . . . . . . . . . . . .189
Figure 7-8. Stop or Continue the Session on Pre-Session Shell Command Error . . . . . . . . . . . .193
Figure 7-9. Assign Server Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .198
Figure 8-1. Sources Node of the Session Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .210
Figure 8-2. Readers Settings in the Sources Node of the Mapping Tab . . . . . . . . . . . . . . . . . .211
Figure 8-3. Connections Settings in the Sources Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .212
Figure 8-4. Properties Settings in the Sources Node of the Mapping Tab . . . . . . . . . . . . . . . . .213
Figure 8-5. Treat Source Rows As Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .215
Figure 8-6. Source Table Owner Name Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .216
Figure 8-7. SQL Query Override Property in the Session Properties . . . . . . . . . . . . . . . . . . . .217
Figure 8-8. Properties Settings in the Sources Node for a Flat File Source . . . . . . . . . . . . . . . .219
Figure 8-9. Flat Files Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .221
Figure 8-10. Fixed-Width File Properties Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .221
Figure 8-11. Flat Files Dialog Box. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .223
Figure 8-12. Delimited File Properties Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .223
Figure 8-13. Line Sequential Buffer Length Property for File Sources . . . . . . . . . . . . . . . . . . .225
Figure 9-1. Defining Target Properties in the Session Properties . . . . . . . . . . . . . . . . . . . . . . .236
Figure 9-2. Writers Settings on the Mapping Tab of the Session Properties . . . . . . . . . . . . . . .237
Figure 9-3. Connections Settings on the Mapping Tab of the Session Properties . . . . . . . . . . .238
Figure 9-4. Properties Settings on the Mapping Tab of the Session Properties . . . . . . . . . . . . .239
Figure 9-5. Properties Settings on the Mapping Tab for a Relational Target . . . . . . . . . . . . . .242
Figure 9-6. Test Load Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .244
Figure 9-7. Session Retry on Deadlock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .247
Figure 9-8. Mapping Using Constraint-Based Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .250
Figure 9-9. Properties Settings on the Mapping Tab for a Flat File Target . . . . . . . . . . . . . . . .262
Figure 9-10. Test Load Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .264
Figure 9-12. Fixed Width Properties Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .265
xxvi List of Figures

Figure 9-14. Delimited File Properties Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 267
Figure 10-1. Mapping with a Single Commit Source . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 279
Figure 10-2. Mapping with Multiple Commit Sources . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 280
Figure 10-3. Mapping with Targets Connected to a Commit Source . . . . . . . . . . . . . .. . .. . 281
Figure 10-4. Mapping a Custom Transformation with a Commit Source . . . . . . . . . . .. . .. . 282
Figure 10-5. Roll Back on Failed Commit Example . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 286
Figure 10-6. Transaction Control Units. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 290
Figure 10-7. Session Commit Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 292
Figure 11-1. Mapping You Can Enable for Recovery . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 303
Figure 11-2. Mapping You Cannot Enable for Recovery . . . . . . . . . . . . . . . . . . . . . . .. . .. . 304
Figure 11-3. Modified Mapping You Can Enable for Recovery . . . . . . . . . . . . . . . . . .. . .. . 304
Figure 11-4. Resuming a Suspended Workflow with Sequential Sessions . . . . . . . . . . .. . .. . 306
Figure 11-5. Resuming a Suspended Workflow with Concurrent Sessions . . . . . . . . . .. . .. . 307
Figure 11-6. Recovering Part of a Workflow With Sequential Sessions. . . . . . . . . . . . .. . .. . 308
Figure 11-7. Recovering Part of a Workflow with Concurrent Sessions . . . . . . . . . . . .. . .. . 309
Figure 11-8. Recovering Concurrent Sessions Individually . . . . . . . . . . . . . . . . . . . . .. . .. . 312
Figure 12-1. Email Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 328
Figure 12-2. Post-Session Email Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 332
Figure 12-3. Suspension Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 339
Figure 12-4. Email Task in a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 341
Figure 12-5. Using Post-Session Commands to Generate Reports . . . . . . . . . . . . . . . .. . .. . 342
Figure 12-6. Using Email Variables to Attach Reports . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 343
Figure 12-7. Sending Email without Microsoft Outlook . . . . . . . . . . . . . . . . . . . . . . .. . .. . 343
Figure 13-1. Default Partition Points and Stages in a Sample Mapping . . . . . . . . . . . .. . .. . 347
Figure 13-2. Threads Created for a Sample Mapping with Three Partitions . . . . . . . . .. . .. . 348
Figure 13-3. Sample Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 349
Figure 13-4. Session Properties Partitions View on the Mapping Tab . . . . . . . . . . . . .. . .. . 351
Figure 13-5. Edit Partition Point Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 352
Figure 13-6. Sample Mapping Showing Valid Partition Points . . . . . . . . . . . . . . . . . .. . .. . 354
Figure 13-7. Mapping where Round-robin Partitioning Can Increase Performance . . . .. . .. . 360
Figure 13-8. Mapping where Hash Partitioning Can Increase Performance . . . . . . . . .. . .. . 361
Figure 13-9. Edit Partition Key Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 362
Figure 13-10. Mapping where Key Range Partitioning Can Increase Performance . . . .. . .. . 363
Figure 13-11. Edit Partition Key Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 364
Figure 13-12. Adding Key Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 365
Figure 13-13. Mapping where Pass-through Partitioning Can Increase Performance . . .. . .. . 367
Figure 13-14. Overriding the SQL Query and Entering a Filter Condition . . . . . . . . .. . .. . 371
Figure 13-15. Properties Settings for Relational Targets in the Session Properties . . . . .. . .. . 378
Figure 13-16. Connections Settings for File Targets in the Session Properties . . . . . . .. . .. . 381
Figure 13-17. Properties Settings for File Targets in the Session Properties . . . . . . . . .. . .. . 382
Figure 13-18. Sorted File Data with 1:n Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 386
Figure 13-19. Sorted File Data Passed Through a Single Partition . . . . . . . . . . . . . . . .. . .. . 387
Figure 13-20. Sorted Relational Data with 1:n Partitioning . . . . . . . . . . . . . . . . . . . .. . .. . 388
List of Figures xxvii

Figure 13-21. Sorted Relational Data Passed Through a Single Partition . . . . . . . . . . . . . . . . .389
Figure 13-22. Using Sorter Transformations with Hash Auto-Keys to Maintain Sort Order . . .390
Figure 13-23. Session Properties - Configuring Sorter Transformations . . . . . . . . . . . . . . . . . .393
Figure 14-1. Workflow Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .403
Figure 14-2. Workflow Monitor Statistics Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .408
Figure 14-3. General Tab for Workflow Monitor Options . . . . . . . . . . . . . . . . . . . . . . . . . . .410
Figure 14-4. Gantt Chart Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .411
Figure 14-5. Task View Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .412
Figure 14-6. Advanced Tab for Workflow Monitor Options . . . . . . . . . . . . . . . . . . . . . . . . . .413
Figure 14-7. Standard Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .415
Figure 14-8. Server Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .415
Figure 14-9. View Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .415
Figure 14-10. Filter Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .415
Figure 14-11. History Names Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .420
Figure 14-12. Gantt Chart View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .423
Figure 14-13. Organizing Gantt Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .426
Figure 14-14. Zooming the Gantt Chart View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .427
Figure 14-15. Task View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .431
Figure 14-16. Session Properties Transformation Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . .434
Figure 15-1. Distributing Sessions in a Server Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .446
Figure 15-2. Running a Non-session Task on the Master Server . . . . . . . . . . . . . . . . . . . . . . .447
Figure 16-1. Properties Settings on the Mapping Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .477
Figure 18-1. Using $PMSessionLogFile as the Name of the Session Log . . . . . . . . . . . . . . . . .497
Figure 18-2. Using Parameters to Change the Session Source File . . . . . . . . . . . . . . . . . . . . . .502
Figure 18-3. Using Parameters to Change the Session Target File . . . . . . . . . . . . . . . . . . . . . .504
Figure 18-4. Using Parameters to Change the Session Lookup File . . . . . . . . . . . . . . . . . . . . .506
Figure 18-5. Using Parameters to Change the Reject File Name . . . . . . . . . . . . . . . . . . . . . . .508
Figure 20-1. Control File Editor Dialog Box for Teradata . . . . . . . . . . . . . . . . . . . . . . . . . . .539
Figure 20-2. Writers Settings on the Mapping Tab. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .553
Figure 20-3. Properties Settings on the Mapping Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .554
Figure 20-4. Connections Settings on the Mapping Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . .556
Figure 22-1. Incremental Aggregation Session Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . .580
Figure 25-1. Single-Pass Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .648
Figure A-1. General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .668
Figure A-2. Properties Tab - General Options Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .670
Figure A-3. Properties Tab - Performance Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .673
Figure A-4. Config Object Tab - Advanced Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .676
Figure A-5. Config Object Tab - Log Option Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .677
Figure A-6. Config Object Tab - Error Handling Settings . . . . . . . . . . . . . . . . . . . . . . . . . . .679
Figure A-7. Mapping Tab - Connections Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .681
Figure A-8. Mapping Tab - Sources Node - Readers Settings . . . . . . . . . . . . . . . . . . . . . . . . .684
Figure A-9. Mapping Tab - Sources Node - Connections Settings . . . . . . . . . . . . . . . . . . . . . .685
Figure A-10. Mapping Tab - Sources Node - Properties Settings . . . . . . . . . . . . . . . . . . . . . . .686
xxviii List of Figures

Figure A-11. Flat Files Dialog Box for Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 688
Figure A-12. Fixed Width Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 689
Figure A-13. Delimited Properties for File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 690
Figure A-14. Mapping Tab - Targets Node - Writers Settings . . . . . . . . . . . . . . . . . . .. . .. . 693
Figure A-15. Mapping Tab - Targets Node - Connections Settings . . . . . . . . . . . . . . .. . .. . 694
Figure A-16. Mapping Tab - Targets Node - Properties Settings (Relational) . . . . . . . .. . .. . 696
Figure A-17. Mapping Tab - Targets Node - File Properties Settings . . . . . . . . . . . . . .. . .. . 699
Figure A-18. Flat Files Dialog Box for Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 701
Figure A-19. Fixed-Width Properties for File Targets . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 702
Figure A-20. Delimited Properties for File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 702
Figure A-21. Mapping Tab - Transformations Node . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 704
Figure A-22. Mapping Tab - Partitions Properties Node . . . . . . . . . . . . . . . . . . . . . . .. . .. . 705
Figure A-23. Mapping Tab - KeyRange Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 706
Figure A-24. Mapping Tab - Partition Points Node . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 707
Figure A-25. Edit Partition Point Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 708
Figure A-26. Edit Partition Key Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 709
Figure A-27. Components Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 710
Figure A-28. Task Browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 712
Figure A-29. Edit Pre-Session Command Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 713
Figure A-30. Email Object Browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 715
Figure A-31. On-Success or On-Failure Email - General Tab . . . . . . . . . . . . . . . . . . .. . .. . 716
Figure A-32. On-Success or On-Failure Email - Properties Tab . . . . . . . . . . . . . . . . . .. . .. . 717
Figure A-33. Metadata Extensions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 718
Figure B-1. Workflow Properties - General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 722
Figure B-2. Workflow Properties - Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 724
Figure B-3. Workflow Properties - Scheduler Tab . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 726
Figure B-4. Workflow Properties - Scheduler Tab - Edit Scheduler Dialog Box . . . . . .. . .. . 727
Figure B-5. Workflow Properties - Customized Repeat Dialog Box . . . . . . . . . . . . . . .. . .. . 729
Figure B-6. Workflow Properties - Variables Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 731
Figure B-7. Workflow Properties - Events Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 732
Figure B-8. Workflow Properties - Metadata Extensions Tab . . . . . . . . . . . . . . . . . . .. . .. . 733
Figure C-1. Server Manager General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 737
Figure C-2. Server Manager Source Options Dialog Box for File Sources . . . . . . . . . . .. . .. . 739
Figure C-3. Server Manager Fixed-Width Properties Dialog Box . . . . . . . . . . . . . . . . .. . .. . 740
Figure C-4. Server Manager Delimited File Properties Dialog Box . . . . . . . . . . . . . . . .. . .. . 741
Figure C-5. Server Manager Source Options Dialog Box (XML Sources) . . . . . . . . . . .. . .. . 741
Figure C-6. Server Manager FTP Properties Dialog Box . . . . . . . . . . . . . . . . . . . . . . .. . .. . 742
Figure C-7. Server Manager Targets Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 744
Figure C-8. Server Manager Output Files Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 745
Figure C-9. Server Manager External Loader Properties . . . . . . . . . . . . . . . . . . . . . . .. . .. . 747
Figure C-10. Server Manager Fixed-Width Dialog Box (Output Files) . . . . . . . . . . . . .. . .. . 747
Figure C-11. Server Manager Delimited File Properties Dialog Box (Output Files) . . .. . .. . 748
Figure C-12. Server Manager XML Target Dialog Box . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 748
List of Figures xxix

Figure C-13. Server Manager Reject File Dialog Box . . . . . . . . . . . . . .. . .. . . .. . . .. . .. . .749
Figure C-14. Server Manager Pre-Session Commands Dialog Box . . . . .. . .. . . .. . . .. . .. . .750
Figure C-15. Server Manager Post-Session Commands and Email . . . . .. . .. . . .. . . .. . .. . .751
Figure C-16. Server Manager Configuration Parameter Dialog Box . . . .. . .. . . .. . . .. . .. . .752
Figure C-17. Server Manager Source Location Tab. . . . . . . . . . . . . . . .. . .. . . .. . . .. . .. . .754
Figure C-18. Server Manager Time tab . . . . . . . . . . . . . . . . . . . . . . . .. . .. . . .. . . .. . .. . .755
Figure C-19. Server Manager Repeat Dialog Box . . . . . . . . . . . . . . . . .. . .. . . .. . . .. . .. . .756
Figure C-20. Server Manager Log and Error Handling Tab . . . . . . . . . .. . .. . . .. . . .. . .. . .758
Figure C-21. Server Manager Transformations Tab . . . . . . . . . . . . . . .. . .. . . .. . . .. . .. . .761
xxx List of Figures

List of Tables
Table 1-1. PowerCenter Server Connectivity Requirements . . . . . . . . . . . . . . . . . . . . . . . . . .. 6
Table 1-2. Processing Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Table 2-1. Workflow Manager General Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Table 2-2. Workflow Manager Format Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Table 2-3. Workflow Manager Miscellaneous Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Table 2-4. Default Permissions for Connection Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Table 2-5. Server Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Table 2-6. TCP/IP Settings to Register a Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Table 2-7. Native Connect String Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Table 2-8. Source and Target Code Page Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Table 2-9. Relational Database Connection Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Table 2-10. Relational Database Connection Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Table 3-1. Metadata Extension Attributes in the Workflow Manager . . . . . . . . . . . . . . . . . . . . 83
Table 3-2. Workflow Manager Keyboard Shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Table 3-3. Keyboard Shortcuts for Navigating the Workspace . . . . . . . . . . . . . . . . . . . . . . . . . 86
Table 4-1. Task-Specific Workflow Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Table 4-2. Datatype Default Values for User-defined Workflow Variables . . . . . . . . . . . . . . . 110
Table 4-3. Schedule Tab Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Table 4-4. Repeat Dialog Box Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Table 5-1. Workflow Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Table 5-2. Timer Task Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
Table 7-1. Apply All Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
Table 7-2. PowerCenter Server Behavior for Failed Sessions . . . . . . . . . . . . . . . . . . . . . . . . . 201
Table 8-1. Treat Source Rows As Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Table 8-2. Flat File Source Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Table 8-3. Fixed-Width File Properties for File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Table 8-4. Delimited File Properties for File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
Table 8-5. Support for ASCII and Unicode Data Movement Modes . . . . . . . . . . . . . . . . . . . 226
Table 8-6. Null Character Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Table 9-1. Support for ASCII and Unicode Data Movement Modes . . . . . . . . . . . . . . . . . . . 234
Table 9-2. Relational Target Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Table 9-3. Test Load Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
Table 9-4. PowerCenter Server Commands on Supported Databases . . . . . . . . . . . . . . . . . . . 245
Table 9-5. Flat File Target Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
Table 9-6. Test Load Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
Table 9-7. Writing to a Fixed-Width Target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
Table 9-8. Delimited File Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Table 9-9. Datatype Modifications for File Target Columns . . . . . . . . . . . . . . . . . . . . . . . . . 269
Table 9-10. Field Length Measurements for Fixed-Width Flat File Targets . . . . . . . . . . . . . . 270
Table 9-11. Characters to Include when Calculating Field Length for Fixed-Width Targets . . 270
List of Tables xxxi

Table 10-1. Transformation Scope Property Values . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .288
Table 10-2. Session Commit Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .292
Table 11-1. PM_RECOVERY Table Definition . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .299
Table 11-2. PM_TGT_RUN_ID Table Definition . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .299
Table 11-3. pmcmd Return Codes for Recovery . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .300
Table 11-4. Transformations that Output Repeatable Data . . . . . . . . . . . . . . . . .. . . .. . .. . .301
Table 12-1. Email Variables for Post-Session Email . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .334
Table 12-2. Format Tags for Email Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .334
Table 13-1. Default Partition Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .347
Table 13-2. Options on Session Properties Partitions View on the Mapping Tab .. . . .. . .. . .352
Table 13-3. Edit Partition Point Dialog Box Options . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .353
Table 13-4. Valid Partition Types for Partition Points . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .357
Table 13-5. File Properties Settings for File Sources . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .376
Table 13-6. Configuring Source File Name for Single-Threaded Reading . . . . . .. . . .. . .. . .376
Table 13-7. Configuring Source File Name for Multi-Threaded Reading . . . . . . .. . . .. . .. . .377
Table 13-8. Partitioning Relational Target Attributes . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .379
Table 13-9. File Targets Connection Options . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .381
Table 13-10. Target File Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .382
Table 13-11. Variable Value Calculations with Partitioned Sessions . . . . . . . . . .. . . .. . .. . .394
Table 13-12. Restrictions on the Number of Partitions for Transformations . . . .. . . .. . .. . .396
Table 13-13. Partitioning Guidelines for Informatica Application Products . . . . .. . . .. . .. . .397
Table 14-1. Workflow Monitor General Options . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .410
Table 14-2. Gantt Chart Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .411
Table 14-3. Advanced Workflow Monitor Options . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .413
Table 14-4. Workflow and Task Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .421
Table 14-5. Session Details on the Transformation Statistics Tab . . . . . . . . . . . .. . . .. . .. . .434
Table 14-6. Performance Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .438
Table 15-1. Losing Connectivity in a Server Grid . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .448
Table 15-2. Override Workflow Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .451
Table 15-3. Override Server Grid Properties . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .451
Table 16-1. Log File Default Locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .456
Table 16-2. Workflow Log Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .458
Table 16-3. Session Log Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .464
Table 16-4. Session Log Tracing Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .473
Table 16-5. Row Indicators in Reject File . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .478
Table 16-6. Column Indicators in Reject File . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .479
Table 17-1. PMERR_DATA Table Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .483
Table 17-2. PMERR_MSG Table Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .485
Table 17-3. PMERR_SESS Table Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .487
Table 17-4. PMERR_TRANS Table Schema . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .487
Table 17-5. Error Log File Column Headers . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .490
Table 17-6. Error Log Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . .. . .494
Table 18-1. Naming Conventions for User-Defined Session Parameters . . . . . . .. . . .. . .. . .496
xxxii List of Tables

Table 19-1. Parameters and Variables in Parameter File . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
Table 19-2. Naming Conventions for User-Defined Session Parameters . . . . . . . . . . . . . . . . . 520
Table 20-1. Partitioning Guidelines for External Loaders . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
Table 20-2. DB2 EE External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
Table 20-3. DB2 EE External Loader Return Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
Table 20-4. DB2 EEE External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
Table 20-5. Oracle External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
Table 20-6. Sybase IQ External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536
Table 20-7. Teradata MultiLoad External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . 540
Table 20-8. Teradata MultiLoad External Loader Attributes Defined at the Session Level . . . . 542
Table 20-9. Teradata TPump External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . 542
Table 20-10. Teradata TPump External Loader Attributes Defined at the Session Level . . . . . 544
Table 20-11. Teradata FastLoad External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . 545
Table 20-12. Teradata FastLoad External Loader Attributes Defined at the Session Level . . . . 546
Table 20-13. Teradata Warehouse Builder Operators and Protocol . . . . . . . . . . . . . . . . . . . . 547
Table 20-14. Teradata Warehouse Builder External Loader Attributes . . . . . . . . . . . . . . . . . . 547
Table 20-15. Teradata Warehouse Builder External Loader Attributes Defined at the Session
Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
Table 20-16. Properties Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
Table 21-1. FTP Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563
Table 23-1. pmcmd Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582
Table 23-2. Connection Information for the Command Line Mode . . . . . . . . . . . . . . . . . . . 590
Table 23-3. pmcmd Return Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
Table 23-4. Setting Defaults for the Interactive Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593
Table 23-5. Command Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594
Table 23-6. pmcmd Syntax Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
Table 24-1. Caching Storage Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614
Table 24-2. Cache File Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
Table 24-3. Aggregate Cache Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
Table 24-7. Column Sizes for Cache Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618
Table 24-4. Rank Cache Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618
Table 24-5. Joiner Cache Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618
Table 24-6. Lookup Cache Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618
Table 25-1. Session Tuning Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655
Table A-1. General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668
Table A-2. Properties Tab - General Options Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671
Table A-3. Properties Tab - Performance Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
Table A-4. Config Object Tab - Advanced Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676
Table A-5. Config Object Tab - Log Options Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678
Table A-6. Config Object Tab - Error Handling Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . 679
Table A-7. Mapping Tab - Connections Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 682
Table A-8. Mapping Tab - Sources Node - Connections Settings . . . . . . . . . . . . . . . . . . . . . 685
Table A-9. Mapping Tab - Sources Node - Properties Settings (Relational Sources) . . . . . . . . 686
Table A-10. Mapping Tab - Sources Node - Properties Settings (File Sources) . . . . . . . . . . . . 687
List of Tables xxxiii

Table A-11. Fixed-Width Properties for File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .689
Table A-12. Delimited Properties for File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .691
Table A-13. Mapping Tab - Targets Node - Writers Settings . . . . . . . . . . . . . . . . . . . . . . . . .693
Table A-14. Mapping Tab - Targets Node - Connections Settings . . . . . . . . . . . . . . . . . . . . . .695
Table A-15. Mapping Tab - Targets Node - Properties Settings (Relational) . . . . . . . . . . . . . .697
Table A-16. Mapping Tab - Targets Node - File Properties Settings . . . . . . . . . . . . . . . . . . . .699
Table A-17. Fixed-Width Properties for File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .702
Table A-18. Delimited Properties for File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .703
Table A-19. Mapping Tab - Partition Points Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .707
Table A-20. Edit Partition Point Dialog Box Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .708
Table A-21. Components Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .711
Table A-22. Components Tab Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .711
Table A-23. Pre- or Post-Session Commands - General Tab . . . . . . . . . . . . . . . . . . . . . . . . . .713
Table A-24. Pre- or Post-Session Commands - Properties Tab . . . . . . . . . . . . . . . . . . . . . . . .714
Table A-25. Pre- or Post-Session Commands - Commands Tab . . . . . . . . . . . . . . . . . . . . . . .714
Table A-26. On-Success or On-Failure Emails - General Tab . . . . . . . . . . . . . . . . . . . . . . . . .716
Table A-27. On-Success or On-Failure Emails - Properties Tab . . . . . . . . . . . . . . . . . . . . . . .717
Table A-28. Metadata Extensions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .718
Table B-1. Workflow Properties - General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .722
Table B-2. Workflow Properties - Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .724
Table B-3. Workflow Properties - Scheduler Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .727
Table B-4. Workflow Properties - Scheduler Tab - Edit Scheduler Dialog Box . . . . . . . . . . . . .728
Table B-5. Workflow Properties - Repeat Dialog Box Options . . . . . . . . . . . . . . . . . . . . . . . .729
Table B-6. Workflow Properties - Variables Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .731
Table B-7. Workflow Properties - Events Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .732
Table B-8. Workflow Properties - Metadata Extensions Tab . . . . . . . . . . . . . . . . . . . . . . . . . .733
Table C-1. General Session Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .738
Table C-2. Source Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .738
Table C-3. File Source Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .739
Table C-4. XML Sources Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .742
Table C-5. FTP Properties Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .743
Table C-6. Target Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .743
Table C-7. Relational Target Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .744
Table C-8. File Target Output Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .746
Table C-9. XML Target Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .748
Table C-10. Reject Files Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .749
Table C-11. Pre-Session Commands Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .750
Table C-12. Post-Session Commands and Email Comparison . . . . . . . . . . . . . . . . . . . . . . . . .751
Table C-13. Performance Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .752
Table C-14. Configuration Parameters Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .753
Table C-15. Log File Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .759
Table C-16. Error Handling Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .759
Table C-17. Transformations Tab Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . .761
xxxiv List of Tables

Preface
Welcome to PowerCenter, Informatica’s software product that delivers an open, scalable data
integration solution addressing the complete life cycle for all data integration projects
including data warehouses and data marts, data migration, data synchronization, and
information hubs. PowerCenter combines the latest technology enhancements for reliably
managing data repositories and delivering information resources in a timely, usable, and
efficient manner.
The PowerCenter metadata repository coordinates and drives a variety of core functions,
including extracting, transforming, loading, and managing data. The PowerCenter Server can
extract large volumes of data from multiple platforms, handle complex transformations on the
data, and support high-speed loads. PowerCenter can simplify and accelerate the process of
moving data warehouses from development to test to production.
xxxv
New Features and Enhancements
This section describes new features and enhancements to PowerCenter 7.1.1, 7.1, and 7.0.
PowerCenter 7.1.1
This section describes new features and enhancements to PowerCenter 7.1.1.
Data Profiling
♦ Data sampling. You can create a data profile for a sample of source data instead of the
entire source. You can view a profile from a random sample of data, a specified percentage
of data, or for a specified number of rows starting with the first row.
♦ Verbose data enhancements. You can specify the type of verbose data you want the
PowerCenter Server to write to the Data Profiling warehouse. The PowerCenter Server can
write all rows, the rows that meet the business rule, or the rows that do not meet the
business rule.
♦ Session enhancement. You can save sessions that you create from the Profile Manager to
the repository.
♦ Domain Inference function tuning. You can configure the Data Profiling Wizard to filter
the Domain Inference function results. You can configure a maximum number of patterns
and a minimum pattern frequency. You may want to narrow the scope of patterns returned
to view only the primary domains, or you may want to widen the scope of patterns
returned to view exception data.
♦ Row Uniqueness function. You can determine unique rows for a source based on a
selection of columns for the specified source.
♦ Define mapping, session, and workflow prefixes. You can define default mapping,
session, and workflow prefixes for the mappings, sessions, and workflows generated when
you create a data profile.
♦ Profile mapping display in the Designer. The Designer displays profile mappings under a
profile mappings node in the Navigator.
PowerCenter Server
♦ Code page. PowerCenter supports additional Japanese language code pages, such as JIPSE-
kana, JEF-kana, and MELCOM-kana.
♦ Flat file partitioning. When you create multiple partitions for a flat file source session, you
can configure the session to create multiple threads to read the flat file source.
♦ pmcmd. You can use parameter files that reside on a local machine with the Startworkflow
command in the pmcmd program. When you use a local parameter file, pmcmd passes
variables and values in the file to the PowerCenter Server.
xxxvi Preface
♦ SuSE Linux support. The PowerCenter Server runs on SuSE Linux. On SuSE Linux, you
can connect to IBM, DB2, Oracle, and Sybase sources, targets, and repositories using
native drivers. Use ODBC drivers to access other sources and targets.
♦ Reserved word support. If any source, target, or lookup table name or column name
contains a database reserved word, you can create and maintain a file, reswords.txt,
containing reserved words. When the PowerCenter Server initializes a session, it searches
for reswords.txt in the PowerCenter Server installation directory. If the file exists, the
PowerCenter Server places quotes around matching reserved words when it executes SQL
against the database.
♦ Teradata external loader. When you load to Teradata using an external loader, you can
now override the control file. Depending on the loader you use, you can also override the
error, log, and work table names by specifying different tables on the same or different
Teradata database.
Repository
♦ Exchange metadata with other tools. You can exchange source and target metadata with
other BI or data modeling tools, such as Business Objects Designer. You can export or
import multiple objects at a time. When you export metadata, the PowerCenter Client
creates a file format recognized by the target tool.
Repository Server
♦ pmrep. You can use pmrep to perform the following functions:
− Remove repositories from the Repository Server cache entry list.
− Enable enhanced security when you create a relational source or target connection in the
repository.
− Update a connection attribute value when you update the connection.
♦ SuSE Linux support. The Repository Server runs on SuSE Linux. On SuSE Linux, you
can connect to IBM, DB2, Oracle, and Sybase repositories.
Security
♦ Oracle OS Authentication. You can now use Oracle OS Authentication to authenticate
database users. Oracle OS Authentication allows you to log on to an Oracle database if you
have a logon to the operating system. You do not need to know a database user name and
password. PowerCenter uses Oracle OS Authentication when the user name for an Oracle
connection is PmNullUser.
Web Services Provider

♦ Attachment support. When you import web service definitions with attachment groups,
you can pass attachments through the requests or responses in a service session. The
document type you can attach is based on the mime content of the WSDL file. You can
attach document types such as XML, JPEG, GIF, or PDF.
Preface xxxvii
♦ Pipeline partitioning. You can create multiple partitions in a session containing web
service source and target definitions. The PowerCenter Server creates a connection to the
Web Services Hub based on the number of sources, targets, and partitions in the session.
XML
♦ Multi-level pivoting. You can now pivot more than one multiple-occurring element in an
XML view. You can also pivot the view row.
PowerCenter 7.1
This section describes new features and enhancements to PowerCenter 7.1.
Data Profiling
♦ Data Profiling for VSAM sources. You can now create a data profile for VSAM sources.
♦ Support for verbose mode for source-level functions. You can now create data profiles
with source-level functions and write data to the Data Profiling warehouse in verbose
mode.
♦ Aggregator function in auto profiles. Auto profiles now include the Aggregator function.
♦ Creating auto profile enhancements. You can now select the columns or groups you want
to include in an auto profile and enable verbose mode for the Distinct Value Count
function.
♦ Purging data from the Data Profiling warehouse. You can now purge data from the Data
Profiling warehouse.
♦ Source View in the Profile Manager. You can now view data profiles by source definition
in the Profile Manager.
♦ PowerCenter Data Profiling report enhancements. You can now view PowerCenter Data
Profiling reports in a separate browser window, resize columns in a report, and view
verbose data for Distinct Value Count functions.
♦ Prepackaged domains. Informatica provides a set of prepackaged domains that you can
include in a Domain Validation function in a data profile.
Documentation
♦ Web Services Provider Guide. This is a new book that describes the functionality of Real-time
Web Services. It also includes information from the version 7.0 Web Services Hub Guide.
♦ XML User Guide. This book consolidates XML information previously documented in the
Designer Guide, Workflow Administration Guide, and Transformation Guide.
Licensing
Informatica provides licenses for each CPU and each repository rather than for each
installation. Informatica provides licenses for product, connectivity, and options. You store
xxxviii Preface
the license keys in a license key file. You can manage the license files using the Repository
Server Administration Console, the PowerCenter Server Setup, and the command line
program, pmlic.
PowerCenter Server
♦ 64-bit support. You can now run 64-bit PowerCenter Servers on AIX and HP-UX
(Itanium).
♦ Partitioning enhancements. If you have the Partitioning option, you can define up to 64
partitions at any partition point in a pipeline that supports multiple partitions.
♦ PowerCenter Server processing enhancements. The PowerCenter Server now reads a
block of rows at a time. This improves processing performance for most sessions.
♦ CLOB/BLOB datatype support. You can now read and write CLOB/BLOB datatypes.
PowerCenter Metadata Reporter

PowerCenter Metadata Reporter modified some report names and uses the PowerCenter 7.1
MX views in its schema.
Repository Server
♦ Updating repository statistics. PowerCenter now identifies and updates statistics for all
repository tables and indexes when you copy, upgrade, and restore repositories. This
improves performance when PowerCenter accesses the repository.
♦ Increased repository performance. You can increase repository performance by skipping
information when you copy, back up, or restore a repository. You can choose to skip MX
data, workflow and session log history, and deploy group history.
♦ pmrep. You can use pmrep to back up, disable, or enable a repository, delete a relational
connection from a repository, delete repository details, truncate log files, and run multiple
pmrep commands sequentially. You can also use pmrep to create, modify, and delete a
folder.
Repository
♦ Exchange metadata with business intelligence tools. You can export metadata to and
import metadata from other business intelligence tools, such as Cognos Report Net and
Business Objects.
♦ Object import and export enhancements. You can compare objects in an XML file to
objects in the target repository when you import objects.
♦ MX views. MX views have been added to help you analyze metadata stored in the
repository. REP_SERVER_NET and REP_SERVER_NET_REF views allow you to see
information about server grids. REP_VERSION_PROPS allows you to see the version
history of all objects in a PowerCenter repository.
Preface xxxix
Transformations
♦ Flat file lookup. You can now perform lookups on flat files. When you create a Lookup
transformation using a flat file as a lookup source, the Designer invokes the Flat File
Wizard. You can also use a lookup file parameter if you want to change the name or
location of a lookup between session runs.
♦ Dynamic lookup cache enhancements. When you use a dynamic lookup cache, the
PowerCenter Server can ignore some ports when it compares values in lookup and input
ports before it updates a row in the cache. Also, you can choose whether the PowerCenter
Server outputs old or new values from the lookup/output ports when it updates a row. You
might want to output old values from lookup/output ports when you use the Lookup
transformation in a mapping that updates slowly changing dimension tables.
♦ Union transformation. You can use the Union transformation to merge multiple sources
into a single pipeline. The Union transformation is similar to using the UNION ALL SQL
statement to combine the results from two or more SQL statements.
♦ Custom transformation API enhancements. The Custom transformation API includes
new array-based functions that allow you to create procedure code that receives and
outputs a block of rows at a time. Use these functions to take advantage of the
PowerCenter Server processing enhancements.
♦ Midstream XML transformations. You can now create an XML Parser transformation or
an XML Generator transformation to parse or generate XML inside a pipeline. The XML
transformations enable you to extract XML data stored in relational tables, such as data
stored in a CLOB column. You can also extract data from messaging systems, such as
TIBCO or IBM MQSeries.
Usability
♦ Viewing active folders. The Designer and the Workflow Manager highlight the active
folder in the Navigator.
♦ Enhanced printing. The quality of printed workspace has improved.
Version Control
You can run object queries that return shortcut objects. You can also run object queries based
on the latest status of an object. The query can return local objects that are checked out, the
latest version of checked in objects, or a collection of all older versions of objects.
Web Services Provider

♦ Real-time Web Services. Real-time Web Services allows you to create services using the
Workflow Manager and make them available to web service clients through the Web
Services Hub. The PowerCenter Server can perform parallel processing of both request-
response and one-way services.
♦ Web Services Hub. The Web Services Hub now hosts Real-time Web Services in addition
to Metadata Web Services and Batch Web Services. You can install the Web Services Hub
on a JBoss application server.
xl Preface
Note: PowerCenter Connect for Web Services allows you to create sources, targets, and
transformations to call web services hosted by other providers. For more informations, see
PowerCenter Connect for Web Services User and Administrator Guide.
Workflow Monitor
The Workflow Monitor includes the following performance and usability enhancements:
♦ When you connect to the PowerCenter Server, you no longer distinguish between online
or offline mode.
♦ You can open multiple instances of the Workflow Monitor on one machine.
♦ You can simultaneously monitor multiple PowerCenter Servers registered to the same
repository.
♦ The Workflow Monitor includes improved options for filtering tasks by start and end
time.
♦ The Workflow Monitor displays workflow runs in Task view chronologically with the most
recent run at the top. It displays folders alphabetically.
♦ You can remove the Navigator and Output window.
XML Support
PowerCenter XML support now includes the following features:
♦ Enhanced datatype support. You can use XML schemas that contain simple and complex
datatypes.
♦ Additional options for XML definitions. When you import XML definitions, you can
choose how you want the Designer to represent the metadata associated with the imported
files. You can choose to generate XML views using hierarchy or entity relationships. In a
view with hierarchy relationships, the Designer expands each element and reference under
its parent element. When you create views with entity relationships, the Designer creates
separate entities for references and multiple-occurring elements.
♦ Synchronizing XML definitions. You can synchronize one or more XML definition when
the underlying schema changes. You can synchronize an XML definition with any
repository definition or file used to create the XML definition, including relational sources
or targets, XML files, DTD files, or schema files.
♦ XML workspace. You can edit XML views and relationships between views in the
workspace. You can create views, add or delete columns from views, and define
relationships between views.
♦ Midstream XML transformations. You can now create an XML Parser transformation or
an XML Generator transformation to parse or generate XML inside a pipeline. The XML
transformations enable you to extract XML data stored in relational tables, such as data
stored in a CLOB column. You can also extract data from messaging systems, such as
TIBCO or IBM MQSeries.
Preface xli
♦ Support for circular references. Circular references occur when an element is a direct or
indirect child of itself. PowerCenter now supports XML files, DTD files, and XML
schemas that use circular definitions.
♦ Increased performance for large XML targets. You can create XML files of several
gigabytes in a PowerCenter 7.1 XML session by using the following enhancements:
− Spill to disk. You can specify the size of the cache used to store the XML tree. If the size
of the tree exceeds the cache size, the XML data spills to disk in order to free up
memory.
− User-defined commits. You can define commits to trigger flushes for XML target files.
− Support for multiple XML output files. You can output XML data to multiple XML
targets. You can also define the file names for XML output files in the mapping.
PowerCenter 7.0
This section describes new features and enhancements to PowerCenter 7.0.
Data Profiling
If you have the Data Profiling option, you can profile source data to evaluate source data and
detect patterns and exceptions. For example, you can determine implicit data type, suggest
candidate keys, detect data patterns, and evaluate join criteria. After you create a profiling
warehouse, you can create profiling mappings and run sessions. Then you can view reports
based on the profile data in the profiling warehouse.
The PowerCenter Client provides a Profile Manager and a Profile Wizard to complete these
tasks.
Data Integration Web Services

You can use Data Integration Web Services to write applications to communicate with the
PowerCenter Server. Data Integration Web Services is a web-enabled version of the
PowerCenter Server functionality available through Load Manager and Metadata Exchange. It
is comprised of two services for communication with the PowerCenter Server, Load Manager
and Metadata Exchange Web Services running on the Web Services Hub.
Documentation
♦ Glossary. The Installation and Configuration Guide contains a glossary of new PowerCenter
terms.
♦ Installation and Configuration Guide. The connectivity information in the Installation
and Configuration Guide is consolidated into two chapters. This book now contains
chapters titled “Connecting to Databases from Windows” and “Connecting to Databases
from UNIX.”
♦ Upgrading metadata. The Installation and Configuration Guide now contains a chapter
titled “Upgrading Repository Metadata.” This chapter describes changes to repository
xlii Preface
objects impacted by the upgrade process. The change in functionality for existing objects
depends on the version of the existing objects. Consult the upgrade information in this
chapter for each upgraded object to determine whether the upgrade applies to your current
version of PowerCenter.
Functions
♦ Soundex. The Soundex function encodes a string value into a four-character string.
SOUNDEX works for characters in the English alphabet (A-Z). It uses the first character
of the input string as the first character in the return value and encodes the remaining
three unique consonants as numbers.
♦ Metaphone. The Metaphone function encodes string values. You can specify the length of
the string that you want to encode. METAPHONE encodes characters of the English
language alphabet (A-Z). It encodes both uppercase and lowercase letters in uppercase.
Installation
♦ Remote PowerCenter Client installation. You can create a control file containing
installation information, and distribute it to other users to install the PowerCenter Client.
You access the Informatica installation CD from the command line to create the control
file and install the product.
PowerCenter Metadata Reporter

PowerCenter Metadata Reporter replaces Runtime Metadata Reporter and Informatica
Metadata Reporter. PowerCenter Metadata Reporter includes the following features:
♦ Metadata browsing. You can use PowerCenter Metadata Reporter to browse PowerCenter
7.0 metadata, such as workflows, worklets, mappings, source and target tables, and
transformations.
♦ Metadata analysis. You can use PowerCenter Metadata Reporter to analyze operational
metadata, including session load time, server load, session completion status, session
errors, and warehouse growth.
PowerCenter Server
♦ DB2 bulk loading. You can enable bulk loading when you load to IBM DB2 8.1.
♦ Distributed processing. If you purchase the Server Grid option, you can group
PowerCenter Servers registered to the same repository into a server grid. In a server grid,
PowerCenter Servers balance the workload among all the servers in the grid.
♦ Row error logging. The session configuration object has new properties that allow you to
define error logging. You can choose to log row errors in a central location to help
understand the cause and source of errors.
♦ External loading enhancements. When using external loaders on Windows, you can now
choose to load from a named pipe. When using external loaders on UNIX, you can now
choose to load from staged files.
Preface xliii
♦ External loading using Teradata Warehouse Builder. You can use Teradata Warehouse
Builder to load to Teradata. You can choose to insert, update, upsert, or delete data.
Additionally, Teradata Warehouse Builder can simultaneously read from multiple sources
and load data into one or more tables.
♦ Mixed mode processing for Teradata external loaders. You can now use data driven load
mode with Teradata external loaders. When you select data driven loading, the
PowerCenter Server flags rows for insert, delete, or update. It writes a column in the target
file or named pipe to indicate the update strategy. The control file uses these values to
determine how to load data to the target.
♦ Concurrent processing. The PowerCenter Server now reads data concurrently from
sources within a target load order group. This enables more efficient joins with minimal
usage of memory and disk cache.
♦ Real time processing enhancements. You can now use real-time processing in sessions that
also process active transformations, such as the Aggregator transformation. You can apply
the transformation logic to rows defined by transaction boundaries.
Repository Server
♦ Object export and import enhancements. You can now export and import objects using
the Repository Manager and pmrep. You can export and import multiple objects and
objects types. You can export and import objects with or without their dependent objects.
You can also export objects from a query result or objects history.
♦ pmrep commands. You can use pmrep to perform change management tasks, such as
maintaining deployment groups and labels, checking in, deploying, importing, exporting,
and listing objects. You can also use pmrep to run queries. The deployment and object
import commands require you to use a control file to define options and resolve conflicts.
♦ Trusted connections. You can now use a Microsoft SQL Server trusted connection to
connect to the repository.
Security
♦ LDAP user authentication. You can now use default repository user authentication or
Lightweight Directory Access Protocol (LDAP) to authenticate users. If you use LDAP, the
repository maintains an association between your repository user name and your external
login name. When you log in to the repository, the security module passes your login name
to the external directory for authentication. The repository maintains a status for each
user. You can now enable or disable users from accessing the repository by changing the
status. You do not have to delete user names from the repository.
♦ Use Repository Manager privilege. The Use Repository Manager privilege allows you to
perform tasks in the Repository Manager, such as copy object, maintain labels, and change
object status. You can perform the same tasks in the Designer and Workflow Manager if
you have the Use Designer and Use Workflow Manager privileges.
♦ Audit trail. You can track changes to repository users, groups, privileges, and permissions
through the Repository Server Administration Console. The Repository Agent logs
security changes to a log file stored in the Repository Server installation directory. The
xliv Preface
audit trail log contains information, such as changes to folder properties, adding or
removing a user or group, and adding or removing privileges.
Transformations
♦ Custom transformation. Custom transformations operate in conjunction with procedures
you create outside of the Designer interface to extend PowerCenter functionality. The
Custom transformation replaces the Advanced External Procedure transformation. You can
create Custom transformations with multiple input and output groups, and you can
compile the procedure with any C compiler.
You can create templates that customize the appearance and available properties of a
Custom transformation you develop. You can specify the icons used for transformation,
the colors, and the properties a mapping developer can modify. When you create a Custom
transformation template, distribute the template with the DLL or shared library you
develop.
♦ Joiner transformation. You can use the Joiner transformation to join two data streams that
originate from the same source.
Version Control
The PowerCenter Client and repository introduce features that allow you to create and
manage multiple versions of objects in the repository. Version control allows you to maintain
multiple versions of an object, control development on the object, track changes, and use
deployment groups to copy specific groups of objects from one repository to another. Version
control in PowerCenter includes the following features:
♦ Object versioning. Individual objects in the repository are now versioned. This allows you
to store multiple copies of a given object during the development cycle. Each version is a
separate object with unique properties.
♦ Check out and check in versioned objects. You can check out and reserve an object you
want to edit, and check in the object when you are ready to create a new version of the
object in the repository.
♦ Compare objects. The Repository Manager and Workflow Manager allow you to compare
two repository objects of the same type to identify differences between them. You can
compare Designer objects and Workflow Manager objects in the Repository Manager. You
can compare tasks, sessions, worklets, and workflows in the Workflow Manager. The
PowerCenter Client tools allow you to compare objects across open folders and
repositories. You can also compare different versions of the same object.
♦ Delete or purge a version. You can delete an object from view and continue to store it in
the repository. You can recover or undelete deleted objects. If you want to permanently
remove an object version, you can purge it from the repository.
♦ Deployment. Unlike copying a folder, copying a deployment group allows you to copy a
select number of objects from multiple folders in the source repository to multiple folders
in the target repository. This gives you greater control over the specific objects copied from
one repository to another.
Preface xlv
♦ Deployment groups. You can create a deployment group that contains references to
objects from multiple folders across the repository. You can create a static deployment
group that you manually add objects to, or create a dynamic deployment group that uses a
query to populate the group.
♦ Labels. A label is an object that you can apply to versioned objects in the repository. This
allows you to associate multiple objects in groups defined by the label. You can use labels
to track versioned objects during development, improve query results, and organize groups
of objects for deployment or export and import.
♦ Queries. You can create a query that specifies conditions to search for objects in the
repository. You can save queries for later use. You can make a private query, or you can
share it with all users in the repository.
♦ Track changes to an object. You can view a history that includes all versions of an object
and compare any version of the object in the history to any other version. This allows you
to see the changes made to an object over time.
XML Support
PowerCenter contains XML features that allow you to validate an XML file against an XML
schema, declare multiple namespaces, use XPath to locate XML nodes, increase performance
for large XML files, format your XML file output for increased readability, and parse or
generate XML data from various sources. XML support in PowerCenter includes the
following features:
♦ XML schema. You can use an XML schema to validate an XML file and to generate source
and target definitions. XML schemas allow you to declare multiple namespaces so you can
use prefixes for elements and attributes. XML schemas also allow you to define some
complex datatypes.
♦ XPath support. The XML wizard allows you to view the structure of XML schema. You
can use XPath to locate XML nodes.
♦ Increased performance for large XML files. When you process an XML file or stream, you
can set commits and periodically flush XML data to the target instead of writing all the
output at the end of the session. You can choose to append the data to the same target file
or create a new target file after each flush.
♦ XML target enhancements. You can format the XML target file so that you can easily view
the XML file in a text editor. You can also configure the PowerCenter Server to not output
empty elements to the XML target.
Usability
♦ Copying objects. You can now copy objects from all the PowerCenter Client tools using
the copy wizard to resolve conflicts. You can copy objects within folders, to other folders,
and to different repositories. Within the Designer, you can also copy segments of
mappings to a workspace in a new folder or repository.
♦ Comparing objects. You can compare workflows and tasks from the Workflow Manager.
You can also compare all objects from within the Repository Manager.
xlvi Preface
♦ Change propagation. When you edit a port in a mapping, you can choose to propagate
changed attributes throughout the mapping. The Designer propagates ports, expressions,
and conditions based on the direction that you propagate and the attributes you choose to
propagate.
♦ Enhanced partitioning interface. The Session Wizard is enhanced to provide a graphical
depiction of a mapping when you configure partitioning.
♦ Revert to saved. You can now revert to the last saved version of an object in the Workflow
Manager. When you do this, the Workflow Manager accesses the repository to retrieve the
last-saved version of the object.
♦ Enhanced validation messages. The PowerCenter Client writes messages in the Output
window that describe why it invalidates a mapping or workflow when you modify a
dependent object.
♦ Validate multiple objects. You can validate multiple objects in the repository without
fetching them into the workspace. You can save and optionally check in objects that
change from invalid to valid status as a result of the validation. You can validate sessions,
mappings, mapplets, workflows, and worklets.
♦ View dependencies. Before you edit or delete versioned objects, such as sources, targets,
mappings, or workflows, you can view dependencies to see the impact on other objects.
You can view parent and child dependencies and global shortcuts across repositories.
Viewing dependencies help you modify objects and composite objects without breaking
dependencies.
♦ Refresh session mappings. In the Workflow Manager, you can refresh a session mapping.
Preface xlvii
About Informatica Documentation
The complete set of documentation for PowerCenter includes the following books:
♦ Data Profiling Guide. Provides information about how to profile PowerCenter sources to
evaluate source data and detect patterns and exceptions.
♦ Designer Guide. Provides information needed to use the Designer. Includes information to
help you create mappings, mapplets, and transformations. Also includes a description of
the transformation datatypes used to process and transform source data.
♦ Getting Started. Provides basic tutorials for getting started.
♦ Installation and Configuration Guide. Provides information needed to install and
configure the PowerCenter tools, including details on environment variables and database
connections.
♦ PowerCenter Connect® for JMS® User and Administrator Guide. Provides information
to install PowerCenter Connect for JMS, build mappings, extract data from JMS messages,
and load data into JMS messages.
♦ Repository Guide. Provides information needed to administer the repository using the
Repository Manager or the pmrep command line program. Includes details on
functionality available in the Repository Manager and Administration Console, such as
creating and maintaining repositories, folders, users, groups, and permissions and
privileges.
♦ Transformation Language Reference. Provides syntax descriptions and examples for each
transformation function provided with PowerCenter.
♦ Transformation Guide. Provides information on how to create and configure each type of
transformation in the Designer.
♦ Troubleshooting Guide. Lists error messages that you might encounter while using
PowerCenter. Each error message includes one or more possible causes and actions that
you can take to correct the condition.
♦ Web Services Provider Guide. Provides information you need to install and configure the Web
Services Hub. This guide also provides information about how to use the web services that the
Web Services Hub hosts. The Web Services Hub hosts Real-time Web Services, Batch Web
Services, and Metadata Web Services.
♦ Workflow Administration Guide. Provides information to help you create and run
workflows in the Workflow Manager, as well as monitor workflows in the Workflow
Monitor. Also contains information on administering the PowerCenter Server and
performance tuning.
♦ XML User Guide. Provides information you need to create XML definitions from XML,
XSD, or DTD files, and relational or other XML definitions. Includes information on
running sessions with XML data. Also includes details on using the midstream XML
transformations to parse or generate XML data within a pipeline.
xlviii Preface
About this Book
The Workflow Administration Guide is written for developers and administrators who are
responsible for creating workflows and sessions, running workflows, and administering the
PowerCenter Server. This guide assumes you have knowledge of your operating systems,
relational database concepts, and the database engines, flat files or mainframe system in your
environment. This guide also assumes you are familiar with the interface requirements for
your supporting applications.
The material in this book is available for online use.
Document Conventions
This guide uses the following formatting conventions:
If you see… It means…
italicized text The word or set of words are especially emphasized.
boldfaced text Emphasized subjects.
italicized monospaced text This is the variable name for a value you enter as part of an
operating system command. This is generic text that should be
replaced with user-supplied values.
Note: The following paragraph provides additional facts.
Tip: The following paragraph provides suggested uses.
Warning: The following paragraph notes situations where you can overwrite
or corrupt data, unless you follow the specified procedure.
monospaced text This is a code example.
bold monospaced text This is an operating system command you enter from a prompt to
run a task.
Preface xlix
Other Informatica Resources
In addition to the product manuals, Informatica provides these other resources:
♦ Informatica Customer Portal
♦ Informatica Webzine
♦ Informatica web site
♦ Informatica Developer Network
♦ Informatica Technical Support
Visiting Informatica Customer Portal

As an Informatica customer, you can access the Informatica Customer Portal site at http://
my.informatica.com. The site contains product information, user group information,
newsletters, access to the Informatica customer support case management system (ATLAS),
the Informatica Knowledgebase, Informatica Webzine, and access to the Informatica user
community.
Visiting the Informatica Webzine

The Informatica Documentation team delivers an online journal, the Informatica Webzine.
This journal provides solutions to common tasks, detailed descriptions of specific features,
and tips and tricks to help you develop data warehouses.
The Informatica Webzine is a password-protected site that you can access through the
Customer Portal. The Customer Portal has an online registration form for login accounts to
its webzine and web support. To register for an account, go to http://my.informatica.com.
If you have any questions, please email webzine@informatica.com.
Visiting the Informatica Web Site

You can access Informatica’s corporate web site at http://www.informatica.com. The site
contains information about Informatica, its background, upcoming events, and locating your
closest sales office. You will also find product information, as well as literature and partner
information. The services area of the site includes important information on technical
support, training and education, and implementation services.
Visiting the Informatica Developer Network

The Informatica Developer Network is a web-based forum for third-party software
developers. You can access the Informatica Developer Network at the following URL:
http://devnet.informatica.com
l Preface
The site contains information on how to create, market, and support customer-oriented add-
on solutions based on Informatica’s interoperability interfaces.
Obtaining Technical Support

There are many ways to access Informatica technical support. You can call or email your
nearest Technical Support Center listed below or you can use our WebSupport Service.
WebSupport requires a user name and password. You can request a user name and password at
http://my.informatica.com.
North America / South America Africa / Asia / Australia / Europe
Informatica Corporation Informatica Software Ltd.

2100 Seaport Blvd. 6 Waltham Park
Redwood City, CA 94063 Waltham Road, White Waltham
Phone: 866.563.6332 or 650.385.5800 Maidenhead, Berkshire
Fax: 650.213.9489 SL6 3TN
Hours: 6 a.m. - 6 p.m. (PST/PDT) Phone: 44 870 606 1525
email: support@informatica.com Fax: +44 1628 511 411
Hours: 9 a.m. - 5:30 p.m. (GMT)
email: support_eu@informatica.com
Belgium
Phone: +32 15 281 702
Hours: 9 a.m. - 5:30 p.m. (local time)
France
Phone: +33 1 41 38 92 26
Germany
Phone: +49 1805 702 702
Netherlands
Phone: +31 306 082 089
Singapore
Phone: +65 322 8589
Hours: 9 a.m. - 5 p.m. (local time)
Switzerland
Phone: +41 800 81 80 70
Hours: 8 a.m. - 5 p.m. (local time)
Preface li
lii Preface
Chapter 1
Understanding the Server

Architecture
This chapter covers the following subjects:
♦ Overview, 2
♦ PowerCenter Server Connectivity, 5
♦ Running a Workflow, 7
♦ Load Manager Process, 8
♦ Data Transformation Manager (DTM) Process, 11
♦ Understanding Processing Threads, 14
♦ PowerCenter Server Processing, 22
♦ System Resources, 24
♦ Code Pages and Data Movement Modes, 27
♦ Output Files and Caches, 28
1
Overview
You can register multiple PowerCenter Servers to a repository. The PowerCenter Server moves
data from sources to targets based on workflow and mapping metadata stored in a repository.
A workflow is a set of instructions that describes how and when to run tasks related to
extracting, transforming, and loading data. The PowerCenter Server runs workflow tasks
according to the conditional links connecting the tasks. You can run a task by placing it in a
workflow.
When you have multiple PowerCenter Servers, you can assign a server to start a workflow or a
session. This allows you to distribute the workload. You can increase performance by using a
server grid to balance the workload. A server grid is a server object that allows you to
automate the distribution of sessions across multiple servers. For more information about
server grids, see “Working with Server Grids” on page 446.
A session is a type of workflow task. A session is a set of instructions that describes how to
move data from sources to targets using a mapping. Other workflow tasks include commands,
decisions, timers, pre-session SQL commands, post-session SQL commands, and email
notification. For details on workflow tasks, see “Working with Tasks” on page 131.
Use the Designer to import source and target definitions into the repository and to build
mappings. A mapping is a set of source and target definitions linked by transformation
objects that define the rules for data transformation. Use the Workflow Manager to develop
and manage workflows. Use the Workflow Monitor to monitor workflows and stop the
PowerCenter Server.
When a workflow starts, the PowerCenter Server retrieves mapping, workflow, and session
metadata from the repository to extract data from the source, transform it, and load it into
the target. It also runs the tasks in the workflow. The PowerCenter Server uses Load Manager
and Data Transformation Manager (DTM) processes to run the workflow.
Figure 1-1 shows the processing path between the PowerCenter Server, repository, source, and
target:
Figure 1-1. PowerCenter Server and Data Movement
PowerCenter
Source Target
Source Server Transformed
Data Data
Instructions
from
Metadata
Repository
2 Chapter 1: Understanding the Server Architecture

The PowerCenter Server can combine data from different platforms and source types. For
example, you can join data from a flat file and an Oracle source. The PowerCenter Server can
also load data to different platforms and target types. For example, you can load transformed
data to both a flat file target and a Microsoft SQL Server database in the same session.
Workflow Processes
The PowerCenter Server uses both process memory and system shared memory to perform
these tasks. It runs as a daemon on UNIX and a service on Windows. The PowerCenter Server
uses the following processes to run a workflow:
♦ The Load Manager process. Starts and locks the workflow, runs workflow tasks, and starts
the DTM to run sessions.
♦ The Data Transformation Manager (DTM) process. Performs session validations. Creates
threads to initialize the session, read, write, and transform data, and handle pre- and post-
session operations.
Pipeline Partitioning
When running sessions, the PowerCenter Server can achieve high performance by
partitioning the pipeline and performing the extract, transformation, and load for each
partition in parallel. To accomplish this, use the following session and server configuration:
♦ Configure the session with multiple partitions.
♦ Install the PowerCenter Server on a machine with multiple CPUs.
You can configure the partition type at most transformations in the pipeline. The
PowerCenter Server can partition data using round-robin, hash, key-range, database
partitioning, or pass-through partitioning.
For relational sources, the PowerCenter Server creates multiple database connections to a
single source and extracts a separate range of data for each connection. For XML or file
sources, the PowerCenter Server reads multiple files concurrently. The files must have the
same structure or hierarchy.
When the PowerCenter Server transforms the partitions concurrently, it passes data between
the partitions as needed to perform operations such as aggregation. When the PowerCenter
Server loads relational data, it creates multiple database connections to the target and loads
partitions of data concurrently. When the PowerCenter Server loads data to file targets, it
creates a separate file for each partition. You can choose to merge the target files.
Figure 1-2 shows a mapping that contains two partitions:
Figure 1-2. Partitioned Mapping
Source Transformations Target
Overview 3
For more information about pipeline partitioning, see “Pipeline Partitioning” on page 345.

PowerCenter Server Connectivity
The PowerCenter Server connects to the following Informatica platform components:
♦ PowerCenter Client
♦ Other PowerCenter Servers
♦ Repository Server
♦ Repository Agent
♦ Source and target databases
The PowerCenter Server is a repository client application. It connects to the Repository
Server and Repository Agent to retrieve workflow and mapping metadata from the repository
database. When the PowerCenter Server requests a repository connection from the Repository
Server, the Repository Server starts and manages the Repository Agent. The Repository Server
then re-directs the PowerCenter Server to connect directly to the Repository Agent. For
details on repository connectivity, see “Understanding the Repository” in the Repository
Guide.
The Workflow Manager communicates directly with the PowerCenter Server over a TCP/IP
connection. The Workflow Manager communicates directly with the PowerCenter Server
each time you schedule or edit a workflow, display workflow details, and request workflow
and session logs. You create the connection by defining the port number in the Workflow
Manager and the PowerCenter Server configuration. Use the Workflow Manager to register
the PowerCenter Server in the repository.
In a server grid, the Workflow Manager communicates directly with multiple PowerCenter
Servers over TCP/IP connections. Each PowerCenter Server retrieves a server grid object from
the repository, which it uses to connect to the other PowerCenter Servers in the grid. When
the PowerCenter Servers connect to each other, they maintain a constant line of
communication with each other. For more information about creating and using server grids,
see “Working with Server Grids” on page 446.
The PowerCenter Server connects to the source or target database using ODBC or native
drivers. It uses TCP/IP to connect to the Repository Server. The PowerCenter Server
maintains a database connection pool for stored procedures or lookup databases in a
workflow. The PowerCenter Server allows an unlimited number of connections to lookup or
stored procedure databases. If a database user does not have permission for the number of
connections a session requires, the session fails. You can optionally set a parameter to limit the
database connections.
For a session, the PowerCenter Server holds the connection as long as it needs to read data
from source tables or write data to target tables.
To prevent loss of information during data transfer, the PowerCenter Server, PowerCenter
Client, Repository Server, Repository Agent, and repository database must have compatible
code pages.
PowerCenter Server Connectivity 5

Figure 1-3 shows the PowerCenter Server connectivity:
Figure 1-3. PowerCenter Connectivity
PowerCenter PowerCenter Sources and

Client TCP/IP Server Native/ Targets
ODBC
TCP/IP
Repository Server
Repository Agent
Native/ODL
PowerCenter
Repository
Table 1-1 summarizes the software you need to connect the PowerCenter Server to the
platform components, source databases, and target databases:
Table 1-1. PowerCenter Server Connectivity Requirements
PowerCenter Server Connection Connectivity Requirement
PowerCenter Client TCP/IP
Other PowerCenter Servers TCP/IP
Repository Server TCP/IP
Repository Agent TCP/IP
Source and target databases Native database drivers or ODBC

Note: Both the Windows and UNIX versions of the PowerCenter Server can use ODBC drivers to connect to
databases. However, Informatica recommends using native drivers when possible to improve performance.

Running a Workflow
The PowerCenter Server uses the Load Manager process and the Data Transformation
Manager Process (DTM) to run the workflow and carry out workflow tasks.
When the PowerCenter Server runs a workflow, the Load Manager performs the following
tasks:
1. Locks the workflow and reads workflow properties.
2. Reads the parameter file and expands workflow variables.
3. Creates the workflow log file.
4. Runs workflow tasks.
5. Distributes sessions to worker servers.
6. Starts the DTM to run sessions.
7. Runs sessions from master servers.
8. Sends post-session email if the DTM terminates abnormally.
For details on the Load Manager process, see “Load Manager Process” on page 8.
When the PowerCenter Server runs a session, the DTM performs the following tasks:
1. Fetches session and mapping metadata from the repository.
2. Creates and expands session variables.
3. Creates the session log file.
4. Validates session code pages if data code page validation is enabled. Checks query
conversions if data code page validation is disabled.
5. Verifies connection object permissions.
6. Runs pre-session shell commands.
7. Runs pre-session stored procedures and SQL.
8. Creates and runs mapping, reader, writer, and transformation threads to extract,
transform, and load data.
9. Runs post-session stored procedures and SQL.
10. Runs post-session shell commands.
11. Sends post-session email.
For details on the DTM process, see “Data Transformation Manager (DTM) Process” on
page 11.
Running a Workflow 7
Load Manager Process
The Load Manager is the primary PowerCenter Server process. It accepts requests from the
PowerCenter Client and from pmcmd. The Load Manager runs and monitors the workflow. It
performs the following tasks:
♦ Manages workflow scheduling.
♦ Locks and reads the workflow.
♦ Reads the parameter file.
♦ Creates the workflow log file.
♦ Runs workflow tasks and evaluates the conditional links connecting tasks.
♦ Starts the DTM, which runs the session.
♦ Writes historical run information to the repository.
♦ Sends post-session email in the event of DTM failure.
Managing Workflow Scheduling

The Load Manager manages workflow scheduling in the following situations:
♦ When you start the PowerCenter Server. When you start the PowerCenter Server, the
Load Manager launches and queries the repository for a list of workflows configured to run
on the PowerCenter Server.
♦ When you save a workflow. When you save a workflow assigned to a PowerCenter Server
to the repository, the Load Manager adds the workflow to or removes the workflow from
the schedule queue.
Locking and Reading the Workflow

When the PowerCenter Server starts a workflow, the Load Manager requests an execute lock
on the workflow from the repository. The execute lock allows the PowerCenter Server to run
the workflow and prevents you from starting the workflow again until it completes. If the
workflow is already locked, the PowerCenter Server cannot start the workflow. A workflow
may be locked if it is already running.
The Load Manager also reads the workflow from the repository at workflow run time. The
Load Manager reads all links and tasks in the workflow except sessions and worklet instances.
The Load Manager reads session instance information from the repository. The DTM
retrieves the session and mapping from the repository at session run time. The Load Manager
reads worklets from the repository when the worklet starts.
For more information on locking, see “Repository Security” in the Repository Guide.

Reading the Parameter File
When the workflow starts, the Load Manager checks the workflow properties for use of a
parameter file. If the workflow uses a parameter file, the Load Manager reads the parameter
file and expands the variable values for the workflow and any worklets invoked by the
workflow.
The parameter file can also contain mapping variables, mapping parameters, session
parameters, and session variables for sessions in the workflow. When starting the DTM, the
Load Manager passes the parameter file name to the DTM.
For more information on the parameter file, see “Session Parameters” on page 495.
Creating the Workflow Log File

The Load Manager creates a log file for the workflow. The workflow log file contains a history
of the workflow run, including initialization, workflow task status, and error messages. You
can use information in the workflow log file in conjunction with the PowerCenter Server log
and session log to troubleshoot system, workflow, or session problems.
You can view the workflow log file in the Workflow Manager or open it in a text editor. The
following sample shows the first few lines of a log file:
INFO : LM_36215 : (2076|2224) Starting execution of workflow
[w_OrdersBooked].
INFO : LM_36255 : (2076|2224) Link [StartWorkflow --> s_BOOKINGS]: empty

expression string, evaluated to TRUE.
INFO : LM_36224 : (2076|2224) Starting execution of session instance

[s_BOOKINGS].
INFO : LM_36302 : (2076|2224) Started DTM process [pid = 508] for session
instance [s_BOOKINGS].
For more information on workflow log files, see “Log Files” on page 455.
Running Workflow Tasks

The Load Manager runs workflow tasks according to the conditional links connecting the
tasks. Links define the order of execution for workflow tasks. When a task in the workflow
completes, the Load Manager evaluates the completed task according to specified conditions,
such as success or failure. Based on the result of the evaluation, the Load Manager runs
successive links and tasks.
For more information on workflows and workflow tasks, see “Working with Workflows” on
page 87.
Distributing Sessions to Worker Servers

When you run a workflow in a server grid, the master server distributes session tasks to the
worker servers in a round-robin fashion to balance the workload. When the master server
Load Manager Process 9

distributes a session to a worker server, the Load Manager on the worker server machine starts
a DTM process to run the session.
For more information about creating and using server grids, see “Working with Server Grids”
on page 446.
Starting the DTM

When the workflow reaches a session, the Load Manager starts the DTM. The Load Manager
provides the DTM with session and parameter file information that allows the DTM to
retrieve the session and mapping metadata from the repository.
For more information on the DTM process, see “Data Transformation Manager (DTM)
Process” on page 11.
Running Sessions from Master Servers

If a PowerCenter Server is part of a server grid, it can run sessions assigned from other master
servers. The master server runs tasks in a workflow before it runs sessions assigned from other
master servers.
For more information about creating and using server grids, see “Working with Server Grids”
on page 446.
Writing Historical Information to the Repository

The Load Manager monitors the status of workflow tasks during the workflow run. When
workflow tasks start or finish, the Load Manager writes historical run information to the
repository. Historical run information for tasks includes start and completion times and
completion status. Historical run information for sessions also includes source read statistics,
target load statistics, and number of errors. You can view this information using the Workflow
Monitor.
For details on using the Workflow Monitor, see “Monitoring Workflows” on page 401.
Sending Post-Session Email

The Load Manager sends post-session email if the DTM terminates abnormally. The DTM
sends post-session email in all other cases. For details on post-session email, see “Sending
Email” on page 319.

Data Transformation Manager (DTM) Process
When the workflow reaches a session, the Load Manager starts the DTM process. The DTM
process is the process associated with the session task. The Load Manager creates one DTM
process for each session in the workflow. The DTM process performs the following tasks:
♦ Reads session information from the repository.
♦ Expands the server, session, and mapping variables and parameters.
♦ Creates the session log file.
♦ Validates source and target code pages.
♦ Verifies connection object permissions.
♦ Runs pre-session shell commands, stored procedures and SQL.
♦ Creates and runs mapping, reader, writer, and transformation threads to extract,
transform, and load data.
♦ Runs post-session stored procedures, SQL, and shell commands.
♦ Sends post-session email.
Reading the Session Information

The Load Manager provides the DTM with session instance information when it starts the
DTM. The DTM retrieves the mapping and session metadata from the repository.
Expanding Variables and Parameters

If the workflow uses a parameter file, the Load Manager sends the parameter file to the DTM
when it starts the DTM. The DTM creates and expands session-level, server-level, and
mapping-level variables and parameters. For more information on the parameter file, see
“Session Parameters” on page 495.
Creating the Session Log File

The DTM creates a log file for the session. The log file contains a complete history of the
session run, including initialization, transformation, status, and error messages. You can use
information in the log file in conjunction with the PowerCenter Server log and the workflow
log file to troubleshoot system or session problems.
You can view the log file in the Workflow Monitor or open it in a text editor. The following
sample shows the first few lines of a log file:
MASTER> CMN_1010 System shared memory [2338661387] allocated for
[12000000] bytes.
MASTER> PETL_24000 Parallel Pipeline Engine initializing.
MASTER> PETL_24001 Parallel Pipeline Engine running.
Data Transformation Manager (DTM) Process 11

MASTER> PETL_24003 Initializing session run.
MAPPING> TM_6014 Initializing session [s_Customers] at [Tue Nov 04

16:55:06 2003]
For more information on session log files, see “Log Files” on page 455.
Validating Code Pages

When the PowerCenter Server runs in Unicode mode with data code page validation enabled,
the DTM validates the following code pages:
♦ Source code pages. Must be a subset of the PowerCenter Server code page.
♦ Target code pages. Must be a superset of the PowerCenter Server code page.
♦ Repository Agent code page. Must be compatible with the PowerCenter Server code page.
♦ Repository Server code page. Must be compatible with the PowerCenter Server code page.
♦ Lookup database code page. Must be compatible with the PowerCenter Server code page.
♦ Stored procedure database code page. Must be compatible with the PowerCenter Server
code page.
♦ PowerCenter Server code page. Must be registered with the Workflow Manager.
If the DTM cannot validate the code pages, it writes the error into the session log and fails the
session. If you disable data code page validation, the PowerCenter Server does not enforce
code page compatibility.
The PowerCenter Server processes data internally using the UCS-2 character set. When you
disable data code page validation the PowerCenter Server verifies that the source query, target
query, lookup database query, and stored procedure call text convert from the source, target,
lookup, or stored procedure data code page to the UCS-2 character without loss of data in
conversion. If the PowerCenter Server encounters an error when converting data, it writes an
error message to the session log.
For more information about code pages, see “Globalization Overview” and “Code Pages” in
the Installation and Configuration Guide.
Verifying Connection Object Permissions

After validating the session code pages, the DTM verifies permissions for connection objects
used in the session. The DTM verifies that the user who started the PowerCenter Server and
the user who started or scheduled the workflow has execute permissions for connection
objects associated with the session.
Running Pre-Session Operations

After verifying connection object permissions, the DTM runs pre-session shell commands.
The DTM then runs pre-session stored procedures and SQL commands.

Running the Processing Threads
After initializing the session, the DTM uses reader, transformation, and writer threads to
extract, transform, and load data. The number of threads the DTM uses to run the session
depends on the number of partitions configured for the session. For a detailed discussion of
reader, transformation, and writer threads, see “Understanding Processing Threads” on
page 14.
Running Post-Session Operations

After the DTM runs the processing threads, it runs post-session SQL commands and stored
procedures. The DTM then runs post-session shell commands.
Sending Post-Session Email

When the session finishes, the DTM composes and sends email reporting session completion
or failure. If the DTM terminates abnormally, the Load Manager sends post-session email.
For details on post-session email, see “Sending Email” on page 319.
Data Transformation Manager (DTM) Process 13

Understanding Processing Threads
The DTM allocates process memory for the session and divides it into buffers. This is also
known as buffer memory. The default memory allocation is 12,000,000 bytes. The DTM uses
multiple threads to process data. The main DTM thread is called the master thread.
The master thread creates and manages other threads. The master thread for a session can
create mapping, pre-session, post-session, reader, transformation, and writer threads. For
more information, see “Thread Types” on page 14.
For each target load order group in a mapping, the master thread can create several threads.
The types of threads depend on the session properties and the transformations in the
mapping. The number of threads depends on the partitioning information for each target
load order group in the mapping.
For more information on target load order groups, see “Reading Source Data” on page 22.
Thread Types
The master thread creates different types of threads for a session. The types of threads the
master thread creates depend on the following factors:
♦ Pre- and post-session properties
♦ Types of transformations in the mapping
Table 1-2 lists the types of threads that the master thread can create:
Table 1-2. Processing Threads
Thread Type Description
Mapping Thread One thread for each session. Fetches session and mapping information.
Compiles the mapping. Cleans up after session execution.
Pre- and Post-Session One thread each to perform pre- and post-session operations.
Threads
Reader Thread One thread for each partition for each source pipeline. Reads from sources.
Relational sources use relational reader threads, and file sources use file
reader threads.
Transformation Thread One or more transformation threads for each partition. Processes data
according to the transformation logic in the mapping.
Writer Thread One thread for each partition, if a target exists in the source pipeline. Writes to
targets. Relational targets use relational writer threads, and file targets use file
writer threads.

Figure 1-4 shows the threads the master thread creates for a simple mapping that contains one
target load order group:
Figure 1-4. Thread Creation for a Simple Mapping
1 Reader Thread 1 Transformation Thread 1 Writer Thread
The mapping in Figure 1-4 contains a single partition. In this case, the master thread creates
one reader, one transformation, and one writer thread to process the data. The reader thread
controls how the PowerCenter Server extracts source data and passes it to the source qualifier,
the transformation thread controls how the PowerCenter Server processes the data, and the
writer thread controls how the PowerCenter Server loads data to the target.
When the pipeline contains only a source definition, source qualifier, and a target definition,
the data bypasses the transformation threads, proceeding directly from the reader buffers to
the writer. This type of pipeline is a pass-through pipeline.
Figure 1-5 shows the threads for a pass-through pipeline with one partition:
Figure 1-5. Thread Creation for a Pass-through Pipeline
1 Reader Thread Bypassed 1 Writer Thread

Transformation
Thread
Note: The previous examples assume that each session contains a single partition. For
information on how partitions and partition points affect thread creation, see “Threads and
Partitioning” on page 16.
Reader Threads
The master thread creates reader threads to extract source data. The number of reader threads
depends on the partitioning information for each pipeline. The number of reader threads
equals the number of partitions. For more information, see “Threads and Partitioning” on
page 16.
The PowerCenter Server creates an SQL statement for each reader thread to extract data from
a relational source. For file sources, the PowerCenter Server can create multiple threads to
read a single source.
Understanding Processing Threads 15

Transformation Threads
The master thread creates transformation threads to transform data received in buffers by the
reader thread, move the data from transformation to transformation, and create memory
caches when necessary. The number of transformation threads depends on the partitioning
information for each pipeline. For more information, see “Threads and Partitioning” on
page 16.
The transformation threads store fully-transformed data in a buffer drawn from the memory
pool for subsequent access by the writer thread.
If the pipeline contains a Rank, Joiner, Aggregator, Sorter, or a cached Lookup
transformation, the transformation thread uses cache memory until it reaches the configured
cache size limits. If the transformation thread requires more space, it pages to local cache files
to hold additional data.
When the PowerCenter Server runs in ASCII mode, the transformation threads pass character
data in single bytes. When the PowerCenter Server runs in Unicode mode, the transformation
threads use double bytes to move character data.
Writer Threads
The master thread creates writer threads to load target data. The number of writer threads
depends on the partitioning information for each pipeline. If the pipeline contains one
partition, the master thread creates one writer thread. If it contains multiple partitions, the
master thread creates multiple writer threads. For more information, see “Threads and
Partitioning” on page 16.
Each writer thread creates connections to the target databases to load data. If the target is a
file, each writer thread creates a separate file. You can configure the session to merge these
files.
If the target is relational, the writer thread takes data from buffers and commits it to session
targets. When loading targets, the writer commits data based on the commit interval in the
session properties. You can configure a session to commit data based on the number of source
rows read, the number of rows written to the target, or the number of rows that pass through
a transformation that generates transactions, such as a Transaction Control transformation.
Threads and Partitioning

The master thread creates different numbers of threads for different mappings. The number
of threads depends on the partitioning information for each target load order group. This
includes the following factors:
♦ The partition points. Controls the thread boundaries and pipeline stages.
♦ The number of partitions. Controls the number of threads the master thread creates for
each pipeline stage.
♦ The number of source pipelines. Controls the number of reader threads and the number
of transformation threads downstream from the sources.

Partition Points
By default, the Workflow Manager places partition points at certain transformations in each
source pipeline. Partition points mark the thread boundaries in a source pipeline and divide
the pipeline into stages. A pipeline stage is the section of a pipeline executed between any two
partition points. When you set a partition point at a transformation, the new pipeline stage
includes that transformation.
The PowerCenter Server can redistribute rows of data at partition points. For example, if you
place a partition point at a Sorter transformation and specify multiple partitions, the
PowerCenter Server redistributes rows among all partitions before the rows enter the Sorter
transformation. The rows stay in the same partitions until they reach the next partition point.
For more information, see “Pipeline Partitioning” on page 345.
By default, the Workflow Manager places a partition point at each of the following
transformations:
♦ Source qualifier. Marks the reader stage. You cannot delete this partition point.
♦ Rank and unsorted Aggregator transformation. Marks the transformation stage
boundaries and creates a new transformation stage. This is necessary to ensure that rows
are grouped properly before the Rank and Aggregator transformations process them. You
can delete these partition points under certain circumstances. For more information, see
“Adding and Deleting Partition Points” on page 353.
♦ Target instance. Marks the writer stage. You cannot delete this partition point.
Figure 1-6 shows the pipeline stages for a mapping that contains an unsorted Aggregator
transformation:
Figure 1-6. Pipeline Stages in a Mapping With an Unsorted Aggregator Transformation
* Default Partition Points
* * *
First Stage Second Stage Third Stage Fourth Stage
The mapping in Figure 1-6 contains four stages by default. The partition point at the source
qualifier marks the boundary between the first (reader) and second (transformation) stages.
The partition point at the Aggregator transformation marks the boundary between the second
and third (transformation) stages. The partition point at the target instance marks the
boundary between the third (transformation) and the fourth (writer) stages.
If you use PowerCenter, you can add and delete partition points at other transformations. For
information on valid partition points, see “Pipeline Partitioning” on page 345. When you add
a partition point, you increase the number of pipeline stages by one. When you remove a
partition point, you decrease the number of pipeline stages by one.

Figure 1-7 shows the pipeline stages if you add a partition point at the Filter transformation:
Figure 1-7. Pipeline Stages in a Mapping with an Additional Partition Point
* Partition Points
* * * *
First Stage Second Stage Third Stage Fourth Stage Fifth Stage
Number of Partitions
The number of threads that process each pipeline stage depends on the number of partitions.
A partition is a pipeline stage that executes in a single reader, transformation, or writer thread.
The number of partitions in any pipeline stage equals the number of threads in that stage. If
you do not specify otherwise, the PowerCenter Server creates one partition in every pipeline
stage. If you purchased the partitioning option, you can configure multiple partitions for a
single pipeline stage.
You can specify the number of partitions at any partition point. The number of partitions
must be consistent across a pipeline. Therefore, if you define two partitions at the source
qualifier, the Workflow Manager sets two partitions at all transformations that are partition
points, and two partitions at the target instances.
For example, suppose you need to use the mapping in Figure 1-6 on page 17 to read data from
three flat files. To do this, you need to specify three partitions at the source qualifier. When
you do this, the Workflow Manager sets three partitions at all other partition points in the
pipeline.
The master thread creates three sets of threads. Figure 1-8 shows thread creation for a
mapping with three partitions:
Figure 1-8. Thread Creation for a Mapping with Three Partitions
* * *
Threads for Partition #1

3 Reader Threads 6 Transformation Threads 3 Writer Threads

(First Stage) (Second Stage) (Third Stage) (Fourth Stage)

When you define three partitions across the mapping in Figure 1-8, the master thread creates
three threads at each pipeline stage, for a total of 12 threads. If you need to read data from
four file sources, you would specify four partitions at the source qualifier. The master thread
would create a fourth thread at each stage, for a total of 16 threads.
The PowerCenter Server processes partitions concurrently. When you run a session with
multiple partitions, the threads run as follows:
1. The reader threads run concurrently to extract data from the source.
2. The transformation threads run concurrently in each transformation stage to process
data. The PowerCenter Server redistributes data among the partitions at each partition
point.
3. The writer threads run concurrently to write data to the target.
Note: Increasing the number of partitions or partition points increases the number of threads.
Therefore, increasing the number of partitions or partition points also increases the load on
the server machine. If the server machine contains ample CPU bandwidth, processing rows of
data in a session concurrently can increase session performance. However, if you create a large
number of partitions or partition points in a session that processes large amounts of data, you
can overload the system.
Number of Source Pipelines

The master thread creates a reader and transformation thread for each source pipeline in the
target load order group. For more information on source pipelines and target load order
groups, see “Reading Source Data” on page 22.
When you connect multiple pipelines to a multiple input group transformation, such as a
Joiner or Custom transformation, the PowerCenter Server maintains the transformation
threads or creates a new transformation thread depending on the partitioning information:
♦ You add a partition point at the multiple input group transformation. The PowerCenter
Server creates a new pipeline stage and creates one transformation thread downstream
from the partition point. The PowerCenter Server creates one transformation thread
regardless of the number of output groups the transformation contains.
♦ You do not add a partition point at the multiple input group transformation. The
PowerCenter Server maintains the same number of transformation threads downstream
from the partition point until it reaches the next partition point. However, for each
partition at the multiple input group transformation and its downstream transformations,
only one thread actively processes a row of data at any given time.

Figure 1-9 shows the thread creation for a mapping that contains a Joiner transformation
configured for sorted input:
Figure 1-9. Thread Creation with Joiner Transformation

1 Reader Thread 1 Transformation Thread
* Partition Points
*
*
*
1 Reader Thread 1 Transformation Thread 1 Writer Thread
Each source pipeline in Figure 1-9 contains a transformation thread. The Joiner
transformation is not a partition point, so both transformation threads can process data at the
Joiner and Expression transformations. However, only one transformation thread processes a
row at any given time. The target load order group contains one target, so the master thread
creates only one writer thread.
Suppose you add a partition point at the Joiner transformation in Figure 1-9. Figure 1-10
shows the mapping in Figure 1-9 with a partition point at the Joiner transformation:
Figure 1-10. Thread Creation with a Partition Point at a Joiner Transformation

1 Reader Thread 1 Transformation Thread
* Partition Points
*
* *
*
1 Reader Thread 1 Transformation 1 Transformation 1 Writer Thread

Thread Thread Created After
the Partition Point

Each source pipeline in Figure 1-10 contains a transformation thread. However, the
transformation threads end at the Joiner transformation. The Joiner transformation is a
partition point, so the master thread creates a new transformation thread starting at the
partition point.
Note: If any source qualifier in either Figure 1-9 or Figure 1-10 feeds a target other than the
target associated with the Joiner transformation, the master thread creates an additional writer
thread.

PowerCenter Server Processing
When you run a session, the PowerCenter Server reads source data and passes it to the
transformations for processing. To help understand PowerCenter Server processing, consider
the following PowerCenter Server actions:
♦ Reading source data. The PowerCenter Server reads the sources in a mapping at different
times depending on how you configure the sources, transformations, and targets in the
mapping. For more information on reading data, see “Reading Source Data” on page 22.
♦ Blocking data. The PowerCenter Server sometimes blocks the flow of data at a
transformation in the mapping while it processes a row of data from a different source. For
more information on blocking data, see “Blocking Data” on page 23.
♦ Block processing. The PowerCenter Server reads and processes a block of rows at a time.
For more information, see “Block Processing” on page 23.
Reading Source Data

You create a session based on a mapping. Mappings contain one or more target load order
groups. A target load order group is the collection of source qualifiers, transformations, and
targets linked together in a mapping. Each target load order group contains one or more
source pipelines. A source pipeline consists of a source qualifier and all of the transformations
and target instances that receive data from that source qualifier.
By default, the PowerCenter Server reads sources in a target load order group concurrently,
and it processes target load order groups sequentially. You can configure the order that the
PowerCenter Server processes target load order groups. For more information on setting the
target load order, see “Mappings” in the Designer Guide.
Figure 1-11 shows a mapping that contains two target load order groups and three source
pipelines:
Figure 1-11. Target Load Order Groups and Source Pipelines

Sources Transformations Targets
Pipeline A
A T1
Target Load Order Group 1

B T2
Pipeline B
Target Load Order Group 2

C T3
Pipeline C

In the mapping shown in Figure 1-11, the PowerCenter Server processes the target load order
groups sequentially. It first processes Target Load Order Group 1 by reading Source A and
Source B at the same time. When it finishes processing Target Load Order Group 1, the
PowerCenter Server begins to process Target Load Order Group 2 by reading Source C.
Blocking Data
You can include multiple input group transformations in a mapping. The PowerCenter Server
passes data to the input groups concurrently. However, sometimes the transformation logic of
a multiple input group transformation requires that the PowerCenter Server block data on
one input group while it waits for a row from a different input group.
Blocking is the suspension of the data flow into an input group of a multiple input group
transformation. When the PowerCenter Server blocks data, it reads data from the source
connected to the input group until it fills the reader and transformation buffers. Once the
PowerCenter Server fills the buffers, it does not read more source rows until the
transformation logic allows the PowerCenter Server to stop blocking the source. When the
PowerCenter Server stops blocking a source, it processes the data in the buffers and continues
to read from the source.
The PowerCenter Server blocks data at one input group when it needs a specific row from a
different input group to perform the transformation logic. Once the PowerCenter Server
reads and processes the row it needs, it stops blocking the source.
Block Processing
The PowerCenter Server reads and processes a block of rows at a time. The number of rows in
the block depend on the row size and the DTM buffer size. In the following circumstances,
the PowerCenter Server processes one row in a block:
♦ Log row errors. When you log row errors, the PowerCenter Server processes one row in a
block.
♦ Connect CURRVAL. When you connect the CURRVAL port in a Sequence Generator
transformation, the session processes one row in a block. For optimal performance,
Informatica recommends that you connect only the NEXTVAL port in mappings. For
more information, see “Sequence Generator Transformation” in the Transformation Guide.
♦ Configure array-based mode for Custom transformation procedure. When you configure
the data access mode for a Custom transformation procedure to be row-based, the
PowerCenter Server processes one row in a block. By default, the data access mode is array-
based, and the PowerCenter Server processes multiple rows in a block. For more
information, see “Custom Transformation Functions” in the Transformation Guide.
PowerCenter Server Processing 23

System Resources
To allocate system resources for read, transformation, and write processing, you should
understand how the PowerCenter Server allocates and uses system resources. The
PowerCenter Server uses the following system resources:
♦ CPU
♦ Load Manager shared memory
♦ DTM buffer memory
♦ Cache memory
CPU Usage
The PowerCenter Server performs read, transformation, and write processing for a pipeline in
parallel. It can process multiple partitions of a pipeline within a session, and it can process
multiple sessions in parallel.
If you have a symmetric multi-processing (SMP) platform, you can use multiple CPUs to
concurrently process session data or partitions of data. This provides increased performance,
as true parallelism is achieved. On a single processor platform, these tasks share the CPU, so
there is no parallelism.
The PowerCenter Server can use multiple CPUs to process a session that contains multiple
partitions. The number of CPUs used depends on factors such as the number of partitions,
the number of threads, the number of available CPUs, and amount or resources required to
process the mapping.
For more information about partitioning, see “Pipeline Partitioning” on page 345.
Load Manager Shared Memory

The Load Manager uses both process and shared memory. The Load Manager keeps a list of
workflows and the schedule queue in process memory. The Load Manager shared memory is
organized as an array of session slots that store session instance and status information. The
DTM retrieves the session object and mapping object from the repository for processing.
Session instance information does not occupy the shared memory slot until session run time.
When you start a workflow, the Load Manager retrieves session instance information from the
repository with other workflow tasks. At session runtime, the Load Manager places the session
instance information into a shared memory slot and starts the DTM. The DTM connects to
the shared memory and uses the session instance information to retrieve the session and
mapping from the repository. When the session completes, the Load Manager releases the
session instance from the shared memory slot and writes session run information to the
repository.
If the PowerCenter Server shuts down, it releases all sessions from shared memory.

You can configure three parameters in the PowerCenter Server configuration that control how
the Load Manager allocates shared memory to sessions and the number of sessions the
PowerCenter Server runs simultaneously:
♦ MaxSessions. The maximum sessions parameter indicates the maximum number of session
slots available to the Load Manager at one time for running or repeating sessions. For
example, if you select the default MaxSessions of 10, the Load Manager allocates 10
session slots. This parameter helps you control the number of sessions the PowerCenter
Server can run simultaneously.
♦ LMSharedMemory. Set the Load Manager shared memory parameter in conjunction with
the Maximum Sessions parameter to ensure that the Load Manager has enough memory
for each session. The Load Manager requires approximately 200,000 bytes of shared
memory for each session slot. The default setting is 2,000,000 bytes. For each increase of
10 sessions in the MaxSessions setting, you need to increase LMSharedMemory by
2,000,000 bytes.
♦ FailSessionIfMaxSessionsReached. The Fail Session If Max Sessions Reached option
determines how the Load Manager handles a session when the number of sessions already
running equals the number specified for maximum sessions. By default, this option is
disabled, and the Load Manager holds sessions waiting to run in a ready queue until a
session slot becomes available.
DTM Buffer Memory

The Load Manager launches the DTM. The DTM allocates buffer memory to the session
based on the DTM Buffer Size setting in the session properties. By default, it allocates
12,000,000 bytes of memory to the session.
The DTM divides the memory into buffer blocks as configured in the Buffer Block Size
setting in the session properties (64,000 bytes per block, by default). The reader,
transformation, and writer threads use buffer blocks to move data from sources to targets.
You can sometimes improve session performance by increasing buffer memory when you run
a session handling a large volume of character data and the PowerCenter Server runs in
Unicode mode. In Unicode mode, the PowerCenter Server uses double bytes to move
characters, so increasing buffer memory might improve session performance.
If the DTM cannot allocate the configured amount of buffer memory for the session, the
session cannot initialize. Informatica recommends you allocate no more than 1 GB for DTM
buffer memory.
System Resources 25
Cache Memory
The DTM process creates in-memory index and data caches to temporarily store data used by
the following transformations:
♦ Aggregator transformation (without sorted input)
♦ Rank transformation
♦ Joiner transformation
♦ Lookup transformation (with caching enabled)
You configure memory size for the index and data cache in the transformation properties. By
default, the PowerCenter Server allocates 1,000,000 bytes for the index cache and 2,000,000
bytes for the data cache.
By default, the DTM creates cache files in the directory configured for the $PMCacheDir
server variable. If the DTM requires more space than it allocates, it pages to local index and
data files.
The DTM process also creates an in-memory cache to store data used by a Sorter
transformation. You configure the memory size for the cache in the transformation properties.
By default, the PowerCenter Server allocates 8,388,608 bytes for the cache, and the DTM
creates cache files in the directory configured for the $PMTempDir server variable. If the
DTM requires more cache space than it allocates, it pages to local cache files.
When processing large amounts of data, the DTM may create multiple index and data files.
The session does not fail if it runs out of cache memory and pages to the cache files. It does
fail, however, if the local directory for cache files runs out of disk space.
After the session completes, the DTM releases memory used by the index and data caches and
deletes any index and data files. However, if the session is configured to perform incremental
aggregation or if a Lookup transformation is configured for a persistent lookup cache, the
DTM saves all index and data cache information to disk for the next session run.
For more information about caching, see “Session Caches” on page 613.

Code Pages and Data Movement Modes
You can configure PowerCenter to move multibyte data. The PowerCenter Server can move
data in either ASCII or Unicode data movement mode. These modes determine how the
PowerCenter Server handles character data. You choose the data movement mode in the
PowerCenter Server configuration settings. If you want to move multibyte data, choose
Unicode data movement mode.
To ensure that data is not lost during conversion from one machine to another, you must also
choose the appropriate code pages for your connections. In the Workflow Manager, you select
code pages for the PowerCenter Server and the database connections the PowerCenter Server
uses to connect to the source and target machines. The Workflow Manager validates code
page compatibility when you add or edit a session.
For more information, see “Globalization Overview” and “Code Pages” in the Installation and
Configuration Guide.
ASCII Mode
Use ASCII mode when all sources and targets are 7-bit ASCII or EBCDIC character sets. In
ASCII mode, the PowerCenter Server recognizes 7-bit ASCII and EBCDIC characters and
stores each character in a single byte. When the PowerCenter Server runs in ASCII mode, it
does not validate session code pages. It reads all character data as ASCII characters and does
not perform code page conversions. It also treats all numerics as U.S. Standard and all dates as
binary data.
Unicode Mode
Use Unicode mode when sources or targets use 8-bit or multibyte character sets and contain
character data. In Unicode mode, the PowerCenter Server recognizes multibyte character sets
as defined by supported code pages.
If you configure the PowerCenter Server to validate data code pages, the PowerCenter Server
validates source and target code page compatibility when you run a session. If you configure
the PowerCenter Server for relaxed data code page validation, the PowerCenter Server lifts
source and target compatibility restrictions.
When reading a source, the PowerCenter Server converts data from the source character set to
Unicode based on the source code page. The PowerCenter Server allots two bytes for each
character when moving data through a mapping. The PowerCenter Server converts data from
Unicode to the target character set based on the target code page when writing to the target. It
also treats all numerics as U.S. Standard and all dates as binary data.
The PowerCenter Server code page must be compatible with the code pages of the
PowerCenter Client.
For details on code page compatibility and validation, see “Globalization Overview” in the
Installation and Configuration Guide.
Code Pages and Data Movement Modes 27

Output Files and Caches
Once launched, the PowerCenter Server logs status and error messages to a UNIX log file or
to the Windows Application log. During each workflow run, the PowerCenter Server creates a
workflow log file. During each session, the PowerCenter Server creates a session log file and
reject file. Depending on transformation cache settings and target types, the PowerCenter
Server may create additional files as well.
The PowerCenter Server uses the PowerCenter Server code page to generate log files. When
you directly access a log file generated by the PowerCenter Server, it appears in the character
set of the PowerCenter Server code page. When you use the Workflow Manager to access a file
generated by the PowerCenter Server, such as a session log, the Workflow Manager uses the
PowerCenter Client code page to translate and display the session log in the character set of
the PowerCenter Client code page.
The PowerCenter Server creates the following output files:
♦ PowerCenter Server log
♦ Workflow log file
♦ Session log file
♦ Session details file
♦ Performance details file
♦ Reject files
♦ Row error logs
♦ Recovery tables and files
♦ Control file
♦ Post-session email
♦ Output file
♦ Cache files
When the PowerCenter Server on UNIX creates any file other than a recovery file, it sets the
file permissions according to the umask of the shell that starts the PowerCenter Server. For
example, when the umask of the shell that starts the PowerCenter Server is 022, the
PowerCenter Server creates files with rw-r--r-- permissions. To change the file permissions,
you must change the umask of the shell that starts the PowerCenter Server and then restart it.
The PowerCenter Server on UNIX creates recovery files with rw------- permissions.
The PowerCenter Server on Windows creates files with read and write permissions.
PowerCenter Server Log

The PowerCenter Server creates a log for all status and error messages. You can troubleshoot
PowerCenter Server problems by examining error messages sent to this log.

On UNIX, the default name of the PowerCenter Server log file is pmserver.log. You configure
the PowerCenter Server log file name with the LogFileName option in the PowerCenter
Server setup program.
On Windows, the PowerCenter Server logs status and error messages in the event log. Use the
Event Viewer to access those messages. You can also configure the PowerCenter Server on
Windows to write status and error messages to a file.
PowerCenter Server Messages

The PowerCenter Server associates a message code with the text of every message. The code
uses a text prefix, such as LM, CMN, or RR, with a code number, such as CMN_1039. In
PowerCenter Server error logs, the codes appear before the text as follows:
LM_34003 Server initialization completed.
LM_36802 Workflow <workflow name> scheduled to run at <time>.
Some message codes are embedded within other codes, for example:
CMN_1050 [LM 2041 Received request to start session]
You can also configure the PowerCenter Server on Windows to write error messages to the
Application Log, which you can view with the Event Viewer. Messages sent from the
PowerCenter Server display PowerCenter in the Source column, the code prefix in the
Category column, and the code number in the Event column. However, since some message
codes are embedded within other codes, to ensure you are viewing the true message code, you
must view the text of the message.
Figure 1-12 shows a sample application log:
Figure 1-12. Event Viewer Application Log Message
Output Files and Caches 29

Figure 1-13 shows how you can view the text of the message by selecting the message and
using the Enter key:
Figure 1-13. Application Log Message Detail
Error Messages
Using the listed error code, consult the Troubleshooting Guide for probable causes and actions
to correct the problem.
Workflow Log File

The PowerCenter Server creates a workflow log file for each workflow it runs. It writes
information in the workflow log such as intitialization of processes, workflow task run
information, errors encountered, and workflow run summary. Workflow log error messages
are categorized into severity levels. You can configure the PowerCenter Server to suppress
writing messages to the workflow log file. You can also configure the workflow to write
workflow messages to the session log file.
As with PowerCenter Server logs and session logs, the PowerCenter Server enters a code
number into the workflow log file message along with message text. You can find information
on error messages in the Troubleshooting Guide.
By default, the PowerCenter Server saves workflow logs in a directory entered for the server
variable $PMWorkflowLogDir in the PowerCenter Server registration and names the
workflow log workflow_name.log.
By default, the PowerCenter Server saves only one workflow log for each workflow. If you
want to save multiple logs for different workflow runs, you can configure the workflow to save

a workflow log file in two different ways:
♦ By timestamp, permitting an unlimited number of workflow logs.
♦ By cycle, saving the configured number of workflow logs, replacing the older logs with new
logs. You can use the server variable $PMWorkflowLogCount to set the number of logs the
PowerCenter Server archives for the workflow.
For more information about the workflow log, see “Log Files” on page 455.
Session Log File

The PowerCenter Server creates a session log file for each session it runs. It writes information
in the session log such as initialization of processes, session validation, creation of SQL
commands for reader and writer threads, errors encountered, and load summary. The amount
of detail in the session log depends on the tracing level that you set.
As with PowerCenter Server logs and workflow logs, the PowerCenter Server enters a code
number along with message text. You can find information on error messages in the
Troubleshooting Guide.
By default, the PowerCenter Server saves session logs in a directory entered for the server
variable $PMSessionLogDir in the PowerCenter Server registration and names the session log
session_name.log.
By default, the PowerCenter Server saves only one session log for each session. If you want to
save multiple logs for different session runs, you can configure the session to save a session log
file in two different ways:
♦ By timestamp, permitting an unlimited number of session logs.
♦ By cycle, saving the configured number of session logs, replacing the older logs with new
logs. You can use the server variable $PMSessionLogCount to set the number of logs the
PowerCenter Server archives for the session.
For more information about the session log, see “Log Files” on page 455.
Session Details
When you run a session, the Workflow Manager creates session details that provide load
statistics for each target in the mapping. You can monitor session details during the session or
after the session completes. Session details include information such as table name, number of
rows written or rejected, and read and write throughput. You can view this information by
double-clicking the session in the Workflow Monitor.
For more information on session details file, see “Monitoring Session Details” on page 434.
Performance Detail File

The PowerCenter Server can create a set of information known as session performance details
to help determine where performance can be improved. Performance details provide

transformation-by-transformation information on the flow of data through the session. To
generate this information for a session, select the performance detail option in the session
properties.
You can view performance details in the Workflow Monitor, or open the text file that contains
the information in a text editor. The PowerCenter Server names the file session_name.perf,
and stores it in the same directory as the session log (in the PowerCenter Server variable
directory $PMSessionLog, by default).
For more information on performance details, see “Creating and Viewing Performance
Details” on page 436.
Reject Files
By default, the PowerCenter Server creates a reject file for each target in the session. The
reject file contains rows of data that the writer does not write to targets.
The writer may reject a row in the following circumstances:
♦ It is flagged for reject by an Update Strategy or Custom transformation.
♦ It violates a database constraint, such as primary key constraint.
♦ A field in the row was truncated or overflowed, and the target database is configured to
reject truncated or overflowed data.
By default, the PowerCenter Server saves the reject file in the directory entered for the server
variable $PMBadFileDir in the Workflow Manager, and names the reject file
target_table_name.bad.
Note: If you enable row error logging, the PowerCenter Server does not create a reject file.
For more information about the reject file, see “Log Files” on page 455.
Row Error Logs

When you configure a session, you can choose to log row errors in a central location. When a
row error occurs, the PowerCenter Server logs error information that allows you to determine
the cause and source of the error. The PowerCenter Server logs information such as source
name, row ID, current row data, transformation, timestamp, error code, error message,
repository name, folder name, session name, and mapping information.
For more information about row error logging, see “Row Error Logging” on page 481.
Recovery Tables and Files

You can recover failed sessions that write to relational targets. The PowerCenter Server creates
recovery tables on the target database system when it runs a session enabled for recovery.
When you run a session in recovery mode, the PowerCenter Server uses information in the
recovery tables to complete the session.
For more information about recovery, see “Recovering Data” on page 295.

Control File
When you run a session that uses an external loader, the PowerCenter Server creates a control
file and a target flat file. The control file contains information about the target flat file such as
data format and loading instructions for the external loader. The control file has an extension
of .ctl. You can view the control file and the target flat file in the target file directory (default:
$PMTargetFilesDir).
For more information about external loading and control files, see “External Loading” on
page 523.
Email
You can compose and send email messages by creating an Email task in the Workflow
Designer or Task Developer. You can place the Email task in a workflow, or you can associate
it with a session. The Email task allows you to automatically communicate information about
a workflow or session run to designated recipients.
Email tasks in the workflow send email depending on the conditional links connected to the
task. For post-session email, you can create two different messages, one to be sent if the
session completes successfully, the other if the session fails. You can also use variables to
generate information about the session name, status, and total rows loaded.
For example, if your database administrator wants to track how long a session takes to
complete, you can configure the session to send an email containing the time and date the
session starts and completes. Or, if you want to notify your Informatica administrator when a
session fails, you can configure the session to send an email only if it fails and attach the
session log to the email.
For more information, see “Sending Email” on page 319.
Indicator File
If you use a flat file as a target, you can configure the PowerCenter Server to create an
indicator file for target row type information. For each target row, the indicator file contains a
number to indicate whether the row was marked for insert, update, delete, or reject. The
PowerCenter Server names this file target_name.ind and stores it in the same directory as the
target file. For more information about configuring the PowerCenter Server, see the
Output File
If the session writes to a target file, the PowerCenter Server creates the target file based on a
file target definition. By default, the PowerCenter Server names the target file based on the
target definition name. If a mapping contains multiple instances of the same target, the
PowerCenter Server names the target files based on the target instance name.

The PowerCenter Server creates this file in the PowerCenter Server variable directory,
$PMTargetFileDir, by default. For more information about working with target files, see
“Working with Targets” on page 233.
Cache Files
When the PowerCenter Server creates memory cache it also creates cache files. The
PowerCenter Server creates index and data cache files for the following transformations in a
mapping:
♦ Aggregator transformation
♦ Joiner transformation
♦ Rank transformation
♦ Lookup transformation
♦ Sorter transformation
By default, the DTM creates the index and data files for Aggregator, Rank, Joiner, and
Lookup transformations in the directory configured for the $PMCacheDir server variable.
The PowerCenter Server names the index file PM*.idx, and the data file PM*.dat. The
PowerCenter Server creates the index and data files for the Sorter transformation in the
$PMTempDir server variable directory.
The PowerCenter Server writes to the cache files during the session in the following cases:
♦ The mapping contains one or more Aggregator transformations configured without sorted
ports.
♦ The session is configured for incremental aggregation.
♦ The mapping contains a Lookup transformation that is configured to use a persistent
lookup cache, and the PowerCenter Server runs the session for the first time.
♦ The mapping contains a Lookup transformation that is configured to initialize the
persistent lookup cache.
♦ The DTM runs out of cache memory and pages to the local cache files. The DTM may
create multiple files when processing large amounts of data. The session fails if the local
directory runs out of disk space.
After the session completes, the DTM generally deletes the overflow index and data files. It
does not delete the cache files under the following circumstances:
♦ The session is configured to perform incremental aggregation.
♦ The session is configured with a persistent lookup cache.
Incremental Aggregation Files

If the session performs incremental aggregation, the PowerCenter Server saves index and data
cache information to disk when the session finished. The next time the session runs, the
PowerCenter Server uses this historical information to perform the incremental aggregation.

The PowerCenter Server names these files PMAGG*.dat and PMAGG*.idx and saves them to
the cache directory.
For more information about incremental aggregation, see “Using Incremental Aggregation”
on page 573.
Persistent Lookup Cache

If a session uses a Lookup transformation, you can configure the transformation to use a
persistent lookup cache. With this option selected, the PowerCenter Server saves the lookup
cache to disk the first time it runs the session, then uses this lookup cache during subsequent
session runs. These files are saved in the cache directory. If you do not name the files in the
transformation properties, these files are named PMLKUP*.idx and PMLKUP*.dat.
For more information about lookup caching, see “Session Caches” on page 613 and “Lookup
Transformation” in the Transformation Guide.

Chapter 2
Configuring the Workflow

Manager
This chapter covers the following topics:
♦ Overview, 38
♦ Customizing the Workflow Manager Options, 39
♦ Registering the PowerCenter Server, 46
♦ Configuring Connection Object Permissions, 51
♦ Setting Up a Relational Database Connection, 53
♦ Replacing a Relational Database Connection, 62
37
Overview
Before you can use the Workflow Manager to create workflows and sessions, you must
configure the Workflow Manager. You can configure display options and connection
information in the Workflow Manager. You must register a PowerCenter Server before you
can start it or create a workflow to run against it.
You can configure the following information in the Workflow Manager:
♦ Configure Workflow Manager options. You can configure options such as grouping
sessions or docking and undocking windows. For details, see “Customizing the Workflow
Manager Options” on page 39.
♦ Register PowerCenter Servers. Before you can start a PowerCenter Server, you must
register it with the repository. For details, see “Registering the PowerCenter Server” on
page 46.
♦ Create a server grid. When you have multiple PowerCenter Servers registered to the same
repository you can create a server grid to balance workloads. For details, see “Working with
Server Grids” on page 446.
♦ Create source and target database connections. Create connections to each source and
target database. You must create connections to a database before you can create a session
that accesses the database. For details, see “Setting Up a Relational Database Connection”
on page 53.
♦ Create connections objects. Create connection objects in the repository when you define
database, FTP, and external loader connections. For details, see “Configuring Connection
Object Permissions” on page 51.
Setting the Date/Time Display Format

The Workflow Manager displays the date and time formats configured in the Windows
Control Panel of the PowerCenter Client machine. To modify the date and time formats,
display the Control panel and open Regional Settings. Set the date and time formats on the
Date and Time tabs.
Note: For the Timer task and schedule settings, the Workflow Manager displays date in short
date format, and the time in 24-hour format (HH:mm).
38 Chapter 2: Configuring the Workflow Manager

Customizing the Workflow Manager Options
You can customize the Workflow Manager default options to control the behavior and look of
the Workflow Manager tools.
To configure Workflow Manager options, choose Tools-Options. You can configure the
following options:
♦ General. You can configure workspace options, display options, and other general options
on the General tab. For more information about the General tab, see “Configuring
General Options” on page 39.
♦ Format. You can configure font, color, and other format options on the Format tab. For
more information about the Format tab, see “Configuring Format Options” on page 42.
♦ Miscellaneous. You can configure Copy Wizard and Versioning options on the
Miscellaneous tab. For more information about the Miscellaneous tab, see “Configuring
Miscellaneous Options” on page 43.
♦ Advanced. You can configure enhanced security for connection objects in the Advanced
tab. For more information about the Advanced tab, see “Enabling Enhanced Security” on
page 44.
Configuring General Options

General options control tool behavior such as whether or not a tool retains its view when you
close it, how the Overview window behaves, and where the Workflow Manager stores
workspace files.
Customizing the Workflow Manager Options 39

Figure 2-1 shows the Workflow Manager General Options:
Figure 2-1. Workflow Manager General Options
Table 2-1 describes general options you can configure in the Workflow Manager:
Table 2-1. Workflow Manager General Options
Option Description
Reload Tasks/ Reloads the last view of a tool when you open it. For example, if you have a workflow open
Workflows When when you disconnect from a repository, select this option so that the same workflow displays
Opening a Folder the next time you open the folder and Workflow Designer. Enabled by default.
Ask Whether to Reload Appears only when you select Reload tasks/workflows when opening a folder. Select this
the Tasks/Workflows option if you want the Workflow Manager to prompt you to reload tasks, workflows, and
worklets each time you open a folder. Disabled by default.
Overview Window Pans By default, when you drag the focus of the Overview window, the focus of the workbook
Delay moves concurrently. When you select this option, the focus of the workspace does not
change until you release the mouse button. Disabled by default.
Arrange Workflows/ Arranges tasks in workflows vertically by default. Disabled by default.

Worklets Vertically By
Default
Allow Invoking In-Place By default, you can press F2 to edit objects directly in the workspace instead of opening the
Editing Using the Edit Task dialog box. Select this option so you can also click the object name in the
Mouse workspace to edit the object. Disabled by default.

Table 2-1. Workflow Manager General Options
Option Description
Open Editor When Task Opens the Edit Task dialog box when you create a task. By default, the Workflow Manager
Is Created creates the task in the workspace. If you do not enable this option, double-click the task to
open the Edit Task dialog box. Disabled by default.
Workspace File The directory for workspace files created by the Workflow Manager. Workspace files
Directory maintain the last task or workflow you saved. This directory should be local to the
PowerCenter Client to prevent file corruption or overwrites by multiple users. By default, the
Workflow Manager creates files in the PowerCenter Client installation directory.
Display Tool Names On Displays the name of the tool in the upper left corner of the workspace or workbook. Enabled
Views by default.
Always Show the Full Shows the full name of a task when you select it. By default, the Workflow Manager
Name of Selected Task abbreviates the task name in the workspace. Enabled by default.
Show the Expression Shows the link condition in the workspace. If you do not enable this option, the Workflow
On a Link Manager abbreviates the link condition in the workspace. Enabled by default.
Launch Workflow The Workflow Monitor launches when you start a workflow or a task. Enabled by default.
Monitor when Workflow
is Started
Receive Notifications Allows you to receive notification messages from the Repository Server. The Repository
from Server Server sends notification about actions performed on repository objects. Enabled by default.
For details, see “Understanding the Repository” in the Repository Guide.

Configuring Format Options
Format options control colors and fonts. To configure format options, select the appropriate
Workflow Manager tool.
Figure 2-2 shows the Workflow Manager Format Options:
Figure 2-2. Workflow Manager Format Options
Table 2-2 describes the format options for the Workflow Manager:
Table 2-2. Workflow Manager Format Options
Option Description
Show Solid Lines for Displays links as solid lines. By default, the Workflow Manager displays links as dotted lines.
Links
Workspace Colors Displays all items that you can customize in the selected tool. Select an item to change its
color.
Color Choose the color of the selected item in Workspace Colors.
Font Categories Select the Workflow Manager tool for which you want to customize the display font.
Change Font Select to change the display font and language script for the Workflow Manager tool you
choose from the Categories menu.
Reset All Resets all format options to their original default values.

Configuring Miscellaneous Options
Copy Wizard options control the display settings and available functions for the Copy
Wizard. Versioning options control how the Workflow Manager displays checked out objects.
Target loading options control how the PowerCenter Server loads targets. To configure Copy
Wizard, Versioning, or Target Load Type options, choose Tools-Options and select the
Miscellaneous tab.
Figure 2-3 shows the Workflow Manager Miscellaneous Options:
Figure 2-3. Copy Wizard, Versioning, and Target Load Type Options
Table 2-3 describes the options for the Copy Wizard, Versioning, and Target Load Type:
Table 2-3. Workflow Manager Miscellaneous Options
Option Description
Validate Copied Objects Validates the copied object. Enabled by default.
Generate Unique Name When Generates unique names for copied objects if you select the Rename option. For
Resolved to “Rename” example, if the workflow wf_Sales has the same as a workflow in the destination
folder, the Rename option generates the unique name wf_Sales1. Enabled by
default.
Get Default Object When Uses the object with the same name in the destination folder if you select the
Resolved to “Choose” Choose option.
Show Check Out Image in Displays the Check Out icon when an object has been checked out. Enabled by
Navigator default.

Table 2-3. Workflow Manager Miscellaneous Options
Option Description
Reset All Resets all Copy Wizard and Versioning options to their default values.
Target Load Type Sets default load type for sessions. You can choose normal or bulk loading.
Any change you make takes effect after you restart the Workflow Manager.
You can override this setting in the session properties. Default is Bulk.
For more information on normal and bulk loading, see Table A-15 on page 697.
Enabling Enhanced Security

The Workflow Manager has an enhanced security option that allows you to specify a default
set of privileges that applies to restricted access controls for connection objects.
When you enable enhanced security, the Workflow Manager automatically assigns default
permissions for connection objects to the object owner, owner group, and all other users. You
can assign read, write, and execute permissions to an object, and specify permission for users
and groups you add in the Permissions dialog box when you edit a connection.
Table 2-4 lists the default permissions to a connection object:
Table 2-4. Default Permissions for Connection Objects
User Default Connection Object Permissions
Owner Read/Write/Execute
Owner Group Read/Execute
World No permissions
If you do not enable enhanced security, the Workflow Manager assigns Read, Write, and
Execute permissions to all users or groups for the connection.
Enabling enhanced security does not lock the restricted access settings for connection objects.
You can continue to change the permissions for connection objects after enabling enhanced
security.
If you delete the Owner from the repository, the Workflow Manager automatically assigns
ownership of the object to Administrator.
To enable enhanced security for connection objects:
1. Choose Tools-Options.
2. Click the Advanced Tab.

3. Select Enable Enhanced Security.
4. Click OK.

Registering the PowerCenter Server
Before you can start the PowerCenter Server or create or run workflows, you need to register
the PowerCenter Server in the repository. Use the Workflow Manager to register the
PowerCenter Server.
To register, edit, or delete the PowerCenter Server, you must have Administer Server,
Administrator, or Super User privileges. In addition, to register a PowerCenter Server, you
need the following information:
♦ PowerCenter Server name.
♦ Host name.
♦ TCP/IP address used to access the PowerCenter Server.
Use the IP address or host name of the machine on which the PowerCenter Server runs,
and the port number the PowerCenter Server uses on that machine.
♦ Code page identifying the character set associated with the PowerCenter Server.
♦ Default directories you want the PowerCenter Server to use for workflow files and caches.
You can perform the following registration tasks for a PowerCenter Server:
♦ Register a PowerCenter Server. When you register a PowerCenter Server, specify
information such as the code page and directories for session output. This information is
stored in the repository.
When you register multiple PowerCenter Servers, you can choose the PowerCenter Server
to run a workflow or a session. You also can create a server grid to distribute workloads
across multiple servers.
♦ Edit a PowerCenter Server. When you edit a PowerCenter Server, all workflows and
sessions using that PowerCenter Server use the updated server connection information,
including the updated code page settings. You do not need to restart the Workflow
Manager to use the updated information.
♦ Delete a PowerCenter Server. When you delete a PowerCenter Server, you must assign
another PowerCenter Server for the workflows and sessions using the deleted server before
you can run the workflow. To assign a PowerCenter Server to a workflow or to a session,
choose Connections-Assign.
Server Variables
You can define server variables for each PowerCenter Server you register. Some server variables
define the path and directories for workflow output files and caches. By default, the
PowerCenter Server places output files in these directories when you run a workflow. Other
server variables define server attributes such as log file count. In a server grid, you must use
the same server variables for each server.
The installation process creates directories in the location where you install the PowerCenter
Server. To use these directories as the default location for the session output files, you must
first set the server variable $PMRootDir to define the path to the directories.

By using server variables, you simplify the process of changing the PowerCenter Server that
runs a workflow. If each workflow in a folder uses server variables, then when you copy the
folder to a production repository, the PowerCenter Server in production can run the workflow
using the server variables defined with the PowerCenter server running against the test
repository. The PowerCenter Server reads and writes the files to the directories in the
$PMRootDir path. To ensure a workflow successfully completes, relocate any necessary file
source or incremental aggregation file to the default directories of the new PowerCenter
Server.
Table 2-5 lists the server variables you configure when you register a PowerCenter Server:
Table 2-5. Server Variables
Required/
Server Variable Description
Optional
$PMRootDir Required A root directory to be used by any or all other server variables.
Informatica recommends you use the PowerCenter Server installation
directory as the root directory.
$PMSessionLogDir Required Default directory for session logs. Defaults to $PMRootDir/SessLogs.
$PMBadFileDir Required Default directory for reject files. Defaults to $PMRootDir/BadFiles.
$PMCacheDir Required Default directory for the index and data cache files. Defaults to
$PMRootDir/Cache. To avoid performance problems, always use a drive
local to the PowerCenter Server for the cache directory. Do not use a
mapped or mounted drive for cache files.
$PMTargetFileDir Required Default directory for target files. Defaults to $PMRootDir/TgtFiles.
$PMSourceFileDir Required Default directory for source files. Defaults to $PMRootDir/SrcFiles.
$PMExtProcDir Required Default directory for external procedures. Defaults to $PMRootDir/

ExtProc.
$PMTempDir Required Default directory for temporary files. Defaults to $PMRootDir/Temp.
$PMSuccessEmailUser Optional Email address to receive post-session email when the session completes
successfully. Use to address post-session email. The default value is an
empty string. For details, see “Sending Email” on page 319.
$PMFailureEmailUser Optional Email address to receive post-session email when the session fails. The
default value is an empty string. Use to address post-session email.
$PMSessionLogCount Optional Number of session logs the PowerCenter Server archives for the session.
Use to archive session logs. For details, see “Viewing Session Logs” on
page 474. Defaults to 0.
$PMSessionErrorThreshold Optional Number of non-fatal errors the PowerCenter Server allows before failing
the session. Non-fatal errors include reader, writer, and DTM errors. If
you want to stop the session on errors, enter the number of non-fatal
errors you want to allow before stopping the session. The PowerCenter
Server maintains an independent error count for each source, target, and
transformation. Use to configure the Stop On option in the session
properties.
Defaults to 0. If you use the default setting, non-fatal errors do not cause
the session to stop.
Registering the PowerCenter Server 47

Table 2-5. Server Variables
Required/
Server Variable Description
Optional
$PMWorkflowLogDir Required Default directory for workflow logs.

Defaults to $PMRootDir/WorkflowLogs.
$PMWorkflowLogCount Optional Number of workflow logs the PowerCenter Server archives for the
workflow. Defaults to 0.
$PMLookupFileDir Optional Default directory for lookup files. Defaults to $PMRootDir/LkpFiles.
Steps for Registering a PowerCenter Server

You can register one or more PowerCenter Servers with a PowerCenter repository, allowing
you to run workflows and sessions on different servers. In a multiple server environment, it is
important to enter descriptive server names for each registered server to help users
differentiate between servers. When you register multiple servers you must have a unique
server name and a unique combination of host name and port number for each server in the
repository. For more information on using multiple servers, see “Using Multiple Servers” on
page 443.
To register the PowerCenter Server:
1. In the Workflow Manager, connect to the repository.

Note: The first time you connect to the repository, use the database user name and
password used to create the repository.
2. Choose Server-Server Configuration.
The Server Browser dialog box appears.
3. Click New to register a new server.

The Server dialog box appears.
4. Enter a new server name.

5. Configure the TCP/IP connectivity settings.
6. If you do not know the IP address, enter the host name and use the Resolve Server button
to resolve the IP address. You can also enter the IP address in the Host Name/IP Address
field and use the Resolve Server button to resolve the host name.
The Workflow Manager can only resolve the host name or IP address if you enter the
information in the Host Name/IP Address field.
The Workflow Manager also resolves the host name or IP address when you click OK.
Table 2-6 describes the settings required to register a PowerCenter Server using TCP/IP:
Table 2-6. TCP/IP Settings to Register a Server
Required/
TCP/IP Option Description
Optional
Server Name Required The name of PowerCenter Server. This name must be unique to
the repository.
Host Name or IP Required Server host name or IP address of the PowerCenter Server
address machine.
Resolved IP Address n/a (read-only) The IP address resolved by the Workflow Manager. This is a
read-only field.
Port Number Required Port number the PowerCenter Server uses. Must be the same
port listed in the PowerCenter Server configuration parameters.
Registering the PowerCenter Server 49

Table 2-6. TCP/IP Settings to Register a Server
Required/
TCP/IP Option Description
Optional
Timeout Required Number of seconds the Workflow Manager waits for a response
from the PowerCenter Server.
Code Page Required Character set associated with the PowerCenter Server. Select
the code page identical to the PowerCenter Server operating
system code page. Must be identical to or compatible with the
repository code page.
7. For $PMRootDir, enter a valid root directory for the PowerCenter Server platform.
Informatica recommends using the PowerCenter Server installation directory as the root
directory because the PowerCenter Server installation creates the default server directories
there. If you enter a different root directory, make sure to create the necessary directories.
8. Enter the server variables, as desired.
Do not use trailing delimiters. A trailing delimiter might invalidate the directory used by
the PowerCenter Server. For example, enter c:\data\sessionlog, not c:\data\sessionlog\.
See Table 2-5 on page 47 for a list of server variables.
9. Click OK.
The new PowerCenter Server appears in the Navigator below the repository.
Deleting a PowerCenter Server

When you delete a PowerCenter Server with associated workflows, assign another server to
the workflows. For details, see “Assigning the PowerCenter Server to a Workflow” on
page 122.
To delete a PowerCenter Server, you must have one of the following privileges:
♦ Administer Server privilege
♦ Super User privilege
To delete a server:
1. In the Workflow Manager, choose Server-Server Configuration.

2. Select the PowerCenter Server you want to delete.
3. Click Delete.
4. Click OK.

Configuring Connection Object Permissions
You create connection objects in the repository when you define the following connections:
♦ Relational. Database connections for relational source or target databases. For more
information about relational database connections, see “Setting Up a Relational Database
Connection” on page 53.
♦ Queue. Database connections for message queues. For more information about message
queues, see the PowerCenter Connect for IBM MQSeries User and Administrator Guide.
♦ FTP. Connection to access source or target files using File Transfer Protocol (FTP). For
more information about using FTP, see “Using FTP” on page 559.
♦ Application. Database connection to access databases such as SAP R/3 and PeopleSoft. For
more information, see your PowerCenter Connect documentation.
♦ Loader. Connection to access target databases using external loaders. For more
information about using external loaders, see “External Loading” on page 523.
With correct permissions, you can access these objects from all folders in the repository and
use them in any session.
Connection Object Permissions

You can configure and manage permissions within each connection object. The Workflow
Manager assigns Owner permissions to the user who creates the connection. The Workflow
Manager grants Owner Group permissions to the first group in the Group Memberships list
of the owner.
The Workflow Manager automatically assigns default permissions for connection objects to
the object owner, owner’s group, and all other users if you enable enhanced security. For more
information about enhanced security, see “Enabling Enhanced Security” on page 44.
You can specify read, write, and execute permissions for each user and group in the list. You
can perform the following types of tasks with different connection object permissions, in
combination with user privileges and folder permissions:
♦ Read. View the connection object in the Workflow Manager and Repository Manager.
When you have read permission, you can perform tasks in which you view, copy, or edit
repository objects associated with the connection object.
♦ Write. Edit the connection object.
♦ Execute. Run sessions that use the connection object.
For information on tasks you can perform with user privileges, folder permissions, and
connection object permissions, see “Repository Security” in the Repository Guide.
To manage connection permissions, you must have Super User privileges or be the owner of
the connection. If you do not have the privilege to manage connection permissions, the
Permissions dialog box is read-only. You can change the owner of the object, add or remove
users and groups in the permissions list, and change the permissions for each user or group.
Configuring Connection Object Permissions 51

To view or delete a connection, you must have at least read permission for the connection. To
edit a connection, you must have read and write permissions for the connection.
You add permissions from the Connection Browser dialog box.
To configure permissions for connection objects:
1. Open the Connection Browser dialog box for the connection object. For example, choose
Connections-Relational to open the Connection Browser dialog box for a relational
database connection.
2. Select the connection object you want to configure in the Connection Browser dialog
box.
3. Click Permissions to open the Permissions dialog box.
Configure permissions for connection objects.
4. Select the owner and group for the connection object.
5. Add user or group you want to assign permissions for the connection, and click OK.

Setting Up a Relational Database Connection
Before the PowerCenter Server can access a source or target database in a session, you must
configure the database connections in the Workflow Manager. When you create or modify a
session that reads from or writes to a relational database, you can select only configured source
and target databases. Database connections are saved in the repository.
When you create a connection, you must have the following information available:
♦ Database name. Name for the connection.
♦ Database type. Type of the source or target database.
♦ Database username. Name of a user who has the appropriate database permissions to read
from and write to the database.
♦ Password. Database password (7-bit ASCII only).
♦ Connect string. Connect string used to communicate with the database.
♦ Database code page. Code page associated with the database.
Some database drivers, such as ISG Navigator, do not allow user names and passwords. Since
the Workflow Manager requires a database user name and password, PowerCenter provides
two reserved words to register databases that do not allow user names and passwords:
♦ PmNullUser
♦ PmNullPasswd
Use the PmNullUser user name if you are using Oracle OS Authentication. Oracle OS
Authentication allows you to log on to an Oracle database if you have a logon to the operating
system. You do not need to know a database user name and password. PowerCenter uses
Oracle OS Authentication when the connection user name is PmNullUser and the connection
is for an Oracle database.
You can change connection information at any time. If you edit a Workflow Manager
connection used by a workflow, the PowerCenter Server uses the updated connection
information the next time the workflow runs. You might use this functionality when moving
from test to production.
Tip: If you edit a database connection, all sessions using the named connection then use the
updated connection.
To create a database connection, you must have one of the following privileges:
♦ Use Workflow Manager
♦ Super User
Database Connect Strings

When you create a database connection, specify a connect string for that connection. The
PowerCenter Server uses connect strings to communicate with a database.
Setting Up a Relational Database Connection 53

Table 2-7 lists the native connect string syntax for each supported database when you create
or update connections:
Table 2-7. Native Connect String Syntax
Database Connect String Syntax Example
IBM DB2 dbname mydatabase
Informix dbname@servername mydatabase@informix
Microsoft SQL Server servername@dbname sqlserver@mydatabase
Oracle dbname.world (same as TNSNAMES entry) oracle.world
Sybase servername@dbname sambrown@mydatabase
Teradata* ODBC_data_source_name or TeradataODBC

ODBC_data_source_name@db_name or TeradataODBC@mydatabase
ODBC_data_source_name@db_user_name TeradataODBC@jsmith
*Use Teradata ODBC drivers to connect to source and target databases.
Database Connection Code Pages

When you create a database connection, select a code page for that connection. Code pages
must be compatible for accurate data movement.
If you configure the PowerCenter Server and PowerCenter Client for data code page
validation, the PowerCenter Server enforces code page compatibility at session runtime. Use
the following guidelines to determine code page compatibility:
♦ The target database code page must be a superset of the source database code page and the
PowerCenter Server code page.
♦ The source database code page must be a subset of the target database code page and the
PowerCenter Server code page.
For example, if the source database code page is 7-bit ASCII and the PowerCenter Server code
page is Latin 1, the target database code page must be Latin 1, which is a superset of 7-bit
ASCII.
Table 2-8 summarizes code page compatibility between the source and target code pages when
you configure the PowerCenter Client and PowerCenter Server for data code page validation:
Table 2-8. Source and Target Code Page Compatibility
Component Code Page Code Page Compatibility
Source Subset of target and PowerCenter Server.
Target Superset of source and PowerCenter Server.

The PowerCenter Server creates external loader data and control files using the
target flat file code page.

When you change the code page in a database connection, you must choose one that is
compatible with the previous code page. If the code pages are incompatible, the Workflow
Manager invalidates all sessions using that database connection.
If you configure the PowerCenter Client and PowerCenter Server for relaxed data code page
validation, you can select any supported code page for source and target database connections.
If you are familiar with your data and are confident that it will convert safely from one code
page to another, you can run sessions with incompatible source and target data code pages. It
is your responsibility to ensure your data will convert properly.
For details, see “Globalization Overview” and “Code Pages” in the Installation and
Configuring Environment SQL

For relational databases, you may need to execute some SQL commands in the database
environment when you connect to the database. For example, you might want to set isolation
levels on the source and target systems to avoid deadlocks.
You configure environment SQL in the database connection. You can use environment SQL
for source, target, lookup, and stored procedure connections. If the SQL syntax is not valid,
the PowerCenter Server does not connect to the database, and the session fails.
The PowerCenter Server executes the SQL each time it connects to the database. For example,
if you configure environment SQL in a target connection, and you configure three partitions
for the pipeline, the PowerCenter Server executes the SQL three times, once for each
connection to the target database.
Guidelines for Entering Environment SQL

Consider the following guidelines when creating the SQL statements:
♦ You can enter any SQL command that is valid in the database associated with the
connection object. The PowerCenter Server does not allow nested comments, even though
the database might.
♦ When you enter SQL in the SQL Editor, you manually type in the SQL statements.
♦ Use a semi-colon (;) to separate multiple statements.
♦ The PowerCenter Server ignores semi-colons within single quotes, double quotes, or
within /* ...*/.
♦ If you need to use a semi-colon outside of quotes or comments, you can escape it with a
back slash (\).
♦ You cannot use session or mapping variables in the environment SQL.
♦ You can configure the table owner name using sqlid in the environment SQL for a DB2
connection. However, the table owner name in the target instance overrides the SET sqlid
statement in environment SQL. To use the table owner name specified in the SET sqlid
statement, do not enter a name in the target name prefix.

Configuring a Relational Database Connection
Use the following procedure to configure a relational database connection.
To create a relational database connection:
1. In the Workflow Manager, connect to a repository.

2. Choose Connections-Relational.
A dialog box appears, listing all the registered source and target database connections.
3. Select the type of database connection you want to create.

4. Click New.

The Connection Object Definition dialog box appears.
5. For relational database connections, enter the connection information listed in Table 2-9:
Table 2-9. Relational Database Connection Information
Database Connection Required/

Description
Option Optional
Name Required Connection name used by the Workflow Manager. Connection

name cannot contain spaces or other special characters, except
for the underscore.
Type Required Type of database.
User Name Required Database user name with the appropriate read and write
database permissions to access the database. If you are using
Oracle OS Authentication, or you are using databases such as
ISG Navigator that do not allow user names, enter PmNullUser.
For Teradata connections, this overrides the default database
user name in the ODBC entry.
Password Required Password for the database user name. For Oracle OS
Authentication, or for databases such as ISG Navigator that do
not allow passwords, enter PmNullPassword. For Teradata
connections, this overrides the database password in the ODBC
entry.
Passwords must be in 7-bit ASCII only.

Table 2-9. Relational Database Connection Information
Database Connection Required/

Description
Option Optional
Connect String Required for all Connect string used to communicate with the database. For
databases, syntax, see “Database Connect Strings” on page 53.
except Microsoft
SQL Server and
Sybase
Code Page Required Specifies the code page the PowerCenter Server uses to read
from a source database or write to a target database or file.
6. For each type of relational database connection, enter the attributes listed in Table 2-10:
Table 2-10. Relational Database Connection Attributes
Relational Database
Attribute Name Description
Type
Rollback Segment Oracle The name of the rollback segment. A rollback segment
records database transactions in the event that you
want to undo the transaction.
Enable Parallel Mode Oracle Enables parallel processing when loading data into a
table in bulk mode.
Environment SQL All relational databases Enter SQL commands to set the database environment
when you connect to the database.
Database Name Sybase, Microsoft SQL The name of the database. For Teradata connections,
Server, and Teradata this overrides the default database name in the ODBC
entry. Also, if you do no enter a database name here for
a Teradata connection, the PowerCenter Server uses
the default database name in the ODBC entry.
Data Source Name Teradata The name of the Teradata ODBC data source.
Server Name Sybase and Microsoft Database server name. Used to configure workflows.
SQL Server
Packet Size Sybase and Microsoft Used to optimize the ODBC connection to Sybase and
SQL Server Microsoft SQL Server.
Domain Name Microsoft SQL Server The name of the domain. Used for Microsoft SQL Server
on Windows.
Use Trusted Connection Microsoft SQL Server If selected, the PowerCenter Server uses Windows
authentication to access the Microsoft SQL Server
database. The user name that starts the PowerCenter
Server must be a valid Windows user with access to the
Microsoft SQL Server database.
7. Click OK.
The new database connection appears in the Connection Browser list.
8. To add more database connections, repeat steps 3-7.

9. Click OK to save all changes.
Deleting Connection Objects

When you delete relational, queue, FTP, Application, and external loader connections, the
Workflow Manager marks all sessions that use these connections invalid. To make the sessions
valid, you must edit them and replace the missing connections.
Copying a Relational Database Connection

After you set up a relational database connection, you can make a copy of it by clicking the
Copy As button. The Workflow Manager allows you to choose the relational database type
when you make a copy of a relational database connection.
When you make a copy of a relational database connection, the Workflow Manager retains
the connection properties that apply to the relational database type you select. The copy of
the connection is invalid if a required connection property is missing. Edit the connection
properties manually to validate the connection.
The Workflow Manager appends an underscore and the first three letters of the relational
database type to the name of the new database connection. For example, you make a copy of
the Microsoft SQL Server database connection called Dev_Target. You choose Oracle for the
type of the new database connection. The Workflow Manager names the new database
connection Dev_Target_Ora.
To copy a relational database connection:
1. Choose Connections-Relational.
The Relational Connection Browser appears.
2. Choose the relational connection you want to copy.

Tip: Hold the shift key to select more than one connection to copy.

3. Click Copy As.
The Select Subtype dialog box appears.
4. Select a relational database type for the copy of the connection.

5. Click OK.
6. The Workflow Manager retains connection properties that apply to the relational
database type.
If a required connection property does not exist, the Workflow Manager displays a
warning message.
7. Click OK to close the warning dialog box.

8. The copy of the connection appears in the Relational Connection Browser.

9. If the copied connection is invalid, click the Edit button to enter required connection
properties.
10. Click Close to close the Relational Connection Browser dialog box.

Replacing a Relational Database Connection
You can replace a relational database connection with another relational database connection.
For example, you might have several sessions that you want to write to another target
database. Instead of editing the properties for each session, you can replace the relational
database connection for all sessions in the repository that use the connection.
When you replace database connections, the Workflow Manager replaces the relational
database connections in the following locations for all sessions using the connection:
♦ Source connection
♦ Target connection
♦ Connection Information property in Lookup and Stored Procedure transformations
♦ $Source Connection Value session property
♦ $Target Connection Value session property
If the repository contains both relational and application connections with the same name,
the Workflow Manager only replaces the relational connection when you specified the
connection type as relational in all locations in the repository.
For example, you have a relational and an application source, each called ITEMS. In one
session, you specified the name ITEMS for a source connection instead of Relational:ITEMS.
When you replace the relational connection ITEMS with another relational connection, the
Workflow Manager does not replace any relational connection in the repository because it
cannot determine the connection type for the source connection entered as ITEMS.
The PowerCenter Server uses the updated connection information the next time the workflow
runs.
To replace connections in the Workflow Manager, you must have Super User privilege.
You must first close all folders before replacing a relational database connection.
To replace a relational database connection:
1. Close all folders in the repository.

2. Choose Connections-Replace.

The Replace Connections dialog box appears.
Replace a connection.
3. Click the Add button to replace a connection.

4. In the From list, choose a relational database connection you want to replace.
5. In the To list, choose the replacement relational database connection.

6. Click Replace.
All sessions in the repository that use the From connection now use the connection you
choose in the To list.
Replacing a Relational Database Connection 63

Chapter 3
Using the Workflow

Manager
♦ Overview, 66
♦ Navigating the Workspace, 69
♦ Working with Repository Objects, 73
♦ Checking Out and In Versioned Repository Objects, 74
♦ Searching For Versioned Objects, 76
♦ Copying Repository Objects, 77
♦ Comparing Repository Objects, 79
♦ Working with Metadata Extensions, 82
65
Overview
In the Workflow Manager, you define a set of instructions called a workflow to execute
mappings you build in the Designer. Generally, a workflow contains a session and any other
task you may want to perform when you execute a session. Tasks can include a session, email
notification, or scheduling information. You connect each task with links in the workflow.
You can also create a worklet in the Workflow Manager. A worklet is an object that groups a
set of tasks. A worklet is similar to a workflow, but without scheduling information. You can
execute a batch of worklets inside a workflow.
After you create a workflow, you run the workflow in the Workflow Manager and monitor it
in the Workflow Monitor. For details on the Workflow Monitor, see “Monitoring Workflows”
on page 401.
Workflow Manager Tools

To create a workflow, you first create tasks such as a session, which contains the mapping you
build in the Designer. You then connect tasks with conditional links to specify the order of
execution for the tasks you created. The Workflow Manager consists of three tools to help you
develop a workflow:
♦ Task Developer. Use the Task Developer to create tasks you want to execute in the
workflow.
♦ Workflow Designer. Use the Workflow Designer to create a workflow by connecting tasks
with links. You can also create tasks in the Workflow Designer as you develop the
workflow.
♦ Worklet Designer. Use the Worklet Designer to create a worklet.
Figure 3-1 shows what a workflow might look like if you want to run a session, perform a
shell command after the session completes, and then stop the workflow:
Figure 3-1. Sample Workflow
Workflow Tasks
You can create the following types of tasks in the Workflow Manager:
♦ Assignment. Assigns a value to a workflow variable. For details, see “Working with the
Assignment Task” on page 140.
♦ Command. Specifies a shell command to run during the workflow. For details, see “Using
Workflow Variables” on page 103.
66 Chapter 3: Using the Workflow Manager

♦ Control. Stops or aborts the workflow. For details on the Control task, see “Stopping or
Aborting the Workflow” on page 129.
♦ Decision. Specifies a condition to evaluate. For details, see “Working with the Decision
Task” on page 149.
♦ Email. Sends email during the workflow. For details on the Email task, see “Sending
Email” on page 319.
♦ Event-Raise. Notifies the Event-Wait task that an event has occurred. For details, see
“Working with Event Tasks” on page 153.
♦ Event-Wait. Waits for an event to occur before executing the next task. For details, see
♦ Session. Runs a mapping you create in the Designer. For details on the Session task, see
“Working with Sessions” on page 173.
♦ Timer. Waits for a timed event to trigger. For details, see “Scheduling a Workflow” on
page 112.
Workflow Manager Windows

The Workflow Manager displays the following windows to help you create and organize
workflows:
♦ Navigator. Allows you to connect to and work in multiple repositories and folders. In the
Navigator, the Workflow Manager displays a red icon over invalid objects.
♦ Workspace. Allows you to create, edit, and view tasks, workflows, and worklets.
♦ Output. Contains tabs to display different types of output messages. The Output window
contains the following tabs:
− Save. Displays messages when you save a workflow, worklet, or task. The Save tab
displays a validation summary when you save a workflow or a worklet.
− Fetch Log. Displays messages when the Workflow Manager fetches objects from the
repository.
− Validate. Displays messages when you validate a workflow, worklet, or task.
− Copy. Displays messages when you copy repository objects.
− Server. Displays messages from the PowerCenter Server.
− Notifications. Displays messages from the Repository Server.
♦ Overview. An optional window that allows you to easily view large workflows in the
workspace. Outlines the visible area in the workspace and highlights selected objects in
color. Choose View-Overview Window to display this window.
You can view a list of open windows and switch from one window to another in the Workflow
Manager. To view the list of open windows, choose Window-Windows.
The Workflow Manager also displays a status bar that shows the status of the operation you
perform.
Overview 67
Figure 3-2 shows the Workflow Manager windows:
Figure 3-2. Workflow Manager Windows

Navigator Workspace
Overview
Output
Status Bar

Navigating the Workspace
The Workflow Manager allows you to perform the following operations to navigate the
workspace:
♦ Customize windows.
♦ Customize toolbars.
♦ Search for tasks, links, events and variables.
♦ Arrange objects in the workspace.
♦ Zoom and pan the workspace.
Customizing Workflow Manager Windows

You can customize the following options for the Workflow Manager windows:
♦ Display a window. From the menu, choose View. Then select the window you want to
open.
♦ Close a window. Click the small x in the upper right corner of the window.
♦ Dock or undock a window. Double-click the title bar, or drag the title bar toward or away
from the workspace.
Using Toolbars
The Workflow Manager can display the following toolbars to help you select tools and
perform operations quickly:
♦ Standard. Contains buttons to connect to and disconnect from repositories and folders,
toggle windows, zoom in and out, pan the workspace, and find objects.
♦ Connections. Contains buttons to open connection browsers and to assign servers.
♦ Repository. Contains buttons to connect to, disconnect from, and add repositories, open
folders, close tools, save changes to repositories, and print the workspace.
♦ View. Contains buttons to customize toolbars, toggle the status bar and windows, toggle
full-screen view, create a new workbook, and view the properties of objects.
♦ Layout. Contains buttons to arrange and restore objects in the workspace, find objects,
zoom in and out, and pan the workspace.
♦ Tasks. Contains buttons to create tasks.
♦ Workflow. Contains buttons to edit workflow properties.
♦ Run. Contains buttons to schedule the workflow, start the workflow, or start a task.
Navigating the Workspace 69

You can perform the following operations with toolbars:
♦ Display or hide a toolbar.
♦ Create a new toolbar.
♦ Add or remove buttons.
For details on how to perform these toolbar operations, see “Using the Designer” in the
Designer Guide.
Searching for Items

The Workflow Manager includes search features to help you find tasks, links, variables, and
events in the workspace as well as text in the Output window. You can search for items in any
Workflow Manager tool or Output window.
There are two ways to search for items in the workspace:
♦ Find in Workspace. Searches multiple items at once and returns a list of all task names,
link conditions, event names, or variable names that contain the search string.
♦ Find Next. Searches through items one at a time and highlights the first task, link, event,
variable, or text string that contains the search string. If you repeat the search, the
Workflow Manager highlights the next item that contains the search string.
To find a task, link, event, or variable in the workspace:
1. In any Workflow Manager tool, click the Find in Workspace toolbar button or choose
Edit-Find in Workspace.
The Find in Workspace dialog box opens:
2. Choose whether you want to search for tasks, links, variables, or events.
3. Enter a search string, or select a string from the list.
The Workflow Manager saves the last 10 search strings in the list.
4. Specify whether or not to match whole words and whether or not to perform a case-
sensitive search.
5. Click Find Now.
The Workflow Manager lists task names, link conditions, event names, or variable names
that match the search string at the bottom of the dialog box.
6. Click Close.

To find a single object:
1. To search for a task, link, event, or variable, open the appropriate Workflow Manager
tool and click a task, link, or event. To search for text in the Output window, click the
appropriate tab in the Output window.
2. Enter a search string in the Find field on the standard toolbar.
The search is not case-sensitive.
Find Next Button
Find Field
3. Choose Edit-Find Next, click the Find Next button on the toolbar, or press Enter or F3
to search for the string.
The Workflow Manager highlights the first task name, link condition, event name, or
variable name that contains the search string, or the first string in the Output window
that matches the search string.
4. To search for the next item, press Enter or F3 again.
The Workflow Manager alerts you when you have searched through all items in the
workspace or Output window before it highlights the same objects a second time.
Arranging Objects in the Workspace

The Workflow Manager can arrange objects in the workspace horizontally or vertically. In the
Task Manager, you can also arrange tasks evenly in the workspace by choosing Tile. To
arrange objects in the workspace, select Layout-Arrange and choose Horizontal, Vertical, or
Tile.
Zooming the Workspace

You can zoom in and out as well as pan the workspace to adjust the view.
Use the following toolbar or Layout menu options to set zoom levels:
♦ Zoom Center In/Out by 10%. Increases or decreases the magnification by 10%
increments while maintaining the center of the view.
♦ Zoom Point In/Out by 10%. Uses a point you select as the center point and increases or
decreases the magnification by 10% increments.
♦ Zoom Rectangle. Increases the current magnification of a rectangular area you select.
Degree of magnification depends upon the size of the area you select, workspace size, and
current magnification.
♦ Zoom Normal. Sets the zoom level to 100%.
♦ Scale to Fit. Scales all workspace objects to fit the workspace.
Navigating the Workspace 71

♦ Zoom Percent. Sets the zoom level to the percent you choose while maintaining the center
of the view.
To maximize the size of the workspace window, choose View-Full Screen. To go back to
normal view, click the Close Full Screen button or press Esc.
To pan the workspace, click Layout-Pan or click the Pan button on the toolbar. Drag the
focus of the workspace window and release the mouse button when it is in the appropriate
position. Double-click the workspace to stop panning.

Working with Repository Objects
The Workflow Manager allows you to perform the following general operations with
repository objects:
♦ View properties for each object.
♦ Enter descriptions for each object.
♦ Rename an object.
To edit any repository object, you must first add a repository in the Navigator so you can
access the repository object. To add a repository in the Navigator, choose Repository-Add or
click the Add Repository button on the Repository toolbar. Enter the repository name and
user name and click OK.
Viewing Object Properties

To view properties of a repository object, first select the repository object in the Navigator.
Choose View-Properties to view object properties. Or, right-click the repository object and
choose Properties.
You can view properties of a folder, task, worklet, or workflow. For folders, the Workflow
Manager displays folder name and whether the folder is shared. Object properties are read-
only.
You can also view dependencies for repository objects, for more information about viewing
object dependencies, see the Repository Guide.
Entering Descriptions for Repository Objects

When you edit an object in the Workflow Manager, you can enter descriptions and comments
for that object. The maximum number of characters you can enter is 2,000 bytes/K, where K
is the maximum number of bytes a character contains in the selected repository code page.
For example, if the repository code page is a Japanese code page where the each character can
contain up to two bytes (K=2), each description and comment field allows you to enter up to
1,000 characters.
Renaming Repository Objects

You can rename repository objects by clicking the Rename button in the Edit Tasks dialog box
or the Edit Workflow dialog box. You can also rename repository objects by clicking the
object name in the workspace and typing in the new name.
Working with Repository Objects 73

Checking Out and In Versioned Repository Objects
When you work with versioned objects, you check out an object when you want to change it,
and check it in when you want to commit your changes to the repository. Checking in new
objects adds a new version to the object history.
For more information, see “Working with Versioned Objects” in the Repository Guide.
Checking Out Objects

When you open an object in the workspace, the repository checks out the object and locks the
object for your use. No other user can check out the object. If another user has checked out
the object, you can open the object as read-only.
You can view objects you and other users have checked out. You might want to view
checkouts to see if an object is available for you to work with, or if you need to check in all of
the objects you have worked with.
For more information on viewing object checkouts, see “Working with Versioned Objects” in
the Repository Guide.
Checking In Objects
You commit changes to the repository by checking in objects. When you check in an object,
the repository creates a new version of the object and assigns it a version number. The
repository increments the version number by one each time it creates a new version.
You can check in an object from the Workflow Manager workspace. To do this, select the
object and choose Versioning-Check in.
You can check in an object when you review the results of the following tasks:
♦ View object history. You can check in an object from the View History window when you
view the history of an object.
♦ View checkouts. You can check in an object from the View Checkouts window when you
search for checked out objects.
♦ View query results. You can check in an object from the Query Results window when you
search for object dependencies or run an object query.
To check in an object, select the object or objects and choose Versioning-Check in.
Enter text into the comment field in the Check In dialog box.

Figure 3-3 shows the Check In dialog box:
Figure 3-3. Check In Workflow Manager Objects
Apply the check in comment to multiple objects.
When you check in an object, the repository creates a new version of the object and
increments the version number by one.
Checking Out and In Versioned Repository Objects 75

Searching For Versioned Objects
You can use an object query to search for versioned objects in the repository that meet
specified conditions. When you run a query, the repository returns results based on those
conditions. You may want to create an object query to perform the following tasks:
♦ Track repository objects during development. You can add Label, User, Last saved, or
Comments parameters to queries to track objects during development. For more
information about creating object queries, see “Grouping Versioned Objects” in the
Repository Guide.
♦ Associate a query with a deployment group. When you create a dynamic deployment
group, you can associate an object query with it. For more information about working
with deployment groups, see “Copying Folders and Deployment Groups” in the Repository
Guide.
To create an object query, choose Versioning-Queries to open the Query Browser.
Figure 3-4 shows the Query Browser:
Figure 3-4. Query Browser
Edit a query.
Delete a query.
Create a query.
Configure permissions.
Run a query.
From the Query Browser, you can create, edit, and delete queries. You can also configure
permissions for each query from the Query Browser. You can run any queries for which you
have read permissions from the Query Browser.
For information about working with object queries, see “Grouping Versioned Objects” in the
Repository Guide.

Copying Repository Objects
You can copy repository objects (such as workflows, worklets, or tasks) within the same folder,
to a different folder, or to a different repository. If you want to copy the object to another
folder, you must open the destination folder before you copy the object into the folder.
The Workflow Manager provides a Copy Wizard that allows you to copy objects. When you
copy a workflow or a worklet, the Copy Wizard copies all of the worklets, sessions, and tasks
in the workflow. You must resolve all conflicts that occur. Conflicts occur when the Copy
Wizard finds a workflow or worklet with the same name in the target folder, or when the
server connection does not exist in the target repository. If a server connection does not exist,
you can skip the conflict and choose a server connection after you copy the workflow. You
cannot copy server connections. Conflicts may also occur when you copy Session tasks.
For more details on the Copy Wizard, see “Copying Objects” in the Repository Guide.
You can configure display settings and functions of the Copy Wizard by choosing Tools-
Options. For details, see “Configuring Miscellaneous Options” on page 43.
Note: The Workflow Manager provides an Import Wizard that allows you to import objects
from an XML file. The Import Wizard provides the same options to resolve conflicts as the
Copy Wizard. For details, see “Exporting and Importing Objects” in the Repository Guide.
Copying Sessions
When you copy a Session task, the Copy Wizard looks for the database connection and
associated mapping in the destination folder. If the mapping or connection does not exist in
the destination folder, you can select a new mapping or connection. If the destination folder
does not contain any mapping, you must first copy a mapping to the destination folder in the
Designer before you can copy the session.
When you copy a session that has mapping variable values saved in the repository, the
Workflow Manager either copies or retains the saved variable values.
Copying Workflow Segments

You can copy segments of workflows and worklets when you want to reuse a portion of
workflow or worklet logic. A segment consists of one or more tasks, the links between the
tasks, and any condition in the links. You can copy reusable and non-reusable objects when
copying and pasting segments. You can copy segments of workflows or worklets into
workflows and worklets within the same folder, within another folder, or within a folder in a
different repository. You can also paste segments of workflows or worklets into an empty
Workflow Designer or Worklet Designer workspace.
Copying Repository Objects 77

To copy a segment from a workflow or worklet:
1. Open the workflow or worklet.

2. Select a segment by highlighting each task you want to copy. You can select multiple
reusable or non-reusable objects. You can also select segments by dragging the pointer in
a rectangle around objects in the workspace.
3. Choose Edit-Copy or press Ctrl+C to copy the segment to the clipboard.
4. Open the workflow or worklet into which you want to paste the segment. You can also
copy the object into the Workflow or Worklet Designer workspace.
5. Choose Edit-Paste or press Ctrl+V.
The Copy Wizard opens, and notifies you if it finds copy conflicts.
Note: You can copy individual non-reusable tasks by selecting the individual task and
following the instructions for copying and pasting segments.

Comparing Repository Objects
The Workflow Manager allows you to compare two repository objects of the same type to
identify differences between the objects. For example, if you have two similar Email tasks in a
folder, you can compare them to see which one contains the attributes you need. When you
compare two objects, the Workflow Manager displays their attributes in detail.
You can compare objects across folders and repositories. To do this, you must have both
folders open. You can compare a reusable object with a non-reusable object. You can also
compare two versions of the same object. For more information about versioned objects, see
“Working with Versioned Objects” in the Repository Guide.
To compare objects, you must have read permission on each folder that contains the objects
you want to compare.
You can compare the following types of objects:
♦ Tasks
♦ Sessions
♦ Worklets
♦ Workflows
You can also compare instances of the same type. For example, if the workflows you compare
contain worklet instances with the same name, you can compare the instances to see if they
differ. The Workflow Manager also allows you to compare the following instances and
attributes:
♦ Instances of sessions and tasks in a workflow or worklet comparison. For example, when
you compare workflows, you can compare task instances that have the same name.
♦ Instances of mappings and transformations in a session comparison. For example, when
you compare sessions, you can compare mapping instances.
♦ The attributes of instances of the same type within a mapping comparison. For example,
when you compare flat file sources, you can compare attributes, such as file type (delimited
or fixed), delimiters, escape characters, and optional quotes.
You can compare schedulers and session configuration objects in the Repository Manager. You
cannot compare objects of different types. For example, you cannot compare an Email task
with a Session task.
When you compare objects, the Workflow Manager displays the results in the Diff Tool
window. The Diff Tool output contains different nodes for different types of objects.
When you import Workflow Manager objects, you can compare object conflicts. For more
information, see “Exporting and Importing Objects” in the Repository Guide.
Comparing Repository Objects 79

Steps for Comparing Objects
Use the following procedure to compare objects.
To compare two objects:
1. Open the folders that contain the objects you want to compare.
2. Open the appropriate Workflow Manager tool.
3. Choose Tasks-Compare, Worklets-Compare, or Workflow-Compare.
A dialog box similar to the following one opens:
4. Click Browse to select an object.

5. Click Compare.
Tip: You can also compare objects from the Navigator or workspace. In the Navigator,
select the objects, right-click and choose Compare Objects. In the workspace, select the
objects, right-click and choose Compare Objects.

Figure 3-5 shows the result of comparing two objects:
Figure 3-5. Diff Tool Window

Filter nodes that
have same attribute
values.
Drill down to
further compare
objects.
Differences between
objects are
highlighted and the
nodes are flagged.
Differences
between object
properties are
marked.
Displays the
properties of the
node you select.
You can further compare differences between object properties by clicking the Compare
Further icon or by right-clicking the differences.
6. If you want to save the comparison as a text or HTML file, choose File-Save to File.
Comparing Repository Objects 81

Working with Metadata Extensions
You can extend the metadata stored in the repository by associating information with
individual repository objects. For example, you may wish to store your name with the
worklets you create. If you create a session, you can store your telephone extension with that
session. You associate information with repository objects using metadata extensions.
Repository objects can contain both vendor-defined and user-defined metadata extensions.
You can view and change the values of vendor-defined metadata extensions, but you cannot
create, delete, or redefine them. You can create, edit, delete, and view user-defined metadata
extensions, as well as change their values.
You can create metadata extensions for the following objects in the Workflow Manager:
♦ Sessions
♦ Workflows
♦ Worklets
You can create both reusable and non-reusable metadata extensions. You associate reusable
metadata extensions with all repository objects of a certain type such as all sessions or all
worklets. You associate non-reusable metadata extensions with a single repository object such
as one workflow. For more information about metadata extensions, see “Metadata Extensions”
in the Repository Guide.
To create, edit, and delete user-defined metadata extensions in the Workflow Manager, you
must have read and write permissions on the folder.
Creating a Metadata Extension

You can create user-defined, reusable and non-reusable metadata extensions for repository
objects using the Workflow Manager. To create a metadata extension, you edit the object for
which you want to create the metadata extension, and then add the metadata extension to the
Metadata Extensions tab.
If you need to create multiple reusable metadata extensions, it is easier to create them using
the Repository Manager. For details, see “Metadata Extensions” in the Repository Guide.
To create a metadata extension:
1. Open the appropriate Workflow Manager tool.

2. Drag the appropriate object into the workspace.
3. Double-click the title bar of the object to edit it.

4. Click the Metadata Extensions tab:
User-Defined
Metadata
Extensions
This tab lists the existing user-defined and vendor-defined metadata extensions. User-
defined metadata extensions appear in the User Defined Metadata Domain. If they exist,
vendor-defined metadata extensions appear in their own domains.
5. Click the Add button.
A new row appears in the User Defined Metadata Extension Domain.
6. Enter the information in Table 3-1:
Table 3-1. Metadata Extension Attributes in the Workflow Manager
Required/
Field Description
Optional
Extension Name Required Name of the metadata extension. Metadata extension names must
be unique for each type of object in a domain. Metadata extension
names cannot contain any special characters except underscores
and cannot begin with numbers.
Datatype Required The datatype: numeric (integer), string, or boolean.
Precision Required for string The maximum length for string metadata extensions.
objects
Working with Metadata Extensions 83

Table 3-1. Metadata Extension Attributes in the Workflow Manager
Required/
Field Description
Optional
Value Optional An optional value.

For a numeric metadata extension, the value must be an integer
between -2,147,483,647 and 2,147,483,647.
For a boolean metadata extension, choose true or false.
For a string metadata extension, click the Open button in the Value
field to enter a value of more than one line, up to 2,147,483,647
bytes.
Reusable Required Makes the metadata extension reusable or non-reusable. Check to

apply the metadata extension to all objects of this type (reusable).
Clear to make the metadata extension apply to this object only
(non-reusable).
Note: If you make a metadata extension reusable, you cannot
change it back to non-reusable. The Workflow Manager makes the
extension reusable as soon as you confirm the action.
UnOverride Optional Restores the default value of the metadata extension when you
click Revert. This column appears only if the value of one of the
metadata extensions was changed.
Description Optional Description of the metadata extension.
7. Click OK.
Editing a Metadata Extension

You can edit user-defined, reusable, and non-reusable metadata extensions for repository
objects using the Workflow Manager. To edit a metadata extension, you edit the repository
object, and then make changes to the Metadata Extensions tab.
What you can edit depends on whether the metadata extension is reusable or non-reusable.
You can promote a non-reusable metadata extension to reusable, but you cannot change a
reusable metadata extension to non-reusable.
Editing Reusable Metadata Extensions

If the metadata extension you want to edit is reusable and editable, you can change the value
of the metadata extension, but not any of its properties. However, if the vendor or user who
created the metadata extension did not make it editable, you cannot edit the metadata
extension or its value. For details, see “Metadata Extensions” in the Repository Guide.
To edit the value of a reusable metadata extension, click the Metadata Extensions tab and
modify the Value field. To restore the default value for a metadata extension, click Revert in
the UnOverride column.

Editing Non-Reusable Metadata Extensions
If the metadata extension you want to edit is non-reusable, you can change the value of the
metadata extension as well as its properties. You can also promote the metadata extension to a
reusable metadata extension.
To edit a non-reusable metadata extension, click the Metadata Extensions tab. You can update
the Datatype, Value, Precision, and Description fields. For a description of these fields, see
Table 3-1 on page 83.
If you wish to make the metadata extension reusable, check Reusable. If you make a metadata
extension reusable, you cannot change it back to non-reusable. The Workflow Manager makes
the extension reusable as soon as you confirm the action.
To restore the default value for a metadata extension, click Revert in the UnOverride column.
Deleting a Metadata Extension

You can delete metadata extensions for repository objects. You delete reusable metadata
extensions using the Repository Manager. You can delete non-reusable metadata extensions
using the Workflow Manager. To do this, edit the repository object, and then delete the
metadata extension from the Metadata Extensions tab.
Working with Metadata Extensions 85

Keyboard Shortcuts
When editing a repository object or maneuvering around the Workflow Manager, use the
following Keyboard shortcuts to help you complete different operations quickly.
Table 3-2 lists the Workflow Manager keyboard shortcuts for editing a repository object:
Table 3-2. Workflow Manager Keyboard Shortcuts
To Press
Cancel editing in a cell Esc
Check and uncheck a check box. Space Bar
Copy text from a cell onto the clipboard. Ctrl+C
Cut text from a cell onto the clipboard. Ctrl+X
Edit the text of a cell. F2. Then move the cursor to the desired location.
Find all combination and list boxes. Type the first letter on the list.
Find tables or fields in the workspace. Ctrl+F
Move around cells in a dialog box. Ctrl+directional arrows
Paste copied or cut text from the clipboard into a cell. Ctrl+V
Select the text of a cell. F2
Table 3-3 lists the Workflow Manager keyboard shortcuts for navigating in the workspace:
Table 3-3. Keyboard Shortcuts for Navigating the Workspace
To Press
Create links. Ctrl+F2. Press Ctrl+F2 to select first task you want to link.
Press Tab to select the rest of the tasks you want to link.
Press Ctrl+F2 again to link all the tasks you selected.
Edit task name in the workspace. F2
Expand selected node and all its children. SHIFT + * (use asterisk on numeric keypad )
Move across Select tasks in the workspace. Tab
Select multiple tasks. Ctrl+mouse click

Chapter 4
Working with Workflows

♦ Overview, 88
♦ Developing Workflows, 91
♦ Using the Workflow Wizard, 99
♦ Using Workflow Variables, 103
♦ Scheduling a Workflow, 112
♦ Validating a Workflow, 119
♦ Running the Workflow, 122
♦ Suspending the Workflow, 127
♦ Stopping or Aborting the Workflow, 129
87
Overview
A workflow is a set of instructions that tells the PowerCenter Server how to execute tasks such
as sessions, email notifications, and shell commands. After you create tasks in the Task
Developer and Workflow Designer, you connect the tasks with links to create a workflow.
In the Workflow Designer, you can specify conditional links and use workflow variables to
create branches in the workflow. The Workflow Manager also provides Event-Wait and Event-
Raise tasks so you can control the sequence of task execution in the workflow. You can also
create worklets and nest them inside the workflow.
Every workflow contains a Start task, which represents the beginning of the workflow.
Figure 4-1 shows a sample workflow:
Figure 4-1. Sample Workflow

Workflow Tasks
Start Task Session Task Assignment Task Link Command Task
You can create workflows with branches to execute tasks concurrently.
88 Chapter 4: Working with Workflows

Figure 4-2 shows a sample workflow with two branches:
Figure 4-2. Sample Workflow With Two Branches
After you create a workflow, select a PowerCenter Server to run the workflow. You can then
start the workflow using the Workflow Manager, Workflow Monitor, or pmcmd.
Use the Workflow Monitor to see the progress of a workflow during its run. The Workflow
Monitor can also show the history of a workflow. For more information about the Workflow
Monitor, see “Monitoring Workflows” on page 401.
Use the following guidelines when you develop a workflow:
1. Create a new workflow. Create a new workflow in the Workflow Designer. For details on
creating a new workflow, see “Creating a New Workflow” on page 91.
2. Add tasks in the workflow. You might have already created tasks in the Task Developer.
Or, you can add tasks to the workflow as you develop the workflow in the Workflow
Designer. For details on workflow tasks, see “Working with Tasks” on page 131.
3. Connect tasks with links. After you add tasks in the workflow, connect them with links
to specify the order of execution in the workflow. For details on links, see “Working with
Links” on page 92.
4. Specify conditions for each link. You can specify conditions on the links to create
branches and dependencies. For details, see “Working with Links” on page 92.
5. Validate workflow. Validate the workflow in the Workflow Designer to identify errors.
For details on validation rules, see “Validating a Workflow” on page 119.
6. Save workflow. When you save the workflow, the Workflow Manager validates the
workflow and updates the repository.
7. Run workflow. In the workflow properties, select a PowerCenter Server to run the
workflow. Run the workflow from the Workflow Manager, Workflow Monitor, or
pmcmd. You can monitor the workflow in the Workflow Monitor. For details on starting
a workflow, see “Running the Workflow” on page 122.
For a complete list of workflow properties, see “Workflow Properties Reference” on page 721.
Overview 89
Workflow Privileges
You need the one of the following privileges to create a workflow:
♦ Use Workflow Manager privilege with read and write folder permissions
You need one of the following privileges to run, schedule, and monitor the workflow:
♦ Workflow Operator privilege

Developing Workflows
The first step to develop a workflow is to create a new workflow in the Workflow Designer. A
workflow must contain a Start task. The Start task represents the beginning of a workflow.
When you create a workflow, the Workflow Designer creates a Start task and adds it to the
workflow. You cannot delete the Start task.
After you create a new workflow, the next step is to add tasks to the workflow. The Workflow
Manager includes tasks such as the Session task, the Command task, and the Email task so
you can design your workflow.
Finally, you connect workflow tasks with links to specify the order of execution in the
workflow. You can add conditions to links.
Creating a New Workflow

You must create a workflow before you can add tasks such as a Session, Command, or Email.
When adding a session, if the workspace in the Workflow Designer is empty, you can create a
workflow automatically.
To create a workflow manually:
1. Open the Workflow Designer.

2. Choose Workflows-Create.
3. Enter a name for the new workflow.
4. Click OK.
The Workflow Designer creates a Start task in the new workflow.
For information on using the Workflow Wizard, see “Using the Workflow Wizard” on
page 99.
Developing Workflows 91
To create a workflow automatically:
1. Open the Workflow Designer. Close any open workflow.

2. Click the session button on the Tasks toolbar.
3. Click in the Workflow Designer workspace.
The Mappings dialog box displays.
4. Select a mapping to associate with the session and click OK.
The Create Workflow dialog box appears. The Workflow Designer names the workflow
wf_MappingName by default. You can rename the workflow or change other workflow
properties. For more information on workflow properties, see “Workflow Properties
Reference” on page 721.
5. Click OK.
The Workflow Designer creates a workflow for the session.
Adding Tasks to Workflows

After you create a new workflow, you add tasks you want to execute in the workflow. You may
already have created tasks in the Task Developer. Or, you may want to create tasks in the
Workflow Designer as you develop the workflow.
If you have already created tasks in the Task Developer, add them to the workflow by dragging
the tasks from the Navigator window to the Workflow Designer workspace.
To create and add tasks as you develop the workflow, choose Tasks-Create in the Workflow
Designer. Or, you can also use the Tasks toolbar to create and add tasks to the workflow. Click
the button on the Tasks toolbar for the task you want to create. Click again in the Workflow
Designer workspace to create and add the task.
Tasks you create in the Workflow Designer are non-reusable. Tasks you create in the Task
Developer are reusable. For more information about reusable tasks, see “Reusable Workflow
Tasks” on page 135.
Working with Links

Use links to connect each workflow task. You can specify conditions with links to create
branches in the workflow. The Workflow Manager does not allow you to use links to create
loops in the workflow. Each link in the workflow can execute only once.
The workflow in Figure 4-3 is not a loop because each task runs at most once.

Figure 4-3 shows a valid workflow:
Figure 4-3. Valid Workflow
The Workflow Manager does not allow you to create a workflow that contains a loop, such as
the loop shown in Figure 4-4. Figure 4-4 shows a loop where the three sessions may be run
multiple times:
Figure 4-4. Example of a Loop
Use the following procedure to link tasks in the Workflow Designer or the Worklet Designer.
To link two tasks:
1. In the Tasks toolbar, click the link button.
Link Button
2. In the workspace, click the first task you want to connect and drag it to the second task.
3. A link appears between the two tasks.
If you have a number of tasks that you want to link concurrently, you may not wish to
connect each link manually. To quickly link tasks concurrently, use the following procedure.
To link several tasks concurrently:
1. In the workspace, click the first task you want to connect.

2. Ctrl-click all other tasks you want to connect.
Note: Do not use Ctrl+A or Edit-Select to choose tasks.
3. Choose Tasks-Link concurrent.

4. A link appears between the first task you selected and each task you added. The first task
you selected links to each task concurrently.
If you have a number of tasks that you want to link sequentially, you may not wish to connect
each link manually. To quickly link tasks sequentially, use the following procedure.
To link several tasks sequentially:
1. In the workspace, click the first task you want to connect.

2. Ctrl-click the next task you want to connect. Continue to add tasks in the order you want
them to run.
3. Choose Tasks-Link sequential.
4. Links appear in sequential order between the first task and each subsequent task you
added.
Specifying Link Conditions

Once you create links between tasks, you can specify conditions for each link to determine the
order of execution in the workflow. If you do not specify conditions for each link, the
PowerCenter Server executes the next task in the workflow by default.
You can use pre-defined or user-defined workflow variables in the link condition. If the link
condition evaluates to True, the PowerCenter Server executes the next task in the workflow. If
the link condition evaluates to False, the PowerCenter Server does not execute the next task in
the workflow.
You can view results of link evaluation during workflow runs in the workflow log file.
Example of Link Conditions

You can use link conditions to specify the order of execution in the workflow or to create
branches in the workflow. For example, you may have two Session tasks in the workflow,
s_STORES_CA and s_STORES_AZ. You want the PowerCenter Server to run the second
Session task only if the first Session task has no target failed rows.
To accomplish this, you can set the link condition between the two sessions so that the
s_STORES_AZ executes only if the number of failed target rows for S_STORES_CA is zero.

Figure 4-5 shows how to set the link condition using the target failed rows variable for
S_STORES_CA:
Figure 4-5. Setting Link Condition
After you specify the link condition in the Expression Editor, the Workflow Manager validates
the link condition and displays it next to the link in the workflow.
Figure 4-6 shows the link condition displayed in the workspace:
Figure 4-6. Displaying Link Condition in the Workflow
Link Condition
To specify a condition for a link:
1. In the Workflow Designer workspace, double-click the link you want to specify.
or
Right-click the link and choose Edit. The Expression Editor displays.
2. In the Expression Editor, enter the link condition.
The Expression Editor provides pre-defined workflow variables, user-defined workflow
variables, variable functions, and boolean and arithmetic operators.
3. Validate the expression using the Validate button. The Workflow Manager displays error
messages in the Output window.
Tip: Click and drag the end point of a link to move it from one task to another without losing
the link condition.
Using the Expression Editor

The Workflow Manager provides an Expression Editor for any expressions in the workflow.
You can enter expressions using the Expression Editor for the following:
♦ Link conditions
♦ Decision task
♦ Assignment task
Figure 4-7 shows the Expression Editor:
Figure 4-7. Expression Editor
The Expression Editor displays system variables, user-defined, and pre-defined workflow
variables such as $Session.status. For details on workflow variables, see “Using Workflow
Variables” on page 103.
The Expression Editor also displays a list of functions. PowerCenter uses a SQL-like language
that contains many functions designed to handle common expressions. For example, you can
use the ABS function to find the absolute value. For a complete list of functions, see the
Transformation Language Reference.

Adding Comments
The Expression Editor also allows you to add comments using -- or // comment indicators.
You can use comments to give descriptive information about the expression, or you can
specify a valid URL to access business documentation about the expression.
For examples on adding comments to expressions, see “The Transformation Language” in the
Transformation Language Reference.
Validating Expressions
You can use the Validate button to validate an expression. If you do not validate an
expression, the Workflow Manager validates it when you close the Expression Editor. You
cannot run a workflow with invalid expressions.
Expressions in link conditions and Decision task conditions must evaluate to a numerical
value. Workflow variables used in expressions must exist in the workflow.
Expression Editor Display

The Expression Editor can display syntax expressions in different colors for better readability.
If you have the latest Rich Edit control, riched20.dll, installed on your system, the Expression
Editor displays expression functions in blue, comments in grey, and quoted strings in green.
You can resize the Expression Editor. Expand the dialog box by dragging from the borders.
The Workflow Manager saves the new size for the dialog box as a client setting.
Deleting a Workflow
You may decide to delete a workflow that you no longer use. When you delete a workflow,
you delete all non-reusable tasks and reusable task instances associated with the workflow.
Reusable tasks used in the workflow remain in the folder when you delete the workflow.
If you delete a workflow that is running, the PowerCenter Server aborts the workflow. If you
delete a workflow that is scheduled to run, the PowerCenter Server removes the workflow
from the schedule.
You can delete a workflow in the Navigator window, or you can delete the workflow currently
displayed in the Workflow Designer workspace.
♦ To delete a workflow from the Navigator window, open the folder, select the workflow and
press the Delete key.
♦ To delete a workflow currently displayed in the Workflow Designer workspace, choose
Workflows-Delete.
Editing a Workflow
When you edit a workflow, the repository updates the workflow information when you save
the workflow. If a workflow is running when you make edits, the PowerCenter Server uses the
updated information the next time you run the workflow.
Viewing Links in Workflow or Worklet

When you edit a workflow or worklet, you can view the forward or backward link paths to
other tasks. You can highlight paths to see links in the workflow branch from the Start task to
the last task in the branch.
Note: You can configure the color the Workflow Manager uses to display links. When you
configure the format options, choose the Link Selection option.
To view link paths:
1. In the Worklet Designer or Workflow Designer, right-click a task and choose Highlight
Path.
2. Choose Forward Path, Backward Path, or Both.
The Workflow Manager highlights all links in the branch you select.
Deleting Links in a Workflow or Worklet

When you edit a workflow or worklet, you can delete multiple links at once without deleting
the connected tasks.
To delete multiple links:
1. In the Worklet Designer or Workflow Designer, select all links you want to delete.
Tip: You can use the mouse to click and drag the selection, or you can Ctrl-click the tasks
and links.
2. Choose Edit-Delete Links.
The Workflow Manager removes all selected links.

Using the Workflow Wizard
You can use the Workflow Wizard to automate the process of creating sessions, adding
sessions to a workflow, and linking sessions to create a workflow. The Workflow Wizard
creates sessions from mappings and adds them to the workflow. It also creates a Start task and
allows you to schedule the workflow. You can add tasks and edit other workflow properties
after the Workflow Wizard completes. If you want to create concurrent sessions, use the
Workflow Designer to manually build a workflow.
Before you create a workflow, verify that the folder contains a valid mapping for the Session
task.
Complete the following steps to build a workflow using the Workflow Wizard:
1. Assign a name and PowerCenter Server to the workflow.
2. Create a session.
3. Schedule the workflow.
Step 1. Assign a Name and PowerCenter Server to the Workflow

In the first step of the Workflow Wizard, you add the name and description of the workflow
and choose the PowerCenter Server to run the workflow.
To create the workflow:
1. In the Workflow Manager, open the folder containing the mapping you want to use in
the workflow.
2. Open the Workflow Designer.
3. Choose Workflows-Wizard.
Using the Workflow Wizard 99

The Workflow Wizard appears.
4. Enter a name for the workflow.

The convention for naming workflows is wf_WorkflowName. For a complete list of
naming conventions for repository objects, see “Naming Conventions” in Getting Started.
5. Enter a description for the workflow.
6. Choose the PowerCenter Server to run the workflow, and click Next.
The next step is to create a session.
Step 2. Create a Session

In the second step of the Workflow Wizard, you create a session based on a mapping. You can
add tasks later in the Workflow Designer workspace. For details on working with tasks, see
“Working with Tasks” on page 131.
To create a session:
1. In the second step of the Workflow Wizard, select a valid mapping and click the right
arrow button.
The Workflow Wizard creates a Session task in the right pane using the selected mapping
and names it s_MappingName by default.

The following figure shows a mapping selected for a session:
2. You can select additional mappings to create more Session tasks in the workflow.
When you add multiple mappings to the list, the Workflow Wizard creates sequential
sessions in the order you add them.
3. Use the arrow buttons to change the session order.
4. Specify whether the session should be reusable.
When you create a reusable session, you can use the session in other workflows. For
details on reusable sessions, see “Working with Tasks” on page 131
5. Specify how you want the PowerCenter Server to run the workflow.
You can specify that the PowerCenter Server runs sessions only if previous sessions
complete, or you can specify that the PowerCenter Server always runs each session. When
you select this option, it applies to all sessions you create using the Workflow Wizard.
Step 3. Schedule a Workflow

In the third step of the Workflow Wizard, you can schedule a workflow to run continuously,
repeat at a given time or interval, or start manually. The PowerCenter Server runs a workflow
unless the prior workflow run fails. When a workflow fails, the PowerCenter Server removes
the workflow from the schedule, and you must reschedule it. You can do this in the Workflow
Manger or using pmcmd.
Using the Workflow Wizard 101

To schedule a workflow:
1. In the third step of the Workflow Wizard, configure the scheduling and run options. For
more information about scheduling a workflow, see “Scheduling a Workflow” on
page 112.
2. Click Next.
The Workflow Wizard displays the settings for the workflow:
3. Verify the workflow settings and click Finish. To edit settings, click Back.
The completed workflow opens in the Workflow Designer workspace. From the
workspace, you can add tasks, create concurrent sessions, add conditions to links, or
modify properties.
4. When you finish modifying the workflow, choose Repository-Save.

Using Workflow Variables
You can create and use variables in a workflow to reference values and record information. For
example, you can use a variable in a Decision task to determine whether the previous task ran
properly. If it did, you can run the next task. If not, you can stop the workflow.
You can use the following types of workflow variables:
♦ Pre-defined workflow variables. The Workflow Manager provides pre-defined workflow
variables for tasks within a workflow. For more information, see “Pre-Defined Workflow
Variables” on page 105.
♦ User-defined workflow variables. You create user-defined workflow variables when you
create a workflow. For more information, see “User-Defined Workflow Variables” on
page 108.
You can use workflow variables when you configure the following types of tasks:
♦ Assignment tasks. You can use an Assignment task to assign a value to a user-defined
workflow variable. For example, you can increment a user-defined counter variable by
setting the variable to its current value plus 1. For information on using workflow variables
in Assignment tasks, see “Working with the Assignment Task” on page 140.
♦ Decision tasks. Decision tasks determine how the PowerCenter Server executes a
workflow. For example, you can use the Status variable to run a second session only if the
first session completes successfully. For information on using workflow variables in
Decision tasks, see “Working with the Decision Task” on page 149.
♦ Links. Links connect each workflow task. You can use workflow variables in links to create
branches in the workflow. For example, after a Decision task, you can create one link to
follow when the decision condition evaluates to true, and another link to follow when the
decision condition evaluates to false. For information on using workflow variables in Link
tasks, see “Working with Links” on page 92.
♦ Timer tasks. Timer tasks specify when the PowerCenter Server begins to execute the next
task in the workflow. You can use a user-defined date/time variable to specify the exact
time the PowerCenter Server starts to execute the next task. For information on using
workflow variables in Timer tasks, see “Working with the Timer Task” on page 161.
You can use the Expression Editor to create an expression that uses variables.
Using Workflow Variables 103

Figure 4-8 shows the Expression Editor:
Figure 4-8. Expression Editor

Select pre-defined variables.
Select user-defined variables.
Create an
expression
using
variables.
When you build an expression, you can select pre-defined variables on the Pre-Defined tab.
You can select user-defined variables on the User-Defined tab. The Functions tab contains
functions that you can use with workflow variables.
Use the point-and-click method to enter an expression using a variable. For information on
using the Expression Editor, see “Using the Expression Editor” on page 96.
You can use the following keywords to write expressions for user-defined and pre-defined
workflow variables:
♦ AND
♦ OR
♦ NOT
♦ TRUE
♦ FALSE
♦ NULL
♦ SYSDATE

Pre-Defined Workflow Variables
Each workflow contains a set of pre-defined variables that you can use to evaluate workflow
and task conditions. You can use the following types of pre-defined variables:
♦ Task-specific variables. The Workflow Manager provides a set of task-specific variables for
each task in the workflow. You can use task-specific variables in a link condition to control
the path the PowerCenter Server takes when running the workflow. The Workflow
Manager lists task-specific variables under the task name in the Expression Editor.
♦ System variables. You can use the SYSDATE and WORKFLOWSTARTTIME system
variables within a workflow. For more information on system variables, see “Variables” in
the Transformation Language Reference. The Workflow Manager lists system variables under
the Built-in node in the Expression Editor.
Table 4-1 lists the task-specific workflow variables available in the Workflow Manager:
Table 4-1. Task-Specific Workflow Variables
Task-Specific Variables Description Task Types Datatype
Condition Evaluation result of decision condition expression. Decision Integer

If the task fails, the Workflow Manager keeps the condition
set to null.
EndTime Date and time the associated task ended. All tasks Date/time
ErrorCode Last error code for the associated task. If there is no error, All tasks Integer
the PowerCenter Server sets ErrorCode to 0 when the
task completes.
ErrorMsg Last error message for the associated task. All tasks Nstring*
If there is no error, the PowerCenter Server sets ErrorMsg
to an empty string when the task completes.
FirstErrorCode Error code for the first error message in the session. Session Integer
If there is no error, the PowerCenter Server sets
FirstErrorCode to 0 when the session completes.
FirstErrorMsg The first error message in the session. Session Nstring*

If there is no error, the PowerCenter Server sets
FirstErrorMsg to an empty string when the task completes.
PrevTaskStatus Status of the previous task in the workflow that the All tasks Integer
PowerCenter Server ran. Statuses include:
- ABORTED
- FAILED
- STOPPED
- SUCCEEDED
Use these key words when writing expressions to evaluate
the status of the previous task. For more information, see
“Evaluating Task Status in a Workflow” on page 107.
SrcFailedRows Total number of rows the PowerCenter Server failed to Session Integer
read from the source.
SrcSuccessRows Total number of rows successfully read from the sources. Session Integer

Table 4-1. Task-Specific Workflow Variables
Task-Specific Variables Description Task Types Datatype
StartTime Date and time the associated task started. All tasks Date/time
Status Status of the previous task in the workflow. Task statuses All tasks Integer
include:
- ABORTED
- DISABLED
- FAILED
- NOTSTARTED
- STARTED
- STOPPED
- SUCCEEDED
Use these key words when writing expressions to evaluate
the status of the current task. For more information, see
“Evaluating Task Status in a Workflow” on page 107.
TgtFailedRows Total number of rows the PowerCenter Server failed to Session Integer
write to the target.
TgtSuccessRows Total number of rows successfully written to the targets. Session Integer
TotalTransErrors Total number of transformation errors. Session Integer
* Variables of type Nstring can have a maximum length of 600 characters.
All pre-defined workflow variables except Status have a default value of null. The
PowerCenter Server uses the default value of null when it encounters a pre-defined variable
from a task that has not yet run in the workflow. Therefore, expressions and link conditions
that depend upon tasks not yet run are valid. The default value of Status is NOTSTARTED.
Using Pre-Defined Workflow Variables in Expressions

When you use a workflow variable in an expression, the PowerCenter Server evaluates the
expression and returns True or False. If the condition evaluates to true, the PowerCenter
Server runs the next task. The PowerCenter Server writes an entry in the workflow log similar
to the following message:
INFO : LM_36506 : (1980|1040) Link [Session2 --> Session3]: condition is
TRUE for the expression [$Session2.PrevTaskStatus = SUCCEEDED].
The Expression Editor displays the pre-defined workflow variables on the Pre-defined tab.
The Workflow Manager groups task-specific variables by task and lists system variables under
the Built-in node. To use a variable in an expression, double-click the variable. The
Expression Editor displays task-specific variables in the Expression field in the following
format:
$<TaskName>.<Pre-definedVariable>

Figure 4-9 shows the Expression Editor with an expression using a task-specific workflow
variable and keyword:
Figure 4-9. Expression Using a Pre-Defined Workflow Variable
Evaluating Task Status in a Workflow

You can use Status and PrevTaskStatus in link conditions to test the status of tasks in a
workflow. Use Status to test the status of the previous task in the workflow. Use
PrevTaskStatus to test the status of the previous task in the workflow that the PowerCenter
Server ran.
Use PrevTaskStatus if you disable a task in the workflow. Status and PrevTaskStatus return the
same value unless the condition uses a disabled task.
Figure 4-10 shows a workflow with link conditions using Status:
Figure 4-10. Status Variable Example
Previous Task in Workflow
Link condition:
$Session2.Status = SUCCEEDED
The PowerCenter Server returns value based on the
previous task in the workflow, Session2.
When you run the workflow, the PowerCenter Server evaluates the link condition and returns
the value based on the status of Session2.

Figure 4-11 shows a workflow with link conditions using PrevTaskStatus:
Figure 4-11. PrevTaskStatus Variable Example

Previous Task Run
Disabled Task
Link condition:
$Session2.PrevTaskStatus = SUCCEEDED
The PowerCenter Server returns value based on the
previous task run, Session1.
When you run the workflow, the PowerCenter Server skips Session2 because the session is
disabled. When the PowerCenter Server evaluates the link condition, it returns the value
based on the status of Session1.
Tip: If you do not disable Session2, the PowerCenter Server returns the value based on the
status of Session2. You do not need to change the link condition when you enable and disable
Session2.
User-Defined Workflow Variables

You can create your own variables within a workflow. When you create a variable in a
workflow, it is valid only in that workflow. You can use the variable in tasks within that
workflow. You can edit and delete user-defined workflow variables.
You can use user-defined variables when you need to make a workflow decision based on
criteria you specify. For example, suppose you create a workflow to load data to an orders
database nightly. You also need to load a subset of this data to headquarters periodically,
perhaps every tenth time you update the local orders database. You create separate sessions to
update the local database and the one at headquarters. The workflow looks like Figure 4-12:
Figure 4-12. Sample Workflow Using Workflow Variable

You can use a user-defined variable to determine when to run the session that updates the
orders database at headquarters.
To do this, set up the workflow as follows:
1. Create a persistent workflow variable, $$WorkflowCount, to represent the number of
times the workflow has run.
2. Add a Start task and both sessions to the workflow.
3. Place a Decision task after the session that updates the local orders database.
Set up the decision condition to check to see if the number of workflow runs is evenly
divisible by 10. You can use the modulus (MOD) function to do this.
4. Create an Assignment task to increment the $$WorkflowCount variable by one.
5. Link the Decision task to the session that updates the database at headquarters when the
decision condition evaluates to true. Link it to the Assignment task when the decision
condition evaluates to false.
When you do this, the session that updates the local database runs every time the workflow
runs. The session that updates the database at headquarters runs every 10th time the
workflow runs.
Start and Current Values

Conceptually, the PowerCenter Server holds two different values for a workflow variable
during a workflow run:
♦ Start value of a workflow variable
♦ Current value of a workflow variable
The start value is the value of the variable at the start of the workflow. The start value could
be a value defined in the parameter file for the variable, a value saved in the repository from
the previous run of the workflow, a user-defined initial value for the variable, or the default
value based on the variable datatype.
The PowerCenter Server looks for the start value of a variable in the following order:
1. Value in parameter file
2. Value saved in the repository (if the variable is persistent)
3. User-specified default value
4. Datatype default value
For a list of datatype default values, see Table 4-2 on page 110.
For example, you create a workflow variable in a workflow and enter a default value, but you
do not define a value for the variable in a parameter file. The first time the PowerCenter
Server runs the workflow, it evaluates the start value of the variable to the user-defined default
value.

If you declare the variable as persistent, the PowerCenter Server saves the value of the variable
to the repository at the end of the workflow run. The next time the workflow runs, the
PowerCenter Server evaluates the start value of the variable as the value saved in the
repository.
If the variable is non-persistent, the PowerCenter Server does not save the value of the variable.
The next time the workflow runs, the PowerCenter Server evaluates the start value of the
variable as the user-specified default value.
If you want to override the value saved in the repository before running a workflow, you need
to define a value for the variable in a parameter file. When you define a workflow variable in
the parameter file, the PowerCenter Server uses this value instead of the value saved in the
repository or the configured initial value for the variable.
The current value is the value of the variable as the workflow progresses. When a workflow
starts, the current value of a variable is the same as the start value. The value of the variable
can change as the workflow progresses if you create an Assignment task that updates the value
of the variable.
If the variable is persistent, the PowerCenter Server saves the current value of the variable to
the repository at the end of a successful workflow run. If the workflow fails to complete, the
PowerCenter Server does not update the value of the variable in the repository.
The PowerCenter Server states the value saved to the repository for each workflow variable in
the workflow log.
Datatype Default Values

If the PowerCenter Server cannot determine the start value of a variable by any other means,
it uses a default value for the variable based on its datatype. For more information on how the
PowerCenter Server determines start values for a variable, see “Start and Current Values” on
page 109.
Table 4-2 lists the datatype default values for user-defined workflow variables:
Table 4-2. Datatype Default Values for User-defined Workflow Variables
Datatype Workflow Manager Default Value
Date/time 1/1/1753 A.D.
Double 0
Integer 0
Nstring Empty string
Creating User-Defined Workflow Variables

You can create workflow variables for a workflow in the workflow properties.

To create a workflow variable:
1. In the Workflow Designer, create a new workflow or edit an existing one.

2. Select the Variables tab.
Add Button
Validate Button
3. Click Add and enter a name for the variable.

The correct format for a user-defined workflow variable is $$VariableName. Do not use a
single $ for a user-defined workflow variable. The single $ is reserved for system variables
and pre-defined workflow variables.
Workflow variable names are not case-sensitive.
4. In the Datatype field, select the datatype for the new variable. You can select from the
following datatypes:
♦ Date/time
♦ Double
♦ Integer
♦ Nstring
Variables of type Nstring can have a maximum length of 600 characters.
5. Enable the Persistent option if you want the value of the variable retained from one
execution of the workflow to the next. For more information, see “Start and Current
Values” on page 109.
6. Enter the default value for the variable in the Default field. If the default value is a null
value, enable the Is Null option.
7. To validate the default value of the new workflow variable, click the Validate button.
8. Click Apply to save the new workflow variable.
9. Click OK to close the workflow properties.

Scheduling a Workflow
You can schedule a workflow to run continuously, repeat at a given time or interval, or you
can manually start a workflow. The PowerCenter Server runs a scheduled workflow as
configured.
By default, the workflow runs on demand. You can change the schedule settings by editing the
scheduler. If you change schedule settings, the PowerCenter Server reschedules the workflow
according to the new settings.
Each workflow has an associated scheduler. A scheduler is a repository object that contains a
set of schedule settings. You can create a non-reusable scheduler for the workflow. Or, you can
create a reusable scheduler so you can use the same set of schedule settings for workflows in
the folder.
The Workflow Manager marks a workflow invalid if you delete the scheduler associated with
the workflow.
If you choose a different PowerCenter Server for the workflow or restart the PowerCenter
Server, it reschedules all workflows. This includes workflows that are scheduled to run
continuously but whose start time has passed. You must manually reschedule workflows
whose start time has passed if they are not scheduled to run continuously.
The PowerCenter Server does not run the workflow if:
♦ The prior workflow run fails. When a workflow fails, the PowerCenter Server removes the
workflow from the schedule, and you must manually reschedule it. You can reschedule the
workflow in the Workflow Manager or using pmcmd. In the Workflow Manager Navigator
window, right-click the workflow and select Schedule Workflow. For more information
about the pmcmd scheduleworkflow command, see “Scheduleworkflow” on page 604.
♦ You remove the workflow from the schedule. You can remove the workflow from the
schedule in the Workflow Manager or using pmcmd. In the Workflow Manager Navigator
window, right-click the workflow and select Unschedule Workflow. For more information
about the pmcmd unscheduleworkflow command, see “Unscheduleworkflow” on page 610.
Note: The PowerCenter Server schedules the workflow in the time zone of the PowerCenter
Server machine. For example, the PowerCenter Client is in your current time zone and the
PowerCenter Server is in a time zone two hours later. If you schedule the workflow to start at
9 a.m., it starts at 9 a.m. in the time zone of the PowerCenter Server machine and 7 a.m.
current time.
To schedule a workflow:
1. In the Workflow Designer, open the workflow.

2. Choose Workflows-Edit.
3. In the Scheduler tab, choose Non-reusable if you want to create a non-reusable set of
schedule settings for the workflow.
Choose Reusable if you want to select an existing reusable scheduler for the workflow.

Note: If you do not have a reusable scheduler in the folder, you must create one before you
choose Reusable. The Workflow Manager displays a warning message if you do not have
an existing reusable scheduler.
4. Click the right side of the Scheduler field to edit scheduling settings for the scheduler.
Edit scheduler
settings.
For a complete list of scheduler options, see “Configuring Scheduler Settings” on

page 114.
5. If you select Reusable, choose a reusable scheduler from the Scheduler Browser dialog
box.
6. Click OK.
To remove a workflow from its schedule, right-click the workflow in the Navigator window
and choose Unschedule Workflow.
Scheduling a Workflow 113

To reschedule a workflow on its original schedule, right-click the workflow in the Navigator
window and choose Schedule Workflow.
Creating a Reusable Scheduler

For each folder, the Workflow Manager allows you to create reusable schedulers so you can
reuse the same set of scheduling settings for workflows in the folder. Use a reusable scheduler
so you do not need to configure the same set of scheduling settings in each workflow.
When you delete a reusable scheduler, all workflows that use the deleted scheduler becomes
invalid. To make the workflows valid, you must edit them and replace the missing scheduler.
To create a reusable scheduler:
1. In the Workflow Designer, choose Workflows-Schedulers.

2. Click Add to add a new scheduler.
3. In the General tab, enter a name for the scheduler.

4. Configure the scheduler settings in the Scheduler tab. For a complete list of scheduler
settings, see Table 4-3 on page 115.
Configuring Scheduler Settings

Configure the Schedule tab of the scheduler to set run options, schedule options, start
options, and end options for the schedule.

Figure 4-13 shows the Schedule tab:
Figure 4-13. Schedule tab
Table 4-3 describes the settings on the Schedule tab:
Table 4-3. Schedule Tab Settings
Required/
Scheduler Options Description
Optional
Run Options: Optional Indicates the workflow schedule type.

Run On Server Initialization/ If you select Run On Server Initialization, the PowerCenter
Run On Demand/Run Server runs the workflow as soon as the server is initialized. The
Continuously PowerCenter Server then starts the next run of the workflow
according to settings in Schedule Options.
If you select Run On Demand, the PowerCenter Server runs the
workflow when you start the workflow manually.
If you select Run Continuously, the PowerCenter Server runs the
workflow as soon as the server initializes. The PowerCenter
Server then starts the next run of the workflow as soon as it
finishes the previous run.
Schedule Options: Optional Required if you select Run On Server Initialization, or if you do
Run Once/Run Every/ not choose any setting in Run Options.
Customized Repeat If you select Run Once, the PowerCenter Server runs the
workflow once, as scheduled in the scheduler.
If you select Run Every, the PowerCenter Server runs the
workflow at regular intervals, as configured.
If you select Customized Repeat, the PowerCenter Server runs
the workflow on the dates and times specified in the Repeat
dialog box.
When you select Customized Repeat, click Edit to open the
Repeat dialog box. The Repeat dialog box allows you to
schedule specific dates and times for the workflow run. The
selected scheduler appears at the bottom of the page.

Table 4-3. Schedule Tab Settings
Required/
Optional
Start Options: Start Date/Start Optional Start Date indicates the date on which the PowerCenter Server
Time begins the workflow schedule.
Start Time indicates the time at which the PowerCenter Server
begins the workflow schedule.
End Options: End On/End Required/ Required if the workflow schedule is Run Every or Customized
After/Forever Optional Repeat.
If you select End On, the PowerCenter Server stops scheduling
the workflow in the selected date.
If you select End After, the PowerCenter Server stops
scheduling the workflow after the set number of workflow runs.
If you select Forever, the PowerCenter Server schedules the
workflow as long as the workflow does not fail.
Customizing Repeat Option

You can schedule the workflow to run once, run at an interval, or customize your own repeat
option. Click the Edit button to open the Customized Repeat dialog box.
Figure 4-14 shows the Customized Repeat dialog box:
Figure 4-14. Customized Repeat Dialog Box

Table 4-4 describes options in the Customized Repeat dialog box:
Table 4-4. Repeat Dialog Box Options
Required/
Repeat Option Description
Optional
Repeat Every Required Enter the numeric interval you would like the PowerCenter Server to schedule
the workflow, and then select Days, Weeks, or Months, as appropriate.
If you select Days, select the appropriate Daily Frequency settings.
If you select Weeks, select the appropriate Weekly and Daily Frequency
settings.
If you select Months, select the appropriate Monthly and Daily Frequency
settings.
Weekly Required/ Required to enter a weekly schedule. Select the day or days of the week on
Optional which you would like the PowerCenter Server to run the workflow.
Monthly Required/ Required to enter a monthly schedule.

Optional If you select Run On Day, select the dates on which you want the workflow
scheduled on a monthly basis. The PowerCenter Server schedules the
workflow to run on the selected dates. If you select a numeric date exceeding
the number of days within a given month, the PowerCenter Server schedules
the workflow for the last day of the month, including leap years. For example, if
you schedule the workflow to run on the 31st of every month, the PowerCenter
Server schedules the session on the 30th of the following months: April, June,
September, and November.
If you select Run On The, select the week(s) of the month, then day of the
week on which you want the workflow to run. For example, if you select Second
and Last, then select Wednesday, the PowerCenter Server schedules the
workflow to run on the second and last Wednesday of every month.
Daily Optional Enter the number of times you would like the PowerCenter Server to run the
workflow on any day the session is scheduled.
If you select Run Once, the PowerCenter Server schedules the workflow once
on the selected day, at the time entered on the Start Time setting on the Time
tab.
If you select Run Every, enter Hours and Minutes to define the interval at which
the PowerCenter Server runs the workflow. The PowerCenter Server then
schedules the workflow at regular intervals on the selected day. The
PowerCenter Server uses the Start Time setting for the first scheduled
workflow of the day.
Editing Scheduler Settings

You can edit scheduler settings for both non-reusable and reusable schedulers.
♦ Non-reusable schedulers. When you configure or edit a non-reusable scheduler, check in
the workflow to allow the schedule to automatically take effect.
You can update the schedule manually with the workflow checked out. Right-click the
workflow in the Navigator, and select Schedule Workflow. Note that the changes are
applied only to the latest checked-in version of the workflow.

♦ Reusable schedulers. When you edit settings for a reusable scheduler, the repository
creates a new version of the scheduler and increments the version number by one. To
update a workflow with the latest schedule, check in the scheduler after you edit it.
When you configure a reusable scheduler for a new workflow, you must check in both the
workflow and the scheduler to enable the schedule to take effect. Thereafter, when you
check in the scheduler after revising it, the workflow schedule is updated automatically
even if it is checked out.
You need to update the workflow schedule manually if you do not check in the scheduler.
To update a workflow schedule manually, right-click the workflow in the Navigator, and
select Schedule Workflow. Note that the new schedule is implemented only for latest
version of the workflow that is checked in. Workflows that are checked out are not
updated with the new schedule.
Disabling Workflows
You may want to disable the workflow while you edit it. This prevents the PowerCenter Server
from running the workflow on its schedule. Select the Disable Workflows option on the
General tab of the workflow properties. The PowerCenter Server does not run disabled
workflows until you clear the Disable Workflows option. Once you clear the Disable
Workflows option, the PowerCenter Server reschedules the workflow.

Validating a Workflow
Before you can run a workflow, you must validate it. When you validate the workflow, you
validate all task instances in the workflow, including nested worklets.
The Workflow Manager validates the following properties:
♦ Expressions. Expressions in the workflow must be valid.
♦ Tasks. Non-reusable task and Reusable task instances in the workflow must follow
validation rules.
♦ Scheduler. If the workflow uses a reusable scheduler, the Workflow Manager verifies that
the scheduler exists.
The Workflow Manager also verifies that you linked each task properly. For example, you
must link the Start task to at least one task in the workflow.
Note: The Workflow Manager validates Session tasks separately. If a session is invalid, the
workflow may still be valid. For more information about session validation, see “Validating a
Session” on page 195.
Expression Validation
The Workflow Manager validates all expressions in the workflow. You can enter expressions in
the Assignment task, Decision task, and link conditions. The Workflow Manager writes any
error message to the Output window.
Expressions in link conditions and Decision task conditions must evaluate to a numerical
value. Workflow variables used in expressions must exist in the workflow.
The Workflow Manager marks the workflow invalid if a link condition is invalid.
Task Validation
The Workflow Manager validates each task in the workflow as you create it. When you save or
validate the workflow, the Workflow Manager validates all tasks in the workflow except
Session tasks. It marks the workflow invalid if it detects any invalid task in the workflow.
The Workflow Manager verifies that attributes in the tasks follow validation rules. For
example, the user-defined event you specify in an Event task must exist in the workflow. The
Workflow Manager also verifies that you linked each task properly. For example, you must
link the Start task to at least one task in the workflow. For details on task validation rules, see
“Validating Tasks” on page 139.
When you delete a reusable task, the Workflow Manager removes the instance of the deleted
task from workflows. The Workflow Manager also marks the workflow invalid when you
delete a reusable task used in a workflow.
The Workflow Manager verifies that there are no duplicate task names in a folder, and that
there are no duplicate task instances in the workflow.
Validating a Workflow 119

Workflow Properties Validation
The Workflow Manager marks the workflow invalid if the scheduler you specify for the
workflow does not exist in the folder.
Running Validation
When you validate a workflow, you validate worklet instances, worklet objects, and all other
nested worklets in the workflow. You validate task instances and worklets, regardless of
whether you have edited them.
The Workflow Manager validates the worklet object using the same validation rules for
workflows. The Workflow Manager validates the worklet instance by verifying attributes in
the Parameter tab of the worklet instance. For details on validating worklets, see “Validating
Worklets” on page 171.
If the workflow contains nested worklets, you can select a worklet to validate the worklet and
all other worklets nested under it. To validate a worklet and its nested worklets, right-click the
worklet and choose Validate.
Example
For example, you have a workflow that contains a non-reusable worklet called Worklet_1.
Worklet_1 contains a nested worklet called Worklet_a. The workflow also contains a reusable
worklet instance called Worklet_2. Worklet_2 contains a nested worklet called Worklet_b.
In the example workflow in Figure 4-15, the Workflow Manager validates links, conditions,
and tasks in the workflow. The Workflow Manager validates all tasks in the workflow,
including tasks in Worklet_1, Worklet_2, Worklet_a, and Worklet_b.
You can validate a part of the workflow. Right-click Worklet_1 and choose Validate. The
Workflow Manager validates all tasks in Worklet_1 and Worklet_a.
Figure 4-15 shows the example workflow:
Figure 4-15. Example Workflow - Validation

Worklet_1: Non-reusable Worklet_2: Reusable
worklet. Contains a worklet. Contains a
nested worklet called nested worklet called
Worklet_a. Worklet_b.
Validating Multiple Workflows

You can validate multiple workflows or worklets without fetching them into the workspace.
To validate multiple workflows, you must select and validate the workflows from a query

results view or a view dependencies list. When you validate multiple workflows, the validation
does not include sessions, nested worklets, or reusable worklet objects in the workflows.
Note: If you are using the Repository Manager, you can select and validate multiple workflows
from the Repository Navigator.
You can save and optionally check in workflows that change from invalid to valid status. For
more information about validating multiple objects, see “Validating Multiple Objects” in the
Repository Guide.
To validate multiple workflows:
1. Select workflows from either a query list or a view dependencies list.

2. Right-click one of the selected workflows and choose Validate.
The Validate Objects dialog box displays.
3. Choose whether to save objects and check in objects that you validate.
Validating a Workflow 121

Running the Workflow
Before you can run a workflow, you must save changes in the folder and select a PowerCenter
Server to run the workflow. You can manually start a workflow configured to run on demand
or to run on a schedule. Use the Workflow Manager, Workflow Monitor, or pmcmd to run a
workflow. You can choose to run the entire workflow, part of a workflow, or a task in the
workflow.
Selecting a Server to Run the Workflow

You must choose a server to run the workflow. If you only register one server, the Workflow
Manager lists the single registered PowerCenter Server that runs the workflow. For
PowerCenter repositories with multiple servers, the Workflow Manager lists all servers.
To select a server to run a workflow:
1. In the Workflow Designer, open the Workflow.

2. Choose Workflows-Edit. The Edit Workflow dialog box appears.
3. Click the Select Server button on the General tab. A list of registered servers appear.
Select a server.
4. Select the server on which you want to run the workflow.

5. Click OK twice to select the server for the workflow.
Assigning the PowerCenter Server to a Workflow

After you register the PowerCenter Server, you can assign it to workflows you want to run on
that server. This allows you to assign the PowerCenter Server to multiple workflows without

editing each workflow property individually. To assign the PowerCenter Server to multiple
workflows, you must first close all folders in the repository.
You can also choose a PowerCenter Server to run a specific workflow by editing the workflow
property. For details, see “Running a Workflow” on page 124.
To assign the PowerCenter Server to workflows, you must have Super User privilege.
To assign the PowerCenter Server:

2. Choose Server-Assign Server.
or
Right-click the server name in the Navigator and choose Assign Server. The Assign Server
dialog box opens.
Select a server to
assign.
Select a folder.
Assign a server to
a workflow.
3. From the Choose Server list, select the server you want to assign.
4. From the Show Folder list, select the folder you want to view. Or, choose All to view
workflows in all folders in the repository.
5. Select the Select check box for each workflow you want to run on the PowerCenter
Server.
6. Click Assign.
Removing an Assigned Server from a Workflow

You can remove an assigned server from a workflow in the Assign Server dialog box. Perform
the following steps to remove an assigned server from a workflow.
Running the Workflow 123

To remove an assigned server:

3. From the Choose Server list, select None.
5. Select the workflows from which you want to remove the assigned server.
6. Click Assign.
Running a Workflow
When you choose Workflows-Start, the PowerCenter Server runs the entire workflow.
To run a workflow from pmcmd, use the startworkflow command. For details on using
pmcmd, see “Using pmcmd” on page 581.
To start a workflow with the Workflow Manager:
1. Connect to a repository and open the folder containing the workflow.

2. From the Navigator, select the workflow that you want to start.
3. Right-click the workflow in the Navigator and choose Start Workflow.
The PowerCenter Server starts running the entire workflow.
When you choose Start Workflow, the workflow runs on the PowerCenter Server you selected
in the workflow properties. You can also use the Choose Server toolbar button to run the
workflow on a different server.
After the Workflow Manager sends a request to the PowerCenter Server, the Output window
displays the PowerCenter Server response. If an error displays, check the workflow log or
session log for error messages.
You can also manually start a workflow by right-clicking in the Workflow Designer workspace
and choosing Start Workflow.
Running a Part of a Workflow

You can choose to run only part of the workflow. To run part of the workflow, right-click the
task that you want the PowerCenter Server to begin running and choose Start Workflow From
Task. The PowerCenter Server runs the workflow from the selected task to the end of the
workflow.
When you run a workflow from a selected task, the PowerCenter Server runs the workflow on
the registered server you choose in the workflow properties. The PowerCenter Server logs
messages in the workflow log when you start a workflow from a task.

To run a part of a workflow from pmcmd, use the startfrom flag of the startworkflow
command. For details on using pmcmd, see “Using pmcmd” on page 581.
To run a part of a workflow:
1. Connect to the folder containing the workflow.

2. In the Navigator window, drill down the Workflow folder to show the tasks in the
workflow.
or
In the Workflow Designer workspace, select the task from which you want the
PowerCenter Server to begin running.
3. Right-click the task on which you want the PowerCenter Server to begin running.
4. Choose Start Workflow From Task.
For example, you have a workflow with multiple tasks. The example workflow in Figure 4-16
contains two branches. If you want to run the tasks commandtask2, e_email2, and
command3, you start the workflow from commandtask2. All subsequent tasks in the branch
will run.
Figure 4-16. Running Part of a Workflow - Example
When you start the workflow from

commandtask2, the PowerCenter
Server runs this portion of the workflow.
Running a Task in the Workflow

When you start a task in the workflow, the Workflow Manager locks the entire workflow so
another user cannot start the workflow. The PowerCenter Server runs the selected task. It does
not run the rest of the workflow.
To run a task using the Workflow Manager, select the task in the Workflow Designer
workspace. Right-click the task and choose Start Task.
You can select a task to start using menu commands in the Workflow Manager. In the
Navigator window, drill down the Workflow folder to show the tasks in the workflow you
want to start. Right-click the task you want to start and choose Start Task.
Running the Workflow 125

To start a task in a workflow from pmcmd, use the starttask command. For details on using
pmcmd, see “Using pmcmd” on page 581.

Suspending the Workflow
When a task in the workflow fails, you might want to suspend the workflow, fix the error, and
resume or recover the workflow. The PowerCenter Server suspends the workflow if you enable
the Suspend On Error option in the workflow properties. You can optionally set a suspension
email so the PowerCenter Server sends an email when it suspends a workflow.
When you enable the Suspend On Error option, the PowerCenter Server suspends the
workflow when one of the following fails:
♦ Session
♦ Command
♦ Worklet
♦ Email
When a task fails in the workflow, the PowerCenter Server stops running tasks in its path.
The PowerCenter Server does not evaluate the output link of the failed task. If no other task is
running in the workflow, the Workflow Monitor displays the status of the workflow as
“Suspended.”
If one or more tasks are still running in the workflow when a task fails, the PowerCenter
Server stops running the failed task and continues running tasks in other paths. The
Workflow Monitor displays the status of the workflow as “Suspending.”
When the status of the workflow is “Suspended” or “Suspending,” you can fix the error, such
as a target database error, and resume or recover the workflow in the Workflow Monitor.
When you resume or recover a workflow, the PowerCenter Server restarts the failed tasks and
continues evaluating the rest of the tasks in the workflow. The PowerCenter Server does not
run any task that already completed successfully.
Note: Do not edit a workflow or the tasks inside a workflow when the PowerCenter Server
suspends a workflow.
For details about resuming the workflow, see “Resuming a Workflow or Worklet” on
page 417. For details about recovering the workflow, see “Recovering a Workflow or Worklet”
on page 417.
To suspend a workflow:

2. Choose Workflows-Edit.
Suspending the Workflow 127

3. In the General tab, enable Suspend On Error.
4. Click OK.
Configuring Suspension Email

You can configure the workflow so that the PowerCenter Server sends an email when it
suspends a workflow. Select an existing reusable email task for the suspension email. When a
task fails, the PowerCenter Server starts suspending the workflow and sends the suspension
email. If another task fails while the PowerCenter Server is suspending the workflow, you do
not get the suspension email again.
The PowerCenter Server sends out a suspension email if another task fails after you resume
the workflow.
For details on configuring suspension emails, see “Working with Suspension Email” on
page 339.

Stopping or Aborting the Workflow
You can specify when and how you want the PowerCenter Server to stop or abort a workflow
by using the Control task in the workflow. After you start a workflow, you can stop or abort it
through the Workflow Monitor or pmcmd. You can issue the stop or abort command at any
time during the execution of a workflow.
You can stop or abort a workflow by performing one of the following actions:
♦ Use a Control task in the workflow. For details, see “Working with the Control Task” on
page 147.
♦ Issue a stop or abort command in the Workflow Monitor. For details, see “Monitoring
Workflows” on page 401.
♦ Issue a stop or abort command in pmcmd. For details, see “pmcmd Reference” on
page 594.
You can also stop or abort a task within a workflow. For details on stopping the Session task,
see “Stopping and Aborting a Session” on page 200.
Server Handling of Stop and Abort

When you stop a workflow, the PowerCenter Server tries to stop all the tasks that are
currently running in the workflow. If the workflow contains a worklet, the PowerCenter
Server also tries to stop all the tasks that are currently running in the worklet. If it cannot stop
the workflow, you need to abort the workflow.
The PowerCenter Server can stop the following tasks completely:
♦ Session
♦ Command
♦ Timer
♦ Event-Wait
♦ Worklet
When you stop a Command task that contains multiple commands, the PowerCenter Server
finishes executing the current command and does not execute the rest of the commands. The
PowerCenter Server cannot stop tasks such as the Email task. For example, if the PowerCenter
Server has already started sending an email when you issue the stop command, the
PowerCenter Server finishes sending the email before it stops running the workflow.
The PowerCenter Server aborts the workflow if the Repository Server process shuts down.
Stopping or Aborting a Task

You can stop or abort a task within a workflow from the Workflow Monitor. When you stop
or abort a task, the PowerCenter Server stops processing the task. The PowerCenter Server
does not process other tasks in the path of the stopped or aborted task. The PowerCenter
Stopping or Aborting the Workflow 129

Server continues processing concurrent tasks in the workflow. If the PowerCenter Server
cannot stop the task, you can abort the task.
When you abort a task, the PowerCenter Server kills the process on the task. The
PowerCenter Server continues processing concurrent tasks in the workflow when you abort a
task.
You can also stop or abort a worklet. The PowerCenter Server stops and aborts a worklet
similar to stopping and aborting a task. The PowerCenter Server stops the worklet while
executing concurrent tasks in the workflow. You can also stop or abort tasks within a worklet.
Stopping or Aborting a Session Task

If the PowerCenter Server is executing a Session task when you issue the stop command, the
PowerCenter Server stops reading data. It continues processing and writing data and
committing data to targets. If the PowerCenter Server cannot finish processing and
committing data, you can issue the abort command.
The PowerCenter Server handles the abort command for the Session task like the stop
command, except it has a timeout period of 60 seconds. If the PowerCenter Server cannot
finish processing and committing data within the timeout period, it kills the DTM process
and terminates the session. For details on stopping or aborting a session, see “Stopping and
Aborting a Session” on page 200.

Chapter 5
Working with Tasks

♦ Overview, 132
♦ Creating a Task, 133
♦ Configuring Tasks, 135
♦ Validating Tasks, 139
♦ Working with the Assignment Task, 140
♦ Working with the Command Task, 143
♦ Working with the Control Task, 147
♦ Working with the Decision Task, 149
♦ Working with Event Tasks, 153
♦ Working with the Timer Task, 161
131
Overview
The Workflow Manager contains many types of tasks to help you build workflows and
worklets. You can create reusable tasks in the Task Developer. Or, create and add tasks in the
Workflow or Worklet Designer as you develop the workflow.
Table 5-1 summarizes workflow tasks available in Workflow Manager:
Table 5-1. Workflow Tasks
Task Name Tool Reusable Description
Assignment Workflow Designer No Assigns a value to a workflow variable. For details, see
Worklet Designer “Working with the Assignment Task” on page 140.
Command Task Developer Yes Specifies shell commands to run during the workflow.
Workflow Designer You can choose to run the Command task only if the
Worklet Designer previous task in the workflow completes. For details, see
“Working with the Command Task” on page 143.
Control Workflow Designer No Stops or aborts the workflow. For details, see “Working
Worklet Designer with the Control Task” on page 147.
Decision Workflow Designer No Specifies a condition to evaluate in the workflow. Use

Worklet Designer the Decision task to create branches in a workflow. For
details, see “Working with the Decision Task” on
page 149.
Email Task Developer Yes Sends email during the workflow. For details, see
Workflow Designer “Sending Email” on page 319.
Worklet Designer
Event-Raise Workflow Designer No Represents the location of a user-defined event. The

Worklet Designer Event-Raise task triggers the user-defined event when
the PowerCenter Server runs the Event-Raise task. For
details, see “Working with Event Tasks” on page 153.
Event-Wait Workflow Designer No Waits for a user-defined or a pre-defined event to occur.

Worklet Designer Once the event occurs, the PowerCenter Server
completes the rest of the workflow. For details, see
Session Task Developer Yes Set of instructions to run a mapping. For details, see
Workflow Designer “Working with Sessions” on page 173.
Worklet Designer
Timer Workflow Designer No Waits for a specified period of time to run the next task.
Worklet Designer For details, see “Working with Event Tasks” on
page 153.
The Workflow Manager validates tasks attributes and links. If a task is invalid, the workflow
becomes invalid. Workflows containing invalid sessions may still be valid. For details on
validating tasks, see “Validating Tasks” on page 139.
132 Chapter 5: Working with Tasks

Creating a Task
You can create tasks in the Task Developer, or you can create them in the Workflow Designer
or the Worklet Designer as you develop the workflow or worklet. Tasks you create in the Task
Developer are reusable. Tasks you create in the Workflow Designer and Worklet Designer are
non-reusable by default.
For details on reusable tasks, see “Reusable Workflow Tasks” on page 135.
Creating a Task in the Task Developer

You can create the following three types of tasks in the Task Developer:
♦ Command
♦ Session
♦ Email
Perform the following steps to create tasks in the Task Developer.
To create a task in the Task Developer:
1. In the Task Developer, choose Tasks-Create. The Create Task dialog box appears.
2. Select the task type you want to create, Command, Session, or Email.
3. Enter a name for the task.
4. For session tasks, select the mapping you want to associate with the session.
5. Click Create.
The Task Developer creates the workflow task.
6. Click Done to close the Create Task dialog box.
Creating a Task in the Workflow or Worklet Designer

You can create and add tasks in the Workflow Designer or Worklet Designer as you develop
the workflow or worklet. You can create any type of task in the Workflow Designer or Worklet
Designer. Tasks you create in the Workflow Designer or Worklet Designer are non-reusable.
Edit the General tab of the task properties to promote a non-reusable task to a reusable task.
Creating a Task 133

Perform the following steps to create tasks in the Workflow Designer or Worklet Designer.
To create tasks in the Workflow Designer or Worklet Designer:
1. In the Workflow Designer or Worklet Designer, open a workflow or worklet.

2. Choose Tasks-Create.
3. Select the type of task you want to create.
4. Enter a name for the task.
5. Click Create.
The Workflow Designer or Worklet Designer creates the task and adds it to the
workspace.
6. Click Done.
You can also use the Tasks toolbar to create and add tasks to the workflow. Click the button
on the Tasks toolbar for the task you want to create. Click again in the Workflow Designer or
Worklet Designer workspace to create and add the task. The Workflow Designer or Worklet
Designer creates the task with a default task name when you use the Tasks toolbar.

Configuring Tasks
After you create the task, you can configure general task options on the General tab. For each
task instance in the workflow, you can configure how the PowerCenter Server runs the task
and the other objects associated with the selected task. You can also disable the task so you can
run rest of the workflow without the selected task.
Figure 5-1 displays the General tab in the Edit Tasks dialog box:
Figure 5-1. General Tab - Edit Tasks Dialog Box
When you use a task in the workflow, you can edit the task in the Workflow Designer and
configure the following task options in the General tab:
♦ Treat input link as AND or OR. Choose to have the PowerCenter Server run the task
when all or one of the input link conditions evaluates to True.
♦ Disable this task. Choose to disable the task so you can run the rest of the workflow
without the task.
♦ Fail parent if this task fails. Choose to fail the workflow or worklet containing the task if
the task fails.
♦ Fail parent if this task does not run. Choose to fail the workflow or worklet containing
the task if the task does not run.
Reusable Workflow Tasks

Workflows can contain reusable task instances and non-reusable tasks. Non-reusable tasks
exist within a single workflow. Reusable tasks can be used in multiple workflows in the same
folder.
Configuring Tasks 135

You have the option to create any task as non-reusable or reusable. Tasks you create in the
Task Developer are reusable. Tasks you create in the Workflow Designer are non-reusable by
default. However, you can edit the general properties of a task to promote it to a reusable task.
The Workflow Manager stores each reusable task separate from the workflows that use the
task. You can view a list of reusable tasks in the Tasks node in the Navigator window. You can
see a list of all reusable Session tasks in the Sessions node in the Navigator window.
To promote a non-reusable workflow task:
1. In the Workflow Designer, double-click the task you want to make reusable.
2. In the General tab of the Edit Task dialog box, check the Make Reusable option.
3. When prompted whether you are sure you want to promote the task, click Yes.
4. Click OK to return to the workflow.
5. Choose Repository-Save.
The newly promoted task appears in the list of reusable tasks in the Tasks node in the
Navigator window.
Instances and Inherited Changes

When you add a reusable task to a workflow, you add an instance of the task. The definition
of the task exists outside the workflow, while an instance of the task exists in the workflow.
You can edit the task instance in the Workflow Designer. Changes you make in the task
instance exist only in the workflow. The task definition remains unchanged in the Task
Developer.
When you make changes to a reusable task definition in the Task Developer, the changes
reflect in the instance of the task in the workflow only if you have not edited the instance.
Reverting Changes in Reusable Tasks Instances

When you edit an instance of a reusable task in the workflow, you can revert back to the
settings in the task definition. When you change settings in the task instance, the Revert
button appears. The Revert button appears after you override task properties. You cannot use
the Revert button for settings that are read-only or locked by another user.

Figure 5-2 displays the Revert button in the Mapping tab of a Session task:
Figure 5-2. Revert Button in Session Properties
AND or OR Input Links

For each task, you can choose to treat the input link as an AND link or an OR link. When a
task has one input link, the PowerCenter Server processes the task when the previous object
completes and the link condition evaluates to True. If you have multiple links going into one
task, you can choose to have an AND input link so that the PowerCenter Server runs the task
when all the link conditions evaluates to True. Or, you can choose to have an OR input link
so that the PowerCenter Server runs the task as soon as any link condition evaluates to True.
To set the type of input links, double-click the task to open the Edit Tasks dialog box. Select
AND or OR for the input link type. For details on working with links and link conditions,
see “Working with Links” on page 92.
Disabling Tasks
In the Workflow Designer, you can disable a workflow task so that the PowerCenter Server
runs the workflow without the disabled task. The status of a disabled task is DISABLED.
Disable a task in the workflow by selecting the Disable This Task option in the Edit Tasks
dialog box.
Configuring Tasks 137

Failing Parent Workflow or Worklet
You can choose to fail the workflow or worklet if a task fails or does not run. The workflow or
worklet that contains the task instance is called the parent. A task might not run when the
input condition for the task evaluates to False.
To fail the parent workflow or worklet if the task fails, double-click the task and select the Fail
Parent If This Task Fails option in the General tab. When you select this option and a task
fails, it does not prevent the other tasks in the workflow or worklet from running. Instead, the
PowerCenter Server marks the status of the workflow or worklet as failed. If you have a session
nested within multiple worklets, you must select the Fail Parent If This Task Fails option for
each worklet instance to see the failure at the workflow level.
To fail the parent workflow or worklet if the task does not run, double-click the task and
select the Fail Parent If This Task Does Not Run option in the General tab. When you choose
this option, the PowerCenter Server fails the parent workflow if a task did not run.
Note: The PowerCenter Server does not fail the parent workflow if you disable a task.

Validating Tasks
You can validate reusable tasks in the Task Developer. Or, you can validate task instances in
the Workflow Designer. When you validate a task, the Workflow Manager validates task
attributes and links. For example, the user-defined event you specify in an Event tasks must
exist in the workflow.
The Workflow Manager uses the following rules to validate tasks:
♦ Assignment. The Workflow Manager validates the expression you enter for the
Assignment task. For example, the Workflow Manager verifies that you assigned a
matching datatype value to the workflow variable in the assignment expression.
♦ Command. The Workflow Manager does not validate the shell command you enter for the
Command task.
♦ Event-Wait. If you choose to wait for a pre-defined event, the Workflow Manager verifies
that you specified a file to watch. If you choose to use the Event-Wait task to wait for a
user-defined event, the Workflow Manager verifies that you specified an event.
♦ Event-Raise. The Workflow Manager verifies that you specified a user-defined event for
the Event-Raise task.
♦ Timer. The Workflow Manager verifies that the variable you specified for the Absolute
Time setting has the Date/Time datatype.
♦ Start. The Workflow Manager verifies that you linked the Start task to at least one task in
the workflow.
When a task instance is invalid, the workflow using the task instance becomes invalid. When
a reusable task is invalid, it does not affect the validity of the task instance used in the
workflow. However, if a Session task instance is invalid, the workflow may still be valid. The
Workflow Manager validates sessions differently. For details, see “Validating a Session” on
page 195.
To validate a task, select the task in the workspace and choose Tasks-Validate. Or, right-click
the task in the workspace and choose Validate.
Validating Tasks 139

Working with the Assignment Task
The Assignment task allows you to assign a value to a user-defined workflow variable. To use
an Assignment task in the workflow, first create and add the Assignment task to the workflow.
Then configure the Assignment task to assign values or expressions to user-defined variables.
After you assign a value to a variable using the Assignment task, the PowerCenter Server uses
the assigned value for the variable during the remainder of the workflow.
You must create a variable before you can assign values to it. You cannot assign values to pre-
defined workflow variables.
To create an Assignment task:
1. In the Workflow Designer, click the Assignment icon on the Tasks toolbar.
Assignment Task Toolbar Icon
or
Choose Tasks-Create. Select Assignment Task for the task type.
2. Enter a name for the Assignment task. Click Create. Then click Done.
The Workflow Designer creates and adds the Assignment task to the workflow.
3. Double-click the Assignment task to open the Edit Task dialog box.
4. On the Expressions tab, click Add to add an assignment.
Add an assignment.
Open Button
5. Click the Open button in the User Defined Variables field.

The Select Variable dialog box appears.
6. Select the variable for which you want to assign a value. Click OK.
7. Click the Edit button in the Expression field to open the Expression Editor.
The Expression Editor shows pre-defined workflow variables, user-defined workflow
variables, variable functions, and boolean and arithmetic operators.
8. Enter the value or expression you want to assign. For example, if you want to assign the
value 500 to the user-defined variable $$custno1, enter the number 500 in the
Expression Editor.
Validate the expression before you close the Expression Editor.
Working with the Assignment Task 141

9. Repeat steps 5-7 to add more variable assignments as necessary. Use the up and down
arrows in the Expressions tab to change the order of the variable assignments.
10. Click OK.

Working with the Command Task
The Command task allows you to specify one or more shell commands to run during the
workflow. For example, you can specify shell commands in the Command task to delete reject
files, copy a file, or archive target files.
You can use a Command task in the following ways:
♦ Standalone Command task. You can use a Command task anywhere in the workflow or
worklet to run shell commands.
♦ Pre- and post-session shell command. You can call a Command task as the pre- or post-
session shell command for a Session task. For more information about specifying pre-
session and post-session shell commands, see “Using Pre- or Post-Session Shell
Commands” on page 188.
Note: You can use server variables or session variables in pre- and post-session shell commands.
You cannot use server variables or session variables in standalone Command tasks. The
PowerCenter Server does not expand server variables or session variables in standalone
Command tasks.
Use any valid UNIX command or shell script for UNIX servers, or any valid DOS or batch
file for Windows servers.
For example, you might use a shell command to copy a file from one directory to another. For
a Windows server you would use the following shell command to copy the SALES_ ADJ file
from the source directory, L, to the target, H:
copy L:\sales\sales_adj H:\marketing\
For a UNIX server, you would use the following command to perform a similar operation:
cp sales/sales_adj marketing/
Each shell command runs in the same environment (UNIX or Windows) as the PowerCenter
Server. Environment settings in one shell command script do not carry over to other scripts.
To run all shell commands in the same environment, call a single shell script that invokes
other scripts.
Using Session Parameters

You can use session parameters in pre- or post-session shell commands. For example, you
might use an input file parameter instead of hard-coding the name of a source file.
Creating a Command Task

Perform the following steps to create a Command task.
Working with the Command Task 143

To create a Command task:
1. In the Workflow Designer or the Task Developer, click the Command Task icon on the
Tasks toolbar.
Command Task Icon
or
Choose Task-Create. Select Command Task for the task type.
2. Enter a name for the Command task. Click Create. Then click Done.
3. Double-click the Command task in the workspace to open the Edit Tasks dialog box.
4. In the Commands tab, click the Add button to add a command.
Add Button
Edit Button
5. In the Name field, enter a name for the new command.

6. In the Command field, click the Edit button to open the Command Editor.
7. Enter the command you want to perform. Enter only one command in the Command
Editor.
8. Click OK to close the Command Editor.
9. Repeat steps 3-8 to add more commands in the task.
10. Click OK.
If you specify non-reusable shell commands for a session, you can promote the non-reusable
shell commands to a reusable Command task. For details, see “Creating a Reusable Command
Task from Pre- or Post-Session Commands” on page 191.
Executing Commands in the Command Task

The PowerCenter Server processes the shell commands in the order you specify them. You can
choose to run a command only if the previous command completed successfully. Or, you
choose to run all commands in the Command Task, regardless of the result of the previous
command. If you configure multiple commands in a Command task to run on UNIX, each
command runs in a separate shell.
To run the next command only if the previous command completes successfully, select the
“Run If Previous Completed” option in the Properties tab of the Command task.
If you select the Run If Previous Completed option, when one of the commands in the
Command task fails, the PowerCenter Server stops running the rest of the commands and
fails the task. If you do not select the Run If Previous Completed option, the PowerCenter
Server runs all the commands in the Command task and treats the task as completed, even if a
command fails.
Working with the Command Task 145

Figure 5-3 shows the Run If Previous Completed option:
Figure 5-3. Run If Previous Completed Option

Working with the Control Task
You can use the Control task to stop, abort, or fail the top-level workflow or the parent
workflow based on an input link condition. A parent workflow or worklet is the workflow or
worklet that contains the Control task.
To create a Control task:
1. In the Workflow Designer, click the Control Task icon on the Tasks toolbar.
Control Task Icon
or
Choose Tasks-Create. Select Control Task for the task type.
2. Enter a name for the Control task. Click Create. Then click Done.
The Workflow Manager creates and adds the Control task to the workflow.
3. Double-click the Control task in the workspace to open it.
Working with the Control Task 147

4. Configure control options on the Properties tab.
You can choose from the following control options:
Control Option Description
Fail Me Marks the Control task as “Failed.” The PowerCenter Server fails
the Control task if you choose this option. If you choose Fail Me in
the Properties tab and choose Fail Parent If This Task Fails in the
General tab, the PowerCenter Server fails the parent workflow.
Fail Parent Marks the status of the workflow or worklet that contains the
Control task as failed after the workflow or worklet completes.
Stop Parent Stops the workflow or worklet that contains the Control task.
Abort Parent Aborts the workflow or worklet that contains the Control task.
Fail Top-Level Workflow Fails the workflow that is running.
Stop Top-Level Workflow Stops the workflow that is running.
Abort Top-Level Workflow Aborts the workflow that is running.

Working with the Decision Task
The Decision task allows you to enter a condition that determines the execution of the
workflow, similar to a link condition. The Decision task has a pre-defined variable called
$Decision_task_name.condition that represents the result of the decision condition. The
PowerCenter Server evaluates the condition in the Decision task and sets the pre-defined
condition variable to True (1) or False (0).
You can specify one decision condition per Decision task.
After the PowerCenter Server evaluates the Decision task, you can use the pre-defined
condition variable in other expressions in the workflow to help you develop the workflow.
Depending on the workflow, you might use link conditions instead of a Decision task.
However, the Decision task simplifies the workflow. For details on link conditions, see
“Working with Links” on page 92.
If you do not specify a condition in the Decision task, the PowerCenter Server evaluates the
Decision task to True.
Using the Decision Task

You can use the Decision task instead of multiple link conditions in a workflow. Instead of
specifying multiple link conditions, use the pre-defined Condition variable in a Decision task
to simplify link conditions.
Example
For example, you have a Command task that depends on the status of the three sessions in the
workflow. You want the PowerCenter Server to run the Command task when any of the three
sessions fails. To accomplish this, use a Decision task with the following decision condition:
$Q1_session.status = FAILED OR $Q2_session.status = FAILED OR
$Q3_session.status = FAILED
You can then use the pre-defined condition variable in the input link condition of the
Command task. Configure the input link with the following link condition:
$Decision.condition = True
Working with the Decision Task 149

Figure 5-4 shows the example workflow using a Decision task:
Figure 5-4. Example Workflow Using a Decision Task
You can configure the same logic in the workflow without the Decision task. Without the
Decision task, you need to use three link conditions and treat the input links to the
Command task as OR links.
Figure 5-5 shows the example workflow without the Decision task:
Figure 5-5. Example Workflow without a Decision Task
You can further expand the example workflow in Figure 5-4. In Figure 5-4, the PowerCenter
Server runs the Command task if any of the three Session tasks fails. Suppose now you want
the PowerCenter Server to also run an Email task if all three Session tasks succeed.

To do this, add an Email task and use the decision condition variable in the link condition.
Figure 5-6 shows the expanded example workflow using a Decision task:
Figure 5-6. Expanded Example Workflow Using a Decision Task
$Decision.condition = True
$Decision.condition = False
Creating a Decision Task

Perform the following steps to create a Decision task.
To create a Decision task:
1. In the Workflow Designer, click the Decision Task icon on the Tasks toolbar.
Decision Task Icon
or
Choose Tasks-Create. Select Decision Task for the task type.
2. Enter a name for the Decision task. Click Create. Then click Done.
The Workflow Designer creates and adds the Decision task to the workspace.
Working with the Decision Task 151

3. Double-click the Decision task to open it.
4. Click the Open button in the Value field to open the Expression Editor.
5. In the Expression Editor, enter the condition you want the PowerCenter Server to
evaluate.
Validate the expression before you close the Expression Editor.
6. Click OK.

Working with Event Tasks
You can define events in the workflow to specify the sequence of task execution. The event is
triggered based on the completion of the sequence of tasks. Use the following tasks to help
you use events in the workflow:
♦ Event-Raise task. Event-Raise task represents a user-defined event. When the PowerCenter
Server runs the Event-Raise task, the Event-Raise task triggers the event. Use the Event-
Raise task with the Event-Wait task to define events.
♦ Event-Wait task. The Event-Wait task waits for an event to occur. Once the event triggers,
the PowerCenter Server continues executing the rest of the workflow.
To coordinate the execution of the workflow, you may specify the following types of events for
the Event-Wait and Event-Raise tasks:
♦ Pre-defined event. A pre-defined event is a file-watch event. For pre-defined events, use an
Event-Wait task to instruct the PowerCenter Server to wait for the specified indicator file
to appear before continuing with the rest of the workflow. When the PowerCenter Server
locates the indicator file, it starts the next task in the workflow.
♦ User-defined event. A user-defined event is a sequence of tasks in the workflow. Use an
Event-Raise task to specify the location of the user-defined event in the workflow. A user-
defined event is sequence of tasks in the branch from the Start task leading to the Event-
Raise task.
When all the tasks in the branch from the Start task to the Event-Raise task complete, the
Event-Raise task triggers the event. The Event-Wait task waits for the Event-Raise task to
trigger the event before continuing with the rest of the tasks in its branch.
Example of User-Defined Events

Say you have four sessions you want to run in a workflow. You want Q1_session and
Q2_session to run concurrently to save time. You also want to run Q3_session after
Q1_session completes. You want to run Q4_session only when Q1_session, Q2_session, and
Q3_session complete.
Figure 5-7 shows how to accomplish this using the Event-Raise and Event-Wait tasks:
Figure 5-7. Example of User-Defined Event

User-defined event: Q1Q3_Complete
Working with Event Tasks 153

Perform the following steps to configure the workflow shown in Figure 5-7:
1. Link Q1_session and Q2_session concurrently.
2. Add Q3_session after Q1_session.
3. Declare an event called Q1Q3_Complete in the Events tab of the workflow properties.
4. In the workspace, add an Event-Raise task after Q3_session.
5. Specify the Q1Q3_Complete event in the Event-Raise task properties. This allows the
Event-Raise task to trigger the event when Q1_session and Q3_session complete.
6. Add an Event-Wait task after Q2_session.
7. Specify the Q1Q3_Complete event for the Event-Wait task.
8. Add Q4_session after the Event-Wait task. When the PowerCenter Server processes the
Event-Wait task, it waits until the Event-Raise task triggers Q1Q3_Complete before it
runs Q4_session.
The PowerCenter Server runs the workflow shown in Figure 5-7 in the following order:
1. The PowerCenter Server runs Q1_session and Q2_session concurrently.
2. When Q1_session completes, the PowerCenter Server runs Q3_session.
3. The PowerCenter Server finishes executing Q2_session.
4. The Event-Wait task waits for the Event-Raise task to trigger the event.
5. The PowerCenter Server completes Q3_session.
6. The Event-Raise task triggers the event, Q1Q3_complete.
7. The PowerCenter Server runs Q4_session because the event, Q1Q3_Complete, has been
triggered.
8. The PowerCenter Server runs the Email task.
Working with Event-Raise Tasks

The Event-Raise task represents the location of a user-defined event. A user-defined event is
the sequence of tasks in the branch from the Start task to the Event-Raise task. When the
PowerCenter Server runs the Event-Raise task, the Event-Raise task triggers the user-defined
event.
To use an Event-Raise task, you must first declare the user-defined event. Then, create an
Event-Raise task in the workflow to represent the location of the user-defined event you just
declared. In the Event-Raise task properties, specify the name of a user-defined event.

Declaring a User-Defined Event
Perform the following steps to declare a name for a user-defined event.
To declare a user-defined event:
1. In the Workflow Designer, select Workflow-Edit to open the workflow properties.

2. Select the Events tab in the Edit Workflow dialog box.
Add a user-
defined event.
3. Click Add to add an event name. Event name is not case-sensitive.

4. Click OK.
Using the Event-Raise Task For a User-Defined Event

After you declare a user-defined event, use the Event-Raise task to represent the location of
the event and to trigger the event.
Perform the following steps to use an Even-Raise task.
To use an Event-Raise task:
1. In the Workflow Designer workspace, create an Event-Raise task and place it in the
workflow to represent the user-defined event you want to trigger. A user-defined event is
the sequence of tasks in the branch from the Start task to the Event-Raise task.

2. Double-click the Event-Raise task to open it.
3. Click the Open button in the Value field on the Properties tab to open the Events
Browser for user-defined events.
4. Choose an event in the Events Browser.

5. Click OK twice to return to the workspace.
Working With Event-Wait Tasks

The Event-Wait task waits for a pre-defined event or a user-defined event. A pre-defined event
is a file-watch event. When you use the Event-Wait task to wait for a pre-defined event, you

specify an indicator file for the PowerCenter Server to watch. The PowerCenter Server waits
for the indicator file to appear. Once the indicator file appears, the PowerCenter Server
continues executing tasks after the Event-Wait task.
Do not use the Event-Raise task to trigger the event when you wait for a pre-defined event.
You can also use the Event-Wait task to wait for a user-defined event. To use the Event-Wait
task for a user-defined event, you specify the name of the user-defined event in the Event-
Wait task properties. The PowerCenter Server waits for the Event-Raise task to trigger the
user-defined event. Once the user-defined event is triggered, the PowerCenter Server
continues running tasks after the Event-Wait task.
Waiting for User-Defined Events

You can use the Event-Wait task to wait for a user-defined event. A user-defined event is
triggered by the Event-Raise task. To wait for a user-defined event, you must first use an
Event-Raise task to trigger the user-defined event.
To wait for a user-defined event:
1. In the workflow, create an Event-Wait task and double-click the Event-Wait task to open
the Edit Task dialog box.
2. In the Events tab of the Edit Tasks dialog box, select User-Defined.
Open the Events Browser.

3. Click the Event button to open the Events Browser dialog box.
4. Select a user-defined event for the PowerCenter Server to wait.

5. Click OK twice.
Waiting for Pre-Defined Events

To use a pre-defined event, you need a shell command, script, or batch file to create an
indicator file. The file must be created or sent to a directory local to the PowerCenter Server.
The file can be any format recognized by the PowerCenter Server operating system. You can
choose to have the PowerCenter Server delete the indicator file after it detects the file, or you
can manually delete the indicator file. The PowerCenter Server marks the status of the Event-
Wait task as failed if it cannot delete the indicator file.
When you specify the indicator file in the Event-Wait task, enter the directory in which the
file will appear and the name of the indicator file. You must provide the absolute path for the
file. The directory must be local to the PowerCenter Server. If you only specify the file name
and not the directory, the PowerCenter Server looks for the indicator file in the system
directory. For example, on Windows 2000, the system directory is c:\winnt\system32.
You can enter the actual name of the file or use server variables to specify the location of the
files. For more information on server variables, see “Server Variables” on page 46.
The PowerCenter Server writes the time the file appears in the workflow log.
Note: Do not use a source or target file name as the indicator file name.
Perform the following steps to wait for a pre-defined event in the workflow.
To wait for a pre-defined event:
1. Create an Event-Wait task and double-click the Event-Wait task to open it.

2. In the Events tab of the Edit Task dialog box, select Pre-defined.
3. Enter the path of the indicator file.

4. If you want the PowerCenter Server to delete the indicator file after it detects the file,
select the Delete Filewatch File option in the Properties tab.
5. Click OK.
Enabling Past Events

By default, the Event-Wait task waits for the Event-Raise task to trigger the event. By default,
the Event-Wait task does not check if the event already occurred. You can select the Enable
Past Events option so that the PowerCenter Server checks if the event has already occurred.

When you select Enable Past Events, the PowerCenter Server continues executing the next
tasks if the event already occurred.
Select the Enable Past Events option in the Properties tab of the Event-Wait task.

Working with the Timer Task
The Timer task allows you to specify the period of time to wait before the PowerCenter Server
runs the next task in the workflow. You can choose to start the next task in the workflow at an
exact time and date. You can also choose to wait a period of time after the start time of
another task, workflow, or worklet before starting the next task.
The Timer task has two types of settings:
♦ Absolute time. You specify the exact time that the PowerCenter Server starts running the
next task in the workflow. You may specify the exact date and time, or you can choose a
user-defined workflow variable to specify the exact time.
♦ Relative time. You instruct the PowerCenter Server to wait for a specified period of time
after the Timer task, the parent workflow, or the top-level workflow starts.
For example, you may have two sessions in the workflow. You want the PowerCenter Server
wait ten minutes after the first session completes before it runs the second session. Use a
Timer task after the first session. In the Relative Time setting of the Timer task, specify ten
minutes from the start time of the Timer task.
Figure 5-8 shows the example workflow using the Timer task:
Figure 5-8. Example Workflow Using the Timer Task
You can use a Timer task anywhere in the workflow after the Start task.
To create a Timer task:
1. In the Workflow Designer, click the Timer task icon on the Tasks toolbar.
Timer Task Toolbar Icon
or
Choose Tasks-Create. Select Timer Task for the task type.
2. Double-click the Timer task to open it.
3. On the General tab, enter a name for the Timer task.
Working with the Timer Task 161

4. Click the Timer tab to specify when the PowerCenter Server starts the next task in the
workflow.
Specify attributes for Absolute Time or Relative Time described in Table 5-2:
Table 5-2. Timer Task Attributes
Timer Attribute Description
Absolute Time: Specify the The PowerCenter Server starts the next task in the workflow at the
exact time to start exact date and time you specify.
Absolute Time: Use this Specify a user-defined date-time workflow variable. The
workflow date-time variable to PowerCenter Server starts the next task in the workflow at the time
calculate the wait you choose.
The Workflow Manager verifies that the variable you specify has
the Date/Time datatype.
The Timer task fails if the date-time workflow variable evaluates to
NULL.
Relative time: Start after Specify the period of time the PowerCenter Server waits to start
executing the next task in the workflow.
Relative time: from the start Choose this option to wait a specified period of time after the start
time of this task time of the Timer task to run the next task.
time of the parent workflow/ time of the parent workflow/worklet to run the next task.
worklet
time of the top-level workflow time of the top-level workflow to run the next task.

Chapter 6
Working with Worklets

♦ Overview, 164
♦ Developing a Worklet, 165
♦ Using Worklet Variables, 169
♦ Validating Worklets, 171
163
Overview
A worklet is an object that represents a set of tasks. It can contain any task available in the
Workflow Manager. You can run worklets inside a workflow. The workflow that contains the
worklet is called the parent workflow. You can also nest a worklet in another worklet.
Create a worklet when you want to reuse a set of workflow logic in several workflows. Use the
Worklet Designer to create and edit worklets.
When the PowerCenter Server runs a worklet, it expands the worklet. The PowerCenter
Server then runs the worklet as it would any other workflow, executing tasks and evaluating
links in the worklet.
The worklet does not contain any scheduling or server information. To run a worklet, include
the worklet in a workflow. The worklet runs on the PowerCenter Server you choose for the
workflow. The Workflow Manager does not provide a parameter file or log file for worklets.
The PowerCenter Server writes information about worklet execution in the workflow log.
Suspending Worklets
When you choose Suspend On Error for the parent workflow, the PowerCenter Server also
suspends the worklet if a task in the worklet fails. When a task in the worklet fails, the
PowerCenter Server stops executing the failed task and other tasks in its path. If no other task
is running in the worklet, the worklet status is “Suspended.” If one or more tasks are still
running in the worklet, the worklet status is “Suspending.” The PowerCenter Server suspends
the parent workflow when the status of the worklet is “Suspended” or “Suspending.”
For details on suspending workflows, see “Suspending the Workflow” on page 127.
164 Chapter 6: Working with Worklets

Developing a Worklet
To develop a worklet, you must first create a worklet. After you create a worklet, configure
worklet properties and add tasks to the worklet. You can create reusable worklets in the
Worklet Designer. You can also create non-reusable worklets in the Workflow Designer as you
develop the workflow.
Creating a Reusable Worklet

Create reusable worklets in the Worklet Designer. You can view a list of reusable worklets in
the Navigator Worklets node.
To create a reusable worklet:
1. In the Worklet Designer, choose Worklets-Create. The Create Worklet dialog box
appears.
2. Enter a name for the worklet.

3. Click OK.
The Worklet Designer creates a Start task in the worklet.
Creating a Non-Reusable Worklet

You can create non-reusable worklets in the Workflow Designer as you develop the workflow.
Non-reusable worklets only exist in the workflow. You cannot use a non-reusable worklet in
another workflow. After you create the worklet in the Workflow Designer, open the worklet to
edit it in the Worklet Designer.
Developing a Worklet 165

You can promote non-reusable worklets to reusable worklets by selecting the Reusable option
in the worklet properties. To rename non-reusable worklets, open the worklet properties in
the Workflow Designer.
To create a non-reusable worklet:
1. In the Workflow Designer, open a workflow.

2. Choose Tasks-Create.
3. Select Worklet for the Task type.
4. Enter a name for the worklet.
5. Click Create.
The Workflow Designer creates the worklet and adds it to the workspace.
6. Click Done.
Configuring Worklet Properties

When you use a worklet in a workflow, you can configure the same set of general task settings
on the General tab as any other task. For example, you can make a worklet reusable, disable a
worklet, configure the input link to the worklet, or fail the parent workflow based on the
worklet. For details on these task settings, see “Configuring Tasks” on page 135.
In addition to general task settings, you can configure the following worklet properties:
♦ Worklet variables. Use worklet variables to reference values and record information. You
use worklet variables the same way you use workflow variables. You can assign a workflow
variable to a worklet variable to override its initial value.
For details on worklet variables, see “Using Worklet Variables” on page 169.
♦ Events. To use the Event-Wait and Event-Raise tasks in the worklet, you must first declare
an event in the worklet properties.
♦ Metadata extension. Extend the metadata stored in the repository by associating
information with repository objects. For details, see “Working with Metadata Extensions”
on page 82.
Adding Tasks in Worklets

After you create a new worklet, add tasks by opening the worklet in the Worklet Designer. A
worklet must contain a Start task. The Start task represents the beginning of a worklet. When
you create a worklet, the Worklet Designer automatically creates a Start task for you.
To add tasks to a non-reusable worklet:
1. Create a non-reusable worklet in the Workflow Designer workspace.

2. Right-click the worklet and choose Open Worklet.

3. The Worklet Designer opens so you can add tasks in the worklet.
4. Add tasks in the worklet by using the Tasks toolbar or choose Tasks-Create in the
Worklet Designer.
5. Connect tasks with links.
Declaring Events in Worklets

Similar to workflows, you can use Event-Wait and Event-Raise tasks in a worklet. To use the
Event-Raise task, you first declare a user-defined event in the worklet. Events in one instance
of a worklet do not affect events in other instances of the worklet. You cannot specify worklet
events in the Event tasks in the parent workflow.
For more information about using event tasks, see “Working with Event Tasks” on page 153.
Viewing Links in a Worklet

When you edit a workflow or worklet, you can view the forward or backward link paths to
other tasks. You can highlight paths to see links in the workflow branch from the Start task to
the last task in the branch. For details, see “Developing Workflows” on page 91.
Nesting Worklets
You can nest a worklet within another worklet. When you run a workflow containing nested
worklets, the PowerCenter Server runs the nested worklet from within the parent worklet. You
can group several worklets together by function or simplify the design of a complex workflow
when you nest worklets.
You might choose to nest worklets to load data to fact and dimension tables. Create a nested
worklet to load fact and dimension data into a staging area. Then, create a nested worklet to
load the fact and dimension data from the staging area to the data warehouse.
You might choose to nest worklets to simplify the design of a complex workflow. Nest
worklets that can be grouped together within one worklet. In the workflow in Figure 6-1, two
worklets relate to regional sales and two worklets relate to quarterly sales.
Figure 6-1 shows a workflow that uses multiple worklets:
Figure 6-1. Workflow with Multiple Worklets
Developing a Worklet 167

The workflow in Figure 6-2 shows the same workflow with the worklets grouped and nested
in parent worklets.
Figure 6-2 shows a workflow that uses nested worklets:
Figure 6-2. Workflow with Nested Worklets
Creating Nested Worklets

From the Worklet Designer, open the parent worklet. To nest an existing reusable worklet,
choose Tasks-Insert Worklet. To create a non-reusable nested worklet, choose Tasks-Create,
and select worklet.

Using Worklet Variables
Worklet variables are similar to workflow variables. A worklet has the same set of pre-defined
variables as any task. You can also create user-defined worklet variables. Like user-defined
workflow variables, user-defined worklet variables can be persistent or non-persistent. For
details on workflow variables, see “Using Workflow Variables” on page 103.
You cannot use variables from the parent workflow in the worklet. Similarly, you cannot use
user-defined worklet variables in the parent workflow. However, you can use pre-defined
worklet variables in the parent workflow, just as you can use pre-defined variables for other
tasks in the workflow.
Persistent Worklet Variables

User-defined worklet variables can be persistent or non-persistent. To create a persistent
worklet variable, select Persistent when you create the variable. When you create a persistent
worklet variable, the worklet variable retains its value the next time the PowerCenter Server
executes the worklet instance in the parent workflow.
For example, you might have a worklet with a persistent variable. Use two instances of the
worklet in a workflow to run the worklet twice. You name the first instance of the worklet
Worklet1 and the second instance Worklet2.
Figure 6-3 shows the example workflow:
Figure 6-3. Example of Persistent Worklet Variable
When you run the example workflow shown in Figure 6-3, the persistent worklet variable
retains its value from Worklet1 and becomes the initial value in Worklet2. After the
PowerCenter Server executes Worklet2, it retains the value of the persistent variable in the
repository and uses the value the next time you run the workflow.
Worklet variables only persist when you run the same workflow. A worklet variable does not
retain its value when you use instances of the worklet in different workflows.
Overriding Initial Value

For each worklet instance, you can override the initial value of the worklet variable by
assigning a workflow variable to it.
To override the initial value of a worklet variable:
1. Double-click the worklet instance in the Workflow Designer workspace.
Using Worklet Variables 169

2. On the Parameters tab, click the Add button.
Add Button
Select a user-defined
worklet variable.
3. Click the open button in the User-Defined Worklet Variables field to select a worklet
variable.
4. Click the Open button in the Parent Workflow Variable field to select a workflow
variable to assign to the worklet variable.
5. Click Apply.
The worklet variable in this worklet instance now has the selected workflow variable as its
initial value.

Validating Worklets
The Workflow Manager validates worklets when you save the worklet in the Worklet
Designer. In addition, when you use worklets in a workflow, the PowerCenter Server validates
the workflow according to the following validation rules at runtime:
♦ You cannot run two instances of the same worklet concurrently in the same workflow.
♦ You cannot run two instances of the same worklet concurrently across two different
workflows.
♦ Each worklet instance in the workflow can run only once.
When a worklet instance is invalid, the workflow using the worklet instance remains valid.
For details on workflow validation rules, see “Validating a Workflow” on page 119.
The Workflow Manager displays a red invalid icon if the worklet object is invalid. The
Workflow Manager validates the worklet object using the same validation rules for workflows.
The Workflow Manager displays a blue invalid icon if the worklet instance in the workflow is
invalid. The worklet instance may be invalid when any of the following conditions occurs:
♦ The parent workflow or worklet variable you assign to the user-defined worklet variable
does not have a matching datatype.
♦ The user-defined worklet variable you used in the worklet properties does not exist.
♦ You do not specify the parent workflow or worklet variable you want to assign.
For non-reusable worklets, you may see both red and blue invalid icons displayed over the
worklet icon in the Navigator.
Validating Worklets 171

Chapter 7
Working with Sessions

♦ Overview, 174
♦ Creating a Session Task, 175
♦ Editing a Session, 177
♦ Creating a Session Configuration Object, 183
♦ Using Pre- and Post-Session SQL Commands, 186
♦ Using Pre- or Post-Session Shell Commands, 188
♦ Using Post-Session Email, 194
♦ Validating a Session, 195
♦ Running the Session, 197
♦ Stopping and Aborting a Session, 200
♦ Mapping Parameters and Variables in Sessions, 203
♦ Handling High Precision Data, 204
173
Overview
A session is a set of instructions that tells the PowerCenter Server how and when to move data
from sources to targets. A session is a type of task, similar to other tasks available in the
Workflow Manager. In the Workflow Manager, you configure a session by creating a Session
task. To run a session, you must first create a workflow to contain the Session task.
When you create a Session task, you enter general information such as the session name,
session schedule, and the PowerCenter Server to run the session. You can also select options to
execute pre-session shell commands, send On-Success or On-Failure email, and use FTP to
transfer source and target files.
Using session properties, you can also override parameters established in the mapping, such as
source and target location, source and target type, error tracing levels, and transformation
attributes. When you assign a server in a server grid to a session, the server you specify at the
session level overrides the server you specify at the workflow level.
You can run as many sessions in a workflow as you need. You can run the Session tasks
sequentially or concurrently, depending on your needs.
The PowerCenter Server creates several files and in-memory caches depending on the
transformations and options used in the session. For more details on session output files and
caches, see “Output Files and Caches” on page 28.
174 Chapter 7: Working with Sessions

Creating a Session Task
You create a Session task for each mapping you want the PowerCenter Server to run. The
PowerCenter Server uses the instructions configured in the session to move data from sources
to targets.
You can create a reusable Session task in the Task Developer. You can also create non-reusable
Session tasks in the Workflow Designer as you develop the workflow. After you create the
session, you can edit the session properties at any time.
Note: Before you create a Session task, you must configure the Workflow Manager to
communicate with databases and the PowerCenter Server. You must assign appropriate
permissions for any database, FTP, or external loader connections you configure. For details
on configuring the Workflow Manager, see “Configuring the Workflow Manager” on page 37.
Session Privileges
To create sessions, you must have one of the following sets of privileges and permissions:
♦ Use Workflow Manager privilege with read, write, and execute permissions
You must have read permission for connection objects associated with the session in addition
to the above privileges and permissions.
PowerCenter allows you to set a read-only privilege for sessions. The Workflow Operator
privilege allows a user to view, start, stop, and monitor sessions without being able to edit
session properties.
Steps to Create a Session Task

Create the Session task in the Task Developer or the Workflow Designer. Session tasks created
in the Task Developer are reusable. For more information about reusable tasks and other
general information about workflow tasks, see “Reusable Workflow Tasks” on page 135.
To create a Session task:
1. In the Workflow Designer, click the Session Task icon on the Tasks toolbar.
or
Choose Tasks-Create. Select Session Task for the task type.
2. Enter a name for the Session task.
3. Click Create. The Mappings dialog box appears.
Creating a Session Task 175

4. Select the mapping you want to use in the Session task and click OK.
5. Click Done. The Session task appears in the workspace.

Editing a Session
After you create a session, you can edit it. For example, you might need to adjust the buffer
and cache sizes, modify the update strategy, or clear a variable value saved in the repository.
Double-click the Session task to open the session properties. The session has the following
tabs, and each of those tabs has multiple settings:
♦ General tab. Enter session name, mapping name, description for the Session task, specify a
PowerCenter Server override, and configure additional task options.
♦ Properties tab. Enter session log information, test load settings, and performance
configuration.
♦ Config Object tab. Enter advanced settings, log options, and error handling
configuration.
♦ Mapping tab. Enter source and target information, override transformation properties,
and configure the session for partitioning.
♦ Components tab. Configure pre- or post-session shell commands and emails.
♦ Metadata Extension tab. Configure metadata extension options.
For a detailed description of the session properties tabs and associated options, see “Session
Properties Reference” on page 667.
Figure 7-1 shows the session properties:
Figure 7-1. Session Properties
Editing a Session 177

You can edit session properties at any time. The repository updates the session properties
immediately.
If the session is running when you edit the session, the repository updates the session when
the session completes. If the mapping changes, the Workflow Manager might issue a warning
that the session is invalid. The Workflow Manager then allows you to continue editing the
session properties. After you edit the session properties, the PowerCenter Server validates the
session and reschedules the session as necessary. For details on session validation, see
“Validating a Session” on page 195.
Edit Session Privilege

To edit a session, you must have one of the following sets of privileges and permissions:
♦ Use Workflow Manager privilege with read and write permissions on the folder
Applying Attributes to All Instances

When you edit the session properties, you can apply source, target, and transformation
settings to all instances of the same type in the session. You can also apply settings to all
partitions in a pipeline. You can apply reader or writer settings, connection settings, and
properties settings.
For example, you might need to change a relational connection from a test to a production
database for all the target instances in a session. You can change the connection value for one
target in a session and apply the connection to the other relational target objects.

Figure 7-2 shows the writers, connections, and properties settings for a target instance in a
session:
Figure 7-2. Session Target Object Settings
For a target
instance, you can
change writers,
connections, and
properties
settings.
Table 7-1 shows the options you can use to apply attributes to objects in a session. You can
apply different options depending on whether the setting is a reader or writer, connection, or
an object property.
Table 7-1. Apply All Options
Setting Option Description
Reader Apply Type to All Instances Applies a reader or writer type to all instances of the same object
Writer type in the session. For example, you can apply a relational
reader type to all the other readers in your session.
Reader Apply Type to All Partitions Applies a reader or writer type to all the partitions in a pipeline.
Writer For example, if you have four partitions, you can change the writer
type in one partition for a target instance. Then you can use this
option to apply the change to the other three partitions.
Connections Apply Connection Type Applies the same type of connection to all instances. Connection
types are relational, FTP, queue, application, or external loader.

Table 7-1. Apply All Options
Setting Option Description
Connections Apply Connection Value Apply a connection value to all instances or partitions. The
connection value defines a specific connection that you can view
in the connection browser. You can only apply a connection value
that is valid for the existing connection type.
Connections Apply Connection Attributes Apply only the connection attribute values to all instances or
partitions. Each type of connection has different attributes. You
can apply connection attributes separately from connection
values. To view sample connection attributes, see Figure 7-3 on
page 181.
Connections Apply Connection Data Apply the connection value and its connection attributes to all the
other instances that have the same connection type. This option
combines the connection option and the connection attribute
option.
Connections Apply All Connection Applies the connection value and its attributes to all the other
Information instances even if they do not have the same connection type. This
option is similar to Apply Connection Data, but it allows you to
change the connection type.
Properties Apply Attribute to all Applies an attribute value to all instances of the same object type
Instances in the session. For example, if you have a relational target you can
choose to truncate a table before you load data. You can apply the
attribute value to all the relational targets in your session.
Properties Apply Attribute to all Applies an attribute value to all partitions in a pipeline. For
Partitions example, you can change the name of the reject file name in one
partition for a target instance, then apply the file name change to
the other partitions.
Applying Connection Settings

When you apply connection settings you can apply the connection type, connection value,
and connection attributes. You can only apply a connection value that is valid for a
connection type unless you choose the Apply All Connection Information option. For
example, if a target instance uses an FTP connection, you can only choose an FTP connection
value to apply to it. The Apply All Connection Information option enables you to apply a
new connection type, connection value, and connection attributes.

Figure 7-3 illustrates the connection options by showing where they display on a connection
browser:
Figure 7-3. Connection Options
The connection type can be relational, FTP, queue,

application, or external loader.
The connection value defines a specific connection.
Connection attributes are different for each

connection type.
Applying Attributes to Partitions or Instances

When you apply attributes to all instances or partitions in a session, you must open the
session and edit one of the session objects. You apply attributes or properties to other
instances by choosing an attribute in that object and selecting to apply its value to the other
instances or partitions.
To apply attributes to all instances or partitions:
1. Open a session in the workspace.

2. Click the Mappings tab.
3. Choose a source, target, or transformation instance from the Navigator. Settings for
properties, connections, and readers or writers might display, depending on the object
you choose.

4. Right-click a reader, writer, property, or connection value. A list of options display.
5. Select an option from the list and choose to apply it to all instances or all partitions.
6. Click OK to apply the attribute or property.

Creating a Session Configuration Object
The Config Object tab in the session properties includes commit and load settings, log
options, and error handling settings. The Workflow Manager allows you to create a reusable
set of attributes for the Config Object tab. When you configure attributes in the Config
Object tab, you can specify a session configuration object you already created. Or, you can
specify the default session configuration object called default_session_config. Override the
attributes in the session configuration object in the Config Object tab.
Figure 7-4 shows the Config Object tab of the session properties:
Figure 7-4. Config Object Tab
Select a
session
configuration
object.
Click the Browse button in the Config Name field to choose a session configuration. Select a
user-defined or default session configuration object from the browser.
To create a session configuration object:
1. In the Workflow Manager, click Tasks-Session Configuration.
Creating a Session Configuration Object 183

The Session Configuration Browser appears.
Figure 7-5 shows the Session Configuration Browser:
Figure 7-5. Session Configuration Browser
2. Click New to create a new session configuration object.
3. Enter a name for the session configuration object.

4. In the Properties tab, configure advanced settings, log options, and error handling
options.
5. Click OK.
For session configuration object settings descriptions, see “Config Object Tab” on page 675.
Creating a Session Configuration Object 185

Using Pre- and Post-Session SQL Commands
You can specify pre- and post-session SQL in the Source Qualifier transformation and the
target instance when you create a mapping. When you create a Session task in the Workflow
Manager you can override the SQL commands on the Mapping tab. You might want to use
these commands to drop indexes on the target before the session runs, and then recreate them
when the session completes.
The PowerCenter Server executes pre-session SQL commands before it reads the source. It
executes post-session SQL commands after it writes to the target.
Guidelines for Entering Pre- and Post-Session SQL Commands

Remember the following guidelines when creating the SQL statements:
♦ You can use any command that is valid for the database type. However, the PowerCenter
Server does not allow nested comments, even though the database might.
♦ You can use mapping parameters and variables in SQL executed against the source, but not
the target.
♦ Use a semi-colon (;) to separate multiple statements.
♦ The PowerCenter Server ignores semi-colons within single quotes, double quotes, or
within /* ...*/.
♦ If you need to use a semi-colon outside of quotes or comments, you can escape it with a
back slash (\).
♦ The Workflow Manager does not validate the SQL.
Error Handling
You can configure error handling on the Config Object tab. You can choose to stop or
continue the session if the PowerCenter Server encounters an error issuing the pre- or post-
session SQL command.

Figure 7-6 shows how to configure error handling for a pre- or post-session SQL commands:
Figure 7-6. Stop or Continue the Session on Pre- or Post-Session SQL Errors
Stop or
continue the
session on pre-
or post-
session SQL
error.
Using Pre- and Post-Session SQL Commands 187

Using Pre- or Post-Session Shell Commands
The PowerCenter Server can perform shell commands at the beginning of the session or at the
end of the session. Shell commands are operating system commands. You can use pre- or post-
session shell commands, for example, to delete a reject file or session log, or to archive target
files before the session begins.
The Workflow Manager provides the following types of shell commands for each Session task:
♦ Pre-session command. The PowerCenter Server performs pre-session shell commands at
the beginning of a session. You can configure a session to stop or continue if a pre-session
shell command fails.
♦ Post-session success command. The PowerCenter Server performs post-session success
commands only if the session completed successfully.
♦ Post-session failure command. The PowerCenter Server performs post-session failure
commands only if the session failed to complete.
Use the following guidelines to call a shell command:
♦ Use any valid UNIX command or shell script for UNIX servers, or any valid DOS or batch
file for Windows servers.
♦ Configure the session to execute the pre- or post-session shell commands.
The Workflow Manager provides a task called the Command task that allows you to specify
shell commands anywhere in the workflow. You can choose a reusable Command task for the
pre- or post-session shell command. Or, you can create non-reusable shell commands for the
pre- or post-session shell commands. For details on the Command task, see “Working with
the Command Task” on page 143.
If you create a non-reusable pre- or post-session shell command, you can make it into a
reusable Command task.
The Workflow Manager allows you to choose from the following options when you configure
shell commands:
♦ Create non-reusable shell commands. Create a non-reusable set of shell commands for the
session. Other sessions in the folder cannot use this set of shell commands.
♦ Use an existing reusable Command task. Select an existing Command task to run as the
pre- or post-session shell command.
Configure pre- and post-session shell commands in the Components tab of the session
properties.
Using Server and Session Variables

You can include any server variable, such as $PMTargetFileDir, or session variables in
commands in pre-session and post-session commands. When you use a server variable instead
of entering a specific directory, you can run the same workflow on different PowerCenter
Servers without changing session properties. You cannot use server variables or session

variables in standalone Command tasks in the workflow. The PowerCenter Server does not
expand server variables or session variables used in standalone Command tasks.
Configuring Non-Reusable Shell Commands

When you create non-reusable pre- or post-session shell commands, the commands are only
visible in session properties. The Workflow Manager does not create Command tasks from
these non-reusable commands. You can make non-reusable shell commands into a reusable
Command tasks.
Figure 7-7 shows the Make Reusable option for a pre-session shell command:
Figure 7-7. Make Reusable Option for Pre-Session Shell Commands
Make this shell command

reusable.
Perform the following steps to create pre- or post-session shell commands for a specific
session.
Using Pre- or Post-Session Shell Commands 189

To create non-reusable shell commands:
1. In the Components tab of the session properties, select Non-reusable for pre- or post-
session shell command.
Edit pre-
session
commands.
2. Click the Edit button in the Value field to open the Edit Pre- or Post-Session Command
dialog box.
3. Enter a name for the command in the General tab.

4. If you want the PowerCenter Server to perform the next command only if the previous
command completed successfully, select Run If Previous Completed in the Properties tab.
5. In the Commands tab, click the Add button to add shell commands.
Enter one command for each line.
Add a command.
6. Click OK.
Creating a Reusable Command Task from Pre- or Post-Session

Commands
If you create non-reusable pre- or post-session shell commands, you can make them into a
reusable Command task. Once you make the pre- or post-session shell commands into a
reusable Command task, you cannot revert back.

To create a Command Task from non-reusable pre- or post-session shell commands, click the
Edit button to open the Edit dialog box for the shell commands. In the General tab, select the
Make Reusable checkbox.
After you check the Make Reusable checkbox and click OK, a new Command task appears in
the Tasks folder in the Navigator window. You can use this Command task in other
workflows, just as you do with any other reusable workflow tasks.
Configuring Reusable Shell Commands

Perform the following steps to call an existing reusable Command task as the pre- or post-
session shell command for the Session task.
To select an existing Command task as the pre-session shell command:
1. In the Components tab of the session properties, click Reusable for the pre- or post-
session shell command.
2. Click the Edit button in the Value field to open the Task Browser dialog box.
3. Select the Command task you want to run as the pre- or post-session shell command.
4. Click the Override button in the Task Browser dialog box if you want to change the order
of the commands, or if you want to specify whether to run the next command when the
previous command fails.
Changes you make to the Command task from the session properties only apply to the
session. In the session properties, you cannot edit the commands in the Command task.
5. Click OK to select the Command task for the pre- or post-session shell command.
The name of the Command task you select appears in the Value field for the shell
command.

Using Server Variables
You can include any server variable, such as $PMTargetFileDir, in pre- or post-session shell
commands. When you use a server variable instead of entering a specific directory, you can
run the same workflow on different PowerCenter Servers without changing session properties.
Pre-Session Shell Command Errors

You can configure the session to stop or continue if a pre-session shell command fails. If you
select stop, the PowerCenter Server stops the session, but continues with the rest of the
workflow. If you select Continue, the PowerCenter Server ignores the errors and continues the
session. By default the PowerCenter Server stops the session upon shell command errors.
Configure the session to stop or continue if a pre-session shell command fails in the Error
Handling settings on the Config Object tab.
Figure 7-8 shows how to configure the session to stop or continue when a pre-session shell
command fails:
Figure 7-8. Stop or Continue the Session on Pre-Session Shell Command Error
Stop or
continue the
session on pre-
session shell
command error.

Using Post-Session Email
The PowerCenter Server can send emails after the session completes. You can send an email
when the session completes successfully. Or, you can send an email when the session fails. The
PowerCenter Server can send the following types of emails for each Session task:
♦ On-Success Email. The PowerCenter Server sends the email when the session completes
successfully.
♦ On-Failure Email. The PowerCenter Server sends the email when the session fails.
You can also use an Email task to send email anywhere in the workflow. If you already created
a reusable Email task, you can select it as the On-Success or On-Failure email for the session.
Or, you can create non-reusable emails that exist only within the Session task.
For more information about sending post-session emails, see “Sending Email” on page 319.

Validating a Session
The Workflow Manager validates a Session task when you save it. You can also manually
validate Session tasks and session instances. Validate reusable Session tasks in the Task
Developer. Validate non-reusable sessions and reusable session instances in the Workflow
Designer.
The Workflow Manager marks a reusable session or session instance invalid if you perform
one of the following tasks:
♦ Edit the mapping in a way that might invalidate the session. You can edit the mapping
used by a session at any time. When you edit and save a mapping, the repository might
invalidate sessions that already use the mapping. The PowerCenter Server does not execute
invalid sessions.
You must reconnect to the folder to see the effect of mapping changes on Session tasks. For
details on validating mappings, see “Mappings” in the Designer Guide.
When you edit a session based on an invalid mapping, the Workflow Manager displays a
warning message:
The mapping [mapping_name] associated with the session [session_name] is
invalid.
♦ Delete a database, FTP, or external loader connection used by the session.

♦ Leave session attributes blank. For example, the session is invalid if you do not specify the
source file name.
♦ Change the code page of a session database connection to an incompatible code page.
If you delete objects associated with a Session task such as session configuration object, Email,
or Command task, the Workflow Manager marks a reusable session invalid. However, the
Workflow Manager does not mark a non-reusable session invalid if you delete an object
associated with the session.
If you delete a shortcut to a source or target from the mapping, the Workflow Manager does
not mark the session invalid.
The Workflow Manager does not validate SQL overrides or filter conditions entered in the
session properties when you validate a session. You must validate SQL override and filter
conditions in the SQL Editor.
If a reusable session task is invalid, the Workflow Manager displays an invalid icon over the
session task in the Navigator and in the Task Developer workspace. This does not affect the
validity of the session instance and the workflows using the session instance.
If a reusable or non-reusable session instance is invalid, the Workflow Manager marks it
invalid in the Navigator and in the Workflow Designer workspace. Workflows using the
session instance remain valid.
To validate a session, select the session in the workspace and choose Tasks-Validate. Or, right-
click the session instance in the workspace and choose Validate.
Validating a Session 195

Validating Multiple Sessions
You can validate multiple sessions without fetching them into the workspace. You must select
and validate the sessions from a query results view or a view dependencies list. You can save
and optionally check in sessions that change from invalid to valid status. For more
information about validating multiple objects, see “Validating Multiple Objects” in the
Repository Guide.
Note: If you are using the Repository Manager, you can select and validate multiple sessions
from the Navigator.
To validate multiple sessions:
1. Select sessions from either a query list or a view dependencies list.

2. Right-click one of the selected sessions and choose Validate.
The Validate Objects dialog box displays.
3. Choose whether to save objects and check in objects that you validate.

Running the Session
By default, the PowerCenter Server you assign to a workflow runs all tasks. If you register
multiple servers to a repository, you can override the PowerCenter Server at the session level.
In a server grid, the master server distributes the sessions to available worker servers. You can
assign a PowerCenter Server to a session. The session always runs on the server you assigned
to it. For more information about how a server grid distributes sessions, see “Distributing
Sessions” on page 446.
Selecting a Server to Run the Session

You can choose a server to run the session. If you only register one server, the Workflow
Manager lists the single registered PowerCenter Server that runs the workflow and session.
For PowerCenter repositories with multiple servers, the Workflow Manager lists all servers.
To select a server to run a session:
1. Open a session in a workflow.

2. Double-click the session in the workflow. The Edit Tasks dialog box appears.
3. Click the Select Server button on the General tab. A list of registered servers appear.
Select a
server.
4. Select a server to run the session.
Running the Session 197

5. Click OK twice to select the server for the session.
Instead of choosing a server for each session in the folder, you can assign multiple sessions to
a server.
Assigning the PowerCenter Server to a Session

After you register the PowerCenter Server, you can assign it to sessions you want to run on
that server. This allows you to assign the PowerCenter Server to multiple sessions without
editing each session property individually. To assign the PowerCenter Server to multiple
sessions, you must first close all folders in the repository.
To assign the PowerCenter Server to sessions, you must have the Super User privilege.
Figure 7-9 shows the Assign Server dialog box:
Figure 7-9. Assign Server Dialog Box
Select a server to assign.
Select a folder.
Show sessions.
Assign a server to a session.
To assign the PowerCenter Server:

or
Right-click the server name in the Navigator and choose Assign Server. The Assign Server
dialog box opens.
3. From the Choose Server list, select the server you want to assign.

5. Select the Show Sessions check box.
6. Select each session you want to run on the PowerCenter Server.
7. Click Assign.
You can remove an assigned server from a session in the Assign Server dialog box. Perform the
following steps to remove an assigned server from a session.
To remove an assigned server:

3. From the Choose Server list, select None.
5. Select the sessions from which you want to remove the assigned server.
6. Click Assign.
Running the Session 199

Stopping and Aborting a Session
You can stop or abort a session just as you can stop or abort any task. You can also abort a
session by using the ABORT() function in the mapping logic. Session errors can cause the
PowerCenter Server to stop a session early. You can control the stopping point by setting an
error threshold in a session, using the ABORT function in mappings, or requesting the
PowerCenter Server to stop the session. You cannot control the stopping point when the
PowerCenter Server encounters fatal errors, such as loss of connection to the target database.
If a session fails as a result of error, you can consider performing session recovery. For more
information on recovery, see “Recovering a Session Task” on page 311. For more information
on row error logging, see “Overview” on page 482.
Threshold Errors
You can choose to stop a session on a designated number of non-fatal errors. A non-fatal error
is an error that does not force the session to stop on its first occurrence. Establish the error
threshold in the session properties with the Stop On option. When you enable this option,
the PowerCenter Server counts non-fatal errors that occur in the reader, writer, and
transformation threads.
The PowerCenter Server maintains an independent error count when reading sources,
transforming data, and writing to targets. The PowerCenter Server counts the following non-
fatal errors when you set the stop on option in the session properties:
♦ Reader errors. Errors encountered by the PowerCenter Server while reading the source
database or source files. Reader threshold errors can include alignment errors while
running a session in Unicode mode.
♦ Writer errors. Errors encountered by the PowerCenter Server while writing to the target
database or target files. Writer threshold errors can include key constraint violations,
loading nulls into a not null field, and database trigger responses.
♦ Transformation errors. Errors encountered by the PowerCenter Server while transforming
data. Transformation threshold errors can include conversion errors, and any condition set
up as an ERROR, such as null input.
When you create multiple partitions in a pipeline, the PowerCenter Server maintains a
separate error threshold for each partition. When the PowerCenter Server reaches the error
threshold for any partition, it stops the session. The writer may continue writing data from
one or more partitions, but it does not affect your ability to perform a successful recovery.
Note: If alignment errors occur in a non line-sequential VSAM file, the PowerCenter Server
sets the error threshold to 1 and stops the session.
Fatal Error
A fatal error occurs when the PowerCenter Server cannot access the source, target, or
repository. This can include loss of connection or target database errors, such as lack of

database space to load data. If the session uses a Normalizer or Sequence Generator
transformation, the PowerCenter Server cannot update the sequence values in the repository,
and a fatal error occurs.
If the session does not use a Normalizer or Sequence Generator transformation, and the
PowerCenter Server loses connection to the repository, the PowerCenter Server does not stop
the session. The session completes, but the PowerCenter Server cannot log session statistics
into the repository.
ABORT Function
Use the ABORT function in the mapping logic to abort a session when the PowerCenter
Server encounters a designated transformation error.
For more information about ABORT, see “Functions” in the Transformation Language
Reference.
User Command
You can stop or abort the session from the Workflow Manager. You can also stop the session
using pmcmd.
PowerCenter Server Handling for Session Failure

The PowerCenter Server handles session errors in different ways, depending on the error or
event that causes the session to fail.
Table 7-2 describes the PowerCenter Server behavior when a session fails:
Table 7-2. PowerCenter Server Behavior for Failed Sessions
Cause for Session Errors PowerCenter Server Behavior
- Error threshold met due to reader errors The PowerCenter Server performs the following tasks:
- Stop command using Workflow Manager or - Stops reading.
pmcmd - Continues processing data.
- Continues writing and committing data to targets.
If the PowerCenter Server cannot finish processing and committing
data, you need to issue the Abort command to stop the session.
Abort command using Workflow Manager The PowerCenter Server performs the following tasks:
- Stops reading.
- Continues processing data.
- Continues writing and committing data to targets.
If the PowerCenter Server cannot finish processing and committing
data within 60 seconds, it kills the PowerCenter Server process.
Stopping and Aborting a Session 201

Table 7-2. PowerCenter Server Behavior for Failed Sessions
Cause for Session Errors PowerCenter Server Behavior
- Fatal error from database The PowerCenter Server performs the following tasks:
- Error threshold met due to writer errors - Stops reading and writing.
- Rolls back all data not committed to the target database.
If the session stops due to fatal error, the commit or rollback may
or may not be successful.
- Error threshold met due to transformation errors The PowerCenter Server performs the following tasks:
- ABORT( ) - Stops reading.
- Invalid evaluation of transaction control - Flags the row as an abort row and continues processing data.
expression - Continues to write to the target database until it hits the abort row.
- Issues commits based on commit intervals.
- Rolls back all data not committed to the target database.

Mapping Parameters and Variables in Sessions
You can use mapping parameters in the session properties to alter certain mapping attributes.
For example, you can use a mapping parameter in a transformation override to override a
filter or user-defined join in a Source Qualifier transformation.
If you use mapping variables in a session, you can clear any of the variable values saved in the
repository by editing the session. When you clear the variable values, the PowerCenter Server
uses the values in the parameter file the next time you run a session. If the session does not use
a parameter file, the PowerCenter Server uses the initial values defined in the mapping. For
more information on mapping variables, see “Mapping Parameters and Variables” in the
Designer Guide.
To view or delete values for mapping variables saved in the repository:
1. In the Navigator window of the Workflow Manager, right-click the Session task and
select View Persistent Values.
2. Click Delete Values to delete existing variable values.

3. To save changes, click OK.
Mapping Parameters and Variables in Sessions 203

Handling High Precision Data
The PowerCenter Server processes decimal values as Doubles or Decimals. When you create a
session, you choose to enable the Decimal datatype or let the PowerCenter Server process the
data as a Double (precision of 15).
To enable high precision data handling:
♦ Use the Decimal datatype with a precision of 16 to 28 in the mapping.
♦ Select Enable High Precision in the session properties.
The precision attributed to a number also includes the scale of the number. For example, the
value 11.47 has a precision of 4 and a scale of 2.
For example, you might have a mapping with Decimal (20,0) that passes the number
40012030304957666903. If you enable high precision, the PowerCenter Server passes the
number as is. If you do not enable high precision, the PowerCenter Server passes
4.00120303049577 x 10 19.
If you want to process a Decimal value with a precision greater than 28 digits, the
PowerCenter Server automatically treats it as a Double value. For example, if you want to
process the number 2345678904598383902092.1927658, which has a precision of 29 digits,
the PowerCenter Server automatically treats this number as a Double value of
2.34567890459838 x 10 21.
To use high precision data handling in a session:
1. In the Workflow Manager, open the session properties.

2. On the Properties tab, select Enable High Precision.
Enable
High
Precision
3. Click OK twice to save changes.
Handling High Precision Data 205

Chapter 8
Working with Sources

♦ Overview, 208
♦ Configuring Sources in a Session, 210
♦ Working with Relational Sources, 214
♦ Working with File Sources, 218
♦ Server Handling for File Sources, 226
♦ Server Handling for File Sources, 226
♦ Using a File List, 230
207
Overview
In the Workflow Manager, you can create sessions with the following sources:
♦ Relational. You can extract data from any relational database that the PowerCenter Server
can connect to. When extracting data from relational sources and Application sources, you
must configure the database connection to the data source prior to configuring the session.
♦ File. You can create a session to extract data from a flat file, COBOL, or XML source. The
PowerCenter Server can extract data from any local directory or FTP connection for the
source file. If the file source requires an FTP connection, you need to configure the FTP
connection to the host machine before you create the session.
♦ Heterogeneous. You can extract data from multiple sources in the same session. You can
extract from multiple relational sources, such as Oracle and SQL Server. Or, you can
extract from multiple source types, such as relational and flat file. When you configure a
session with heterogeneous sources, configure each source instance separately.
Globalization Features
You can choose a code page that you want the PowerCenter Server to use for relational sources
and flat files. You specify code pages for relational sources when you configure database
connections in the Workflow Manager. You can set the code page for file sources in the session
properties. For more information about code pages, see “Globalization Overview” in the
Source Connections
Before you can extract data from a source, you must configure the connection properties the
PowerCenter Server uses to connect to the source file or database. You can configure source
database and FTP connections in the Workflow Manager.
For more information on creating database connections, see “Configuring the Workflow
Manager” on page 37. For more information on creating FTP connections, see “Using FTP”
on page 559.
Permissions and Privileges

You must have read permissions for the connections you use in the session. For example, if the
source requires database connections or FTP connections, you must have permission to read
those connections in the session.
Allocating Buffer Memory

When the PowerCenter Server initializes a session, it allocates blocks of memory to hold
source and target data. The PowerCenter Server allocates at least two blocks for each source
and target partition. Sessions that use a large number of sources or targets might require
208 Chapter 8: Working with Sources

additional memory blocks. If the PowerCenter Server cannot allocate enough memory blocks
to hold the data, it fails the session.
For more information on allocating buffer memory, see “Optimizing the Session” on
page 655.
Partitioning Sources
You can create multiple partitions for relational, Application, and file sources. For relational
or Application sources, the PowerCenter Server creates a separate connection to the source
database for each partition you set in the session properties. For file sources, you can
configure the session to read the source with one thread or multiple threads.
For more information on partitioning data, see “Pipeline Partitioning” on page 345.
Overview 209
Configuring Sources in a Session
Configure source properties for sessions in the Sources node of the Mapping tab of the session
properties. When you configure source properties for a session, you define properties for each
source instance in the mapping.
Figure 8-1 shows the Sources node on the Mapping tab:
Figure 8-1. Sources Node of the Session Properties
The Sources node lists the sources used in the session and displays their settings. To view and
configure settings for a source, select the source from the list. You can configure the following
settings for a source:
♦ Readers
♦ Connections
♦ Properties
Configuring Readers
You can click the Readers settings on the Sources node to view the reader the PowerCenter
Server uses with each source instance. The Workflow Manager specifies the necessary reader
for each source instance in the Readers settings on the Sources node.

Figure 8-2 shows the Readers settings in the Sources node of the Mapping tab:
Figure 8-2. Readers Settings in the Sources Node of the Mapping Tab
Configuring Connections
Click the Connections settings on the Sources node to define source connection information.
Configuring Sources in a Session 211

Figure 8-3 shows the Connections settings in the Sources node of the Mapping tab:
Figure 8-3. Connections Settings in the Sources Node
Edit a
connection.
Choose a
connection.
For relational sources, choose a configured database connection in the Value column for each
relational source instance. By default, the Workflow Manager displays the source type for
relational sources. For details on configuring database connections, see “Selecting the Source
Database Connection” on page 214.
For flat file and XML sources, choose one of the following source connection types in the
Type column for each source instance:
♦ FTP. If you want to read data from a flat file or XML source using FTP, you must specify
an FTP connection when you configure source options. You must define the FTP
connection in the Workflow Manager prior to configuring the session.
You must have read permission for any FTP connection you want to associate with the
session. The user starting the session must have execute permission for any FTP
connection associated with the session. For details on using FTP, see “Using FTP” on
page 559.
♦ None. Choose None when you want to read from a local flat file or XML file.
Configuring Properties
Click the Properties settings in the Sources node to define source property information. The
Workflow Manager displays properties, such as source file name and location for flat file,

COBOL, and XML source file types. You do not need to define any properties on the
Properties settings for relational sources.
Figure 8-4 shows the Properties settings in the Sources node of the Mapping tab:
Figure 8-4. Properties Settings in the Sources Node of the Mapping Tab
For more information on configuring sessions with relational sources, see “Working with
Relational Sources” on page 214. For more information on configuring sessions with flat file
sources, see “Working with File Sources” on page 218. For more information on configuring
sessions with XML sources, see the XML User Guide.
Configuring Sources in a Session 213

Working with Relational Sources
When you configure a session to read data from a relational source, you can configure the
following properties for sources:
♦ Source database connection. Select the database connection for each relational source. For
more information, see “Selecting the Source Database Connection” on page 214.
♦ Treat source rows as. Define how the PowerCenter Server treats each source row as it reads
it from the source table. For more information, see “Defining the Treat Source Rows As
Property” on page 214.
♦ Table owner name. Define the table owner name for each relational source. For more
information, see “Configuring the Table Owner Name” on page 216.
♦ Override SQL query. You can override the default SQL query to extract source data. For
more information, see “Overriding the SQL Query” on page 216.
Selecting the Source Database Connection

Before you can run a session to read data from a source database, the PowerCenter Server
must connect to the source database. Database connections must exist in the repository to
appear on the source database list. You must define them prior to configuring a session. For
details on configuring a database connection, see “Setting Up a Relational Database
Connection” on page 53.
On the Connections settings in the Sources node, select the database connection from the list.
You must have read permission for the source database connection to configure the session to
use it. The user starting the configured session must have execute permission for source
database connections.
Defining the Treat Source Rows As Property

When the PowerCenter Server reads a source, it marks each row with an indicator to specify
which operation to perform when the row reaches the target. You can define how the
PowerCenter Server marks each row using the Treat Source Rows As property in the General
Options settings on the Properties tab.

Figure 8-5 shows the Treat Source Rows As property on the General Options settings:
Figure 8-5. Treat Source Rows As Property
Treat Source
Rows As
Property
Table 8-1 describes the options you can choose for the Treat Source Rows As property:
Table 8-1. Treat Source Rows As Options
Treat Source Rows As Option Description
Insert The PowerCenter Server marks all rows to insert into the target.
Delete The PowerCenter Server marks all rows to delete from the target.
Update The PowerCenter Server marks all rows to update the target. You can further
define the update operation in the target options. For more information, see “Target
Properties” on page 241.
Data Driven The PowerCenter Server uses the Update Strategy transformations in the mapping
to determine the operation on a row-by-row basis. You define the update operation
in the target options. If the mapping contains an Update Strategy transformation,
this option defaults to Data Driven. You can also use this option when the mapping
contains Custom transformations configured to set the update strategy.
Once you determine how to treat all rows in the session, you also need to set update strategy
options for individual targets. For more information on setting the target update strategy
options, see “Target Properties” on page 241.
For more information on setting the update strategy for a session, see “Update Strategy
Working with Relational Sources 215

Configuring the Table Owner Name
You can define the owner name of the source table in the session properties. For some
databases such as DB2, tables can have different owners. If the database user specified in the
database connection is not the owner of the source tables in a session, specify the table owner
for each source instance. A session can fail if the database user is not the owner and you do
not specify the table owner name.
Specify the table owner name in the Owner Name field in the Properties settings in the
Sources node.
Figure 8-6 shows the Properties settings where you define the table owner name for relational
sources:
Figure 8-6. Source Table Owner Name Property
Owner Name
Overriding the SQL Query

You can alter or override the default query in the mapping by entering SQL override in the
Properties settings in the Sources node. You can enter any SQL statement supported by the
source database.
The Workflow Manager does not validate the SQL override. The following errors could cause
the session to fail, and possibly cause data errors:
♦ Fields with incompatible datatypes or unknown fields
♦ Typing mistakes or other errors

Figure 8-7 shows the Properties settings in the Sources node where you can override the SQL
query:
Figure 8-7. SQL Query Override Property in the Session Properties
SQL Query
To override the default query for a relational source:

2. Click the Mapping tab and open the Transformations view.
3. Click the Sources node and open the Properties settings.
4. Click the Open button in the SQL Query field to open the SQL Editor.
5. Enter the SQL override.
6. Click OK to return to the session properties.
Working with Relational Sources 217

Working with File Sources
You can create a session to extract data from flat file or COBOL sources. When you create a
session to read data from a flat file or COBOL file, you can configure the following
information in the session properties:
♦ Source properties. You can define source properties on the Properties settings in the
Sources node, such as source file options. For more information, see “Configuring Source
♦ Flat file properties. You can edit fixed-width and delimited source file properties. For
more information, see “Configuring Fixed-Width File Properties” on page 220 and
“Configuring Delimited File Properties” on page 222.
♦ Line sequential buffer length. You can change the buffer length for flat files on the
Advanced settings on the Config Object tab. For more information, see “Configuring Line
Sequential Buffer Length” on page 225.
♦ Treat source rows as. Define how the PowerCenter Server treats each source row as it reads
it from the source. For more information, see “Defining the Treat Source Rows As
Property” on page 214.
Configuring Source Properties

You can define session source properties on the Properties settings in the Sources node.

Figure 8-8 shows the flat file source properties you define in the Properties settings of the
Sources node on the Mapping tab:
Figure 8-8. Properties Settings in the Sources Node for a Flat File Source
Working with File Sources 219

Table 8-2 describes the properties you define on the Properties settings for flat file source
definitions:
Table 8-2. Flat File Source Properties
File Source Required/

Description
Options Optional
Source File Optional Enter the directory name in this field. By default, the PowerCenter Server looks
Directory in the server variable directory, $PMSourceFileDir, for file sources.
If you specify both the directory and file name in the Source Filename field,
clear this field. The PowerCenter Server concatenates this field with the Source
Filename field when it runs the session.
You can also use the $InputFileName session parameter to specify the file
directory.
For details on session parameters, see “Session Parameters” on page 495.
Source Filename Required Enter the file name, or file name and path. Optionally use the $InputFileName
session parameter for the file name.
The PowerCenter Server concatenates this field with the Source File Directory
field when it runs the session. For example, if you have “C:\data\” in the Source
File Directory field, then enter “filename.dat” in the Source Filename field.
When the PowerCenter Server begins the session, it looks for
“C:\data\filename.dat”.
By default, the Workflow Manager enters the file name configured in the source
definition.
Source Filetype Required Allows you to configure multiple file sources using a file list.
Indicates whether the source file contains the source data, or whether it
contains a list of files with the exact same file properties. Choose Direct if the
source file contains the source data. Choose Indirect if the source file contains
a list of files.
When you select Indirect, the PowerCenter Server finds the file list and reads
each listed file when it runs the session. For details on file lists, see “Using a
File List” on page 230.
Set File Properties Optional Opens a dialog box that allows you to override source file properties. By
link default, the Workflow Manager displays file properties as configured in the
source definition.
For more information, see “Configuring Fixed-Width File Properties” on
page 220 and “Configuring Delimited File Properties” on page 222.
Configuring Fixed-Width File Properties

When you read data from a fixed-width file, you can edit file properties in the session, such as
the null character or code page. You can configure fixed-width properties for non-reusable
sessions in the Workflow Designer and for reusable sessions in the Task Developer. You
cannot configure fixed-width properties for instances of reusable sessions in the Workflow
Designer.
Click Set File Properties to open the Flat Files dialog box.

Figure 8-9 shows the Flat Files dialog box:
Figure 8-9. Flat Files Dialog Box
To edit the fixed-width properties, select Fixed Width and click Advanced. The Fixed-Width
Properties dialog box appears. By default, the Workflow Manager displays file properties as
configured in the mapping. Edit these settings to override those configured in the source
definition.
Figure 8-10 shows the Fixed-Width Properties dialog box:
Figure 8-10. Fixed-Width File Properties Dialog Box

Table 8-3 describes options you can define in the Fixed Width Properties dialog box for file
sources:
Table 8-3. Fixed-Width File Properties for File Sources
Fixed-Width Required/
Description
Properties Options Optional
Text/Binary Required Indicates the character representing a null value in the file. This can be any
valid character in the file code page, or any binary value from 0 to 255. For
more information about specifying null characters, see “Null Character
Handling” on page 227.
Repeat Null Optional If selected, the PowerCenter Server reads repeat NULL characters in a
Character single field as a single NULL value. If you do not select this option, the
PowerCenter Server reads a single null character at the beginning of a field
as a null field.
Important: For multibyte code pages, Informatica recommends that you
specify a single-byte null character if you are using repeating non-binary null
characters. This ensures that repeating null characters fit into the column
exactly.
For more information about specifying null characters, see “Null Character
Code Page Required Select the code page of the fixed-width file. The default setting is the client
code page.
Number of Initial Optional The PowerCenter Server skips the specified number of rows before reading
Rows to Skip the file. Use this to skip header rows. One row may contain multiple records.
If you select the Line Sequential File Format option, the PowerCenter Server
ignores this option.
Number of Bytes to Optional The PowerCenter Server skips the specified number of bytes between
Skip Between records. For example, you have an ASCII file on Windows with one record on
Records each line, and a carriage return and line feed appear at the end of each line.
If you want the PowerCenter Server to skip these two single-byte characters,
enter 2.
If you have an ASCII file on UNIX with one record for each line, ending in a
carriage return, skip the single character by entering 1.
Strip Trailing Blanks Optional If selected, the PowerCenter Server strips trailing blank spaces from records
before passing them to the Source Qualifier transformation.
Line Sequential File Optional Select this option if the file uses a carriage return at the end of each record,
Format shortening the final column.
Configuring Delimited File Properties

When you read data from a delimited file, you can edit file properties in the session, such as
the delimiter or code page. You can configure delimited properties for non-reusable sessions
in the Workflow Designer and for reusable sessions in the Task Developer. You cannot
configure delimited properties for instances of reusable sessions in the Workflow Designer.
Click Set File Properties to open the Flat Files dialog box.

To edit the delimited properties, select Delimited and click Advanced. The Delimited File
Properties dialog box appears. By default, the Workflow Manager displays file properties as
configured in the mapping. Edit these settings to override those configured in the source
definition.
Figure 8-12 shows the Delimited File Properties dialog box:
Figure 8-12. Delimited File Properties Dialog Box

Table 8-4 describes options you can define in the Delimited File Properties dialog box for file
sources:
Table 8-4. Delimited File Properties for File Sources
Delimited File Required/

Description
Delimiters Required Character used to separate columns of data in the source file. Use the button
to the right of this field to enter a different delimiter. Delimiters can be either
printable or single-byte unprintable characters, and must be different from
the escape character and the quote character (if selected). You cannot select
unprintable multibyte characters as delimiters. The delimiter must be in the
same code page as the flat file code page.
Treat Consecutive Optional By default, the PowerCenter Server reads pairs of delimiters as a null value.
Delimiters as One If selected, the PowerCenter Server reads any number of consecutive
delimiter characters as one.
For example, a source file uses a comma as the delimiter character and
contains the following record: 56, , , Jane Doe. By default, the PowerCenter
Server reads that record as four columns separated by three delimiters: 56,
NULL, NULL, Jane Doe. If you select this option, the PowerCenter Server
reads the record as two columns separated by one delimiter: 56, Jane Doe.
Optional Quotes Required Select No Quotes, Single Quote, or Double Quotes. If you select a quote
character, the PowerCenter Server ignores delimiter characters within the
quote characters. Therefore, the PowerCenter Server uses quote characters
to escape the delimiter.
For example, a source file uses a comma as a delimiter and contains the
following row: 342-3849, ‘Smith, Jenna’, ‘Rockville, MD’, 6.
If you select the optional single quote character, the PowerCenter Server
ignores the commas within the quotes and reads the row as four fields.
If you do not select the optional single quote, the PowerCenter Server reads
six separate fields.
When the PowerCenter Server reads two optional quote characters within a
quoted string, it treats them as one quote character. For example, the
PowerCenter Server reads the following quoted string as I’m going
tomorrow:
2353, ‘I’’m going tomorrow.’, MD
Additionally, if you select an optional quote character, the PowerCenter
Server only reads a string as a quoted string if the quote character is the first
character of the field.
Note: You can improve session performance if the source file does not
contain quotes or escape characters.
Code Page Required Select the code page of the delimited file. The default setting is the client
code page.
Escape Character Optional Character immediately preceding a delimiter character embedded in an

unquoted string, or immediately preceding the quote character in a quoted
string. When you specify an escape character, the PowerCenter Server
reads the delimiter character as a regular character (called escaping the
delimiter or quote character).
Note: You can improve session performance for mappings containing
Sequence Generator transformations if the source file does not contain
quotes or escape characters.

Table 8-4. Delimited File Properties for File Sources

Description
Remove Escape Optional This option is selected by default. Clear this option to include the escape
Character From Data character in the output string.
Rows to Skip the file. Use this to skip title or header rows in the file.
Configuring Line Sequential Buffer Length

You can configure the line buffer length for file sources. By default, the PowerCenter Server
reads a file record into a buffer that holds 1024 bytes. If the source file records are larger than
1024 bytes, increase the Line Sequential Buffer Length property in the session properties
accordingly.
Figure 8-13 shows the Advanced settings on the Config Object tab in the session properties
where you define the line buffer length:
Figure 8-13. Line Sequential Buffer Length Property for File Sources
Line
Sequential
Buffer Length

Server Handling for File Sources
When you configure a session with file sources, you might take these additional features into
account when creating mappings with file sources:
♦ Character set
♦ Multibyte character error handling
♦ Null character handling
♦ Row length handling for fixed-width flat files
♦ Numeric data handling
♦ Tab handling
Character Set
You can configure the PowerCenter Server to run sessions in either ASCII or Unicode data
movement mode.
Table 8-5 describes source file formats supported by each data movement path in
PowerCenter:
Table 8-5. Support for ASCII and Unicode Data Movement Modes
Character Set Unicode mode ASCII mode
7-bit ASCII Supported Supported
US-EBCDIC Supported Supported

(COBOL sources only)
8-bit EBCDIC Supported Supported

(COBOL sources only)
ASCII-based MBCS Supported PowerCenter Server generates a warning message.
EBCDIC-based MBCS Supported Not supported. The PowerCenter Server terminates the session.
If you configure a session to run in ASCII data movement mode, delimiters, escape
characters, and null characters must be valid in the ISO Western European Latin 1 code page.
Any 8-bit characters you specified in previous versions of PowerCenter are still valid. In
Unicode data movement mode, delimiters, escape characters, and null characters must be
valid in the specified code page of the flat file.
For more information about configuring and working with data movement modes, see
“Globalization Overview” in the Installation and Configuration Guide.

Multibyte Character Error Handling
Misalignment of multibyte data in a file causes session errors. Data becomes misaligned when
you place column breaks incorrectly in a file, resulting in multibyte characters that extend
beyond the last byte in a column.
When you import a fixed-width flat file, you can create, move, or delete column breaks using
the Flat File Wizard. Incorrect positioning of column breaks can create alignment errors when
you run a session containing multibyte characters.
The PowerCenter Server handles alignment errors in fixed-width flat files according to the
following guidelines:
♦ Non-line sequential file. The PowerCenter Server skips rows containing misaligned data
and resumes reading the next row. The skipped row appears in the session log with a
corresponding error message. If an alignment error occurs at the end of a row, the
PowerCenter Server skips both the current row and the next row, and writes them to the
session log.
♦ Line sequential file. The PowerCenter Server skips rows containing misaligned data and
resumes reading the next row. The skipped row appears in the session log with a
corresponding error message.
♦ Reader error threshold. You can configure a session to stop after a specified number of
non-fatal errors. A row containing an alignment error increases the error count by 1. The
session stops if the number of rows containing errors reaches the threshold set in the
session properties. Errors and corresponding error messages appear in the session log file.
Fixed-width COBOL sources are always byte-oriented and can be line sequential. The
PowerCenter Server handles COBOL files according to the following guidelines:
♦ Line sequential files. The PowerCenter Server skips rows containing misaligned data and
writes the skipped rows to the session log. The session stops if the number of error rows
reaches the error threshold.
♦ Non-line sequential files. The session stops at the first row containing misaligned data.
Null Character Handling

You can specify single-byte or multibyte null characters for fixed-width flat files. The
PowerCenter Server uses these characters to determine if a column is null.
Server Handling for File Sources 227

Table 8-6 describes how the PowerCenter Server uses the Null Character and Repeat Null
Character properties to determine if a column is null:
Table 8-6. Null Character Handling
Null Repeat Null

PowerCenter Server Behavior
Character Character
Binary Disabled A column is null if the first byte in the column is the binary null character. The
PowerCenter Server reads the rest of the column as text data only to determine the
column alignment and track the shift state for shift sensitive code pages. If data in the
column is misaligned, the PowerCenter Server skips the row and writes the skipped row
and a corresponding error message to the session log.
Non-binary Disabled A column is null if the first character in the column is the null character. The
PowerCenter Server reads the rest of the column only to determine the column
alignment and track the shift state for shift sensitive code pages. If data in the column is
misaligned, the PowerCenter Server skips the row and writes the skipped row and a
corresponding error message to the session log.
Binary Enabled A column is null if it contains only the specified binary null character. The next column
inherits the initial shift state of the code page.
Non-binary Enabled A column is null if the repeating null character fits into the column exactly, with no bytes
leftover. For example, a five-byte column is not null if you specify a two-byte repeating
null character. In shift-sensitive code pages, shift bytes do not affect the null value of a
column. A column is still null if it contains a shift byte at the beginning or end of the
column.
Informatica recommends you specify a single-byte null character if you use repeating
non-binary null characters. This ensures that repeating null characters fit into a column
exactly.
Row Length Handling for Fixed-Width Flat Files

For fixed-width flat files, data in a row can be shorter than the row length in the following
situations:
♦ The file is fixed-width line-sequential with a carriage return or line feed that appears
sooner than expected.
♦ The file is fixed-width non-line sequential, and the last line in the file is shorter than
expected.
In these cases, the PowerCenter Server reads the data but does not append any blanks to fill
the remaining bytes. The PowerCenter Server reads subsequent fields as NULL. Fields
containing repeating null characters that do not fill the entire field length are not considered
NULL.

Numeric Data Handling
Sometimes, file sources contain non-numeric data in numeric columns. When the
PowerCenter Server reads non-numeric data, it treats the row differently, depending on the
source type. When the PowerCenter Server reads non-numeric data from numeric columns in
a flat file source or an XML source, it drops the row and writes the row to the session log.
When the PowerCenter Server reads non-numeric data for numeric columns in a COBOL
source, it reads a null value for the column.
Server Handling for File Sources 229

Using a File List
You can create a session to run multiple source files for one source instance in the mapping.
You might use this feature if, for example, your company collects data at several locations
which you then want to move through the same session. When you create a mapping to use
multiple source files for one source instance, the properties of all files must exactly match the
source definition.
To use multiple source files, you create a file containing the names and directories of each
source file you want the PowerCenter Server to use. This file is referred to as a file list.
When you configure the session properties, enter the file name of the file list in the Source
Filename field and enter the location of the file list in the Source File Directory field. When
the session starts, the PowerCenter Server reads the file list, then locates and reads the first file
source in the list. After the PowerCenter Server reads the first file, it locates and reads the next
file in the list.
The PowerCenter Server writes the path and name of the file list to the session log. If the
PowerCenter Server encounters an error while accessing a source file, it logs the error in the
session log and stops the session.
Note: When you use a file list and the session performs incremental aggregation, the
PowerCenter Server performs incremental aggregation across all listed source files.
Creating the File List

The file list contains the names of all the source files you want the PowerCenter Server to use
for the source instance in the session. Create the file list in an editor appropriate to the
PowerCenter Server platform and save it as a text file. For example, you can create a file list
for a PowerCenter Server on Windows with any text editor then save it as ASCII.
The PowerCenter Server interprets the file list using the PowerCenter Server code page. Each
file in the list must use the user-defined code page configured in the source definition. This
code page must be a subset of the repository code page.
Each file in the file list must share the same file properties as configured in the source
definition or as entered for the source instance in the session property sheet. You can enter
different paths for each file in the list, but for the session to complete successfully, the paths
must be local to the PowerCenter Server machine. Map the drives on a PowerCenter Server on
Windows or mount the drives on a PowerCenter Server on UNIX, as necessary. If you do not
specify a path for a file, the PowerCenter Server assumes the file is in the same directory as the
file list.
The file list format must follow the following guidelines:
♦ Text file
♦ One file name, or path and file name, for each line

The PowerCenter Server skips blank lines and ignores leading blank spaces. Any characters
indicating a new line, such as \n in ASCII files, must be valid in the code page of the
PowerCenter Server.
The following example shows a valid file list created for a PowerCenter Server on Windows.
Each of the drives listed are mapped on the server machine. The western_trans.dat file is
located in the same directory as the file list.
western_trans.dat
d:\data\eastern_trans.dat
e:\data\midwest_trans.dat
f:\data\canada_trans.dat
Once you create the file list, place it in a directory local to the PowerCenter Server.
Configuring a Session to Use a File List

After you create a file list for multiple source files, you can configure the session to access
those files.
To use multiple source files for one source instance in a session:

2. Click the Mapping tab and open the Transformations view.
Using a File List 231

3. Click the Properties settings in the Sources node.
Source
Filename
Indirect
File Type
4. In the Source Filetype field, choose Indirect.

5. In the Source Filename field, replace the file name with the name of the file list.
If necessary, also enter the path in the Source File Directory field.
If you enter only a file name in the Source Filename field, and you have specified a path
in the Source File Directory field, the PowerCenter Server looks for the named file in the
listed directory.
If you enter only a file name in the Source Filename field, and you do not specify a path
in the Source File Directory field, the PowerCenter Server looks for the named file in the
directory where the PowerCenter Server is installed on UNIX or in the system directory
on Windows.
6. Click OK.

Chapter 9
Working with Targets

♦ Overview, 234
♦ Configuring Targets in a Session, 236
♦ Working with Relational Targets, 240
♦ Working with Target Connection Groups, 257
♦ Working with Active Sources, 259
♦ Working with File Targets, 261
♦ Server Handling for File Targets, 268
♦ Working with Heterogeneous Targets, 274
233
Overview
In the Workflow Manager, you can create sessions with the following targets:
♦ Relational. You can load data to any relational database that the PowerCenter Server can
connect to. When loading data to relational targets, you must configure the database
connection to the target before you configure the session.
♦ File. You can load data to a flat file or XML target. The PowerCenter Server can load data
to any local directory or FTP connection for the target file. If the file target requires an
FTP connection, you need to configure the FTP connection to the host machine before
you create the session.
♦ Heterogeneous. You can output data to multiple targets in the same session. You can
output to multiple relational targets, such as Oracle and Microsoft SQL Server. Or, you
can output to multiple target types, such as relational and flat file. For more information,
see “Working with Heterogeneous Targets” on page 274.
Globalization Features
You can configure the PowerCenter Server to run sessions in either ASCII or Unicode data
movement mode.
Table 9-1 describes target character sets supported by each data movement mode in
PowerCenter:
Table 9-1. Support for ASCII and Unicode Data Movement Modes
Character Set Unicode Mode ASCII Mode
ASCII-based MBCS Supported PowerCenter Server generates a warning

message, but does not terminate the session.
UTF-8 Supported (targets only) PowerCenter Server generates a warning

message, but does not terminate the session.
PowerCenter allows you to work with targets that use multibyte character sets. You can choose
a code page that you want the PowerCenter Server to use for relational objects and flat files.
You specify code pages for relational objects when you configure database connections in the
Workflow Manager. The code page for a database connection used as a target must be a
superset of the repository code page.
When you change the database connection code page to one that is not two-way compatible
with the old code page, the Workflow Manager generates a warning and invalidates all
sessions that use that database connection.
234 Chapter 9: Working with Targets

Code pages you select for a file represent the code page of the data contained in these files. If
you are working with flat files, you can also specify delimiters and null characters supported
by the code page you have specified for the file.
Target code pages must be a superset of the repository code page. They must also be a superset
of the source code page and the PowerCenter Server code page.
However, if you configure the PowerCenter Server and Client for relaxed code page
validation, you can select any code page supported by PowerCenter for the target database
connection. When using relaxed code page validation, select compatible code pages for the
source and target data to prevent data inconsistencies. For more information about code page
compatibility, see “Globalization Overview” in the Installation and Configuration Guide.
If the target contains multibyte character data, configure the PowerCenter Server to run in
Unicode mode. When the PowerCenter Server runs a session in Unicode mode, it uses the
database code page to translate data.
If the target contains only single-byte characters, configure the PowerCenter Server to run in
ASCII mode. When the PowerCenter Server runs a session in ASCII mode, it does not
validate code pages.
Target Connections
Before you can load data to a target, you must configure the connection properties the
PowerCenter Server uses to connect to the target file or database. You can configure target
database and FTP connections in the Workflow Manager.
For details on creating database connections, see “Setting Up a Relational Database
Connection” on page 53. For details on creating FTP connections, see “Using FTP” on
page 559.
Partitioning Targets
When you create multiple partitions in a session with a relational target, the PowerCenter
Server creates multiple connections to the target database to write target data concurrently.
When you create multiple partitions in a session with a file target, the PowerCenter Server
creates one target file for each partition. You can configure the session properties to merge
these target files.
For details on configuring a session for pipeline partitioning, see “Pipeline Partitioning” on
page 345.

You must have execute permissions for connection objects associated with the session. For
example, if the target requires database connections or FTP connections, you must have read
permission on the connections to configure the session, and execute permission to run the
session.
Overview 235
Configuring Targets in a Session
Configure target properties for sessions in the Transformations view on Mapping tab of the
session properties. Click the Targets node to view the target properties. When you configure
target properties for a session, you define properties for each target instance in the mapping.
Figure 9-1 shows where you define target properties in a session:
Figure 9-1. Defining Target Properties in the Session Properties
Targets Node
Writers Settings
Connections Settings
Properties Settings
Transformations View
The Targets node contains the following settings where you define properties:
♦ Writers
♦ Connections
♦ Properties
Configuring Writers
Click the Writers settings in the Transformations view to define the writer to use with each
target instance.

Figure 9-2 shows you define the writer to use with each target instance:
Figure 9-2. Writers Settings on the Mapping Tab of the Session Properties
Writers Settings
When the mapping target is a flat file, an XML file, an SAP BW target, or an IBM MQSeries
target, the Workflow Manager specifies the necessary writer in the session properties.
However, when the target in the mapping is relational, you can change the writer type to File
Writer if you plan to use an external loader.
Note: You can change the writer type for non-reusable sessions in the Workflow Designer and
for reusable sessions in the Task Developer. You cannot change the writer type for instances of
reusable sessions in the Workflow Designer.
When you override a relational target to use the file writer, the Workflow Manager changes
the properties for that target instance on the Properties settings. It also changes the
connection options you can define in the Connections settings.
After you override a relational target to use a file writer, define the file properties for the
target. Click Set File Properties and choose the target to define. For more information, see
“Configuring Fixed-Width Properties” on page 265 and “Configuring Delimited Properties”
on page 266.
Configuring Connections
View the Connections settings on the Mapping tab to define target connection information.
Configuring Targets in a Session 237

Figure 9-3 shows the Connections settings on the Mapping tab of the session properties:
Figure 9-3. Connections Settings on the Mapping Tab of the Session Properties
Choose a connection.
Edit a connection.
For relational targets, the Workflow Manager displays Relational as the target type by default.
In the Value column, choose a configured database connection for each relational target
instance. For details on configuring database connections, see “Target Database Connection”
on page 241.
For flat file and XML targets, choose one of the following target connection types in the Type
column for each target instance:
♦ FTP. If you want to load data to a flat file or XML target using FTP, you must specify an
FTP connection when you configure target options. FTP connections must be defined in
the Workflow Manager prior to configuring sessions.
You must have read permission for any FTP connection you want to associate with the
session. The user starting the session must have execute permission for any FTP
connection associated with the session. For details on using FTP, see “Using FTP” on
page 559.
♦ Loader. You can use the external loader option to improve the load speed to Oracle, DB2,
Sybase IQ, or Teradata target databases.
To use this option, you must use a mapping with a relational target definition and choose
File as the writer type on the Writers settings for the relational target instance. The
PowerCenter Server uses an external loader to load target files to the Oracle, DB2, Sybase

IQ, or Teradata database. You cannot choose external loader if the target is defined in the
mapping as a flat file, XML, MQ, or SAP BW target.
For details on using the external loader feature, see “External Loading” on page 523.
♦ Queue. Choose Queue when you want to output to an IBM MQSeries message queue. For
details, see the PowerCenter Connect for IBM MQSeries User and Administrator Guide.
♦ None. Choose None when you want to write to a local flat file or XML file.
Configuring Properties
View the Properties settings on the Mapping tab to define target property information. The
Workflow Manager displays different properties for the different target types: relational, flat
file, and XML.
Figure 9-4 shows the Properties settings on the Mapping tab:
Figure 9-4. Properties Settings on the Mapping Tab of the Session Properties
Properties Settings
For more information on relational target properties, see “Working with Relational Targets”
on page 240. For more information on flat file target properties, see “Working with File
Targets” on page 261. For more information on XML target properties, see “Working with
Heterogeneous Targets” on page 274.
For more information on configuring sessions with multiple target types, see “Working with
Heterogeneous Targets” on page 274.
Configuring Targets in a Session 239

Working with Relational Targets
When you configure a session to load data to a relational target, you define most properties in
the Transformations view on the Mapping tab. You also define some properties on the
Properties tab and the Config Object tab.
You can configure the following properties for relational targets:
♦ Target database connection. Define database connection information. For more
information, see “Target Database Connection” on page 241.
♦ Target properties. You can define target properties such as target load type, target update
options, and reject options. For more information, see “Target Properties” on page 241.
♦ Truncate target tables. The PowerCenter Server can truncate target tables before loading
data. For more information, see “Truncating Target Tables” on page 245.
♦ Deadlock retry. You can configure the session to retry deadlocks when writing to targets.
For more information, see “Deadlock Retry” on page 246.
♦ Drop and recreate indexes. Use pre- and post-session SQL to drop and recreate an index
on a relational target table to optimize query speed. For more information, see “Dropping
and Recreating Indexes” on page 248.
♦ Constraint-based loading. The PowerCenter Server can load data to targets based on
primary key-foreign key constraints and active sources in the session mapping. For more
information, see “Constraint-Based Loading” on page 248.
♦ Bulk loading. You can specify bulk mode when loading to DB2, Microsoft SQL Server,
Oracle, and Sybase databases. For more information, see “Bulk Loading” on page 252.
You can define the following properties in the session and override the properties you define
in the mapping:
♦ Table name prefix. You can specify the target owner name or prefix in the session
properties to override the table name prefix in the mapping. For more information, see
“Table Name Prefix” on page 254.
♦ Pre-session SQL. You can create SQL commands and execute them in the target database
before loading data to the target. For example, you might want to drop the index for the
target table before loading data into it. For more information, see “Using Pre- and Post-
Session SQL Commands” on page 186.
♦ Post-session SQL. You can create SQL commands and execute them in the target database
after loading data to the target. For example, you might want to recreate the index for the
target table after loading data into it. For more information, see “Using Pre- and Post-
Session SQL Commands” on page 186.
If any target table or column name contains a database reserved word, you can create and
maintain a reserved words file containing database reserved words. When the PowerCenter
Server executes SQL against the database, it places quotes around the reserved words. For
more information, see “Reserved Words” on page 255.
When the PowerCenter Server runs a session with at least one relational target, it performs
database transactions per target connection group. For example, it commits all data to targets

in a target connection group at the same time. For more information, see “Working with
Target Connection Groups” on page 257.
Target Database Connection

Before you can run a session to load data to a target database, the PowerCenter Server must
connect to the target database. Database connections must exist in the repository to appear on
the target database list. You must define them prior to configuring a session. For details on
configuring a database connection, see “Configuring the Workflow Manager” on page 37.
You can choose the target connections in the Transformations view of the Mapping tab. Click
either the Targets or Connections node and select the database connection from the list for
each target instance. You must have read permission for the target database connection to
configure the session to use it. The user starting the configured session must have execute
permission for target database connections.
Target Properties
You can configure session properties for relational targets in the Transformations view on the
Mapping tab, and in the General Options settings on the Properties tab. Define the properties
for each target instance in the session.
When you click the Transformations view on the Mapping tab, you can view and configure
the settings of a specific target. Select the target under the Targets node.
Working with Relational Targets 241

Figure 9-5 shows the relational target properties you define in the Properties settings on the
Mapping tab:
Figure 9-5. Properties Settings on the Mapping Tab for a Relational Target
Edit settings for a

particular target.
Table 9-2 describes the properties available in the Properties settings on the Mapping tab of
the session properties:
Table 9-2. Relational Target Properties
Required/
Target Property Description
Optional
Target Load Type Required You can choose Normal or Bulk.

If you select Normal, the PowerCenter Server loads targets normally.
You can only choose Bulk when you load to Sybase, Oracle, or Microsoft
SQL Server. If you specify Bulk for other database types, the PowerCenter
Server reverts to a normal load.
Note: Choose Normal mode if the mapping contains an Update Strategy
transformation.
For more information, see “Bulk Loading” on page 252.
Insert* Optional If selected, the PowerCenter Server inserts all rows flagged for insert.
By default, this option is selected.
Update (as Update)* Optional If selected, the PowerCenter Server updates all rows flagged for update.
Update (as Insert)* Optional If selected, the PowerCenter Server inserts all rows flagged for update.
By default, this option is not selected.

Table 9-2. Relational Target Properties
Required/
Optional
Update (else Insert)* Optional If selected, the PowerCenter Server updates rows flagged for update if they
exist in the target, then inserts any remaining rows marked for insert.
Delete* Optional If selected, the PowerCenter Server deletes all rows flagged for delete.
Truncate Table Optional If selected, the PowerCenter Server truncates the target before loading.
For details on this feature, see “Truncating Target Tables” on page 245.
Reject File Directory Optional Enter the directory name in this field. By default, the PowerCenter Server
writes all reject files to the server variable directory, $PMBadFileDir.
If you specify both the directory and file name in the Reject Filename field,
clear this field. The PowerCenter Server concatenates this field with the
Reject Filename field when it runs the session.
You can also use the $BadFileName session parameter to specify the file
directory.
Reject Filename Required Enter the file name, or file name and path. By default, the PowerCenter
Server names the reject file after the target instance name:
target_name.bad. Optionally use the $BadFileName session parameter for
the file name.
The PowerCenter Server concatenates this field with the Reject File
Directory field when it runs the session. For example, if you have
“C:\reject_file\” in the Reject File Directory field, and enter “filename.bad” in
the Reject Filename field, the PowerCenter Server writes rejected rows to
C:\reject_file\filename.bad.
*For details on target update strategies, see “Update Strategy Transformation” in the Transformation Guide.

Figure 9-6 shows the test load options in the General Options settings on the Properties tab:
Figure 9-6. Test Load Options
Test Load
Options
Table 9-3 describes the test load options on the General Options settings on the Properties
tab:
Table 9-3. Test Load Options
Required/
Property Description
Optional
Enable Test Load Optional You can configure the PowerCenter Server to perform a test load.
With a test load, the PowerCenter Server reads and transforms data without
writing to targets. The PowerCenter Server generates all session files, and
performs all pre- and post-session functions, as if running the full session.
The PowerCenter Server writes data to relational targets, but rolls back the
data when the session completes. For all other target types, such as flat file
and SAP BW, the PowerCenter Server does not write data to the targets.
Enter the number of source rows you want to test in the Number of Rows to
Test field.
You cannot perform a test load on sessions using XML sources.
Note: You can perform a test load for relational targets when you configure a
session for normal mode. If you configure the session for bulk mode, the
session fails.
Number of Rows to Optional Enter the number of source rows you want the PowerCenter Server to test
Test load.
The PowerCenter Server reads the exact number you configure for the test
load.

Truncating Target Tables
The PowerCenter Server can truncate target tables before running a session. You can choose
to truncate tables on a target-by-target basis. If you have more than one target instance, you
only have to select the truncate target table option for one target instance.
Depending on the target database and primary key-foreign key relationships in the session
target, the PowerCenter Server might issue a delete or truncate command.
Table 9-4 lists the commands that the PowerCenter Server issues for each database:
Table 9-4. PowerCenter Server Commands on Supported Databases
Table contains a primary key Table does not contain a primary key
Target Database
referenced by a foreign key referenced by a foreign key
DB2 truncate table <table_name>* truncate table <table_name>*
Informix delete from <table_name> delete from <table_name>
ODBC delete from <table_name> delete from <table_name>
Oracle delete from <table_name> unrecoverable truncate table <table_name>
Microsoft SQL Server delete from <table_name> truncate table <table_name>**
Sybase 11.x truncate table <table_name> truncate table <table_name>

*If you use a DB2 database on AS/400, the PowerCenter Server issues a clrpfm command.
** If you use the Microsoft SQL Server ODBC driver, the PowerCenter Server issues a delete statement.
If the PowerCenter Server issues a truncate target table command and the target table instance
specifies a table name prefix, the PowerCenter Server verifies the database user privileges for
the target table by issuing a truncate command. If the database user is not specified as the
target owner name or does not have the database privilege to truncate the target table, the
PowerCenter Server automatically issues a delete command instead and writes the following
error message to the session log:
WRT_8208 Error truncating target table <target table name> trying DELETE
FROM query.
If the PowerCenter Server issues a delete command and the database has logging enabled, the
database saves all deleted records to the log for rollback. If you do not want to save deleted
records for rollback, you can disable logging to improve the speed of the delete.
For all databases, if the PowerCenter Server fails to truncate or delete any selected table
because the user lacks the necessary privileges, the session fails.
If you use truncate target tables with one of the following functions, the PowerCenter Server
fails to successfully truncate target tables for the session:
♦ Incremental aggregation. When you enable both truncate target tables and incremental
aggregation in the session properties, the Workflow Manager issues a warning that you
cannot enable truncate target tables and incremental aggregation in the same session.

♦ Test load. When you enable both truncate target tables and test load, the PowerCenter
Server disables the truncate table function, runs a test load session, and writes the
following message to the session log:
WRT_8105 Truncate target tables option turned off for test load session.
To truncate a target table:

2. Click the Mapping tab, and then click the Transformations view.
3. Click the Targets node.
Truncate Target
Table Option
4. In the Properties settings, select Truncate Target Table Option for each target table you
want the PowerCenter Server to truncate before it runs the session.
5. Click OK.
Deadlock Retry
Select the Session Retry on Deadlock option in the session properties if you want the
PowerCenter Server to retry target writes on a deadlock. A deadlock might occur when the
PowerCenter Server attempts to take control of the same lock for a row when loading
partitioned targets or when running two sessions simultaneously to the same target.

If the PowerCenter Server encounters a deadlock when it tries to write to a target, the
deadlock only affects targets in the same target connection group. The PowerCenter Server
still writes to targets in other target connection groups.
Encountering deadlocks can slow session performance. To improve session performance, you
can increase the number of target connection groups the PowerCenter Server uses to write to
the targets in a session. To use a different target connection group for each target in a session,
use a different database connection name for each target instance. If you want, you can specify
the same connection information for each connection name. For more information, see
“Working with Target Connection Groups” on page 257.
You can only retry sessions on deadlock for targets configured for normal load. If you select
this option and configure a target for bulk mode, the PowerCenter Server does not retry target
writes on a deadlock for that target. You can also configure the PowerCenter Server to set the
number of deadlock retries and the deadlock sleep time period. For more information on
configuring the PowerCenter Server, see the Installation and Configuration Guide.
To retry a session on deadlock, click the Properties tab in the session properties and then
scroll down to the Performance settings.
Figure 9-7 shows how to retry sessions on deadlock:
Figure 9-7. Session Retry on Deadlock
Session Retry
on Deadlock

Dropping and Recreating Indexes
After you insert significant amounts of data into a target, you normally need to drop and
recreate indexes on that table to optimize query speed. You can drop and recreate indexes by:
♦ Using pre- and post-session SQL. The preferred method for dropping and re-creating
indexes is to define a SQL statement in the Pre SQL property that drops indexes before
loading data to the target. You can use the Post SQL property to recreate the indexes after
loading data to the target. Define the Pre SQL and Post SQL properties for relational
targets in the Transformations view on the Mapping tab in the session properties. For more
information, see “Using Pre- and Post-Session SQL Commands” on page 186.
♦ Using the Designer. The same dialog box you use to generate and execute DDL code for
table creation can drop and recreate indexes. However, this process is not automatic. Every
time you run a session that modifies the target table, you need to launch the Designer and
use this feature.
Constraint-Based Loading
In the Workflow Manager, you can specify constraint-based loading for a session. When you
select this option, the PowerCenter Server orders the target load on a row-by-row basis. For
every row generated by an active source, the PowerCenter Server loads the corresponding
transformed row first to the primary key table, then to any foreign key tables. Constraint-
based loading depends on the following requirements:
♦ Active source. Related target tables must have the same active source.
♦ Key relationships. Target tables must have key relationships.
♦ Target connection groups. Targets must be in one target connection group.
♦ Treat rows as insert. Use this option when you insert into the target. You cannot use
updates with constraint-based loading.
Active Source
When target tables receive rows from different active sources, the PowerCenter Server reverts
to normal loading for those tables, but loads all other targets in the session using constraint-
based loading when possible. For example, a mapping contains three distinct pipelines. The
first two contain a source, source qualifier, and target. Since these two targets receive data
from different active sources, the PowerCenter Server reverts to normal loading for both
targets. The third pipeline contains a source, Normalizer, and two targets. Since these two
targets share a single active source (the Normalizer), the PowerCenter Server performs
constraint-based loading: loading the primary key table first, then the foreign key table.
For more information on active sources, see “Working with Active Sources” on page 259.
Key Relationships
When target tables have no key relationships, the PowerCenter Server does not perform
constraint-based loading. Similarly, when target tables have circular key relationships, the

PowerCenter Server reverts to a normal load. For example, you have one target containing a
primary key and a foreign key related to the primary key in a second target. The second target
also contains a foreign key that references the primary key in the first target. The
PowerCenter Server cannot enforce constraint-based loading for these tables. It reverts to a
normal load.
Target Connection Groups

The PowerCenter Server enforces constraint-based loading for targets in the same target
connection group. If you want to specify constraint-based loading for multiple targets that
receive data from the same active source, you must verify the tables are in the same target
connection group. If the tables with the primary key-foreign key relationship are in different
target connection groups, the PowerCenter Server cannot enforce constraint-based loading
when you run the workflow.
To verify that all targets are in the same target connection group, perform the following tasks:
♦ Verify all targets are in the same target load order group and receive data from the same
active source.
♦ Use the default partition properties and do not add partitions or partition points.
♦ Define the same target type for all targets in the session properties.
♦ Define the same database connection name for all targets in the session properties.
♦ Choose normal mode for the target load type for all targets in the session properties.
For more information, see “Working with Target Connection Groups” on page 257.
Treat Rows as Insert

Use constraint-based loading only when the session option Treat Source Rows As is set to
Insert. You might get inconsistent data if you select a different Treat Source Rows As option
and you configure the session for constraint-based loading.
When the mapping contains Update Strategy transformations and you need to load data to a
primary key table first, split the mapping using one of the following options:
♦ Load primary key table in one mapping and dependent tables in another mapping. You
can use constraint-based loading to load the primary table.
♦ Perform inserts in one mapping and updates in another mapping.
For more information about update strategies, see “Update Strategy Transformation” in the
Transformation Guide.
Constraint-based loading does not affect the target load ordering of the mapping. Target load
ordering defines the order the PowerCenter Server reads the sources in each target load order
group in the mapping. A target load order group is a collection of source qualifiers,
transformations, and targets linked together in a mapping. Constraint-based loading
establishes the order in which the PowerCenter Server loads individual targets within a set of
targets receiving data from a single source qualifier.

Example
The session for the mapping in Figure 9-8 is configured to perform constraint-based loading.
In the first pipeline, target T_1 has a primary key, T_2 and T_3 contain foreign keys
referencing the T1 primary key. T_3 has a primary key that T_4 references as a foreign key.
Since these four tables receive records from a single active source, SQ_A, the PowerCenter
Server loads rows to the target in the following order:
♦ T_1
♦ T_2 and T_3 (in no particular order)
♦ T_4
The PowerCenter Server loads T_1 first because it has no foreign key dependencies and
contains a primary key referenced by T_2 and T_3. The PowerCenter Server then loads T_2
and T_3, but since T_2 and T_3 have no dependencies, they are not loaded in any particular
order. The PowerCenter Server loads T_4 last, because it has a foreign key that references a
primary key in T_3.
Figure 9-8. Mapping Using Constraint-Based Loading
After loading the first set of targets, the PowerCenter Server begins reading source B. If there
are no key relationships between T_5 and T_6, the PowerCenter Server reverts to a normal
load for both targets.
If T_6 has a foreign key that references a primary key in T_5, since T_5 and T_6 receive data
from a single active source, the Aggregator AGGTRANS, the PowerCenter Server loads rows
to the tables in the following order:
♦ T_5
♦ T_6

T_1, T_2, T_3, and T_4 are in one target connection group if you use the same database
connection for each target, and you use the default partition properties. T_5 and T_6 are in
another target connection group together if you use the same database connection for each
target and you use the default partition properties. The PowerCenter Server includes T_5 and
T_6 in a different target connection group because they are in a different target load order
group from the first four targets.
To enable constraint-based loading:
1. In the General Options settings of the Properties tab, choose Insert for the Treat Source
Rows As property.
Treat rows
as insert.

2. Click the Config Object tab. In the Advanced settings, select Constraint Based Load
Ordering.
Constraint Based
Load Ordering
3. Click OK.
Bulk Loading
You can enable bulk loading when you load to DB2, Sybase, Oracle, or Microsoft SQL Server.
If you enable bulk loading for other database types, the PowerCenter Server reverts to a
normal load. Bulk loading improves the performance of a session that inserts a large amount
of data to the target database. Configure bulk loading on the Mapping tab.
When bulk loading, the PowerCenter Server invokes the database bulk utility and bypasses
the database log, which speeds performance. Without writing to the database log, however,
the target database cannot perform rollback. As a result, you may not be able to perform
recovery. Therefore, you must weigh the importance of improved session performance against
the ability to recover an incomplete session.
For more information on increasing session performance when bulk loading, see “Bulk
Loading” on page 642.
Note: When loading to DB2, Microsoft SQL Server, and Oracle targets, you must specify a
normal load for data driven sessions. When you specify bulk mode and data driven, the
PowerCenter Server reverts to normal load.

Committing Data
When bulk loading to Sybase and DB2 targets, the PowerCenter Server ignores the commit
interval you define in the session properties and commits data when the writer block is full.
When bulk loading to Microsoft SQL Server and Oracle targets, the PowerCenter Server
commits data at each commit interval. Also, Microsoft SQL Server and Oracle start a new
bulk load transaction after each commit.
Tip: When bulk loading to Microsoft SQL Server or Oracle targets, define a large commit
interval to reduce the number of bulk load transactions and increase performance.
Oracle Guidelines
Oracle allows bulk loading for the following software versions:
♦ Oracle server version 8.1.5 or higher
♦ Oracle client version 8.1.7.2 or higher
You can use the Oracle client 8.1.7 if you install the Oracle Threaded Bulk Mode patch.
Use the following guidelines when bulk loading to Oracle:
♦ Do not define CHECK constraints in the database.
♦ Do not define primary and foreign keys in the database. However, you can define primary
and foreign keys for the target definitions in the Designer.
♦ To bulk load into indexed tables, choose non-parallel mode. To do this, you must disable
the Enable Parallel Mode option. For more information, see “Configuring a Relational
Database Connection” on page 56.
Note that when you disable parallel mode, you cannot load multiple target instances,
partitions, or sessions into the same table.
To bulk load in parallel mode, you must drop indexes and constraints in the target tables
before running a bulk load session. After the session completes, you can rebuild them. If
you use bulk loading with the session on a regular basis, you can use pre- and post-session
SQL to drop and rebuild indexes and key constraints.
♦ When you use the LONG datatype, verify it is the last column in the table.
♦ Specify the Table Name Prefix for the target when you use Oracle client 9i. If you do not
specify the table name prefix, the PowerCenter Server uses the database login as the prefix.
For more information, see your Oracle documentation.
DB2 Guidelines
Use the following guidelines when bulk loading to DB2:
♦ You must drop indexes and constraints in the target tables before running a bulk load
session. After the session completes, you can rebuild them. If you use bulk loading with
the session on a regular basis, you can use pre- and post-session SQL to drop and rebuild
indexes and key constraints.

♦ You cannot use source-based or user-defined commit when you run bulk load sessions on
DB2.
♦ If you create multiple partitions for a DB2 bulk load session, you must use database
partitioning for the target partition type. If you choose any other partition type, the
PowerCenter Server reverts to normal load and writes the following message to the session
log:
ODL_26097 Only database partitioning is support for DB2 bulk load.
Changing target load type variable to Normal.
♦ When you bulk load to DB2, the DB2 database writes non-fatal errors and warnings to a
message log file in the session log directory. The message log file name is
<session_log_name>.<target_instance_name>.<partition_index>.log. You can check both
the message log file and the session log when you troubleshoot a DB2 bulk load session.
For more information, see your DB2 documentation.
Table Name Prefix

The table name prefix is the owner of the target table. For some databases, such as DB2,
tables can have different owners. If the database user specified in the database connection is
not the owner of the target tables in a session, specify the table owner for each target instance.
A session can fail if the database user is not the owner and you do not specify the table owner
name.
You can specify the table owner name in the target instance or in the session properties. When
you specify the table owner name in the session properties, you override table owner name in
the transformation properties. For more information about specifying table owner name in
the mapping properties, see “Mappings” in the Designer Guide.
Note: When you specify the table owner name and you set the sqlid for a DB2 database in the
environment SQL, the PowerCenter Server uses table owner name in the target instance. To
use the table owner name specified in the SET sqlid statement, do not enter a name in the
target name prefix.
To specify the target owner name or prefix at the session level:
1. In the Workflow Manager, open the session properties and click the Transformations
view on the Mapping tab.
2. Select the target instance under the Targets node.

3. In the Properties settings, enter the table owner name or prefix in the Table Name Prefix
field, and click OK.
Target Instance
Table Name Prefix
Reserved Words
If any table name or column name contains a database reserved word, such as MONTH or
YEAR, the session fails with database errors when the PowerCenter Server executes SQL
against the database. You can create and maintain a reserved words file, reswords.txt, in the
PowerCenter Server installation directory. When the PowerCenter Server initializes a session,
it searches for reswords.txt. If the file exists, the PowerCenter Server places quotes around
matching reserved words when it executes SQL against the database.
Use the following rules and guidelines when working with reserved words.
♦ The PowerCenter Server searches the reserved words file when it generates SQL to connect
to source, target, and lookup databases.
♦ If you override the SQL for a source, target, or lookup, you must enclose any reserved
word in quotes.
♦ You may need to enable some databases, such as Microsoft SQL Server and Sybase, to use
SQL-92 standards regarding quoted identifiers. You can use environment SQL to issue the
command. For example, with Microsoft SQL Server, you can use the following command:
SET QUOTED_IDENTIFIER ON

Sample reswords.txt File
To use a reserved words file, create a file named reswords.txt and place it in the PowerCenter
Server installation directory. Create a section for each database that you need to store reserved
words for. Add reserved words used in any table or column name. You do not need to store all
reserved words for a database in this file. Database names and reserved words in resword.txt
are not case sensitive.
Following is a sample resword.txt file:
[Teradata]
MONTH
DATE
INTERVAL
[Oracle]
OPTION
START
[DB2]
[SQL Server]
CURRENT
[Informix]
[ODBC]
MONTH
[Sybase]

Working with Target Connection Groups
When you create a session with at least one relational target, SAP BW target, or dynamic
MQSeries target, you need to consider target connection groups. A target connection group is
a group of targets that the PowerCenter Server uses to determine commits and loading. When
the PowerCenter Server performs a database transaction, such as a commit, it performs the
transaction to all targets in a target connection group.
The PowerCenter Server performs the following database transactions per target connection
group:
♦ Deadlock retry. If the PowerCenter Server encounters a deadlock when it writes to a
target, the deadlock only affects targets in the same target connection group. The
PowerCenter Server still writes to targets in other target connection groups. For more
information, see “Deadlock Retry” on page 246.
♦ Constraint-based loading. The PowerCenter Server enforces constraint-based loading for
targets in a target connection group. If you want to specify constraint-based loading, you
must verify the primary table and foreign table are in the same target connection group.
For more information, see “Constraint-Based Loading” on page 248.
Targets in the same target connection group meet the following criteria:
♦ Belong to the same partition.
♦ Belong to the same target load order group.
♦ Have the same target type in the session.
♦ Have the same database connection name for relational targets, and Application
connection name for SAP BW targets. For more information, see the PowerCenter
Connect for SAP BW User and Administrator Guide.
♦ Have the same target load type, either normal or bulk mode.
For example, suppose you create a session based on a mapping that reads data from one source
and writes to two Oracle target tables. In the Workflow Manager, you do not create multiple
partitions in the session. You use the same Oracle database connection for both target tables
in the session properties. You specify normal mode for the target load type for both target
tables in the session properties. The targets in the session belong to the same target
connection group.
Suppose you create a session based on the same mapping. In the Workflow Manager, you do
not create multiple partitions. However, you use one Oracle database connection name for
one target, and you use a different Oracle database connection name for the other target. You
specify normal mode for the target load type for both target tables. The targets in the session
belong to different target connection groups.
Note: When you define the target database connections for multiple targets in a session using
session parameters, the targets may or may not belong to the same target connection group.
The targets belong to the same target connection group if all session parameters resolve to the
same target connection name. For example, you create a session with two targets and specify
the session parameter $DBConnection1 for one target, and $DBConnection2 for the other
Working with Target Connection Groups 257

target. In the parameter file, you define $DBConnection1 as Sales1 and you define
$DBConnection2 as Sales1 and run the workflow. Both targets in the session belong to the
same target connection group.

Working with Active Sources
An active source is an active transformation the PowerCenter Server uses to generate rows. An
active source can be any of the following transformations:
♦ Aggregator
♦ Application Source Qualifier
♦ Custom, configured as an active transformation
♦ Joiner
♦ MQ Source Qualifier
♦ Normalizer (VSAM or pipeline)
♦ Rank
♦ Sorter
♦ Source Qualifier
♦ XML Source Qualifier
♦ Mapplet, if it contains any of the above transformation
Note: Although the Filter, Router, Transaction Control, and Update Strategy transformations
are active transformations, the PowerCenter Server does not use them as active sources in a
pipeline.
Active sources affect how the PowerCenter Server processes a session when you use any of the
following transformations or session properties:
♦ XML targets. The PowerCenter Server can load data from different active sources to an
XML target when each input group receives data from one active source. For more
information on XML targets, see “Working with XML Targets” in the XML User Guide.
♦ Transaction generators. Transaction generators, such as Transaction Control
transformations, become ineffective for downstream transformations or targets if you put a
transaction control point after it. Transaction control points are transaction generators and
active sources that generate commits. For more information on effective and ineffective
transaction generators, see “Transaction Control Transformation” in the Transformation
Guide. For a list of transaction control points, see “Transformation Scope” on page 287.
♦ Mapplets. An Input transformation must receive data from a single active source. For
more information on connecting mapplets to active sources in mappings, see “Mapplets”
in the Designer Guide.
♦ Source-based commit. Some active sources generate commits. When you run a source-
based commit session, the PowerCenter Server generates a commit from these active
sources at every commit interval. For more information on source-based commit sessions,
see “Source-Based Commits” on page 278.
Working with Active Sources 259

♦ Constraint-based loading. To use constraint-based loading, you must connect all related
targets to the same active source. The PowerCenter Server orders the target load on a row-
by-row basis based on rows generated by an active source. For more information on
constraint-based loading, see “Constraint-Based Loading” on page 248.
♦ Row error logging. If an error occurs downstream from an active source that is not a
source qualifier, the PowerCenter Server cannot identify the source row information for
the logged error row. For more information on logging errors, see “Overview” on
page 482.

Working with File Targets
You can output data to a flat file in either of the following ways:
♦ Use a flat file target definition. Create a mapping with a flat file target definition. Create
a session using the flat file target definition. When the PowerCenter Server runs the
session, it creates the target flat file based on the flat file target definition.
♦ Use a relational target definition. Use a relational definition to write to a flat file when
you want to use an external loader to load the target. Create a mapping with a relational
target definition. Create a session using the relational target definition. Configure the
session to output to a flat file by specifying the File Writer in the Writers settings on the
Mapping tab. For details on using the external loader feature, see “External Loading” on
page 523.
You can configure the following properties for flat file targets:
♦ Target properties. You can define target properties such as partitioning options, output
file options, and reject options. For more information, see “Configuring Target Properties”
on page 261.
♦ Flat file properties. You can choose to create delimited or fixed-width files, and define
their properties. For more information, see “Configuring Fixed-Width Properties” on
page 265 and “Configuring Delimited Properties” on page 266.
Configuring Target Properties

You can configure session properties for flat file targets in the Properties settings on the
Mapping tab, and in the General Options settings on the Properties tab. Define the properties
for each target instance in the session.
Working with File Targets 261

Figure 9-9 shows the flat file target properties you define in the Properties settings on the
Mapping tab in the session properties:
Figure 9-9. Properties Settings on the Mapping Tab for a Flat File Target
Flat File Target

Instance
Set File
Properties
Properties
Settings
Table 9-5 describes the properties you define in the Properties settings for flat file target
definitions:
Table 9-5. Flat File Target Properties
Required/
Target Properties Description
Optional
Merge Partitioned Optional When selected, the PowerCenter Server merges the partitioned target files into
Files one file when the session completes, and then deletes the individual output
files. If the PowerCenter Server fails to create the merged file, it does not
delete the individual output files.
You cannot merge files if the session uses FTP, an external loader, or a
message queue.
For details on configuring a session for partitioning, see “Pipeline Partitioning”
on page 345.
Merge File Optional Enter the directory name in this field. By default, the PowerCenter Server
Directory writes the merged file in the server variable directory, $PMTargetFileDir.
If you enter a full directory and file name in the Merge File Name field, clear
this field.
Merge File Name Optional Name of the merge file. Default is target_name.out. This property is required if
you select Merge Partitioned Files.

Table 9-5. Flat File Target Properties
Required/
Target Properties Description
Optional
Output File Optional Enter the directory name in this field. By default, the PowerCenter Server
Directory writes output files in the server variable directory, $PMTargetFileDir.
If you specify both the directory and file name in the Output Filename field,
clear this field. The PowerCenter Server concatenates this field with the Output
You can also use the $OutputFileName session parameter to specify the file
directory.
Output Filename Required Enter the file name, or file name and path. By default, the Workflow Manager
names the target file based on the target definition used in the mapping:
target_name.out.
If the target definition contains a slash character, the Workflow Manager
replaces the slash character with an underscore.
When you use an external loader to load to an Oracle database, you must
specify a file extension. If you do not specify a file extension, the Oracle loader
cannot find the flat file and the PowerCenter Server fails the session. For more
information about external loading, see “Loading to Oracle” on page 533.
Enter the file name, or file name and path. Optionally use the $OutputFileName
The PowerCenter Server concatenates this field with the Output File Directory
field when it runs the session.
Note: If you specify an absolute path file name when using FTP, the
PowerCenter Server ignores the Default Remote Directory specified in the FTP
connection. When you specify an absolute path file name, do not use single or
double quotes.
Reject File Optional Enter the directory name in this field. By default, the PowerCenter Server
Directory writes all reject files to the server variable directory, $PMBadFileDir.
clear this field. The PowerCenter Server concatenates this field with the Reject
directory.
Reject Filename Required Enter the file name, or file name and path. By default, the PowerCenter Server
names the reject file after the target instance name: target_name.bad.
Optionally use the $BadFileName session parameter for the file name.
The PowerCenter Server concatenates this field with the Reject File Directory
field when it runs the session. For example, if you have “C:\reject_file\” in the
Reject File Directory field, and enter “filename.bad” in the Reject Filename
field, the PowerCenter Server writes rejected rows to
Set File Properties Optional Opens a dialog box that allows you to define flat file properties. For more
Link information, see “Configuring Fixed-Width Properties” on page 265 and
“Configuring Delimited Properties” on page 266.
When you output to a flat file using a relational target definition in the mapping,
make sure you define the flat file properties by clicking the Set File Properties
link.

Figure 9-10 shows the test load options in the General Options settings on the Properties tab:
Figure 9-10. Test Load Options
Test Load
Options
Table 9-6 describes the test load options in the General Options settings on the Properties
tab:
Table 9-6. Test Load Options
Required/
Property Description
Optional
writing to targets. The PowerCenter Server generates all session files and
Test field.
Note: You can perform a test load for relational targets when you configure a
session for normal mode. If you configure the session for bulk mode, the
session fails.
Test load.
The PowerCenter Server reads the number you configure for the test load.

Configuring Fixed-Width Properties
When you output data to a fixed-width file, you can edit file properties in the session
properties, such as the null character or code page. You can configure fixed-width properties
for non-reusable sessions in the Workflow Designer and for reusable sessions in the Task
Developer. You cannot configure fixed-width properties for instances of reusable sessions in
the Workflow Designer.
In the Transformations view on the Mapping tab, click the Targets node and then click Set
File Properties to open the Flat Files dialog box.
To edit the fixed-width properties, select Fixed Width and click Advanced.
Figure 9-12 shows the Fixed Width Properties dialog box:
Figure 9-12. Fixed Width Properties Dialog Box

Table 9-7 describes the options you define in the Fixed Width Properties dialog box:
Table 9-7. Writing to a Fixed-Width Target
Description
Null Character Required Enter the character you want the PowerCenter Server to use to represent
null values. You can enter any valid character in the file code page.
For more information about using null characters for target files, see “Null
Characters in Fixed-Width Files” on page 272.
Repeat Null Character Optional Select this option to indicate a null value by repeating the null character to
fill the field. If you do not select this option, the PowerCenter Server enters
a single null character at the beginning of the field to represent a null
value. For more information about specifying null characters for target
files, see “Null Characters in Fixed-Width Files” on page 272.
code page.
Configuring Delimited Properties

When you output data to a delimited file, you can edit file properties in the session
properties, such as the delimiter or code page. You can configure delimited properties for
non-reusable sessions in the Workflow Designer and for reusable sessions in the Task
Developer. You cannot configure delimited properties for instances of reusable sessions in the
Workflow Designer.
In the Transformations view on the Mapping tab, click the Targets node and then click Set
File Properties to open the Flat Files dialog box.
To edit the delimited properties, select Delimited and click Advanced.

Figure 9-14 shows the Delimited File Properties dialog box:
Figure 9-14. Delimited File Properties Dialog Box
Table 9-8 describes the options you can define in the Delimited File Properties dialog box:
Table 9-8. Delimited File Properties
Edit Delimiter Required/

Description
Options Optional
Delimiters Required Character used to separate columns of data. Use the button to the right of this
field to enter a non-printable delimiter. Delimiters can be either printable or
single-byte unprintable characters, and must be different from the escape
character and the quote character (if selected). You cannot select unprintable
multibyte characters as delimiters.
Optional Quotes Required Select None, Single, or Double. If you select a quote character, the
PowerCenter Server does not treat delimiter characters within the quote
characters as a delimiter. For example, suppose an output file uses a comma
as a delimiter and the PowerCenter Server receives the following row: 342-
3849, ‘Smith, Jenna’, ‘Rockville, MD’, 6.
ignores the commas within the quotes and writes the row as four fields.
If you do not select the optional single quote, the PowerCenter Server writes
Code Page Required Select the code page of the delimited file. The default setting is the client code
page.

Server Handling for File Targets
When you configure a session to write to file targets, you need to know how the PowerCenter
Server loads data. In the mapping, you must correctly configure your flat file target
definitions and the relational target definitions you use to write to flat files. The PowerCenter
Server loads data to flat files based on the following criteria:
♦ Writing to fixed-width flat files from relational target definitions. The PowerCenter
Server adds spaces to target columns based on transformation datatype.
♦ Writing to fixed-width flat files from flat file target definitions. You must configure the
precision and field width for flat file target definitions to accommodate the total length of
the target field.
♦ Writing multibyte data to fixed-width files. You must configure the precision of string
columns to accommodate character data. When writing shift-sensitive data to a fixed-
width flat file target, the PowerCenter Server adds shift characters and spaces to meet file
requirements.
♦ Null characters in fixed-width files. The PowerCenter Server writes repeating or non-
repeating null characters to fixed-width target file columns differently depending on
whether the characters are single- or multibyte.
♦ Character set. You can write ASCII or Unicode data to a flat file target.
♦ Writing metadata to flat file targets. You can configure the PowerCenter Server to write
the column header information when you write to flat file targets.
Writing to Fixed-Width Flat Files with Relational Target Definitions

When you want to output to a fixed-width file based on a relational target definition in the
mapping, consider how the PowerCenter Server handles spacing in the target file.
When the PowerCenter Server writes to a fixed-width flat file based on a relational target
definition in the mapping, it adds spaces to columns based on the transformation datatype
connected to the target. This allows the PowerCenter Server to write optional symbols
necessary for the datatype, such as a negative sign or decimal point, without sending the row
to the reject file.
For example, you connect a transformation Integer(10) port to a Number(10) column in a
relational target definition. In the session properties, you override the relational target
definition to use the File Writer and you specify to output a fixed-width flat file. In the target
flat file, the PowerCenter Server appends an additional byte to the Number(10) column to
allow for negative signs that might be associated with Integer data.

Table 9-9 describes the number of bytes the PowerCenter Server adds to the target column
and optional characters it uses for each datatype:
Table 9-9. Datatype Modifications for File Target Columns
Transformation Datatype Bytes Added by

Connected to Fixed-Width PowerCenter Optional Characters for the Datatype
Flat File Target Column Server
Decimal 2 - Negative sign (-) for the mantissa.

- Decimal point (.).
Double 7 - Negative sign for the mantissa.

- Decimal point.
- Negative sign, e, and three digits for the exponent, for
example, -4.2-e123.
Float 7 - Negative sign for the mantissa.

- Decimal point.
- Negative sign, e, and three digits for the exponent.
Integer 1 - Negative sign for the mantissa.
Money 2 - Negative sign for the mantissa.

- Decimal point.
Numeric 2 - Negative sign for the mantissa.

- Decimal point.
Real 7 - Negative sign for the mantissa.

- Decimal point.
- Negative sign, e, and three digits for the exponent.
Writing to Fixed-Width Files with Flat File Target Definitions

When you want to output to a fixed-width flat file based on a flat file target definition, you
must configure precision and field width for the target field to accommodate the total length
of the target field. If the data for a target field is too long for the total length of the field, the
PowerCenter Server performs one of the following actions:
♦ Truncates the row for string columns
♦ Writes the row to the reject file for numeric and datetime columns
Note: When the PowerCenter Server writes a row to the reject file, it writes a message in the
session log.
When a session writes to a fixed-width flat file based on a fixed-width flat file target definition
in the mapping, the PowerCenter Server defines the total length of a field by the precision or
field width defined in the target.
Fixed-width files are byte-oriented, which means the total length of a field is measured in
bytes.
Server Handling for File Targets 269

Table 9-10 describes how the PowerCenter Server measures the total field length for fields in a
fixed-width flat file target definition:
Table 9-10. Field Length Measurements for Fixed-Width Flat File Targets
Datatype Target Field Property That Determines Total Field Length
Number Field width
String Precision
Datetime Field width
Table 9-11 lists the characters you must accommodate when you configure the precision or
field width for flat file target definitions to accommodate the total length of the target field:
Table 9-11. Characters to Include when Calculating Field Length for Fixed-Width Targets
Datatype Characters to Accommodate
Number - Decimal separator.

- Thousands separators.
- Negative sign (-) for the mantissa.
String - Multibyte data.

- Shift-in and shift-out characters.
For more information, see “Writing Multibyte Data to Fixed-Width Flat Files” on page 270.
Datetime - Date and time separators, such as slashes (/), dashes (-), and colons (:).
For example, the format MM/DD/YYYY HH24:MI:SS has a total length of 19 bytes.
When you edit the flat file target definition in the mapping, define the precision or field
width great enough to accommodate both the target data and the characters in Table 9-11.
For example, suppose you have a mapping with a fixed-width flat file target definition. The
target definition contains a number column with a precision of 10 and a scale of 2. You use a
comma as the decimal separator and a period as the thousands separator. You know some rows
of data might have a negative value. Based on this information, you know the longest possible
number is formatted with the following format:
-NN.NNN.NNN,NN
Open the flat file target definition in the mapping and define the field width for this number
column as a minimum of 14 bytes.
For more information on formatting numeric and datetime values, see “Working with Flat
Files” in the Designer Guide.
Writing Multibyte Data to Fixed-Width Flat Files

If you plan to load multibyte data into a fixed-width flat file, configure the precision to
accommodate the multibyte data. Fixed-width files are byte-oriented, not character-oriented.
So, when you configure the precision for a fixed-width target, you need to consider the
number of bytes you load into the target, rather than the number of characters.

For string columns, the PowerCenter Server truncates the data if the precision is not large
enough to accommodate the multibyte data.
You might work with the following types of multibyte data:
♦ Non shift-sensitive multibyte data. The file contains all multibyte data. Configure the
precision in the target definition to allow for the additional bytes.
For example, you know that the target data contains four double-byte characters, so you
define the target definition with a precision of 8 bytes.
If you configure the target definition with a precision of 4, the PowerCenter Server
truncates the data before writing to the target.
♦ Shift-sensitive multibyte data. The file contains single-byte and multibyte data. When
writing to a shift-sensitive flat file target, the PowerCenter Server adds shift characters and
spaces to meet file requirements. You must configure the precision in the target definition
to allow for the additional bytes and the shift characters. For more information, see
“Writing Shift-Sensitive Multibyte Data” on page 271.
Note: Delimited files are character-oriented, and you do not need to allow for additional
precision for multibyte data.
Writing Shift-Sensitive Multibyte Data

When writing to a shift-sensitive flat file target, the PowerCenter Server adds shift characters
and spaces if the data going into the target does not meet file requirements. You need to allow
at least two extra bytes in each data column containing multibyte data so the output data
precision matches the byte width of the target column.
The PowerCenter Server writes shift characters and spaces in the following ways:
♦ If a column begins or ends with a double-byte character, the PowerCenter Server adds shift
characters so the column begins and ends with a single-byte shift character.
♦ If the data is shorter than the column width, the PowerCenter Server pads the rest of the
column with spaces.
♦ If the data is longer than the column width, the PowerCenter Server truncates the data so
the column ends with a single-byte shift character.
To illustrate how the PowerCenter Server handles a fixed-width file containing shift-sensitive
data, say you want to output the following data to the target:
SourceCol1 SourceCol2
AAAA aaaa
A is a double-byte character, a is a single-byte character.

The first target column contains eight bytes and the second target column contains four
bytes.

The PowerCenter Server must add shift characters to handle shift-sensitive data. Since the
first target column can only handle eight bytes, the PowerCenter Server truncates the data
before it can add the shift characters.
TargetCol1 TargetCol2
-oAAA-i aaaa
The following table describes the notation used in this example:
Notation Description
A Double-byte character
-o Shift-out character
-i Shift-in character
For the first target column, the PowerCenter Server writes only three of the double-byte
characters to the target. It cannot write any additional double-byte characters to the output
column because the column must end in a single-byte character. If you add two more bytes to
the first target column definition, then the PowerCenter Server can add shift characters and
write all the data without truncation.
For the second target column, the PowerCenter Server writes all four single-byte characters to
the target. It does not add write shift characters to the column because the column begins and
ends with single-byte characters.
Null Characters in Fixed-Width Files

You can specify any valid single-byte or multibyte character as a null character for a fixed-
width target. You can also use a space as a null character.
The null character can be repeating or non-repeating. If the null character is repeating, the
PowerCenter Server writes as many null characters as possible into a target column. If you
specify a multibyte null character and there are extra bytes left after writing null characters,
the PowerCenter Server pads the column with single-byte spaces. If a column is smaller than
the multibyte character specified as the null character, the session fails at initialization.
Character Set
You can configure the PowerCenter Server to run sessions with flat file targets in either ASCII
or Unicode data movement mode.
If you configure a session with a flat file target to run in Unicode data movement mode, the
target file code page must be a superset of the PowerCenter Server code page and the source
code page. Delimiters, escape, and null characters must be valid in the specified code page of
the flat file.
If you configure a session to run in ASCII data movement mode, delimiters, escape, and null
characters must be valid in the ISO Western European Latin1 code page. Any 8-bit character
you specified in previous versions of PowerCenter is still valid.

For more information about configuring and working with data movement modes and code
pages, see “Globalization Overview” in the Installation and Configuration Guide.
Writing Metadata to Flat File Targets

When you write to flat file targets, you can configure the PowerCenter Server to write the
column header information. When you enable the Output Metadata For Flat File Target
option, the PowerCenter Server writes column headers to flat file targets. It writes the target
definition port names to the flat file target in the first line, starting with the # symbol. By
default, this option is disabled.
When writing to fixed-width files, the PowerCenter Server truncates the target definition port
name if it is longer than the column width.
For example, you have the following fixed-width flat file target definition:
The column width for ITEM_ID is six. When you enable the Output Metadata For Flat File
Target option, the PowerCenter Server writes the following text to a flat file:
#ITEM_ITEM_NAME PRICE
100001Screwdriver 9.50
100002Hammer 12.90
100003Small nails 3.00
For information about configuring the PowerCenter Server to output flat file metadata, see
the Installation and Configuration Guide.

Working with Heterogeneous Targets
You can output data to multiple targets in the same session. When the target types or database
types of those targets differ from each other, you have a session with heterogeneous targets.
To create a session with heterogeneous targets, you can create a session based on a mapping
with heterogeneous targets. Or, you can create a session based on a mapping with
homogeneous targets and select different database connections.
A heterogeneous target has one of the following characteristics:
♦ Multiple target types. You can create a session that writes to both relational and flat file
targets.
♦ Multiple target connection types. You can create a session that writes to a target on an
Oracle database and to a target on a DB2 database. Or, you can create a session that writes
to multiple targets of the same type, but you specify different target connections for each
target in the session.
All database connections you define in the Workflow Manager are unique to the PowerCenter
Server, even if you define the same connection information. For example, you define two
database connections, Sales1 and Sales2. You define the same user name, password, connect
string, code page, and attributes for both Sales1 and Sales2. Even though both Sales1 and
Sales2 define the same connection information, the PowerCenter Server treats them as
different database connections. When you create a session with two relational targets and
specify Sales1 for one target and Sales2 for the other target, you create a session with
heterogeneous targets.
You can create a session with heterogeneous targets in one of the following ways:
♦ Create a session based on a mapping with targets of different types or different database
types. In the session properties, keep the default target types and database types.
♦ Create a session based on a mapping with the same target types. However, in the session
properties, specify different target connections for the different target instances, or
override the target type to a different type.
You can override the target type in the session properties. However, you can only perform
certain overrides. You can specify the following target type overrides in a session:
♦ Relational target to flat file.
♦ Relational target to any other relational database type. Verify the datatypes used in the
target definition are compatible with both databases.
♦ SAP BW target to a flat file target type.
Note: When the PowerCenter Server runs a session with at least one relational target, it
performs database transactions per target connection group. For example, it orders the target
load for targets in a target connection group when you enable constraint-based loading. For
more information, see “Working with Target Connection Groups” on page 257.

Chapter 10
Understanding Commit
Points
♦ Overview, 276
♦ Target-Based Commits, 277
♦ Source-Based Commits, 278
♦ User-Defined Commits, 283
♦ Understanding Transaction Control, 287
♦ Setting Commit Properties, 292
275
Overview
A commit interval is the interval at which the PowerCenter Server commits data to targets
during a session. The commit point can be a factor of the commit interval, the commit
interval type, and the size of the buffer blocks. The commit interval is the number of rows
you want to use as a basis for the commit point. The commit interval type is the type of rows
that you want to use as a basis for the commit point. You can choose between the following
commit types:
♦ Target-based commit. The PowerCenter Server commits data based on the number of
target rows and the key constraints on the target table. The commit point also depends on
the buffer block size, the commit interval, and the PowerCenter Server configuration for
writer timeout.
♦ Source-based commit. The PowerCenter Server commits data based on the number of
source rows. The commit point is the commit interval you configure in the session
properties.
♦ User-defined commit. The PowerCenter Server commits data based on transactions
defined in the mapping properties. You can also configure some commit and rollback
options in the session properties.
Source-based and user-defined commit sessions have partitioning restrictions. If you
configure a session with multiple partitions to use source-based or user-defined commit, you
can only choose pass-through partitioning at certain partition points in a pipeline. For more
information, see “Specifying Partition Types” on page 356.
276 Chapter 10: Understanding Commit Points

Target-Based Commits
During a target-based commit session, the PowerCenter Server commits rows based on the
number of target rows and the key constraints on the target table. The commit point depends
on the following factors:
♦ Commit interval. The number of rows you want to use as a basis for commits. Configure
the target commit interval in the session properties.
♦ Writer wait timeout. The amount of time the writer waits before it issues a commit.
Configure the writer wait timeout in the PowerCenter Server setup.
♦ Buffer blocks. Blocks of memory that hold rows of data during a session. You can
configure the buffer block size in the session properties, but you cannot configure the
number of rows the block holds.
When you run a target-based commit session, the PowerCenter Server may issue a commit
before, on, or after, the configured commit interval. The PowerCenter Server uses the
following process to issue commits:
♦ When the PowerCenter Server reaches a commit interval, it continues to fill the writer
buffer block.When the writer buffer block fills, the PowerCenter Server issues a commit.
♦ If the writer buffer fills before the commit interval, the PowerCenter Server writes to the
target, but waits to issue a commit. It issues a commit when one of the following
conditions is true:
− The writer is idle for the amount of time specified by the PowerCenter Server writer wait
timeout option.
− The PowerCenter Server reaches the commit interval and fills another writer buffer.
For more information about configuring the writer wait timeout, see “Installing and
Configuring the PowerCenter Server on Windows” or “Installing and Configuring the
PowerCenter Server on UNIX” in the Installation and Configuration Guide.
Note: When you choose target-based commit for a session containing an XML target, the
Workflow Manager disables the On Commit session property on the Transformations view of
the Mapping tab.
Target-Based Commits 277

Source-Based Commits
During a source-based commit session, the PowerCenter Server commits data to the target
based on the number of rows from some active sources in a target load order group. These
rows are referred to as source rows.
When the PowerCenter Server runs a source-based commit session, it identifies commit
source for each pipeline in the mapping. The PowerCenter Server generates a commit row
from these active sources at every commit interval. The PowerCenter Server writes the name
of the transformation used for source-based commit intervals into the session log:
Source-based commit interval based on... TRANSFORMATION_NAME
The PowerCenter Server might commit less rows to the target than the number of rows
produced by the active source. For example, you have a source-based commit session that
passes 10,000 rows through an active source, and 3,000 rows are dropped due to
transformation logic. The PowerCenter Server issues a commit to the target when the 7,000
remaining rows reach the target.
The number of rows held in the writer buffers does not affect the commit point for a source-
based commit session. For example, you have a source-based commit session that passes
10,000 rows through an active source. When those 10,000 rows reach the targets, the
PowerCenter Server issues a commit. If the session completes successfully, the PowerCenter
Server issues commits after 10,000, 20,000, 30,000, and 40,000 source rows.
If the targets are in the same transaction control unit, the PowerCenter Server commits data
to the targets at the same time. If the session fails or aborts, the PowerCenter Server rolls back
all uncommitted data in a transaction control unit to the same source row.
If the targets are in different transaction control units, the PowerCenter Server performs the
commit when each target receives the commit row. If the session fails or aborts, the
PowerCenter Server rolls back each target to the last commit point. It might not roll back to
the same source row for targets in separate transaction control units. For more information on
transaction control units, see “Understanding Transaction Control Units” on page 289.
Note: Source-based commit may slow session performance if the session uses a one-to-one
mapping. A one-to-one mapping is a mapping that moves data from a Source Qualifier, XML
Source Qualifier, or Application Source Qualifier transformation directly to a target. For
more information about performance, see “Performance Tuning” on page 635.
Determining the Commit Source

When you run a source-based commit session, the PowerCenter Server generates commits at
all source qualifiers and transformations that do not propagate transaction boundaries. This
includes the following active sources:
♦ Source Qualifier
♦ Application Source Qualifier
♦ MQ Source Qualifier

♦ XML Source Qualifier when you only connect ports from one output group
♦ Normalizer (VSAM)
♦ Aggregator with the All Input transformation scope
♦ Joiner with the All Input transformation scope
♦ Rank with the All Input transformation scope
♦ Sorter with the All Input transformation scope
♦ Custom with one output group and with the All Input transformation scope
♦ A multiple input group transformation with one output group connected to multiple
upstream transaction control points
♦ Mapplet, if it contains one of the above transformations
For more information on transformation scope and transaction control, see “Understanding
Transaction Control” on page 287. For more information on active sources, see “Working
with Active Sources” on page 259.
A mapping can have one or more target load order groups, and a target load order group can
have one or more active sources that generate commits. The PowerCenter Server uses the
commits generated by the active source that is closest to the target definition. This is known
as the commit source.
For example, you have the mapping in Figure 10-1:
Figure 10-1. Mapping with a Single Commit Source
Transformation Scope
property is All Input.
The mapping contains a Source Qualifier transformation and an Aggregator transformation

with the All Input transformation scope. The Aggregator transformation is closer to the
targets than the Source Qualifier transformation and is therefore used as the commit source
for the source-based commit session.
Source-Based Commits 279

Also, suppose you have the mapping in Figure 10-2:
Figure 10-2. Mapping with Multiple Commit Sources
property is All Input.
The mapping contains a target load order group with one source pipeline that branches from
the Source Qualifier transformation to two targets. One pipeline branch contains an
Aggregator transformation with the All Input transformation scope, and the other contains an
Expression transformation. The PowerCenter Server identifies the Source Qualifier
transformation as the commit source for t_monthly_sales and the Aggregator as the commit
source for T_COMPANY_ALL. It performs a source-based commit for both targets, but uses
a different commit source for each.
Switching from Source-Based to Target-Based Commit

If the PowerCenter Server identifies a target in the target load order group that does not
receive commits from an active source that generates commits, it reverts to target-based
commit for that target only.
The PowerCenter Server writes the name of the transformation used for source-based commit
intervals into the session log. When the PowerCenter Server switches to target-based commit,
it writes a message in the session log.
A target might not receive commits from a commit source in the following circumstances:
♦ The target receives data from the XML Source Qualifier transformation, and you
connect multiple output groups from an XML Source Qualifier transformation to
downstream transformations. An XML Source Qualifier transformation does not generate
commits when you connect multiple output groups downstream.
♦ The target receives data from an active source with multiple output groups other than an
XML Source Qualifier transformation. For example, the target receives data from a
Custom transformation that you do not configure to generate transactions. Multiple
output group active sources neither generate nor propagate commits.

Connecting XML Sources in a Mapping
An XML Source Qualifier transformation does not generate commits when you connect
multiple output groups downstream. When you an XML Source Qualifier transformation in a
mapping, the PowerCenter Server can use different commit types for targets in this session
depending on the transformations used in the mapping:
♦ You put a commit source between the XML Source Qualifier transformation and the
target. The PowerCenter Server uses source-based commit for the target because it receives
commits from the commit source. The active source is the commit source for the target.
♦ You do not put a commit source between the XML Source Qualifier transformation and
the target. The PowerCenter Server uses target-based commit for the target because it
receives no commits.
Suppose you have the mapping in Figure 10-3:
Figure 10-3. Mapping with Targets Connected to a Commit Source
Connected to an XML
Source Qualifier
transformation with multiple
connected output groups.
PowerCenter Server uses
target-based commit when
loading to these targets.
Connected to an active
source that generates
commits, AGG_Sales.
PowerCenter Server uses
source-based commit
when loading to this
target.
Transformation Scope = All Input
This mapping contains an XML Source Qualifier transformation with multiple output groups
connected downstream. Because you connect multiple output groups downstream, the XML
Source Qualifier transformation does not generate commits. You connect the XML Source
Qualifier transformation to two relational targets, T_STORE and T_PRODUCT. Therefore,
these targets do not receive any commit generated by an active source. The PowerCenter
Server uses target-based commit when loading to these targets.
However, the mapping includes an active source that generates commits, AGG_Sales, between
the XML Source Qualifier transformation and T_YTD_SALES. The PowerCenter Server uses
source-based commit when loading to T_YTD_SALES.
Source-Based Commits 281

Connecting Multiple Output Group Custom Transformations in a Mapping
Multiple output group Custom transformations that you do not configure to generate
transactions neither generate nor propagate commits. Therefore, the PowerCenter Server can
use different commit types for targets in this session depending on the transformations used
in the mapping:
♦ You put a commit source between the Custom transformation and the target. The
PowerCenter Server uses source-based commit for the target because it receives commits
from the active source. The active source is the commit source for the target.
♦ You do not put a commit source between the Custom transformation and the target. The
PowerCenter Server uses target-based commit for the target because it receives no
commits.
Suppose you have the mapping in Figure 10-4:
Figure 10-4. Mapping a Custom Transformation with a Commit Source
Connected to a multiple output

group active source,
CT_XML_Parser. PowerCenter
Server uses target-based commit
when loading to these targets.
Connected to an active source

that generates commits,
AGG_store_orders. PowerCenter
Server uses source-based commit
when loading to this target.
Transformation Scope is All Input.
The mapping contains a multiple output group Custom transformation, CT_XML_Parser,

which drops the commits generated by the Source Qualifier transformation. Therefore,
targets T_store_name and T_store_addr do not receive any commits generated by an active
source. The PowerCenter Server uses target-based commit when loading to these targets.
However, the mapping includes an active source that generates commits, AGG_store_orders,
between the Custom transformation and T_store_orders. The PowerCenter Server uses
source-based commit when loading to T_store_orders.
Note: You can configure a Custom transformation to generate transactions when the Custom
transformation procedure outputs transactions. When you do this, configure the session for
user-defined commit. For more information on user-defined commit sessions, see “User-
Defined Commits” on page 283.

User-Defined Commits
During a user-defined commit session, the PowerCenter Server commits and rolls back
transactions based on a row or set of rows that pass through a Transaction Control
transformation. The PowerCenter Server evaluates the transaction control expression for each
row that enters the transformation. The return value of the transaction control expression
defines the commit or rollback point.
You can use also create a user-defined commit session when the mapping contains a Custom
transformation configured to generate transactions. When you do this, the procedure
associated with the Custom transformation defines the transaction boundaries.
When the PowerCenter Server evaluates a commit row, it commits all rows in the transaction
to the target or targets. When it evaluates a rollback row, it rolls back all rows in the
transaction from the target or targets. The PowerCenter Server writes a message to the session
log at each commit and rollback point. The session details are cumulative. The following
message is a sample commit message from the session log:
WRITER_1_1_1> WRT_8317
USER-DEFINED COMMIT POINT Wed Oct 15 08:15:29 2003
===================================================
WRT_8036 Target: TCustOrders (Instance Name: [TCustOrders])
WRT_8038 Inserted rows - Requested: 1003 Applied: 1003

Rejected: 0 Affected: 1023
When the PowerCenter Server writes all rows in a transaction to all targets, it issues commits
sequentially for each target.
The PowerCenter Server rolls back data based on the return value of the transaction control
expression or error handling configuration. If the transaction control expression returns a
rollback value, the PowerCenter Server rolls back the transaction. If an error occurs, you can
choose to roll back or commit at the next commit point.
If the transaction control expression evaluates to a value other than commit, rollback, or
continue, the PowerCenter Server fails the session. For more information about valid values,
see “Transaction Control Transformation” in the Transformation Guide.
When the session completes, the PowerCenter Server may write data to the target that was not
bound by commit rows. You can choose to commit at end of file or to roll back that open
transaction.
Note: If you use bulk loading with a user-defined commit session, the target may not recognize
the transaction boundaries. If the target connection group does not support transactions, the
PowerCenter Server writes the following message to the session log:
WRT_8234 Warning: Target Connection Group’s connection doesn’t support
transactions. Targets may not be loaded according to specified transaction
boundaries rules.
User-Defined Commits 283

Rolling Back Transactions
The PowerCenter Server rolls back transactions in the following circumstances:
♦ Rollback evaluation. The transaction control expression returns a rollback value.
♦ Open transaction. You choose to roll back at the end of file.
♦ Roll back on error. You choose to roll back commit transactions if the PowerCenter Server
encounters a non-fatal error.
♦ Roll back on failed commit. If any target connection group in a transaction control unit
fails to commit, the PowerCenter Server rolls back all uncommitted data to the last
successful commit point.
For more information on transaction control units, see “Understanding Transaction Control
Units” on page 289.
Rollback Evaluation
If the transaction control expression returns a rollback value, the PowerCenter Server rolls
back the transaction and writes a message to the session log indicating that the transaction
was rolled back. It also indicates how many rows were rolled back.
The following message is a sample message that the PowerCenter Server writes to the session
log when the transaction control expression returns a rollback value:
WRITER_1_1_1> WRT_8326 User-defined rollback processed
WRITER_1_1_1> WRT_8331 Rollback statistics
WRT_8162 ===================================================
WRT_8330 Rolled back [333] inserted, [0] deleted, [0] updated rows for the
target [TCustOrders]
Roll Back Open Transaction

If the last row in the transaction control expression evaluates to
TC_CONTINUE_TRANSACTION, the session completes with an open transaction. If you
choose to roll back that open transaction, the PowerCenter Server rolls back the transaction
and writes a message to the session log indicating that the transaction was rolled back.
The following message is a sample message indicating that Commit on End of File is disabled
in the session properties:
WRITER_1_1_1> WRT_8168 End loading table [TCustOrders] at: Wed Nov 05
10:21:56 2003
WRITER_1_1_1> WRT_8325 Final rollback executed for the target

[TCustOrders] at end of load
The following message is a sample message indicating that Commit on End of File is enabled
in the session properties:
WRITER_1_1_1> WRT_8143
Commit at end of Load Order Group Wed Nov 05 08:15:29 2003

Roll Back on Error
You can choose to roll back a transaction at the next commit point if the PowerCenter Server
encounters a non-fatal error. When the PowerCenter Server encounters a non-fatal error, it
processes the error row and continues processing the transaction. If the transaction boundary
is a commit row, the PowerCenter Server rolls back the entire transaction and writes it to the
reject file.
The following table describes row indicators in the reject file for rolled-back transactions:
Row Indicator Description
4 Rolled-back insert
5 Rolled-back update
6 Rolled-back delete
Note: The PowerCenter Server does not roll back a transaction if it encounters an error before
it processes any row through the Transaction Control transformation.
Roll Back on Failed Commit

When the PowerCenter Server reaches the commit point for all targets in a transaction control
unit, it issues commits sequentially for each target. If the commit fails for any target
connection group within a transaction control unit, the PowerCenter Server rolls back all data
to the last successful commit point. The PowerCenter Server cannot roll back committed
transactions, but it does write the transactions to the reject file.
For example, use the mapping in Figure 10-5 on page 286 to read through the following
scenario. This mapping has one transaction control unit and three target connection groups.
The target names contain information about the target connection group. For example,
TCG1_T1 represents the first target connection group and the first target.
1. The PowerCenter Server reaches the third commit point for all targets.
2. It begins to issue commits sequentially for each target.
3. The PowerCenter Server successfully commits to TCG1_T1 and TCG1_T2.
4. The commit fails for TCG2_T3.
5. The PowerCenter Server does not issue a commit for TCG3_T4.
6. The PowerCenter Server rolls back TCG2_T3 and TCG3_T4 to the second commit
point, but it cannot roll back TCG1_T1 and TCG1_T2 to the second commit point
because it successfully committed at the third commit point.
7. The PowerCenter Server writes the rows to the reject file from TCG2_T3 and
TCG3_T4. These are the rollback rows associated with the third commit point.
8. The PowerCenter Server writes the row to the reject file from TCG_T1 and TCG1_T2.
These are the commit rows associated with the third commit point.
User-Defined Commits 285

Figure 10-5 illustrates PowerCenter Server behavior when it rolls back on a failed commit:
Figure 10-5. Roll Back on Failed Commit Example
Third commit is successful (3).

Rows appear in the reject file (8).
Third commit fails (4).

PowerCenter Server rolls back to second commit (6).
Rows appear in reject file (7).
PowerCenter Server does not issue third commit (5).

It rolls back to second commit (6).
Rows appear in reject file (7).
The following table describes row indicators in the reject file for committed transactions in a
failed transaction control unit:
Row Indicator Description
7 Committed insert
8 Committed update
9 Committed delete

Understanding Transaction Control
PowerCenter allows you to define transactions that the PowerCenter Server uses when it
processes transformations, and when it commits and rolls back data at a target. You can define
a transaction based on a varying number of input rows. A transaction is a set of rows bound
by commit or rollback rows, the transaction boundaries. Some rows may not be bound by
transaction boundaries. This set of rows is an open transaction. You can choose to commit at
end of file or to roll back open transactions when you configure the session. For more
information on the Commit On End of File session property, see “Setting Commit
The PowerCenter Server can process a transformation for each row at a time, for all rows in a
transaction, or for all source rows together. Processing a transformation for all rows in a
transaction allows you to include such transformations, such as an Aggregator, in a real-time
session. For more information on configuring how the PowerCenter Server processes a
transformation, see “Transformation Scope” on page 287.
Transaction boundaries originate from transaction control points. A transaction control point
is a transformation that defines or redefines the transaction boundary in the following ways:
♦ Generates transaction boundaries. The transformations that define transaction
boundaries differ, depending on the session commit type:
− Target-based and user-defined commit. Transaction generators generate transaction
boundaries. A transaction generator is a transformation that generates both commit and
rollback rows. The Transaction Control and Custom transformation are transaction
generators.
− Source-based commit. Some active sources generate commits. They do not generate
rollback rows. Also, transaction generators generate commit and rollback rows. For a list
of active sources that generate commits, see “Determining the Commit Source” on
page 278.
♦ Drops incoming transaction boundaries. When a transformation drops incoming
transaction boundaries, and does not generate commits, the PowerCenter Server outputs
all rows into an open transaction. All active sources that generate commits and transaction
generators drop incoming transaction boundaries.
For a list of transaction control points, see Table 10-1 on page 288.
You can configure how the PowerCenter Server applies the transformation logic to incoming
data with the Transformation Scope transformation property. When the PowerCenter Server
processes a transformation, it either drops transaction boundaries or preserves transaction
boundaries, depending on the transformation scope and the mapping configuration.
You can choose one of the following values for the transformation scope:
♦ Row. Applies the transformation logic to one row of data at a time. Choose Row when a
row of data does not depend on any other row. When you choose Row for a
Understanding Transaction Control 287

transformation connected to multiple upstream transaction control points, the
PowerCenter Server drops transaction boundaries and outputs all rows from the
transformation as an open transaction. When you choose Row for a transformation
connected to a single upstream transaction control point, the PowerCenter Server
preserves transaction boundaries.
♦ Transaction. Applies the transformation logic to all rows in a transaction. Choose
Transaction when a row of data depends on all rows in the same transaction, but does not
depend on rows in other transactions. When you choose Transaction, the PowerCenter
Server preserves incoming transaction boundaries. It resets any cache, such as an
aggregator or lookup cache, when it receives a new transaction.
When you choose Transaction for a multiple input group transformation, you must
connect all input groups to the same upstream transaction control point.
♦ All Input. Applies the transformation logic on all incoming data. When you choose All
Input, the PowerCenter Server drops incoming transaction boundaries and outputs all
rows from the transformation as an open transaction. Choose All Input when a row of data
depends on all rows in the source.
Table 10-1 lists the transformation scope values available for each transformation:
Table 10-1. Transformation Scope Property Values
Transformation Row Transaction All Input
Aggregator Optional. Default.

Transaction control point.
Application Source n/a.

Qualifier Transaction control point.
Custom* Optional. Optional. Default.

Transaction control point Transaction control point or Transaction control point
or when configured to when configured to generate when it has one output
generate commits. commits. group or when configured
to generate commits.
Expression Default. Does not display.
External Procedure Default. Does not display.
Filter Default. Does not display.
Joiner Optional. Default.

Lookup Default. Does not display.
MQ Source Qualifier n/a.

Normalizer (VSAM) n/a.

Normalizer (relational) Default. Does not display.
Rank Optional. Default.


Table 10-1. Transformation Scope Property Values
Transformation Row Transaction All Input
Router Default. Does not display.
Sorter Optional. Default.

Sequence Generator Default. Does not display.
Source Qualifier n/a.

Stored Procedure Default. Does not display.
Transaction Control Default. Does not display.

Union Default. Does not display.
Update Strategy Default. Does not display.
XML Generator Optional. Default. Does not display.

Transaction when the flush
on commit is set to create a
new document,
XML Parser Default. Does not display.
XML Source Qualifier n/a.

*For more information on how the Transformation Scope property affects the Custom transformation, see “Custom Transformation” in the
Understanding Transaction Control Units

A transaction control unit is the group of targets connected to an active source that generates
commits or an effective transaction generator. A transaction control unit may contain
multiple target connection groups. For more information on target connection groups, see
“Working with Target Connection Groups” on page 257.
When the PowerCenter Server reaches the commit point for all targets in a transaction control
unit, it issues commits sequentially for each target.

Figure 10-6 illustrates transaction control units with a Transaction Control transformation:
Figure 10-6. Transaction Control Units
Target Connection Group 1
Transaction
Control Unit 1
Target Connection Group 4 Transaction

Control Unit 2
Note that T5_ora1 uses the same connection name as T1_ora1 and T2_ora1. Because
T5_ora1 is connected to a separate Transaction Control transformation, it is in a separate
transaction control unit and target connection group. If you connect T5_ora1 to
tc_TransactionControlUnit1, it will be in the same transaction control unit as all targets, and
in the same target connection group as T1_ora1 and T2_ora1.
Rules and Guidelines

Consider the following rules and guidelines when you work with transaction control:
♦ Transformations with Transaction transformation scope must receive data from a single
transaction control point.
♦ The PowerCenter Server uses the transaction boundaries defined by the first upstream
transaction control point for transformations with Transaction transformation scope.
♦ Transaction generators can be effective or ineffective for a target. The PowerCenter Server
uses the transaction generated by an effective transaction generator when it loads data to a
target. For more information on effective and ineffective transaction generators, see
“Transaction Control Transformation” in the Transformation Guide.
♦ The Workflow Manager prevents you from using incremental aggregation in a session with
an Aggregator transformation with Transaction transformation scope.
♦ Transformations with All Input transformation scope cause a transaction generator to
become ineffective for a target in a user-defined commit session. For more information on

using transaction generators in mappings, see “Transaction Control Transformation” in the
♦ The PowerCenter Server resets any cache at the beginning of each transaction for
Aggregator, Joiner, Rank, and Sorter transformations with Transaction transformation
scope.
♦ You can only choose the Transaction transformation scope for Joiner transformations when
you use sorted input.
♦ When you add a partition point at a transformation with Transaction transformation
scope, the Workflow Manager uses the pass-through partition type by default. You cannot
change the partition type.

Setting Commit Properties
When you create a session, you can configure commit properties. The properties you set
depend on the type of mapping and the type of commit you want the PowerCenter Server to
perform.
Figure 10-7 shows the session commit properties that you set in the General Options settings
of the Properties tab:
Figure 10-7. Session Commit Properties
Commit Type
Commit Interval
Commit on
End of File
Roll Back
Transactions
on Error
Table 10-2 describes the session commit properties that you set in the General Options
settings of the Properties tab:
Table 10-2. Session Commit Properties
Property Target-Based Source-Based User-Defined
Commit Type Selected by default if no Choose for source-based Selected by default if

transaction generator or only commit if no transaction effective transaction
ineffective transaction generator or only ineffective generators are in the
generators are in the transaction generators are in mapping.
mapping. the mapping.
Commit Interval* Default is 10,000. Default is 10,000. n/a

Table 10-2. Session Commit Properties
Property Target-Based Source-Based User-Defined
Commit on End of File Commits data at the end of Commits data at the end of Commits data at the end of
the file. Enabled by default. the file. Clear this option if the file. Clear this option if
You cannot disable this you want the PowerCenter you want the PowerCenter
option. Server to roll back open Server to roll back open
transactions. transactions.
Roll Back n/a If the PowerCenter Server If the PowerCenter Server

Transactions on encounters a non-fatal error, encounters a non-fatal error,
Errors you can choose to roll back you can choose to roll back
the transaction at the next the transaction at the next
commit point. commit point.
When the PowerCenter When the PowerCenter
Server encounters a Server encounters a
transformation error, it only transformation error, it only
rolls back the transaction if rolls back the transaction if
the error occurs after the the error occurs after the
effective transaction effective transaction
generator for the target. generator for the target.
*Tip: When you bulk load to Microsoft SQL Server or Oracle targets, define a large commit interval. Microsoft SQL Server
and Oracle start a new bulk load transaction after each commit. Increasing the commit interval reduces the number of
bulk load transactions and increases performance.
Setting Commit Properties 293

Chapter 11
Recovering Data

♦ Overview, 296
♦ Preparing for Recovery, 297
♦ Recovering a Suspended Workflow, 305
♦ Recovering a Failed Workflow, 308
♦ Recovering a Session Task, 311
♦ Server Handling for Recovery, 314
♦ Completing Unrecoverable Sessions, 316
295
Overview
If you stop a session or if an error causes a session to stop unexpectedly, refer to the session
logs to determine the cause of the failure. Correct the errors, and then complete the session.
The method you use to complete the session depends on the configuration of the mapping
and the session, the specific failure, and how much progress the session made before it failed.
If the PowerCenter Server did not commit any data, run the session again. If the session
issued at least one commit and is recoverable, consider running the session in recovery mode.
Recovery allows you to restart a failed session and complete it as if the session had run
without pause. When the PowerCenter Server runs in recovery mode, it continues to commit
data from the point of the last successful commit. For more information on PowerCenter
Server processing during recovery, see “Server Handling for Recovery” on page 314.
All recovery sessions run as part of a workflow. When you recover a session, you also have the
option to run part of the workflow. Consider the configuration and design of the workflow
and the status of other tasks in the workflow before you choose a method of recovery.
Depending on the configuration and status of the workflow and session, you can choose one
or more of the following recovery methods:
♦ Recover a suspended workflow. If the workflow suspends due to session failure, you can
recover the failed session and resume the workflow. For details, see “Recovering a
Suspended Workflow” on page 305.
♦ Recover a failed workflow. If the workflow fails as a result of session failure, you can
recover the session and run the rest of the workflow. For details, see “Recovering a Failed
Workflow” on page 308.
♦ Recover a session task. If the workflow completes, but a session fails, you can recover the
session alone without running the rest of the workflow. You can also use this method to
recover multiple failed sessions in a branched workflow. For details, see “Recovering a
Session Task” on page 311.
For more information on session failure, see “Stopping and Aborting a Session” on page 200.
296 Chapter 11: Recovering Data

Preparing for Recovery
Before you perform recovery, you must configure the mapping, session, workflow, and target
database to ensure that the recovery session will consistently read, transform, and write data as
though the session had not failed.
Under certain circumstances, you cannot recover the session and must run it again. For more
information on completing unrecoverable sessions, see “Completing Unrecoverable Sessions”
on page 316.
Configuring the Mapping

When you design a mapping, consider requirements for session recovery. Configure the
mapping so that the PowerCenter Server can extract, transform, and load data with the same
results each time it runs the session.
Use the following guidelines when you configure the mapping:
♦ Sort the data from the source. This guarantees that the PowerCenter Server always
receives source rows in the same order. You can do this by configuring the Sorted Ports
option in the Source Qualifier or Application Source Qualifier transformation or by
adding a Sorter transformation configured for distinct output rows to the mapping after
the source qualifier.
♦ Verify all targets receive data from transformations that produce repeatable data. Some
transformations produce repeatable data. You can enable a session for recovery in the
Workflow Manager when all targets in the mapping receive data from transformations that
produce repeatable data. For more information on repeatable data, see “Working with
Repeatable Data” on page 301.
Also, to perform consistent data recovery, the source, target, and transformation properties for
the recovery session must be the same as those for the failed session. Do not change the
properties of objects in the mapping before you run the recovery session.
Configuring the Session

To perform recovery on a failed session, the session must meet the following criteria:
♦ The session is enabled for recovery.
♦ The previous session run failed and the recovery information is accessible.
To enable recovery, select the Enable Recovery option in the Error Handling settings of the
Configuration tab in the session properties.
If you enable recovery and also choose to truncate the target for a relational normal load
session, the PowerCenter Server does not truncate the target when you run the session in
recovery mode.
Use the following guidelines when you enable recovery for a partitioned session:
Preparing for Recovery 297

♦ The Workflow Manager configures all partition points to use the default partitioning
scheme for each transformation when you enable recovery.
♦ The Workflow Manager sets the partition type to pass-through unless the transformation
receiving the data is either an Aggregator transformation, a Rank transformation, or a
sorted Joiner transformation.
♦ You can only enable recovery for unsorted Joiner transformations with one partition.
♦ For Custom transformations, you can enable recovery only for transformations with one
input group.
The PowerCenter Server disables test load when you enable the session for recovery.
To perform consistent data recovery, the session properties for the recovery session must be
the same as the session properties for the failed session. This includes the partitioning
configuration and the session sort order.
Configuring the Workflow

The recovery method you choose for the workflow depends on the design and configuration
of the workflow. As with sessions, you can configure a workflow so that you can correct errors
and complete the workflow as though it ran without error.
If other tasks or workflows in your environment depend on the successful completion of a
session, configure the workflow containing the session to suspend on error. This is useful for
sequential and concurrent sessions because it prevents the PowerCenter Server from
continuing the workflow after the session fails. This is also useful if multiple concurrent
sessions fail or if other workflows depend on the successful completion of the workflow. For
details on recovering a suspended workflow, see “Recovering a Suspended Workflow” on
page 305.
If you do not want to configure the workflow to suspend on error, you can configure
recoverable sessions to fail the workflow if the session fails. This prevents the PowerCenter
Server from continuing to run the workflow after the session fails. In this case, you may want
to perform recovery by running the part of the workflow that did not yet run. For more
information, see “Recovering a Failed Workflow” on page 308.
You can also allow the workflow to complete even if sessions or other tasks fail. You can then
choose to recover only the failed session tasks. This allows you to recover the sessions without
running previously successful tasks. For more information, see “Recovering a Session Task” on
page 311.
Configuring the Target Database

When the PowerCenter Server runs a session in recovery mode, it uses information in
recovery tables that it creates on the target database system. The PowerCenter Server creates
the recovery tables when it runs a session enabled for recovery. If the tables already exist, the
PowerCenter Server writes information to them.

The PowerCenter Server creates the following recovery tables in the target database:
♦ PM_RECOVERY. This table records target load information during the session run. The
PowerCenter Server removes the information from this table after each successful session
and initializes the information at the beginning of subsequent sessions.
♦ PM_TGT_RUN_ID. This table records information the PowerCenter Server uses to
identify each target on the database. The information remains in the table between session
runs.
If you want the PowerCenter Server to create the recovery tables, you must grant table
creation privileges to the database user name for the target database connection. If you do not
want the PowerCenter Server to create the recovery tables, you must create the recovery tables
manually.
Do not edit or drop the recovery tables while recovery is enabled. If you want to disable
recovery, the PowerCenter Server does not remove the recovery tables from the target
database. You must manually remove the recovery tables.
Table 11-1 describes the format of PM_RECOVERY:
Table 11-1. PM_RECOVERY Table Definition
Column Name Datatype
REP_GID VARCHAR(240)
WFLOW_ID NUMBER
SUBJ_ID NUMBER
TASK_INST_ID NUMBER
TGT_INST_ID NUMBER
PARTITION_ID NUMBER
TGT_RUN_ID NUMBER
RECOVERY_VER NUMBER
CHECK_POINT NUMBER
ROW_COUNT NUMBER
Table 11-2 describes the format of PM_TGT_RUN_ID:
Table 11-2. PM_TGT_RUN_ID Table Definition
Column Name Datatype
LAST_TGT_RUN_ID NUMBER
Note: If you manually create the PM_TGT_RUN_ID table, you must specify a value other
than zero in the LAST_TGT_RUN_ID column to ensure that the session runs successfully in
recovery mode.
Preparing for Recovery 299

Creating pmcmd Scripts
You can use pmcmd to perform recovery from the command line or in a script. When you use
pmcmd commands in a script, pmcmd indicates the success or failure of the command with a
return code. The following return codes apply to recovery sessions.
Table 11-3 describes the return codes for pmcmd that relate to recovery:
Table 11-3. pmcmd Return Codes for Recovery
Code Description
12 The PowerCenter Server cannot start recovery because the session or workflow is scheduled, suspending,
waiting for an event, waiting, initializing, aborting, stopping, disabled, or running.
19 The PowerCenter Server cannot start the session in recovery mode because the workflow is configured to run
continuously.
For details on additional pmcmd return codes, see “pmcmd Return Codes” on page 590.

Working with Repeatable Data
You can enable a session for recovery in the Workflow Manager when all targets in the
mapping receive data from transformations that produce repeatable data. All transformations
have a property that determines when the transformation produces repeatable data. For most
transformations, this property is hidden. However, you can write the Custom transformation
procedure to output repeatable data, and then configure the Custom transformation Output
Is Repeatable property to match the procedure behavior.
Transformations can produce repeatable data under the following circumstances:
♦ Never. The order of the output data is inconsistent between session runs. This is the
default for active Custom transformations.
♦ Based on input order. The output order is consistent between session runs when the input
data order for all input groups is consistent between session runs. This is the default for
passive Custom transformations.
♦ Always. The order of the output data is consistent between session runs even if the order
of the input data is inconsistent between session runs.
♦ Based on transformation configuration. The transformation produces repeatable data
depending on how you configure the transformation. You can always enable the session for
recovery, but you may get inconsistent results depending on how you configure the
transformation.
Table 11-4 lists which transformations produce repeatable data:
Table 11-4. Transformations that Output Repeatable Data
Transformation Output is Repeatable
Source Qualifier (relational) Based on transformation configuration.

Use sorted ports to produce repeatable data. Or, add a transformation that
produces repeatable data immediately after the Source Qualifier
transformation. If you do not do either of these options, you might get
inconsistent results.
Source Qualifier (flat file) Always.
Application Source Qualifier Based on transformation configuration.

Use sorted ports for relational sources, such as Siebel sources, to produce
repeatable data. Or, add a transformation that produces repeatable data
immediately after the Application Source Qualifier transformation. If you do not
do either of these options, you might get inconsistent results.
MQ Source Qualifier Always.
XML Source Qualifier Always.
Aggregator Always.
Custom Based on transformation configuration.

Configure the Output is Repeatable property according to the Custom
transformation procedure behavior.
Working with Repeatable Data 301

Table 11-4. Transformations that Output Repeatable Data
Transformation Output is Repeatable
Expression Based on input order.
External Procedure Based on input order.
Filter Based on input order.
Joiner Based on input order.
Lookup Based on input order.
Normalizer (VSAM) Always.

You can enable the session for recovery, however, you might get inconsistent
results if you run the session in recovery mode. The Normalizer transformation
generates source data in the form of primary keys. Recovering a session might
generate different values than if the session completed successfully. However,
the PowerCenter Server continues to produce unique key values.
Normalizer (pipeline) Based on input order.
Rank Always.
Router Based on input order.
Sequence Generator Based on transformation configuration.

You must reset the sequence value to the value set in the failed session run. If
you do not, you might get inconsistent results.
Sorter, configured for distinct output Always.

rows
Sorter, not configured for distinct Based on input order.

output rows
Stored Procedure Based on input order.
Transaction Control Based on input order.
Union Never.
Update Strategy Based on input order.
XML Generator Always.
XML Parser Always.
To run a session in recovery mode, you must first enable the failed session for recovery. To
enable a session for recovery, the Workflow Manager verifies all targets in the mapping receive
data from transformations that produce repeatable data. The Workflow Manager uses the
values in the Table 11-4 to determine whether or not you can enable a session for recovery.
However, the Workflow Manager cannot verify whether or not you configure some
transformations, such as the Sequence Generator transformation, correctly and always allows
you to enable these sessions for recovery. You may get inconsistent results if you do not
configure these transformations correctly.

You cannot enable a session for recovery in the Workflow Manager under the following
circumstances:
♦ You connect a transformation that never produces repeatable data directly to a target. To
enable this session for recovery, you can add a transformation that always produces
repeatable data between the transformation that never produces repeatable data and the
target.
♦ You connect a transformation that never produces repeatable data directly to a
transformation that produces repeatable data based on input order. To enable this session
for recovery, you can add a transformation that always produces repeatable data
immediately after the transformation that never produces repeatable data.
When a mapping contains a transformation that never produces repeatable data, you can add
a transformation that always produces repeatable data immediately after it.
Note: In some cases, you might get inconsistent data if you run some sessions in recovery
mode. For a description of circumstances that might lead to inconsistent data, see
“Completing Unrecoverable Sessions” on page 316.
Figure 11-1 illustrates a mapping you can enable for recovery:
Figure 11-1. Mapping You Can Enable for Recovery
The mapping contains an Aggregator transformation that always produces repeatable data.
The Aggregator transformation provides data for the Lookup and Expression transformations.
Lookup and Expression transformations produce repeatable data if they receive repeatable
data. Therefore, the target receives repeatable data, and you can enable this session for
recovery.
Working with Repeatable Data 303

Figure 11-2 illustrates a mapping you cannot enable for recovery:
Figure 11-2. Mapping You Cannot Enable for Recovery
Never produces repeatable data.

Configured for distinct output rows.
Always produces repeatable data.
The mapping contains two Source Qualifier transformations that produce repeatable data.
However, the mapping contains a Union and Custom transformation downstream that never
produce repeatable data. The Lookup transformation only produces repeatable data if it
receives repeatable data. Therefore, the target does not receive repeatable data, and you
cannot enable this session for recovery.
You can modify this mapping to enable the session for recovery by adding a Sorter
transformation configured for distinct output rows immediately after transformations that
never output repeatable data. Since the Union transformation is connected directly to another
transformation that never produces repeatable data, you only need to add a Sorter
transformation after the Custom transformation, as shown in the mapping in Figure 11-3:
Figure 11-3. Modified Mapping You Can Enable for Recovery
Never produces repeatable data.

Configured for distinct output rows.
Always produces repeatable data.
Produces repeatable data based on

input order.

Recovering a Suspended Workflow
You can configure the workflow to suspend if a task fails. If a session that is enabled for
recovery fails, you can correct the error that caused the session to fail and resume the
suspended workflow in recovery mode. When the PowerCenter Server resumes the workflow,
it runs the failed session in recovery mode. If the recovery session succeeds, the PowerCenter
Server runs the rest of the workflow.
You can recover a suspended workflow with sequential or concurrent sessions. For workflows
with either sequential or concurrent sessions, suspending the workflow on error is useful if
successive tasks in the workflow depend on the success of the previous sessions. For a
workflow with concurrent sessions, resuming a suspended workflow in recovery mode also
allows you to simultaneously recover concurrent failed sessions.
You can only resume a suspended workflow in recovery mode if a session that is enabled for
recovery fails. If a session fails that is not enabled for recovery, you can resume the workflow
normally. When you resume the workflow, the PowerCenter Server restarts the session. If the
session succeeds, the PowerCenter Server runs the rest of the workflow.
To configure the workflow to suspend on error, enable the Suspend On Error option on the
General tab of the workflow properties. For more information about suspending the
workflow, see “Suspending the Workflow” on page 127.
For steps on recovering a suspended workflow, see “Steps for Recovering a Suspended
Recovering a Suspended Workflow with Sequential Sessions

When a sequential session enabled for recovery fails, the PowerCenter Server places the
workflow in a suspended state. While the workflow is suspended, you can correct the error
that caused the session to fail.
After you correct the error, you can resume the workflow in recovery mode. When it resumes
the workflow, the PowerCenter Server starts the failed session in recovery mode.
If the recovery session succeeds, the PowerCenter Server runs the rest of the workflow. If the
recovery session fails, the PowerCenter Server suspends the workflow again.
Example
Suppose the workflow w_ItemOrders contains two sequential sessions. In this workflow,
s_ItemSales is enabled for recovery, and the workflow is configured to suspend on error.
Recovering a Suspended Workflow 305

Figure 11-4 illustrates w_ItemOrders:
Figure 11-4. Resuming a Suspended Workflow with Sequential Sessions

Workflow
configured to
suspend on
error.
Session enabled for recovery.
Suppose s_ItemSales fails, and the PowerCenter Server suspends the workflow. You correct the
error and resume the workflow in recovery mode. The PowerCenter Server recovers the
session successfully, and then runs s_UpdateOrders.
If s_UpdateOrders also fails, the PowerCenter Server suspends the workflow again. You
correct the error, but you cannot resume the workflow in recovery mode because you did not
enable the session for recovery. Instead, you resume the workflow. The PowerCenter Server
starts s_UpdateOrders from the beginning, completes the session successfully, and then runs
the StopWorkflow control task.
Recovering a Suspended Workflow with Concurrent Sessions

When a concurrent session enabled for recovery fails, the PowerCenter Server places the
workflow in a suspending state while it completes any other concurrently running tasks. After
concurrent tasks succeed or fail, the PowerCenter Server places the workflow in a suspended
state. While the workflow is suspended, you can correct the error that caused the session to
fail. If concurrent tasks failed, you can also correct those errors.
After you correct the error, you can resume the workflow in recovery mode. The PowerCenter
Server runs the failed session in recovery mode. If multiple concurrent sessions failed, the
PowerCenter Server starts all failed sessions enabled for recovery in recovery mode, and
restarts other concurrent tasks or sessions not enabled for recovery.
After successful recovery or completion of all failed sessions and tasks, the PowerCenter Server
completes the rest of the workflow. If a recovery session or task fails again, the PowerCenter
Server suspends the workflow.
Example
Suppose you have the workflow w_ItemsDaily, containing three concurrent sessions,
s_SupplierInfo, s_PromoItems, and s_ItemSales. In this workflow, s_SupplierInfo and
s_PromoItems are enabled for recovery, and the workflow is configured to suspend on error.

Figure 11-5 illustrates w_ItemsDaily:
Figure 11-5. Resuming a Suspended Workflow with Concurrent Sessions

Sessions enabled for recovery.
Workflow
configured to
suspend on error.
Suppose s_SupplierInfo fails while the PowerCenter Server is running the three sessions. The
PowerCenter Server places the workflow in a suspending state and continues running the
other two sessions. s_PromoItems and s_ItemSales also fail, and the PowerCenter Server then
places the workflow in a suspended state.
You correct the errors that caused each session to fail and then resume the workflow in
recovery mode. The PowerCenter Server starts s_SupplierInfo and s_PromoItems in recovery
mode. Since s_ItemSales is not enabled for recovery, it restarts the session from the beginning.
The PowerCenter Server runs the three sessions concurrently.
After all sessions succeed, the PowerCenter Server runs the Command task.
Steps for Recovering a Suspended Workflow

You can use the Workflow Monitor to resume a workflow in recovery mode. If the workflow
or session is currently scheduled, waiting, or disabled, the PowerCenter Server cannot run the
session in recovery mode. You must stop or unschedule the workflow or stop the session.
To resume a workflow or worklet in recovery mode:
1. In the Navigator, select the suspended workflow you want to resume.

2. Choose Task-Resume/Recover.
The PowerCenter Server resumes the workflow.
You can also use pmcmd to resume a workflow in recovery mode. For more information, see
“Using pmcmd” on page 581.
Recovering a Suspended Workflow 307

Recovering a Failed Workflow
You can configure a session to fail the workflow if the session fails. If the session is also
enabled for recovery, you can correct the error that caused the session to fail and recover the
workflow from the failed session. When the PowerCenter Server recovers the workflow from
the failed session, it runs the failed session in recovery mode. If the recovery session succeeds,
the PowerCenter Server runs the rest of the workflow.
You can recover a workflow from a failed sequential or concurrent session. You might want to
fail a workflow as a result of session failure if successive tasks in the workflow depend on the
success of the previous sessions.
To configure a session to fail the workflow if the session fails, enable the Fail Parent If This
Task Fails option on the General tab of the session properties. For more information, see
“Working with Tasks” on page 131.
For steps on recovering a failed workflow, see “Steps for Recovering a Failed Workflow” on
page 310.
Recovering a Failed Workflow with Sequential Sessions

When a sequential session fails that is enabled for recovery and configured to fail the
workflow, the PowerCenter Server fails the workflow. You can correct the error that caused the
session to fail and recover the workflow from the failed session. When the PowerCenter Server
recovers the workflow from the session, it runs the session in recovery mode.
If the recovery session succeeds, the PowerCenter Server runs the rest of the workflow. If the
recovery session fails, the PowerCenter Server fails the workflow again.
Example
Suppose the workflow w_ItemOrders contains two sequential sessions. s_ItemSales is enabled
for recovery and also configured to fail the parent workflow if it fails.
Figure 11-6 illustrates w_ItemOrders:
Figure 11-6. Recovering Part of a Workflow With Sequential Sessions

Session enabled for recovery.
Sessions configured to fail workflow if either session fails.

Suppose s_ItemSales fails, and the PowerCenter Server fails the workflow. You correct the
error and recover the workflow from s_ItemSales. The PowerCenter Server successfully
recovers the session, and then runs the next task in the workflow, s_UpdateOrders.
Suppose s_UpdateOrders also fails, and the PowerCenter Server fails the workflow again. You
correct the error, but you cannot recover the workflow from the session. Instead, you start the
workflow from the session. The PowerCenter Server starts s_UpdateOrders from the
beginning, completes the session successfully, and then runs the StopWorkflow control task.
Recovering a Failed Workflow with Concurrent Sessions

When a concurrent session fails that is enabled for recovery and configured to fail the
workflow, the PowerCenter Server fails the workflow. You can then correct the error that
caused the session to fail and recover the workflow from the failed session. When the
PowerCenter Server recovers the workflow, it runs the session in recovery mode. If the
recovery session succeeds, the PowerCenter Server runs successive tasks in the workflow in the
same path as the session. The PowerCenter Server does not recover or restart concurrent tasks
when you recover a workflow from a failed session.
If multiple concurrent sessions fail that are enabled for recovery and configured to fail the
workflow, the Informatica fails the workflow when the first session fails. Concurrent sessions
continue to run until they succeed or fail. After all concurrent sessions complete, you can
correct the errors that caused failures.
After you correct the errors, you can recover the workflow. If multiple sessions enabled for
recovery fail, individually recover all but one failed session. You can then recover the workflow
from the remaining failed session. This ensures that the Informatica recovers all concurrent
failed sessions before it runs the rest of the workflow. For details on recovering a session
individually, see “Recovering a Session Task” on page 311.
Example
Suppose the workflow w_ItemsDaily contains three concurrent sessions, s_SupplierInfo,
s_PromoItems, and s_ItemSales. In this workflow, each session is enabled for recovery and
configured to fail the parent workflow if the session fails.
Figure 11-7. Recovering Part of a Workflow with Concurrent Sessions

Sessions configured to fail parent workflow if
the session fails.
Recovering a Failed Workflow 309

Suppose s_SupplierInfo fails while the three concurrent sessions are running, and the
PowerCenter Server fails the workflow. s_PromoItems and s_ItemSales also fail. You correct
the errors that caused each session to fail.
In this case, you must combine two recovery methods to run all sessions before completing
the workflow. You recover s_PromoItems individually. You cannot recover s_ItemSales
because it is not enabled for recovery, but you start the session from the beginning. After the
PowerCenter Server successfully completes s_PromoItems and s_ItemSales, you recover the
workflow from s_SupplierInfo. The PowerCenter Server runs the session in recovery mode,
and then runs the Command task.
Steps for Recovering a Failed Workflow

You can use the Workflow Manager or Workflow Monitor to recover a failed workflow. If the
workflow or session is currently scheduled, waiting, or disabled, the PowerCenter Server
cannot run the session in recovery mode. You must stop or unschedule the workflow or stop
the session.
To recover a failed workflow using the Workflow Manager:
1. Select the failed session in the Navigator or in the Workflow Designer workspace.
2. Right-click the failed session and choose Recover Workflow from Task.
The PowerCenter Server runs the failed session in recovery mode, and then runs the rest
of the workflow.
To recover a failed workflow using the Workflow Monitor:
1. Select the failed session in the Navigator.

2. Right-click the session and choose Recover Workflow From Task.
or
Choose Task-Recover Workflow From Task.
The PowerCenter Server runs the session in recovery mode.
You can also use pmcmd to recover a failed workflow. For more information, see “Using
pmcmd” on page 581.

Recovering a Session Task
If you do not configure the workflow to suspend on error, and you do not configure the
workflow to fail if sessions or tasks fail, the PowerCenter Server completes the workflow even
if it encounters errors. If a session fails, but other tasks in the workflow complete successfully,
you may want to recover only the failed session. When the PowerCenter Server recovers a
session, it runs the session in recovery mode.
You can recover sequential or concurrent sessions. For workflows with sequential sessions,
individually recovering a session is useful if the rest of the workflow succeeded and you need
to recover the failed session. This allows you to recover the session without restarting
successful tasks.
For workflows with concurrent sessions, this method is useful if multiple concurrent sessions
fail and also cause the workflow to fail. You can individually recover concurrent sessions and
individually start subsequent tasks in the workflow paths until the paths converge at a single
task.
In other complex, branched workflows, individually recovering multiple failed sessions allows
you to specify the order in which the sessions run.
Recovering Sequential Sessions

When a sequential session enabled for recovery fails, and the workflow is not configured to
suspend or fail on error, the PowerCenter Server continues to run the workflow. You can
correct the error that caused the session to fail.
After you correct the error, you can individually recover the failed session. When the
PowerCenter Server individually recovers a session, it runs the session in recovery mode. It
does not run other tasks in the workflow.
Recovering Concurrent Sessions

When a concurrent session enabled for recovery fails, the PowerCenter Server continues to
run the workflow. Other tasks and the workflow may succeed. You can correct the error that
caused the session to fail. If concurrent tasks failed, you can also correct those errors. After
you correct the errors, you can individually recover each session without running the rest of
the workflow.
If multiple concurrent sessions fail that are enabled for recovery and configured to fail the
workflow on session failure, the PowerCenter Server fails the workflow. You can correct the
errors that caused the sessions to fail. After you correct the errors, you can individually recover
each session. Once all concurrent tasks are recovered or complete, you can start the session
from a task where the concurrent paths converge.
Recovering a Session Task 311

Example
Suppose the workflow w_ItemsDaily contains three concurrently running sessions. Each
session is enabled for recovery and configured to fail the workflow if the session fails.
Figure 11-8. Recovering Concurrent Sessions Individually

Sessions configured to fail parent workflow if
the session fails.
Suppose s_ItemSales fails and the PowerCenter Server fails the workflow. s_PromoItems and
s_SupplierInfo also fail. You correct the errors that caused the sessions to fail.
After you correct the errors, you individually recover each failed session. The PowerCenter
Server successfully recovers the sessions. The workflow paths after the sessions converge at the
Command task, allowing you to start the workflow from the Command task and complete
the workflow.
Alternatively, after you correct the errors, you could also individually recover two of the three
failed sessions. After the PowerCenter Server successfully recovers the sessions, you can
recover the workflow from the third session. The PowerCenter Server then recovers the third
session and, on successful recovery, runs the rest of the workflow.
Steps for Recovering a Session Task

You can use the Workflow Manager or Workflow Monitor to recover a failed session in a
workflow. If the workflow or session is currently scheduled, waiting, or disabled, the
PowerCenter Server cannot run the session in recovery mode. You must stop or unschedule
the workflow or stop the session.
To recover a failed session using the Workflow Manager:
1. Select the failed session in the Navigator or in the Workflow Designer workspace.
2. Right-click the failed session and choose Recover Task.
To recover a failed session using the Workflow Monitor:
1. Select the failed session in the Navigator.

2. Right-click the session and choose Recover Task.
or
Choose Task-Recover Task.
You can also use pmcmd to recover a failed session. For more information, see “Using pmcmd”
on page 581.
Recovering a Session Task 313

Server Handling for Recovery
The PowerCenter Server writes recovery data to relational target databases when you run a
session enabled for recovery. If the session fails, the PowerCenter Server uses the recovery data
to determine the point at which it continues to commit data during the recovery session.
Verifying Recovery Tables

The PowerCenter Server creates recovery information in cache files for all sessions enabled for
recovery. It also creates recovery tables on the target database for relational targets during the
initial session run.
If the session is enabled for recovery, the PowerCenter Server creates recovery information in
cache files during the normal session run. The PowerCenter Server stores the cache files in the
directory specified for $PMCacheDir. The PowerCenter Server generates file names in the
format PMGMD_METADATA_*.dat. Do not alter these files or remove them from the
PowerCenter Server cache directory. The PowerCenter Server cannot run the recovery session
if you delete the recovery cache files.
If the session writes to a relational database and is enabled for recovery, the PowerCenter
Server also verifies the recovery tables on the target database for all relational targets at the
beginning of a normal session run. If the tables do not exist, the PowerCenter Server creates
them. If the database user name the PowerCenter Server uses to connect to the target database
does not have permission to create the recovery tables, you must manually create them. For
information about recovery table structure, see “Configuring the Target Database” on
page 298.
During the session run, the PowerCenter Server writes target load information for normal
load targets into the recovery tables. If the session fails, the PowerCenter Server uses this
information to complete the session in recovery mode. If the session is configured to write to
relational targets in bulk mode, the PowerCenter Server does not write recovery information
to the recovery tables.
If the session completes successfully, the PowerCenter Server deletes all recovery cache files
and removes recovery table entries that are related to the session. The PowerCenter Server
initializes the information in the recovery tables at the beginning of the next session run.
The PowerCenter Server also uses the recovery cache files to store messages from real-time
sources. For more information, see your PowerCenter Connect documentation.
Running Recovery
If a session enabled for recovery fails, you can run the session in recovery mode. The
PowerCenter Server moves a recovery session through the states of a normal session:
scheduled, waiting, running, succeeded, and failed. When the PowerCenter Server starts the
recovery session, it runs all pre-session tasks.

For relational normal load targets, the PowerCenter Server performs incremental load
recovery. It uses the recovery information created during the normal session run to determine
the point at which the session stopped committing data to the target. It then continues
writing data to the target. On successful recovery, the PowerCenter Server removes the
recovery information from the tables.
For example, if the PowerCenter Server commits 10,000 rows before the session fails, when
you run the session in recovery mode, the PowerCenter Server bypasses the rows up to 10,000
and starts loading with row 10,001.
If the session writes to a relational target in bulk mode, the PowerCenter Server performs the
entire writer run. If the Truncate Target Table option is enabled in the session properties, the
PowerCenter Server truncates the target before loading data.
If the session writes to a flat file or XML file, the PowerCenter Server performs full load
recovery. It overwrites the existing output file and performs the entire writer run. If the
session writes to heterogeneous targets, the PowerCenter Server performs incremental load
recovery for all relational normal load targets and full load recovery for all other target types.
On successful recovery, the PowerCenter Server deletes recovery cache files associated with the
session. It also performs all post-session tasks.
Server Handling for Recovery 315

Completing Unrecoverable Sessions
In some cases, you cannot perform recovery for a session. There may also be circumstances
that cause a recovery session to fail or produce inconsistent data. If you cannot recover a
session, you can run the session again.
You cannot run sessions in recovery mode under the following circumstances:
♦ You change the number of partitions. If you change the number of partitions after the
session fails, the recovery session fails.
♦ Recovery table is empty or missing from the target database. The PowerCenter Server
fails the recovery session under the following circumstances:
− You deleted the table after the PowerCenter Server created it.
− The session enabled for recovery succeeded, and the PowerCenter Server removed the
recovery information from the table.
♦ Recovery cache file is missing. The PowerCenter Server fails the recovery session if the
recovery cache file is missing from the PowerCenter Server cache directory.
♦ The PowerCenter Server performing recovery is on a different operating system. The
operating system of the PowerCenter Server that runs the recovery session must be the
same as the operating system of the PowerCenter Server that ran the failed session.
You might get inconsistent data if you perform recovery under the following circumstances:
♦ You change the partitioning configuration. If you change any partitioning options after
the session fails, you may get inconsistent data.
♦ Source data is not sorted. To perform a successful recovery, the PowerCenter Server must
process source rows during recovery in the same order it processes them during the initial
session. Use the Sorted Ports option in the Source Qualifier transformation or add a Sorter
transformation directly after the Source Qualifier transformation.
♦ The sources or targets change after the initial session failure. If you drop or create
indexes, or edit data in the source or target tables before recovering a session, the
PowerCenter Server may return missing or repeat rows.
♦ The session writes to a relational target in bulk mode, but the session is not configured
to truncate the target table. The PowerCenter Server may load duplicate rows to the
during the recovery session.
♦ The mapping uses a Normalizer transformation. The Normalizer transformation
generates source data in the form of primary keys. Recovering a session might generate
different values than if the session completed successfully. However, the PowerCenter
Server will continue to produce unique key values.
♦ The mapping uses a Sequence Generator transformation. The Sequence Generator
transformation generates source data in the form of sequence values. Recovering a session
might generate different values than if the session completed successfully.
If you want to ensure the same sequence data is generated during the recovery session, you
can reset the value specified as the Current Value in the Sequence Generator

transformation properties to the same value used when you ran the failed session. If you do
not reset the Current Value, the PowerCenter Server will continue to generate unique
Sequence values.
♦ The session performs incremental aggregation and the PowerCenter Server stops
unexpectedly. If the PowerCenter Server stops unexpectedly while running an incremental
aggregation session, the recovery session cannot use the incremental aggregation cache
files. Rename the backup cache files for the session from PMAGG*.idx.bak and
PMAGG*.dat.bak to PMAGG*.idx and PMAGG*.dat before you perform recovery.
♦ The PowerCenter Server data movement mode changes after the initial session failure. If
you change the data movement mode before recovering the session, the PowerCenter
Server might return incorrect data.
♦ The PowerCenter Server code page or source and target code pages change after the
initial session failure. If you change the source, target, or PowerCenter Server code pages,
the PowerCenter Server might return incorrect data. You can perform recovery if the new
code pages are two-way compatible with the original code pages.
♦ The PowerCenter Server runs in Unicode mode and you change the session sort order.
When the PowerCenter Server runs in Unicode mode, it sorts character data based on the
sort order selected for the session. Do not perform recovery if you change the session sort
order after the session fails.
Completing Unrecoverable Sessions 317

Chapter 12
Sending Email

♦ Overview, 320
♦ Configuring Email on UNIX, 321
♦ Configuring Email on Windows, 322
♦ Working with Email Tasks, 328
♦ Working with Post-Session Email, 332
♦ Working with Suspension Email, 339
♦ Using Email Tasks in a Workflow or Worklet, 341
♦ Tips, 342
319
Overview
You can send email to designated recipients when the PowerCenter Server runs a workflow.
For example, if you want to track how long a session takes to complete, you can configure the
session to send an email containing the time and date the session starts and completes. Or, if
you want the PowerCenter Server to notify you when a workflow suspends, you can configure
the workflow to send email when it suspends.
When you create a workflow or worklet, you can include the following types of email:
♦ Email task. You can include reusable and non-reusable Email tasks anywhere in the
workflow or worklet. For more information, see “Using Email Tasks in a Workflow or
Worklet” on page 341.
♦ Post-session email. You can configure the session so the PowerCenter Server sends an
email when the session completes or fails. You create an Email task and use it for post-
session email. For more information, see “Working with Post-Session Email” on page 332.
When you configure the subject and body of post-session email, you can use email
variables to include information about the session run, such as session name, status, and
the total number of records loaded. You can also use email variables to attach the session
log or other files to email messages. For more information, see “Email Variables and
Format Tags” on page 333.
♦ Suspension email. You can configure the workflow so the PowerCenter Server sends an
email when the workflow suspends. You create an Email task and use it for suspension
email. For more information, see “Working with Suspension Email” on page 339.
Before you can configure a session or workflow to send email, you need to create an Email
task. For more information, see “Working with Email Tasks” on page 328.
The PowerCenter Server on Windows sends email in MIME format. This allows you to
include characters in the subject and body that are not in 7-bit ASCII. For more information
on the MIME format or the MIME decoding process, see your email documentation.
Before creating Email tasks, configure the PowerCenter Server to send email. For more
information, see “Configuring Email on UNIX” on page 321 and “Configuring Email on
Windows” on page 322.
320 Chapter 12: Sending Email

Configuring Email on UNIX
The PowerCenter Server on UNIX uses rmail to send email. To send email, the repository
user who starts the PowerCenter Server must have the rmail tool installed in the path.
If you want to send email to more than one person, separate the email address entries with a
comma. Do not put spaces between addresses.
To verify the rmail tool is accessible on AIX:
1. Log on to the UNIX system as the Informatica user who starts the PowerCenter Server.
2. Type the following lines at the prompt and press Enter:
rmail <your fully qualified email address>,<second fully
qualified email address>
From <your_user_name>
3. To indicate the end of the message, type ^D.

You should receive a blank email from the email account of the user you specify in the
From line. If not, locate the directory where rmail resides and add that directory to the
path.
To verify the rmail tool is accessible on all other UNIX machines:
1. Log on to the UNIX system as the Informatica user who starts the PowerCenter Server.
2. Type the following line at the prompt and press Enter:
rmail <your fully qualified email address>,<second fully
qualified email address>
3. To indicate the end of the message, type . on a line of its own and press Enter.
Or, type ^D.
You should receive a blank email from the email account of the Informatica user. If not,
locate the directory where rmail resides and add that directory to the path.
Once you verify that rmail is installed correctly, you can send email. For more information on
configuring email, see “Working with Email Tasks” on page 328.
Configuring Email on UNIX 321

Configuring Email on Windows
The PowerCenter Server on Windows uses Microsoft Outlook to send email using the MAPI
interface. You must meet the following requirements to send email on a PowerCenter Server
on Windows:
♦ Install the Microsoft Outlook mail client on the PowerCenter Server machine.
♦ Run Microsoft Outlook on a Microsoft Exchange Server.
♦ Create a Windows user account that has Log on as a service rights and a Microsoft Outlook
profile.
To configure the PowerCenter Server on Windows to send email, you must perform the
following steps:
1. Verify the Informatica Service startup account.
2. Configure a Microsoft Outlook profile for the Informatica Service startup account.
3. Configure Logon network security.
4. Create distribution lists in the Personal Address Book in Microsoft Outlook.
5. Configure the PowerCenter Server to send email using the Microsoft Outlook profile you
created in step 2.
Step 1. Verify the Informatica Service Startup Account

You must have an Informatica Service startup account, which grants a user the Log on as a
service right to start the Informatica Service. Verify the Informatica Service startup account so
that you can create a Microsoft Outlook profile for the user who has Log on as a service right
for the Informatica Service Start Account.
For details on verifying service rights, see the Troubleshooting section of “Installing and
Configuring the PowerCenter Server on Windows” in the Installation and Configuration
Guide.
Step 2. Configure a Microsoft Outlook User

You must set up a Microsoft Outlook user for the Informatica Service startup account before
configuring the PowerCenter Server to send email. The user profile must contain the
following services:
♦ Microsoft Exchange Server
♦ Personal Address Book
Use the same log on name for both the Microsoft Outlook account you create and the user
you grant Log on as a service rights in the Informatica Service startup account.
Note: If you do not already have a Microsoft Outlook mailbox for the Informatica Service
startup account user, ask your network administrator to create one.

To configure a Microsoft Outlook user:
1. Open the Control Panel on the machine running the PowerCenter Server.
2. Double-click the Mail (or Mail and Fax) icon.
3. On the Services tab of the user Properties dialog box, click Show Profiles.
The Mail dialog box displays the list of profiles configured for the computer.
4. If you have a Microsoft Outlook profile set up for the Informatica Service startup
account, skip to “Step 3. Configure Logon Network Security” on page 325. If you do not
already have a Microsoft Outlook profile set up for the Informatica Service startup
account, continue to the next step.
5. Click Add in the mail properties window.
The Microsoft Outlook Setup Wizard appears.
Configuring Email on Windows 323

6. Select Use The Following Information Services and then select Microsoft Exchange
Server. Click Next.
7. Enter a profile name. You can enter any name, but Informatica recommends that you
enter a text string that matches the Informatica Service startup account. Click Next.

8. Enter the name of the Microsoft Exchange Server. Enter your mailbox name. Click Next.
9. Indicate whether you travel with your computer. Click Next.

10. Enter the path to your personal address book. Click Next.
11. Indicate whether you want to run Outlook when you start Windows. Click Next.
12. The Setup Wizard indicates that you have successfully configured an Outlook profile.
13. Click Finish.
Step 3. Configure Logon Network Security

You must configure the Logon Network Security before you run the Microsoft Exchange
Server Service.
To configure Logon Network Security for the Microsoft Exchange Server:
1. Open the Control Panel on the machine running the PowerCenter Server.
2. Double-click the Mail (or Mail and Fax) icon. The User Properties sheet appears.

3. On the Services tab, select Microsoft Exchange Server and click Properties.
4. Click the Advanced tab. Set the Logon network security option to NT Password
Authentication.
Logon Network Security
5. Click OK.
Step 4. Create Distribution Lists

When the PowerCenter Server runs on Windows, you can enter only one email address in the
Workflow Manager. If you want to send email to multiple recipients, create a distribution list
containing these addresses in the Personal Address Book in Microsoft Outlook. Enter the
distribution list name as the recipient when configuring email.
For more information about working with your Personal Address Book, refer to Microsoft
Outlook documentation.

Step 5. Configure the PowerCenter Server Setup
After you create the Microsoft Outlook profile, configure the PowerCenter Server to send
email as that Microsoft Outlook user.
To configure the PowerCenter Server as a Microsoft Outlook user:
1. From the PowerCenter Server Setup, click the Configuration tab.

2. In the MS Exchange Profile field, enter the name of the Microsoft Outlook profile you
created for the Informatica Service startup account.
Microsoft Exchange Profile

Working with Email Tasks
The Workflow Manager provides an Email task that allows you to send email during a
workflow. You can create reusable Email tasks in the Task Developer for any type of email. Or,
you can create non-reusable Email tasks in the Workflow and Worklet Designer.
You can use Email tasks in any of the following locations:
♦ Session properties. You can configure the session to send email when the session
completes or fails. For more information, see “Working with Post-Session Email” on
page 332.
♦ Workflow properties. You can configure the workflow to send email when the workflow
suspends. For more information, see “Working with Suspension Email” on page 339.
♦ Workflow or worklet. You can include an Email task anywhere in the workflow or worklet
to send email based on a condition you define. For more information, see “Using Email
Tasks in a Workflow or Worklet” on page 341.
Figure 12-1 shows the Edit Tasks dialog box for an Email task in the Task Developer:
Figure 12-1. Email Task
Email Address Tips and Guidelines

Consider the following tips and guidelines when you enter the email address in an Email task:
♦ Enter the email address using 7-bit ASCII characters only.
♦ You can enter either the $PMSuccessEmailUser or $PMFailureEmailUser server variable
for post-session email. For more information, see “Using Server Variables” on page 333.

♦ If the PowerCenter Server runs on Windows, you can enter a Microsoft Exchange Profile
name. The mail recipient must have an entry in the Global Address book of the Microsoft
Outlook profile.
♦ If the PowerCenter Server runs on Windows, you can send email to multiple recipients by
creating a distribution list in your Personal Address book. All recipients must also be in the
Global Address book. You cannot enter multiple addresses separated by commas or semi-
colons.
♦ If the PowerCenter Server runs on UNIX, you can enter multiple email addresses separated
by a comma. Do not include spaces between email addresses.
Steps to Create an Email Task

You can create Email tasks in the Task Developer, Worklet Designer, and Workflow Designer.
Use the following steps to create an Email task.
To create an Email task in the Task Developer:
1. In the Task Developer, choose Tasks-Create. The Create Task dialog box appears.
2. Select an Email task and enter a name for the task. Click Create.
The Workflow Manager creates an Email task in the workspace.
3. Click Done.
Working with Email Tasks 329

4. Double-click the Email task in the workspace. The Edit Tasks dialog box appears.
5. Click Rename to enter a name for the task.

6. You can optionally enter a description for the task in the Description field.
7. Click the Properties tab.
Enter the email text.
8. Enter the fully qualified email address of the mail recipient in the Email User Name field.
For more information on entering the email address, see “Email Address Tips and
Guidelines” on page 328.

9. Enter the subject of the email in the Email Subject field. Or, you can leave this field
blank.
10. Click the Open button in the Email Text field to open the Email Editor.
11. Enter the text of the email message in the Email Editor.
When you use the Email task, you can incorporate format tags in your message. For more
information, see “Email Variables and Format Tags” on page 333.
You can leave the Email Text field blank.
12. Click OK twice to save your changes.
Working with Email Tasks 331

Working with Post-Session Email
You can configure a session so the PowerCenter Server sends email to someone when it fails or
completes a session. You can create two Email tasks, one the PowerCenter Server sends if it
completes the session, and the other if it fails the session.
The PowerCenter Server sends post-session email at the end of a session, after executing post-
session shell commands or stored procedures. When the PowerCenter Server encounters an
error sending the email, it writes a message to the server or event log. It does not fail the
session.
The Workflow Manager includes the following session properties to send post-session email:
♦ On-Success Email
♦ On-Failure Email
Figure 12-2 shows the On-Success and On-Failure email properties on the Components tab of
the session properties:
Figure 12-2. Post-Session Email Properties
Use a reusable
Email task.
Select a
reusable Email
task.
Edit the non-

reusable Email
task.
Use a non-
reusable Email
task.
You can specify a reusable Email task you create in the Task Developer for either success email
or failure email. Or, you can create a non-reusable Email task for each session property. When
you create a non-reusable Email task for the session property, you create the Email task for
that session only. You cannot use the Email task in the workflow or worklet.

You cannot specify a non-reusable Email task you create in the Workflow or Worklet Designer
for post-session email.
Tip: When you configure an Email task for post-session email, use the email server variables,
$PMSuccessEmailUser or $PMFailureEmailUser, for the email recipient. Verify you specify
the values of the server variables for the PowerCenter Server that runs the session.

You can use server variables to address post-session email. When you register the PowerCenter
Server, you can configure its server variables. You can use the following server variables for
sending post-session email:
♦ $PMSuccessEmailUser. Email address of the user to receive email when the session
completes successfully. Use this variable for the Email User Name for success email only.
The PowerCenter Server does not expand this variable when you use it for any other email
type.
♦ $PMFailureEmailUser. Email address of the user to receive email when the session fails to
complete. Use this variable for the Email User Name for failure email only. The
PowerCenter Server does not expand this variable when you use it for any other email type.
When you use one of these server variables, the PowerCenter Server sends email to the address
configured for the server variable.
You might use this functionality when you have an administrator who troubleshoots all failed
sessions. Instead of entering the administrator email address for each session, you can use the
email variable $PMFailureEmailUser. If the administrator changes, you can correct all sessions
by editing the $PMFailureEmailUser server variable, instead of editing the email address in
each session.
You might also use this functionality when you have different administrators for different
PowerCenter Servers. If you deploy a folder from one repository to another or otherwise
change the PowerCenter Server that runs the session, the new server automatically sends email
to users associated with the new server when you use server variables instead of hard-coded
email addresses.
Note: $PMSuccessEmailUser and $PMFailureEmailUser are optional server variables. Verify
you define a variable before using it to address email.
Email Variables and Format Tags

You can use email variables and format tags in an email message for post-session emails. You
can use some email variables in the subject of the email. With email variables, you can include
important session information in the email, such as the number of rows loaded, the session
completion time, or read and write statistics. You can also attach the session log or other
relevant files to the email. Use format tags in the body of the message to make the message
easier to read.
Working with Post-Session Email 333

Note: The PowerCenter Server does not limit the type or size of attached files. However, since
large attachments can cause problems with your email system, avoid attaching excessively
large files, such as session logs generated using verbose tracing. The PowerCenter Server
generates an error message in the email if an error occurs attaching the file.
Table 12-1 describes the email variables you can use in a post-session email:
Table 12-1. Email Variables for Post-Session Email
Email Variable Description
%s Session name.
%e Session status.
%b Session start time.
%c Session completion time.
%i Session elapsed time (session completion time-session start time).
%l Total rows loaded.
%r Total rows rejected.
%t Source and target table details, including read throughput in bytes per second and write throughput
in rows per second. The PowerCenter Server includes all information displayed in the session detail
dialog box.
%m Name of the mapping used in the session.
%n Name of the folder containing the session.
%d Name of the repository containing the session.
%g Attach the session log to the message.
%a<filename> Attach the named file. The file must be local to the PowerCenter Server. The following are valid file
names: %a<c:\data\sales.txt> or %a</users/john/data/sales.txt>.
Note: The file name cannot include the greater than character (>) or a line break.
Note: The PowerCenter Server ignores %a, %g, or %t when you include them in the email subject. Include these variables in the email
message only.
Table 12-2 lists the format tags you can use in an Email task:
Table 12-2. Format Tags for Email Tasks
Formatting Format Tag
tab \t
new line \n
Configuring Post-Session Email

You can configure post-session email to use a reusable or non-reusable Email task.

Using a Reusable Email Task
Use the following steps to configure post-session email to use a reusable Email task.
To configure post-session email to use a reusable Email task:
1. Open the session properties and click the Components tab.
2. Select Reusable in the Type column for the success email or failure email field.
3. Click the Open button in the Value column to select the reusable Email task.

4. Select the Email task in the Object Browser dialog box and click OK.
5. You can optionally edit the Email task for this session property by clicking the Edit
button in the Value column.
If you edit the Email task for either success email or failure email, the edits only apply to
this session.
6. Click OK to close the session properties.
Using a Non-Reusable Email Task

Follow these steps to configure success email or failure email to use a non-reusable Email task.
To configure success email or failure email to use a non-reusable Email task:
1. Open the session properties and click the Components tab.
2. Select Non-Reusable in the Type column for the success email or failure email field.

3. Open the email editor using the Open button.
4. Edit the Email task and click OK. For more information on editing Email tasks, see
“Working with Email Tasks” on page 328.
5. Click OK to close the session properties.
Sample Email
The following is user-entered text from a sample post-session email configuration using
variables:
Session complete.
Session name: %s
%l
%r
%e
%b
%c
%i
%g
The following is sample output from the configuration above:

Session complete.
Session name: sInstrTest
Total Rows Loaded = 1
Total Rows Rejected = 0
Completed

Start Time: Tue Nov 17 12:26:31 2003
Completion Time: Tue Nov 17 12:26:41 2003
Elapsed time: 0:00:10 (h:m:s)

Working with Suspension Email
You can configure a workflow to send email when the PowerCenter Server suspends the
workflow. For example, when a task fails, the PowerCenter Server suspends the workflow and
sends the suspension email. You can fix the error and resume the workflow.
If another task fails while the PowerCenter Server is suspending the workflow, you do not get
the suspension email again. However, the PowerCenter Server sends another suspension email
if another task fails after you resume the workflow.
For more information, see “Suspending the Workflow” on page 127.
Configure suspension email on the General tab of the workflow properties.
Figure 12-3 shows the Suspension Email workflow options:
Figure 12-3. Suspension Email
Select a reusable Email task.
Remove the reusable Email task.
Select Suspend On Error.
To configure suspension email:

2. Choose Workflows-Edit to open the workflow properties.
3. On the General tab, select Suspend on Error.
Working with Suspension Email 339

4. Click the Browse Emails button to select a reusable Email task.
Note: The Workflow Manager returns an error message if you do not have any reusable
Email tasks in the folder. Create a reusable Email task in the folder before you configure
suspension email.
5. Choose a reusable Email task and click OK.
6. Click OK to close the workflow properties.

Using Email Tasks in a Workflow or Worklet
You can use Email tasks anywhere in a workflow or worklet. For example, you can include an
Email task in a workflow after a Command task that executes a shell script. You can configure
the links in the workflow or worklet so the PowerCenter Server sends you email if the
Command task fails.
You might want the PowerCenter Server to generate a report during a workflow and email the
report to you after generating it.
Note: When you use an Email task outside of a Session task, the PowerCenter Server reads
variables related to the session as text. For example, if you use the variable %s in an Email task
in the workflow, the PowerCenter Server cannot provide a session name, as it is not within a
session.
Figure 12-4 shows a workflow that performs this operation:
Figure 12-4. Email Task in a Workflow
Configure the gen_report Command task to execute a shell script that generates the report.
Verify the shell script saves the report to a directory local to the PowerCenter Server.
Configure the em_report Email task to attach the file generated from the shell script.
Using Email Tasks in a Workflow or Worklet 341

Tips
The following suggestions can extend the capabilities of Email tasks.
Create generic user for sending email.

Often there are multiple users who can start sessions on a PowerCenter Server. If you want to
avoid entering the Microsoft Outlook profile each time the PowerCenter user changes, create
a generic Microsoft Outlook profile, such as “PowerCenter,” then grant each PowerCenter
user rights to send mail through this profile.
Use server variables to address post-session emails.

When the server variables $PMSuccessEmailUser and $PMFailureEmailUser are configured
for the PowerCenter Server, use them to address post-session emails. This allows you to
change the recipient of post-session emails for all sessions the server runs by editing the server
variables. It can also make deploying sessions into production easier when the variables are
defined for both development and production servers.
Generate and send post-session reports.

You can use a post-session success command to generate a report file and attach that file to a
success email. For example, you create a batch file called Q3rpt.bat that generates a sales
report, and you are running Microsoft Outlook on Windows.
Figure 12-5 shows how you can configure the post-session success command to generate a
report:
Figure 12-5. Using Post-Session Commands to Generate Reports

Figure 12-6 shows how you can configure success email to attach a report file:
Figure 12-6. Using Email Variables to Attach Reports
Use email variable

%a to attach the
report.
Use other mail programs.

If you do not have Microsoft Outlook, you can use a post-session success command to invoke
a command line email program, such as WindMail. In this case, you do not have to enter the
email user name or subject, since your recipients, email subject, and body text will be
contained in the batch file, sendmail.bat.
Figure 12-7 shows how you can configure the post-session success command to invoke a
command line email program:
Figure 12-7. Sending Email without Microsoft Outlook
Tips 343
Chapter 13
This chapter covers the following subjects:
♦ Overview, 346
♦ Configuring Partitioning Information, 351
♦ Cache Partitioning, 359
♦ Round-Robin Partition Type, 360
♦ Hash Keys Partition Types, 361
♦ Key Range Partition Type, 363
♦ Pass-Through Partition Type, 367
♦ Database Partitioning Partition Type, 369
♦ Partitioning Relational Sources, 371
♦ Partitioning File Sources, 374
♦ Partitioning Relational Targets, 378
♦ Partitioning File Targets, 380
♦ Partitioning Joiner Transformations, 384
♦ Partitioning Lookup Transformations, 391
♦ Partitioning Sorter Transformations, 392
♦ Mapping Variables in Partitioned Pipelines, 394
♦ Partitioning Rules, 395
345
Overview
You create a session for each mapping you want the PowerCenter Server to run. Every
mapping contains one or more source pipelines. A source pipeline consists of a source
qualifier and all the transformations and targets that receive data from that source qualifier.
If you purchase the Partitioning option, you can specify partitioning information for each
source pipeline in a mapping. The partitioning information for a pipeline controls the
following factors:
♦ The number of reader, transformation, and writer threads that the master thread creates
for the pipeline. For more information, see “Understanding Processing Threads” on
page 14.
♦ How the PowerCenter Server reads data from the source, including the number of
connections to the source.
♦ How the PowerCenter Server distributes rows of data to each transformation as it processes
the pipeline.
♦ How the PowerCenter Server writes data to the target, including the number of
connections to each target in the pipeline.
You can specify partitioning information for a pipeline by setting the following attributes:
♦ Location of partition points. Partition points mark the thread boundaries in a pipeline
and divide the pipeline into stages. The PowerCenter Server sets partition points at several
transformations in a pipeline by default. If you have the Partitioning option, you can
define other partition points. When you add partition points, you increase the number of
transformation threads, which can improve session performance. The PowerCenter Server
can redistribute rows of data at partition points, which can also improve session
performance. For more information on partition points, see “Partition Points” on
page 346.
♦ Number of partitions. A partition is a pipeline stage that executes in a single thread. If you
purchase the Partitioning option, you can set the number of partitions at any partition
point. When you add partitions, you increase the number of processing threads, which can
improve session performance. For more information, see “Number of Partitions” on
page 348.
♦ Partition types. The PowerCenter Server specifies a default partition type at each partition
point. If you purchase the Partitioning option, you can change the partition type. The
partition type controls how the PowerCenter Server redistributes data among partitions at
partition points. For more information, see “Partition Types” on page 348.
Partition Points
By default, the PowerCenter Server sets partition points at various transformations in the
pipeline. Partition points mark thread boundaries as well as divide the pipeline into stages. A
stage is a section of a pipeline between any two partition points. When you set a partition
point at a transformation, the new pipeline stage includes that transformation.
346 Chapter 13: Pipeline Partitioning

Table 13-1 lists the partition points that the Workflow Manager creates by default:
Table 13-1. Default Partition Points
Transformation Default
Description
(Partition Point) Partition Type
Source Qualifier or Pass-through Controls how the PowerCenter Server reads data from the source
Normalizer transformation and passes data into the source qualifier.
Rank and unsorted Hash auto-keys Ensures that the PowerCenter Server groups rows properly before it
Aggregator transformations sends them to the transformation.
Target instances Pass-through Controls how the target instances pass data to the targets.
If you purchase the Partitioning option, you can add partition points at other transformations
and delete some partition points.
Figure 13-1 shows the default partition points and pipeline stages for a simple mapping with
one source pipeline:
Figure 13-1. Default Partition Points and Stages in a Sample Mapping

* * *
First Stage Second Stage Third Stage Fourth Stage
The mapping in Figure 13-1 contains four stages. The partition point at the source qualifier
marks the boundary between the first (reader) and second (transformation) stages. The
partition point at the Aggregator transformation marks the boundary between the second and
third (transformation) stages. The partition point at the target instance marks the boundary
between the third (transformation) and fourth (writer) stage.
When you add a partition point, you increase the number of pipeline stages by one. Similarly,
when you delete a partition point, you reduce the number of stages by one. For more
information, see “Understanding Processing Threads” on page 14.
Besides marking stage boundaries, partition points also mark the points in the pipeline where
the PowerCenter Server can redistribute data across partitions. For example, if you place a
partition point at a Filter transformation and define multiple partitions, the PowerCenter
Server can redistribute rows of data among the partitions before the Filter transformation
processes the data. The partition type you set at this partition point controls the way in which
the PowerCenter Server passes rows of data to each partition. For more information, see
“Partition Types” on page 348.
For more information on adding and deleting partition points, see “Adding and Deleting
Partition Points” on page 353.
Overview 347
Number of Partitions
A partition is a pipeline stage that executes in a single reader, transformation, or writer thread.
By default, the PowerCenter Server defines a single partition in the source pipeline. If you
purchase the Partitioning option, you can increase the number of partitions. This increases
the number of processing threads, which can improve session performance.
For example, you need to use the mapping in Figure 13-1 to extract data from three flat files
of various sizes. To do this, you define three partitions at the source qualifier to read the data
simultaneously. When you do this, the Workflow Manager defines three partitions in the
pipeline.
Figure 13-2 shows the threads that the master thread creates for this mapping:
Figure 13-2. Threads Created for a Sample Mapping with Three Partitions

* * *

3 Reader Threads 6 Transformation Threads 3 Writer Threads

By default, the PowerCenter Server sets the number of partitions to one. You can generally
define up to 64 partitions at any partition point. However, there are situations in which you
can define only one partition in the pipeline. For more information, see “Restrictions on the
Number of Partitions” on page 395.
Note: Increasing the number of partitions or partition points increases the number of threads.
Therefore, increasing the number of partitions or partition points also increases the load on
the server machine. If the server machine contains ample CPU bandwidth, processing rows of
data in a session concurrently can increase session performance. However, if you create a large
number of partitions or partition points in a session that processes large amounts of data, you
can overload the system.
For more information on adding and deleting partitions, see “Adding and Deleting Partitions”
on page 356.
Partition Types
When you configure the partitioning information for a pipeline, you must specify a partition
type at each partition point in the pipeline. The partition type determines how the
PowerCenter Server redistributes data across partition points.

The Workflow Manager allows you to specify the following partition types:
♦ Round-robin. The PowerCenter Server distributes data evenly among all partitions. Use
round-robin partitioning where you want each partition to process approximately the same
number of rows. For more information, see “Round-Robin Partition Type” on page 360.
♦ Hash. The PowerCenter Server applies a hash function to a partition key to group data
among partitions. If you select hash auto-keys, the PowerCenter Server uses all grouped or
sorted ports as the partition key. If you select hash user keys, you specify a number of ports
to form the partition key. Use hash partitioning where you want to ensure that the
PowerCenter Server processes groups of rows with the same partition key in the same
partition. For more information, see “Hash Keys Partition Types” on page 361.
♦ Key range. You specify one or more ports to form a compound partition key. The
PowerCenter Server passes data to each partition depending on the ranges you specify for
each port. Use key range partitioning where the sources or targets in the pipeline are
partitioned by key range. For more information, see “Key Range Partition Type” on
page 363.
♦ Pass-through. The PowerCenter Server passes all rows at one partition point to the next
partition point without redistributing them. Choose pass-through partitioning where you
want to create an additional pipeline stage to improve performance, but do not want to
change the distribution of data across partitions. For more information, see “Pass-Through
Partition Type” on page 367.
♦ Database partitioning. The PowerCenter Server queries the IBM DB2 system for table
partition information and loads partitioned data to the corresponding nodes in the target
database. Use database partitioning with IBM DB2 targets stored on a multi-node
tablespace. For more information, see “Database Partitioning Partition Type” on page 369.
You can specify different partition types at different points in the pipeline.
Figure 13-3 shows a mapping where you can specify different partition types to increase
session performance:
Figure 13-3. Sample Mapping
The mapping in Figure 13-3 reads data about items and calculates average wholesale costs and
prices. The mapping must read item information from three flat files of various sizes, and
then filter out discontinued items. It sorts the active items by description, calculates the
average prices and wholesale costs, and writes the results to a relational database in which the
target tables are partitioned by key range.
When you use this mapping in a session, you can increase session performance by specifying
different partition types at the following partition points in the pipeline:
♦ Source qualifier. To read data from the three flat files concurrently, you must specify three
partitions at the source qualifier. Accept the default partition type, pass-through.
Overview 349
♦ Filter transformation. Since the source files vary in size, each partition processes a
different amount of data. Set a partition point at the Filter transformation, and choose
round-robin partitioning to balance the load going into the Filter transformation.
♦ Sorter transformation. To eliminate overlapping groups in the Sorter and Aggregator
transformations, use hash auto-keys partitioning at the Sorter transformation. This causes
the PowerCenter Server to group all items with the same description into the same
partition before the Sorter and Aggregator transformations process the rows. You can
delete the default partition point at the Aggregator transformation.
♦ Target. Since the target tables are partitioned by key range, specify key range partitioning
at the target to optimize writing data to the target.
For more information on specifying partition types, see “Specifying Partition Types” on
page 356.

Configuring Partitioning Information
When you create or edit a session, you can change the partitioning information for each
pipeline in a mapping. If the mapping contains multiple pipelines, you can specify multiple
partitions in some pipelines and single partitions in others. You update partitioning
information using the Partitions view on the Mapping tab in the session properties.
You can configure the following information in the Partitions view on the Mapping tab:
♦ Add and delete partition points.
♦ Enter a description for each partition.
♦ Specify the partition type at each partition point.
♦ Add a partition key and key ranges for certain partition types.
Figure 13-4 shows the configuration options on the Partitions view on the Mapping tab:
Figure 13-4. Session Properties Partitions View on the Mapping Tab

Add a partition
point.
Delete a partition
point.
Edit the selected

partition point.
Selected Partition
Point
Partitioning
Workspace
Edit Keys
Specify key
ranges.
Click to display
Partitions view.
Configuring Partitioning Information 351

Table 13-2 describes the configuration options for the Partitions view on the Mapping tab:
Table 13-2. Options on Session Properties Partitions View on the Mapping Tab
Partitions View Option Description
Add Partition Point Click to add a new partition point in the mapping. When you add a partition point, the
transformation name appears under the Partition Points node.
Delete Partition Point Click to delete the selected partition point.

You cannot delete certain partition points. For details, see “Adding and Deleting Partition
Points” on page 353.
Edit Partition Point Click to edit the selected partition point. This opens the Edit Partition Point dialog box. For
more information on the options in this dialog box, see Table 13-3 on page 353.
Key Range Displays the key and key ranges for the partition point, depending on the partition type.
For key range partitioning, you specify the key ranges.
For hash user keys partitioning, this field displays the partition key.
The Workflow Manager does not display this area for other partition types.
Edit Keys Click to add or remove the partition key for key range or hash user keys partitioning. You
cannot create a partition key for hash auto-keys, round-robin, or pass-through partitioning.
You can configure the following information when you edit or add a partition point:
♦ Specify the partition type at the partition point.
♦ Add and delete partitions.
Figure 13-5 shows the configuration options in the Edit Partition Point dialog box:
Figure 13-5. Edit Partition Point Dialog Box

Selected Partition Point
Add a partition.
Delete a partition.
Select a partition.
Enter the partition description.
Specify the partition type.

Table 13-3 describes the configuration options in the Edit Partition Point dialog box:
Table 13-3. Edit Partition Point Dialog Box Options
Partition Options Description
Select Partition Type Changes the partition type.
Partition Names Selects individual partitions from this dialog box to configure.
Add a Partition Adds a partition. You can add up to 64 partitions at any partition point. The number of
partitions must be consistent across the pipeline. Therefore, if you define three partitions
at one partition point, the Workflow Manager defines three partitions at all partition points
in the pipeline.
Delete a Partition Deletes the selected partition. Each partition point must contain at least one partition.
Description Enter an optional description for the current partition.
Adding and Deleting Partition Points

When you create a session, the Workflow Manager creates one partition point at the following
transformations in the pipeline:
♦ Source Qualifier or Normalizer. This partition point controls how the PowerCenter Server
extracts data from the source and passes it to the source qualifier. You cannot delete this
partition point.
♦ Rank and unsorted Aggregator transformations. These partition points ensure that the
PowerCenter Server groups rows properly before it sends them to the transformation. You
can delete these partition points if the pipeline contains only one partition or if the
PowerCenter Server passes all rows in a group to a single partition before they enter the
transformation.
For example, in the mapping in Figure 13-3 on page 349, you can delete the default
partition point at the Aggregator transformation because hash auto-keys partitioning at
the Sorter transformation sends all rows that contain items with the same description to
the same partition. Therefore, the Aggregator transformation receives data for all items
with the same description in one partition and can calculate the average costs and prices
for this item correctly.
♦ Target instances. This partition point controls how the writer passes data to the targets.
You cannot delete this partition point.
Rules for Adding and Deleting Partition Points

You can add and delete partition points at other transformations in the pipeline according to
the following rules:
♦ You cannot create partition points at source instances.
♦ You cannot create partition points at Sequence Generator transformations or unconnected
transformations.

♦ You can add partition points at any other transformation provided that no partition point
receives input from more than one pipeline stage.
Figure 13-6 shows the valid partition points in a mapping:
Figure 13-6. Sample Mapping Showing Valid Partition Points
* Valid Partition Points

* *
*
In this mapping, the Workflow Manager creates partition points at the source qualifier and
target instance by default. You can place an additional partition point at Expression
transformation EXP_3.
If you place a partition point at EXP_3 and define one partition, the master thread creates the
following threads:
* Partition Points
* *
*
Reader Thread Transformation Threads Writer Thread

In this case, each partition point receives data from only one pipeline stage, so EXP_3 is a
valid partition point.
The following transformations are not valid partition points:
Transformation Reason
Source This is a source instance.

Transformation Reason
SG_1 This is a Sequence Generator transformation.
EXP_1 and EXP_2 If you could place a partition point at EXP_1 or EXP_2, you would create an additional pipeline
stage that processes data from the source qualifier to EXP_1 or EXP_2. In this case, EXP_3
would receive data from two pipeline stages, which is not allowed.
For more information about processing threads, see “Understanding Processing Threads” on
page 14.
Steps for Adding Partition Points

You add partition points from the Mappings tab of the session properties.
To add a partition point:
1. On the Partitions view of the Mapping tab, select a transformation that is not already a
partition point, and click the Add a Partition Point button.
Tip: You can select a transformation from the Non-Partition Points node.
2. Select the partition type for the partition point or accept the default value. For
information on specifying a valid partition type, see “Specifying Partition Types” on
page 356.
3. Click OK.
The transformation appears in the Partition Points node in the Partitions view on the
Mapping tab of the session properties.

Adding and Deleting Partitions
In general, you can define up to 64 partitions at any partition point in a source pipeline. In
certain circumstances, the number of partitions in the pipeline must be set to one. For more
information, see “Restrictions on the Number of Partitions” on page 395.
The number of partitions you specify equals the number of connections to the source or
target. If the pipeline contains a relational source or target, the number of partitions at the
source qualifier or target instance equals the number of connections to the database. If the
pipeline contains file sources, you can configure the session to read the source with one thread
or with multiple threads. For more information on connecting to relational sources and
targets, see “Partitioning Relational Sources” on page 371 and “Partitioning Relational
Targets” on page 378. For more information on connecting to file sources and targets, see
“Partitioning File Sources” on page 374 and “Partitioning File Targets” on page 380.
The number of partitions you specify remains consistent throughout the pipeline. So if you
specify three partitions at any partition point, the PowerCenter Server creates three partitions
at all other partition points in the pipeline.
Entering Partition Descriptions

You can enter a description for each partition you create. To enter a description, select the
partition in the Edit Partition Point dialog box, and then enter the description in the
Description field.
Specifying Partition Types

The Workflow Manager sets a default partition type for each partition point in the pipeline.
At the source qualifier and target instance, the Workflow Manager specifies pass-through
partitioning. For Rank and unsorted Aggregator transformations, for example, the Workflow
Manager specifies hash auto-keys partitioning when the transformation scope is All Input.
When you create a new partition point, the Workflow Manager sets the partition type to the
default partition type for that transformation. You can change the default type.
You must specify pass-through partitioning for all transformations that are downstream from
a transaction generator or an active source that generates commits, and upstream from a target
or a transformation with Transaction transformation scope. Also, if you configure the session
to use constraint-based loading, you must specify pass-through partitioning for all
transformations that are downstream from the last active source.

Table 13-4 lists valid partition types and the default partition type for different partition
points in the pipeline:
Table 13-4. Valid Partition Types for Partition Points
Transformation Round- Hash Hash User Key Pass- Database Default Partition
(Partition Point) Robin Auto-Keys Keys Range Through Partitioning Type
Source definition Not a valid partition

point
Source Qualifier X X Pass-through

(relational sources)
Source Qualifier X Pass-through

(flat file sources)
XML Source Qualifier X Pass-through
Normalizer X Pass-through
(COBOL sources)
Normalizer X X X X Pass-through
(relational)
Aggregator (sorted) X Pass-through
Aggregator (unsorted) X X Based on

transformation scope*
Custom X X X X Pass-through
Expression X X X X Pass-through
External Procedure X X X X Pass-through
Filter X X X X Pass-through
Joiner X X Based on
Lookup X X X X X Pass-through
Rank X X Based on
Router X X X X Pass-through
Sequence Generator Not a valid partition

point
Sorter X X X Based on
Stored Procedure X X X X Pass-through
Transaction Control X X X X Pass-through
Union X X X X Pass-through
Update Strategy X X X X Pass-through

Table 13-4. Valid Partition Types for Partition Points
Transformation Round- Hash Hash User Key Pass- Database Default Partition
(Partition Point) Robin Auto-Keys Keys Range Through Partitioning Type
Unconnected Not a valid partition

transformation point
Relational target X X X X X Pass-through

definition (DB2 targets
only) The default for DB2
targets is database
partitioning
Flat file target definition X X X X Pass-through
XML target definition Not a valid partition

point
* The default partition type is pass-through when the transformation scope is Transaction, and hash auto-keys when the transformation scope is All Input.
Adding Keys and Key Ranges

If you select key range or hash user keys partitioning at any partition point, you need to
specify a partition key. The PowerCenter Server uses the key to pass rows to the appropriate
partition.
For example, if you specify key range partitioning at a Source Qualifier transformation, the
PowerCenter Server uses the key and ranges to create the WHERE clause when it selects data
from the source. Therefore, you can have the PowerCenter Server pass all rows that contain
customer IDs less than 135000 to one partition and all rows that contain customer IDs
greater than or equal to 135000 to another partition. For more information, see “Key Range
Partition Type” on page 363.
If you specify hash user keys partitioning at a transformation, the PowerCenter Server uses the
key to group data based on the ports you select as the key. For example, if you specify
ITEM_DESC as the hash key, the PowerCenter Server distributes data so that all rows that
contain items with the same description go to the same partition. For more information, see
“Hash Keys Partition Types” on page 361.

Cache Partitioning
When you create a session with multiple partitions, the PowerCenter Server can partition
caches for the Aggregator, Joiner, Lookup, and Rank transformations. It creates a separate
cache for each partition, and each partition works with only the rows needed by that
partition. As a result, the PowerCenter Server requires only a portion of total cache memory
for each partition. When you run a session, the PowerCenter Server accesses the cache in
parallel for each partition.
After you configure the session for partitioning, you can configure memory requirements and
cache directories for each transformation in the Transformations view on the Mapping tab of
the session properties. To configure the memory requirements, calculate the total
requirements for a transformation, and divide by the number of partitions. To further
improve performance, you can configure separate directories for each partition.
The guidelines for cache partitioning is different for each cached transformation:
♦ Aggregator transformation. The PowerCenter Server uses cache partitioning for any
multi-partitioned session with an Aggregator transformation. You do not have to set a
partition point at the Aggregator transformation.
♦ Joiner transformation. The PowerCenter Server uses cache partitioning when you create a
partition point at the Joiner transformation. For more information about partitioning with
Joiner transformations, see “Partitioning Joiner Transformations” on page 384.
♦ Lookup transformation. The PowerCenter Server uses cache partitioning when you create
a hash auto-keys partition point at the Lookup transformation. For more information
about partitioning with Lookup transformations, see “Partitioning Lookup
Transformations” on page 391.
♦ Rank transformation. The PowerCenter Server uses cache partitioning for any multi-
partitioned session with a Rank transformation. You do not have to set a partition point at
the Rank transformation.
For more caching information, see “Session Caches” on page 613.
Cache Partitioning 359

Round-Robin Partition Type
In round-robin partitioning, the PowerCenter Server distributes rows of data evenly to all
partitions. Each partition processes approximately the same number of rows.
Table 13-4 on page 357 lists the partition points where you can specify round-robin
partitioning.
Use round-robin partitioning when you need to distribute rows evenly and do not need to
group data among partitions. In a pipeline that reads data from file sources of different sizes,
you can use round-robin partitioning to ensure that each partition receives approximately the
same number of rows.
Figure 13-7 shows a mapping where round-robin partitioning helps distribute rows before
they enter a Filter transformation:
Figure 13-7. Mapping where Round-robin Partitioning Can Increase Performance
Round-robin partitioning distributes data

evenly at the Filter transformation.
The session based on this mapping reads item information from three flat files of different
sizes:
♦ Source file 1: 80,000 rows
When the PowerCenter Server reads the source data, the first partition begins processing 80%
of the data, the second partition processes 5% of the data, and the third partition processes
15% of the data.
To distribute the workload more evenly, set a partition point at the Filter transformation and
set the partition type to round-robin. The PowerCenter Server distributes the data so that
each partition processes approximately one third of the data.

Hash Keys Partition Types
In hash partitioning, the PowerCenter Server uses a hash function to group rows of data
among partitions. The PowerCenter Server groups the data based on a partition key.
Use hash partitioning when you want the PowerCenter Server to distribute rows to the
partitions by group. For example, you need to sort items by item ID, but you do not know
how many items have a particular ID number.
There are two types of hash partitioning:
♦ Hash auto-keys. The PowerCenter Server uses all grouped or sorted ports as a compound
partition key. You may need to use hash auto-keys partitioning at Rank, Sorter, and
unsorted Aggregator transformations.
♦ Hash user keys. You specify a number of ports to generate the partition key.
Table 13-4 on page 357 lists the partition points where you can specify hash partitioning.
Hash Auto-Keys
You can use hash auto-keys partitioning at or before Rank, Sorter, Joiner, and unsorted
Aggregator transformations to ensure that rows are grouped properly before they enter these
transformations.
Figure 13-8 shows a mapping where hash auto-keys partitioning causes the PowerCenter
Server to distribute rows to each partition according to group before they enter the Sorter and
Aggregator transformations:
Figure 13-8. Mapping where Hash Partitioning Can Increase Performance
Hash auto-keys partitioning groups data at the Sorter.
In this mapping, the Sorter transformation sorts items by item description. If items with the
same description exist in more than one source file, each partition will contain items with the
same description. Without hash auto-keys partitioning, the Aggregator transformation might
calculate average costs and prices for each item incorrectly.
To prevent errors in the cost and prices calculations, set a partition point at the Sorter
transformation and set the partition type to hash auto-keys. When you do this, the
PowerCenter Server redistributes the data so that all items with the same description reach the
Sorter and Aggregator transformations in a single partition.
Hash Keys Partition Types 361

Hash User Keys
In hash user keys partitioning, the PowerCenter Server uses a hash function to group rows of
data among partitions based on a user-defined partition key. You choose the ports that define
the partition key.
In the mapping in Figure 13-8 on page 361, if you specify hash auto-keys partitioning, the
Sorter transformation receives rows of data grouped by the sort key, such as ITEM_DESC. If
the item descriptions are long, and you know that each item has a unique ID number, you can
specify hash user keys partitioning at the Sorter transformation and select ITEM_ID as the
hash key. This may improve the performance of the session since the hash function usually
processes numerical data more quickly than string data.
Adding a Hash Key

If you select hash user keys partitioning at any partition point, you must specify a hash key.
The PowerCenter Server uses the hash key to distribute rows to the appropriate partition
according to group.
To specify the hash key, select the partition point on the Partitions view of the Mapping tab,
and click Edit Keys. This displays the Edit Partition Key dialog box. The Available Ports list
displays the connected input and input/output ports in the transformation. To specify the
hash key, select one or more ports from this list, and then click Add.
Figure 13-9 shows one port selected as the hash key for a Filter transformation:
Figure 13-9. Edit Partition Key Dialog Box
Rearrange selected ports.
To rearrange the order of the ports that make up the key, select a port in the Selected Ports list
and click the up or down arrow.

Key Range Partition Type
With key range partitioning, the PowerCenter Server distributes rows of data based on a port
or set of ports that you specify as the partition key. For each port, you define a range of values.
The PowerCenter Server uses the key and ranges to send rows to the appropriate partition.
Table 13-4 on page 357 lists the partition points where you can specify key range
partitioning.
Use key range partitioning in mappings where the source and target tables are partitioned by
key range.
Figure 13-10 shows a mapping where key range partitioning can optimize writing to the
target table:
Figure 13-10. Mapping where Key Range Partitioning Can Increase Performance
Key range partitioning at the target

optimizes writing to the target tables.
The target table in the database is partitioned by ITEM_ID as follows:

♦ Partition 1: 0001–2999
♦ Partition 2: 3000–5999
♦ Partition 3: 6000–9999
To optimize writing to the target table, perform the following tasks:
1. Set the partition type at the target instance to key range.
2. Create three partitions.
3. Choose ITEM_ID as the partition key.
The PowerCenter Server uses this key to pass data to the appropriate partition.
4. Set the key ranges as follows:
ITEM_ID Start Range End Range
Partition #1 3000
Partition #2 3000 6000
Partition #3 6000
When you do this, the PowerCenter Server sends all items with IDs less than 3000 to the first
partition. It sends all items with IDs between 3000 and 5999 to the second partition. Items
with IDs greater than or equal to 6000 go to the third partition. For more information on key
ranges, see “Adding Key Ranges” on page 365.
Key Range Partition Type 363

Adding a Partition Key
To specify the partition key for key range partitioning, select the partition point on the
Partitions view of the Mapping tab, and click Edit Keys. This displays the Edit Partition Key
dialog box. The Available Ports list displays the connected input and input/output ports in
the transformation. To specify the partition key, select one or more ports from this list, and
then click Add.
Figure 13-11 shows one port selected as the partition key for the target table
T_ITEM_PRICES:
Figure 13-11. Edit Partition Key Dialog Box
Rearrange the selected ports.
To rearrange the order of the ports that make up the partition key, select a port in the Selected
Ports list and click the up or down arrow.
In key range partitioning, the order of the ports does not affect how the PowerCenter Server
redistributes rows among partitions, but it can affect session performance. For example, you
might configure the following compound partition key:
Selected Ports
ITEMS.DESCRIPTION
ITEMS.DISCONTINUED_FLAG
Since boolean comparisons are usually faster than string comparisons, the session may run
faster if you arrange the ports in the following order:
Selected Ports
ITEMS.DISCONTINUED_FLAG
ITEMS.DESCRIPTION

Adding Key Ranges
After you identify the ports that make up the partition key, you must enter the ranges for each
port on the Partitions view of the Mapping tab.
Figure 13-12 shows where you enter key ranges on the Partitions view of the Mapping tab:
Figure 13-12. Adding Key Ranges
Specify key ranges.
You can leave the start or end range blank for a partition. When you leave the start range
blank, the PowerCenter Server uses the minimum data value as the start range. When you
leave the end range blank, the PowerCenter Server uses the maximum data value as the end
range.
For example, you can add the following ranges for a key based on CUSTOMER_ID in a
pipeline that contains two partitions:
CUSTOMER_ID Start Range End Range
Partition #1 135000
Partition #2 135000
When the PowerCenter Server reads the Customers table, it sends all rows that contain
customer IDs less than 135000 to the first partition, and all rows that contain customer IDs
equal to or greater than 135000 to the second partition. The PowerCenter Server eliminates
rows that contain null values or values that fall outside the key ranges.
Key Range Partition Type 365

When you configure a pipeline to load data to a relational target, if a row contains null values
in any column that makes up the partition key or if a row contains a value that fall outside all
of the key ranges, the PowerCenter Server sends that row to the first partition.
When you configure a pipeline to read data from a relational source, the PowerCenter Server
reads rows that fall within the key ranges. It does not read rows with null values in any
partition key column.
If you want to read rows with null values in the partition key, use pass-through partitioning
and create a SQL override.
Consider the following guidelines when you create key ranges:
♦ The partition key must contain at least one port.
♦ You must specify a range for each port.
♦ Use the standard PowerCenter date format to enter dates in key ranges.
♦ The Workflow Manager does not validate overlapping string or numeric ranges.
♦ The Workflow Manager does not validate gaps or missing ranges.
Adding Filter Conditions

If you specify key range partitioning for a relational source, you can specify optional filter
conditions or override the SQL query. For details, see “Partitioning Relational Sources” on
page 371.

Pass-Through Partition Type
In pass-through partitioning, the PowerCenter Server processes data without redistributing
rows among partitions. Therefore, all rows in a single partition stay in that partition after
crossing a pass-through partition point.
When you add a partition point to a pipeline, the master thread creates an additional pipeline
stage. Use pass-through partitioning when you want to increase data throughput, but you
cannot or do not want to increase the number of partitions.
You can specify pass-through partitioning at any valid partition point in a pipeline.
Figure 13-13 shows a mapping where pass-through partitioning can increase data throughput:
Figure 13-13. Mapping where Pass-through Partitioning Can Increase Performance
Reader Thread Transformation Thread Writer Thread

(First Stage) (Second Stage) (Third Stage)
By default, this mapping contains partition points only at the source qualifier and target
instance. Since this mapping contains an XML target, you can configure only one partition at
any partition point.
In this case, the master thread creates one reader thread to read data from the source, one
transformation thread to process the data, and one writer thread to write data to the target.
Each pipeline stage processes the rows as follows:
Source Qualifier Transformations Target Instance
(First Stage) (Second Stage) (Third Stage)
Time
Row Set 1 – –
Row Set 2 Row Set 1 –
Row Set 3 Row Set 2 Row Set 1
Row Set 4 Row Set 3 Row Set 2
... ... ...
Row Set n Row Set n-1 Row Set n-2
Because the pipeline contains three stages, the PowerCenter Server can process three sets of
rows concurrently.
If the Expression transformations are very complicated, processing the second
(transformation) stage can take a long time and cause low data throughput. To improve
performance, set a partition point at Expression transformation EXP_2 and set the partition
Pass-Through Partition Type 367

type to pass-through. This creates an additional pipeline stage. The master thread creates an
additional transformation thread:
Reader Thread Transformation Threads Writer Thread

The PowerCenter Server can now process four sets of rows concurrently as follows:
Source FIL_1 & EXP_1 EXP_2 & LKP_1 Target
Qualifier Transformations Transformations Instance
Time
Row Set 1 - - -
Row Set 2 Row Set 1 - -
Row Set 3 Row Set 2 Row Set 1 -
Row Set 4 Row Set 3 Row Set 2 Row Set 1
... ... ... ...
Row Set n Row Set n-1 Row Set n-2 Row Set n-3
By adding an additional partition point at Expression transformation EXP_2, you replace one
long running transformation stage with two shorter running transformation stages. Data
throughput depends on the longest running stage. So in this case, data throughput increases.
For more information about processing threads, see “Understanding Processing Threads” on
page 14.

Database Partitioning Partition Type
When you load to an IBM DB2 table stored on a multi-node tablespace, you can optimize
session performance by using the database partitioning partition type instead of the pass-
through partition type for IBM DB2 targets.
When you use database partitioning, the PowerCenter Server queries the DB2 system for
table partition information and loads partitioned data to the corresponding nodes in the
target database.
You can only specify database partitioning for relational targets.
You can specify database partitioning for the target partition type with any number of
pipeline partitions and any number of database nodes. However, you can improve load
performance further when the number of pipeline partitions equals the number of database
nodes.
Use the following rules and guidelines when you use database partitioning:
♦ By default, the PowerCenter Server fails the session when you use database partitioning for
non-DB2 targets. However, you can configure the PowerCenter Server to default to pass-
through partitioning when you use database partitioning for non-DB2 relational targets:
− On Windows. Select the Treat Database Partitioning as Pass-Through option on the
Configuration tab of the PowerCenter Server setup. By default, this option is disabled.
− On UNIX. Add the following entry to the file pmserver.cfg:
TreatDBPartitionAsPassThrough=Yes
♦ You cannot use database partitioning when you configure the session to use source-based
or user-defined commit, constraint-based loading, or session recovery.
♦ The target table must contain a partition key. Also, you must link all not-null partition key
columns in the target instance to a transformation in the mapping.
♦ You must use high precision mode when the IBM DB2 table partitioning key uses a Bigint
field. The PowerCenter Server fails the session when the IBM DB2 table partitioning key
uses a Bigint field and you use low precision mode.
♦ If you create multiple partitions for a DB2 bulk load session, you must use database
partitioning for the target partition type. If you choose any other partition type, the
PowerCenter Server reverts to normal load and writes the following message to the session
log:
ODL_26097 Only database partitioning is support for DB2 bulk load.
Changing target load type variable to Normal.
If you configure a session for database partitioning, the PowerCenter Server reverts to pass-
through partitioning under the following circumstances:
♦ The DB2 target table is stored on one node.
♦ You run the session in debug mode using the Debugger.
Database Partitioning Partition Type 369

♦ You configure the PowerCenter Server to treat the database partitioning partition type as
pass-through partitioning and you used database partitioning for a non-DB2 relational
target.

Partitioning Relational Sources
When you run a session that partitions relational or Application sources, the PowerCenter
Server creates a separate connection to the source database for each partition. It then creates
an SQL query for each partition. You can customize the query for each source partition by
entering filter conditions in the Transformation view on the Mapping tab. You can also
override the SQL query for each source partition using the Transformations view on the
Mapping tab.
Figure 13-14 shows where you can override the SQL query for each source partition:
Figure 13-14. Overriding the SQL Query and Entering a Filter Condition
Browse Button
Enter SQL overrides.
Enter filter conditions.
For more information about partitioning Application sources, refer to the PowerCenter
Connect documentation.
Entering an SQL Query

You can enter an SQL override if you want to customize the SELECT statement in the SQL
query. The SQL statement you enter on the Transformations view of the Mapping tab
overrides any customized SQL query that you set in the Designer when you configure the
Source Qualifier transformation. For more information, see “Source Qualifier
Partitioning Relational Sources 371

The SQL query also overrides any key range and filter condition that you enter for a source
partition. So, if you also enter a key range and source filter, the PowerCenter Server uses the
SQL query override to extract source data.
If you create a key that contains null values, you can extract the nulls by creating another
partition and entering an SQL query or filter to extract null values.
To enter an SQL query for each partition, click the Browse button in the SQL Query field.
Enter the query in the SQL Editor dialog box, and then click OK.
If you entered an SQL query in the Designer when you configured the Source Qualifier
transformation, that query appears in the SQL Query field for each partition. To override this
query, click the Browse button in the SQL Query field, revise the query in the SQL Editor
dialog box, and then click OK.
Entering a Filter Condition

If you specify key range partitioning at a relational source qualifier, you can enter an
additional filter condition. When you do this, the PowerCenter Server generates a WHERE
clause that includes the filter condition you enter in the session properties.
The filter condition you enter on the Transformations view of the Mapping tab overrides any
filter condition that you set in the Designer when you configure the Source Qualifier
transformation. For more information, see “Source Qualifier Transformation” in the
If you use key range partitioning, the filter condition works in conjunction with the key
ranges. For example, you want to select data based on customer ID, but you do not want to
extract information for customers outside the USA. Define the following key ranges:
CUSTOMER_ID Start Range End Range
Partition #1 135000
Partition #2 135000
If you know that the IDs for customers outside the USA fall within the range for a particular
partition, you can enter a filter in that partition to exclude them. Therefore, you enter the
following filter condition for the second partition:
CUSTOMERS.COUNTRY = ‘USA’
When the session runs, the following queries for the two partitions appear in the session log:
READER_1_1_1> RR_4010 SQ instance [SQ_CUSTOMERS] SQL Query [SELECT
CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.LAST_NAME FROM
CUSTOMERS WHERE CUSTOMER.CUSTOMER ID < 135000]
[...]
READER_1_1_2> RR_4010 SQ instance [SQ_CUSTOMERS] SQL Query [SELECT
CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.LAST_NAME FROM
CUSTOMERS WHERE CUSTOMERS.COUNTRY = ‘USA’ AND 135000 <=
CUSTOMERS.CUSTOMER_ID]

To enter a filter condition, click the Browse button in the Source Filter field. Enter the filter
condition in the SQL Editor dialog box, and then click OK.
If you entered a filter condition in the Designer when you configured the Source Qualifier
transformation, that query appears in the Source Filter field for each partition. To override
this filter, click the Browse button in the Source Filter field, change the filter condition in the
SQL Editor dialog box, and then click OK.
Partitioning Relational Sources 373

Partitioning File Sources
When a session uses a file source, you can configure it to read the source with one thread or
with multiple threads. The PowerCenter Server creates one connection to the file source when
you configure the session to read with one thread, and it creates multiple concurrent
connections to the file source when you configure the session to read with multiple threads.
Configure the source file name property for partitions 2-n to specify single- or multi-threaded
reading. To configure for single-threaded reading, pass empty data through partitions 2-n. To
configure for multi-threaded reading, leave the source file name blank for partitions 2-n. For
more information about configuring file properties with multiple partitions, see “Configuring
for File Partitioning” on page 375.
Guidelines for Partitioning File Sources

Use the following guidelines when you configure a file source session with multiple partitions:
♦ You can use pass-through partitioning at the source qualifier.
♦ You can use single- or multi-threaded reading with flat file or COBOL sources.
♦ You can use single-threaded reading with XML sources.
♦ You cannot use multi-threaded reading if the source files are non-disk files, such as FTP
files or IBM MQSeries sources.
♦ If you use a shift-sensitive code page, you can use multi-threaded reading only if the
following conditions are true:
− The file is fixed-width.
− The file is not line sequential.
− You did not enable user-defined shift state in the source definition.
♦ If you configure a session for multi-threaded reading, and the PowerCenter Server cannot
create multiple threads to a file source, it writes a message to the session log and reads the
source with one thread.
♦ When the PowerCenter Server uses multiple threads to read a source file, it may not read
the rows in the file sequentially. If sort order is important, configure the session to read the
file with a single thread. For example, sort order may be important if the mapping contains
a sorted Joiner transformation and the file source is the sort origin.
♦ You can also use a combination of direct and indirect files to balance the load.
♦ Session performance for multi-threaded reading is optimal with large source files.
Although the PowerCenter Server can create multiple connections to small source files,
performance may not be optimal.

Using One Thread to Read a File Source
When the PowerCenter Server uses one thread to read a file source, it creates one connection
to the source. The PowerCenter Server reads the rows in the file or file list sequentially. You
can configure single-threaded reading for direct or indirect file sources in a session:
♦ Reading direct files. You can configure the PowerCenter Server to read from one or more
direct files. If you configure the session with more than one direct file, the PowerCenter
Server creates a concurrent connection to each file. It does not create multiple connections
to a file.
♦ Reading indirect files. When the PowerCenter Server reads an indirect file, it reads the file
list and reads the files in the list sequentially. If the session has more than one file list, the
PowerCenter Server reads the file lists concurrently, and it reads the files in the list
sequentially.
Using Multiple Threads to Read a File Source

When the PowerCenter Server uses multiple threads to read a source file, it creates multiple
concurrent connections to the source. The PowerCenter Server may or may not read the rows
in a file sequentially. You can configure multi-threaded reading for direct or indirect file
sources in a session:
♦ Reading direct files. When the PowerCenter Server reads a direct file, it creates multiple
reader threads to read the file concurrently. You can configure the PowerCenter Server to
read from one or more direct files. For example, if a session reads from two files and you
create five partitions, the PowerCenter Server may distribute one file among two partitions
and one file among three partitions.
♦ Reading indirect files. When the PowerCenter Server reads an indirect file, it creates
multiple threads to read the file list concurrently. It also creates multiple threads to read
the files in the list concurrently. The PowerCenter Server may use more than one thread to
read a single file.
Configuring for File Partitioning

After you create partition points and configure partitioning information, you can configure
source connection settings and file properties on the Transformations view of the Mapping
tab. Click the source instance name you want to configure under the Sources node. When you
click the source instance name for a file source, the Workflow Manager displays connection
and file properties in the session properties.
You can configure the source file names and directories for each source partition. The
Workflow Manager generates a file name and location for each partition.
Partitioning File Sources 375

Table 13-5 describes the file properties settings for file sources in a mapping:
Table 13-5. File Properties Settings for File Sources
Attribute Value Description
Source File Directory Enter the local source file directory. The default location is $PMSourceFileDir.
Source File Name Enter the local source file name. You can also use the session variable, $InputFileName, as
defined in the parameter file. If you use a file list, enter the name of the list.
By default, the Workflow Manager uses the source file name for each partition. Edit the file
name property for partitions 2-n based on how you want the PowerCenter Server to read
the files.
Source File Type Choose Direct to use source files or Indirect to use a file list.
Configuring Sessions to Use a Single Thread

To configure a session to read a file with a single thread, pass empty data through partitions 2-
n. To pass empty data, create a file with no data, such as “empty.txt,” and put it in the source
file directory. Then, use “empty.txt” as the source file name.
Table 13-6 describes the session configuration and the PowerCenter Server behavior when it
uses a single thread to read source files:
Table 13-6. Configuring Source File Name for Single-Threaded Reading
Source File Name Value PowerCenter Server Behavior
Partition #1 ProductsA.txt The PowerCenter Server creates one thread to read ProductsA.txt. It reads
Partition #2 empty.txt rows in the file sequentially. After it reads the file, it passes the data to
Partition #3 empty.txt three partitions in the transformation pipeline.
Partition #1 ProductsA.txt The PowerCenter Server creates two threads. It creates one thread to read
Partition #2 empty.txt ProductsA.txt, and it creates one thread to read ProductsB.txt. It reads the
Partition #3 ProductsB.txt files concurrently, and it reads rows in the files sequentially.
If you use FTP to access source files, you can choose a different connection for each direct
file. For more information about using FTP to access source files, see “Using FTP” on
page 559.
Configuring Sessions to Use Multiple Threads

To configure a session to read a file with multiple threads, leave the source file name blank for
partitions 2-n. The PowerCenter Server uses partitions 2-n to read a portion of the previous
partition file or file list. The PowerCenter Server ignores the directory field of that partition.

Table 13-7 describes the session configuration and the PowerCenter Server behavior when it
uses multiple threads to read source files:
Table 13-7. Configuring Source File Name for Multi-Threaded Reading
Attribute Value PowerCenter Server Behavior
Partition #1 ProductsA.txt The PowerCenter Server creates three threads to concurrently read
Partition #2 <blank> ProductsA.txt.
Partition #3 <blank>
Partition #1 ProductsA.txt The PowerCenter Server creates three threads to read ProductsA.txt and
Partition #2 <blank> ProductsB.txt concurrently. Two threads read ProductsA.txt and one thread
Partition #3 ProductsB.txt reads ProductsB.txt.
Partitioning File Sources 377

Partitioning Relational Targets
When you configure a pipeline to load data to a relational target, the PowerCenter Server
creates a separate connection to the target database for each partition at the target instance. It
concurrently loads data for each partition into the target database.
Configure partition attributes for targets in the pipeline on the Transformations view of the
Mapping tab in the session properties. For relational targets, you configure the reject file
names and directories. The PowerCenter Server creates one reject file for each target partition.
Figure 13-15 shows the Properties settings for relational targets:
Figure 13-15. Properties Settings for Relational Targets in the Session Properties
Properties Settings
Selected Target Instance
Enter reject file directories.
Enter reject file names.

Table 13-8 describes the partitioning attributes for relational targets in a pipeline:
Table 13-8. Partitioning Relational Target Attributes
Attribute Description
Reject File Directory Location for the target reject files. Default is $PMBadFileDir.
Reject File Name Name of reject file. Default is target name partition number.bad. You can also use the session
variable, $BadFileName, as defined in the parameter file.
Database Compatibility
When you configure a session with multiple partitions at the target instance, the PowerCenter
Server creates one connection to the target for each partition. If you configure multiple target
partitions in a session that loads to a database or ODBC target that does not support multiple
concurrent connections to tables, the session fails.
When you create multiple target partitions in a session that loads data to an Informix
database, you must create the target table with row-level locking. If you insert data from a
session with multiple partitions into an Informix target configured for page-level locking, the
session fails and returns the following message:
WRT_8206 Error: The target table has been created with page level locking.
The session can only run with multi partitions when the target table is
created with row level locking.
Sybase IQ does not allow multiple concurrent connections to tables. If you create multiple
target partitions in a session that loads to Sybase IQ, the PowerCenter Server loads all of the
data in one partition.
Partitioning Relational Targets 379

Partitioning File Targets
When you configure a session to write to a file target, the PowerCenter Server writes the
output to a separate file for each partition at the target instance. When you run the session,
the PowerCenter Server writes to the files concurrently.
You can configure connection settings and file properties for each target partition. You
configure these settings in the Transformations view on the Mapping tab.
Configuring Connection Settings

The Connections settings in the Transformations view on the Mapping tab allow you to
configure the connection type for all target partitions. You can choose different connection
objects for each partition, but they must all be of the same type.
You can use one of the following connection types with target files:
♦ Local. Write the partitioned target files to the local machine.
♦ FTP. Transfer the partitioned target files to another machine. You can transfer the files to
any machine to which the PowerCenter Server can connect. For more information about
using FTP to load to target files, see “Using FTP” on page 559.
♦ Loader. Use an external loader that can load from multiple output files. This option
appears if the pipeline loads data to a relational target and you choose a file writer in the
Writers settings on the Mapping tab. If you choose a loader that cannot load from multiple
output files, the PowerCenter Server fails the session. For more information about
configuring external loaders for partitioning, see “Partitioning Sessions with External
Loaders” on page 526.
♦ Message Queue. Transfer the partitioned target files to an IBM MQSeries message queue.
For more information about loading to message queues, refer to the PowerCenter Connect
for IBM MQSeries User and Administrator Guide.
You can merge target files only if you choose local connections for all target partitions.

Figure 13-16 shows the Connections settings for file targets:
Figure 13-16. Connections Settings for File Targets in the Session Properties

Connection Type
Table 13-9 describes the connection options for file targets in a mapping:
Table 13-9. File Targets Connection Options
Connection Type Choose a local, FTP, external loader, or message queue connection. Select None for a local
connection.
The connection type is the same for all partitions.
Value For an FTP, external loader, or message queue connection, click the button in this field to
select the connection object.
You can specify a different connection object for each partition.
Configuring File Properties

The Properties settings in the Transformations view on the Mapping tab allow you to
configure file properties such as the reject file names and directories, the output file names
and directories, and whether to merge the target files.
Partitioning File Targets 381

Figure 13-17 shows the Properties settings for file targets:
Figure 13-17. Properties Settings for File Targets in the Session Properties
Properties Settings
Select to merge target files.
Enter output file directories.
Enter output file names.
Enter reject file directories.
Enter reject file names.
Table 13-10 describes the file properties for file targets in a mapping:
Table 13-10. Target File Properties
Merge Partitioned Files If you select this option, the PowerCenter Server merges the partitioned target files into one
file when the session completes, and then deletes the individual output files. It does not
delete the individual files if it fails to create the merged file.
You cannot merge files if the session uses FTP, an external loader, or an MQSeries
message queue.
Merge File Directory Location for the merge file. Default is $PMTargetFileDir.
Merge File Name Name of the merge file. Default is target name.out.
Output File Directory Location for the target file. Default is $PMTargetFileDir.

Table 13-10. Target File Properties
Output File Name Name of target file. Default is target name partition number.out. You can also use the
session variable, $OutputFileName, as defined in the parameter file.
Reject File Directory Location for the target reject files. Default is $PMBadFileDir.
Reject File Name Name of reject file. Default is target name partition number.bad.
Partitioning File Targets 383

Partitioning Joiner Transformations
When you create a partition point at the Joiner transformation, the Workflow Manager sets
the partition type to hash auto-keys when the transformation scope is All Input. The
Workflow Manager sets the partition type to pass-through when the transformation scope is
Transaction.
You must create the same number of partitions for the master and detail source. If you
configure the Joiner transformation for sorted input, you can change the partition type to
pass-through. See the Transformation Guide for more information about configuring the
Joiner transformation for sorted input.
To use cache partitioning with a Joiner transformation, you must create a partition point at
the Joiner transformation. This allows you to create multiple partitions for both the master
and detail source of a Joiner transformation. For more information about cache partitioning,
see “Cache Partitioning” on page 359.
Note: If you do not create a partition point at the Joiner transformation, you can create n
partitions for the detail source, but only one partition for the master source (1:n).
Partitioning Sorted Joiner Transformations

When you include a Joiner transformation that uses sorted input in the mapping, you must
verify the Joiner transformation receives sorted data. If your sources contain large amounts of
data, you may want to configure partitioning to improve performance. However, partitions
that redistribute rows can rearrange the order of sorted data, so it is important to configure
partitions to maintain sorted data.
For example, when you use a hash auto-keys partition point, the PowerCenter Server uses a
hash function to determine the best way to distribute the data among the partitions. However,
it does not maintain the sort order, so you must follow specific partitioning guidelines to use
this type of partition point.
When you join data, you can partition data for the master and detail pipelines in the
following ways:
♦ 1:n. Use one partition for the master source and multiple partitions for the detail source.
The PowerCenter Server maintains the sort order because it does not redistribute master
data among partitions.
♦ n:n. Use an equal number of partitions for the master and detail sources. When you use
n:n partitions, the PowerCenter Server processes multiple partitions concurrently. You may
need to configure the partitions to maintain the sort order depending on the type of
partition you use at the Joiner transformation.
Note: When you use 1:n partitions, do not add a partition point at the Joiner transformation.
If you add a partition point at the Joiner transformation, the Workflow Manager adds an
equal number of partitions to both master and detail pipelines.
Use different partitioning guidelines, depending on where you sort the data:

♦ Using sorted flat files. Use one of the following partitioning configurations:
− Use 1:n partitions when you have one flat file in the master pipeline and multiple flat
files in the detail pipeline. Configure the session to use one reader-thread for each file.
− Use n:n partitions when you have one large flat file in the master and detail pipelines.
Configure partitions to pass all sorted data in the first partition, and pass empty file data
in the other partitions.
♦ Using sorted relational data. Use one of the following partitioning configurations:
− Use 1:n partitions for the master and detail pipeline.
− Use n:n partitions. If you use a hash auto-keys partition, configure partitions to pass all
sorted data in the first partition.
♦ Using the Sorter transformation. Use n:n partitions. If you use a hash auto-keys partition
at the Joiner transformation, configure each Sorter transformation to use hash auto-keys
partition points as well.
Note: Add only pass-through partition points between the sort origin and the Joiner
transformation.
Using Sorted Flat Files

Use 1:n partitions when you have one flat file in the master pipeline and multiple flat files in
the detail pipeline. When you use 1:n partitions, the PowerCenter Server maintains the sort
order because it does not redistribute data among partitions. When you have one large flat file
in each master and detail pipeline, you can use n:n partitions and add a pass-through or hash
auto-keys partition at the Joiner transformation. When you add a hash auto-keys partition
point, you must configure partitions to pass all sorted data in the first partition to maintain
the sort order.
Using 1:n Partitions

If the session uses one flat file in the master pipeline and multiple flat files in the detail
pipeline, you can use one partition for the master source and n partitions for the detail file
sources (1:n). Add a pass-through partition point at the detail Source Qualifier
transformation. Do not add a partition point at the Joiner transformation. The PowerCenter
Server maintains the sort order when you create one partition for the master source because it
does not redistribute sorted data among partitions.
When you have multiple files in the detail pipeline that have the same structure, pass the files
to the Joiner transformation using the following guidelines:
♦ Configure the mapping with one source and one Source Qualifier transformation in each
pipeline.
♦ Specify the path and file name for each flat file in the Properties settings of the
Transformations view on the Mapping tab of the session properties.
♦ Each file must use the same file properties as configured in the source definition.
Partitioning Joiner Transformations 385

♦ The range of sorted data in the flat files can overlap. You do not need to use a unique range
of data for each file.
Figure 13-18 shows sorted file data joined using 1:n partitioning:
Figure 13-18. Sorted File Data with 1:n Partitions
Flat File
Source
Qualifier
Joiner
transformation
Flat File 1
Source
Flat File 2 Qualifier
with pass-
Flat File 3 through
partition
Sorted Data
Sorted output depends on join type.
The Joiner transformation may output unsorted data depending on the join type. If you use a
full outer or detail outer join, the PowerCenter Server processes unmatched master rows last,
which can result in unsorted data.
Using n:n Partitions

If the session uses sorted flat file data, you can use n:n partitions for the master and detail
pipelines. You can add a pass-through partition or hash auto-keys partition at the Joiner
transformation. If you add a pass-through partition at the Joiner transformation, follow
instructions in the Transformation Guide for maintaining the sort order in mappings.
If you add a hash auto-keys partition point at the Joiner transformation, you can maintain the
sort order by passing all sorted data to the Joiner transformation in a single partition. When
you pass sorted data in one partition, the PowerCenter Server maintains the sort order when it
redistributes data using a hash function.
To allow the PowerCenter Server to pass all sorted data in one partition, configure the session
to use the sorted file for the first partition and empty files for the remaining partitions.
The PowerCenter Server redistributes the rows among multiple partitions and joins the sorted
data.

Figure 13-19 shows sorted file data passed through a single partition to maintain sort order:
Figure 13-19. Sorted File Data Passed Through a Single Partition
Source
Qualifier
Joiner
transformation
with hash auto-
keys partition
point
Source
Qualifier
Sorted Data
No Data
The example in Figure 13-19 shows sorted data passed in a single partition to maintain the
sort order. The first partition contains sorted file data while all other partitions pass empty file
data. At the Joiner transformation, the PowerCenter Server distributes the data among all
partitions while maintaining the order of the sorted data.
Using Sorted Relational Data

When you join relational data, you can use 1:n partitions for the master and detail pipeline.
When you use 1:n partitions, you cannot add a partition point at the Joiner transformation. If
you use n:n partitions, you can add a pass-through or hash auto-keys partition at the Joiner
transformation. If you use a hash auto-keys partition point, you must configure partitions to
pass all sorted data in the first partition to maintain sort order.
Using 1:n Partitions

If the session uses sorted relational data, you can use one partition for the master source and n
partitions for the detail source (1:n). Add a key-range or pass-through partition point at the
Source Qualifier transformation. Do not add a partition point at the Joiner transformation.
The PowerCenter Server maintains the sort order when you create one partition for the
master source because it does not redistribute data among partitions.

Figure 13-20 shows sorted relational data with 1:n partitioning:
Figure 13-20. Sorted Relational Data with 1:n Partitioning
Relational Source Qualifier

Source transformation
Joiner
transformation
Relational Source Qualifier

Source transformation with
key-range or pass-
through partition point Sorted Data
Unsorted Data
Sorted output
depends on join
type.
The Joiner transformation may output unsorted data depending on the join type. If you use a
full outer or detail outer join, the PowerCenter Server processes unmatched master rows last,
which can result in unsorted data.
Using n:n Partitions

If the session uses sorted relational data, you can use n:n partitions for the master and detail
pipelines and add a pass-through or hash auto-keys partition point at the Joiner
transformation. When you use a pass-through partition at the Joiner transformation, follow
instructions in the Transformation Guide for maintaining sorted data in mappings.
When you use a hash auto-keys partition point, you maintain the sort order by passing all
sorted data to the Joiner transformation in a single partition. Add a key-range partition point
at the Source Qualifier transformation that contains all source data in the first partition.
When you pass sorted data in one partition, the PowerCenter Server redistributes data among
multiple partitions using a hash function and joins the sorted data.

Figure 13-21 shows sorted relational data passed through a single partition to maintain the
sort order:
Figure 13-21. Sorted Relational Data Passed Through a Single Partition
Source Qualifier
Relational transformation with
Source key-range partition
point Joiner
transformation
with hash auto-
keys partition
Source Qualifier point
Relational transformation with
Source key-range partition
point
Sorted Data
No Data
The example in Figure 13-21 shows sorted relational data passed in a single partition to
maintain the sort order. The first partition contains sorted relational data while all other
partitions pass empty data. After the PowerCenter Server joins the sorted data, it redistributes
data among multiple partitions.
Using Sorter Transformations

If the session uses the Sorter transformations to sort data, you can use n:n partitions for the
master and detail pipelines. Use a hash auto-keys partition point at the Sorter transformation
to group the data. You can add a pass-through or hash auto-keys partition point at the Joiner
transformation.
The PowerCenter Server groups data into partitions of the same hash values, and the Sorter
transformation sorts the data before passing it to the Joiner transformation. When the
PowerCenter Server processes the Joiner transformation configured with a hash auto-keys
partition, it maintains the sort order by processing the sorted data using the same partitions it
uses to route the data from each Sorter transformation.

Figure 13-22 shows Sorter transformations used with hash auto-keys to maintain sort order:
Figure 13-22. Using Sorter Transformations with Hash Auto-Keys to Maintain Sort Order
Source with Source Sorter

unsorted Qualifier transformation
data transformation with hash auto-
keys partition Joiner
point transformation
with hash auto-
keys or pass-
Sorter through
Source with Source transformation partition point
unsorted Qualifier with hash auto-
data transformation keys partition
point
Sorted Data
Unsorted Data
Note: For best performance, use sorted flat files or sorted relational data. You may want to
calculate the processing overhead for adding Sorter transformations to your mapping.
Optimizing Sorted Joiner Transformations with Partitions

When you use partitions with a sorted Joiner transformation, you may optimize performance
by grouping data and using n:n partitions.
Add a Hash Auto-keys Partition Upstream of the Sort Origin

To obtain expected results and get best performance when partitioning a sorted Joiner
transformation, you must group and sort data. To group data, ensure that rows with the same
key value are routed to the same partition. The best way to ensure that data is grouped and
distributed evenly among partitions is to add a hash auto-keys or key-range partition point
before the sort origin. Placing the partition point before you sort the data ensures that you
maintain grouping and sort the data within each group.
Use n:n Partitions

You may be able to improve performance for a sorted Joiner transformation by using n:n
partitions. When you use n:n partitions, the Joiner transformation reads master and detail
rows concurrently and does not need to cache all of the master data. This reduces memory
usage and speeds processing. When you use 1:n partitions, the Joiner transformation caches
all the data from the master pipeline and writes the cache to disk if the memory cache fills.
When the Joiner transformation receives the data from the detail pipeline, it must then read
the data from disk to compare the master and detail pipelines.

Partitioning Lookup Transformations
You can use cache partitioning for static and dynamic caches, and named and unnamed
caches. When you create a partition point at a connected Lookup transformation, you can use
cache partitioning under the following conditions:
♦ You use the hash auto-keys partition type for the Lookup transformation.
♦ The lookup condition contains only equality operators.
♦ The database is configured for case-sensitive comparison.
For example, if the lookup condition contains a string port and the database is not
configured for case-sensitive comparison, the PowerCenter Server does not perform cache
partitioning and writes the following message to the session log:
CMN_1799 Cache partitioning requires case sensitive string comparisons.
Lookup will not use partitioned cache as the database is configured for
case insensitive string comparisons.
For more information about cache partitioning, see “Cache Partitioning” on page 359.
Partitioning Lookup Transformations 391

Partitioning Sorter Transformations
If you configure multiple partitions in a session that uses a Sorter transformation, the
PowerCenter Server sorts data in each partition separately. The Workflow Manager allows you
to choose hash auto-keys, key-range, or pass-through partitioning when you add a partition
point at the Sorter transformation.
Use hash-auto keys partitioning when you place the Sorter transformation before an
Aggregator transformation configured to use sorted input. Hash auto-keys partitioning
groups rows with the same values into the same partition based on the partition key. After
grouping the rows, the PowerCenter Server passes the rows through the Sorter
transformation. The PowerCenter Server processes the data in each partition separately, but
hash auto-keys partitioning accurately sorts all of the source data because rows with matching
values are processed in the same partition.
Use key-range partitioning when you want to send all rows in a partitioned session from
multiple partitions into a single partition for sorting. When you merge all rows into a single
partition for sorting, the PowerCenter Server can process all of your data together.
Use pass-through partitioning if you already used hash partitioning in the pipeline. This
ensures that the data passing into the Sorter transformation is correctly grouped among the
partitions. Pass-through partitioning increases session performance without increasing the
number of partitions in the pipeline.
For more information on Sorter transformations, see “Sorter Transformation” in the
Configuring Sorter Transformation Work Directories

The PowerCenter Server creates temporary files for each Sorter transformation in a pipeline.
It reads and writes data to these files while it performs the sort. The PowerCenter Server stores
these files in the Sorter transformation work directories.
By default, the Workflow Manager sets the work directories for all partitions at Sorter
transformations to $PMTempDir. You can specify a different work directory for each
partition in the session properties.

Figure 13-23 shows where you specify the work directories in the session properties:
Figure 13-23. Session Properties - Configuring Sorter Transformations
Selected Sorter Transformation
Enter Sorter transformation

work directories.
Partitioning Sorter Transformations 393

Mapping Variables in Partitioned Pipelines
When you specify multiple partitions in a target load order group that uses mapping variables,
the PowerCenter Server evaluates the value of a mapping variable in each partition separately.
The PowerCenter Server uses the following process to evaluate variable values:
1. It updates the current value of the variable separately in each partition according to the
variable function used in the mapping.
2. After loading all the targets in a target load order group, the PowerCenter Server
combines the current values from each partition into a single final value based on the
aggregation type of the variable.
3. If there is more than one target load order group in the session, the final current value of
a mapping variable in a target load order group becomes the current value in the next
target load order group.
4. When the PowerCenter Server completes loading the last target load order group, the
final current value of the variable is saved into the repository.
For more information about mapping variables, see “Mapping Parameters and Variables”
in the Designer Guide. For more information about target load order groups, see “Reading
Source Data” on page 22.
Use one of the following variable functions in the mapping to set the variable value:
♦ SetCountVariable
♦ SetMaxVariable
♦ SetMinVariable
For more information about the variable functions, see “Functions” in the Transformation
Language Reference.
Table 13-11 describes how the PowerCenter Server calculates variable values across partitions:
Table 13-11. Variable Value Calculations with Partitioned Sessions
Variable Function Variable Value Calculation Across Partitions
SetCountVariable PowerCenter Server calculates the final count values from all partitions.
SetMaxVariable PowerCenter Server compares the final variable value for each partition and saves the
highest value.
SetMinVariable PowerCenter Server compares the final variable value for each partition and saves the
lowest value.
Note: You should use the SetVariable function only once for each mapping variable in a
pipeline. When you create multiple partitions in a pipeline, the PowerCenter Server uses
multiple threads to process that pipeline. If you use this function more than once for the same
variable, the current value of a mapping variable may have indeterministic results.

Partitioning Rules
You can create multiple partitions in a pipeline if the PowerCenter Server can maintain data
consistency when it processes the partitioned data. When you create a session, the Workflow
Manager validates each pipeline for partitioning. You can change the partitioning information
for a pipeline as long as it conforms to the rules and restrictions listed in this section.
There are several types of partitioning rules and restrictions. These include restrictions on the
number of partitions, partitioning restrictions when you change a mapping, restrictions that
apply to other Informatica products, and general guidelines.
Restrictions on the Number of Partitions

In general, you can create up to 64 partitions at any partition point in each pipeline in a
mapping. Under certain circumstances however, the number of partitions should or must be
limited.
Restrictions for Numerical Functions

The numerical functions CUME, MOVINGSUM, and MOVINGAVG calculate running
totals and averages on a row-by-row basis. According to the way you partition a pipeline, the
order that rows of data pass through a transformation containing one of these functions can
change. Therefore, a session with multiple partitions that uses CUME, MOVINGSUM, or
MOVINGAVG functions may not always return the same calculated result.
Restrictions for Relational Targets

When you configure a session to load data to relational targets, the PowerCenter Server can
create one or more connections to each target. If you configure multiple target partitions in a
session that writes to a database or ODBC target that does not support multiple connections,
the session fails.
When you create multiple target partitions in a session that loads data to an Informix
database, you must create the target table with row-level locking.
For more information, see “Database Compatibility” on page 379.
Sybase IQ does not allow multiple concurrent connections to tables. If you create multiple
target partitions in a session that loads to Sybase IQ, the PowerCenter Server loads all of the
data in one partition.
Restrictions for Transformations

Some restrictions on the number of partitions depend on the types of transformations in the
pipeline. These restrictions apply to all transformations, including reusable transformations,
transformations created in mappings and mapplets, and transformations, mapplets, and
mappings referenced by shortcuts.
Partitioning Rules 395

Table 13-12 describes the restrictions on the number of partitions for transformations:
Table 13-12. Restrictions on the Number of Partitions for Transformations
Transformation Restrictions
Custom transformation By default, you can only specify one partition if the pipeline contains a Custom
transformation.
However, this transformation contains an option on the Properties tab to allow
multiple partitions. If you enable this option, you can specify multiple partitions at this
transformation. Do not select Is Partitionable if the Custom transformation procedure
performs the procedure based on all the input data together, such as data cleansing.
External Procedure By default, you can only specify one partition if the pipeline contains an External
transformation Procedure transformation.
This transformation contains an option on the Properties tab to allow multiple
partitions. If this option is enabled, you can specify multiple partitions at this
transformation.
Joiner transformation You can specify only one partition if the pipeline contains the master source for a
Joiner transformation and you do not add a partition point at the Joiner
transformation.
XML target instance You can specify only one partition if the pipeline contains XML targets.
Sequence numbers generated by Normalizer and Sequence Generator transformations might

not be sequential for a partitioned source, but they are unique.
Restrictions when Running the Debugger

You can run the Debugger on a session if all pipelines in the mapping contain one partition.
Partition Restrictions for Editing Objects

When you edit object properties, you can impact your ability to create multiple partitions in a
a session or to run an existing session with multiple partitions.
Before You Create a Session

When you create a session, the Workflow Manager checks the mapping properties. Mappings
dynamically pick up changes to shortcuts, but not to reusable objects, such as reusable
transformations and mapplets. Therefore, if you edit a reusable object in the Designer after
you save a mapping and before you create a session, you must open and resave the mapping for
the Workflow Manager to recognize the changes to the object.
After You Create a Session with Multiple Partitions

When you edit a mapping after you create a session with multiple partitions, the Workflow
Manager does not invalidate the session even if the changes violate partitioning rules. The
PowerCenter Server fails the session the next time it runs unless you edit the session so that it
no longer violates partitioning rules.

The following changes to mappings can cause session failure:
♦ You delete a transformation that was a partition point.
♦ You add a transformation that is a default partition point.
♦ You move a transformation that is a partition point to a different pipeline.
♦ You change a transformation that is a partition point in any of the following ways:
− The existing partition type is invalid.
− The transformation can no longer support multiple partitions.
− The transformation is no longer a valid partition point.
♦ You disable partitioning in an External Procedure transformation after you create a
pipeline with multiple partitions.
♦ You switch the master and detail source for the Joiner transformation after you create a
pipeline with multiple partitions.
Partition Restrictions for Informatica Application Products

You can specify multiple partitions in Informatica Application products, but there are some
additional restrictions with these products.
Table 13-13 describes the partitioning restrictions that apply to Informatica Application
products:
Table 13-13. Partitioning Guidelines for Informatica Application Products
Product Restrictions
PowerCenter Connect for PeopleSoft If the pipeline contains an Application Source Qualifier transformation for
PeopleSoft when it is connected to or associated with a PeopleSoft tree, then
you can specify only one partition and the partition type must be pass-
through.
PowerCenter Connect for IBM For MQSeries sources, you can specify multiple partitions only if there is no
MQSeries associated source qualifier in the pipeline.
You cannot merge output files from sessions with multiple partitions if you
use an MQSeries message queue as the target connection type.
PowerCenter Connect for SAP R/3 If the mapping contains hierarchies or IDOCs, then you can specify only one
partition and the partition type must be pass-through.
If you generate the ABAP program using exec SQL, then you can specify only
one partition and the partition type must be pass-through.
You must use the Informatica default date format to enter dates in key
ranges.
PowerCenter Connect for SAP BW You can specify only one partition when the target load order group contains
an SAP BW target.

Table 13-13. Partitioning Guidelines for Informatica Application Products
Product Restrictions
PowerCenter Connect for Siebel When you use a source filter in a join override, always use the following
syntax for Siebel business components:
SiebelBusinessComponentName.SiebelFieldName
When you create a source filter for a Siebel business component, always use
the following syntax:
SiebelBusinessComponentName.SiebelFieldName
PowerCenter Connect SDK If the mapping contains a multi-group target that receives data from more
than one pipeline, then you can specify only one partition.
If the mapping contains a multi-group target that receives data from multiple
groups, then the partition type must be pass-through.
For more information about these other products, please see the product documentation.
Partitioning Guidelines
This section summarizes the other guidelines that appear throughout this chapter.
Guidelines for Adding and Deleting Partition Points

The following guidelines apply to adding and deleting partition points:
♦ You cannot delete a partition point at a Source Qualifier transformation, a Normalizer
transformation for COBOL sources, or a target instance.
♦ You cannot create a partition point at a source instance.
♦ You cannot create a partition point at a Sequence Generator transformation or an
unconnected transformation.
♦ You can add a partition point at any other transformation provided that no partition point
receives input from more than one pipeline stage.
For more information, see “Adding and Deleting Partition Points” on page 353.
Guidelines for Specifying the Partition Type

You must choose pass-through partitioning at certain partition points in a pipeline if the
session uses a source-based commit or constraint-based loading, or if the mapping contains a
transaction generator, such as a Transaction Control transformation. For more information,
see Table 13-4 on page 357.
If recovery is enabled, the Workflow Manager sets pass-through as the partition type unless
the partition point is either an Aggregator transformation or a Rank transformation.
Guidelines for Adding and Deleting Partition Keys

The following guidelines apply to creating and deleting partition keys:
♦ A partition key must contain at least one port.

♦ If you choose key range partitioning at any partition point, you must specify a range for
each port in the partition key.
♦ If you choose key range partitioning and need to enter a date range for any port, use the
standard PowerCenter date format. For details on the default date format, see “Dates” in
the Transformation Language Reference.
♦ The Workflow Manager does not validate overlapping string ranges, overlapping numeric
ranges, gaps, or missing ranges.
♦ If a row contains a null value in any column that makes up the partition key, or if a row
contains values that fall outside all of the key ranges, the PowerCenter Server sends that
row to the first partition.
For more information, see “Adding Key Ranges” on page 365.
Guidelines for Partitioning File Sources and Targets

The following guidelines apply to partitioning file sources and targets:
♦ When connecting to file sources or targets, you must choose the same connection type for
all partitions. You may choose different connection objects as long as each object is of the
same type. For more information, see “Partitioning File Sources” on page 374 and
“Partitioning File Targets” on page 380.
♦ You cannot merge output files from sessions with multiple partitions if you use FTP, an
external loader, or an MQSeries message queue as the target connection type. For more
information, see “Partitioning File Targets” on page 380.

Chapter 14
Monitoring Workflows

♦ Overview, 402
♦ Using the Workflow Monitor, 404
♦ Customizing Workflow Monitor Options, 409
♦ Using Workflow Monitor Toolbars, 415
♦ Working with Tasks and Workflows, 416
♦ Workflow and Task Status, 421
♦ Using the Gantt Chart View, 423
♦ Using the Task View, 430
♦ Monitoring Session Details, 434
♦ Creating and Viewing Performance Details, 436
♦ Tips, 441
401
Overview
You can monitor workflows and tasks in the Workflow Monitor. View details about a
workflow or task in Gantt Chart view or Task view. You can run, stop, abort, and resume
workflows from the Workflow Monitor.
The Workflow Monitor displays workflows that have run at least once. The Workflow
Monitor continuously receives information from the PowerCenter Server and Repository
Server. It also fetches information from the repository to display historic information.
The Workflow Monitor consists of the following windows:
♦ Navigator window. Displays monitored repositories, servers, and repository objects.
♦ Output window. Displays messages from the PowerCenter Server and the Repository
Server.
♦ Time window. Displays progress of workflow runs.
♦ Gantt Chart view. Displays details about workflow runs in chronological (Gantt Chart)
format.
♦ Task view. Displays details about workflow runs in a report format, organized by workflow
run.
The Workflow Monitor displays time relative to the time configured on the PowerCenter
Server machine. For example, a folder contains two workflows. One workflow runs on a
PowerCenter Server in your local time zone, and the other runs on a PowerCenter Server in a
time zone two hours later. If you start both workflows at 9 a.m. local time, the Workflow
Monitor displays the start time as 9 a.m. for one workflow and as 11 a.m. for the other
workflow.
402 Chapter 14: Monitoring Workflows

Figure 14-1 shows the Workflow Monitor in Gantt Chart view:
Figure 14-1. Workflow Monitor
Navigator
Window
Gantt
Chart
View
Task View Output Window Time Window
Toggle between Gantt Chart view and Task view by clicking the tabs on the bottom of the
Workflow Monitor.
Note: You can view and hide the Output window in the Workflow Monitor. To toggle back
and forth, choose View-Output.

To use the Workflow Monitor, you must have one of the following sets of permissions and
privileges:
♦ Use Workflow Manager privilege with the execute permission on the folder
♦ Workflow Operator privilege with the read permission on the folder
You must also have execute permission for connection objects to restart, resume, stop, or
abort a workflow containing a session.
For more information on permissions and privileges necessary to use the Workflow Monitor,
see “Permissions and Privileges by Task” in the Repository Guide.
Overview 403
Using the Workflow Monitor
The Workflow Monitor provides options to view information about workflow runs. After you
open the Workflow Monitor and connect to a repository, you can view dynamic information
about workflow runs by connecting to a PowerCenter Server.
You can customize the Workflow Monitor display by configuring the maximum days or
workflow runs the Workflow Monitor shows. You can also filter tasks and servers in both
Gantt Chart and Task view.
Complete the following steps to monitor workflows:
1. Open the Workflow Monitor.
2. Connect to the repository containing the workflow.
3. Connect to the PowerCenter Server.
4. Select the workflow you want to monitor.
5. Choose from Gantt Chart view or Task view.
Opening the Workflow Monitor

You can open the Workflow Monitor in the different ways:
♦ From the Windows Start menu
♦ From the Workflow Manager Navigator
♦ Configure the Workflow Manager to open the Workflow Monitor when you run a
workflow from the Workflow Manager.
You can open multiple instances of the Workflow Monitor on one machine using the
Windows Start menu.
To open the Workflow Monitor when you start a workflow:
1. In the Workflow Manager, choose Tools-Options.

2. In the General tab, select Launch Workflow Monitor When Workflow Is Started.
To open the Workflow Monitor from the Workflow Manager:

2. In the Navigator, right-click a server or a repository and choose Run Monitor.
The Workflow Monitor appears.

Connecting to Repositories
When you open the Workflow Monitor, you must connect to a repository to monitor the
objects in it. Connect to repositories by choosing Repository-Connect. Enter the repository
name and connection information.
Once you connect to a repository, the Workflow Monitor displays a list of servers available for
the repository. The Workflow Monitor can monitor multiple repositories, PowerCenter
Servers, and workflows at the same time.
Note: If you are not connected to a repository, you can remove the repository from the
Navigator. Select the repository in the Navigator and choose Edit-Delete. The Workflow
Monitor displays a message verifying that you want to remove the repository from the
Navigator list. Click Yes to remove the repository. You can connect to the repository again at
any time.
Connecting to PowerCenter Servers

When you connect to a repository, the Workflow Monitor displays all registered PowerCenter
Servers and deleted PowerCenter Servers. To monitor tasks and workflows that run on a
server, you must connect to the server. In the Navigator, the Workflow Monitor displays a red
icon over deleted servers.
To connect to a server, right-click it and choose Connect. When you connect to a server, you
can view all folders that you have read permission on. You can disconnect from a server by
right-clicking it and selecting Disconnect. When you disconnect from a server, or when the
Workflow Monitor cannot connect to a server, the Workflow Monitor displays disconnected
for the server status.
You can also verify whether a PowerCenter Server is running by pinging it. Right-click the
server in the Navigator and select Ping Server. You can view the ping response time in the
Output window.
Note: You can also open a PowerCenter Server node in the Navigator without connecting to it.
When you open a PowerCenter Server, the Workflow Monitor gets workflow run information
stored in the repository. It does not get dynamic workflow run information from currently
running workflows.
Filtering Tasks and Servers

You can filter tasks and servers in both Gantt Chart view and Task view. Use the Filters menu
to hide tasks and servers you do not want to view in the Workflow Monitor.
Filtering Tasks
You can view all or some workflow tasks. You can filter out tasks to view only tasks you want.
For example, if you want to view only Session tasks, you can hide all other tasks. You can view
all tasks at any time.
Using the Workflow Monitor 405

You can also filter deleted tasks. To filter deleted tasks, choose Filters-Deleted Tasks.
To filter tasks:
1. Choose Filters-Tasks.
The Filter Tasks dialog box appears.
2. Clear the tasks you want to hide, and select the tasks you want to view.
3. Click OK.
Note: When you filter a task, the Gantt Chart view displays a red link between tasks to
indicate a filtered task. You can double-click the link to view the tasks you hid.
Filtering Servers
When you connect to a repository, the Workflow Monitor displays a list of registered servers
and deleted servers. When you register multiple servers, you can filter out servers to view only
servers you want to monitor.
When you hide a server, the Workflow Monitor hides the server from the Navigator for both
Gantt Chart and Task view. You can show the server at any time.
You can hide unconnected servers. When you hide a connected server, the Workflow Monitor
asks if you want to disconnect from the server and then filter it. You must disconnect from a
server before hiding it.
To filter ser vers:
1. In the Navigator, right-click a repository and select Filter Servers.

or
Choose Filters-Servers.

The Filter Servers dialog box appears.
2. Select the servers you want to view, and clear the servers you want to filter. Click OK.
If you are connected to a server that you clear, the Workflow Monitor prompts you to
disconnect from the server before filtering.
3. Click Yes to disconnect from the server and filter it.
The Workflow Monitor hides the server from the Navigator.
Click No to remain connected to the server. If you click No, you cannot filter the server.
Tip: You can also filter a server in the Navigator by right-clicking it and selecting Filter Server.
Opening and Closing Folders

You can choose which folders to open and close in the Workflow Monitor. When you open a
folder, the Workflow Monitor displays the number of workflow runs that you configured in
the Workflow Monitor options. For more information, see “Configuring General Options” on
page 409.
You can open and close folders in both Gantt Chart and Task view. When you open a folder,
it opens in both views. To open a folder, right-click it in the Navigator and select Open. Or,
you can double-click the folder.
To view folder contents in the Workflow Monitor, you must have one of the following sets of
permissions and privileges:
♦ Workflow Operator privilege with read permission on the folder
Using the Workflow Monitor 407

Viewing Statistics
You can view statistics about the objects you monitor in the Workflow Monitor by choosing
View-Statistics. The Statistics dialog box displays the following information:
♦ Number of opened repositories. Number of repositories you are connected to in the
Workflow Monitor.
♦ Number of connected servers. Number of servers you connected to since you opened the
Workflow Monitor.
♦ Number of fetched tasks. Number of tasks the Workflow Monitor fetched from the
repository during the period specified in the Time window.
Figure 14-2 shows the Statistics dialog box:
Figure 14-2. Workflow Monitor Statistics Dialog Box
Viewing Properties
You can view properties for the following items:
♦ Tasks. You can view properties such as task name, start time, and status.
♦ Sessions. You can view properties about the Session task and session run, such as mapping
name and number of rows successfully loaded. You can also view load statistics about the
session run. For more information on session details, see “Monitoring Session Details” on
page 434. You can also view performance details about the session run. For more
information, see “Creating and Viewing Performance Details” on page 436.
♦ Workflows. You can view properties such as start time, status, and run type.
♦ Links. When you double-click a link between tasks in Gantt Chart view, you can view
tasks you hide.
♦ Servers. You can view properties such as server version and startup time. You can also view
the sessions and workflows running on the PowerCenter Server.
♦ Folders. You can view properties such as the number of workflow runs displayed in the
Time window.
To view properties for all objects, right-click the object and select Properties. You can right-
click items in the Navigator or the Time window in either Gantt Chart view or Task view.
To view link properties, double-click the link in the Time window of Gantt Chart view.
When you view link properties, you can double-click a task in the Link Properties dialog box
to view the properties for the filtered task.

Customizing Workflow Monitor Options
You can configure how the Workflow Monitor displays general information, workflows, and
tasks. You can configure general tasks such as the maximum number of days or runs that the
Workflow Monitor displays. You can also configure options specific to Gantt Chart and Task
view.
Choose Tools-Options to configure Workflow Monitor options.
You can configure the following options in the Workflow Monitor:
♦ General. Customize general options such as the maximum number of workflow runs to
display and whether to receive messages from the Workflow Manager. See “Configuring
General Options” on page 409
♦ Gantt Chart view. Configure Gantt Chart view options such as workspace color, status
colors, and time format. See “Configuring Gantt Chart View Options” on page 411.
♦ Task view. Configure which columns to display in Task view. See “Configuring Task View
Options” on page 412.
♦ Advanced. Configure advanced options such as the number of workflow runs the
Workflow Monitor holds in memory for each server. “Configuring Advanced Options” on
page 412.
Configuring General Options

You can customize general options such as the maximum number of days to display and
which text editor to use for viewing session and workflow logs.
Customizing Workflow Monitor Options 409

Figure 14-3 shows the General Options tab:
Figure 14-3. General Tab for Workflow Monitor Options
Table 14-1 describes the options you can configure on the General tab:
Table 14-1. Workflow Monitor General Options
Setting Description
Maximum Days Specifies the number of tasks the Workflow Monitor displays up to a maximum
number of days. The default is 5.
Maximum Workflow Runs per Specifies the maximum number of workflow runs the Workflow Monitor displays for
Folder each folder. The default is 200.
Receive Messages from Select this option to receive messages from the Workflow Manager. The Workflow
Workflow Manager Manager sends messages when you start or schedule a workflow in the Workflow
Manager. The Workflow Monitor displays these messages in the Output window.
Receive Notifications from Select this option to receive notifications from the Repository Server. Notifications
Repository Server from the Repository Server display in the Output window Notifications tab.
Log File Editor Enter the path and file name of the text editor to view and edit workflow and session
logs. You can browse to select an editor. By default, the Workflow Monitor uses
WordPad.
Location The location where the Workflow Monitor stores temporary versions of log files
when you open session or workflow logs from the Workflow Monitor.

Configuring Gantt Chart View Options
You can configure Gantt Chart view options such as workspace color, status colors, and time
format.
Figure 14-4 shows the Gantt Chart Options tab:
Figure 14-4. Gantt Chart Options
Table 14-2 describes the options you can configure on the Gantt Chart Options tab:
Table 14-2. Gantt Chart Options
Gantt Chart Option Description
Status Color Choose a status and configure the color for the status. The Workflow Monitor displays tasks
with the selected status in the colors you choose. You can choose two colors to display a
gradient.
Recovery Color Configure the color for the recovery sessions. The Workflow Monitor uses the status color for
the body of the status bar, and it uses and the recovery color as a gradient in the status bar.
Workspace Color Choose a color for each workspace component.
Time Format Select a display format for the time window.

Configuring Task View Options
You can choose the columns you want to display in Task view. You can also reorder the
columns and specify a default column width.
Figure 14-5 shows the Task View Options tab:
Figure 14-5. Task View Options
Configuring Advanced Options

You can configure advanced options such as the number of workflow runs the Workflow
Monitor holds in memory for each server.

Figure 14-6 shows the Advanced Options tab:
Figure 14-6. Advanced Tab for Workflow Monitor Options
Table 14-3 describes the options you can configure on the Advanced tab:
Table 14-3. Advanced Workflow Monitor Options
Setting Description
Expand Running Workflows Automatically Expands running workflows in the Navigator.
Hide Folders/Workflows That Do Not Contain Hides folders or workflows under the Workflow Run column in the Time
Any Runs When Filtering By Running/ window when you filter running or scheduled tasks.
Schedule Runs
Highlight the Entire Row When an Item Is Highlights the entire row in the Time window for selected items. When
Selected you disable this option, the Workflow Monitor highlights the item in the
Workflow Run column in the Time window.

Table 14-3. Advanced Workflow Monitor Options
Setting Description
Open Latest 20 Runs At a Time Allows you to open the number of workflow runs of your choice. The
number of runs to be opened is set at 20 by default.
Minimum Number of Workflow Runs (Per Specifies the minimum number of workflow runs per server that the
Server) the Workflow Monitor Will Workflow Monitor holds in memory before it starts releasing older runs
Accumulate in Memory from memory.
When you connect to a server, the Workflow Monitor fetches the
number of workflow runs specified on the General tab for each folder
you connect to. When the number of runs is less than the number
specified in this option, the Workflow Monitor stores new runs in
memory until it reaches this number. Then it releases the oldest run
from memory when it fetches a new run.
When the number of workflow runs the Workflow Monitor initially
fetches exceeds the number specified in this option, the Workflow
Monitor stores all those runs and then releases the oldest run from
memory when it fetches a new run.

Using Workflow Monitor Toolbars
The Workflow Monitor toolbars allow you to select tools and tasks quickly. You can perform
the following toolbar operations:
♦ Display or hide a toolbar.
♦ Create a new toolbar.
♦ Add or remove buttons.
For details on how to perform these toolbar operations, see “Using the Designer” in the
Designer Guide.
By default, the Workflow Monitor displays the following toolbars:
♦ Standard. Contains buttons to connect to and disconnect from repositories, and to zoom
and print the workspace.
Figure 14-7 displays the Standard toolbar:
Figure 14-7. Standard Toolbar
♦ Server. Contains buttons to connect to and disconnect from PowerCenter Servers, to ping
the server, and to start and stop workflows, worklets, and tasks.
Figure 14-8 displays the Server toolbar:
Figure 14-8. Server Toolbar
♦ View. Contains buttons to refresh the view and to open workflow and session logs.
Figure 14-9 displays the View toolbar:
Figure 14-9. View Toolbar
♦ Filter. Contains buttons to display most recent runs, and to filter tasks, servers, and
folders.
Figure 14-10 displays the Filter toolbar:
Figure 14-10. Filter Toolbar
Using Workflow Monitor Toolbars 415

Working with Tasks and Workflows
You can perform the following tasks with objects in the Workflow Monitor:
♦ Run a task or workflow.
♦ Resume a suspended workflow.
♦ Stop or abort a task or workflow.
♦ Schedule and unschedule a workflow.
♦ View session logs and workflow logs.
♦ View history names.
Running a Task, Workflow, or Worklet

The Workflow Monitor displays workflows that have run at least once. In the Workflow
Monitor, you can run a workflow or any task or worklet in the workflow. To run a workflow
or part of a workflow, right-click the workflow or task and choose a restart option. When you
choose restart, the task, workflow, or worklet runs on the PowerCenter Server you specify in
the workflow properties.
You can also run part of a workflow. When you run part of a workflow, the PowerCenter
Server runs the workflow from the selected task to the end of the workflow.
For details on running workflows and tasks in the Workflow Manager, see “Running the
To run a workflow from the Workflow Monitor:
1. In the Navigator, select the workflow you want to run.

2. Right-click the workflow in the Navigator and choose Restart.
or
Choose Task-Restart.
The PowerCenter Server runs the workflow you specify.
To run a task from the Workflow Monitor:
1. In the Navigator, select the task or worklet you want to run.

2. Right-click the task or worklet in the Navigator and choose Restart Task.
The PowerCenter Server runs the task or worklet you specify. It does not run the rest of
the workflow.
To run a part of a workflow from the Workflow Monitor:
1. In the Navigator, select the task from which you want to run the workflow.

2. Right-click the task and choose Restart Workflow from Task.
or
Choose Task-Restart.
The PowerCenter Server runs the workflow starting with the task you specify.
Resuming a Workflow or Worklet

In the workflow properties, you can choose to suspend the workflow or worklet if a task fails.
After you fix the failed task, resume the workflow in the Workflow Monitor. When you
resume a workflow, the PowerCenter Server finds the failed task, runs the task again, and
continues running the rest of the tasks in the workflow path.
For details on suspending a workflow, see “Suspending the Workflow” on page 127.
To resume a workflow or worklet:
1. In the Navigator, select the workflow or worklet you want to resume.

2. Choose Tasks-Resume.
or
Right-click the workflow or worklet in the Navigator and choose Resume.
The Workflow Monitor displays server messages about the resume command in the
Output window.
Recovering a Workflow or Worklet

In the workflow properties, you can choose to suspend the workflow or worklet if a session
fails. After you fix the errors that caused the session to fail, recover the workflow in the
Workflow Monitor. When you recover a workflow, the PowerCenter Server recovers the failed
session, and continues running the rest of the tasks in the workflow path.
For details on suspending a workflow, see “Suspending the Workflow” on page 127.
To recover a workflow or worklet:
1. In the Navigator, select the workflow or worklet you want to recover.

2. Choose Tasks-Resume/Recover.
or
Right-click the workflow or worklet in the Navigator and choose Resume/Recover.
The Workflow Monitor displays server messages about the recover command in the
Output window.
Working with Tasks and Workflows 417

Stopping or Aborting Tasks and Workflows
You can stop or abort a task, workflow, or worklet in the Workflow Monitor at any time.
When you stop a task in the workflow, the PowerCenter Server stops processing the task and
all other tasks in its path. The PowerCenter Server continues running concurrent tasks. If the
PowerCenter Server cannot stop processing the task, you need to abort the task. When the
PowerCenter Server aborts a task, it kills the DTM process and terminates the task.
For details on server handling of stop and abort, see “Server Handling of Stop and Abort” on
page 129.
To stop or abort workflows, tasks, or worklets in the Workflow Monitor:
1. In the Navigator, select the task, workflow, or worklet you want to stop or abort.
2. Choose Tasks-Stop or Tasks-Abort.
or
Right-click the task, workflow, or worklet in the Navigator and choose Stop or Abort.
3. The Workflow Monitor displays the status of the stop or abort command in the Output
window.
Scheduling and Unscheduling Workflows

You can schedule and unschedule workflows in the Workflow Monitor. You can schedule any
workflow that is not configured to run on demand. When you try to schedule a run on
demand workflow, the Workflow Monitor displays an error message in the Output window.
When you schedule an unscheduled workflow, the workflow uses its original schedule
specified in the workflow properties. If you want to specify a different schedule for the
workflow, you must edit the scheduler in the Workflow Manager.
To schedule an unscheduled workflow in the Workflow Monitor:

♦ Right-click the workflow and choose Schedule.
The Workflow Monitor displays the workflow status as Scheduled, and displays a message
in the Output window.
To unschedule a scheduled workflow in the Workflow Monitor:

♦ Right-click the workflow and choose Unschedule.
The Workflow Monitor displays the workflow status as Unscheduled, and displays a
message in the Output window.
For details on scheduling workflows, see “Scheduling a Workflow” on page 112.

Viewing Session Logs and Workflow Logs
You can open and edit session and workflow log files from the Workflow Monitor. To view
workflow or session logs, connect to the server. You can view the most recent session or
workflow log. Or, select a particular workflow run and view the log for that run. If a past
session or workflow log is not available, the Workflow Manager opens the most recent log file.
You can view log files in any text editor on the PowerCenter Client. To change the log file
editor, choose Tools-Options. Enter the path and file name of the text editor in the Log File
Editor field on the General tab.
When you open a session or workflow log, the Workflow Monitor copies the log file from the
PowerCenter Server machine to the directory specified on the General tab of the Options
dialog box. The Workflow Monitor opens the file from the temporary directory on the client
machine. When you open a session or workflow log, you can cancel the operation at any time.
Note: To view past session or workflow log files, you must configure the session or workflow to
save logs by timestamp. For more information on workflow and session logs, see “Log Files”
on page 455.
Viewing Dynamic Log Files

When you open a session or workflow log, the Workflow Monitor opens the most recent
version of the log file, even if the PowerCenter Server is currently writing to the log file. Each
time you choose Get Session Log or Get Workflow Log, the Workflow Monitor opens a new
text file with the most recent version of the log file. If you choose to open the log file after the
session completes, the Workflow Monitor opens the entire log in a new text file.
Steps to View Log Files

Perform the following steps to view a session or workflow log.
To view a session or workflow log file:
1. Right-click a Session task or workflow in the Navigator or Time window.

2. Choose Get Session Log, or choose Get Workflow Log.
The most recent session or workflow log file opens in the log file editor you specify for
the Workflow Monitor.
Tip: When the Workflow Monitor retrieves the session or workflow log, you can press the
Esc key to cancel the process.
Viewing History Names

If you rename a task, workflow, or worklet, the Workflow Monitor can show a history of
names. When you start a renamed task, workflow, or worklet, the Workflow Monitor displays
the current name. To view a list of historical names, select the task, workflow, or worklet in
the Navigator. Right-click and choose Show History Names.
Working with Tasks and Workflows 419

Figure 14-11 shows the History Names dialog box:
Figure 14-11. History Names Dialog Box

Workflow and Task Status
The Workflow Monitor displays the status of workflows and tasks.
Table 14-4 describes the different statuses for workflow and tasks:
Table 14-4. Workflow and Task Status
Status Name Status for Description
Aborted Workflows The PowerCenter Server aborted the workflow or task. The PowerCenter
Tasks Server kills the DTM process when you abort a workflow or task.
Aborting Workflows The PowerCenter Server is in the process of aborting the workflow or task.
Tasks
Disabled Workflows You select the Disabled option in the workflow or task properties. The
Tasks PowerCenter Server does not run the disabled workflow or task until you clear
the Disabled option.
Failed Workflows The PowerCenter Server failed the workflow or task due to errors.
Tasks
Running Workflows The PowerCenter Server is running the workflow or task.

Tasks
Scheduled Workflows You schedule the workflow to run at a future date. The PowerCenter Server
runs the workflow for the duration of the schedule.
Stopped Workflows You choose to stop the workflow or task in the Workflow Monitor. The
Tasks PowerCenter Server stopped the workflow or task.
Stopping Workflows The PowerCenter Server is in the process of stopping a workflow or task.
Tasks
Succeeded Workflows The PowerCenter Server successfully completed the workflow or task.
Tasks
Suspended Workflows The PowerCenter Server suspends the workflow because a task fails and no
Worklets other tasks are running in the workflow. This status is available only when you
choose the Suspend on Error option.
Suspending Workflows A task fails in the workflow when other tasks are still running. The PowerCenter
Worklets Server stops executing the failed task and continues executing tasks in other
paths. This status is available only when you choose the Suspend on Error
option.
Terminated Workflows The PowerCenter Server terminated unexpectedly when it was running this
workflow or task.
Unscheduled Workflows You removed a workflow from the schedule. Or, the workflow is scheduled and
the PowerCenter Server is about to run the scheduled workflow.
Waiting Workflows The PowerCenter Server is waiting for available resources so it can execute
Tasks the workflow or task. For example, you may set the maximum number of
concurrent sessions to 10. If the PowerCenter Server is already executing 10
concurrent sessions, all other workflows and tasks has the Waiting status until
the PowerCenter Server is free to execute more tasks.
Workflow and Task Status 421

To see a list of tasks by status, view the workflow in Task view and sort by status. Or, choose
Edit-List Tasks in Gantt Chart view. For details, see “Listing Tasks and Workflows” on
page 424.

Using the Gantt Chart View
The Gantt Chart view allows you to view chronological details of workflow runs. The Gantt
Chart view displays the following information:
♦ Task name. Name of the task in the workflow.
♦ Duration. The length of time the PowerCenter Server spends running the most recent task
or workflow.
♦ Status. The status of the most recent task or workflow. For more information about status,
see “Workflow and Task Status” on page 421.
♦ Connection between objects. The Workflow Monitor shows links between objects in the
Time window.
Figure 14-12 displays the Gantt Chart view:
Figure 14-12. Gantt Chart View
Organizing Tasks
In Gantt Chart view, you can organize tasks in the Navigator. You can drag and drop tasks
within a workflow to change the order they appear in the Navigator.
Using the Gantt Chart View 423

For example, the Workflow Monitor usually displays the Decision task as the first task in the
following workflow:
Decision task displays first.
You can drag and drop the Decision task within the Navigator so the Decision task is in the
middle or at the bottom of the list of tasks for that workflow:
Decision task displays

between other tasks.
Listing Tasks and Workflows

The Workflow Monitor lists tasks and workflows in all repositories you connect to. You can
view tasks and workflows by status, such as failed or succeeded. You can highlight the task in
Gantt Chart view by double-clicking the task in the list.

To view a list of tasks and workflows by status:
1. Open the Gantt Chart view and choose Edit-List Tasks. The List Tasks dialog box
appears.
2. In the List What field, select the type of task status you want to list.
For example, select Failed to view a list of failed tasks and workflows.
3. Click List to view the list.
Tip: Double-click the task name in the List Tasks dialog box to highlight the task in Gantt
Chart view.
Navigating the Time Window in Gantt Chart View

You can scroll through the Time window in Gantt Chart view to monitor the workflow runs.
To scroll the Time window, you can use any of the following methods:
♦ Use the scroll bars.
♦ Right-click the task or workflow and choose Go To Next Run, or choose Go To Previous
Run.
♦ Choose View-Organize to select the date you want to display.
When you choose View-Organize, the Go To field appears above the Time window. Click the
Go To field to view a calendar and select the date you want to display. When you choose a
date, the Workflow Monitor displays that date beginning at 12:00 a.m.

Figure 14-13 shows the Go To field:
Figure 14-13. Organizing Gantt Chart
Zooming the Gantt Chart View

You can change the zoom settings in Gantt Chart view. By default, the Workflow Monitor
shows the Time window in increments of one hour. You can change the time increments to
zoom the Time window.

Figure 14-14 shows the Time window in 30 minute increments:
Figure 14-14. Zooming the Gantt Chart View
Zoom
30 Minute
Increments
Solid Line
For Hour
Increments
Dotted Line
For Half Hour
Increments
To zoom the Time window in Gantt Chart view, choose View-Zoom and then choose the
desired time increment.
You can also choose the time increment in the Zoom button on the toolbar.
Performing a Search
Use the search tool in the Gantt Chart view to search for tasks, workflows, and worklets in all
repositories you connect to. The Workflow Monitor searches for the word you specify in task
names, workflow names, and worklet names. You can highlight the task in Gantt Chart view
by double-clicking the task after searching.

To perform a search:
1. Open the Gantt Chart view and choose Edit-Find. The Find Object dialog box appears.
2. In the Find What field, enter the keyword you want to find.
3. Click Find Now.
The Workflow Monitor displays a list of tasks, workflows, and worklets that match the
keyword.
Tip: Double-click the task name in the Find Object dialog box to highlight the task in
Gantt Chart view.

Opening All Folders
You can open all folders that you have read permission on in a Repository. To open all the
folders in the Gantt Chart view, right-click the server you want to view, and then choose
Open All Folders. The Workflow Monitor displays workflows and tasks in the folders.

Using the Task View
The Task view displays information about workflow runs in a report format. The Task view
provides a convenient way to compare and filter details of workflow runs. Task view displays
the following information:
♦ Workflow run list. The list of workflow runs. The workflow run list contains folder,
workflow, worklet, and task names. The Workflow Monitor displays workflow runs
chronologically with the most recent run at the top. It displays folders and servers
alphabetically.
♦ Status. The status of the task or workflow.
♦ Start time. The time that the PowerCenter Server starts executing the task or workflow.
♦ Completion time. The time that the PowerCenter Server finishes executing the task or
workflow.
♦ Status message. Message from the PowerCenter Server regarding the status of the task or
workflow.
♦ Run type. The method you used to start the workflow. You might manually start the
workflow or schedule the workflow to start.
♦ Worker server. The PowerCenter Server that ran the task.
You can perform the following tasks in Task view:
♦ Filter tasks. Use the Filter menu to select the tasks you want to display or hide. For more
information on filtering tasks in Task view, see “Filtering in Task View” on page 431.
♦ Hide and view columns. Hide or view an entire column in Task view. For details on
hiding and viewing columns in Task view, see “Configuring Task View Options” on
page 412.
♦ Hide and view the Navigator. You can hide the Navigator in Task view. Choose View-
Navigator to hide or view the Navigator.
To view the tasks in Task view, select the server you want to monitor in the Navigator.

Figure 14-15 displays the Task view:
Figure 14-15. Task View
Navigator
Window
Workflow
Run List
Time Window
Task View
Output
Window
Filtering in Task View

In Task view, you can view all or some workflow tasks. You can filter tasks in the following
ways:
♦ By task type. You can filter out tasks to view only tasks you want. For example, if you want
to view only session task types, you can filter out all other tasks. For more information on
filtering task types and servers, see “Filtering Tasks and Servers” on page 405.
♦ By nodes in the Navigator. You can filter the workflow runs the Workflow Monitor
displays in the Time window by selecting different nodes in the Navigator. For example,
when you select a repository name in the Navigator, the Time window displays all
workflow runs that ran on the PowerCenter Servers registered to that repository. When
you select a folder name in the Navigator, the Time window displays all workflow runs in
that folder.
♦ By the most recent runs. To display by the most recent runs, choose Filters-Most Recent
Runs and choose the number of runs you want to display.
♦ By Time window columns. You can choose Filters-Auto Filter and filter by properties you
specify in the Time window columns.
Using the Task View 431

To filter by Time view columns:
1. Choose Filters-Auto Filter.

The Filter button appears in the some columns of the Time Window in Task view:
Filter Button
Select the
workflows you want
to display.
2. Click the Filter button in a column in the Time Window.

3. Choose the properties you want to filter.
Tip: If you want to view all tasks, select All to view all tasks.
When you click the Filter button in either the Start Time or Completion Time column,
you can choose a custom time to filter.
4. Select Custom for either Start Time or Completion Time. The Filter Start Time or
Custom Completion Time dialog box appears.
5. Choose to show tasks before, after, or between the time you specify. Select the date and
time. Click OK.

Opening All Folders
You can open all folders that you have read permission on in a Repository. To open all folders
in the Task view, right-click the server with the folders you want to view, and then choose
Open All Folders. The Workflow Monitor displays workflows and tasks in the folders.
Using the Task View 433

Monitoring Session Details
When the PowerCenter Server runs a Session task, the Workflow Monitor creates session
details that provide load statistics for each target in the mapping. You can view session details
when the session runs or after the session completes.
To view session details, right-click the session in the Workflow Monitor and choose
Properties. Click the Transformation Statistics tab in the Properties dialog box.
Figure 14-16 shows the session details on the Transformation Statistics tab:
Figure 14-16. Session Properties Transformation Statistics
When you create multiple partitions in a session, the PowerCenter Server provides session
details for each partition. You can use these details to determine if the data is evenly
distributed among the partitions. For example, if the PowerCenter Server moves more rows
through one target partition than another, or if the throughput is not evenly distributed, you
might want to adjust the data range for the partitions.
When you load data to a target with multiple groups, such as an XML target, the
PowerCenter Server provides session details for each group.
Table 14-5 lists the information on the Transformation Statistics tab:
Table 14-5. Session Details on the Transformation Statistics Tab
Session Detail Description
Instance Name Name of the source qualifier instance or the target instance in the mapping. If you create
multiple partitions in the source or target, the Instance Name displays the partition number.
If the source or target contains multiple groups, the Instance Name displays the group
name.
Transformation Name Name of the source qualifier or target.

Table 14-5. Session Details on the Transformation Statistics Tab
Session Detail Description
Applied Rows For targets, shows the number of rows the PowerCenter Server successfully applied to the
target (that is, the target returned no errors).
For sources, shows the number of rows the PowerCenter Server successfully read from
the source.
Note: The number of applied rows equals the number of affected rows for sources.
Affected Rows For targets, shows the number of rows affected by the specified operation. For example,
you have a table with one column called SALES_ID and five rows containing the values 1,
2, 3, 2, and 2. You mark rows for update where SALES_ID is 2. The writer affects three
rows, even though there was only one update request. Or, if you mark rows for update
where SALES_ID is 4, the writer affects 0 rows.
For sources, shows the number of rows the PowerCenter Server successfully read from
the source.
Note: The number of applied rows equals the number of affected rows for sources.
Rejected Rows Number of rows the PowerCenter Server dropped when reading from the source, or the
number of rows the PowerCenter Server rejected when writing to the target.
Throughput (Rows/Sec) Rate at which the PowerCenter Server read rows from the source or wrote data into the
target in bytes per second.
Last Error Message The most recent error message written to the session log. If you view details after the
session completes, this field displays the last error message.
Last Error Code The error message code of the most recent error message written to the session log. If you
view details after the session completes, this field displays the last error code.
Start Time The time the PowerCenter Server started to read from the source or write to the target.
The Workflow Monitor displays time relative to the PowerCenter Server.
End Time The time the PowerCenter Server finished reading from the source or writing to the target.
The Workflow Monitor displays time relative to the PowerCenter Server.
Monitoring Session Details 435

Creating and Viewing Performance Details
The performance details provide counters that help you understand the session and mapping
efficiency. Each source qualifier, target definition, and individual transformation appears in
the performance details, along with counters that display performance information about
each transformation.
You can view performance details through the Workflow Monitor as the session runs, or you
can open the resulting file in a text editor.
You create performance details by selecting Collect Performance Data in the session properties
before running the session. By evaluating the final performance details, you can determine
where session performance slows down. Monitoring also provides session-specific details that
can help tune the following:
♦ Buffer block size
♦ Index and data cache size for Aggregator, Rank, Lookup, and Joiner transformations
♦ Lookup transformations
Before using performance details to improve session performance you must do the following:
♦ Enable monitoring
♦ Increase Load Manager shared memory
♦ Understand performance counters
Enabling Monitoring
To view performance details, you must enable monitoring in the session properties before
running the session.
To enable monitoring:
1. In the Workflow Manager, open the selected session properties.

2. In the Performance settings of the Properties tab, select Collect Performance Data, and
click OK.
3. Run the session.
Viewing Session Performance Details

You can view session performance details in the Workflow Monitor or by locating and
opening the performance details file.
In the Workflow Monitor, you can watch performance details during the session run.

To view performance details in the Workflow Monitor:
1. While the session is running, right-click the session in the Workflow Monitor and choose
Properties.
2. Click the Performance tab in the Properties dialog box.
3. Click OK.
To view the performance details file:
1. Locate the performance details file.

The PowerCenter Server names the file session_name.perf, and stores it in the same
directory as the session log. If there is no session-specific directory for the session log, the
PowerCenter Server saves the file in the default log files directory.
2. Open the file in any text editor.
Memory Requirement for Performance Details

When you enable monitoring, you must increase the size of the Load Manager Shared
Memory. For each session in shared memory that you configure to create performance details,
the Load Manager requires 200,000 bytes of additional shared memory.
If you create performance details for all sessions, multiply the MaxSessions parameter by
200,000 bytes to calculate the additional shared memory requirements.
Understanding Performance Counters

All transformations have some basic counters that indicate the number of input rows, output
rows, and error rows.
Source Qualifiers, Normalizers, and targets have additional counters that indicate the
efficiency of data moving into and out of buffers. You can use these counters to locate
performance bottlenecks.
Creating and Viewing Performance Details 437

Some transformations have counters specific to their functionality. For example, each Lookup
transformation has a counter that indicates the number of rows stored in the lookup cache.
When you read performance details, the first column displays the transformation name as it
appears in the mapping, the second column contains the counter name, and the third column
holds the resulting number or efficiency percentage.
When you create multiple partitions in a pipeline, the PowerCenter Server generates one set
of counters for each partition. The following performance counters illustrate two partitions
for an Expression transformation:
Transformation Counter Value
EXPTRANS [1] Expression_input rows 8
Expression_output rows 8
EXPTRANS [2] Expression_input rows 16
Expression_output rows 16
Note: When you increase the number of partitions, the number of aggregate or rank input
rows may be different from the number of output rows from the previous transformation.
Table 14-6 lists the counters that may appear in the Session Performance Details dialog box or
in the performance details file:
Table 14-6. Performance Counters
Transformation Counters Description
Aggregator/Rank_inputrows Number of rows passed into the transformation.
Aggregator/Rank_outputrows Number of rows sent out of the transformation.
Aggregator/Rank_errorrows Number of rows in which the PowerCenter Server

encountered an error.
Aggregator/Rank_readfromcache Number of times the PowerCenter Server read from the

index or data cache.
Aggregator/Rank_writetocache Number of times the PowerCenter Server wrote to the

Aggregator and
Rank Aggregator/Rank_readfromdisk Number of times the PowerCenter Server read from the
Transformations index or data file on the local disk, instead of using
cached data.
Aggregator/Rank_writetodisk Number of times the PowerCenter Server wrote to the

index or data file on the local disk, instead of using
cached data.
Aggregator/Rank_newgroupkey Number of new groups the PowerCenter Server

created.
Aggregator/Rank_oldgroupkey Number of times the PowerCenter Server used existing

groups.

Lookup_inputrows Number of rows passed into the transformation.
Lookup_outputrows Number of rows sent out of the transformation.

Lookup
Transformation Lookup_errorrows Number of rows in which the PowerCenter Server
Lookup_rowsinlookupcache Number of rows stored in the lookup cache.
Joiner_inputMasterRows Number of rows the master source passed into the

transformation.
Joiner_inputDetailRows Number of rows the detail source passed into the

transformation.
Joiner_outputrows Number of rows sent out of the transformation.
Joiner_errorrows Number of rows in which the PowerCenter Server

Joiner_readfromcache Number of times the PowerCenter Server read from the

Joiner_writetocache Number of times the PowerCenter Server wrote to the

Joiner_readfromdisk* Number of times the PowerCenter Server read from the

index or data files on the local disk, instead of using
cached data.
Joiner Joiner_writetodisk* Number of times the PowerCenter Server wrote to the

Transformation index or data files on the local disk, instead of using
cached data.
Joiner_readBlockFromDisk** Number of times the PowerCenter Server read from the

index or data files on the local disk, instead of using
cached data.
Joiner_writeBlockToDisk** Number of times the PowerCenter Server wrote to the

Joiner_seekToBlockInDisk** Number of times the PowerCenter Server accessed the

index or data files on the local disk.
Joiner_insertInDetailCache* Number of times the PowerCenter Server wrote to the

detail cache. The PowerCenter Server generates this
counter only if you join data from a single source.
Joiner_duplicaterows Number of duplicate rows the PowerCenter Server

found in the master relation.
Joiner_duplicaterowsused Number of times the PowerCenter Server used the

duplicate rows in the master relation.
Creating and Viewing Performance Details 439

Transformation_inputrows Number of rows passed into the transformation.
All Other Transformation_outputrows Number of rows sent out of the transformation.

Transformations
Transformation_errorrows Number of rows in which the PowerCenter Server
*The PowerCenter Server generates this counter when you use sorted input for the Joiner transformation.
**The PowerCenter Server generates this counter when you do not use sorted input for the Joiner transformation.
If you have multiple source qualifiers and targets, evaluate them as a whole. For source
qualifiers and targets, a high value is considered 80-100 percent. Low is considered 0-20
percent.

Tips
Reduce the size of the Time window.
When you reduce the size of the Time window, the Workflow Monitor refreshes the screen
faster, reducing flicker.
Use the Repository Manager to truncate the list of workflow logs.

If the Workflow Monitor takes a long time to refresh from the repository or to open folders,
truncate the list of workflow logs. When you configure a session or workflow to archive
session logs or workflow logs, the PowerCenter Server saves those logs in local directories. The
repository also creates an entry for each saved workflow log and session log. If you move or
delete a session log or workflow log from the workflow log directory or session log directory,
truncate the lists of workflow and session logs to remove the entries from the repository. The
repository always retains the most recent workflow log entry for each workflow.
Tips 441
Chapter 15
Using Multiple Servers

♦ Overview, 444
♦ Using Server Variables, 445
♦ Working with Server Grids, 446
♦ Configuring Server Grids, 450
443
Overview
You can register and run multiple PowerCenter Servers against a local or global repository.
When you register multiple PowerCenter Servers to the same repository, you can distribute
the workload across the servers to increase performance.
You have the following options to run workflows and sessions using multiple servers:
♦ Use a server grid to run workflows. You can use a server grid to automate the distribution
of sessions. A server grid is a server object that distributes sessions in a workflow to servers
based on server availability. The grid maintains connections to multiple servers in the grid.
For more information about using server grids, see “Working with Server Grids” on
page 446.
♦ Change the assigned server for a workflow. When you configure a workflow, you assign a
server to run that workflow. Each time the scheduled workflow runs, it runs on the
assigned server. You can change the assigned server for a workflow in the workflow
properties.
♦ Change the assigned server for a session. When you configure a session, by default it runs
on the server assigned to the workflow. You can change the assigned server for a session in
the session properties.
♦ Start a workflow on a non-assigned server. By default, each workflow runs on its assigned
PowerCenter Server. You can run a workflow on a non-assigned server if the workflow is
not currently running. Use the Start Workflow button on the Standard toolbar, and choose
a PowerCenter Server.
You can use the Workflow Monitor to monitor workflows running on multiple servers. For
server grids, the Workflow Monitor shows the individual status of each server in a grid. You
can identify the server grid that a server is assigned to by right-clicking the server in the
Workflow Monitor and selecting Properties. For more information about using the Workflow
Monitor, see “Monitoring Workflows” on page 401.
Tip: You might want to place the most CPU intensive sessions on the more powerful servers.
444 Chapter 15: Using Multiple Servers

In a multiple server environment, each server must have access to input files and directories
used by the session it runs. You can use server variables to simplify the process of changing the
server that runs a session or workflow. Server variables set the paths for files and caches
created during a session.
If you override a server variable in a workflow or session, you may need to manually edit the
session or workflow properties. If the new PowerCenter Server cannot locate the override
directory, it cannot run the session.
Using a File Server

Consider setting up a central location or using a file server accessible to all the PowerCenter
Servers. This allows you to run sessions on different servers without moving cache files and
input files.
♦ Configure $PMRootDir for each server to point to the central location.
♦ Use the same variables on each machine.
If you do not use a central file server, you need to relocate input files to the default directories
of the new PowerCenter Server. Input files can include parameter files, cache files, external
procedures, and flat file sources.
Running Sessions with Cache Files

In a multiple server environment, each PowerCenter Server needs access to the index and data
cache files created during previous sessions. This can include incremental aggregation files
and persistent lookup cache files. If the PowerCenter Server cannot locate the cache files, it
rebuilds them.
When the PowerCenter Server rebuilds incremental aggregation files, it loses aggregate
history. Use one of the following methods to save aggregate history in a multiple server
environment:
♦ Use consistent server variables. Use the same variable for $PMCacheDir for each
PowerCenter Server running incremental aggregation sessions.
♦ Run incremental aggregation sessions on the same machine. When you run large
incremental aggregation sessions, you might want to consider assigning a server to a
session and overriding the server variable to write to a drive local to the assigned
PowerCenter Server.
♦ Move incremental aggregation files. If you cannot make files accessible to each
PowerCenter Server, or if the files are very large, you must move them to the server
running the session.
Note: Since aggregate files can become very large, make sure the directory can accommodate
the necessary files.
Using Server Variables 445

Working with Server Grids
You can increase workflow performance by using a server grid to balance the server workload.
When you create a server grid, you can add PowerCenter Servers to the grid. When you run a
workflow against a PowerCenter Server in the grid, that server becomes the master server for
the workflow. The master server runs all non-session tasks and assigns session tasks to run on
other servers in the grid. The other servers become worker servers for that workflow run.
You can specify server grid distribution options at the server level, workflow level, and session
level. PowerCenter Servers specified at the session level override both server level and
workflow level properties. For more information about these overrides, see “Configuring
Note: You cannot run a single session on multiple servers.
Distributing Sessions
In a server grid, the master server starts the workflow and then distributes sessions to worker
servers. The master server is the server that starts a workflow. A worker server is a server that
runs sessions assigned to it by a master server. By default, each PowerCenter Server in a server
grid is both a master server and a worker server. This means that a server in a grid can
distribute sessions to and receive sessions from every server in the grid. The master server
distributes sessions that are ready to run to available worker servers in a round-robin fashion
based on server availability. The starting point for the session assignment is random.
If a worker server is running the maximum number of concurrent sessions, the master server
assigns another worker server to run the session. If all worker servers are running the
maximum number of concurrent sessions, the master server places the session in its own ready
queue.
For information about configuring the maximum number of concurrent sessions, see
“Installing and Configuring the PowerCenter Server on Windows” and “Installing and
Configuring the PowerCenter Server on UNIX” in the Installation and Configuration Guide.
Figure 15-1 shows how a master server distributes the sessions in Workflow1 among the
servers in a grid. The server grid contains Server A, Server B, and Server C. Server A is the
master server, and Server B and Server C are worker servers.
Figure 15-1. Distributing Sessions in a Server Grid

In Workflow1, Server A is the master
server.
Server B
Server A
Server C
Server A

Figure 15-2 shows how a master server distributes sessions in a workflow where a non-session
task exists. Server C is the master server, and Server A and Server B are worker servers. Server
C runs all non-session tasks it encounters and assigns sessions in a round-robin fashion.
Figure 15-2. Running a Non-session Task on the Master Server
Server C is the master server.

Server A Server C
Server C Server B Server A
Server C Server B
Server Grid Connectivity

PowerCenter Servers in a server grid create and maintain a connection to each other. A server
grid contains information about other servers in the grid. When you start a PowerCenter
Server, it fetches the server grid object and creates a TCP/IP connection to the other servers in
the grid.
Each server in the grid monitors the other servers to check connectivity status. As a result, the
grid notifies each server when you add, edit, or delete any server in the grid.
You can add servers to a server grid at any time. When a server starts up, it connects to the
grid and can run sessions from master servers and distribute sessions to worker servers in the
grid. The Workflow Monitor communicates with the master server to monitor progress of
workflows, get session statistics, retrieve performance details, and stop or abort the workflow
or task instances.
If a PowerCenter Server loses its connection to the grid, it tries to reestablish a connection.
You do not need to restart the server for it to connect to the grid. If a PowerCenter Server is
not connected to the server grid, the other PowerCenter Servers in the server grid do not send
it tasks.
When a PowerCenter Server cannot reestablish a connection to the grid, session and workflow
completion depends on factors such as shut down mode and which server loses connectivity.
Working with Server Grids 447

Table 15-1 lists scenarios where a server grid can lose connectivity:
Table 15-1. Losing Connectivity in a Server Grid
Connectivity Loss Server Behavior
Worker server shuts down The worker server is not available to the master servers in the server grid.
unexpectedly or you shut it down Master servers do not assign a session to the unavailable worker server and
before it receives a session. proceed with the round-robin distribution of sessions.
Worker server shuts down The master server marks the status of the session as terminated. The worker
unexpectedly while running a server stops running all sessions. The session settings you specify determine if
session. the workflow fails. For more information about the Fail parent if this task fails
option, Fail parent if this task does not run option, or Disable this task option,
see “Configuring Tasks” on page 135.
You shut down a worker server while The shut down mode you specify determines how the worker server handles
it is running a session. sessions when it shuts down. When you shut down the worker server in
complete mode, it continues to run the sessions it started until it completes, but
does not accept sessions from master servers. For more information about
shut down modes, see “pmcmd Reference” on page 594.
Worker server loses its network The worker server continues to run the session and writes its status to the
connection and cannot connect to the session log. However, the master server marks the status of the session as
server grid. terminated.
You must resume the workflow or resume from the failed task to continue
running the workflow and update the session status. If you do not need the
session status of the previous run, you can restart the workflow or restart the
workflow from a task to start up a new workflow run. For more information, see
“Working with Tasks and Workflows” on page 416.
Master server shuts down Workflow fails. You must restart the workflow on another server or wait for the
unexpectedly. master server to become available.
You shut down the master server The shut down mode you specify determines how the master server handles
while running a workflow or session. workflows and sessions when it shuts down. When you shut down the master
server in complete mode, it continues to run the workflows and sessions it
started until they complete, but does not accept tasks from other master
servers. For more information about shut down modes, see “pmcmd
Reference” on page 594.
Master server loses its network The master server continues to run workflows as a standalone PowerCenter
connection and cannot connect to the Server. If a worker server is assigned to a session, the session fails because
server grid. the master server cannot distribute the session to the worker server. The
session settings you specify determine if the workflow fails. For more
information about the Fail parent if this task fails option, Fail parent if this task
does not run option, or Disable this task option, see “Configuring Tasks” on
page 135.
Server Grid Guidelines and Requirements

Informatica recommends that each PowerCenter Server in a server grid uses the same
operating system. While you can specify different session log directories, workflow log

directories, and temp directories for the PowerCenter Servers, each PowerCenter Server in a
server grid must meet the following requirements:
♦ Register each PowerCenter Server to the same repository.
♦ Use the same database connectivity for each PowerCenter Server.
♦ Use the same server variables for each server in a grid, except for the $PMTempDir,
$PMSessionLogDir, and $PMWorkflowLogDir variables.
♦ Use the same cache directory.
♦ Configure the following PowerCenter Server parameters the same:
− Fail session if maximum number of concurrent sessions is reached
− PMServer 4.0 date handling compatibility
− Aggregate treat null as zero
− Aggregate treat rows as insert
− Treat CHAR as CHAR on read
− Data Movement Mode
− Validate Data Code Pages
− Output Session Log In UTF8
− Export Session Log Lib Name
− Treat Null in comparison operator as
− Data Display Format
♦ PowerCenter Servers must be the same product version.
♦ DB2 EEE loader must be on the same machine as PowerCenter Server.
Working with Server Grids 449

Configuring Server Grids
When you work with server grids, you can configure properties in the grid, workflow, and
session. When you run a session using a server grid, the server grid evaluates session properties
first, then workflow properties, and then grid properties.
Configuring Server Grid Properties

By default, each PowerCenter Server you add to the server grid can be both a master server
and a worker server. Each server accepts tasks from the grid. You can configure a server to be
only a master server by clearing Accept task from Server Grid. A PowerCenter Server that is
only a master server does not run sessions from other servers in the grid, but it can distribute
sessions to other servers in the grid.
Configuring Workflow Properties

When you configure a workflow, you can configure the following server properties:
♦ You can assign a server to run the workflow. When you assign a server to a workflow, the
server becomes the master server for the workflow.
♦ You can configure the entire workflow to run only on the master server. By default, the
master server distributes sessions to worker servers. You can configure the session to
override this workflow configuration.
Configuring Session Properties

You can assign a server to run a session. When you assign a server to a session, you override
workflow and grid server assignments. You might want to assign a server to sessions that use
the following features:
♦ Caching. When you run sessions that access large cache files, such as incremental
aggregation files, you can increase performance by using a drive local to the PowerCenter
Server for the cache directory. Assign a server to a session and override the server variable
to write to a drive local to the PowerCenter Server.
♦ External loader. Assign a server to run DB2 EEE external loader sessions. DB2 EEE
loaders require that the loader process runs on the PowerCenter Server running the session.
Note: If you assign a server to a session that is not in the grid, and the master server cannot
connect to the assigned server, the session fails.

Override Examples
Table 15-2 shows a configuration where the session properties override the workflow
properties. The session runs on Server B even though you select the workflow option to run
all tasks on Server A because the session is assigned to Server B.
Table 15-2. Override Workflow Properties
Level Configuration
Grid - Server A accepts tasks from server grid.

- Server B accepts tasks from server grid.
Workflow - Run on Server A.

- Tasks must run on server.
Session Run on Server B.
Table 15-3 shows a configuration where the session properties override the server grid
properties. The session runs on Server B, even though you configure Server B not to accept
tasks from the grid because you assigned the session to Server B.
Table 15-3. Override Server Grid Properties
Level Configuration
Grid - Server A accepts tasks from server grid.

- Server B does not accept tasks from server grid.
Workflow - Run on Server A.

- Tasks can run on other servers in the grid.
Session Run on Server B.
Steps for Creating a Server Grid

Use the Server Grid Browser to create and edit server grids. When you create or edit a server
grid, you can choose servers from the list of available servers. A server is available if it is
registered in the same repository and is not part of another server grid. You can add up to 64
PowerCenter Servers in a grid.
Use the following procedure to create a server grid.
To create a server grid:
1. Choose Server-Server Grid.

The Server Grid Browser opens.
2. Click New.
Configuring Server Grids 451

The Server Grid Editor opens with a list of available PowerCenter Servers.
3. Enter a server grid name and description.

4. Select the server you want to include in the server grid, and click Add.
The selected server appears in Selected Servers column.
5. Clear Accept tasks from Server Grid if you want the server to be only a master server.
Configure as both a
master and worker
server.
6. Repeat steps 4 and 5 until you have chosen all the servers for the grid.

7. Click OK.
The server grid name appears in the Server Grid Browser. Select Show servers in grid to
view the servers in the grid.
8. Click Close.
Configuring Server Grids 453

Chapter 16
Log Files

♦ Overview, 456
♦ Workflow Logs, 457
♦ Session Logs, 463
♦ Reject Files, 476
455
Overview
The PowerCenter Server can create log files for each workflow it runs. These files contain
information about the tasks the PowerCenter Server performs, plus statistics about the
workflow and all sessions in the workflow. If the writer or target database rejects data during a
session run, the PowerCenter Server creates a file that contains the rejected rows.
The PowerCenter Server can create the following types of log files:
♦ Workflow log. Contains information about the workflow run such as workflow name,
tasks executed, and workflow errors. By default, the PowerCenter Server writes this
information to the server log or Windows Event Log, depending on how you configure the
PowerCenter Server. If you wish to create a workflow log, enter a workflow file name in the
workflow properties. For more information, see “Workflow Logs” on page 457.
♦ Session log. Contains information about the tasks that the PowerCenter Server performs
during a session, plus load summary and transformation statistics. By default, the
PowerCenter Server creates one session log for each session it runs. If a workflow contains
multiple sessions, the PowerCenter Server creates a separate session log for each session in
the workflow. For more information, see “Session Logs” on page 463.
♦ Reject file. Contains rows rejected by the writer or target file during a session run. If the
writer or target does not reject any data during a session, the PowerCenter Server does not
generate a reject file for that session. For more information, see “Reject Files” on page 476.
By default, the PowerCenter Server saves each type of log file in its own directory. The
PowerCenter Server represents these directories using server variables.
Table 16-1 shows the default location for each type of log file:
Table 16-1. Log File Default Locations
Default Directory
Log File Type Value
(Server Variable)
Workflow logs $PMWorkflowLogDir $PMRootDir/WorkflowLogs
Session logs $PMSessionLogDir $PMRootDir/SessLogs
Reject files $PMBadFileDir $PMRootDir/BadFiles
You can change the default directories at the server level by editing the server connection in
the Workflow Manager. You can also override these values for individual workflows or sessions
by updating the workflow or session properties.
456 Chapter 16: Log Files

Workflow Logs
You can configure a workflow to create a workflow log. When you do this, the PowerCenter
Server writes information such as process initialization, workflow task run information, errors
encountered, and workflow run summary to the workflow log.
In general, a workflow log contains the following information about the workflow:
♦ Workflow name
♦ Workflow status
♦ Status of tasks and worklets in the workflow
♦ Start and end times for tasks and worklets
♦ Results of link conditions
♦ Some session messages and errors
♦ Errors encountered during the workflow
The PowerCenter Server categorizes workflow log error messages into severity levels. The
PowerCenter Server either writes or does not write an error message to the log file based on
the error severity level. You can set the Error Severity Level for Log Files in the PowerCenter
Server setup program. For more information, see “Installing and Configuring the
PowerCenter Server on Windows” or “Installing and Configuring the PowerCenter Server on
UNIX” in the Installation and Configuration Guide. You can also configure the PowerCenter
Server to suppress writing messages to the workflow log file completely.
As with PowerCenter Server logs and session logs, the PowerCenter Server enters a code
number into the workflow log file message along with message text. You can find information
on error messages in the Troubleshooting Guide.
You configure a workflow to create a workflow log by entering a workflow log file name in the
workflow properties. If you choose to create a workflow log, the PowerCenter Server saves the
workflow log in a directory entered for the server variable $PMWorkflowLogDir in the
PowerCenter Server registration. You can override the workflow log directory at the server
level or at the workflow level.
By default, the PowerCenter Server saves one workflow log for each workflow. If you want to
save multiple logs for different workflow runs, you can configure the workflow to save a
workflow log file by timestamp, which permits an unlimited number of workflow logs, or by
run, which saves a specified number of logs. To view previous workflow logs, save log files by
timestamp.
If you choose not to create workflow logs, the PowerCenter Server writes the workflow log
messages to the to the server log or Windows Event Log, depending on how you configure the
PowerCenter Server. For more information on configuring the PowerCenter Server, see
“Installing and Configuring the PowerCenter Server on Windows” or “Installing and
Configuring the PowerCenter Server on UNIX” in the Installation and Configuration Guide.
Workflow Logs 457

Workflow Log Messages
The PowerCenter Server precedes each message in the log file with a code and number. It also
precedes some messages with a timestamp. The code defines a group of messages for a specific
process. The number defines a specific message. The message can provide general information
or it can be an error message.
You can configure the PowerCenter Server to append a time stamp to every message it writes
to the workflow log. To do this, enable the Time Stamp Workflow Log option in the
PowerCenter Server setup program. For more information, see “Installing and Configuring
the PowerCenter Server on Windows” or “Installing and Configuring the PowerCenter Server
on UNIX” in the Installation and Configuration Guide.
Workflow Log Codes

You can use the workflow log to determine the cause of workflow problems. To resolve
workflow problems, locate the relevant log file codes and text prefixes in the workflow log,
then see the Troubleshooting Guide for details. You can find workflow-related server messages
in the UNIX server log (default name: pmserver.log) or in the Windows Event Log (viewed
with the Event Viewer).
Table 16-2 describes the codes that can appear in workflow logs:
Table 16-2. Workflow Log Codes
Error Code Description
CMN Messages related to databases, memory allocation, Lookup and Joiner transformations, and internal
errors.
LM Messages related to the Load Manager.
REP Messages related to repository functions.
TM Messages related to Data Transformation Manager (DTM).
VAR Messages related to mapping variables.
Workflow Log Sample

The following sample is a workflow log from a simple workflow that shows log file codes:
INFO : LM_36315 [Tue Nov 18 11:16:38 2003] : (270|305) Starting execution
of workflow [wf_PhoneList].

of start instance [StartWorkflow].
INFO : LM_36333 [Tue Nov 18 11:16:38 2003] : (270|305) Execution of start

instance [StartWorkflow] succeeded.
INFO : LM_36505 : (270|305) Link [StartWorkflow --> s_PhoneList]: empty

expression string, evaluated to TRUE.

of session instance [s_PhoneList].

INFO : LM_36522 : (270|305) Started DTM process [pid = 273] for session
instance [s_PhoneList].
INFO : CMN_1760 : (273|255) Message from session: LM_36033 [Connected to

repository [SALES] running on server:port [monster]:[5001] user
[Administrator]].
INFO : CMN_1760 : (273|255) Message from session: TM_6228 [Writing session

output to log file [d:\pcserver\SessLogs\s_PhoneList.log].].
INFO : LM_36333 [Tue Nov 18 11:16:43 2003] : (270|306) Execution of

session instance [s_PhoneList] succeeded.
INFO : LM_36318 [Tue Nov 18 11:16:43 2003] : (270|306) Execution of

workflow [wf_PhoneList] succeeded.
Configuring Workflow Logs

You can configure workflow log options in the workflow properties. You can configure the
following information for a workflow log:
♦ Location. You can configure the directory where you want the workflow log created. By
default, the PowerCenter Server creates the workflow log in the directory configured for
the $PMWorkflowLogDir server variable. You can enter a different directory, but if the
directory does not exist or is not local to the PowerCenter Server that runs the workflow,
the workflow fails.
♦ Name. If you wish to create a workflow log, you can enter a name for the workflow log
file. If you do not enter a filename, the PowerCenter Server does not create a workflow log.
Instead, the PowerCenter Server writes workflow log messages to the Windows Event Log
or UNIX server log.
♦ Archive. You can configure the number of workflow logs you want the PowerCenter Server
to archive for each workflow. By default, the PowerCenter Server does not archive
workflow logs.
Archiving Workflow Logs

By default, the PowerCenter Server does not save multiple logs for a single workflow. It
creates one workflow log for each workflow and overwrites the existing log with the latest
workflow log.
If you wish to save multiple logs for a workflow, you can configure the PowerCenter Server to
do this. The PowerCenter Server can save workflow logs in two ways:
♦ Save a selected number of logs
♦ Save all logs by timestamp
If you configure the workflow to save a specific number of workflow logs, it names the most
recent log filename.log. It then cycles through a closed naming sequence for historical logs as
follows: filename.log.0, filename.log.1, filename.log.2, …, filename.log.n-1, where n represents
the number of workflow logs. Because the PowerCenter Server cycles through the numeric
naming sequence, check the workflow log file timestamp to determine the chronological order
of those files.
Workflow Logs 459

Instead of entering a specific number of workflow logs to save, you can use the server variable
$PMWorkflowLogCount. When you use $PMWorkflowLogCount server variable, the
PowerCenter Server archives the number of workflow logs configured for the server variable.
If you use $PMWorkflowLogCount for all workflows, you can increase the number of
archived workflow logs for all workflows by changing the server variable.
Note: By default, $PMWorkflowLogCount is set to 0. To archive workflow logs using
$PMWorkflowLogCount, configure it for a larger number of workflow logs. For details on
configuring server variables, see “Registering the PowerCenter Server” on page 46.
You can also save all workflow logs by configuring a workflow to save logs by timestamp.
When timestamping workflow logs, the PowerCenter Server appends the year, month, day,
hour, and minute of the workflow completion to the log file. The resulting log file name is
filename.log.yyyymmddhhmi, where:
♦ yyyy = year
♦ mm = month, ranging from 1-12
♦ dd = day, ranging from 1-31
♦ hh = hour, ranging from 0-23
♦ mi = minute, ranging from 0-59
To prevent filling the workflow log directory, periodically delete or backup log files when
using the timestamp option.
Note: You can also truncate workflow and session log entries from the repository. For more
information, see “Using the Repository Manager” in the Repository Guide.
Steps for Configuring Workflow Logs

You can configure workflow log information on the Properties tab of the workflow properties.
To configure workflow log information:
1. In the Workflow Manager, open the workflow properties.

2. Select the Properties tab.
3. Enter the following workflow log options:
Option Name Description
Parameter File Name Designates the name and directory for the parameter file. Use the parameter file to
define workflow parameters. For details on parameter files, see “Parameter Files”
on page 511.
Workflow Log File Name Optionally enter a file name, or a file name and directory.
If you leave this field blank, the PowerCenter Server does not create a workflow
log. Instead, the PowerCenter Server writes workflow log messages to the server
log or Windows Event Log, depending on how you configure the PowerCenter
Server.
If you fill in this field, the PowerCenter Server appends information in this field to
that entered in the Workflow Log File Directory field. For example, if you have
"C:\workflow_logs\" in the Workflow Log File Directory field, then enter
"logname.txt" in the Workflow Log File Name field, the PowerCenter Server writes
logname.txt to the C:\workflow_logs\ directory.
Workflow Log File Directory Designates a location for the workflow log file. By default, the PowerCenter Server
writes the log file in the server variable directory, $PMWorkflowLogDir.
If you enter a full directory and file name in the Workflow Log File Name field, clear
this field.
Workflow Logs 461

Save Workflow Log By If you select Save Workflow Log by Timestamp, the PowerCenter Server saves all
workflow logs, appending a timestamp to each log.
If you select Save Workflow Log by Runs, the PowerCenter Server saves a
designated number of workflow logs. Configure the number of workflow logs in the
Save Workflow Log for These Runs option.
For details on these options, see “Archiving Workflow Logs” on page 459.
You can also use the $PMWorkflowLogCount server variable to save the
configured number of workflow logs for the PowerCenter Server.
Save Workflow Log for The number of historical workflow logs you want the PowerCenter Server to save.
These Runs The Informatica saves the number of historical logs you specify, plus the most
recent workflow log. Therefore, if you specify 5 runs, the PowerCenter Server
saves the most recent workflow log, plus historical logs 0 to 4, for a total of 6 logs.
You can specify up to 2,147,483,647 historical logs. If you specify 0 logs, the
PowerCenter Server saves only the most recent workflow log.
4. Click OK to save the workflow.
Viewing Workflow Logs

Workflow logs are text files that you can open with any text editor. The PowerCenter Server
saves workflow logs in the directory you specify in the Workflow Log File Directory field in
You can also view workflow logs through the Workflow Monitor. When you do this, the
Workflow Manager creates a temporary file that stores the workflow log. You can view the
temporary file through the Workflow Monitor.
The PowerCenter Server generates the workflow log based on the PowerCenter Server code
page. You can specify the language in which you want to view the workflow log based on the
locale of the machine hosting the PowerCenter Server.
To use the Workflow Monitor to view the most recent workflow log:
1. In the Navigator window, connect to the server on which the workflow runs.
2. Open the folder that contains the workflow.
3. Right-click the workflow and choose Get Workflow Log.
If you save workflow logs by timestamp, you can also use the Workflow Monitor to view past
workflow logs. To do this, right click the workflow in the Gantt chart view and choose Get
Workflow Log.
For more information about the Workflow Monitor, see “Using the Workflow Monitor” on
page 404.

Session Logs
The session log file contains information about all tasks the PowerCenter Server performs,
plus the load summary and transformation statistics. The amount of detail in the session log
depends on the tracing level that you set. You can define the tracing level for each
transformation or for the entire session. The session-level tracing overrides any
transformation-level tracing levels.
In general, the session log contains the following information about the session:
♦ Allocation of system shared memory
♦ Execution of pre-session commands
♦ Creation of SQL commands for reader and writer threads
♦ Start and end times for target loading
♦ Errors encountered during session
♦ Execution of post-session commands
♦ Load summary of reader, writer, and Data Transformation Manager (DTM) statistics
By default, the PowerCenter Server saves session logs in the directory for the PowerCenter
Server variable $PMSessionLogDir, which you define in the Workflow Manager. The default
name for the session log is s_mapping name.log. You can override the session log name and
location in the session properties.
The PowerCenter Server does not archive session logs by default. Instead, it creates one log for
each session and overwrites the existing log with the latest session log. However, you can
configure the session to archive session logs. For more information, see “Archiving Session
Logs” on page 471.
By default, the PowerCenter Server generates session log files based on the PowerCenter
Server code page. However, if you enable the Output Session Log in UTF-8 option on the
Configuration tab of the PowerCenter Server setup program, the PowerCenter Server writes
to the session log using the UTF-8 character set.
Note: By default, the PowerCenter Server writes row errors to the session log. However, if you
enable row error logging in the sessions properties, the PowerCenter Server does not write
dropped rows to the session log. When you enable row error logging, you can configure the
PowerCenter Server to write row errors to the session log in addition to the row error log by
enabling verbose data tracing.
Session Log Messages

The PowerCenter Server precedes each message in the log file with a thread identification and
then a code and number. The code defines a group of messages for a specific process. The
number defines a specific message. The message can provide general information or it can be
an error message.
Session Logs 463

You can configure the PowerCenter Server to write session log messages to an external library
as well as to the session log. To do this, you can set the Export Session Log Lib Name in the
PowerCenter Server setup program. For more information, see “Installing and Configuring
the PowerCenter Server on Windows” or “Installing and Configuring the PowerCenter Server
on UNIX” in the Installation and Configuration Guide.
Session Log Codes

You can use the session log to determine the cause of session problems. To resolve session
problems, locate the relevant log file codes and text prefixes in the session log, then see the
Troubleshooting Guide for details. You can find session-related server messages in the UNIX
server log (default name: pmserver.log) or in the Windows Event Log (viewed with the Event
Viewer).
Table 16-3 describes the codes that can appear in session logs:
Table 16-3. Session Log Codes
Message Code Description
BLKR Messages related to reader process, including Application, relational, or flat file.
CNX Messages related to the Repository Agent connections.
CMN Messages related to databases, memory allocation, Lookup and Joiner transformations, and
internal errors.
DBG Messages related to PowerCenter Server loading and debugging.
DBGR Messages related to the Debugger.
EP Messages related to external procedures.
ES Messages related to the Repository Server.
FR Messages related to file sources.
FTP Messages related to File Transfer Protocol operations.
HIER Messages related to reading XML sources.
LM Messages related to the Load Manager.
NTSERV Messages related to Windows server operations.
OBJM Messages related to the Repository Agent.
ODL Messages related to database functions.
PETL Messages related to pipeline partitioning.
PMF Messages related to caching Aggregator, Rank, Joiner, or Lookup transformations.
RAPP Messages related to the Repository Agent.
REP Messages related to repository functions.
RR Messages related to relational sources.
SF Messages related to server framework, used by Load Manager and Repository Server.

Table 16-3. Session Log Codes
Message Code Description
SORT Messages related to the Sorter transformation.
TE Messages related to transformations.
TM Messages related to Data Transformation Manager (DTM).
TT Messages related to transformations.
VAR Messages related to mapping variables.
WRT Messages related to the Writer.
XMLR Messages related to the XML Reader.
XMLW Messages related to the XML Writer.
Thread Identification
The thread identification consists of the thread type and a series of numbers separated by
underscores. The numbers following a thread name indicate the following information:
♦ Target load order group number
♦ Partition point number
♦ Partition number
Note: The PowerCenter Server writes an asterisk (*) as the partition point number for writer
threads.
The PowerCenter Server prints the thread identification before the log file code and the
message text in the session log. The following example illustrates a reader thread from target
load order group one, concurrent source set one, source pipeline one, and partition one:
READER_1_1_1> DBG_21438 Reader: Source is [p152636], user [jennie]
For more information on partitioning, see “Pipeline Partitioning” on page 345.

When you configure the PowerCenter Server to read Joiner transformation sources
sequentially, the PowerCenter Server writes numbers with the following information after the
thread name:
♦ Target load order group number
♦ Concurrent source set number
♦ Partition point number
♦ Partition number
A concurrent source set is the group of sources in a target load order group the PowerCenter
Server reads concurrently. A target load order group might contain multiple concurrent
source sets if it contains a Joiner transformation and you configure the PowerCenter Server to
read Joiner transformation sources sequentially.
Session Logs 465

Enable the PMServer 6.X Joiner source order compatibility PowerCenter Server option to
configure it to read Joiner transformation sources sequentially.
Session Log Sample

The following sample is an excerpt from a session log file that illustrates log file codes and
thread identifications:
TM_6703 Session [s_m_SampleSessionLog] is run by PowerCenter Server
[sarao].
MASTER> CMN_1688 Allocated [12000000] bytes from process memory for [DTM
Buffer Pool].
MASTER> PETL_24000 Parallel Pipeline Engine initializing.

MASTER> PETL_24001 Parallel Pipeline Engine running.
MASTER> PETL_24003 Initializing session run.
MAPPING> TM_6014 Initializing session [s_m_SampleSessionLog] at [Tue Aug

03 11:29:57 2004]
*****START LOAD SESSION*****
Load Start Time: Tue Aug 03 11:30:00 2004
Target tables:
Emp_target
READER_1_1_1> BLKR_16019 Read [1] rows, read [0] error rows for source
table [EMP_SRC] instance name [EMP_SRC]
READER_1_1_1> BLKR_16008 Reader run completed.
TRANSF_1_1_1> DBG_21216 Finished transformations for Source Qualifier

[SQ_EMP_SRC]. Total errors [0]
WRITER_1_*_1> WRT_8167 Start loading table [Emp_target] at: Tue Aug 03

11:30:00 2004
MASTER> PETL_24002 Parallel Pipeline Engine finished.

MASTER> PETL_24012 Session run completed successfully.

Some messages are embedded within other messages. For example, a code CMN_1039
contains informational messages from the Microsoft SQL Server as it changes to the source
database to be used in the session.
Note: If you configure the PowerCenter Server to run in ASCII mode, the session log file
reports the sort order as Binary, even if you select a different sort order in the session
properties.
Load Summary
The session log includes a load summary that reports the number of rows inserted, updated,
deleted, and rejected for each target as of the last commit point. The PowerCenter Server
reports the load summary for each session by default. However, you can set tracing level to
Verbose Initialization or Verbose Data to report the load summary for each transformation.
The following sample is an excerpt from a load summary:
*****START LOAD SESSION*****
Load Start Time: Tue Aug 03 11:30:00 2004
Target tables:
Emp_target
Commit on end-of-data Aug 03 11:30:07 2004
===================================================
WRT_8036 Target: Emp_target (Instance Name: [Emp_target])

WRITER_1_*_1> WRT_8035 Load complete time: Tue Aug 03 11:30:07 2004
LOAD SUMMARY
============
Session Logs 467

WRT_8036 Target: Emp_target (Instance Name: [Emp_target])

,
WRITER_1_*_1> WRT_8043 *****END LOAD SESSION*****
The PowerCenter Server reports statistics for each of the following operations performed on
the target:
♦ Inserted. Shows the number of rows the PowerCenter Server marked for insert into the
target. The number of affected rows cannot be larger than requested for this operation.
♦ Updated. Shows the number of rows the PowerCenter Server marked for update in the
target. The number of affected rows can be different from the number of requested rows.
For example, you have a table with one column called SALES_ID and five rows containing
the values: 1, 2, 3, 2, and 2. You mark rows for update where SALES_ID is 2. The writer
affects three rows, even though there was only one update request. Or, if you mark rows for
update where SALES_ID is 4, the writer affects 0 rows.
♦ Deleted. Shows the number of rows the PowerCenter Server marked to remove from the
target. The number of affected rows can be different from the number of requested rows.
♦ Rejected. Shows the number of rows the PowerCenter Server rejected during the writing
process. These rows cannot be applied to the target. For the Rejected rows category, the
number of affected and applied rows is always zero since these rows are not written to the
target.
The load summary provides the following statistics:
♦ Requested rows. Shows the number of rows the writer actually received for the specified
operation.
♦ Applied rows. Shows the number of rows the writer successfully applied to the target (that
is, the target returned no errors).
♦ Affected rows. Shows the number of rows affected by the specified operation. Depending
on the operation, the number of affected rows can be different from the number of
requested rows. For example, you have a table with one column called SALES_ID and five
rows containing the values: 1, 2, 3, 2, and 2. You mark rows for update where SALES_ID
is 2. The writer affects three rows, even though there was only one update request. Or, if
you mark rows for update where SALES_ID is 4, the writer affects 0 rows.
♦ Rejected rows. Shows the number of rows the writer could not apply to the target. For
example, the target database rejects a row if the PowerCenter Server attempts to insert
NULL into a not-null field. The PowerCenter Server writes all rejected rows to the session
reject file, or to the row error log, depending on how you configure the session.

♦ Mutated from update. Shows the number of rows originally flagged for update that are
instead inserted into the target when the session is configured Update Else Insert.
If the number of rows requested, applied, rejected, and affected are all zero for any of these
four operations, the operation does not appear as a line in the load summary. If no data is
passed to the target, the writer reports the following message:
No data loaded for this target.
Detailed Transformation Statistics

The DTM enables transformation statistics in the session log for two levels of tracing,
Verbose Initialization and Verbose Data. Transformation statistics appear after the load
summary in the log file.
The PowerCenter Server reports the following details for each transformation in the mapping:
♦ The name of the transformation
♦ The number of input rows and the name of the input source
♦ The number of output rows and the name of the output transformation or target
♦ The number of rows dropped
The following sample is an excerpt from the transformation statistics in a session log file:
DETAILED TRANSFORMATION ROW STATISTICS
for DSQ [SQ_EMPLOYEES], Partition[1]
---------------------------------
MAPPING>
MAPPING> TT_11031 Transformation [SQ_EMPLOYEES]:
MAPPING> TT_11035 Input - 12 (__READER__)
MAPPING> TT_11037 [T_EMPLOYEES]: Output - 12, Dropped - 0
MAPPING>
Configuring Session Logs

Configure session log options in the session properties. You can configure the following
information for a session log:
♦ Location. You can configure the directory where you want the session log created. By
default, the PowerCenter Server creates the session log in the directory configured for the
$PMSessionLogDir server variable. You can enter a different directory, but if the directory
does not exist or is not local to the PowerCenter Server that runs the session, the session
fails.
Session Logs 469

♦ Name. You can name the session log or accept the default name. The default name for the
session log is s_mapping name.log.
♦ Archive. You can configure the number of session logs you want the PowerCenter Server to
archive for each session. By default, the PowerCenter Server does not archive session logs.
♦ Tracing levels. You can control the type of information the PowerCenter Server includes in
the session log by setting a tracing level for the session. By default, the PowerCenter Server
uses tracing levels configured in the mapping.
Configuring Session Log Locations and Filenames

You can configure the name and location of the session log on the Properties tab of the session
properties.
To configure session log information:

2. Select the General Options settings on the Properties tab.
Session Log Filename

and Directory

3. Enter the following session log options:
Session Log File By default, the PowerCenter Server uses the session name for the log file name:
Name s_mapping name.log. For a debug session, it uses DebugSession_mapping
name.log.
Optionally enter a file name, a file name and directory, or use the
$PMSessionLogFile session parameter. The PowerCenter Server appends
information in this field to that entered in the Session Log File Directory field. For
example, if you have “C:\session_logs\” in the Session Log File Directory field, then
enter “logname.txt” in the Session Log File field, the PowerCenter Server writes the
logname.txt to the C:\session_logs\ directory.
You can also use the $PMSessionLogFile session parameter to represent the name
of the session log or the name and location of the session log. For details on session
parameters, see “Session Parameters” on page 495.
Session Log File Location of the log file. Enter a valid directory local to the PowerCenter Server. By
Directory default, the PowerCenter Server creates session logs in the directory configured for
the $PMSessionLogDir server variable.
4. Click OK to save the session.
Archiving Session Logs

You can archive session logs on a session-by-session basis. The PowerCenter Server can save
session logs in the following ways:
♦ Save a selected number of logs
♦ Save all logs by timestamp
By default, the PowerCenter Server does not archive session logs. It creates one session log for
each session and overwrites the existing log with the latest session log.
If you configure the session to save a specific number of session logs, it names the most recent
log s_mapping name.log. It then cycles through a closed naming sequence for historical logs as
follows: s_mapping name.log.0, s_mapping name.log.1, s_mapping name.log.2, …, s_mapping
name.log.n-1, where n is the number of session logs. Because the PowerCenter Server cycles
through the numeric naming sequence, check the session log file timestamp to determine the
chronological order of those files.
Instead of entering a specific number of session logs to save, you can use the server variable
$PMSessionLogCount. When you use $PMSessionLogCount server variable, the
PowerCenter Server archives the number of session logs configured for the server variable. If
you use $PMSessionLogCount for all sessions, you can increase the number of archived
session logs for all sessions by changing the server variable.
Note: By default, $PMSessionLogCount is set to 0. To archive session logs using
$PMSessionLogCount, configure it for a larger number of session logs. For details on
configuring server variables, see “Registering the PowerCenter Server” in the Installation and
Session Logs 471

You can also save all session logs by configuring a session to save logs by timestamp. When
timestamping session logs, the PowerCenter Server appends the month, day, hour, and minute
of the session completion to the log file. The resulting log file name is s_mapping
name.log.yyyymmddhhmi, where:
♦ yyyy = year
♦ mm = month, ranging from 1-12
♦ dd = day, ranging from 1-31
♦ hh = hour, ranging from 0-23
♦ mi = minute, ranging from 0-59
To prevent filling the session log directory, periodically delete or backup log files when using
the timestamp option.
Note: You can also truncate workflow and session log entries from the repository. For more
information, see “Using the Repository Manager” in the Repository Guide.
To specify archiving information:

2. Select the Log Options settings on the Config Object tab.
Log Options Settings

3. Enter the following session log options:
Save Session Log By If you select Save Session Log by Timestamp, the PowerCenter Server saves all
session logs, appending a timestamp to each log.
If you select Save Session Log by Runs, the PowerCenter Server saves a designated
number of session logs. Configure the number of sessions in the Save Session Log for
These Runs option.
You can also use the $PMSessionLogCount server variable to save the configured
number of session logs for the PowerCenter Server.
Save Session Log for The number of historical session logs you want the PowerCenter Server to save.
These Runs The Informatica saves the number of historical logs you specify, plus the most recent
session log. Therefore, if you specify 5 runs, the PowerCenter Server saves the most
recent session log, plus historical logs 0 to 4, for a total of 6 logs.
PowerCenter Server saves only the most recent session log.
Setting Tracing Levels

The amount of detail in the session log depends on the tracing level that you set. You can
define tracing levels for each transformation or for the entire session. By default, the
PowerCenter Server uses tracing levels configured in the mapping.
Setting a tracing level for the session overrides the tracing levels configured for each
transformation in the mapping. If you select a normal tracing level or higher, the
PowerCenter Server writes row errors into the session log, including the transformation in
which the error occurred and complete row data. If you configure the session for row error
logging, the PowerCenter Server writes row errors to the error log instead of the session log. If
you want the PowerCenter Server to write dropped rows to the session log as well, configure
the session with Verbose Data tracing level.
Table 16-4 describes the session log tracing levels:
Table 16-4. Session Log Tracing Levels
Tracing Level Description
None The PowerCenter Server uses the tracing level set in the mapping.
Terse PowerCenter Server logs initialization information as well as error messages and notification of
rejected data.
Normal PowerCenter Server logs initialization and status information, errors encountered, and skipped
rows due to transformation row errors. Summarizes session results, but not at the level of
individual rows.
Session Logs 473

Table 16-4. Session Log Tracing Levels
Tracing Level Description
Verbose In addition to normal tracing, PowerCenter Server logs additional initialization details, names of
Initialization index and data files used, and detailed transformation statistics.
Verbose Data In addition to verbose initialization tracing, PowerCenter Server logs each row that passes into the
mapping. Also notes where the PowerCenter Server truncates string data to fit the precision of a
column and provides detailed transformation statistics.
When you configure the tracing level to verbose data, the PowerCenter Server writes row data for
all rows in a block when it processes a transformation.
You can also enter tracing levels for individual transformations in the mapping. When you
enter a tracing level in the session properties, you override tracing levels configured for
transformations in the mapping.
To set the tracing level:
1. Select the Error Handling settings on the Config Object tab.
Tracing
Level
2. Select a tracing level from the Override Tracing list. Table 16-4 on page 473 describes the
session log tracing levels.
Viewing Session Logs

Session logs are text files that you can open with any text editor. The PowerCenter Server
saves session logs in the directory you specify in the Session Log File Directory field in the
session properties.

You can also view session logs through the Workflow Monitor. When you do this, the
Workflow Monitor creates a temporary file that stores the session log. You can view the
temporary file through the Workflow Monitor.
If a session fails, you can still view the session log file.
The PowerCenter Server generates the session log based on the PowerCenter Server code page.
You can specify the language in which you want to view the session log based on the locale of
the machine hosting the PowerCenter Server.
To use the Workflow Monitor to view the most recent session log:
1. In the Navigator window, connect to the server on which the workflow runs.
2. Open the folder that contains the workflow.
3. Open the workflow that contains the session whose log you wish to view.
4. Right-click the session and choose Get Session Log.
If you save session logs by timestamp, you can also use the Workflow Monitor to view past
session logs. To do this, right-click the session in the Gantt chart view and choose Get Session
Log.
For more information about the Workflow Monitor, see “Using the Workflow Monitor” on
page 404.
Session Logs 475

Reject Files
During a session, the PowerCenter Server creates a reject file for each target instance in the
mapping. If the writer or the target rejects data, the PowerCenter Server writes the rejected
row into the reject file. The reject file and session log contain information that helps you
determine the cause of the reject.
Each time you run a session, the PowerCenter Server appends rejected data to the reject file.
Depending on the source of the problem, you can correct the mapping and target database to
prevent rejects in subsequent sessions.
Note: If you enable row error logging in the session properties, the PowerCenter Server does
not create a reject file. It writes the reject rows to the row error tables or file.
Locating Reject Files

The PowerCenter Server creates reject files for each target instance in the mapping. It creates
reject files in the session reject file directory, as configured on the Properties settings of the
Targets node on the Mapping tab (Transformation view). By default, the PowerCenter Server
creates reject files in the $PMBadFileDir server variable directory.
The PowerCenter Server names reject files after the target instance name. The default name
for reject files is target instance partition number.bad. You can view or edit reject file names in
the session properties. The Workflow Manager replaces slash characters in the target instance
name with underscore characters.
To find the location and name of the reject files, view the properties settings of the Targets
node on the Mapping tab (Transformation view).

Figure 16-1 shows the properties settings on the Mapping tab:
Figure 16-1. Properties Settings on the Mapping Tab
Reject file directory

and filename
When you run a session that contains multiple partitions, the PowerCenter Server creates a
separate reject file for each partition.
Reading Reject Files

After you locate a reject file, you can read it using a text editor that supports the reject file
code page. Reject files contain rows of data rejected by the writer or the target database.
Though the PowerCenter Server writes the entire row in the reject file, the problem generally
centers on one column within the row. To help you determine which column caused the row
to be rejected, the PowerCenter Server adds row and column indicators to give you more
information about each column:
♦ Row indicator. The first column in each row of the reject file is the row indicator. The
numeric indicator tells whether the row was marked for insert, update, delete, or reject.
If the session is a user-defined commit session, the row indicator might tell whether the
transaction was rolled back due to a non-fatal error or if the committed transaction was in
a failed target connection group. For more information about user-defined commit
sessions and rejected rows, see “User-Defined Commits” on page 283.
♦ Column indicator. Column indicators appear after every column of data. The alphabetical
character indicators tell whether the data was valid, overflow, null, or truncated.
The following sample reject file shows the row and column indicators:
0,D,1921,D,Nelson,D,William,D,415-541-5145,D
0,D,1922,D,Page,D,Ian,D,415-541-5145,D
Reject Files 477

0,D,1923,D,Osborne,D,Lyle,D,415-541-5145,D
0,D,1928,D,De Souza,D,Leo,D,415-541-5145,D
0,D,2001,D,S. MacDonald,D,Ira,D,415-541-5145,D
Row Indicators
The first column in the reject file is the row indicator. The number listed as the row indicator
tells the writer what to do with the row of data.
Table 16-5 describes the row indicators in a reject file:
Table 16-5. Row Indicators in Reject File
Row Indicator Meaning Rejected By
0 Insert Writer or target
1 Update Writer or target
2 Delete Writer or target
3 Reject Writer
4 Rolled-back insert Writer
5 Rolled-back update Writer
6 Rolled-back delete Writer
7 Committed insert Writer
8 Committed update Writer
9 Committed delete Writer
If a row indicator is 3, the writer rejected the row because an update strategy expression
marked it for reject.
If a row indicator is 0, 1, or 2, either the writer or the target database rejected the row. To
narrow down the reason why rows marked 0, 1, or 2 were rejected, review the column
indicators and consult the session log.
Column Indicators
After the row indicator is a column indicator, followed by the first column of data, and
another column indicator. Column indicators appear after every column of data and define
the type of the data preceding it.

Table 16-6 describes the column indicators in a reject file:
Table 16-6. Column Indicators in Reject File
Column
Type of data Writer Treats As
Indicator
D Valid data. Good data. Writer passes it to the target database. The
target accepts it unless a database error occurs, such
as finding a duplicate key.
O Overflow. Numeric data exceeded the Bad data, if you configured the mapping target to reject
specified precision or scale for the column. overflow or truncated data.
N Null. The column contains a null value. Good data. Writer passes it to the target, which rejects it
if the target database does not accept null values.
T Truncated. String data exceeded a Bad data, if you configured the mapping target to reject
specified precision for the column, so the overflow or truncated data.
PowerCenter Server truncated it.
Null columns appear in the reject file with commas marking their column. An example of a
null column surrounded by good data appears as follows:
5,D,,N,5,D
Because either the writer or target database can reject a row, and because they can reject the
row for a number of reasons, you need to evaluate the row carefully and consult the session
log to determine the cause for reject.
Reject Files 479

Chapter 17
Row Error Logging
This chapter includes the following topics:

♦ Overview, 482
♦ Understanding the Error Log Tables, 483
♦ Understanding the Error Log File, 489
♦ Configuring Error Log Options, 493
481
Overview
When you configure a session, you can choose to log row errors in a central location. When a
row error occurs, the PowerCenter Server logs error information that allows you to determine
the cause and source of the error. The PowerCenter Server logs information such as source
name, row ID, current row data, transformation, timestamp, error code, error message,
repository name, folder name, session name, and mapping information.
You can log row errors into relational tables or flat files. When you enable error logging, the
PowerCenter Server creates the error tables or an error log file the first time it runs the session.
Error logs are cumulative. If the error logs exist, the PowerCenter Server appends error data to
the existing error logs.
You can choose to log source row data. Source row data includes row data, source row ID, and
source row type from the source qualifier where an error occurs. The PowerCenter Server
cannot identify the row in the source qualifier that contains an error if the error occurs after a
non pass-through partition point with more than one partition or one of the following active
sources:
♦ Aggregator
♦ Custom, configured as an active transformation
♦ Joiner
♦ Normalizer (pipeline)
♦ Rank
♦ Sorter
By default, the PowerCenter Server logs transformation errors in the session log and reject
rows in the reject file. When you enable error logging, the PowerCenter Server does not
generate a reject file or write dropped rows to the session log. Without a reject file, the
PowerCenter Server does not log Transaction Control transformation rollback or commit
errors. If you want to write rows to the session log in addition to the row error log, you can
enable verbose data tracing.
Note: When you log row errors, session performance may decrease because the PowerCenter
Server processes one row at a time instead of a block of rows at once.
Error Log Code Pages

The code page for the error log must match the code page for the session log. By default, the
error log code page matches the server code page, and you can set the server configuration
parameter to use UTF-8. The code page for the relational database where the error tables exist
needs to be one-way compatible with the server code page. For more information about code
pages, see “Globalization Overview” in the Installation and Configuration Guide.
482 Chapter 17: Row Error Logging

Understanding the Error Log Tables
When you choose relational database error logging, the PowerCenter Server creates four error
tables the first time you run a session. You specify the database connection to the database
where the PowerCenter Server creates these tables. If the error tables exist for a session, the
PowerCenter Server appends row errors to these tables.
Relational database error logging allows you to collect row errors from multiple sessions in
one set of error tables. To do this, you specify the same error log table name prefix for all
sessions. You can issue select statements on the generated error tables to retrieve error data for
a particular session.
You can specify a prefix for the error tables. The error table names can have up to eleven
characters. Do not specify a prefix that exceeds 19 characters when naming Oracle, Sybase, or
Teradata error log tables, as these databases have a maximum length of 30 characters for table
names.
The PowerCenter Server creates the error tables without specifying primary and foreign keys.
However, you can specify key columns.
The PowerCenter Server generates the following tables to help you track row errors:
♦ PMERR_DATA. Stores data and metadata about a transformation row error and its
corresponding source row.
♦ PMERR_MSG. Stores metadata about an error and the error message.
♦ PMERR_SESS. Stores metadata about the session.
♦ PMERR_TRANS. Stores metadata about the source and transformation ports, such as
name and datatype, when a transformation error occurs.
PMERR_DATA
When the PowerCenter Server encounters a row error, it inserts an entry into the
PMERR_DATA table. This table stores data and metadata about a transformation row error
and its corresponding source row.
Table 17-1 describes the structure of the PMERR_DATA table:
Table 17-1. PMERR_DATA Table Schema
Column Name Datatype Description
REPOSITORY_GID Varchar A unique identifier for the repository.
WORKFLOW_RUN_ID Integer A unique identifier for the workflow.
WORKLET_RUN_ID Integer A unique identifier for the worklet. If a session is not part of
a worklet, this value is “0”.
SESS_INST_ID Integer A unique identifier for the session.
TRANS_MAPPLET_INST Varchar Name of the mapplet where an error occurred.
Understanding the Error Log Tables 483

TRANS_NAME Varchar Name of the transformation where an error occurred.
TRANS_GROUP Varchar Name of the input group or output group where an error
occurred. Defaults to either “input” or “output” if the
transformation does not have a group.
TRANS_PART_INDEX Integer Specifies the partition number of the transformation where

an error occurred.
TRANS_ROW_ID Integer Specifies the row ID generated by the last active source.
TRANS_ROW_DATA Long Varchar Delimited string containing all column data, including the
column indicator. Column indicators are:
D - valid
O - overflow
N - null
T - truncated
B - binary
U - data unavailable
The fixed delimiter between column data and column
indicator is colon ( : ). The delimiter between the columns
is pipe ( | ). You can override the column delimiter in the
error handling settings.
The PowerCenter Server converts all column data to text

string in the error table. For binary data, the PowerCenter
Server uses only the column indicator.
This value can span multiple rows. When the data exceeds
2000 bytes, the PowerCenter Server creates a new row.
The line number for each row error entry is stored in the
LINE_NO column.
SOURCE_ROW_ID Integer Value that the source qualifier assigns to each row it
reads. If the PowerCenter Server cannot identify the row,
the value is -1.
SOURCE_ROW_TYPE Integer The row indicator that tells whether the row was marked
for insert, update, delete, or reject.
0 - Insert
1 - Update
2 - Delete
3 - Reject

SOURCE_ROW_DATA Long Varchar Delimited string containing all column data, including the
column indicator. Column indicators are:
D - valid
O - overflow
N - null
T - truncated
B - binary
The fixed delimiter between column data and column
indicator is colon ( : ). The delimiter between the columns
is pipe ( | ). You can override the column delimiter in the
error handling settings.
The PowerCenter Server converts all column data to text

string in the error table or error file. For binary data, the
PowerCenter Server uses only the column indicator.
This value can span multiple rows. When the data exceeds
2000 bytes, the PowerCenter Server creates a new row.
The line number for each row error entry is stored in the
LINE_NO column.
LINE_NO Integer Specifies the line number for each row error entry in
SOURCE_ROW_DATA and TRANS_ROW_DATA that
spans multiple rows.
Informatica recommends using the fields in bold to join tables.
PMERR_MSG
When the PowerCenter Server encounters a row error, it inserts an entry into the
PMERR_MSG table. This table stores metadata about the error and the error message.
Table 17-2 describes the structure of the PMERR_MSG table:
Table 17-2. PMERR_MSG Table Schema
WORKLET_RUN_ID Integer A unique identifier for the worklet. If a session is not part
of a worklet, this value is “0”.
MAPPLET_INST_NAME Varchar Mapplet to which the transformation belongs. If the

transformation is not part of a mapplet, this value is N/A.

Table 17-2. PMERR_MSG Table Schema
TRANS_PART_INDEX Integer Specifies the partition number of the transformation

where an error occurred.
TRANS_ROW_ID Integer Specifies the row ID generated by the last active source.
ERROR_SEQ_NUM Integer Counter for the number of errors per row in each
transformation group. If a session has multiple partitions,
the PowerCenter Server maintains this counter for each
partition.
For example, if a transformation generates three errors in
partition 1 and two errors in partition 2,
ERROR_SEQ_NUM generates the values 1, 2, and 3 for
partition 1, and values 1 and 2 for partition 2.
ERROR_TIMESTAMP Date/Time Timestamp of the PowerCenter Server when the error

occurred.
ERROR_UTC_TIME Integer The Coordinated Universal Time, also known as

Greenwich Mean Time, of when an error occurred.
ERROR_CODE Integer The error code that the error generates.
ERROR_MSG Long Varchar Error message, which can span multiple rows. When the
data exceeds 2000 bytes, the PowerCenter Server
creates a new row. The line number for each row error
entry is stored in the LINE_NO column.
ERROR_TYPE Integer The type of error that occurred. The PowerCenter Server
uses the following values:
1 - Reader error
2 - Writer error
3 - Transformation error
ERROR_MSG that spans multiple rows.
PMERR_SESS
When you choose relational database error logging, the PowerCenter Server inserts entries
into the PMERR_SESS table. This table stores metadata about the session where an error
occurred.

Table 17-3 describes the structure of the PMERR_SESS table:
Table 17-3. PMERR_SESS Table Schema
WORKLET_RUN_ID Integer A unique identifier for the worklet. If a session is not part of a
worklet, this value is “0”.
SESS_START_TIME Date/Time Timestamp of the PowerCenter Server when a session starts.
SESS_START_UTC_TIME Integer The Coordinated Universal Time, also known as Greenwich Mean
Time, of when the session starts.
REPOSITORY_NAME Varchar The repository name where sessions are stored.
FOLDER_NAME Varchar Specifies the folder where the mapping and session are located.
WORKFLOW_NAME Varchar Specifies the workflow that runs the session being logged.
TASK_INST_PATH Varchar Fully qualified session name that can span multiple rows. The
PowerCenter Server creates a new line for the session name. The
PowerCenter Server also creates a new line for each worklet in the
qualified session name. For example, you have a session named
WL1.WL2.S1. Each component of the name appears on a new line:
WL1
WL2
S1
The PowerCenter Server writes the line number in the LINE_NO
column.
MAPPING_NAME Varchar Specifies the mapping that the session uses.
TASK_INST_PATH that spans multiple rows.
PMERR_TRANS
When the PowerCenter Server encounters a transformation error, it inserts an entry into the
PMERR_TRANS table. This table stores metadata, such as the name and datatype of the
source and transformation ports.
Table 17-4 describes the structure of the PMERR_TRANS table:
Table 17-4. PMERR_TRANS Table Schema

Table 17-4. PMERR_TRANS Table Schema
WORKLET_RUN_ID Integer A unique identifier for the worklet. If a session is not

part of a worklet, this value is “0”.
TRANS_MAPPLET_INST Varchar Specifies the instance of a mapplet.
TRANS_ATTR Varchar Lists the port names and datatypes of the input or
output group where the error occurred. Port name and
datatype pairs are separated by commas, for example:
portname1:datatype, portname2:datatype.
This value can span multiple rows. When the data

exceeds 2000 bytes, the PowerCenter Server creates a
new row for the transformation attributes and writes the
line number in the LINE_NO column.
SOURCE_MAPPLET_INST Varchar Name of the mapplet in which the source resides.
SOURCE_NAME Varchar Name of the source qualifier. N/A appears when a row
error occurs downstream of an active source that is not
a source qualifier or a non pass-through partition point
with more than one partition. For a list of active sources
that can affect row error logging, see “Overview” on
page 482.
SOURCE_ATTR Varchar Lists the connected field(s) in the source qualifier

where an error occurred. When an error occurs in
multiple fields, each field name is entered on a new
line. Writes the line number in the LINE_NO column.
TRANS_ATTR and SOURCE_ATTR that spans multiple
rows.

Understanding the Error Log File
You can create an error log file to collect all errors that occur in a session. This error log file is
a column delimited line sequential file. By specifying a unique error log file name, you can
create a separate log file for each session in a workflow. When you want to analyze the row
errors for only one session, use an error log file.
In an error log file, double pipes “||” delimit error logging columns. By default, pipe “|”
delimits row data. You can change this row data delimiter by setting the Data Column
Delimiter error log option.
The code page for the error file is the same as the code page for the session log file. If the
session log uses a UTF-8 code page, the error file also uses a UTF-8 code page. For more
information about code pages, see “Globalization Overview” in the Installation and
Error log files have the following structure:
[Session Header]
[Column Header]
[Column Data]
♦ Session header. Contains session run information. Information in the session header is like
the information stored in the PMERR_SESS table.
♦ Column header. Contains data column names.
♦ Column data. Contains actual row data and error message information.
The following sample error log file contains a session header, column header, and column
data:
**********************************************************************
Repository GID: fe4817ab-7d87-465f-9110-354222424df0
Repository: CustomerInfo
Folder: Row_Error_Logging
Workflow: wf_basic_REL_errors_AGG_case
Session: s_m_basic_REL_errors_AGG_case
Mapping: m_basic_REL_errors_AGG_case
Workflow Run ID: 1310
Worklet Run ID: 0
Session Instance ID: 19
Session Start Time: 08/03/2004 16:57:01
Session Start Time (UTC): 1067126221
**********************************************************************
Understanding the Error Log File 489

Transformation||Transformation Mapplet Name||Transformation
Group||Partition Index||Transformation Row ID||Error Sequence||Error
Timestamp||Error UTC Time||Error Code||Error Message||Error
Type||Transformation Data||Source Mapplet Name||Source Name||Source Row
ID||Source Row Type||Source Data
agg_REL_basic||N/A||Input||1||1||1||08/03/2004
16:57:03||1067126223||11019||Port [CUST_ID_NULL]: Default value is:
ERROR(<<Expression Error>> [ERROR]: [AGG] CUST_ID - NULL detected on
input.\n... nl:ERROR(s:'[AGG] CUST_ID - NULL detected on
input.')).||3||D:1221|N:|N:|N:|D:Kauai Dive Shoppe|D:4-976 Sugarloaf
Hwy|D:Kapaa Kauai|D:HI|D:94766|D:[AGG] DEFAULT SID VALUE.|D:01/01/2001
00:00:00||mplt_add_NULLs_to_QACUST3||SQ_QACUST3||1||0||D:1221|D:Kauai
Dive Shoppe|D:4-976 Sugarloaf Hwy|D:Kapaa Kauai|D:HI|D:94766
16:57:03||1067126223||11019||Port [CITY_IN]: Default value is:
ERROR(<<Expression Error>> [ERROR]: [AGG] Null detected for City_IN.\n...
nl:ERROR(s:'[AGG] Null detected for
City_IN.')).||3||D:1354|N:|N:|D:1354|T:Cayman Divers World|D:PO Box
541|N:|D:Gr|N:|D:[AGG] DEFAULT SID VALUE.|D:01/01/2001
00:00:00||mplt_add_NULLs_to_QACUST3||SQ_QACUST3||4||0||D:1354|D:Cayman
Divers World Unlim|D:PO Box 541|N:|D:Gr|N:
16:57:03||1067126223||11131||Transformation [agg_REL_basic] had an error
evaluating variable column [Var_Divide_by_Price]. Error message is
[<<Expression Error>> [/]: divisor is zero\n... f:(f:2 / f:(f:1 -
f:TO_FLOAT(i:1)))].||3||D:1356|N:|N:|D:1356|T:Tom Sawyer Diving C|T:632-1
Third Frydenh|D:Christiansted|D:St|D:00820|D:[AGG] DEFAULT SID
VALUE.|D:01/01/2001
00:00:00||mplt_add_NULLs_to_QACUST3||SQ_QACUST3||5||0||D:1356|D:Tom
Sawyer Diving Centre|D:632-1 Third Frydenho|D:Christiansted|D:St|D:00820
Table 17-5 describes the columns in an error log file:
Table 17-5. Error Log File Column Headers
Log File Column Headers Description
Transformation The name of the transformation used by a mapping where an error occurred.
Transformation Mapplet Name Name of the mapplet that contains the transformation. N/A appears when this
information is not available.
Transformation Group Name of the input or output group where an error occurred. Defaults to either “input”
or “output” if the transformation does not have a group.
Partition Index Specifies the partition number of the transformation partition where an error
occurred.
Transformation Row ID Specifies the row ID for the error row.
Error Sequence Counter for the number of errors per row in each transformation group. If a session
has multiple partitions, the PowerCenter Server maintains this counter for each
partition.
For example, if a transformation generates three errors in partition 1 and two errors
in partition 2, ERROR_SEQ_NUM generates the values 1, 2, and 3 for partition 1,
and values 1 and 2 for partition 2.

Error Timestamp Timestamp of the PowerCenter Server when the error occurred.
Error UTC Time The Coordinated Universal Time, also known as Greenwich Mean Time, when the
error occurred.
Error Code The error code that corresponds to the error message.
Error Message Error message.
Error Type The type of error that occurred. The PowerCenter Server uses the following values:
1 - Reader error
2 - Writer error
3 - Transformation error
Transformation Data Delimited string containing all column data, including the column indicator. Column
indicators are:
D - valid
O - overflow
N - null
T - truncated
B - binary
The fixed delimiter between column data and column indicator is a colon ( : ). The
delimiter between the columns is a pipe ( | ). You can override the column delimiter
in the error handling settings.
The PowerCenter Server converts all column data to text string in the error file. For
binary data, the PowerCenter Server uses only the column indicator.
Source Name Name of the source qualifier. N/A appears when a row error occurs downstream of
an active source that is not a source qualifier or a non pass-through partition point
with more than one partition. For a list of active sources that can affect row error
logging, see “Overview” on page 482.
Source Row ID Value that the source qualifier assigns to each row it reads. If the PowerCenter
Server cannot identify the row, the value is -1.
Understanding the Error Log File 491

Source Row Type The row indicator that tells whether the row was marked for insert, update, delete, or
reject.
0 - Insert
1 - Update
2 - Delete
3 - Reject
Source Data Delimited string containing all column data, including the column indicator. Column
indicators are:
D - valid
O - overflow
N - null
T - truncated
B - binary
The fixed delimiter between column data and column indicator is a colon ( : ). The
delimiter between the columns is a pipe ( | ). You can override the column delimiter
in the error handling settings.
The PowerCenter Server converts all column data to text string in the error table or
error file. For binary data, the PowerCenter Server uses only the column indicator.

Configuring Error Log Options
You configure error logging for each session in a workflow. You can find error handling
options in the Config Object tab of the sessions properties.
Tip: You can use the Workflow Manager to create a reusable set of attributes for the Config
Object tab. For more information on creating a session configuration object, see “Creating a
Session Configuration Object” on page 183.
To configure error logging options:
1. Double-click the Session task to open the session properties.

2. Select the Config Object tab.
3. Choose error handling options.
Error Log
Options
Configuring Error Log Options 493

Table 17-6 describes the error logging settings of the Config Object tab:
Table 17-6. Error Log Options
Required/
Error Log Options Description
Optional
Error Log Type Required Specifies the type of error log to create. You can specify relational
database, flat file, or no log. By default, the PowerCenter Server
does not create an error log.
Error Log DB Connection Required/ Specifies the database connection for a relational log. This option is
Optional required when you enable relational database logging.
Error Log Table Name Optional Specifies the table name prefix for relational logs. The PowerCenter
Prefix Server appends 11 characters to the prefix name. Oracle and
Sybase have a 30 character limit for table names. If a table name
exceeds 30 characters, the session fails.
Error Log File Directory Required/ Specifies the directory where errors are logged. By default, the error
Optional log file directory is $PMBadFilesDir\. This option is required when
you enable flat file logging.
Error Log File Name Required/ Specifies error log file name. The character limit for the error log file
Optional name is 255. By default, the error log file name is PMError.log. This
option is required when you enable flat file logging.
Log Row Data Optional Specifies whether or not to log transformation row data. By default,
the PowerCenter Server logs transformation row data. If you disable
this property, N/A or -1 appears in transformation row data fields.
Log Source Row Data Optional If you choose not to log source row data, or if source row data is
unavailable, the PowerCenter Server writes an indicator such as N/
A or -1, depending on the column datatype.
If you do not need to capture source row data, consider disabling
this option to increase PowerCenter Server performance.
Data Column Delimiter Required Delimiter for string type source row data and transformation group
row data. By default, the PowerCenter Server uses a pipe ( | )
delimiter. Verify that you do not use the same delimiter for the row
data as the error logging columns. If you use the same delimiter, you
may find it difficult to read the error log file.
4. Click OK.

Chapter 18
Session Parameters
This chapter contains information on the following topics:

♦ Overview, 496
♦ Session Log Parameter, 497
♦ Database Connection Parameters, 499
♦ Source File Parameters, 502
♦ Target File Parameters, 504
♦ Lookup File Parameters, 506
♦ Reject File Parameters, 508
♦ Tips, 510
495
Overview
Session parameters, like mapping parameters, represent values you might want to change
between sessions, such as a database connection or source file. Use session parameters in the
session properties, and then define the parameters in a parameter file. You can specify the
parameter file for the session to use in the session properties. You can also specify it when you
use pmcmd to start the session.
The Workflow Manager provides one built-in session parameter, $PMSessionLogFile. With
$PMSessionLogFile, you can change the name of the session log generated for the session.
The Workflow Manager also allows you to create user-defined session parameters.
Table 18-1 describes required naming conventions for the session parameters you can define:
Table 18-1. Naming Conventions for User-Defined Session Parameters
Parameter Type Naming Convention
Database Connection $DBConnectionName
Source File $InputFileName
Target File $OutputFileName
Lookup File $LookupFileName
Reject File $BadFileName
Use session parameters to make sessions more flexible. For example, you have the same type of
transactional data written to two different databases, and you use the database connections
TransDB1 and TransDB2 to connect to the databases. You want to use the same mapping for
both tables. Instead of creating two sessions for the same mapping, you can create a database
connection parameter, $DBConnectionSource, and use it as the source database connection
for the session. When you create a parameter file for the session, you set
$DBConnectionSource to TransDB1 and run the session. After the session completes, you set
$DBConnectionSource to TransDB2 and run the session again.
You might use several session parameters together to make session management easier. For
example, you might use source file and database connection parameters to configure a session
to read data from different source files and write the results to different target databases. You
can then use reject file parameters to write the session reject files to the target machine. You
can use the session log parameter, $PMSessionLogFile, to write to different session logs in the
target machine, as well.
When you use session parameters, you must define the parameters in the parameter file.
Session parameters do not have default values. When the PowerCenter Server cannot find a
value for a session parameter, it fails to initialize the session.
496 Chapter 18: Session Parameters

Session Log Parameter
The Workflow Manager provides a built-in session parameter named $PMSessionLogFile. Use
$PMSessionLogFile in the session properties to change the name or location of the session log
between runs. When you use $PMSessionLogFile in the session properties, define the
parameter in the parameter file.
Changing the Session Log Name

You can use $PMSessionLogFile to change the session log name between sessions. In the
General Options settings of the Properties tab, enter $PMSessionLogFile in the Session Log
Filename field. Then define $PMSessionLogFile in the parameter file. When the PowerCenter
Server runs the session, it creates a session log in the directory listed in the Session Log File
Directory field and names the session log as instructed by the parameter file. If a session log
with the same name already exists, the PowerCenter Server overwrites the existing file.
Figure 18-1 illustrates how to use the session log parameter with a directory:
Figure 18-1. Using $PMSessionLogFile as the Name of the Session Log
Session Log
Parameter
Session Log
Directory
Parameter
Filename
For example, in a session, you leave Session Log File Directory set to its default value, the
$PMSessionLogDir server variable. For Session Log File Name, you enter the session
parameter $PMSessionLogFile. In the parameter file, you set $PMSessionLogFile to
“TestRun.txt”. When you registered the PowerCenter Server, you defined $PMSessionLogDir
as C:/Program Files/Informatica/PowerCenter Server/SessLogs. When the PowerCenter Server
Session Log Parameter 497

runs the session, it creates a session log named TextRun.txt in the C:/Program Files/
Informatica/PowerCenter Server/SessLogs directory.
Changing the Session Log Name and Location

You can also use $PMSessionLogFile to change both the directory and the session log name
between sessions. If you do this, you also need to clear the Session Log File Directory field.
The PowerCenter Server concatenates both fields to determine where and how to name the
session log.
For example, you have one session writing target files to different systems. You want each
session log written to the target machine so the local administrator can review the file. In the
session, you configure a target file session parameter $PMOutputFile1. You then use
$PMSessionLogFile to define the session log file name and clear the Session Log File
Directory. In the parameter file, you configure both the target file and session log file
parameter to write to the same machine. Set $PMOutputFile1 to E:/target files/
Marketing.out, and $PMSessionLogFile to E:/session logs/Marketing.txt. After you run the
session, you can edit the parameter file to change the directory and file names for both the
target file and session log parameters.
Alternatively, you can create a different parameter file for each target. You can then use
pmcmd to specify which parameter file to use when you start the session.
Steps for Using $PMSessionLogFile

Use $PMSessionLogFile when you want to change the name and/or location of a session log
between session runs.
To use the session log parameter:
1. In the session properties, click the General Options settings of the Properties tab.
2. Enter $PMSessionLogFile in the Session Log File field.
3. If you want $PMSessionLogFile to represent both the session log name and directory,
clear the Session Log File Directory field.
4. Enter a parameter file and directory in the Parameter File Name field.
5. Click OK.
Before you run the session, create the parameter file in the specified directory and define
$PMSessionLogFile. For details, see “Parameter Files” on page 511.

Database Connection Parameters
You can create user-defined database connection session parameters to reuse sessions for
different relational sources, targets, or lookups. You can create a database connection
parameter in the session properties of any session that uses a relational source, target, or
lookup. Name all database connection session parameters with the prefix $DBConnection,
followed by any alphanumeric and underscore characters. When you define the parameter in
the parameter file, you can reference any database connection in the repository.
For example, you have a session you want to use with two relational sources. You access the
first source with a database connection named “Marketing” and the second with a connection
named “Sales.” In the session, you create a source database connection parameter named
$DBConnection_Source. In the parameter file, you define $DBConnection_Source as
Marketing and run the session. After the session completes, you set $DBConnection_Source
to Sales in the parameter file, and then run the session.
Alternatively, you can create two different parameter files, one for each source database
connection. You can then use pmcmd to specify which parameter file to use when you start the
session.
If you want to use the same database connection for more than one connection, such as source
and target, you can enter the same $DBConnection parameter for both source and target
database connection. In the parameter file, enter one default value for the $DBConnection
parameter. The PowerCenter Server uses the same DBConnectionName when accessing
source and target.
Similarly heterogeneous sources may also use the same $DBConnection parameter.
To configure a database connection parameter:
1. In the session properties, click the Mapping tab (Transformation view) and click
Connections settings for the sources or targets node.
Database Connection Parameters 499

2. Click the Open button in the Value field.
Open
Button
3. In the Relational Connection Browser, select Use Connection Variable.

4. Enter a name for the database connection parameter. Name the connection parameter
$DBConnectionName.

5. In the General Options settings of the Properties tab, enter a parameter file and directory
in the Parameter Filename field.
The directory must be local to the PowerCenter Server.
6. Click OK.
Before you run the session, create the parameter file in the specified directory and define the
database connection parameter. For details, see “Parameter Files” on page 511.
Database Connection Parameters 501

Source File Parameters
You can create user-defined source file session parameters. Use a source file parameter when
you want to change the name or location of a session source file between session runs. Name
all source file session parameters with the prefix $InputFile, followed by any alphanumeric
and underscore characters. All source file session parameters within a session must have
distinct names. You can create a source file parameter in any session that reads from file
sources. When you define the parameter in the parameter file, you can reference any source
file local to the PowerCenter Server.
You can use a user-defined source file session parameter in either the Source File Directory or
Source Filename session property.
Changing the Source File

You can use a source file parameter to change the name of the source file a session uses. In the
Properties settings of the Mapping tab, enter the source file parameter in the Source Filename
field. Then define the parameter in a parameter file. When the PowerCenter Server runs the
session, it connects to the directory listed in the Source File Directory field and reads the
source file listed in the parameter file.
Figure 18-2 shows how to use a source file parameter with a source directory:
Figure 18-2. Using Parameters to Change the Session Source File
Source File
Directory
Source
Filename In the
Parameter File

For example, in a session, you leave Source File Directory set to its default, the
$PMSourceFileDir server variable. For the source file name, you create a session parameter
named $Inputfile_products. In the parameter file, you set $Inputfile_products to
“products.txt”. When you registered the PowerCenter Server, you set $PMSourceFileDir for
C:/Program Files/Informatica/PowerCenter Server/SrcFiles. When the PowerCenter Server
runs the session, it reads the products.txt file in the C:/Program Files/Informatica/
PowerCenter Server/SrcFiles directory.
Changing the Source File and Directory

You can use a source file parameter to change both the source file and directory used by a
session. When you specify both the source file and directory in the Source Filename field, you
need to clear the Source File Directory field. The PowerCenter Server concatenates both fields
to determine where to find the indicated source file.
Steps for Using a Source File Parameter

Use a source file parameter when you want to change the source file and/or location between
session runs.
To use a source file parameter:
1. Select a source under the Sources node on the Mapping tab.

2. Go to the Properties settings.
3. In the Source Filename field, enter the source file parameter name.
Name all source file parameters $InputFileName.
4. If you want the parameter to represent both the source file name and location, clear the
Source Directory field.
6. Click OK.
source file parameter. For details, see “Parameter Files” on page 511.
Source File Parameters 503

Target File Parameters
You can create user-defined target file session parameters. Use a target file parameter when
you want to change the name or location of a session target file between session runs. Name
all target file session parameters with the prefix $OutputFile, followed by any alphanumeric
and underscore characters. All target file session parameters within a session need to have
distinct names. You can create a target file parameter in any session that writes to file targets.
When you define the parameter in a parameter file, you can write the target file to any
directory local to the PowerCenter Server.
You can use a user-defined target file session parameter in either the Output File Directory or
Output Filename session property.
Changing the Target File

You can use a target file parameter to change the name of the target file the PowerCenter
Server creates when it runs a session. In the Properties settings of the Mapping tab, enter the
target file parameter in the Output File Name field. Then define the parameter in a parameter
file. When the PowerCenter Server runs the session, it connects to the directory listed in the
Output File Directory field and creates the target file listed in the parameter file. If the target
file exists, the PowerCenter Server overwrites the existing target file.
Figure 18-3 shows how to use a target file parameter with a target file directory:
Figure 18-3. Using Parameters to Change the Session Target File
Target file
directory
Target file name
in the
parameter file

For example, you want to name the target file based on the month in which the session runs.
In the session you leave the target directory set to its default, the $PMTargetFileDir server
variable. For the target file name, you create a session parameter named $OutputFileName. In
the parameter file, you set $OutputFileName to “Nov2000.out”. When you registered the
PowerCenter Server, set the $PMTargetFileDir to C:/Program Files/Informatica/PowerCenter
Server/TgtFiles. When the PowerCenter Server runs the session, it creates Nov2000.out in the
C:/Program Files/Informatica/PowerCenter Server/TgtFiles directory.
Changing the Target File and Directory

You can use a target file parameter to change both the target file and directory used by a
session. When you specify both the target file and directory in the Output Filename field, you
need to clear the Output File Directory field. The PowerCenter Server concatenates both
fields to determine where to create the target file.
For example, a session uses a source file parameter to read both internal and external weblogs
on different session runs. You want to write the results of the internal weblog session to one
system and the external weblog session to another. In the session, you name the target file
$OutputFileName and clear the Output File Directory field. In the parameter file, you set
$OutputFileName to “E:/internal_weblogs/November_int.txt” to create a target file for the
internal weblog session. After the session completes, you change $OutputFileName to “F:/
external_weblogs/November_ex.txt” for the external weblog session.
Alternatively, you can create a different parameter file for each target. You can then use
Steps for Using a Target File Parameter

Use a target file parameter when you want to change the name and/or location of a target file
between session runs.
To use a target file parameter:
1. Select a target under the Targets node on the Mapping tab.

3. In the Output Filename field, enter the target file parameter name.
Name all target file parameters $OutputFileName.
4. If you want the parameter to represent both the target file name and location, clear the
Output File Directory field.
6. Click OK.
target file parameter you created. For details, see “Parameter Files” on page 511.
Target File Parameters 505

Lookup File Parameters
You can create user-defined lookup file session parameters. Use a lookup file parameter when
you want to change the name or location of a session lookup file between session runs. Name
all lookup file session parameters with the prefix $LookupFile, followed by any alphanumeric
and underscore characters. All lookup file session parameters within a session must have
distinct names. You can create a lookup file parameter in any session that performs lookups
onflat files. When you define the parameter in the parameter file, you can reference any
lookup file local to the PowerCenter Server.
You can use a user-defined lookup file session parameter in either the Lookup Source File
Directory or Lookup Source Filename session property.
Changing the Lookup File

You can use a lookup file parameter to change the name of the lookup file a session uses. In
the Properties settings of the Mapping tab, enter the lookup file parameter in the Lookup
Filename field. Then define the parameter in a parameter file. When the PowerCenter Server
runs the session, it connects to the directory listed in the Lookup File Directory field and
reads the source file listed in the parameter file.
Figure 18-4 shows how to use a lookup file parameter with a lookup directory:
Figure 18-4. Using Parameters to Change the Session Lookup File
Lookup File
Directory
Lookup file
name in the
parameter file

For example, in a session, you leave Lookup File Directory set to its default, the
$PMLookupFileDir server variable. For the lookup file name, you create a session parameter
named $LookupFile_orders. In the parameter file, you set $LookupFile_orders to “orders.txt”.
When you registered the PowerCenter Server, you set $PMLookupFileDir for C:/Program
Files/Informatica/PowerCenter Server/LkpFiles. When the PowerCenter Server runs the
session, it reads the orders.txt file in the C:/Program Files/Informatica/PowerCenter Server/
LkpFiles directory.
Changing the Lookup File and Directory

You can use a lookup file parameter to change both the lookup file and directory used by a
session. When you specify both the lookup file and directory in the Lookup Source Filename
field, you need to clear the Lookup Source File Directory field. The PowerCenter Server
concatenates both fields to determine where to find the indicated lookup file.
Steps for Using a Lookup File Parameter

Use a lookup file parameter when you want to change the lookup file and/or location between
session runs.
To use a lookup file parameter:
1. Select a Lookup transformation on the Mapping tab.

3. In the Lookup Source Filename field, enter the lookup file parameter name.
Name all lookup file parameters $LookupFileName.
4. If you want the parameter to represent both the source file name and location, clear the
Lookup Directory field.
6. Click OK.
lookup file parameter. For details, see “Parameter Files” on page 511.
Lookup File Parameters 507

Reject File Parameters
You can create user-defined reject file session parameters. Use a reject file parameter when you
want to change the name or location of session reject files between session runs. Name all
reject file session parameters with the prefix $BadFile, followed by any alphanumeric and
underscore characters. All reject file parameters within a session need to have distinct names.
You can create a reject file parameter for any target in a session. When you define the
parameter in a parameter file, you can reference any directory local to the PowerCenter Server.
You can use a user-defined reject file session parameter in either the Reject File Directory or
Reject Filename session property.
Changing the Reject File Name

You can use a reject file parameter to change the name of a reject file a session uses. In the
Properties settings of the Mapping tab, enter the reject file parameter in the Reject Filename
field. Then define the parameter in the parameter file. When the PowerCenter Server runs the
session, it locates the directory listed in the Reject File Directory field and creates the reject
file listed in the parameter file. If the reject file already exists, it appends rejected data to the
existing reject file.
Figure 18-5 shows how to use a reject file parameter with a reject file directory:
Figure 18-5. Using Parameters to Change the Reject File Name
Reject file
directory
Reject file
name in the
parameter file

For example, you want to rename reject files between sessions to keep rejected data from
different session runs in different files. in a session, you leave Reject File Directory set to its
default, the $PMBadFileDir server variable. For the reject file name, you create a session
parameter named $BadFileName. In the parameter file, you set $BadFileName to
“FirstRun.bad.” When you registered the PowerCenter Server, you set $PMBadFileDir for C:/
Program Files/Informatica/PowerCenter Server/BadFiles. When the PowerCenter Server runs
the session, it creates the FirstRun.bad file in the C:/Program Files/Informatica/PowerCenter
Server/BadFiles directory.
Changing the Reject File and Directory

You can use a reject file parameter to change both the directory and name for session reject
files. When you specify both the reject file and directory in the Reject Filename field, you
need to clear the Reject File Directory field. The PowerCenter Server concatenates both fields
to determine where to find the indicated reject file.
For example, you use a database connection parameter to configure a session to write to
different target databases. Instead of having the PowerCenter Server append rejected data
from all sessions to the same reject file, you want to have a reject file for each target system. In
the session, you name the reject file $BadFileName and clear the Reject File Directory field.
In the parameter file, you set $BadFileName to the reject filename and directory for the target
database used in the session. When you change the database connection parameter to a
different database, you can also change the reject filename and directory.
Alternatively, you can create a different parameter file for each target system. You can then use
Steps for Using a Reject File Parameter

Use a reject file parameter when you want to change the reject file and/or location between
session runs.
To use a reject file parameter:
1. Go to the Properties settings of the Mapping tab.

2. In the Reject Filename field, enter the reject file parameter name.
Name all reject file parameters $BadFileName.
3. If you want the parameter to represent both the reject file name and location, clear the
Reject File Directory field.
5. Click OK.
reject file parameter. For details, see “Parameter Files” on page 511.
Reject File Parameters 509

Tips
Use reject file and session log parameters in conjunction with target file or target database
connection parameters.
When you use a target file or target database connection parameter with a session, you can
keep track of reject files by using a reject file parameter to write the reject file to the target
machine. You can also use the session log parameter to write the session log to the target
machine.

Chapter 19
Parameter Files

♦ Overview, 512
♦ Parameter File Format, 513
♦ Guidelines for Creating Parameter Files, 515
♦ Sample Parameter File, 517
♦ Configuring the Parameter File Location, 518
♦ Troubleshooting, 520
♦ Tips, 521
511
Overview
You can use a parameter file to define the values for parameters and variables used in a
workflow, worklet, or session. You can create a parameter file using a text editor such as
WordPad or Notepad. You list the parameters or variables and their values in the parameter
file. Parameter files can contain the following types of parameters and variables:
♦ Workflow variables
♦ Worklet variables
♦ Session parameters
♦ Mapping parameters and variables
When you use parameters or variables in a workflow, worklet, or session, the PowerCenter
Server checks the parameter file to determine the start value of the parameter or variable. You
can use a parameter file to initialize workflow variables, worklet variables, mapping
parameters, and mapping variables. If you do not define start values for these parameters and
variables, the PowerCenter Server checks for the start value of the parameter or variable in
other places. For more information, see “Using Workflow Variables” on page 103 and
“Mapping Parameters and Variables” in the Designer Guide.
You can place parameter files on the PowerCenter Server machine or on a local machine. Use
a local parameter file if you do not have access to parameter files on the PowerCenter Server
machine. When you use a local parameter file, pmcmd passes variables and values in the file to
the PowerCenter Server. Local parameter files are used with the startworkflow pmcmd
command. For more information, see “pmcmd Reference” on page 594.
You must define session parameters in a parameter file. Since session parameters do not have
default values, when the PowerCenter Server cannot locate the value of a session parameter in
the parameter file, it fails to initialize the session.
You can include parameter or variable information for more than one workflow, worklet, or
session in a single parameter file by creating separate sections for each object within the
parameter file.
You can also create multiple parameter files for a single workflow, worklet, or session and
change the file that these tasks use as needed. To specify the parameter file the PowerCenter
Server uses with a workflow, worklet, or session, you can do either of the following:
♦ Enter the parameter file name and directory in the workflow, worklet, or session
properties.
♦ Start the workflow, worklet, or session using pmcmd and enter the parameter filename and
directory in the command line. For details, see “Using pmcmd” on page 581.
If you enter a parameter file name and directory in both the workflow, worklet, or session
properties and in the pmcmd command line, the PowerCenter Server uses the information
you enter in the pmcmd command line.
512 Chapter 19: Parameter Files

Parameter File Format
When you enter values in a parameter file, you must precede the entries with a heading that
identifies the workflow, worklet, or session whose parameters and variables you want to
assign. You assign individual parameters and variables directly below this heading, entering
each parameter or variable on a new line. You can list parameters and variables in any order
for each task.
You can define the following heading formats:
♦ Workflow variables:
[folder name.WF:workflow name]
♦ Worklet variables:
[folder name.WF:workflow name.WT:worklet name]
♦ Worklet variables in nested worklets:

[folder name.WF:workflow name.WT:worklet name.WT:worklet name...]
♦ Session parameters, plus mapping parameters and variables:

[folder name.WF:workflow name.ST:session name]
or
[folder name.session name]
or
[session name]
Below each heading, you define parameter and variable values as follows:
parameter name=value
parameter2 name=value
variable name=value
variable2 name=value
For example, you have a session, s_MonthlyCalculations, in the Production folder. The
session uses a string mapping parameter, $$State, that you want to set to “MA”, and a
datetime mapping variable, $$Time. $$Time already has an initial values of “9/30/2000
00:00:00” saved in the repository, but you want to override this value to “10/1/2000
00:00:00.” The session also uses session parameters to connect to source files and target
databases, as well as to write session log to the appropriate session log file.
Table 19-1 shows the parameters and variables that you define in the parameter file:
Table 19-1. Parameters and Variables in Parameter File
Parameter and Variable Type Parameter and Variable Name Desired Definition
String Mapping Parameter $$State MA
Datetime Mapping Variable $$Time 10/1/2000 00:00:00
Parameter File Format 513

Table 19-1. Parameters and Variables in Parameter File
Parameter and Variable Type Parameter and Variable Name Desired Definition
Source File (Session Parameter) $InputFile1 Sales.txt
Database Connection (Session Parameter) $DBConnection_Target Sales (database connection)
Session Log File (Session Parameter) $PMSessionLogFile d:/session logs/firstrun.txt
The parameter file for the session includes the folder and session name, as well as each
parameter and variable:
[Production.s_MonthlyCalculations]
$$State=MA
$$Time=10/1/2000 00:00:00
$InputFile1=sales.txt
$DBConnection_target=sales
$PMSessionLogFile=D:/session logs/firstrun.txt
The next time you run the session, you might edit the parameter file to change the state to
MD and delete the $$Time variable. This allows the PowerCenter Server to use the value for
the variable that was set in the previous session run.

Guidelines for Creating Parameter Files
Use the following guidelines when creating parameter files:
♦ Capitalize folder and session names as necessary. Folder and session names are case-
sensitive in the parameter file.
♦ Enter folder names for non-unique session names. When a session name exists more than
once in a repository, enter the folder name to indicate the location of the session.
♦ Create one or more parameter files. You assign parameter files to workflows, worklets, and
sessions individually. You can specify the same parameter file for all of these tasks or create
several parameter files.
♦ When you want to include parameter and variable information for more than one
session in the file, create a new section for each session as follows. The folder name is
optional.
[folder_name.session_name]
parameter_name=value
variable_name=value
mapplet_name.parameter_name=value
[folder2_name.session_name]
parameter_name=value
variable_name=value
♦ Specify headings in any order. You can place headings in any order in the parameter file.
However, if you define the same parameter or variable more than once in the file, the
PowerCenter Server assigns the parameter or variable value using the first instance of the
parameter or variable.
♦ Specify parameters and variables in any order. Below each heading, you can specify the
parameters and variables in any order.
♦ When defining parameter values, do not use unnecessary line breaks or spaces. The
PowerCenter Server might interpret additional spaces as part of the value.
♦ List all necessary mapping parameters and variables. Values entered for mapping
parameters and variables become the start value for parameters and variables in a mapping.
Mapping parameter and variable names are not case sensitive.
♦ List all session parameters. Session parameters do not have default values. An undefined
session parameter can cause the session to fail. Session parameter names are not case-
sensitive.
♦ Use correct date formats for datetime values. When entering datetime values, use the
following date formats:
− MM/DD/RR
− MM/DD/RR HH24:MI:SS
Guidelines for Creating Parameter Files 515

− MM/DD/YYYY
− MM/DD/YYYY HH24:MI:SS
♦ Do not enclose parameters or variables in quotes. The PowerCenter Server interprets
everything after the equal sign as part of the value.
♦ Precede parameters and variables created in mapplets with the mapplet name as follows:
mapplet2_name.variable_name=value

Sample Parameter File
The following text is an excerpt from a parameter file:
[HET_TGTS.WF:wf_TCOMMIT_INST_ALIAS]
$$platform=unix
[HET_TGTS.WF:wf_TGTS_ASC_ORDR.ST:s_TGTS_ASC_ORDR]
$$platform=unix
$DBConnection_ora=qasrvrk2_hp817
[ORDERS.WF:wf_PARAM_FILE.WT:WL_PARAM_Lvl_1]
$$DT_WL_lvl_1=02/01/2000 00:00:00
$$Double_WL_lvl_1=2.2
[ORDERS.WF:wf_PARAM_FILE.WT:WL_PARAM_Lvl_1.WT:NWL_PARAM_Lvl_2]
$$DT_WL_lvl_2=03/01/2000 00:00:00
$$Int_WL_lvl_2=3
$$String_WL_lvl_2=ccccc
Sample Parameter File 517

Configuring the Parameter File Location
You can specify the parameter filename and directory in the workflow or session properties.
To enter a parameter file in the workflow properties:
1. Select Workflows-Edit.
2. Click the Properties tab.
3. Enter the parameter directory and name in the Parameter Filename field.
You can enter either a direct path or a server variable directory. Use the appropriate
delimiter for the PowerCenter Server operating system.
Enter the
parameter
directory.
4. Click OK.
To enter a parameter file in the session properties:
1. Click the Properties tab and open the General Options settings.
2. Enter the parameter directory and name in the Parameter Filename field.

3. You can enter either a direct path or a server variable directory. Use the appropriate
delimiter for the PowerCenter Server operating system.
Enter the
parameter
directory.
4. Click OK.
Configuring the Parameter File Location 519

Troubleshooting
I have a section in a parameter file for a session, but the PowerCenter Server does not seem
to read it.
In the parameter file, folder and session names are case-sensitive. Make sure to enter folder
and session names exactly as they appear in the Workflow Manager. Also, use the appropriate
prefix for all user-defined session parameters.
Table 19-2 describes required naming conventions for user-defined session parameters:
Table 19-2. Naming Conventions for User-Defined Session Parameters
Parameter Type Naming Convention
Database Connection $DBConnectionName
Reject File $BadFileName
Source File $InputFileName
Target File $OutputFileName
Lookup File $LookupFileName
I am trying to use a source file parameter to specify a source file and location, but the
PowerCenter Server cannot find the source file.
Make sure to clear the source file directory in the session properties. The PowerCenter Server
concatenates the source file directory with the source file name to locate the source file.
Also, make sure to enter a directory local to the PowerCenter Server and to use the
appropriate delimiter for the operating system.
I am trying to run a workflow with a parameter file and one of the sessions keeps failing.
The session might contain a parameter that is not listed in the parameter file. The
PowerCenter Server uses the parameter file to start all sessions in the workflow. Check the
session properties, then verify that all session parameters are defined correctly in the
parameter file.

Tips
Use a single parameter file to group parameter information for related sessions.
When sessions are likely to use the same database connection or directory, you might want to
include them in the same parameter file. When existing systems are upgraded, you can update
information for all sessions by editing one parameter file.
Use pmcmd and multiple parameter files for sessions with regular cycles.
When you change parameter values for a session in a cycle, reuse the same values on a regular
basis. If you run a session against both the sales and marketing databases once a week, you
might want to create separate parameter files for each regular session run. Then, instead of
changing the parameter file in the session properties each time you run the session, use pmcmd
to specify the parameter file to use when you start the session.
Tips 521
Chapter 20
External Loading

♦ Overview, 524
♦ External Loader Permissions, 525
♦ External Loader Behavior, 526
♦ Loading to DB2, 528
♦ Loading to Oracle, 533
♦ Loading to Sybase IQ, 535
♦ Loading to Teradata, 538
♦ Creating an External Loader Connection, 551
♦ Configuring External Loading in a Session, 553
♦ Troubleshooting, 557
523
Overview
You can configure a session to use DB2, Oracle, Sybase IQ, and Teradata external loaders to
load session target files into the respective databases. External Loaders can increase session
performance since these databases can load information directly from files faster than they can
run the SQL commands to insert the same data into the database.
To use an external loader for a session, you must perform the following tasks:
1. Create an external loader connection in the Workflow Manager and configure the
external loader attributes. For details on creating external loader connections, see
“Creating an External Loader Connection” on page 551.
2. Configure the session to write to flat file instead of to a relational database. For more
information, see “Configuring a Session to Write to a File” on page 553.
3. Choose an external loader connection for each target file in the session properties. For
more information, see “Selecting an External Loader Connection” on page 555.
When you run a session that uses an external loader, the PowerCenter Server creates a control
file and a target flat file. The control file contains information about the target flat file such as
data format and loading instructions for the external loader. The control file has an extension
of .ctl. You can view the control file and the target flat file in the target file directory (default:
$PMTargetFileDir).
The PowerCenter Server waits for all external loading to complete before performing post-
session commands, external procedures, and sending post-session email.
Before you run external loaders, consider the following issues:
♦ Disable constraints. Normally, you disable constraints built into the tables receiving the
data before performing the load. Consult your database documentation for instructions on
how to disable constraints.
♦ Performance issues. To preserve high performance, you can increase commit intervals and
turn off database logging. However, to perform database recovery on failed sessions, you
must have database logging turned on.
♦ Code page requirements. DB2, Oracle, Sybase IQ, and Teradata database servers must run
in the same code page as the target flat file code page. The external loaders start in the
target flat file code page. The PowerCenter Server creates the control and target flat files
using the target flat file code page. If you are using a code page other than 7-bit ASCII for
the target flat file, run the PowerCenter Server in Unicode data movement mode.
The PowerCenter Server can use multiple external loaders within one session. For example, if
the mapping contains two targets, you can create a session that uses different connection
types: one uses an Oracle external loader connection and the other uses a Sybase IQ external
loader connection.
524 Chapter 20: External Loading

External Loader Permissions
You can set external loader connection permissions in the connection object in the Workflow
Manager. The Workflow Manager assigns Owner permissions to the user who registers the
connection. The Workflow Manager grants Owner Group permissions to the first group in
the Group Memberships list of the owner. You can manage External Loader permissions if you
are the owner of the external loader connection or if you have Super User privileges.
If you want to edit an external loader connection, you must have read and write permissions
for the connection. If you want to run sessions that use a target external loader connection,
you must have at least execute permission for the connection.

To create an external loader connection, you must have one of the following privileges:
♦ Super User
To configure a session to use an external loader, you must have one of the following sets of
privileges and permissions:
♦ Use Workflow Manager privilege and folder read and write permissions
♦ Super User
If you enabled enhanced security, you must also have read permission for external loader
connections associated with the session.
External Loader Permissions 525

External Loader Behavior
The behavior of the external loader depends on how you choose to load the data. You can load
data in the following ways:
♦ Loading to named pipes. When you load data to named pipes, the external loader starts to
load data to the target database as soon as the data appears in the named pipe.
♦ Staging data using flat files. When you stage data in flat files, the external loader starts to
load data to the target databases only after the PowerCenter Server completes writing to
the target flat files.
Loading Data Using Named Pipes

On UNIX, the PowerCenter Server writes to a named pipe, which is named after the
configured target file name. The external loader starts to load data to the database as soon as
the data appears in the named pipe. When you use external loaders on UNIX, the loader
deletes the named pipe as soon as it completes the load.
On Windows, when you load data using named pipes, the PowerCenter Server writes data to
a named pipe using the specified format: \\.\pipe\<pipename> where the pipename is the same
as the configured target name. If the PowerCenter Server finds a file or named pipe that uses
the same name as the target flat file, it deletes the file or named pipe and recreates it.
If the PowerCenter Server on UNIX finds a file or named pipe (with the same name as the
session target flat file) in the target directory, it deletes the file or named pipe and recreates the
named pipe.
Tip: You may not be able to create a named pipe or file if another file exists that uses the same
name. You can rename the output file in the session that uses the external loader.
Staging Data to Flat Files

When you stage data using flat files, the external loader starts loading data to target databases
only after the PowerCenter Server completes writing to the target flat files. The external
loader does not delete the target flat files after loading them to the database. Make sure the
target file directory can accommodate the size of the target flat files.
If the session contains fatal errors, the PowerCenter Server does not finish writing data to the
target files, and the external loader does not start.
Partitioning Sessions with External Loaders

When you configure multiple partitions in a session with a flat file target, the PowerCenter
Server creates a separate flat file for each partition. Some external loaders cannot load data
from multiple files into the target. When you use an external loader in a session with multiple
partitions, you must configure partitioning according to the external loader you use.

When you use an external loader that can load data from multiple files, you can create
multiple partitions in the session. You choose an external loader connection for each
partition. The PowerCenter Server creates an output file for each partition, and the external
loader loads the output from each target file to the database.
If you use a loader that cannot load from multiple files, the session fails.
Table 20-1 lists the external loaders and loader behavior:
Table 20-1. Partitioning Guidelines for External Loaders
External Loader Load Behavior
DB2 EE db2load Cannot load from multiple output files.
DB2 EEE autoloader Cannot load from multiple output files.*
Oracle Behavior based on parallel load configuration:

- Disabled. Cannot load from multiple output files.
- Enabled. Can load from multiple output files.
Sybase IQ Cannot load from multiple output files.
Teradata MultiLoad Cannot load from multiple output files.
Teradata TPump Can load from multiple output files.
Teradata Fastload Cannot load from multiple output files.
Teradata Warehouse Builder Can load from multiple output files.

*The PowerCenter Server cannot pass multiple output files to the DB2 EEE autoloader.
Errors and Error Messages

The PowerCenter Server writes external loader initialization and completion messages in the
session log. For details on external loader performance, check the external loader log. The
loader saves the log in the same directory as the target flat files (default location:
$PMTargetFileDir). The default extension for external loader logs is .ldrlog.
External Loader Behavior 527

Loading to DB2
The DB2 EE external loader and DB2 EEE external loader can perform insert and replace
operations on targets. The external loaders can also restart or terminate load operations.
The DB2 EE external loader invokes the db2load executable located in the PowerCenter
Server installation directory. The DB2 EE external loader can load data to a DB2 server on a
machine that is remote to the PowerCenter Server.
The DB2 EEE external loader invokes the IBM DB2 Autoloader program to load data. The
Autoloader program uses the db2atld executable. The DB2 EEE external loader can partition
data and load the partitioned data simultaneously to the corresponding database partitions.
When you use the DB2 EEE external loader, the PowerCenter Server and the DB2 EEE server
must be on the same machine.
The DB2 external loaders load from a delimited flat file. Verify that the target table columns
are wide enough to store all of the data.
If you select a DB2 loader in a session with multiple partitions, the session fails. For more
information about partitioning sessions with external loaders, see “Partitioning Sessions with
External Loaders” on page 526.
If you configure multiple targets in the same pipeline to use DB2 external loaders, each loader
must load to a different tablespace on the target database. For information on selecting
external loaders, see “Configuring External Loading in a Session” on page 553.
When you load data to a DB2 database using the DB2 EE or DB2 EEE external loader, you
must have the correct authority levels and privileges to load data to the database tables.
Setting DB2 External Loader Operation Modes

DB2 operation modes specify the type of load the external loader runs. You can configure the
DB2 EE or DB2 EEE external loader to run in one of the following operation modes:
♦ Insert. Adds loaded data to the table without changing existing table data.
♦ Replace. Deletes all existing data from the table, and inserts the loaded data. The table and
index definitions do not change.
♦ Restart. Restarts a previously interrupted load operation.
♦ Terminate. Terminates a previously interrupted load operation and rolls back the
operation to the starting point, even if consistency points were passed. The tablespaces
return to normal state, and all table objects are made consistent.
Configuring Authorities, Privileges, and Permissions

When you load data to a DB2 database using the DB2 EE or DB2 EEE external loader, you
must have the correct authority levels and privileges to load data to the database tables.

DB2 privileges allow you to create or access database resources. Authority levels provide a
method of grouping privileges and higher-level database manager maintenance and utility
operations. Together, these act to control access to the database manager and its database
objects. You can access objects for which you have the required privilege or authority.
To load data into a table, you must have one of the following authorities:
♦ SYSADM authority
♦ DBADM authority
♦ LOAD authority on the database, and one of the following privileges:
− INSERT privilege on the table when the load utility is invoked in INSERT mode,
TERMINATE mode (to terminate a previous load insert operation), or RESTART mode
(to restart a previous load insert operation)
− INSERT and DELETE privilege on the table when the load utility is invoked in
REPLACE mode, TERMINATE mode (to terminate a previous load replace operation),
or RESTART mode (to restart a previous load replace operation)
In addition, you must have proper read access and read/write permissions:
♦ The database instance owner must have read access to the external loader input files.
♦ If you run DB2 as a service on Windows, you must configure the service start account with
a user account that has read/write permissions to use LAN resources, including drives,
directories, and files.
♦ If you load to DB2 EEE, the database instance owner must have write access to the load
dump file and the load temporary file.
For more information, consult your IBM DB2 database documentation.
Configuring DB2 EE External Loader Attributes

Table 20-2 describes attributes for DB2 EE external loader connections:
Table 20-2. DB2 EE External Loader Attributes
Default
Attributes Description
Value
Opmode Insert The DB2 external loader operation mode. Choose one of the following operation
modes:
- Insert
- Replace
- Restart
- Terminate
For more information about DB2 operation modes, see “Setting DB2 External
Loader Operation Modes” on page 528.
External Loader db2load The name of the DB2 EE external loader executable file.
Executable
Loading to DB2 529

Table 20-2. DB2 EE External Loader Attributes
Default
Attributes Description
Value
DB2 Server Location Remote The location of the DB2 EE database server relative to the PowerCenter Server.
Select Local if the DB2 EE database server resides on the PowerCenter Server
machine. Select Remote if the DB2 EE Server resides on another machine.
Is Staged Disabled The method of loading data. Select Is Staged to load data to a flat file staging
area before loading to the database. Otherwise, the data is loaded to the
database using a named pipe. For more information, see “Loading Data Using
Named Pipes” on page 526 or “Staging Data to Flat Files” on page 526.
Recoverable Enabled Sets tablespaces in backup pending state if forward recovery is enabled. If you
disable forward recovery, the DB2 tablespace will not set to backup pending
state. If the DB2 tablespace is in backup pending state, you must fully back up
the database before you perform any other operation on the tablespace.
DB2 EE External Loader Return Codes

The DB2 EE external loader indicates the success or failure of a load operation with a return
code. The PowerCenter Server writes the external loader return code to the session log.
Return code (0) indicates that the load operation succeeded. The Informatica Server writes
the following message to the session log if the external loader successfully completes the load
operation:
WRT_8029 External loader process <external loader name> exited
successfully.
Any other return code indicates that the load operation failed. The PowerCenter Server writes
the following error message to the session log:
WRT_8047 Error: External loader process <external loader name> exited with
error <return code>.
Table 20-3 describes the return codes for the DB2 EE external loader:
Table 20-3. DB2 EE External Loader Return Codes
Code Description
0 The external loader operation completed successfully.
1 The external loader cannot locate the control file.
2 The external loader could not open the external loader log file.
3 The external loader could not access the control file because the control file is locked by another process.
4 The DB2 database returned an error.
Configuring DB2 EEE External Loader Attributes

You can configure the DB2 EEE external loader to use different loading modes when loading
to the database. Loading modes determine how the DB2 EEE external loader loads data across

partitions in the database. You can configure the DB2 EEE external loader to use the
following loading modes:
♦ Split and load. The DB2 EEE external loader partitions the data and loads it
simultaneously on the corresponding database partitions.
♦ Split only. The DB2 EEE external loader partitions the data and writes the output to files
in the specified split file directory.
♦ Load only. The DB2 EEE external loader does not partition the data. It loads data in
existing split files on the corresponding database partitions.
♦ Analyze. The DB2 EEE external loader generates an optimal partitioning map with even
distribution across all database partitions. If you run the external loader in split and load
mode after you run it in analyze mode, the external loader uses the optimal partitioning
map to partition the data.
For more information about DB2 loading modes, consult your DB2 database documentation.
The DB2 EEE external loader also writes multiple external loader logs. The number of
external loader logs depends on the number of database partitions to which the external
loader loads data. For each partition, the external loader appends a number corresponding to
the partition number to the external loader log file name. The DB2 EEE external loader log
file format is file_name.ldrlog.partition_number.
The PowerCenter Server does not archive or overwrite DB2 EEE external loader logs. If an
external loader log of the same name exists when the external loader runs, the external loader
appends new external loader log messages to the end of the existing external loader log file.
You must manually archive or delete the external loader log files. For details on log files
generated by DB2 Autoload, consult your DB2 documentation.
For information on DB2 EEE external loader return codes, consult your DB2
documentation.
Table 20-4 describes attributes for DB2 EEE external loader connections:
Table 20-4. DB2 EEE External Loader Attributes
Default
Value
Opmode Insert The DB2 external loader operation mode. Choose one of the following
operation modes:
- Insert
- Replace
- Restart
- Terminate
For more information about DB2 operation modes, see “Setting DB2 External
Loader Operation Modes” on page 528.
External Loader db2atld The name of the DB2 EEE external loader executable file.
Executable
Split File Location n/a The location of the split files. The external loader creates split files if you
configure SPLIT_ONLY loading mode.
Output Nodes n/a The database partitions on which the load operation is to be performed.
Loading to DB2 531

Table 20-4. DB2 EEE External Loader Attributes
Default
Value
Split Nodes n/a The database partitions that determine how to split the data. If you do not
specify this attribute, the external loader automatically determines an optimal
splitting method.
Mode Split and The loading mode the external loader uses to load the data. Choose one of the
load following loading modes:
- Split and load
- Split only
- Load only
- Analyze
Max Num Splitters 25 Maximum number of splitter processes.
Force No Forces the external loader operation to continue even if it determines at startup
time that some target partitions or tablespaces are offline.
Status Interval 100 Number of megabytes of data the external loader loads before writing a
progress message to the external loader log. You can specify a value between
1 and 4,000 MB.
Ports 6000-6063 The range of TCP ports the external loader uses to create sockets for internal
communications with the DB2 server.
Check Level Nocheck Specifies whether the external loader should check for record truncation during
input or output.
Map File Input n/a The name of the file that specifies the partitioning map. If you want to use a
customized partitioning map, you must specify this attribute. You can generate
a customized partitioning map when you run the external loader in Analyze
loading mode.
Map File Output n/a The name of the partitioning map when you run the external loader in Analyze
loading mode. You must specify this attribute if you want to run the external
loader in Analyze loading mode.
Trace 0 The number of rows the external loader traces when you need to review a
dump of the data conversion process and output of hashing values.
Is Staged Disabled The method of loading data. Select Is Staged to load data to a flat file staging
area before loading to the database. Otherwise, the data is loaded to the
database using a named pipe. For more information, see “Loading Data Using
Named Pipes” on page 526 or “Staging Data to Flat Files” on page 526.
Date Format mm/dd/ The date format. The date format in the Connection Object definition must
yyyy match the date format you define in the target definition. DB2 supports the
following date formats:
- mm/dd/yyyy
- yyyy-mm-dd
- dd.mm.yyyy
- yyyy-mm-dd

Loading to Oracle
The Oracle SQL loader can perform insert, update, and delete operations on targets. The
target flat file for an Oracle external loader can be fixed-width or delimited.
Loading Multibyte Data to Oracle

When you load multibyte data to Oracle, data precision is measured in bytes for fixed-width
files and in characters for delimited files. Make sure the target table columns are wide enough
to store all the data without risking data truncation. To widen the columns, increase the
column size in the target table definition.
Oracle supports character-oriented datatypes, such as Nchar, where the precision is measured
in characters. If you use the Nchar datatype, multiply the maximum number of characters by
K, where K is the maximum number of bytes a character contains in the selected target code
page. This ensures that the PowerCenter Server does not truncate data before loading the
target file.
Note: If you configure a session to write to an Oracle 8 table in bulk mode with NOT NULL
constraints on any columns, the session may write null data into a NOT NULL column.
Oracle External Loader Attributes

Use the following guidelines when you enter attributes for the Oracle external loader
connection:
♦ If you select an Oracle external loader, the default external loader executable name is
SQLLOAD. This is accurate for most UNIX platforms, but if you use Windows, check
your Oracle documentation to find the name of the external loader executable.
♦ Select Do Not Enable Parallel Load to write to a non-partitioned Oracle target table.
♦ To write to a partitioned Oracle target using Direct Path, you must select Enable Parallel
Load and Append load mode.
♦ To write to a partitioned Oracle target using Conventional Path, select Enable Parallel
Load for best performance.
Tip: For optimal performance, select Direct Path when writing to a partitioned Oracle target.
For details, see your Oracle documentation.
Loading to Oracle 533

Table 20-5 describes the attributes for Oracle external loader connections:
Table 20-5. Oracle External Loader Attributes
Attribute Default Value Description
Error Limit 1 Number of errors to allow before the external loader stops the load
operation.
Load Mode Append The loading mode the external loader uses to load data. Choose from
one of the following loading modes:
- Append
- Insert
- Replace
- Truncate
Load Method Use Conventional The method the external loader uses to load data. Choose from one of
Path the following load methods:
- Use Conventional Path
- Use Direct Path (Recoverable)
- Use Direct Path (Unrecoverable)
Enable Parallel Enable Parallel Determines whether the Oracle external loader loads data in parallel to a
Load Load partitioned Oracle target table. Choose either Enable Parallel Load or Do
Not Enable Parallel Load.
You can create multiple partitions in a session if you use a loader
configured to enable parallel load. Sessions with multiple partitions fail if
you use a loader configured not to enable parallel load. For more
information, see “Partitioning Sessions with External Loaders” on
page 526.
Rows Per Commit 10000 For Conventional Path load method, this attribute specifies the number
of rows in the bind array for load operations. For Direct Path load
methods, this attribute specifies the number of rows the external loader
reads from the target flat file before it saves the data to the database.
External Loader sqlload The name of the external loader executable file.
Executable
Log File Name n/a The path and name of the external loader log file.
Is Staged Disabled The method of loading data. Select Is Staged to load data to a flat file
staging area before loading to the database. Otherwise, the data is
loaded to the database using a named pipe. For more information, see
“Loading Data Using Named Pipes” on page 526 or “Staging Data to Flat
Files” on page 526.
Reject File
The Oracle external loader creates a reject file for data rejected by the database. The reject file
has an extension of .ldrreject. The loader saves the reject file in the target files directory
(default location: $PMTargetFileDir).

Loading to Sybase IQ
The Sybase external loader can perform insert operations on Sybase IQ targets. It cannot
perform update or delete operations on targets.
Use the following rules and guidelines when you work with a Sybase IQ external loader:
♦ Ensure that target tables do not violate primary key constraints.
♦ Configure a Sybase IQ user with read/write access before you use a Sybase IQ external
loader.
♦ Target flat files for a Sybase IQ external loader can be fixed-width or delimited.
♦ The PowerCenter Server can load multibyte data to Sybase IQ targets.
♦ If you select a Sybase IQ external loader in a session with multiple partitions, the session
fails. For more information about partitioning sessions with external loaders, see
“Partitioning Sessions with External Loaders” on page 526.
♦ If the PowerCenter Server and Sybase IQ Server are on different machines, map a drive
from the machine hosting the PowerCenter Server to the machine hosting the Sybase IQ
Server. In a UNIX environment, mount the drive.
Using Sybase IQ External Loader on UNIX

For Sybase IQ external loaders, the PowerCenter Server can write to a named pipe if the
PowerCenter Server is local to the Sybase IQ database. Use pmconfig to enable the
SybaseIQLocaltoPMServer option. If the PowerCenter Server is not local to the Sybase IQ
database server or if you do not enable the option, the PowerCenter Server writes to a flat file.
Loading Multibyte Data to Sybase IQ

When you load multibyte data to Sybase IQ targets, consider the following issues involving
data precision and delimiters.
Fixed-Width Flat File Targets

If you plan to load multibyte data into a fixed-width flat file target, configure the precision to
accommodate the multibyte data. Fixed-width files are byte-oriented, not character-oriented.
So when you configure the precision for a fixed-width target, you need to consider the
number of bytes you load into the target, rather than the number of characters. The
PowerCenter Server writes the row to the reject file if the precision is not large enough to
accommodate the multibyte data.
For more information about writing to flat files, see “Working with File Targets” on page 261.
Loading to Sybase IQ 535

Delimited Flat File Targets
For delimited flat files, data precision is measured in characters. When you insert multibyte
character data in the target, you do not need to allow for additional precision for multibyte
data. Sybase IQ does not allow optional quotes. You must choose None for Optional Quotes
if you have a delimited target flat file.
When you load multibyte data to Sybase IQ targets, null characters and delimiters can be up
to four bytes each. To avoid reading the delimiters as regular characters, each byte of the
delimiter must have an ASCII value of less than 0x40. For details on loading multibyte data to
targets, see “Working with File Targets” on page 261.
Sybase IQ External Loader Attributes

Use the following guidelines when you enter attributes for the Sybase IQ external loader
connection:
♦ The connect string must contain the following attributes:
uid=user ID; pwd=password; eng=Sybase IQ database server name;
links=tcpip; (host=host name; port=port number)
♦ The server datafile directory is relative to the database server.

If the directory is in a Windows system, use a backslashes (\) in the directory path:
D:\mydirectory\inputfile.out
If the directory is in a UNIX system, use a forward slash (/):

/mydirectory/inputfile.out
♦ When you create a Sybase IQ external loader connection, the Workflow Manager sets the
name of the external loader executable file to dbisql by default. If you use an executable file
with a different name, for example, dbisqlc, you must update the External Loader
Executable field. If the external loader executable file directory is not in the system path,
you must enter the file path and file name in this field.
Table 20-6 describes the attributes for Sybase IQ external loader connections:
Table 20-6. Sybase IQ External Loader Attributes
Default
Value
Block Factor 10000 The number of records per block in the target Sybase table. The external
loader applies the Block Factor attribute to load operations for fixed-
width flat file targets only.
Block Size 50000 The size of blocks used in Sybase database operations. The external
loader applies the Block Size attribute to load operations for delimited
flat file targets only.
Checkpoint Enabled If enabled, the Sybase IQ database issues a checkpoint after

successfully loading the table. If disabled, the database issues no
checkpoints.

Table 20-6. Sybase IQ External Loader Attributes
Default
Value
Notify Interval 1000 The number of rows the Sybase IQ external loader loads before it writes
a status message to the external loader log.
Server Datafile Directory n/a The location of the flat file target. You must specify this attribute relative
to the database server installation directory. Enter the target file directory
path using the syntax for the machine hosting the database server
installation. For example, if the PowerCenter Server is on a Windows
machine and the Sybase IQ Server is on a UNIX machine, use UNIX
syntax.
External Loader dbisql The name of the Sybase IQ external loader executable.
Executable
Is Staged Enabled The method of loading data. Select Is Staged to load data to a flat file
staging area before loading to the database. Otherwise, the data is
loaded to the database using a named pipe. For more information, see
“Loading Data Using Named Pipes” on page 526 or “Staging Data to Flat
Loading to Sybase IQ 537

Loading to Teradata
When you load to Teradata, you can use the following external loaders:
♦ Multiload. Performs insert, update, delete, and upsert operations for large volume
incremental loads. You can use this loader when you run a session with a single partition.
Multiload acquires table level locks, making it appropriate for offline loading. For more
information about configuring the Multiload external loader connection object, see
“Teradata MultiLoad External Loader Attributes” on page 540.
♦ TPump. Performs insert, update, delete, and upsert operations for relatively low volume
updates. You can use this loader when you run a session with multiple partitions. TPump
acquires row-hash locks on the table, allowing other users to access the table as TPump
loads to it. For more information about configuring the Tpump external loader
connection object, see “Teradata TPump External Loader Attributes” on page 542.
♦ FastLoad. Performs insert operations for high volume initial loads, or for high volume
truncate and reload operations. You can use this loader when you run a session with a
single partition. You can only use this loader on empty tables with no secondary indexes.
For more information about configuring the FastLoad external loader connection object,
see “Teradata FastLoad External Loader Attributes” on page 545.
♦ Warehouse Builder. Performs insert, update, upsert, and delete operations on targets. You
can use this loader when you run a session with multiple partitions. You can achieve the
functionality of the other loaders based on the operator you use. For more information
about configuring the Warehouse Builder external loader connection object, see “Teradata
Warehouse Builder External Loader Attributes” on page 547.
If you use a Teradata external loader to perform update or upsert, you can use the Target
Update Override option in the Mapping Designer to override the UPDATE statement in the
external loader control file. For upsert, the INSERT statement in the external loader control
file remains unchanged. For details on using the Target Update Override option, see
“Mappings” in the Designer Guide.
Use the following guidelines when you use the Teradata external loaders:
♦ The PowerCenter Server can use Teradata external loaders to load fixed-width flat files to a
Teradata database.
♦ The target output file name, including the file extension, must not exceed 27 characters. If
the session contains multiple partitions, the target output file name, including the file
extension, must not exceed 25 characters.
♦ You cannot use spaces as null characters.
♦ You can use the Teradata external loaders to load multibyte data.
♦ You cannot use the Teradata external loaders to load binary data.
♦ When you load to Teradata using named pipes, set the checkpoint value to 0 to prevent
external loaders from performing checkpoint operations.
♦ When you edit a session, you can specify error, log, or work table names, depending on the
loader you use. You can also specify error, log, or work database names.

♦ When you edit a session, you can override the control file in the loader connection
properties.
You can view the Teradata control file in the target directory.
See the Teradata documentation for more information about the loaders.
Overriding the Control File

When you edit the loader connection in a session, you can override the control file. You might
want to override the control file to change some loader properties that you cannot edit in the
loader connection. For example, you can specify the tracing option in the control file.
When you override the control file, the Workflow Manager saves the control file to the
repository. The PowerCenter Server uses the saved control file when you run the session
again. If you do not override the control file, the PowerCenter Server generates a control file
based on the session and loader properties by default. It saves the control file in the output file
directory by default, but it does not use the control file the next time it runs the session.
To override the control file, override the loader connection for the target in the session. Click
the Edit button in the Control File Content Override loader property.
Figure 20-1 shows the Control File Editor dialog box where you override the Teradata control
file:
Figure 20-1. Control File Editor Dialog Box for Teradata
In the Control File Editor dialog box, click Generate to create the default control file. The
Workflow Manager creates the default control file based on the session and loader properties.
Edit the generated control file, and click OK to save your changes.
Note that if you change a target or loader connection setting after you edit the control file,
the control file does not include those changes. If you want to include those changes, you
must generate the control file again and edit it.
Note: The Workflow Manager does not validate the control file syntax. Teradata verifies the
control file syntax when you run a session. If the control file is invalid, the session fails.
Loading to Teradata 539

Teradata MultiLoad External Loader Attributes
You can configure the external loader connection object in the Workflow Manager. You can
also override the external loader connection object attributes when you edit a reusable or non-
reusable session.
Use the following guidelines when you work with the MultiLoad external loader:
♦ You can perform insert, update, delete, and upsert operations on targets. You can also use
data driven mode to perform insert, update, or delete operations based on instructions
coded in an Update Strategy or Custom transformation within a mapping.
♦ The MultiLoad external loader cannot load from multiple output files. If you run a session
with multiple partitions, the session fails. For more information about partitioning
sessions with external loaders, see “Partitioning Sessions with External Loaders” on
page 526.
♦ If you invoke a greater number of sessions than the maximum number of concurrent
sessions the database allows, the session may hang. You can set the minimum value for
Tenacity and Sleep to ensure that sessions fail rather than hang.
Table 20-7 shows the attributes that you configure for the Teradata MultiLoad external
loader:
Table 20-7. Teradata MultiLoad External Loader Attributes
Default
Value
TDPID n/a The Teradata database ID.
Database Name n/a Optional database name.
Date Format n/a The date format. The date format in the Connection Object definition must match
the date format you define in the target definition. The PowerCenter Server
supports the following date formats:
- dd/mm/yyyy
- mm/dd/yyyy
- yyyy/dd/mm
- yyyy/mm/dd
Error Limit 0 The total number of rejected records that MultiLoad can write to the MultiLoad error
tables. Uniqueness violations do not count as rejected records.
An error limit of 0 means that there is no limit on the number of rejected rows.
Checkpoint 10,000 The interval between checkpoints. You can set the interval to the following values:
- 60 or more: MultiLoad performs a checkpoint operation after it processes each
multiple of that number of records.
- 1–59: MultiLoad performs a checkpoint operation at the specified interval, in
minutes.
- 0: MultiLoad does not perform any checkpoint operations during the import task.
Tenacity 10,000 Specifies how long, in hours, MultiLoad tries to log onto the required sessions. If a
logon fails, MultiLoad delays for the number of minutes specified in the Sleep
attribute, and then retries the logon. MultiLoad keeps trying until the logon
succeeds or the number of hours specified in the Tenacity attribute elapses.

Table 20-7. Teradata MultiLoad External Loader Attributes
Default
Value
Load Mode Upsert The mode to generate SQL commands: Insert, Delete, Update, Upsert, or Data
Driven.
When you select Data Driven loading, the PowerCenter Server follows instructions
coded in an Update Strategy or Custom transformations within the mapping to
determine how to flag rows for insert, delete, or update. The PowerCenter Server
writes a column in the target file or named pipe to indicate the update strategy. The
control file uses these values to determine how to load data to the target. The
PowerCenter Server uses the following values to indicate the update strategy:
0 - Insert
1 - Update
2 - Delete
Drop Error Tables Enabled Specifies whether to drop the MultiLoad error tables before beginning the next
session. Select this option to drop the tables, or clear it to keep them.
External Loader mload The name and optional file path of the Teradata external loader executable. If the
Executable external loader executable directory is not in the system path, you must enter the
file path and filename.
Max Sessions 1 The maximum number of MultiLoad sessions per MultiLoad job. Max Sessions must
be between 1 and 32,767.
Running multiple MultiLoad sessions causes the client and database to use more
resources. Therefore, setting this value to a small number may improve
performance.
Sleep 6 The number of minutes MultiLoad waits before retrying a logon. MultiLoad tries until
the logon succeeds or the number of hours specified in the Tenacity attribute
elapses.
Sleep must be greater than 0. If you specify 0, MultiLoad issues an error message
and uses the default value, 6 minutes.
Is Staged Disabled The method of loading data. Select Is Staged to load data to a flat file staging area
before loading to the database. Otherwise, the data is loaded to the database using
a named pipe. For more information, see “Loading Data Using Named Pipes” on
page 526 or “Staging Data to Flat Files” on page 526.
Error Database n/a The error database name. You can use this attribute to override the default error
database name. If you do not specify a database name, the PowerCenter Server
uses the target table database.
Work Table n/a The work table database name. You can use this attribute to override the default
Database work table database name. If you do not specify a database name, the
PowerCenter Server uses the target table database.
Log Table n/a The log table database name. You can use this attribute to override the default log
Database table database name. If you do not specify a database name, the PowerCenter
Server uses the target table database.

Table 20-8 shows the attributes that you configure when you edit a session and override the
Teradata MultiLoad external loader connection object:
Table 20-8. Teradata MultiLoad External Loader Attributes Defined at the Session Level
Default
Value
Error Table 1 n/a The table name for the first error table. You can use this attribute to override
the default error table name. If you do not specify an error table name, the
PowerCenter Server uses ET_<target_table_name>.
Error Table 2 n/a The table name for the second error table. You can use this attribute to
override the default error table name. If you do not specify an error table name,
the PowerCenter Server uses UV_<target_table_name>.
Work Table n/a The work table name. You can use this attribute to override the default work
table name. If you do not specify a work table name, the PowerCenter Server
uses WT_<target_table_name>.
Log Table n/a The log table name. You can use this attribute to override the default log table
name. If you do not specify a log table name, the PowerCenter Server uses
ML_<target_table_name>.
Control File Content n/a The control file text. You can use this attribute to override the control file the
Override PowerCenter Server uses when it loads to Teradata. For more information, see
“Overriding the Control File” on page 539.
For more information about these attributes, consult your Teradata documentation.
Teradata TPump External Loader Attributes

reusable session.
You can perform insert, update, delete, and upsert operations on targets. You can also use data
driven mode to perform insert, update, or delete operations based on instructions coded in an
Update Strategy or Custom transformation within a mapping.
If you run a session with multiple partitions, you can use a TPump external loader to load the
output files to a Teradata database. You must select a Teradata TPump external loader for each
partition. For information on selecting external loaders, see “Configuring External Loading in
a Session” on page 553.
Table 20-9 shows the attributes that you configure for the Teradata TPump external loader:
Table 20-9. Teradata TPump External Loader Attributes
Default
Value
Database Name n/a Optional database name.

Default
Value
Error Limit 0 Limits the number of rows rejected for errors. When the error limit is exceeded,
TPump rolls back the transaction that causes the last error. An error limit of 0
causes TPump to stop processing after any error.
Checkpoint 15 The number of minutes between checkpoints. You must set the checkpoint to a
value between 0 and 60.
Tenacity 4 Specifies how long, in hours, TPump tries to log onto the required sessions. If a
logon fails, TPump delays for the number of minutes specified in the Sleep
attribute, and then retries the logon. TPump keeps trying until the logon succeeds
or the number of hours specified in the Tenacity attribute elapses.
To disable Tenacity, set the value to 0.
Load Mode Upsert The mode to generate SQL commands: Insert, Delete, Update, Upsert, or Data
Driven.
When you select Data Driven loading, the PowerCenter Server follows instructions
coded in an Update Strategy or Custom transformations within the session mapping
to determine how to flag rows for insert, delete, or update. The PowerCenter Server
control file uses these values to determine how to load data to the database. The
0 - Insert
1 - Update
2 - Delete
Drop Error Tables Enabled Specifies whether to drop the TPump error tables before beginning the next
session. Select this option to drop the tables, or clear it to keep them.
External Loader tpump The name and optional file path of the Teradata external loader executable. If the
file path and filename.
Max Sessions 1 The maximum number of TPump sessions per TPump job. Each partition in a
session starts its own TPump job. Running multiple TPump sessions causes the
client and database to use more resources. Therefore, setting this value to a small
number may improve performance.
Sleep 6 The number of minutes TPump waits before retrying a logon. TPump tries until the
logon succeeds or the number of hours specified in the Tenacity attribute elapses.
Packing Factor 20 The number of rows that each session buffer holds. Packing improves network/
channel efficiency by reducing the number of sends and receives between the
target flat file and the Teradata database.
Statement Rate 0 The initial maximum rate, per minute, at which the TPump executable sends
statements to the Teradata database. If you set this attribute to 0, the statement
rate is unspecified.

Default
Value
Serialize Disabled Determines whether or not operations on a given key combination (row) occur
serially.
You may want to check this option if the TPump job contains multiple changes to
one row. Sessions that contain multiple partitions with the same key range but
different filter conditions may cause multiple changes to a single row. In this case,
you may want to enable Serialize to prevent locking conflicts in the Teradata
database, especially if you set the Pack attribute to a value greater than 1.
If you select this option, the PowerCenter Server uses the primary key specified in
the target table as the Key column. If no primary key exists in the target table, you
must either clear this checkbox or indicate the Key column in the data layout
section of the control file.
Robust Disabled When Robust is not selected, it signals TPump to use simple restart logic. In this
case, restarts cause TPump to begin at the last checkpoint. TPump reloads any
data that was loaded after the checkpoint. This method does not have the extra
overhead of the additional database writes in the robust logic.
No Monitor Enabled When selected, this attribute prevents TPump from checking for statement rate
changes from, or update status information for, the TPump monitor application.
Teradata TPump external loader connection object:
Table 20-10. Teradata TPump External Loader Attributes Defined at the Session Level
Default
Value
Error Table n/a The error table name. You can use this attribute to override the default error
table name. If you do not specify an error table name, the PowerCenter Server
uses ET_<target_table_name><partition_number>.
LT_<target_table_name><partition_number>.
Override PowerCenter Server uses when it loads to Teradata. For more information, see

Teradata FastLoad External Loader Attributes

reusable session.
Use the following guidelines with the FastLoad external loader:
♦ Each FastLoad job loads data to one Teradata database table. If you want to load data to
multiple tables using FastLoad, you must create multiple FastLoad jobs.
♦ The FastLoad external loader cannot load from multiple output files. If you run a session
with multiple partitions, the session fails. For more information about partitioning
sessions with external loaders, see “Partitioning Sessions with External Loaders” on
page 526.
♦ The target table must be empty with no defined secondary indexes.
♦ FastLoad does not load duplicate rows from the output file to the target table in the
Teradata database if the target table has a primary key.
♦ If you load date values to the target table, you must configure the date format for the
column in the target table in the format YYYY-MM-DD.
♦ You cannot use FastLoad to load binary data.
You can view the Teradata FastLoad control file in the target directory.
Table 20-11 shows the attributes that you configure for the Teradata FastLoad external loader:
Table 20-11. Teradata FastLoad External Loader Attributes
Default
Value
Database Name n/a The database name.
Error Limit 1,000,000 The maximum number of rows that FastLoad rejects before it stops loading data to
the database table.
Checkpoint 0 The number of rows transmitted to the Teradata database between checkpoints. If
processing stops while a FastLoad job is running, you can restart the job at the
most recent checkpoint.
If you enter 0, FastLoad does not perform checkpoint operations.
Tenacity 4 The number of hours FastLoad tries to log on to the required FastLoad sessions
when the maximum number of load jobs are already running on the Teradata
database. When FastLoad tries to log on for a new session, and the Teradata
database indicates that the maximum number of load sessions is already running,
FastLoad logs off all new sessions that were logged on, delays for the number of
minutes specified in the Sleep attribute, and then retries the logon. FastLoad keeps
trying until it logs on for the required number of sessions or exceeds the number of
hours specified in the Tenacity attribute.

Table 20-11. Teradata FastLoad External Loader Attributes
Default
Value
Drop Error Tables Enabled Specifies whether to drop the FastLoad error tables before beginning the next
session. FastLoad will not run if non-empty error tables exist from a prior job.
Select this option to drop the tables, or clear it to keep them.
External Loader fastload The name and optional file path of the Teradata external loader executable. If the
file path and file name.
Max Sessions 1 The maximum number of FastLoad sessions per FastLoad job. Max Sessions must
be between 1 and the total number of access module processes (AMPs) on your
system.
Sleep 6 The number of minutes FastLoad pauses before retrying a logon. FastLoad tries
until the logon succeeds or the number of hours specified in the Tenacity attribute
elapses.
Truncate Target Disabled Specifies whether to truncate the target database table before beginning the
Table FastLoad job. FastLoad cannot load data to non-empty tables.
Teradata FastLoad external loader connection object:
Table 20-12. Teradata FastLoad External Loader Attributes Defined at the Session Level
Default
Value
Error Table 1 n/a The table name for the first error table. You can use this attribute to override
Error Table 2 n/a The table name for the second error table. You can use this attribute to
override the default error table name. If you do not specify an error table
name, the PowerCenter Server uses UV_<target_table_name>.
Override PowerCenter Server uses when it loads to Teradata. For more information,
see “Overriding the Control File” on page 539.

Teradata Warehouse Builder External Loader Attributes
reusable session.
If you run a session with multiple partitions, you can use a Warehouse Builder external loader
to load the output files to a Teradata database. You must select a Teradata Warehouse Builder
external loader for each partition. For information on selecting external loaders, see
“Configuring External Loading in a Session” on page 553.
Teradata Warehouse Builder uses operators to load data. Operators allow the Teradata
Warehouse Builder to achieve the functionality of FastLoad, MultiLoad, or TPump. When
you use Teradata Warehouse Builder, each operator uses the protocol for a Teradata external
loader.
Table 20-13 shows the operators and protocol for each Teradata Warehouse Builder operator:
Table 20-13. Teradata Warehouse Builder Operators and Protocol
Operator Protocol
Load Uses FastLoad protocol. Load attributes are described in Table 20-14. For more
information about how FastLoad works, see “Teradata FastLoad External Loader
Attributes” on page 545.
Update Uses MultiLoad protocol. Update attributes are described in Table 20-14. For more
information about how MultiLoad works, see “Teradata MultiLoad External Loader
Stream Uses TPump protocol. Stream attributes are described in Table 20-14. For more
information about how TPump works, see “Teradata TPump External Loader
Each Teradata Warehouse Builder operator has associated attributes. Not all attributes
available for FastLoad, MultiLoad, and TPump external loaders are available for Teradata
Warehouse Builder.
Table 20-14 shows the attributes that you configure for Teradata Warehouse Builder:
Table 20-14. Teradata Warehouse Builder External Loader Attributes
Default
Value
Database Name n/a The database name.
Error Database n/a The name of the error database.

Name
Operator Update The Warehouse Builder operator used to load the data. Choose Load, Update, or
Stream.
Max instances 4 The maximum number of parallel instances for the defined operator.

Default
Value
Error Limit 0 The maximum number of rows that Warehouse Builder rejects before it stops loading
data to the database table.
Checkpoint 0 The number of rows transmitted to the Teradata database between checkpoints. If
processing stops while a Warehouse Builder job is running, you can restart the job at
the most recent checkpoint.
If you enter 0, Warehouse Builder does not perform checkpoint operations.
Tenacity 4 The number of hours Warehouse Builder tries to log on to the Warehouse Builder
sessions when the maximum number of load jobs are already running on the
Teradata database. When Warehouse Builder tries to log on for a new session, and
the Teradata database indicates that the maximum number of load sessions is
already running, Warehouse Builder logs off all new sessions that were logged on,
delays for the number of minutes specified in the Sleep attribute, and then retries the
logon. Warehouse Builder keeps trying until it logs on for the required number of
sessions or exceeds the number of hours specified in the Tenacity attribute.
To disable Tenacity, set the value to 0.
Load Mode Upsert The mode to generate SQL commands. Choose Insert, Update, Upsert, Delete or
Data Driven.
When you use the Update or Stream operators, you can choose Data Driven load
mode. When you select data driven loading, the PowerCenter Server follows
instructions coded in Update Strategy or Custom transformations within the mapping
to determine how to flag rows for insert, delete, or update. The PowerCenter Server
control file uses these values to determine how to load data to the database. The
0 - Insert
1 - Update
2 - Delete
Drop Error Tables Enabled Specifies whether to drop the Warehouse Builder error tables before beginning the
next session. Warehouse Builder will not run if error tables containing data exist from
a prior job. Clear the option to keep error tables.
Truncate Target Disabled Specifies whether to truncate target tables. Enable this option to truncate the target
Table database table before beginning the Warehouse Builder job.
External Loader tbuild The name and optional file path of the Teradata external loader executable file. If the
Executable external loader directory is not in the system path, enter the file path and file name.
Max Sessions 4 The maximum number of Warehouse Builder sessions per Warehouse Builder job.
Max Sessions must be between 1 and the total number of access module processes
(AMPs) on your system.
Sleep 6 The number of minutes Warehouse Builder pauses before retrying a logon.
Warehouse Builder tries until the logon succeeds or the number of hours specified in
the Tenacity attribute elapses.
Serialize Disabled Specifies whether operations on a column occur serially.

Enabled with Update and Stream operators only.

Default
Value
Packing Factor 20 The number of rows that each session buffer holds. Packing improves network/
channel efficiency by reducing the number of sends and receives between the target
file ad the Teradata database. Enabled with Stream operator only.
Robust Disabled The recovery or restart mode. When you disable Robust, the Stream operator uses
simple restart logic. The Stream operator reloads any data that was loaded after the
last checkpoint.
When you enable Robust, Warehouse Builder uses robust restart logic. In robust
mode, the Stream operator determines how many rows were processed since the
last checkpoint. The Stream operator processes all the rows that were not processed
after the last checkpoint. Enabled with Stream operator only.
before loading to the database. Otherwise, the data is loaded to the database using a
named pipe. For more information, see “Loading Data Using Named Pipes” on
Work Table n/a The work table database name. You can use this attribute to override the default
Database work table database name. If you do not specify a database name, the PowerCenter
Note: Valid attributes depend upon the operator you select.
Table 20-15 shows the attributes that you configure when you edit a session and override
Teradata Warehouse Builder external loader connection object:
Table 20-15. Teradata Warehouse Builder External Loader Attributes Defined at the Session Level
Default
Value
Error Table 1 n/a The table name for the first error table. You can use this attribute to override the
default error table name. If you do not specify an error table name, the
Error Table 2 n/a The table name for the second error table. You can use this attribute to override
PowerCenter Server uses UV_<target_table_name>.
Work Table n/a The work table name. You can use this attribute to override the default work table
name. If you do not specify a work table name, the PowerCenter Server uses
WT_<target_table_name>.
RL_<target_table_name>.

Table 20-15. Teradata Warehouse Builder External Loader Attributes Defined at the Session Level
Default
Value
Control File n/a The control file text. You can use this attribute to override the control file the
Content Override PowerCenter Server uses when it loads to Teradata. For more information, see
Note: Valid attributes depend upon the operator you select.

Creating an External Loader Connection
The PowerCenter Server uses external loader attributes to create an external loader
connection. You enter external loader attributes in the Workflow Manager when you create an
external loader connection.
When you configure external loader settings, you may need to consult your DB2, Oracle SQL
Loader, Sybase IQ, or Teradata documentation for details.
Tip: If you edit an external loader connection, all sessions using the connection use the
updated connection.
To create an external loader connection:
1. In the Workflow Manager, choose Connections-Loader.

The Loader Connection Browser dialog box appears:
2. Click New.
Creating an External Loader Connection 551

3. Select an external loader type, and then click OK.
4. Enter a name for the external loader connection.

5. Enter the database user name, password, and connect string.
Enter the PmNullUser user name and PmNullPasswd if you use Oracle OS
Authentication. PowerCenter uses Oracle OS Authentication when the connection user
name is PmNullUser and the connection is with an Oracle database.
Note: When you use Teradata, you can enter PmNullPasswd as the database password to
prevent the password from appearing in the control file. When you do this, the
PowerCenter Server writes an empty string for the password in the control file.
6. Enter the necessary loader attributes.
7. Click OK.
8. To create additional connections, repeat steps 3-7, and then click Close to save your
changes.

Configuring External Loading in a Session
Before using an external loader in a session, you must first configure the necessary
connections. For more details, see “Creating an External Loader Connection” on page 551.
To use an external loader during a session, perform the following steps:
1. Configure the session to write to a file.
2. Configure the file properties.
3. Select the external loader connection.
Configuring a Session to Write to a File

When you want to use an external loader to write to a database, create the target definition in
the mapping according to the target database type. The session configures a relational target
type by default. To select an external loader connection, you must configure the session to
write to a file instead of a relational target. To do this, you must change the writer type from
Relational Writer to File Writer. You change the writer type using the Writers settings on the
Mappings tab.
Figure 20-2 shows the Writers settings on the Mapping tab:
Figure 20-2. Writers Settings on the Mapping Tab
Target
Instance
Writer Type
Configuring External Loading in a Session 553

To change the writer type for the target, select the target instance in the Instances list. Change
the writer type from Relational Writer to File Writer.
Configuring File Properties

After you configure the session to write to a file, you can set the file properties. You need to
specify the output file name and directory, as well as the reject file name and directory. You
configure these properties using the Properties settings on the Mapping tab.
Figure 20-3 shows the Properties settings on the Mapping tab:
Figure 20-3. Properties Settings on the Mapping Tab
Target
Instance
Properties
Settings
To set the file properties, select the target instance in the Instances list.

Table 20-16 shows the attributes in Properties settings:
Table 20-16. Properties Settings
Output File Directory Enter the directory name in this field. By default, the PowerCenter Server writes output
files to the directory $PMTargetFileDir.
If you enter a full directory and file name in the Output Filename field, clear this field.
External loader sessions may fail if you use double spaces in the path for the output file.
Output Filename Enter the file name, or file name and path. By default, the Workflow Manager names the
target file based on the target definition used in the mapping: target_name.out. External
loader sessions may fail if you use double spaces in the path for the output file.
Reject File Directory By default, the PowerCenter Server writes all reject files to the directory $PMBadFileDir.
If you enter a full directory and file name in the Reject Filename field, clear this field.
Reject Filename Enter the file name, or file name and directory. The PowerCenter Server appends
information in this field to that entered in the Reject File Directory field. For example, if you
have “C:/reject_file/” in the Reject File Directory field, and enter “filename.bad” in the
Reject Filename field, the PowerCenter Server writes rejected rows to C:/reject_file/
filename.bad.
By default, the PowerCenter Server names the reject file after the target instance name:
target_name.bad.
You can also enter a reject file session parameter to represent the reject file or the reject
file and directory. Name all reject file parameters $BadFileName. For details on session
parameters, see “Session Parameters” on page 495.
Set File Properties Opens a dialog box that allows you to define flat file properties. When you use an external
loader, you must define the flat file properties by clicking the Set File Properties button.
For Oracle external loaders, the target flat file can be fixed-width or delimited.
For Sybase IQ external loaders, the target flat file can be fixed-width or delimited.
For Teradata external loaders, the target flat file must be fixed-width. For DB2 external
loaders, the target flat file must be delimited.
For more information, see “Configuring Fixed-Width Properties” on page 265 and
“Configuring Delimited Properties” on page 266.
Note: Do not select Merge Partitioned Files or enter a merge file name. You cannot merge
partitioned output files when you use an external loader.
Selecting an External Loader Connection

After you configure file properties, you are ready to select the external loader connection. To
do this, you must choose the connection type and the connection object. You configure
connection options using the Connections settings on the Mappings tab.
Configuring External Loading in a Session 555

Figure 20-4 shows the Connections settings on the Mapping tab:
Figure 20-4. Connections Settings on the Mapping Tab
Target
Instance
Connection
Type and
selected
Connection
Object
To select an external loader connection:
1. On the Mapping tab, select the target instance in the Navigator.

2. Select the Loader connection type.
3. Click the Open button in the Value field to select the correct external loader connection
object.
4. Choose an external loader connection object, and then click OK.
5. Click OK to save your changes.
If the session contains multiple partitions, and you choose a loader that can load from
multiple output files, you can select a different connection for each partition, but each
connection must be of the same type. For example, you can select different Teradata TPump
external loader connections for each partition, but you cannot select a Teradata TPump
connection for one partition and an Oracle connection for another partition.
If the session contains multiple partitions, and you choose a loader that can load from only
one output file, the session fails. For more information about running external loader sessions
with multiple partitions, see “Partitioning Sessions with External Loaders” on page 526.

Troubleshooting
I am trying to set up a session to load data to an external loader, but I cannot select an
external loader connection in the session properties.
Check your mapping to make sure you did not configure it to load to a flat file target. In
order to use an external loader, you must configure the mapping with a DB2, Oracle, Sybase
IQ, or Teradata relational target. When you create the session, select a file writer in the
Writers settings of the Mapping tab in the session properties. Then open the Connections
settings and select an external loader connection.
I am trying to run a session that uses TPump, but the session fails. The session log displays
an error saying that the Teradata output file name is too long.
The PowerCenter Server uses the Teradata output file name to generate names for the TPump
error and log files, as well as the log table name. To do this, the PowerCenter Server adds a
prefix of several characters to the output file name. It adds three characters for sessions with
one partition and five characters for sessions with multiple partitions.
Teradata allows log table names of up to 30 characters. Because the PowerCenter Server adds a
prefix, if you are running a session with a single partition, specify a target output file name
with a maximum of 27 characters, including the file extension. If you are running a session
with multiple partitions, specify a target output file name with a maximum of 25 characters,
including the file extension.
I tried to load data to Teradata using TPump, but the session failed. I corrected the error,
but the session still fails.
Occasionally, Teradata does not drop the log table when you rerun the session. Check the
Teradata database, and manually drop the log table if it exists. Then rerun the session.
Troubleshooting 557
Chapter 21
Using FTP

♦ Overview, 560
♦ Creating an FTP Connection, 561
♦ Creating an FTP Session, 565
559
Overview
The PowerCenter Server can use File Transfer Protocol (FTP) to access source and target files.
With both source and target files, you can use FTP to transfer the files directly to the
PowerCenter Server or stage them on a local directory.
You can also stage files by creating a pre-session shell command to move the files local to the
PowerCenter Server. Accessing files directly with FTP generally provides better session
performance than using FTP to stage the files. However, you may want to stage FTP files to
keep a local archive.
Before creating an FTP session, you must configure the FTP connection in the Workflow
Manager. For details, see “Creating an FTP Connection” on page 561.
When using FTP file sources and targets in a session, you should know the following
information:
♦ FTP connection name
♦ Remote file name and exact path
♦ Whether you want to stage the files
Mainframe Notes
Due to mainframe restrictions, the following constraints apply when using FTP with
mainframe machines:
♦ You cannot execute sessions concurrently if the sessions use the same FTP source file or
target file located on a mainframe.
♦ If you abort a workflow containing a session with a staged FTP source or target from a
mainframe, you may need to wait for the connection to timeout before you can run the
workflow again.
560 Chapter 21: Using FTP

Creating an FTP Connection
The PowerCenter Server can access source and target files on remote machines using FTP. The
PowerCenter Server can use FTP to access any machine to which the PowerCenter Server can
connect.
Before you create a session using FTP, you must configure the FTP connection in the
Workflow Manager.
You must know the following information when you create an FTP connection:
♦ Connection name. The connection name used by the Workflow Manager.
♦ Host name. The name or IP address of the remote machine. Optionally, you can specify a
port number between 1 and 65535 inclusive. If you do not specify a port number, the
PowerCenter Server uses the port number 21 by default. Use the following syntax for
specifying a host name:
hostname:port-number
or
IP address:port-number
When you specify a port number, enable that port number for FTP on the host machine.
♦ Default remote directory. The directory you want the PowerCenter Server to use by
default. In the session, when you enter a file name without a directory, the PowerCenter
Server appends the file name to this directory. Therefore, this path must be exact and
contain the appropriate trailing delimiters. For example, if you enter c:/data/ and in the
session specify the file FILENAME, the PowerCenter Server reads the path and file name
as c:\data\FILENAME.
If you enter the wrong delimiter for an FTP directory, the Workflow Manager does not
correct it. If the FTP host is a mainframe machine, the directory must begin with a single
quote and end with the period delimiter, such as: ‘defaultdir. You can override this option
in the session properties.
Depending on the remote machine you access, you might also need to enter the user name
and password. The password must be in 7-bit ASCII only. As with database connections, if
you edit an FTP connection, all sessions using the FTP connection use the updated
connection.
FTP Permissions
If you enable enhanced security, you can set FTP connection permissions in the Workflow
Manager. The Workflow Manager assigns Owner permissions to the user who registers the
connection. The Workflow Manager grants Owner Group permissions to the first group in
the Group Memberships list of the owner. You can manage FTP connection permissions if
you are the owner of the connection or if you have Super User privileges.
A registered FTP connection does not appear in the list of FTP connections if you do not
have at least read permission for the connection. If you want to edit a connection, you must
Creating an FTP Connection 561

have read and write permissions for the connection. If you want to run sessions that use a
source or target FTP connection, you must have execute permission for the connection.
To create an FTP connection, you must have one of the following privileges:
♦ Super User
Steps for Creating an FTP Connection

Perform the following steps to create an FTP connection.
To create an FTP connection:

2. Choose Connections-FTP. The FTP Object Browser appears.

3. Click New.
4. Enter the connection information in Table 21-1:
Table 21-1. FTP Options
Required/
FTP Option Description
Optional
Name Required Connection name used by the Workflow Manager.
User Name Optional User name necessary to access the host machine.
Password Optional Password for the user name. Must be in 7-bit ASCII only.
Host Name Required Host name or dotted IP address of the FTP connection.
Optionally, you can specify a port number between 1 and 65535,
inclusive. If you do not specify a port number, the PowerCenter Server
uses 21 by default. Use the following syntax for specifying the host
name:
hostname:port-number
-or-
IP address:port-number
When you specify a port number, enable that port number for FTP on the
host machine.
Default Remote Required Enter a valid FTP directory on the host machine.
Directory Do not enclose the default remote directory in quotation marks.
The default directory name must be exact and include a trailing delimiter.
Note: Depending on the FTP server you use, you may have limited
options for entering FTP directories. Please see your FTP server
documentation for details.
Creating an FTP Connection 563

5. Click OK.
6. Repeat steps 3-5 for any other necessary FTP connection, then click Close.

Creating an FTP Session
After defining FTP connections in the Workflow Manager, you can create sessions using FTP
file sources and targets. You can use any mapping with the flat file sources or targets.
The steps to create FTP sessions vary for source and target files. You can use FTP to access
both source and target files in a session.
To create a session using FTP sources and targets, you must have one of the following sets of
privileges and permissions:
♦ Use Workflow Manager privilege with folder read and write permissions
You must have read permission for FTP connections you want to associate with the session in
addition to the privileges and permissions listed above.
FTP File Sources

Use FTP to access source files from any machine on your network, including mainframes.
To create a session using FTP source files:

2. In the Connections settings on the Mapping tab, select FTP for Type.
Select an
FTP
connection.
Creating an FTP Session 565

3. Click the Open button in the Value field to select an FTP connection.
4. Click Override and enter the remote file name.
If you enter a file name without a leading slash or drive letter, the PowerCenter Server
appends the file name to the Default Remote Directory path entered in the FTP
Connection dialog box. For example, if your default remote directory is c:/data/, and you
enter a remote file name of FILENAME, the PowerCenter Server connects to the FTP
host and looks for c:/data/FILENAME.
If you enter a fully qualified file name in the Remote Filename field, the PowerCenter
Server uses the named path rather than the path entered in the Default Remote
Directory.

If you enter a mainframe file name for a source file in the default directory, make sure you
enter the closing quote. For example, if your default remote directory is:
‘defaultdir.
To access the file, FILENAME, from the default mainframe directory, enter the following
in the Remote Filename field:
filename’
When the PowerCenter Server begins the session, it connects to the mainframe host and
looks for:
‘defaultdir.filename’
In contrast, if you want to use a file in a different directory, you must enter that directory
and file name in the Remote Filename field, like this:
‘overridedir.filename’
Note: Depending on the FTP server you use, you may have limited options for entering
FTP directories. Please see your FTP server documentation for details.
5. To store the file in a directory local to the PowerCenter Server, select Is Staged.
When you select this option for a source file, the PowerCenter Server moves the source
file from the FTP host to a local directory before the session begins, then uses the local
file during the session. If the staged file exists, the PowerCenter Server truncates the
staged file before running the session.
The location of the local file differs depending on the information entered in the
Properties settings of the Sources tab:

If you have an individual path and file name listed in the Source Filename field, the
PowerCenter Server uses that path as the local directory, and names the staged local file
after the listed file. For example, if the Source Filename field contains the path, c:/data/
sales_info, the PowerCenter Server connects to the FTP host, then moves the file to c:/
data, and names the file sales_info.
If the Source Filename field contains only a file name (and no path), the PowerCenter
Server names the file as defined in the Source Filename field, and places the file in the
directory listed in the Source file directory field. If the directory is not specified, the
PowerCenter Server stages the file in the directory where the PowerCenter Server runs on
UNIX or in Windows system directory.
If you do not stage the source file, the PowerCenter Server accesses the data directly from
the FTP host.
6. Repeat steps 3-5 for each FTP source and target in the session, then click OK.
7. Configure the rest of the session, then click OK.
FTP File Targets

You can use FTP to transfer target files to any machine to which the PowerCenter Server can
connect.
To create a session using FTP target files:

2. In the Connections settings on the Mapping tab, select FTP for Type.
Select an
FTP
connection.
3. Click the Open button in the Value field to select an FTP connection.
4. Click Override and enter the remote file name.

If you enter a file name without a leading slash or drive letter, the PowerCenter Server
appends the file name to the Default Remote Directory path entered in the FTP
Connection dialog. For example, if your default remote directory is c:/data/, and you
enter a remote file name of FILENAME, the PowerCenter Server connects to the FTP
host and looks for c:/data/FILENAME.
If you enter a fully qualified file name, the PowerCenter Server uses the named path
rather than the path entered in the Default Remote Directory. Do not enclose the fully
qualified file name in single or double quotation marks. The session may fail if you
enclose the fully qualified file name in quotation marks.
When you transfer a target file to a mainframe host, make sure you enter the opening
quote. For example, if your default remote directory is defaultdir., you enter the
following in the default remote directory field:
‘defaultdir.
Note: Depending on the FTP server you use, you may have limited options for entering
FTP directories. Please see your FTP server documentation for details.
5. To store the target file in a directory on the machine where the PowerCenter Server runs,
select Is Staged.
When you select this option, the PowerCenter Server writes to the local target file during
the session, then moves the file to the FTP host after the session is complete. The
location of the local file differs depending on the information entered in the Properties
settings of the Mapping tab:

If you have an individual path and file name listed in the Output Filename field, the
PowerCenter Server uses that path as the local directory, and names the staged local file
after the listed file. For example, if the Output Filename field contains the path, c:/data/
t_company_all.out, the PowerCenter Server connects to the FTP host, then moves the
file to c:/data, and names the file t_company_all.out.
If the Output Filename field contains only a file name (and no path), the PowerCenter
Server names the file as defined in the Output Filename field, and places the file in the
directory listed in the Output file directory field. If the directory is not specified, the
PowerCenter Server stages the file in the directory where the PowerCenter Server runs on
UNIX or the system directory on Windows.
If you do not stage the file, the PowerCenter Server accesses the data directly from the
FTP host. The local file and directory are not used.
Select the Merge Partitioned Files option and specify the merge file name and directory
when you partition your target. For more information, see “Partitioning File Targets” on
page 380.
6. Repeat steps 3-5 for each FTP target in the session, and then click OK.
7. Configure the rest of the session, and then click OK.

Chapter 22
Using Incremental
Aggregation
♦ Overview, 574
♦ PowerCenter Server Processing for Incremental Aggregation, 575
♦ Reinitializing the Aggregate Files, 576
♦ Moving or Deleting the Aggregate Files, 577
♦ Partitioning Guidelines with Incremental Aggregation, 578
♦ Preparing for Incremental Aggregation, 579
573
Overview
When using incremental aggregation, you apply captured changes in the source to aggregate
calculations in a session. If the source changes only incrementally and you can capture
changes, you can configure the session to process only those changes. This allows the
PowerCenter Server to update your target incrementally, rather than forcing it to process the
entire source and recalculate the same data each time you run the session.
For example, you might have a session using a source that receives new data every day. You
can capture those incremental changes because you have added a filter condition to the
mapping that removes pre-existing data from the flow of data. You then enable incremental
aggregation.
When the session runs with incremental aggregation enabled for the first time on March 1,
you use the entire source. This allows the PowerCenter Server to read and store the necessary
aggregate data. On March 2, when you run the session again, you filter out all the records
except those time-stamped March 2. The PowerCenter Server then processes only the new
data and updates the target accordingly.
Consider using incremental aggregation in the following circumstances:
♦ You can capture new source data. Use incremental aggregation when you can capture new
source data each time you run the session. Use a Stored Procedure or Filter transformation
to process only new data.
♦ Incremental changes do not significantly change the target. Use incremental aggregation
when the changes do not significantly change the target. If processing the incrementally
changed source alters more than half the existing target, the session may not benefit from
using incremental aggregation. In this case, drop the table and re-create the target with
complete source data.
Note: Do not use incremental aggregation if your mapping contains percentile or median
functions. The PowerCenter Server uses system memory to process Percentile and Median
functions in addition to the cache memory you configure in the session property sheet. As a
result, the PowerCenter Server does not store incremental aggregation values for Percentile
and Median functions in disk caches.
574 Chapter 22: Using Incremental Aggregation

PowerCenter Server Processing for Incremental
Aggregation
The first time you run an incremental aggregation session, the PowerCenter Server processes
the entire source. At the end of the session, the PowerCenter Server stores aggregate data from
that session run in two files, the index file and the data file. The PowerCenter Server creates
the files in a local directory.
Each subsequent time you run the session with incremental aggregation, you use only the
incremental source changes in the session.
For each input record, the PowerCenter Server checks historical information in the index file
for a corresponding group. If it finds a corresponding group, the PowerCenter Server
performs the aggregate operation incrementally, using the aggregate data for that group, and
saves the incremental change. If it does not find a corresponding group, the PowerCenter
Server creates a new group and saves the record data.
When writing to the target, the PowerCenter Server applies the changes to the existing target.
It saves modified aggregate data in the index and data files to be used as historical data the
next time you run the session.
If the source changes significantly, and you want the PowerCenter Server to continue saving
aggregate data for future incremental changes, configure the PowerCenter Server to overwrite
existing aggregate data with new aggregate data. For details, see “Reinitializing the Aggregate
When you partition a session that uses incremental aggregation, the PowerCenter Server
creates one set of cache files for each partition.
The PowerCenter Server creates new aggregate data, instead of using historical data, when you
perform one of the following tasks:
♦ Save a new version of the mapping.
♦ Configure the session to reinitialize the aggregate cache.
♦ Move the aggregate files without correcting the configured path or directory for the files in
the session property sheet.
♦ Change the configured path or directory for the aggregate files without moving the files to
the new location.
♦ Delete cache files.
♦ Decrease the number of partitions.
Note: When the PowerCenter Server rebuilds incremental aggregation files, the data in the
previous files is lost.
PowerCenter Server Processing for Incremental Aggregation 575

Reinitializing the Aggregate Files
If the source tables change significantly, you might want to run the session with the entire
source data. To do this, you can configure the session to reinitialize the aggregate cache.
For example, you can reinitialize the aggregate cache if the source for a session changes
incrementally every day and completely changes once a month. When you receive the new
monthly source, you might configure the session to reinitialize the aggregate cache, truncate
the existing target, and use the new source table during the session.
After you run a session that reinitializes the aggregate cache, edit the session properties to
disable the Reinitialize Aggregate Cache option. If you do not clear Reinitialize Aggregate
Cache, the PowerCenter Server overwrites the aggregate cache each time you run the session.
Note: When you move from Windows to UNIX, you must reinitialize the cache. Therefore,
you cannot change from a Latin1 code page to an MSLatin1 code page, even though these
code pages are compatible.

Moving or Deleting the Aggregate Files
Once you run an incremental aggregation session, avoid moving or modifying the index and
data files that store historical aggregate information.
If you do move the files into a different directory, and you want the PowerCenter Server to
use the aggregate files, you must also change the path to those files in the session properties.
As well, if you change the path to the files, but you do not move the files, the PowerCenter
Server rebuilds the files the next time you run the session.
If you change certain session or server properties, the PowerCenter Server cannot use the
incremental aggregation files, and it fails the session. To avoid session failure, delete existing
incremental aggregation files when you perform any of the following tasks:
♦ Change the PowerCenter Server data movement mode from ASCII to Unicode or from
Unicode to ASCII.
♦ Change the PowerCenter Server code page to an incompatible code page.
♦ Change the session sort order when the PowerCenter Server runs in Unicode mode.
♦ Change the Enable High Precision session option.
Finding Index and Data Files

By default, the PowerCenter Server stores the index and data files in the directory entered in
the server variable, $PMCacheDir, in the Workflow Manager. The PowerCenter Server names
the index file PMAGG*.idx. The PowerCenter Server names the data file PMAGG*.dat.
If you run the session using Verbose Init mode, the PowerCenter Server writes the file names
in the session log. To locate the files, look in the previous session log for the TE_7034 and
TE_7035 messages that indicate the cache file name and location. The following messages
show sample entries in the session log:
MAPPING> TE_7034 Aggregate Information: Index file is
[D:\Informatica\InformaticaServer\Cache\PMAGG8_4_2.idx]
MAPPING> TE_7035 Aggregate Information: Data file is

[D:\Informatica\InformaticaServer\Cache\PMAGG8_4_2.dat]
If you do not run the session using Verbose Init mode or use an identifiable transformation
naming convention, you may have difficulty determining which files belong to each session.
For more information about cache file storage and naming conventions, see “Cache Files” on
page 615.
Moving or Deleting the Aggregate Files 577

Partitioning Guidelines with Incremental Aggregation
When you use incremental aggregation in a session with multiple partitions, the PowerCenter
Server creates one set of cache files for each partition.
Use the following guidelines when you change the number of partitions or the cache
directory:
♦ Change the cache directory for a partition. If you change the directory for a partition and
you want the PowerCenter Server to reuse the cache files, you must move the cache files for
the partition associated with the changed directory.
− If you change the directory for the first partition, and you do not move the cache files,
the PowerCenter Server rebuilds the cache files for all partitions.
− If you change the directory for partitions 2-n, and you do not move the cache files, the
PowerCenter Server rebuilds the cache files that it cannot locate.
♦ Decrease the number of partitions. If you delete a partition, and you want the
PowerCenter Server to reuse the cache files, you must move the cache files for the deleted
partition to the directory configured for the first partition. If you do not move the files to
the directory of the first partition, the PowerCenter Server rebuilds the cache files that it
cannot locate.
Note: If you increase the number of partitions, the PowerCenter Server realigns the index
and data cache files the next time you run a session. It does not need to rebuild the files.
♦ Move cache files. If you move cache files for a partition and you want the PowerCenter
Server to reuse the files, you must also change the partition directory. If you do not change
the directory, the Informatica rebuilds the files the next time you run a session.
♦ Delete cache files. If you delete cache files, the PowerCenter Server rebuilds them the next
time you run a session.
If you change the number of partitions and the cache directory, you may need to move cache
files for both. For example, if you change the cache directory for the first partition, and you
decrease the number of partitions, you need to move the cache files for the deleted partition as
well as the cache files for the partition associated with the changed directory.

Preparing for Incremental Aggregation
When you use incremental aggregation, you need to configure both mapping and session
properties.
♦ Implement mapping logic or filter to remove pre-existing data.
♦ Configure the session for incremental aggregation and verify that the file directory has
enough disk space for the aggregate files.
Configuring the Mapping

Before enabling incremental aggregation, you must capture changes in source data. You might
do this by:
♦ Using a filter in the mapping. You may be able to remove pre-existing source data during
a session with a filter.
♦ Using a stored procedure. You may be able to remove pre-existing source data at the
source database with a pre-load stored procedure.
Configuring the Session

Use the following guidelines when you configure the session for incremental aggregation:
♦ Verify the location where you want to store the aggregate files. The index and data files
grow in proportion to the source data. When denoting the directory for those files, be sure
the directory has enough disk space to store historical data for the session.
When you run multiple sessions with incremental aggregation, decide where you want the
files stored. Then enter the appropriate directory for the server variable, $PMCacheDir, in
the Workflow Manager. You can enter session-specific directories for the index and data
files. However, by using the server variable for all sessions using incremental aggregation,
you can easily change the cache directory when necessary by changing $PMCacheDir.
Changing the cache directory without moving the files causes the PowerCenter Server to
reinitialize the aggregate cache and gather new aggregate data.
In a server grid, PowerCenter Servers rebuild incremental aggregation files they cannot
find. When a PowerCenter Server rebuilds incremental aggregation files, it loses aggregate
history. For more information about methods to save aggregate history in a server grid, see
“Running Sessions with Cache Files” on page 445.
♦ Configure the session to write file names in the session log. If you want the PowerCenter
Server to write the incremental aggregation cache file names in the session log, configure
the session with Verbose Init tracing. You can override tracing in the Error Handling
settings on the Config Object tab.
♦ Verify the incremental aggregation settings in the session properties. You can configure
the session for incremental aggregation in the Performance settings on the Properties tab.
Preparing for Incremental Aggregation 579

You can also configure the session to reinitialize the aggregate cache. If you choose to
reinitialize the cache, the Workflow Manager displays a warning indicating the
PowerCenter Server overwrites the existing cache and a reminder to clear this option after
running the session.To configure a session for incremental aggregation:
Figure 22-1 shows the Performance settings on the Properties tab where you configure
incremental aggregation options:
Figure 22-1. Incremental Aggregation Session Properties
Configure
incremental
aggregation.
Note: You cannot use incremental aggregation when the mapping includes an Aggregator
transformation with Transaction transformation scope. The Workflow Manager marks the
session invalid.

Chapter 23
Using pmcmd

♦ Overview, 582
♦ Configuring Environment Variables, 585
♦ Using the Command Line Mode, 589
♦ Using the Interactive Mode, 592
♦ pmcmd Reference, 594
581
Overview
pmcmd is a program that you can use to communicate with the PowerCenter Server. You can
perform some of the tasks that you can also perform in the Workflow Manager such as
starting and stopping workflows and tasks.
You can use pmcmd in the following modes:
♦ Command line mode. The command line syntax allows you to write scripts for scheduling
workflows. Each command you write in the command line mode must include connection
information to the PowerCenter Server.
♦ Interactive mode. You establish and maintain an active connection to the PowerCenter
Server. This allows you to issue a series of commands.
You can use repository user names and passwords as environment variables with pmcmd. You
can also customize the way pmcmd displays the date and time on the machine running the
PowerCenter Server. Before you use pmcmd, configure these variables on the PowerCenter
Server. For more information, see “Configuring Environment Variables” on page 585.
Note: To issue the shutdownserver command, you must have the Super User privilege or
Administer Server privilege.
Table 23-1 provides a description for the pmcmd commands. For details on command syntax
and usage, see “pmcmd Reference” on page 594.
Table 23-1. pmcmd Commands
Command Mode(s) Description
aborttask Command line, Aborts a task. Issue this command only after the PowerCenter
Interactive Server fails to stop when you issue the stoptask command. For
more information, see “Aborttask” on page 596.
abortworkflow Command line, Aborts a workflow. Issue this command only after the PowerCenter
Interactive Server fails to stop the workflow when you issue the stopworkflow
command. For more information, see “Abortworkflow” on page 597.
connect Interactive Connects to the PowerCenter Server in the interactive mode. Use
this command in conjunction with connection information. For more
information, see “Connect” on page 597.
disconnect Interactive Disconnects from the PowerCenter Server in the interactive mode.
For more information, see “Disconnect” on page 598.
exit Interactive Exits from pmcmd in the interactive mode. For more information,
see “Exit” on page 598.
getrunningsessionsdetails Command line, Displays details for sessions currently running on a PowerCenter
Interactive Server including information for the folder, workflow, and session
instance. Displays session status and statistics on each target
table and source qualifier. For more information, see
“Getrunningsessionsdetails” on page 598.
582 Chapter 23: Using pmcmd

getserverdetails Command line, Displays details for the PowerCenter Server including server
Interactive status, information on active workflows, and timestamp
information.
In a server grid, this command displays the PowerCenter Servers
that runs each task instance. For more information, see
“Getserverdetails” on page 599.
getserverproperties Command line, Displays the PowerCenter Server name, type, and version. It
Interactive returns the timestamp on the PowerCenter Server and the name of
the repository. It also indicates the data movement mode and
whether the PowerCenter Server can debug mappings. For more
information, see “Getserverproperties” on page 599.
getsessionstatistics Command line, Displays session details including information for the folder,
Interactive workflow, and task instance. Displays session status and statistics
on each target table and source qualifier.
“Getsessionstatistics” on page 600.
gettaskdetails Command line, Displays details for a task including folder and workflow name. Also
Interactive displays the task, status, and run mode.
“Gettaskdetails” on page 601.
getworkflowdetails Command line, Displays details for a workflow including workflow name, status,
Interactive and run mode. Also displays information when the workflow was
last executed. For more information, see “Getworkflowdetails” on
page 601.
help Command line, Displays a list of pmcmd commands and syntax. For more
Interactive information, see “Help” on page 602.
pingserver Command line, Determines whether the PowerCenter Server is running. For more
Interactive information, see “Pingserver” on page 602.
quit Interactive Quits from pmcmd in the interactive mode. For more information,
see “Quit” on page 602.
resumeworkflow Command line, Resumes a suspended workflow. For more information, see
Interactive “Resumeworkflow” on page 603.
resumeworklet Command line, Resumes a suspended worklet. For more information, see
Interactive “Resumeworklet” on page 603.
scheduleworkflow Command line, The scheduleworkflow command instructs the PowerCenter Server
Interactive to schedule a workflow. Use this command to manually reschedule
a workflow that has been removed from the schedule. For more
information, see “Scheduleworkflow” on page 604.
setfolder Interactive Designates a folder as the default folder in which to execute all
subsequent commands. For more information, see “Setfolder” on
page 604.
Overview 583
setnowait Interactive Instructs the PowerCenter Server to execute subsequent

commands in the nowait mode. In the nowait mode, you can enter
a new pmcmd command after the PowerCenter Server receives the
previous command. For more information, see “Setnowait” on
page 605.
setwait Interactive Instructs the PowerCenter Server to execute subsequent

commands in the wait mode. In the wait mode, you can enter a new
pmcmd command only after the PowerCenter Server completes the
previous command. For more information, see “Setwait” on
page 605.
showsettings Interactive Displays the settings for the interactive mode, including
PowerCenter Server and repository name, username, wait mode,
and default folder. For more information, see “Showsettings” on
page 605.
shutdownserver Command line, Shuts down the PowerCenter Server. Use this command in
Interactive conjunction with a shutdownmode option. For more information,
see “Shutdownserver” on page 605.
startask Command line, Starts a task. Use this command in conjunction with a task name.
Interactive For more information, see “Starttask” on page 606.
startworkflow Command line, Starts a workflow. Use this command in conjunction with a
Interactive workflow name. For more information, see “Startworkflow” on
page 607.
stoptask Command line, Stops a task. Use this command in conjunction with a task name.
Interactive For more information, see “Stoptask” on page 609.
stopworkflow Command line, Stops a workflow. Use this command in conjunction with a workflow
Interactive name. For more information, see “Stopworkflow” on page 609.
unscheduleworkflow Command line, Instructs the PowerCenter Server to remove the workflow from the
Interactive schedule. For more information, see “Unscheduleworkflow” on
page 610.
unsetfolder Interactive Designates no folder as the default folder. For more information,
see “Unsetfolder” on page 610.
version Command line, Displays the PowerCenter version number. For more information,
Interactive see “Version” on page 611.
waittask Command line, Instructs the PowerCenter Server to wait for the completion of a
Interactive running task before starting another command. Use this command
in conjunction with a task name. For more information, see
“Waittask” on page 611.
waitworkflow Command line, Notifies you of the status of a workflow. Use this command in
Interactive conjunction with a workflow name. For more information, see
“Waitworkflow” on page 611.

Configuring Environment Variables
Before you use pmcmd, you can set environment variables that are applied each time you run
pmcmd. You can configure the following environment variables to use with pmcmd:
♦ PM_CODEPAGENAME
♦ PMTOOL_DATEFORMAT
♦ Repository USERNAME and PASSWORD
♦ PM_HOME
Configuring PM_CODEPAGENAME
pmcmd uses the code page of the machine hosting pmcmd unless you specify the code page
environment variable, PM_CODEPAGENAME, to override it. The code page must be
compatible with the PowerCenter Server code page. pmcmd sends commands in Unicode. If
the code pages are not compatible, the PowerCenter Server might not find the workflow,
session, or task in the repository. For more information about code page compatibility, see
“Globalization Overview” and “Code Pages” in the Installation and Configuration Guide.
To configure a code page environment variable in a UNIX environment:
1. If you are in a UNIX C shell environment, type:

setenv PM_CODEPAGENAME <code page name>
If you are in a UNIX Bourne shell environment, type:

PM_CODEPAGENAME=<code page name>
export PM_CODEPAGENAME
To configure a code page as an environment variable on Windows:
1. Enter environment variables in the Windows System Properties.

For information about setting environment variables for your Windows operating system,
consult your Windows documentation.
2. Enter a system variable named PM_CODEPAGENAME and set the value to the code
page name.
Configuring PMTOOL_DATEFORMAT
Use this environment variable to customize the way pmcmd displays the date and time. The
pmcmd program verifies that the string you specify is a valid format. If the format string is not
valid, the PowerCenter Server generates a warning message and displays the date in the format
DY MON DD HH24:MI:SS YYYY.
Configuring Environment Variables 585

To configure a date display format as an environment variable on UNIX:

setenv PMTOOL_DATEFORMAT <date/time format string>

PMTOOL_DATEFORMAT=<date/time format string>
export PMTOOL_DATEFORMAT
To configure a date display format as an environment variable on Windows:

2. Enter a system or user variable named PMTOOL_DATEFORMAT and set the value to
the display format string.
Configuring Repository Username and Password

You can enter your repository user name and password at the command line as environment
variables. The password is an encrypted value.
To configure a username as an environment variable on UNIX:

setenv USERNAME YourUsername

USERNAME=YourUsername
export USERNAME
You can assign the environment variable any valid UNIX name.
To configure a password as an environment variable on UNIX:
1. In a UNIX session, navigate to the directory where the PowerCenter Server is installed.
2. At the shell prompt, type:
pmpasswd YourPassword
This command runs the encryption utility pmpasswd located in the directory where the
PowerCenter Server is installed. The encryption utility generates and displays your
encrypted password. The following is sample output. In this example, the password
entered was “monday.”
Encrypted string -->bX34dqq<--
Will decrypt to -->monday<--
Your encrypted password is bX34dqq.

setenv PASSWORD YourEncryptedPassword

PASSWORD= YourEncryptedPassword
export PASSWORD
You can assign the environment variable any valid UNIX name.
To configure a username as an environment variable on Windows:

2. Enter the name of the user environment variable in the Variable field. Enter your
repository username in the Value field.
You can set these up as either a user or system variable. User variables take precedence
over system variables.
To configure a password as an environment variable on Windows:
1. In Windows DOS, navigate to the directory where the PowerCenter Server is installed.
2. At the command line, type:
pmpasswd YourPassword
The encryption utility generates and displays your encrypted password. The following is
sample output. In this example, the password entered was “monday.”
Encrypted string -->bX34dqq<--
Will decrypt to -->monday<--
Your encrypted password is bX34dqq.

4. Enter the name of your password environment variable in the Variable field. Enter your
encrypted password in the Value field.
You can set these up as either a user or system variable. User variables take precedence
over system variables.
Configuring PM_HOME
Use the PM_HOME variable to start pmcmd from a directory other than the install directory.
On UNIX, point the PM_HOME and PATH environment variables to the PowerCenter
Configuring Environment Variables 587

Server installation directory. On Windows, include the PowerCenter Server install directory
in the environment path.
Warning: If you specify an incorrect directory path for the PM_HOME environment variable
the PowerCenter Server cannot start.
To start pmcmd from any directory on UNIX:
1. Point the PM_HOME environment variable to the installation directory.

If you are in a UNIX C shell environment, type the following to set the PM_HOME
variable:
setenv PM_HOME <install directory>

PM_HOME=<install directory>
export PM_HOME
2. Add the installation directory to the PATH environment variable.

If you are in a UNIX C shell environment, type the following to set the PATH variable:
setenv PATH “<install directory>:$PATH”

PATH=”<install directory>:$PATH”
export PATH
To start pmcmd from any directory on Windows:

In the system properties, add the installation directory to the path variable. For example, on
Windows 2000, configure the path variable in System settings. Click the Environment tab to
select the path variable and add the installation directory to the variable value.

Using the Command Line Mode
You can use pmcmd commands with operating system scheduling tools like cron or embed
pmcmd commands into shell scripts or Perl programs.
Each command must include the connection information to the PowerCenter Server and the
PowerCenter repository. For example, to start a workflow named wFlow4 in the command
line mode, use the following syntax:
pmcmd startworkflow -s serveraddress:portno -u YourUsername -p
YourPassword wFlow4
The following command immediately starts the workflow wSalesAvg, located in the east
folder, on the remote PowerCenter Server with host name Sales listening at port 6258:
pmcmd startworkflow -u seller3 -p jackson -s SALES:6258 -f east -wait
wSalesAvg
The user, seller3, with the password “jackson” sends the request to start the workflow. When
you use the wait option, pmcmd returns to the shell or command prompt when the workflow
completes.
For a list of commands you can use in the command line mode, see Table 23-1 on page 582.
For details on each command see “pmcmd Reference” on page 594.
Connecting to the PowerCenter Server in the Command Line Mode

When you run pmcmd in the command line mode, you enter connection parameters such as
username, password, and server information for each command. If you incorrectly enter or
omit one of the required parameters, the command fails and pmcmd returns a non-zero return
code. For a description of all the return codes, see “pmcmd Return Codes” on page 590.
There are several options to enter the user and password information. You can enter a
username. Or, if you previously defined a username environment variable, you can enter that
instead. You can also enter a previously defined password environment variable instead of a
password. The following command uses both user and password variables:
pmcmd startworkflow -s serveraddress:portno -uv USERNAME -pv PASSWORD
wFlow4
For information on defining username and password environment variables, see “Configuring
Repository Username and Password” on page 586.
Using the Command Line Mode 589

Table 23-2 describes the connection information you enter each time you write a command in
the command line mode:
Table 23-2. Connection Information for the Command Line Mode
Required/
Parameter Flags Description
Optional
username -user Required Your repository username. Required if userEnvVar is not used.
-u
userEnvVar -uservar Required Specifies the username environment variable. Required if

-uv username is not used.
If you do not encrypt your password, you can use -u $username,
and run the command from a shell script.
password -password Required Your repository password. Required if passwordEnvVar is not

-p used.
passwordEnvVar -passwordvar Required Specifies the password environment variable. Required if

-pv password is not used.
If you do not encrypt your password, you can use -p $password,
and run the command from a shell script.
serveraddr -serveraddr Required Server address of the machine hosting the PowerCenter Server.
-s
host N/A Optional Name of the machine hosting the PowerCenter Server. If you do
not specify a host name, pmcmd assumes the PowerCenter
Server runs on the machine executing pmcmd.
portno N/A Required Port number at which the PowerCenter Server listens.
pmcmd Return Codes

When you work in the command line mode, pmcmd indicates the success or failure of a
command with a return code. Return code (0) indicates that the command succeeded. Any
other return code indicates that the command failed.
Table 23-3 describes the return codes for command line pmcmd.
Table 23-3. pmcmd Return Codes
Code Description
0 For all commands, a return value of zero indicates that the command ran successfully. You can issue
these commands in the wait or nowait mode: starttask, startworkflow, resumeworklet, resumeworkflow,
aborttask, and abortworkflow. If you issue a command in the wait mode, a return value of zero indicates
the command ran successfully. If you issue a command in the nowait mode, a return value of zero
indicates that the request was successfully transmitted to the PowerCenter Server, and it acknowledged
the request.
1 The PowerCenter Server is down, or pmcmd cannot connect to the PowerCenter Server. The TCP/IP
host name or port number or a network problem occurred.
2 The specified task name, workflow name, or folder name does not exist.

Table 23-3. pmcmd Return Codes
Code Description
3 An error occurred in starting or running the workflow or task.
4 Usage error. You passed the wrong parameters to pmcmd.
5 An internal pmcmd error occurred. Contact Informatica Technical Support.
6 An error occurred while stopping the PowerCenter Server. Contact Informatica Technical Support.
7 You used an invalid username or password.
8 You do not have the appropriate permissions or privileges to perform this task.
9 The connection to the PowerCenter Server timed out while sending the request.
12 The PowerCenter Server cannot start recovery because the session or workflow is scheduled,
suspending, waiting for an event, waiting, initializing, aborting, stopping, disabled, or running.
13 The username environment variable is not defined.
14 The password environment variable is not defined.
15 The username environment variable is missing.
16 The password environment variable is missing.
17 Parameter file does not exist.
18 The PowerCenter Server found the parameter file, but it did not have the initial values for the session
parameters, such as $input or $output.
19 The PowerCenter Server cannot start the session in recovery mode because the workflow is configured
to run continuously.
20 A repository error has occurred. Please make sure that the Repository Server and the database are
running and the number of connections to the database is not exceeded.
21 PowerCenter Server is shutting down and it is not accepting new requests.
22 The PowerCenter Server cannot find a unique instance of workflow/session you specified. Enter the
command again with the folder name and workflow name.
23 There is no data available for your request.
24 Out of memory.
25 Command is cancelled.
Using the Command Line Mode 591

Using the Interactive Mode
Use pmcmd in the interactive mode to start and stop workflows and tasks without writing a
script. Once you establish a dedicated connection to the PowerCenter Server, you can issue
commands without specifying the connection information. For example, to start the
workflow wFlow4 in the interactive mode, type the following at the pmcmd prompt:
pmcmd> startworkflow wFlow4
The following commands immediately start the workflow wSalesAvg, located in the east
folder:
pmcmd> connect -user seller3 -password jackson -serveraddr SALES:6258
pmcmd> setwait
pmcmd> setfolder east
pmcmd> startworkflow wSalesAvg
The setwait command means that for all subsequent commands, pmcmd returns the command
prompt when the workflow completes. The setfolder command means that for all subsequent
commands dealing with workflows or tasks, pmcmd uses the specified workflow or task from
the east folder.
For a list of commands you can use in the interactive mode, see Table 23-1 on page 582. For
details on each command see “pmcmd Reference” on page 594.
Connecting to the PowerCenter Server in the Interactive Mode

To use pmcmd in the interactive mode, first establish a dedicated connection to the
PowerCenter Server.
To start in the interactive mode:
1. In either a Windows DOS session or a UNIX session, navigate to the directory where the
PowerCenter Server is installed.
2. At the shell or command prompt, type:
pmcmd
This command returns the PowerCenter version number and the pmcmd prompt.
3. From the pmcmd prompt, type:
connect -u YourUserName -p YourPassword -s ServerName:PortNo
Or, if you use username and password environment variables, type the following at the
pmcmd prompt:
connect -uv USERNAME -pv PASSWORD -serveraddr ServerName:PortNo
For information on defining user name and password environment variables, see
“Configuring Repository Username and Password” on page 586.

If you omit connection information, pmcmd prompts you to enter the correct information.
Once pmcmd successfully connects, you receive the pmcmd prompt. At the pmcmd prompt,
you can issue commands without specifying the connection information.
Setting Defaults in the Interactive Mode

Once you connect to a PowerCenter Server using pmcmd interactive mode, you can designate
default folders or conditions to use each time the PowerCenter Server executes a command.
For example, if you want to issue a series of commands on tasks in the same folder, specify the
name of the folder with the setfolder command. All subsequent commands use that folder as
the default.
Table 23-4 describes the commands that you can use to set defaults for subsequent
commands.
Table 23-4. Setting Defaults for the Interactive Mode
Command Description
setfolder Designates a folder as the default folder in which to execute all subsequent commands.
setnowait Instructs the PowerCenter Server to execute subsequent commands in the nowait mode.
The pmcmd prompt is available after the PowerCenter Server receives the previous
command. The nowait mode is the default mode.
setwait Instructs the PowerCenter Server to execute subsequent commands in the wait mode.
The pmcmd prompt is available only after the PowerCenter Server completes the previous
command.
showsettings Displays the following settings for the interactive mode:

- name of the PowerCenter Server and repository to which pmcmd is connected
- username
- wait mode
- default folder
unsetfolder Reverses the setfolder command.
For a list of all the commands that you can use in the interactive mode, see Table 23-1 on
page 582.
Using the Interactive Mode 593

pmcmd Reference
pmcmd provides multiple ways to enter some of the parameters. For example, to enter a
repository password, use the following syntax:
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>
You can use -password or -p before entering a password. Or, use -passwordvar or -pv before a
password environment variable.
To enter a password, precede the password with either the -password or the -p flag.
-password YourPassword
or
-p YourPassword
If you use a password environment variable, precede the variable name with either the -pv flag
or the -passwordvar flag.
-passwordvar PASSWORD
or
-pv PASSWORD
For a list of all the parameters you can use with pmcmd, see Table 23-5 on page 594.
Command Parameters
When you use most parameters, you precede the parameter with a flag. For ease of use, you
can use a shortened version for most flags. For example, you can either use -serveraddr or its
shortened equivalent, -s.
Table 23-5 describes the parameters used in pmcmd commands and lists the associated flags:
Table 23-5. Command Parameters
folder -folder Name of the folder containing the workflow or task. Required if the workflow
-f or task name is not unique in the repository.
host N/A The name of the machine hosting the PowerCenter Server. If you do not
specify a host name, pmcmd assumes the PowerCenter Server runs on the
machine executing pmcmd.
localparamfile -localparamfile The localparamfile is a parameter file on a local machine that pmcmd uses
-lpf when you start a workflow. Use in conjunction with the startworkflow
command.
paramfile -paramfile The paramfile parameter determines which parameter file is used when a task
or workflow runs. It overrides the configured parameter file for the workflow or
task. Use in conjunction with the starttask or startworkflow commands.

Table 23-5. Command Parameters
password -password Your repository password. Required if passwordEnvVar is not used.

-p
passwordEnvVar -passwordvar Specifies the password environment variable. Required if password is not
-pv used.
portno N/A Specifies the port number at which the PowerCenter Server listens.
recovery -recovery Specifies you want to run the session in recovery mode.
serveraddr -serveraddr Server address of the machine hosting the PowerCenter Server.
-s
startfrom -startfrom Starts a workflow from a specified task, taskInstancePath. Use the startfrom
parameter in conjunction with the startworkflow command. Write the
taskInstancePath as a fully qualified string.
taskInstancePath N/A Indicates a task and where it appears within the workflow. A task within a
workflow is indicated by its task name alone. A task within a worklet is
indicated by WorkletName.TaskName.
userEnvVar -uservar Specifies the username environment variable. Required if username is not
-uv used.
username -user Your repository username. Required if userEnvVar is not used.

-u
workflow -workflow Name of the workflow.

-w
Using Quotation Marks

If a command parameter contains spaces, use single or double quotation marks to enclose the
parameter. For example, use single quotes in the following syntax to enclose the folder name:
abortworkflow -f ‘quarterly sales’ -wait Q3workflow
To denote an empty string, use two single quotes (‘’) or two double quotes (“”). Be sure you
match an opening quote with a closing quote.
Syntax Notation
Table 23-6 describes the notation used in pmcmd syntax:
Table 23-6. pmcmd Syntax Notation
Convention Description
-z Flag placed before a parameter. This designates the parameter you enter. For
example, to enter the username, type -u or -user followed by the username.
<x> Required parameter. If you omit a required parameter, pmcmd returns an error
message.
pmcmd Reference 595

Table 23-6. pmcmd Syntax Notation
Convention Description
<x | y > Select between required parameters. For the command to run, you must select
from the listed parameters. If you omit a required parameter, pmcmd returns an
error message.
[x] Optional parameter. The command runs whether or not you enter in optional
parameters. For example, if you want to use the help command, the syntax is a
follows:
Help [Command]
If you enter a command, pmcmd returns information on that command only. If you
omit the command name, pmcmd returns a list of all commands.
[x|y] Select between optional parameters. The command runs whether or not you
enter in optional parameters. For example, many commands run in either the wait
or nowait mode.
[-wait|-nowait]
The command runs in the mode you specify. If you do not specify a mode,
pmcmd runs the command in the default nowait mode.
<< x | y>| <a | b>> When a set contains subsets, the superset is indicated with bold brackets < >. A
bold pipe symbol (| )separates the subsets.
Tip: When you enter commands in pmcmd, type the command name first followed by the
optional parameters in any order.
Aborttask
The aborttask command aborts a task. Issue this command only after the PowerCenter Server
fails to stop the task when you issue the stoptask command. For details on how the
PowerCenter Server aborts and stops tasks, see “Server Handling of Stop and Abort” on
page 129.
In the command line mode, use the following syntax to abort a task:
pmcmd aborttask
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
[<-folder|-f> folder]
<<-workflow|-w> workflow>
[-wait|-nowait]
taskInstancePath
In the interactive mode, enter the following syntax at the pmcmd prompt to abort a task:
aborttask

[-wait|-nowait]
taskInstancePath
Write the taskInstancePath as a fully qualified string. If the task is within a worklet, write the
string as WorkletName.TaskName. If the task is directly within a workflow, use the task name
alone.
For information on other parameters used in this command, see Table 23-5 on page 594.
Abortworkflow
The abortworkflow command aborts a workflow. Issue this command only after the
PowerCenter Server fails to stop the workflow when you issue the stopworkflow command.
For details on how the PowerCenter Server aborts and stops workflows, see “Server Handling
of Stop and Abort” on page 129.
In the command line mode, use the following syntax to abort a workflow:
pmcmd abortworkflow
[-wait|-nowait]
workflow
In the interactive mode, enter the following syntax at the pmcmd prompt to abort a workflow:
abortworkflow
[-wait|-nowait]
workflow
Connect
The connect command connects the pmcmd program to the PowerCenter Server in the
interactive mode. If you omit connection information, pmcmd prompts you to enter the
correct information. Once pmcmd successfully connects, you receive the pmcmd prompt. At
the pmcmd prompt, you can issue commands without specifying the connection information.
connect
pmcmd Reference 597

Note: You can use this command in the interactive mode only.
Disconnect
The disconnect command disconnects pmcmd from the PowerCenter Server. It does not close
the pmcmd program. Use this command when you want to disconnect from a PowerCenter
Server and connect to another in the interactive mode.
In the interactive mode, use the following syntax to disconnect pmcmd from a PowerCenter
Server:
disconnect
Note: You can use this command only in the pmcmd interactive mode.
Exit
The exit command disconnects pmcmd from the PowerCenter Server and closes the pmcmd
program.
In the interactive mode, use the following syntax to exit pmcmd:
exit
Note: You can use this command only in the pmcmd interactive mode.
Getrunningsessionsdetails
The getrunningsessionsdetails command returns the details for all sessions currently running
on the PowerCenter Server. Details include startup and current time, folder and workflow
names, session instance, master and execution servers, number of successful and failed rows in
sources and targets, number of transformation errors, and number of sessions running on the
PowerCenter Server.
In the command line mode, use the following syntax to get details about sessions running on
the PowerCenter Server:
pmcmd getrunningsessionsdetails
In the interactive mode, enter the following syntax at the pmcmd prompt to get details about
getrunningsessionsdetails

Getserverdetails
The getserverdetails command returns details about workflows and tasks running on a
PowerCenter Server.
♦ Workflow details. Workflow details include the name of the PowerCenter Server, folder,
workflow, workflow log file, and user that runs the workflow. It includes workflow run
type, start time, run status, and run error code. It also includes the number of active
workflows and the number of scheduled workflows.
♦ Task details. In addition to workflow details, task details include folder name, workflow
name, task instance name, task type, task start time, task run status, task run error code,
and task run mode. When the task is a session, the getserverdetails command also returns
master server name, worker server name, server grid name, the number of active sessions,
and the number of waiting sessions.
In the command line mode, use the following syntax to get details about the PowerCenter
Server:
pmcmd getserverdetails
[-all|-running|-scheduled]
In the interactive mode, enter the following syntax at the pmcmd prompt to get details about
getserverdetails
[-all|-running|-scheduled]
Issue the getserverdetails command for all or some of the workflows. The -running option
returns status details on active workflows. Active workflows include running, suspending, and
suspended workflows. The -scheduled option returns status details on the scheduled
workflows. The default option is the -all option, and it returns status details on the scheduled
and running workflows.
Getserverproperties
The getserverproperties command returns the PowerCenter Server name, type, and version. It
returns the timestamp on the PowerCenter Server, the PowerCenter Server startup time, and
the name of the repository. It indicates the data movement mode, the PowerCenter Server
code page, and whether the PowerCenter Server can debug mappings. It also specifies the
server grid name.
In the command line mode, use the following syntax to see the PowerCenter Server
properties:
pmcmd getserverproperties
pmcmd Reference 599

<-serveraddr|-s>[host:]portno
In the interactive mode, enter the following syntax at the pmcmd prompt to see PowerCenter
Server properties:
getserverproperties
Serveraddr is the server name and port number of the PowerCenter Server.
Getsessionstatistics
The getsessionstatistics command returns session details and statistics. The command returns
the following information for each partition:
♦ Session details. Session details include the name of the folder, workflow, task instance, and
mapping. It includes the task run status, session log file name, first error code and message,
the number of transformation errors, and the number of successful and failed rows for the
sources and targets. It also includes the name of the master server, worker server, and server
grid.
♦ Session statistics. Session statistics include the transformation name, transformation
instance name, and the number of applied, affected, and rejected rows. It also includes the
throughput, last error code and message, and start and end time for the session.
In the command line mode, use the following syntax to get session statistics:
pmcmd getsessionstatistics
taskInstancePath
In the interactive mode, enter the following syntax at the pmcmd prompt to get session
statistics:
getsessionstatistics
taskInstancePath
When using this command, specify the workflow name. Also, write the taskInstancePath as a
fully qualified string. If the task is within a worklet, write the string as
WorkletName.TaskName. If the task is directly within a workflow, enter only the task name.

Gettaskdetails
The gettaskdetails command returns the folder name, workflow name, task instance name,
task type, last execution start time, last execution complete time, task run status, and task run
mode. It also returns the run error code and message.
If you issue the gettaskdetails command for a Session task, the command also returns the
following additional information: mapping name, session log file name, first error code and
message, number of successful and failed rows from the source and target, the number of
transformation errors, master server name, worker server name, and server grid name.
In the command line mode, use the following syntax to get details on a task:
pmcmd gettaskdetails
taskInstancePath
In the interactive mode, enter the following syntax at the pmcmd prompt to get details on a
task:
gettaskdetails
taskInstancePath
When you use this command, specify the workflow name. Also, write the taskInstancePath as
a fully qualified string. If the task is within a worklet, write the string as
WorkletName.TaskName. If the task is directly within a workflow, enter only the task name.
Getworkflowdetails
The getworkflowdetails command returns the folder name, workflow name, last start time,
last completion time, workflow status, run mode, and the username that ran the last
workflow.
In the command line mode, use the following syntax to get details on a workflow:
pmcmd getworkflowdetails
pmcmd Reference 601

workflow
In the interactive mode, enter the following syntax at the pmcmd prompt to get details on a
workflow:
getworkflowdetails
workflow
Help
The help command returns the syntax for the command you specify. If you omit the
command name, pmcmd lists each command and syntax.
In the command line mode, use the following command for help with command line
commands:
pmcmd help [command]
In the interactive mode, use the following command for help with interactive mode
commands:
help [command]
Pingserver
The pingserver command verifies that the PowerCenter Server is running.
In the command line mode, use the following syntax to ping the PowerCenter Server:
pmcmd pingserver
In the interactive mode, enter the following syntax at the pmcmd prompt to ping the
PowerCenter Server:
pingserver
Serveraddr is the host name and port number of the PowerCenter Server.
Quit
The quit command disconnects pmcmd from the PowerCenter Server and closes the pmcmd
program.
In the interactive mode, use the following syntax to quit pmcmd:
quit
Note: You can use this command in the pmcmd interactive mode only.

Resumeworkflow
The resumeworkflow command resumes suspended workflows. To resume a workflow, specify
the folder and workflow name. The PowerCenter Server resumes the workflow from all
suspended and failed worklets and all suspended and failed Command, Email, and Session
tasks.
In the command line mode, use the following syntax to resume a workflow:
pmcmd resumeworkflow

[-wait|-nowait]
[-recovery]
workflow
In the interactive mode, enter the following syntax at the pmcmd prompt to resume a
workflow:
resumeworkflow
[-wait|-nowait]
[-recovery]
workflow
Resumeworklet
The resumeworklet command resumes suspended worklets. To resume the workflow from a
specific worklet, specify the taskInstancePath as a fully qualified string. If you do not specify a
taskInstancePath, the workflow resumes from the suspended worklet.
In the command line mode, use the following syntax to resume a worklet:
pmcmd resumeworklet
[-wait|-nowait]
[-recovery]
pmcmd Reference 603

taskInstancePath
In the interactive mode, enter the following syntax at the pmcmd prompt to resume a worklet:
resumeworklet
[-wait|-nowait]
[-recovery]
taskInstancePath
Scheduleworkflow
The scheduleworkflow command instructs the PowerCenter Server to schedule a workflow.
Use this command to reschedule a workflow that has been removed from the schedule.
In the command line mode, use the following syntax to schedule a workflow:
pmcmd scheduleworkflow <-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> user_env_var>
<<-password|-p> password|<-passwordvar|-pv> password_env_var>
[<-folder|-f> folder] workflow
In the interactive mode, enter the following syntax at the pmcmd prompt to schedule a
workflow:
scheduleworkflow [<-folder|-f> folder] workflow
Setfolder
The setfolder command designates a folder as the default folder in which to execute all
subsequent commands. After issuing this command, you do not need to enter a folder name
for workflow, task, and session commands. If you enter a folder name in a command after the
setfolder command, that folder name overrides the default folder name for that command
only.
In the interactive mode, enter the following syntax at the pmcmd prompt to designate a folder
as the default folder:
setfolder folder

Setnowait
The setnowait command instructs the PowerCenter Server to execute subsequent commands
in the nowait mode. The nowait mode is the default mode.
In the interactive mode, enter the following syntax at the pmcmd prompt to instruct the
PowerCenter Server to execute subsequent commands in the nowait mode:
setnowait
When the nowait mode is set, the pmcmd prompt is available after the PowerCenter Server
receives the previous command. No parameters are required for this command.
Setwait
The setwait command instructs the PowerCenter Server to execute subsequent commands in
the wait mode. The pmcmd prompt is available only after the PowerCenter Server completes
the previous command.
In the interactive mode, enter the following syntax at the pmcmd prompt to instruct the
PowerCenter Server to execute subsequent commands in the wait mode:
setwait
No parameters are required for this command.

Showsettings
The showsettings command returns the name of the PowerCenter Server and repository to
which pmcmd is connected. It displays the username, wait mode, and default folder. No
parameters are required for this command.
In the interactive mode, enter the following syntax at the pmcmd prompt to display interactive
mode settings:
showsettings
Shutdownserver
The shutdownserver command stops the PowerCenter Server. You must have the Super User
or Administer Server privilege to use this command.
You can shut down the PowerCenter Server in the complete, stop, or abort mode. In the
complete mode, pmcmd allows currently running workflows to complete before shutting
down the PowerCenter Server. In the stop mode, the PowerCenter Server stops the running
workflows. In the abort mode, the PowerCenter Server aborts the running workflows. For
pmcmd Reference 605

more information on the implications of stopping or abort a workflow, see “Stopping or
Aborting the Workflow” on page 129.
In the command line mode, use the following syntax to stop the PowerCenter Server:
pmcmd shutdownserver

<-complete|-stop|-abort>
In the interactive mode, enter the following syntax at the pmcmd prompt to stop the
PowerCenter Server:
shutdownserver
<-complete|-stop|-abort>
Starttask
The starttask command starts a task.
In the command line mode, use the following syntax to start a task:
pmcmd starttask
[-paramfile paramfile]
[-wait|-nowait]
[-recovery]
taskInstancePath
In the interactive mode, enter the following syntax at the pmcmd prompt to start a task:
starttask
[-wait|-nowait]
[-recovery]
taskInstancePath

string as WorkletName.TaskName. If the task is directly within a workflow, enter only the
task.
Using Parameter Files with Starttask

When you start a task, you can optionally enter the directory and name of a parameter file.
The PowerCenter Server runs the task using the parameters in the file you specify.
For UNIX shell users, enclose the parameter file name in single quotes:
-paramfile ’$PMRootDir/myfile.txt’
For Windows command prompt users, the parameter file name cannot have beginning or
trailing spaces. If the name includes spaces, enclose the file name in double quotes:
-paramfile ”$PMRootDir\my file.txt”
When you write a pmcmd command that includes a parameter file located on another
machine, use the backslash (\) with the dollar sign ($). This ensures that the machine where
the variable is defined expands the server variable.
pmcmd starttask -uv USERNAME -pv PASSWORD -s SALES:6258 -f east -w
wSalesAvg -paramfile ’\$PMRootDir/myfile.txt’ taskA
Startworkflow
The startworkflow command starts a workflow.
In the command line mode, use the following syntax to start a workflow:
pmcmd startworkflow

[<-startfrom> taskInstancePath]
[-recovery]
[<-localparamfile|-lpf> localparamfile]
[-wait|-nowait]
workflow
In the interactive mode, enter the following syntax at the pmcmd prompt to start a workflow:
startworkflow
pmcmd Reference 607

[<-startfrom> taskInstancePath]
[-recovery]
[<-localparamfile|-lpf> localparamfile]
[-wait|-nowait]
workflow
Use the -startfrom flag to start the workflow at a designated taskInstancePath. Write the
taskInstancePath as a fully qualified string. If the task is within a worklet, write the string as
WorkletName.TaskName. If the task is directly within a workflow, enter only the task. If you
do not specify a starting point, the workflow starts at the Start task.
Using Parameter Files with Startworkflow

When you start a workflow, you can optionally enter the directory and name of a parameter
file. The PowerCenter Server runs the workflow using the parameters in the file you specify.
For UNIX shell users, enclose the parameter file name in single quotes. For Windows
command prompt users, the parameter file name cannot have beginning or trailing spaces. If
the name includes spaces, enclose the file name in double quotes
You can use choose parameter files on the following machines:
♦ PowerCenter Server machine. When you use a parameter file located on the PowerCenter
Server machine, use the -paramfile option to indicate the location and name of the
parameter file.
On UNIX, use the following syntax:
-paramfile ’$PMRootDir/myfile.txt’
On Windows, use the following syntax:

-paramfile ”$PMRootDir\my file.txt”
♦ Local machine. When you use a parameter file located on the machine where pmcmd is
invoked, pmcmd passes variables and values in the file to the PowerCenter Server. When
you list a local parameter file, specify the absolute path or relative path to the file. Use the
-localparamfile or -lpf option to indicate the location and name of the local parameter file.
On UNIX, use the following syntax:
-lpf ‘param_file.txt’
-lpf ‘c:\Informatica\parameterfiles\param file.txt’
-localparamfile ‘c:\Informatica\parameterfiles\param file.txt’
On Windows, use the following syntax:

-lpf param_file.txt
-lpf “c:\Informatica\parameterfiles\param file.txt”
-localparamfile param_file.txt

♦ Shared network drives. When you use a parameter file located on another machine, use
the backslash (\) with the dollar sign ($). This ensures that the machine where the variable
is defined expands the server variable.
-paramfile ’\$PMRootDir/myfile.txt’
Stoptask
The stoptask command stops a task.
In the command line mode, use the following syntax to stop a task:
pmcmd stoptask
[-wait|-nowait]
taskInstancePath
In the interactive mode, enter the following syntax at the pmcmd prompt to stop a task:
stoptask
[-wait|-nowait] taskInstancePath
alone.
Stopworkflow
The stopworkflow command stops a workflow.
In the command line mode, use the following syntax to stop a workflow:
pmcmd stopworkflow
pmcmd Reference 609

[-wait|-nowait]
workflow
In the interactive mode, enter the following syntax at the pmcmd prompt to stop a workflow:
stopworkflow
[-wait|-nowait]
workflow
Unscheduleworkflow
The unscheduleworkflow command instructs the PowerCenter Server to remove the workflow
from the schedule.
In the command line mode, enter the following syntax at the pmcmd prompt to remove the
workflow from the schedule:
pmcmd unscheduleworkflow <-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> user_env_var>
<<-password|-p> password|<-passwordvar|-pv> password_env_var>
[<-folder|-f> folder] workflow
In the interactive mode, enter the following syntax at the pmcmd prompt to remove the
workflow from the schedule:
unscheduleworkflow [<-folder|-f> folder] workflow
Unsetfolder
The unsetfolder command designates no folder as the default folder. After you issue this
command, you must specify a folder name each time you enter a command for a session,
workflow, or task.
In the interactive mode, enter the following syntax at the pmcmd prompt to clear the setfolder
command:
unsetfolder
No parameters are required for this command.


Version
The version command displays the PowerCenter version and Informatica trademark and
copyright information.
In the command line mode, use the following command to verify the PowerCenter version:
pmcmd version
In the interactive mode, enter the following syntax at the pmcmd prompt to verify the
PowerCenter version:
version
Waittask
The waittask command instructs the PowerCenter Server to complete the task before
returning the pmcmd prompt to the command prompt or shell.
In the command line mode, use the following syntax to set a task in the wait mode:
pmcmd waittask
taskInstancePath
In the interactive mode, enter the following syntax at the pmcmd prompt to set a task in the
wait mode:
waittask
taskInstancePath
alone.
Waitworkflow
The waitworkflow command notifies you whether the specified workflow has run successfully
or is not running. If the workflow is running, pmcmd indicates the success with return code 0
after the workflow has completed. If the workflow is not running, pmcmd indicates the
pmcmd Reference 611

workflow is not running with return code 3. For more information on pmcmd return codes,
see “pmcmd Return Codes” on page 590.
The waitworkflow command returns the pmcmd prompt to the command prompt or shell
when a workflow completes.
In the command line mode, use the following syntax to set a workflow to the wait mode:
pmcmd waitworkflow

workflow
In the interactive mode, enter the following syntax at the pmcmd prompt to set a workflow to
the wait mode:
waitworkflow
workflow
You can use waitworkflow in conjunction with the startworkflow command if you are running
scripts. For example, you may want to check the status of a critical workflow that was
previously started. You can use the waitworkflow command to wait for that workflow to
complete before you start the next workflow.

Chapter 24
Session Caches
This chapter includes the following topics:

♦ Overview, 614
♦ Determining Cache Requirements, 617
♦ Cache Partitioning, 620
♦ Aggregator Caches, 621
♦ Joiner Caches, 624
♦ Lookup Caches, 628
♦ Rank Caches, 632
613
Overview
The PowerCenter Server creates index and data caches in memory for Aggregator, Rank,
Joiner, and Lookup transformations in a mapping. The PowerCenter Server stores key values
in the index cache and output values in the data cache. You configure memory parameters for
the index and data cache in the transformation or session properties.
If the PowerCenter Server requires more memory, it stores overflow values in cache files.
When the session completes, the PowerCenter Server releases cache memory, and in most
circumstances, it deletes the cache files.
The PowerCenter Server creates cache files based on the PowerCenter Server code page.
Table 24-1 gives an overview of the type of information that the PowerCenter Server stores in
the index and data caches:
Table 24-1. Caching Storage Overview
Transformation Index Cache Data Cache
Aggregator Stores group values as configured in the Stores calculations based on the group by
group by ports. ports.
Rank Stores group values as configured in the Stores ranking information based on the group
group by ports. by ports.
Joiner Stores index values for the master source Stores master source rows.
table as configured in the join condition.
Lookup Stores lookup condition information. Stores lookup data that is not stored in the
index cache.
Memory Cache
The PowerCenter Server creates a memory cache based on the size configured in the session
properties. When you create a mapping, you specify the index and data cache size for each
transformation instance. When you create a session, you can override the index and data
cache size for each transformation instance in the session properties.
When you configure a session, you calculate the amount of memory the PowerCenter Server
needs to process the session. Calculate requirements based on factors such as processing
overhead and column size for key and output columns.
By default, the PowerCenter Server allocates 1,000,000 bytes to the index cache and
2,000,000 bytes to the data cache for each transformation instance. If the PowerCenter Server
cannot allocate the configured amount of cache memory, it cannot initialize the session and
the session fails.
If a server grid has 32-bit and 64-bit servers, and if a session exceeds 2 GB of memory, the
master server assigns it to a 64-bit server. For information on server grids, see “Working with
614 Chapter 24: Session Caches

When you specify large cache sizes in transformations on 64-bit machines, the PowerCenter
Server might run out of physical memory and perform slower. If the cache size forces the
PowerCenter Server to swap virtual memory and to spill to disk, performance decreases.
Note: A PowerCenter Server running on a 32-bit machine cannot run a session if the total size
of all the configured session caches is more than 2 GB.
Cache Files
If the PowerCenter Server requires more memory than the configured cache size, it stores
overflow values in the cache files. Since paging to disk can slow session performance, try to
configure the index and data cache sizes to store data in memory.
The PowerCenter Server creates the index and data cache files by default in the PowerCenter
Server variable directory, $PMCacheDir. If you do not define $PMCacheDir, the
PowerCenter Server saves the files in the PMCache directory specified in the UNIX
configuration file or the cache directory in the Windows registry. If the UNIX PowerCenter
Server does not find a directory there, it creates the index and data files in the installation
directory. If the PowerCenter Server on Windows does not find a directory there, it creates the
files in the system directory.
If a cache file handles more than 2 GB of data, the PowerCenter Server creates multiple index
and data files. When creating these files, the PowerCenter Server appends a number to the
end of the filename, such as PMAGG*.idx1 and PMAGG*.idx2. The number of index and
data files are limited only by the amount of disk space available in the cache directory.
When you run a session, the PowerCenter Server writes a message in the session log indicating
the cache file name and the transformation name. When a session completes, the
PowerCenter Server typically deletes index and data cache files. However, you may find index
and data files in the cache directory under the following circumstances:
♦ The session performs incremental aggregation.
♦ You configure the Lookup transformation to use a persistent cache.
♦ The session does not complete successfully.
The PowerCenter Server use the following naming convention when it creates cache files:
[<Name Prefix> | <Prefix> <session ID>_<transformation ID>]_[partition
index]<suffix>.[overflow index]
Overview 615
Table 24-2 describes the naming convention for cache files that the PowerCenter Server
creates:
Table 24-2. Cache File Names
File Name Component Description
Name Prefix Cache file name prefix configured in the Lookup transformation.
Prefix Describes the type of transformation:

- Aggregator transformation is PMAGG.
- Joiner transformation is PMJNR.
- Lookup transformation is PMLKUP.
- Rank transformation is PMAGG.
Session ID Session instance ID number.
Transformation ID Transformation instance ID number.
Partition Index If the session contains more than one partition, this identifies the partition number. The
partition index is zero-based, so the first partition has no partition index. Partition index 2
indicates a cache file created in the third partition.
Suffix Identifies the type of file:

- Index file is .idx.
- Data file is .dat.
Overflow Index If a cache file handles more than 2 GB of data, the PowerCenter Server creates multiple
index and data files. When creating these files, the PowerCenter Server appends an
overflow index to the filename, such as PMAGG*.idx.1 and PMAGG*.idx.2. The number of
index and data files are limited by the amount of disk space available in the cache
directory.
For example, in the file name, PMLKUP8_4_2.idx, PMLKUP identifies the transformation
type as Lookup, 8 is the session ID, 4 is the transformation ID, and 2 is the partition index.
The cache directory should be local to the PowerCenter Server. You might encounter
performance or reliability problems when you cache large quantities of data on a mapped or
mounted drive.
For details on tuning the caches, see “Performance Tuning” on page 635.

Determining Cache Requirements
When you configure a mapping that uses an Aggregator, Rank, Joiner, or Lookup
transformation, you configure memory cache on the Properties tab of the transformation. You
can override these memory requirements in the session properties. To calculate the index and
data cache, you need to consider column and row requirements as well as processing overhead.
The PowerCenter Server requires processing overhead to cache data and index information.
Column overhead includes a null indicator, and row overhead can include row ID and key
information.
Use the following steps to calculate and configure the cache size required to run a mapping:
1. Add the size requirements for the columns in the cache.
2. Add row or group processing overhead.
3. Multiply by the number of groups or rows.
4. Configure the index and data cache in the transformation properties. You configure cache
sizes for each transformation on the Properties tab in the mapping.
The amount of memory you configure depends on the partition properties and how much
memory cache and disk cache you want to use. If you use cache partitioning, the PowerCenter
Server requires only a portion of total cache memory for each partition. For information on
cache partitioning, see “Cache Partitioning” on page 620.
Cache Calculations
To determine cache requirements for a session, first add the total column size in the cache to
the row overhead. Multiply the result by the number of groups or rows in the cache. This
gives the minimum caching requirements. To determine the maximum requirements for the
index cache, you multiply the minimum requirements by two.
The following tables provide the calculations for the minimum cache requirements for each
transformation:
Table 24-3. Aggregate Cache Calculation
Cache Calculation Columns in Cache
Index # groups [( Σ column size) + 17] Group by columns.
Data # groups[( Σ column size) + 7] - Non group by input ports used in non-aggregate output
expression.
- Non group by input/output ports.
- Local variable ports.
- Column containing aggregate function (multiply by
three).*
* Each aggregate function has different cache space requirements. As a general rule, you can multiply the column containing the
aggregate function by three.
Determining Cache Requirements 617

Table 24-4. Rank Cache Calculation
Index # groups [( Σ column size) + 17] Group by columns.
Data # groups [(# ranks *( Σ column size + 10)) + 20] - Non group by input ports used in
non-aggregate output expression.
- Rank ports.
Table 24-5. Joiner Cache Calculation
Index # master rows [( Σ column size) + 16] Master column in join conditions.
Data # master rows [( Σ column size) + 8] Master column not in join condition and used for
output.
Table 24-6. Lookup Cache Calculation
Index 200 * [( Σ column size) + 16] Columns in lookup condition.

(minimum)
Index # rows in lookup table [( Σ column size) + 16] * 2 Columns in lookup condition.
(maximum)
Data # rows in lookup table [( Σ column size) + 8] Connected output ports not in
the lookup condition.
Return port (for unconnected
Lookup transformations).
For more information about each cache, see the separate sections in this chapter.
Cache Column Sizes

When you calculate the column size for each cache, include the size of the data and additional
processing requirements.
Table 24-7 gives the columns sizes for index and data cache calculations:
Table 24-7. Column Sizes for Cache Calculations
Datatype Aggregator, Rank Joiner, Lookup
Binary precision + 2 precision + 8

Round to nearest multiple of 8
Date/Time 18 24
Decimal, high precision off (all precision) 10 16
Decimal, high precision on (precision <=18) 18 24

Table 24-7. Column Sizes for Cache Calculations
Datatype Aggregator, Rank Joiner, Lookup
Decimal, high precision on (precision >18, <=28) 22 32
Decimal, high precision on (precision >28) 10 16
Decimal, high precision on (negative scale) 10 16
Double 10 16
Real 10 16
Integer 6 16
Small integer 6 16
NString, NText, String, Text Unicode mode: Unicode mode:

2*(precision + 2) 2*(precision + 5)
ASCII mode: precision + 3 ASCII mode: precision + 9
The column sizes include the bytes required for a null indicator.
Additionally, to increase lookup and join performance, the PowerCenter Server aligns all data
for lookup and joiner caches on an eight byte boundary. So, each Lookup and Joiner column
includes rounding to the nearest multiple of eight.
Determining Cache Requirements 619

Cache Partitioning
When you create a session with multiple partitions, the PowerCenter Server can partition
caches for the Aggregator, Joiner, Lookup, and Rank transformations. It creates a separate
cache for each partition, and each partition works with only the rows needed by that
partition. As a result, the PowerCenter Server requires only a portion of total cache memory
for each partition. When you run a session, the PowerCenter Server accesses the cache in
parallel for each partition. If you do not use cache partitioning, the PowerCenter Server
accesses the cache serially for each partition.
After you configure the session for partitioning, you can configure memory requirements and
cache directories for each transformation in the Transformations view on the Mapping tab of
the session properties. To configure the memory requirements, calculate the total
requirements for a transformation, and divide by the number of partitions. To further
improve performance, you can configure separate directories for each partition.
The guidelines for cache partitioning is different for each cached transformation:
♦ Aggregator transformation. The PowerCenter Server uses cache partitioning for any
multi-partitioned session with an Aggregator transformation. You do not have to set a
partition point at the Aggregator transformation. For more caching information, see
“Aggregator Caches” on page 621.
♦ Joiner transformation. The PowerCenter Server uses cache partitioning when you create a
partition point at the Joiner transformation. For more caching information, see “Joiner
Caches” on page 624.
♦ Lookup transformation. The PowerCenter Server uses cache partitioning when you create
a hash auto-keys partition point at the Lookup transformation. For more caching
information, see “Lookup Caches” on page 628.
♦ Rank transformation. The PowerCenter Server uses cache partitioning for any multi-
partitioned session with a Rank transformation. You do not have to set a partition point at
the Rank transformation. For more caching information, see “Joiner Caches” on page 624.
For more partitioning information, see “Pipeline Partitioning” on page 345.

Aggregator Caches
When the PowerCenter Server runs a session with an Aggregator transformation, it stores data
in memory until it completes the aggregation. The PowerCenter Server uses cache
partitioning when you create multiple partitions in a pipeline that contains an Aggregator
transformation. It creates one memory cache and one disk cache for each partition and routes
data from one partition to another based on group key values of the transformation.
After you configure the partitions in the session, you can configure the memory requirements
and cache directories for the Aggregator transformation on the Mappings tab in session
properties. Allocate enough disk space to hold one row in each aggregate group.
If you use incremental aggregation, the PowerCenter Server saves the cache files in the cache
file directory. For information about caching with incremental aggregation, see “Partitioning
Guidelines with Incremental Aggregation” on page 578.
Note: The PowerCenter Server uses memory to process an Aggregator transformation with
sorted ports. It does not use cache memory. You do not need to configure cache memory for
Aggregator transformations that use sorted ports.
For more information about the Aggregator transformation, see “Aggregator Transformation”
in the Transformation Guide.
Calculating the Aggregator Index Cache

The index cache holds group information from the group by ports. Use the following
information to calculate the minimum aggregate index cache size:
Aggregate Index Cache Calculation Columns in Cache
# groups [( Σ column size) + 17] Group by columns.
Aggregator Caches 621

For example, the following Aggregator transformation, AGG_SalesPerRegionItem, groups by
STORE_ID and ITEM.
Use the column sizes in Table 24-7 on page 618 to add the group by columns.
Column Name Column Type Datatype Size
STORE_ID Group by Integer 6
ITEM Group by String (15) 18
TOTAL COLUMN SIZE = 24
You know that there are 36 stores and 2,000 items, so the total number of groups is 72,000.
Use the following calculation to determine the minimum index cache requirements:
72,000 * (24 + 17) = 2,952,000
Double the size to determine the maximum index cache requirements:

2,952,000 * 2 = 5,904,000
Therefore, this Aggregator transformation requires an index cache size between 2,952,000
and 5,904,000 bytes.
Calculating the Aggregator Data Cache

The data cache holds row data for variable ports and connected output ports. As a result, the
data cache is generally larger than the index cache. To reduce the data cache size, connect only

the necessary input/output ports to subsequent transformations. Use the following
information to calculate the minimum aggregate data cache size:
Aggregate Data Cache Calculation Columns in Cache
# groups[( Σ column size) + 7] - Non group by input ports used in non-aggregate output expression.
- Port containing aggregate function (multiply by three).*
*The cache space requirements for aggregate functions are different for each function. However, you can multiply the port containing the
aggregate function by three for all aggregate functions.
The following figure shows the connected output ports of AGG_SalesPerRegionItem:
Use the column sizes in Table 24-7 on page 618 to add the columns in the data cache:
ORDER_ID Non group by input/output Integer 6
SALES_PER_STORE_ITEMS Port containing aggregate function Decimal (12, 2) 30*

*Remember to multiply the port containing the aggregate function by three. For more information, see Table 24-3 on page 617.
Note that you do not use STORE_ID and ITEM in the data cache calculation. These
columns are connected to the target, but you do not use them in the cache calculation because
they are group by ports and are used in the index cache calculation.
The total number of groups as calculated for the index cache size is 72,000. Use the following
calculation to determine the minimum data cache requirements:
72,000 * (36 + 7) = 3,096,000
Therefore, this Aggregator transformation requires a data cache size of 3,096,000 bytes.
Aggregator Caches 623

Joiner Caches
When the PowerCenter Server runs a session with a Joiner transformation, it reads rows from
the master and detail sources concurrently and builds index and data caches based on the
master rows. The PowerCenter Server then performs the join based on the detail source data
and the cache data.
The number of rows the PowerCenter Server stores in the cache depends on the partitioning
scheme, the data in the master source, and whether or not you use sorted input. For more
information on how many rows the PowerCenter Server stores, see “Calculating the Number
of Master Rows” on page 625.
When you create multiple partitions in a session, the PowerCenter Server processes the Joiner
transformation differently when you use n:n partitioning and when you use 1:n partitioning.
♦ Processing master and detail data for outer joins. When you run a multi-partitioned
session with a partitioned Joiner transformation, the PowerCenter Server builds one cache
per partition. In a single-partitioned master pipeline (1:n), the PowerCenter Server
outputs unmatched master rows after it processes all detail partitions. In a multi-
partitioned master pipeline (n:n), the PowerCenter Server outputs unmatched master rows
after it processes the partition for each detail cache.
♦ Configuring memory requirements. When you run a session with a Joiner transformation,
the PowerCenter Server uses n times the memory you specify on the Transformation view
of the Mapping tab. The PowerCenter Server might page to disk if you do not specify
enough memory.
When you use 1:n partitioning, each partition requires as much memory as a 1:1 partition
session. When you configure the cache for the Joiner transformation, enter the total
transformation memory requirements for a single partition.
When you use n:n partitioning, each partition requires only a portion of the memory
required by a 1:1 partition session. When you configure the cache, divide the memory
requirements for a 1:1 partition session by the number of partitions. Enter that amount for
the cache requirements.
For example, you calculate the following cache requirements for a Joiner transformation
instance and determine that the transformation requires 2,000,000 bytes of memory for
the index cache and 4,000,000 bytes of memory for the data cache. You create four
partitions for the pipeline. If you use 1:n partitioning, you enter 2,000,000 bytes for the
index cache and 4,000,000 bytes for the data cache. If you use n:n partitioning, enter
500,000 bytes for the index cache and 1,000,000 bytes for the data cache.
To increase join performance, the PowerCenter Server aligns all data for joiner caches on an
eight byte boundary.
Note: To use n:n partitioning with a Joiner transformation, you must create a partition point
at the Joiner transformation. This allows you to create multiple partitions for both the master
and detail source of a Joiner transformation.
For more information about the Joiner transformation, see “Joiner Transformation” in the

Calculating the Number of Master Rows
The number of rows the PowerCenter Server stores in the cache depends on the partitioning
scheme, the data in the master source, and whether or not you use sorted input.
The PowerCenter Server caches all master rows with a unique key in the index cache, and all
master rows in the data cache under any of the following circumstances:
♦ You do not use sorted input.
♦ You use sorted input and 1:n partitioning.
However, when you use sorted input and you use n:n partitioning, the PowerCenter Server
caches a different number of rows in the index and data cache:
♦ Index cache. The PowerCenter Server caches 100 master rows with unique keys.
♦ Data cache. The PowerCenter Server caches the master rows in the data cache that
correspond to the 100 rows in the index cache. The number of rows it stores in the data
cache depends on the data. For example, if every master row contains a unique key, the
PowerCenter Server stores 100 rows in the data cache. However, if the master data
contains multiple rows with the same key, the PowerCenter Server stores more than 100
rows in the data cache.
Calculating the Joiner Index Cache

The index cache holds rows from the master source that are in the join condition. Use the
following information to calculate the minimum joiner index cache size:
Joiner Index Cache Calculation Columns in Cache
# master rows [( Σ column size) + 16] Master column in join condition.
Joiner Caches 625

For example the Joiner transformation, JNR_ORDERS_PRODUCTS, does not use sorted
input, and it joins the sources ORDERS and PRODUCTS on ITEM_NO:
Use the column sizes in Table 24-7 on page 618 to add the columns in the index cache:
ITEM_NO Master column in join condition Decimal (10) 16
PRODUCTS is the master source and has 90,000 rows. Use the following calculation to
determine the minimum index cache requirements:
90,000 * (16 + 16) = 2,880,000

2,880,000 * 2 = 5,760,000
Therefore, this Joiner transformation requires an index cache size between 2,880,000 and
5,760,000 bytes.
Calculating the Joiner Data Cache

The data cache holds rows from the master source until the PowerCenter Server joins the
data. Use the following information to calculate the minimum joiner data cache size:
Joiner Data Cache Calculation Columns in Cache
# master rows [( Σ column size) + 8] Master column not in join condition and used for output.

The following figure shows the connected output ports for JNR_ORDERS_PRODUCTS:
Use the column sizes in Table 24-7 on page 618 to add the columns for the data cache:
ITEM_NAME Master column not in join condition String (23) 32
PRODUCT CATEGORY Master column not in join condition Decimal (21) 30
Note that you do not use ITEM_NO in the data cache calculation because it is part of the
join condition and is used in the index cache.
The master source has 90,000 rows.
Use the following calculation to determine the minimum data cache requirements:
90,000 * (62 + 8) = 6,300,000
This Joiner transformation requires a data cache size of 6,300,000 bytes.
Joiner Caches 627

Lookup Caches
When the PowerCenter Server builds a lookup cache in memory, it processes the first row of
data in the transformation. It queries the cache for each row that enters the transformation.
Configure the index and data cache memory for each Lookup transformation. The
PowerCenter Server caches data differently for static and dynamic caches and also for sessions
that use cache partitioning.
When you run the session, the PowerCenter Server rebuilds a persistent cache if any cache file
is missing or invalid.
For more information about configuring the lookup cache and how the PowerCenter Server
processes lookup requests, see “Lookup Caches” in the Transformation Guide.
Static Cache
When you use a static lookup cache, the PowerCenter Server creates one memory cache for
each partition.
If you use cache partitioning, the PowerCenter Server requires only a portion of the total
memory to cache each partition. So, when you configure cache size, you can divide the total
memory requirements by the number of partitions.
If you do not use cache partitioning, the PowerCenter Server requires as much memory for
each partition as it does for a single partition pipeline. So, when you configure cache size, you
enter the total memory requirements for the transformation.
If two Lookup transformations in a mapping share the cache, the PowerCenter Server does not
allocate additional memory for shared transformations in the same pipeline stage. For shared
transformations in a different pipeline stage, the PowerCenter Server does allocate additional
memory.
Static Lookup transformations that use the same data or a subset of data to create a disk cache
can share the disk cache. However, the lookup keys may be different, so the transformations
must have separate memory caches.
For more information about caching the Lookup transformation, see “Lookup Caches” in the
Dynamic Cache
When you use a dynamic lookup cache, the PowerCenter Server creates the memory cache
based on whether you use cache partitioning or not.
If you use cache partitioning, the PowerCenter Server creates one memory cache for each
partition. It requires only a portion of the total memory to cache each partition. So, when you
configure cache size, you can divide the total memory requirements by the number of
partitions.

If you do not use cache partitioning, the PowerCenter Server creates one memory cache and
one disk cache for each transformation. All partitions share the memory and disk cache.
When you configure the cache size, enter the total memory requirements in the
transformation or on the Mapping tab in the session properties.
When Lookup transformations share a dynamic cache, the PowerCenter Server updates the
memory cache and disk cache. To keep the caches synchronized, the PowerCenter Server must
share the disk cache and the corresponding memory cache between the transformations.
Sharing Partitioned Caches

Use the following guidelines when you share partitioned Lookup caches:
♦ Lookup transformations can share a partitioned cache if the transformations meet the
following conditions:
− The cache structures are identical. The lookup/output ports for the first shared
transformation must match the lookup/output ports for the subsequent transformations.
− The transformations have the same lookup conditions, and the lookup condition
columns are in the same order.
♦ You cannot share a partitioned cache with a non-partitioned cache.
♦ When you share Lookup caches across target load order groups, you must configure the
target load order groups with the same number of partitions.
Note: If the PowerCenter Server detects a mismatch between Lookup transformations sharing
an unnamed cache, it rebuilds the cache files. If the PowerCenter Server detects a mismatch
between Lookup transformations sharing a named cache, it fails the session.
Calculating the Lookup Index Cache

The lookup index cache holds data for the columns used in the lookup condition. The
formula for calculating the minimum lookup index cache size is different than calculating the
maximum size.
For best session performance, specify the maximum lookup index cache size. If you specify a
lookup index cache less than the minimum cache size, the PowerCenter Server fails the
session.
Calculating the Minimum Lookup Index Cache

The minimum size for a lookup index cache is independent of the number of source rows.
Use the following information to calculate the minimum lookup index cache for both
connected and unconnected Lookup transformations:
Lookup Index Cache Calculation Columns in Cache
200 * [( Σ column size) + 16] Columns in lookup condition.
Lookup Caches 629

Calculating the Maximum Lookup Index Cache
Use the following information to calculate the maximum lookup index cache for both
Lookup Index Cache Calculation Columns in Cache
# rows in lookup table [( Σ column size) + 16] * 2 Columns in lookup condition.
Example
The Lookup transformation, LKP_PROMOS, looks up values based on the ITEM_ID. It
uses the following lookup condition:
ITEM_ID = IN_ITEM_ID1
Use the column sizes in Table 24-7 on page 618 to add the columns for the index cache:
ITEM_ID Column in lookup condition integer 16
The lookup condition uses one column, ITEM_ID, and the table contains 60,000 rows.
Use the following calculation to determine the minimum index cache requirements:
200 * (16 + 16) = 6,400
Use the following calculation to determine the maximum index cache requirements:
60,000 * (16 + 16) * 2 = 3,840,000

Therefore, this Lookup transformation requires an index cache size between 6,400 and
3,840,000 bytes.
Calculating the Lookup Data Cache

In a connected transformation, the data cache contains data for the connected output ports,
not including ports used in the lookup condition. In an unconnected transformation, the data
cache contains data from the return port.
Use the following information to calculate the minimum data cache requirements for both
Lookup Data Cache Calculation Columns in Cache
# rows in lookup table [( Σ column size) + 8] Connected output ports not in the lookup condition.
Use return ports for unconnected transformations.
The following figure shows the connected output ports for LKP_PROMOS:
Use the column sizes in Table 24-7 on page 618 to add the columns for the data cache:
PROMOTION_ID Connected output port not in lookup condition Integer 16
DISCOUNT Connected output port not in lookup condition Decimal (10) 16
The lookup table has 60,000 rows.

60,000 * (32 + 8) = 2,400,000
This Lookup transformation requires a data cache size of 2,400,000 bytes.
Lookup Caches 631

Rank Caches
When the PowerCenter Server runs a session with a Rank transformation, it compares an
input row with rows in the data cache. If the input row out-ranks a stored row, the
PowerCenter Server replaces the stored row with the input row.
For example, you configure a Rank transformation to find the top three sales. The
PowerCenter Server reads the following input data:
SALES
10,000
12,210
5,000
2,455
6,324
The PowerCenter Server caches the first three rows (10,000, 12,210, and 5,000). When the
PowerCenter Server reads the next row (2,455) it compares it to the cache values. Since the
row is lower in rank than the cached rows, it discards the row with 2,455. The next row
(6,324), however, is higher in rank than one of the cached rows. Therefore, the PowerCenter
Server replaces the cached row with the higher-ranked input row.
If the Rank transformation is configured to rank across multiple groups, the PowerCenter
Server ranks incrementally for each group it finds.
The PowerCenter Server uses cache partitioning, when you create multiple partitions in a
pipeline that contains a Rank transformation. It creates one memory cache and one disk cache
per partition and routes data from one partition to another based on group key values of the
transformation.
After you configure the partitions in the session, you can configure the memory requirements
and cache directories for the Rank transformation on the Mappings tab in session properties.
For more information about the Rank transformation, see “Rank Transformation” in the
Calculating the Rank Index Cache

The index cache holds group information from the group by ports. Use the following
information to calculate the minimum rank index cache size:
Rank Index Cache Calculation Columns in Cache
# groups [( Σ column size) + 17] Group by columns.

For example, the Rank transformation, RNK_TOPTEN, groups by product category:
Use the column sizes in Table 24-7 on page 618 to add the columns in the index cache:
PRODUCT_CATEGORY Group by String (21) 24
There are 10,000 product categories, so the total number of groups is 10,000. Use the
following calculation to determine the minimum index cache requirements:
10,000 * (24 + 17) = 410,000

410,000 * 2 = 820,000
Therefore, this Rank transformation requires an index cache size between 410,000 and
820,000 bytes.
Calculating the Rank Data Cache

The data cache size is proportional to the number of ranks. It holds row data until the
PowerCenter Server completes the ranking and is generally larger than the index cache. To
reduce the data cache size, connect only the necessary input/output ports to subsequent
Rank Caches 633

transformations. Use the following information to calculate the minimum rank data cache
size:
Rank Data Cache Calculation Columns in Cache
# groups [(# ranks *( Σ column size + 10)) + 20] - Non group by input ports used in non-
aggregate output expression.
- Rank ports.
The following figure shows the connected output ports of RNK_TOPTEN:
Use the column sizes in Table 24-7 on page 618 to add the columns in the data cache:
ITEM_NO Non group by input/output port Decimal (10) 10
ITEM_NAME Non group by input/output port String (23) 26
PRICE Rank port Decimal (14) 10
RNK_TOPTEN ranks by price, and the total number of ranks is 10. The number of groups is
10,000.
10,000[(10 * (46 + 10)) + 20] = 5,800,000
This Rank transformation requires a data cache size of 5,800,000 bytes.

Chapter 25
Performance Tuning

♦ Overview, 636
♦ Identifying the Performance Bottleneck, 637
♦ Optimizing the Target Database, 642
♦ Optimizing the Source Database, 645
♦ Optimizing the Mapping, 647
♦ Optimizing the Session, 655
♦ Optimizing the System, 660
♦ Pipeline Partitioning, 663
635
Overview
The goal of performance tuning is to optimize session performance by eliminating
performance bottlenecks. To tune the performance of a session, first you identify a
performance bottleneck, eliminate it, and then identify the next performance bottleneck until
you are satisfied with the session performance. You can use the test load option to run sessions
when you tune session performance.
The most common performance bottleneck occurs when the PowerCenter Server writes to a
target database. You can identify performance bottlenecks by the following methods:
♦ Running test sessions. You can configure a test session to read from a flat file source or to
write to a flat file target to identify source and target bottlenecks.
♦ Studying performance details. You can create a set of information called performance
details to identify session bottlenecks. Performance details provide information such as
buffer input and output efficiency. For details about performance details, see “Creating
and Viewing Performance Details” on page 436.
♦ Monitoring system performance. You can use system monitoring tools to view percent
CPU usage, I/O waits, and paging to identify system bottlenecks.
Once you determine the location of a performance bottleneck, you can eliminate the
bottleneck by following these guidelines:
♦ Eliminate source and target database bottlenecks. Have the database administrator
optimize database performance by optimizing the query, increasing the database network
packet size, or configuring index and key constraints.
♦ Eliminate mapping bottlenecks. Fine tune the pipeline logic and transformation settings
and options in mappings to eliminate mapping bottlenecks.
♦ Eliminate session bottlenecks. You can optimize the session strategy and use performance
details to help tune session configuration.
♦ Eliminate system bottlenecks. Have the system administrator analyze information from
system monitoring tools and improve CPU and network performance.
If you tune all the bottlenecks above, you can further optimize session performance by
increasing the number of pipeline partitions in the session. Adding partitions can improve
performance by utilizing more of the system hardware while processing the session.
Because determining the best way to improve performance can be complex, change only one
variable at a time, and time the session both before and after the change. If session
performance does not improve, you might want to return to your original configurations.
636 Chapter 25: Performance Tuning

Identifying the Performance Bottleneck
The first step in performance tuning is to identify the performance bottleneck. Performance
bottlenecks can occur in the source and target databases, the mapping, the session, and the
system. Generally, you should look for performance bottlenecks in the following order:
1. Target
2. Source
3. Mapping
4. Session
5. System
You can identify performance bottlenecks by running test sessions, viewing performance
details, and using system monitoring tools.
Identifying Target Bottlenecks

The most common performance bottleneck occurs when the PowerCenter Server writes to a
target database. You can identify target bottlenecks by configuring the session to write to a flat
file target. If the session performance increases significantly when you write to a flat file, you
have a target bottleneck.
If your session already writes to a flat file target, you probably do not have a target bottleneck.
You can optimize session performance by writing to a flat file target local to the PowerCenter
Server.
Causes for a target bottleneck may include small check point intervals, small database
network packet size, or problems during heavy loading operations. For details about
eliminating a target bottleneck, see “Optimizing the Target Database” on page 642.
Identifying Source Bottlenecks

Performance bottlenecks can occur when the PowerCenter Server reads from a source
database. If your session reads from a flat file source, you probably do not have a source
bottleneck. You can improve session performance by setting the number of bytes the
PowerCenter Server reads per line if you read from a flat file source.
If the session reads from relational source, you can use a filter transformation, a read test
mapping, or a database query to identify source bottlenecks.
Using a Filter Transformation

You can use a filter transformation in the mapping to measure the time it takes to read source
data.
Identifying the Performance Bottleneck 637

Add a filter transformation in the mapping after each source qualifier. Set the filter condition
to false so that no data is processed past the filter transformation. If the time it takes to run
the new session remains about the same, then you have a source bottleneck.
Using a Read Test Session

You can create a read test mapping to identify source bottlenecks. A read test mapping isolates
the read query by removing the transformation in the mapping. Use the following steps to
create a read test mapping:
1. Make a copy of the original mapping.
2. In the copied mapping, keep only the sources, source qualifiers, and any custom joins or
queries.
3. Remove all transformations.
4. Connect the source qualifiers to a file target.
Use the read test mapping in a test session. If the test session performance is similar to the
original session, you have a source bottleneck.
Using a Database Query

You can identify source bottlenecks by executing the read query directly against the source
database.
Copy the read query directly from the session log. Execute the query against the source
database with a query tool such as isql. On Windows, you can load the result of the query in a
file. On UNIX systems, you can load the result of the query in /dev/null.
Measure the query execution time and the time it takes for the query to return the first row. If
there is a long delay between the two time measurements, you can use an optimizer hint to
eliminate the source bottleneck.
Causes for a source bottleneck may include an inefficient query or small database network
packet sizes. For details about eliminating source bottlenecks, see “Optimizing the Source
Database” on page 645.
Identifying Mapping Bottlenecks

If you determine that you do not have a source or target bottleneck, you might have a
mapping bottleneck. You can identify mapping bottlenecks by using a Filter transformation
in the mapping.
If you determine that you do not have a source bottleneck, you can add a Filter
transformation in the mapping before each target definition. Set the filter condition to false
so that no data is loaded into the target tables. If the time it takes to run the new session is the
same as the original session, you have a mapping bottleneck.

You can also identify mapping bottlenecks by using performance details. High errorrows and
rowsinlookupcache counters indicate a mapping bottleneck. For details on eliminating
mapping bottlenecks, see “Optimizing the Mapping” on page 647.
High Rowsinlookupcache Counters

Multiple lookups can slow down the session. You might improve session performance by
locating the largest lookup tables and tuning those lookup expressions. For details, see
“Optimizing Multiple Lookups” on page 650.
High Errorrows Counters

Transformation errors impact session performance. If a session has large numbers in any of
the Transformation_errorrows counters, you might improve performance by eliminating the
errors. For details, see “Eliminating Transformation Errors” on page 648.
Identifying a Session Bottleneck

If you do not have a source, target, or mapping bottleneck, you may have a session bottleneck.
You can identify a session bottleneck by using the performance details. The PowerCenter
Server creates performance details when you enable Collect Performance Data in the
Performance settings on the Properties tab of the session properties.
Performance details display information about each Source Qualifier, target definition, and
individual transformation. All transformations have some basic counters that indicate the
number of input rows, output rows, and error rows.
For details about performance details, see “Creating and Viewing Performance Details” on
page 436.
Any value other than zero in the readfromdisk and writetodisk counters for Aggregator,
Joiner, or Rank transformations indicate a session bottleneck.
Small cache size, low buffer memory, and small commit intervals can cause session
bottlenecks. For details on eliminating session bottlenecks, see “Optimizing the Session” on
page 655.
Aggregator, Rank, and Joiner Readfromdisk and Writetodisk Counters

If a session contains Aggregator, Rank, or Joiner transformations, examine each
Transformation_readfromdisk and Transformation_writetodisk counter.
If these counters display any number other than zero, you can improve session performance
by increasing the index and data cache sizes. The PowerCenter Server uses the index cache to
store group information and the data cache to store transformed data, which is typically
larger. Therefore, although both the index cache and data cache sizes affect performance, you
will most likely need to increase the data cache size more than the index cache size. For
further information about configuring cache sizes, see “Session Caches” on page 613.

If the session performs incremental aggregation, the PowerCenter Server reads historical
aggregate data from the local disk during the session and writes to disk when saving historical
data. As a result, the Aggregator_readtodisk and writetodisk counters display a number
besides zero. However, since the PowerCenter Server writes the historical data to a file at the
end of the session, you can still evaluate the counters during the session. If the counters show
any number other than zero during the session run, you can increase performance by tuning
the index and data cache sizes.
To view the session performance details while the session runs, right-click the session in the
Workflow Monitor and choose Properties. Click the Properties tab in the details dialog box.
Source and Target BufferInput_efficiency and BufferOutput_efficiency

Counters
If the BufferInput_efficiency and the BufferOutput_efficiency counters are low for all sources
and targets, increasing the session DTM buffer size may improve performance. For
information on when and how to tune this parameter, see “Increasing DTM Buffer Size” on
page 656.
Under certain circumstances, tuning the buffer block size may also improve session
performance. For details, see “Optimizing the Buffer Block Size” on page 657.
Identifying a System Bottleneck

After you tune the source, target, mapping, and session, you may consider tuning the system.
You can identify system bottlenecks by using system tools to monitor CPU usage, memory
usage, and paging.
The PowerCenter Server uses system resources to process transformation, session execution,
and reading and writing data. The PowerCenter Server also uses system memory for other
data such as aggregate, joiner, rank, and cached lookup tables. You can use system
performance monitoring tools to monitor the amount of system resources the PowerCenter
Server uses and identify system bottlenecks.
On Windows, you can use system tools in the Task Manager or Administrative Tools.
On UNIX systems you can use system tools such as vmstat and iostat to monitor system
performance.
For details on eliminating system bottlenecks, see “Optimizing the System” on page 660.
Identifying System Bottlenecks on Windows

On Windows, you can view the Performance and Processes tab in the Task Manager (use Ctrl-
Alt-Del and choose Task Manager). The Performance tab in the Task Manager provides a
quick look at CPU usage and total memory used. You can view more detailed performance
information by using the Performance Monitor on Windows (use Start-Programs-
Administrative Tools and choose Performance Monitor).

Use the Windows Performance Monitor to create a chart that provides the following
information:
♦ Percent processor time. If you have several CPUs, monitor each CPU for percent
processor time. If the processors are utilized at more than 80%, you may consider adding
more processors.
♦ Pages/second. If pages/second is greater than five, you may have excessive memory
pressure (thrashing). You may consider adding more physical memory.
♦ Physical disks percent time. This is the percent time that the physical disk is busy
performing read or write requests. You may consider adding another disk device or
upgrading the disk device.
♦ Physical disks queue length. This is the number of users waiting for access to the same
disk device. If physical disk queue length is greater than two, you may consider adding
another disk device or upgrading the disk device.
♦ Server total bytes per second. This is the number of bytes the server has sent to and
received from the network. You can use this information to improve network bandwidth.
Identifying System Bottlenecks on UNIX

You can use UNIX tools to monitor user background process, system swapping actions, CPU
loading process, and I/O load operations. When you tune UNIX systems, tune the server for
a major database system. Use the following UNIX tools to identify system bottlenecks on the
UNIX system:
♦ lsattr -E -I sys0. Use this tool to view current system settings. This tool shows maxuproc,
the maximum level of user background processes. You may consider reducing the amount
of background process on your system.
♦ iostat. Use this tool to monitor loading operation for every disk attached to the database
server. Iostat displays the percentage of time that the disk was physically active. High disk
utilization suggests that you may need to add more disks.
If you use disk arrays, use utilities provided with the disk arrays instead of iostat.
♦ vmstat or sar -w. Use this tool to monitor disk swapping actions. Swapping should not
occur during the session. If swapping does occur, you may consider increasing your
physical memory or reduce the number of memory-intensive applications on the disk.
♦ sar -u. Use this tool to monitor CPU loading. This tool provides percent usage on user,
system, idle time, and waiting time. If the percent time spent waiting on I/O (%wio) is
high, you may consider using other under-utilized disks. For example, if your source data,
target data, lookup, rank, and aggregate cache files are all on the same disk, consider
putting them on different disks.

Optimizing the Target Database
If your session writes to a flat file target, you can optimize session performance by writing to a
flat file target that is local to the PowerCenter Server. If your session writes to a relational
target, consider performing the following tasks to increase performance:
♦ Drop indexes and key constraints.
♦ Increase checkpoint intervals.
♦ Use bulk loading.
♦ Use external loading.
♦ Increase database network packet size.
♦ Optimize Oracle target databases.
Dropping Indexes and Key Constraints

When you define key constraints or indexes in target tables, you slow the loading of data to
those tables. To improve performance, drop indexes and key constraints before running your
session. You can rebuild those indexes and key constraints after the session completes.
If you decide to drop and rebuild indexes and key constraints on a regular basis, you can
create pre- and post-load stored procedures to perform these operations each time you run the
session.
Note: To optimize performance, use constraint-based loading only if necessary.
Increasing Checkpoint Intervals

The PowerCenter Server performance slows each time it waits for the database to perform a
checkpoint. To increase performance, consider increasing the database checkpoint interval.
When you increase the database checkpoint interval, you increase the likelihood that the
database performs checkpoints as necessary, when the size of the database log file reaches its
limit.
For details on specific database checkpoints, checkpoint intervals, and log files, consult your
database documentation.
Bulk Loading
You can use bulk loading to improve the performance of a session that inserts a large amount
of data to a DB2, Sybase, Oracle, or Microsoft SQL Server database. Configure bulk loading
on the Mapping tab.
When bulk loading, the PowerCenter Server bypasses the database log, which speeds
performance. Without writing to the database log, however, the target database cannot
perform rollback. As a result, you may not be able to perform recovery. Therefore, you must

weigh the importance of improved session performance against the ability to recover an
incomplete session.
For more information on configuring bulk loading, see “Bulk Loading” on page 252.
External Loading
You can use the External Loader session option to integrate external loading with a session.
If you have a DB2 EE or DB2 EEE target database, you can use the DB2 EE or DB2 EEE
external loaders to bulk load target files. The DB2 EE external loader uses the PowerCenter
Server db2load utility to load data. The DB2 EEE external loader uses the DB2 Autoloader
utility.
If you have a Teradata target database, you can use the Teradata external loader utility to bulk
load target files.
If your target database runs on Oracle, you can use the Oracle SQL*Loader utility to bulk
load target files. When you load data to an Oracle database using a pipeline with multiple
partitions, you can increase performance if you create the Oracle target table with the same
number of partitions you use for the pipeline.
If your target database runs on Sybase IQ, you can use the Sybase IQ external loader utility to
bulk load target files. If your Sybase IQ database is local to the PowerCenter Server on your
UNIX system, you can increase performance by loading data to target tables directly from
named pipes.
For details on the External Loader option, see “External Loading” on page 523.
Increasing Database Network Packet Size

You can increase the network packet size in the Informatica Workflow Manager to reduce
target bottleneck. For Sybase and Microsoft SQL Server, increase the network packet size to
8K - 16K. For Oracle, increase the network packet size in tnsnames.ora and listener.ora. If
you increase the network packet size in the PowerCenter Server configuration, you also need
to configure the database server network memory to accept larger packet sizes.
See your database documentation about optimizing database network packet size.
Optimizing Oracle Target Databases

If your target database is Oracle, you can optimize the target database by checking the storage
clause, space allocation, and rollback segments.
When you write to an Oracle database, check the storage clause for database objects. Make
sure that tables are using large initial and next values. The database should also store table and
index data in separate tablespaces, preferably on different disks.
Optimizing the Target Database 643

When you write to Oracle target databases, the database uses rollback segments during loads.
Make sure that the database stores rollback segments in appropriate tablespaces, preferably on
different disks. The rollback segments should also have appropriate storage clauses.
You can optimize the Oracle target database by tuning the Oracle redo log. The Oracle
database uses the redo log to log loading operations. Make sure that redo log size and buffer
size are optimal. You can view redo log properties in the init.ora file.
If your Oracle instance is local to the PowerCenter Server, you can optimize performance by
using IPC protocol to connect to the Oracle database. You can set up Oracle database
connection in listener.ora and tnsnames.ora.
See your Oracle documentation for details on optimizing Oracle databases.

Optimizing the Source Database
If your session reads from a flat file source, you can improve session performance by setting
the number of bytes the PowerCenter Server reads per line. By default, the PowerCenter
Server reads 1024 bytes per line. If each line in the source file is less than the default setting,
you can decrease the Line Sequential Buffer Length setting in the session properties.
If your session reads from a relational source, review the following suggestions for improving
performance:
♦ Optimize the query.
♦ Create tempdb as in-memory database.
♦ Use conditional filters.
♦ Increase database network packet size.
♦ Connect to Oracle databases using IPC protocol.
Optimizing the Query

If a session joins multiple source tables in one Source Qualifier, you might be able to improve
performance by optimizing the query with optimizing hints. Also, single table select
statements with an ORDER BY or GROUP BY clause may benefit from optimization such as
adding indexes.
Usually, the database optimizer determines the most efficient way to process the source data.
However, you might know properties about your source tables that the database optimizer
does not. The database administrator can create optimizer hints to tell the database how to
execute the query for a particular set of source tables.
The query the PowerCenter Server uses to read data appears in the session log. You can also
find the query in the Source Qualifier transformation. Have your database administrator
analyze the query, and then create optimizer hints and/or indexes for the source tables.
Use optimizing hints if there is a long delay between when the query begins executing and
when PowerCenter receives the first row of data. Configure optimizer hints to begin returning
rows as quickly as possible, rather than returning all rows at once. This allows the
PowerCenter Server to process rows parallel with the query execution.
Queries that contain ORDER BY or GROUP BY clauses may benefit from creating an index
on the ORDER BY or GROUP BY columns. Once you optimize the query, use the SQL
override option to take full advantage of these modifications. For details on using SQL
override, see “Source Qualifier Transformation” in the Transformation Guide.
You can also configure the source database to run parallel queries to improve performance.
See your database documentation for configuring parallel query.
Optimizing the Source Database 645

Using tempdb to Join Sybase and Microsoft SQL Server Tables
When joining large tables on a Sybase or Microsoft SQL Server database, you might improve
performance by creating the tempdb as an in-memory database to allocate sufficient memory.
Check your Sybase or Microsoft SQL Server manual for details.
Using Conditional Filters

A simple source filter on the source database can sometimes impact performance negatively
because of lack of indexes. You can use the PowerCenter conditional filter in the Source
Qualifier to improve performance.
Whether you should use the PowerCenter conditional filter to improve performance depends
on your session. For example, if multiple sessions read from the same source simultaneously,
the PowerCenter conditional filter may improve performance.
However, some sessions may perform faster if you filter the source data on the source
database. You can test your session with both the database filter and the PowerCenter filter to
determine which method improves performance.
Increasing Database Network Packet Sizes

You can improve the performance of a source database by increasing the network packet size,
allowing larger packets of data to cross the network at one time. To do this you must complete
the following tasks:
♦ Increase the database server network packet size.
♦ Change the packet size in the Workflow Manager database connection to reflect the
database server packet size.
For Oracle, increase the packet size in listener.ora and tnsnames.ora. For other databases,
check your database documentation for details on optimizing network packet size.
Connecting to Oracle Source Databases

If your Oracle instance is local to the PowerCenter Server, you can optimize performance by
using IPC protocol to connect to the Oracle database. You can set up Oracle database
connection in listener.ora and tnsnames.ora.

Optimizing the Mapping
Mapping-level optimization may take time to implement but can significantly boost session
performance. Focus on mapping-level optimization only after optimizing on the target and
source databases.
Generally, you reduce the number of transformations in the mapping and delete unnecessary
links between transformations to optimize the mapping. You should configure the mapping
with the least number of transformations and expressions to do the most amount of work
possible. You should minimize the amount of data moved by deleting unnecessary links
between transformations.
For transformations that use data cache (such as Aggregator, Joiner, Rank, and Lookup
transformations), limit connected input/output or output ports. Limiting the number of
connected input/output or output ports reduces the amount of data the transformations store
in the data cache.
You can also perform the following tasks to optimize the mapping:
♦ Configure single-pass reading.
♦ Optimize datatype conversions.
♦ Eliminate transformation errors.
♦ Optimize transformations.
♦ Optimize expressions.
Configuring Single-Pass Reading

Single-pass reading allows you to populate multiple targets with one source qualifier.
Consider using single-pass reading if you have several sessions that use the same sources. If
you join the separate mappings and use only one source qualifier for each source, the
PowerCenter Server then reads each source only once, then sends the data into separate data
flows. A particular row can be used by all the data flows, by any combination, or by none, as
the situation demands.
For example, you have the PURCHASING source table, and you use that source daily to
perform an aggregation and a ranking. If you place the Aggregator and Rank transformations
in separate mappings and sessions, you force the PowerCenter Server to read the same source
table twice. However, if you join the two mappings, using one source qualifier, the
PowerCenter Server reads PURCHASING only once, then sends the appropriate data to the
two separate data flows.
When changing mappings to take advantage of single-pass reading, you can optimize this
feature by factoring out any functions you do on both mappings. For example, if you need to
subtract a percentage from the PRICE ports for both the Aggregator and Rank
Optimizing the Mapping 647

transformations, you can minimize work by subtracting the percentage before splitting the
pipeline as shown in Figure 25-1:
Figure 25-1. Single-Pass Reading
Optimizing Datatype Conversions

Forcing the PowerCenter Server to make unnecessary datatype conversions slows
performance. For example, if your mapping moves data from an Integer column to a Decimal
column, then back to an Integer column, the unnecessary datatype conversion slows
performance. Where possible, eliminate unnecessary datatype conversions from mappings.
Some datatype conversions can improve system performance. Use integer values in place of
other datatypes when performing comparisons using Lookup and Filter transformations.
For example, many databases store U.S. zip code information as a Char or Varchar datatype.
If you convert your zip code data to an Integer datatype, the lookup database stores the zip
code 94303-1234 as 943031234. This helps increase the speed of the lookup comparisons
based on zip code.
Eliminating Transformation Errors

In large numbers, transformation errors slow the performance of the PowerCenter Server.
With each transformation error, the PowerCenter Server pauses to determine the cause of the
error and to remove the row causing the error from the data flow. Then the PowerCenter
Server typically writes the row into the session log file.
Transformation errors occur when the PowerCenter Server encounters conversion errors,
conflicting mapping logic, and any condition set up as an error, such as null input. Check the
session log to see where the transformation errors occur. If the errors center around particular
transformations, evaluate those transformation constraints.
If you need to run a session that generates a large numbers of transformation errors, you
might improve performance by setting a lower tracing level. However, this is not a
recommended long-term response to transformation errors. For details on error tracing and
performance, see “Reducing Error Tracing” on page 659.

Optimizing Lookup Transformations
If a mapping contains a Lookup transformation, you can optimize the lookup. Some of the
things you can do to increase performance include caching the lookup table, optimizing the
lookup condition, or indexing the lookup table.
For more information on the Lookup transformation, see “Lookup Transformation” in the
Transformation Guide. For more information on lookup caching, see “Lookup Caches” in the
Transformation Guide and “Session Caches” on page 613.
Caching Lookups
If a mapping contains Lookup transformations, you might want to enable lookup caching. In
general, you want to cache lookup tables that need less than 300MB.
When you enable caching, the PowerCenter Server caches the lookup table and queries the
lookup cache during the session. When this option is not enabled, the PowerCenter Server
queries the lookup table on a row-by-row basis. You can increase performance using a shared
or persistent cache:
♦ Shared cache. You can share the lookup cache between multiple transformations. You can
share an unnamed cache between transformations in the same mapping. You can share a
named cache between transformations in the same or different mappings.
♦ Persistent cache. If you want to save and reuse the cache files, you can configure the
transformation to use a persistent cache. Use this feature when you know the lookup table
does not change between session runs. Using a persistent cache can improve performance
because the PowerCenter Server builds the memory cache from the cache files instead of
from the database.
For more information on lookup caching options, see “Lookup Transformation” in the
Reducing the Number of Cached Rows

Use the Lookup SQL Override option to add a WHERE clause to the default SQL statement.
This allows you to reduce the number of rows included in the cache.
Optimizing the Lookup Condition

If you include more than one lookup condition, place the conditions with an equal sign first
to optimize lookup performance.
Indexing the Lookup Table

The PowerCenter Server needs to query, sort, and compare values in the lookup condition
columns. The index needs to include every column used in a lookup condition. You can
improve performance for both cached and uncached lookups:
♦ Cached lookups. You can improve performance by indexing the columns in the lookup
ORDER BY. The session log contains the ORDER BY statement.

♦ Uncached lookups. Because the PowerCenter Server issues a SELECT statement for each
row passing into the Lookup transformation, you can improve performance by indexing
the columns in the lookup condition.
Optimizing Multiple Lookups

If a mapping contains multiple lookups, even with caching enabled and enough heap
memory, the lookups can slow performance. By locating the Lookup transformations that
query the largest amounts of data, you can tune those lookups to improve overall
performance.
To see which Lookup transformations process the most data, examine the
Lookup_rowsinlookupcache counters for each Lookup transformation. The Lookup
transformations that have a large number in this counter might benefit from tuning their
lookup expressions. If those expressions can be optimized, session performance improves. For
hints on tuning expressions, see “Optimizing Expressions” on page 652.
Optimizing Filter Transformations

If you filter rows from the mapping, you can improve efficiency by filtering early in the data
flow. Instead of using a Filter transformation halfway through the mapping to remove a
sizable amount of data, use a source qualifier filter to remove those same rows at the source.
If you cannot move the filter into the source qualifier, move the Filter transformation as close
to the source qualifier as possible to remove unnecessary data early in the data flow.
In your filter condition, avoid using complex expressions. You can optimize Filter
transformations by using simple integer or true/false expressions in the filter condition.
Use a Filter or Router transformation to drop rejected rows from an Update Strategy
transformation if you do not need to keep rejected rows.
Optimizing Aggregator Transformations

Aggregator transformations often slow performance because they must group data before
processing it. Aggregator transformations need additional memory to hold intermediate
group results. You can optimize Aggregator transformations by performing the following
tasks:
♦ Group by simple columns.
♦ Use sorted input.
♦ Use incremental aggregation.
Group By Simple Columns

You can optimize Aggregator transformations when you group by simple columns. When
possible, use numbers instead of string and dates in the columns used for the GROUP BY.
You should also avoid complex expressions in the Aggregator expressions.

Use Sorted Input
You can increase session performance by sorting data and using the Aggregator Sorted Input
option.
The Sorted Input decreases the use of aggregate caches. When you use the Sorted Input
option, the PowerCenter Server assumes all data is sorted by group. As the PowerCenter
Server reads rows for a group, it performs aggregate calculations. When necessary, it stores
group information in memory.
The Sorted Input option reduces the amount of data cached during the session and improves
performance. Use this option with the Source Qualifier Number of Sorted Ports option to
pass sorted data to the Aggregator transformation.
You can benefit from better performance when you use the Sorted Input option in sessions
with multiple partitions.
For details about using Sorted Input in the Aggregator transformation, see “Aggregator
Use Incremental Aggregation

If you can capture changes from the source that changes less than half the target, you can use
Incremental Aggregation to optimize the performance of Aggregator transformations.
When using incremental aggregation, you apply captured changes in the source to aggregate
calculations in a session. The PowerCenter Server updates your target incrementally, rather
than processing the entire source and recalculate the same calculations every time you run the
session.
For details on using Incremental Aggregation, see “Using Incremental Aggregation” on
page 573.
Optimizing Joiner Transformations

Joiner transformations can slow performance because they need additional space at run time
to hold intermediate results. You can view Joiner performance counter information to
determine whether you need to optimize the Joiner transformations.
Joiner transformations need a data cache to hold the master table rows and an index cache to
hold the join columns from the master table. You need to make sure that you have enough
memory to hold the data and the index cache so the system does not page to disk. To
minimize memory requirements, you can also use the smaller table as the master table or join
on as few columns as possible.
The type of join you use can affect performance. Normal joins are faster than outer joins and
result in fewer rows. When possible, use database joins for homogenous sources.

Optimizing Sequence Generator Transformations
You can optimize Sequence Generator transformations by creating a reusable Sequence
Generator and use it in multiple mappings simultaneously. You can also optimize Sequence
Generator transformations by configuring the Number of Cached Values property.
The Number of Cached Values property determines the number of values the PowerCenter
Server caches at one time. Make sure that the Number of Cached Value is not too small. You
may consider configuring the Number of Cached Values to a value greater than 1,000.
For details on configuring Sequence Generator transformation, see “Sequence Generator
Optimizing Expressions
As a final step in tuning the mapping, you can focus on the expressions used in
transformations. When examining expressions, focus on complex expressions for possible
simplification. Remove expressions one-by-one to isolate the slow expressions.
Once you locate the slowest expressions, take a closer look at how you can optimize those
expressions.
Factoring Out Common Logic

If the mapping performs the same task in several places, reduce the number of times the
mapping performs the task by moving the task earlier in the mapping. For example, you have
a mapping with five target tables. Each target requires a Social Security number lookup.
Instead of performing the lookup five times, place the Lookup transformation in the mapping
before the data flow splits. Then pass lookup results to all five targets.
Minimizing Aggregate Function Calls

When writing expressions, factor out as many aggregate function calls as possible. Each time
you use an aggregate function call, the PowerCenter Server must search and group the data.
For example, in the following expression, the PowerCenter Server reads COLUMN_A, finds
the sum, then reads COLUMN_B, finds the sum, and finally finds the sum of the two sums:
SUM(COLUMN_A) + SUM(COLUMN_B)
If you factor out the aggregate function call, as below, the PowerCenter Server adds
COLUMN_A to COLUMN_B, then finds the sum of both.
SUM(COLUMN_A + COLUMN_B)
Replacing Common Sub-Expressions with Local Variables

If you use the same sub-expression several times in one transformation, you can make that
sub-expression a local variable. You can use a local variable only within the transformation,
but by calculating the variable only once, you can speed performance. For details, see
“Transformations” in the Designer Guide.

Choosing Numeric versus String Operations
The PowerCenter Server processes numeric operations faster than string operations. For
example, if you look up large amounts of data on two columns, EMPLOYEE_NAME and
EMPLOYEE_ID, configuring the lookup around EMPLOYEE_ID improves performance.
Optimizing Char-Char and Char-Varchar Comparisons

When the PowerCenter Server performs comparisons between CHAR and VARCHAR
columns, it slows each time it finds trailing blank spaces in the row. You can use the Treat
CHAR as CHAR On Read option in the PowerCenter Server setup so that the PowerCenter
Server does not trim trailing spaces from the end of Char source fields. For details, see the
Choosing DECODE versus LOOKUP

When you use a LOOKUP function, the PowerCenter Server must look up a table in a
database. When you use a DECODE function, you incorporate the lookup values into the
expression itself, so the PowerCenter Server does not have to look up a separate table.
Therefore, when you want to look up a small set of unchanging values, using DECODE may
improve performance. For details on using a DECODE, see the Transformation Language
Reference.
Using Operators Instead of Functions

The PowerCenter Server reads expressions written with operators faster than expressions with
functions. Where possible, use operators to write your expressions. For example, if you have
an expression that involves nested CONCAT calls such as:
CONCAT( CONCAT( CUSTOMERS.FIRST_NAME, ‘ ’) CUSTOMERS.LAST_NAME)
you can rewrite that expression with the || operator as follows:

CUSTOMERS.FIRST_NAME || ‘ ’ || CUSTOMERS.LAST_NAME
Optimizing IIF Expressions

IIF expressions can return a value as well as an action, which allows for more compact
expressions. For example, say you have a source with three Y/N flags: FLG_A, FLG_B,
FLG_C, and you want to return values such that: If FLG_A = “Y”, then return = VAL_A. If
FLG_A = “Y” AND FLG_B = “Y”, then return = VAL_A + VAL_B, and so on for all the
permutations.
One way to write the expression is as follows:
IIF( FLG_A = 'Y' and FLG_B = 'Y' AND FLG_C = 'Y',
VAL_A + VAL_B + VAL_C,
IIF( FLG_A = 'Y' and FLG_B = 'Y' AND FLG_C = 'N',
VAL_A + VAL_B ,
IIF( FLG_A = 'Y' and FLG_B = 'N' AND FLG_C = 'Y',
VAL_A + VAL_C,
IIF( FLG_A = 'Y' and FLG_B = 'N' AND FLG_C = 'N',

VAL_A ,
IIF( FLG_A = 'N' and FLG_B = 'Y' AND FLG_C = 'Y',
VAL_B + VAL_C,
IIF( FLG_A = 'N' and FLG_B = 'Y' AND FLG_C = 'N',
VAL_B ,
IIF( FLG_A = 'N' and FLG_B = 'N' AND FLG_C = 'Y',
VAL_C,
IIF( FLG_A = 'N' and FLG_B = 'N' AND FLG_C = 'N',
0.0,
))))))))
This first expression requires 8 IIFs, 16 ANDs, and at least 24 comparisons.

But if you take advantage of the IIF function’s ability to return a value, you can rewrite that
expression as:
IIF(FLG_A='Y', VAL_A, 0.0)+ IIF(FLG_B='Y', VAL_B, 0.0)+ IIF(FLG_C='Y',
VAL_C, 0.0)
This results in three IIFs, two comparisons, two additions, and a faster session.
Evaluating Expressions
If you are not sure which expressions slow performance, the following steps can help isolate
the problem.
To evaluate expression performance:
1. Time the session with the original expressions.

2. Copy the mapping and replace half of the complex expressions with a constant.
3. Run and time the edited session.
4. Make another copy of the mapping and replace the other half of the complex expressions
with a constant.
5. Run and time the edited session.

Optimizing the Session
Once you optimize your source database, target database, and mapping, you can focus on
optimizing the session. You can perform the following tasks to improve overall performance:
♦ Increase the number of partitions.
♦ Reduce errors tracing.
♦ Remove staging areas.
♦ Tune session parameters.
Table 25-1 lists the settings and values you can use to improve session performance:
Table 25-1. Session Tuning Parameters
Suggested Suggested
Setting Default Value
Minimum Value Maximum Value
DTM Buffer Size 12,000,000 bytes 6,000,000 bytes 128,000,000 bytes
Buffer block size 64,000 bytes 4,000 bytes 128,000 bytes
Index cache size 1,000,000 bytes 1,000,000 bytes 12,000,000 bytes
Data cache size 2,000,000 bytes 2,000,000 bytes 24,000,000 bytes
Commit interval 10,000 rows N/A N/A
High Precision Disabled N/A N/A
Tracing Level Normal Terse N/A
If you purchased the partitioning option, you can increase the number of partitions in a
pipeline to improve session performance. Increasing the number of partitions allows the
PowerCenter Server to create multiple connections to sources and process partitions of source
data concurrently.
When you create a session, the Workflow Manager validates each pipeline in the mapping for
partitioning. You can specify multiple partitions in a pipeline if the PowerCenter Server can
maintain data consistency when it processes the partitioned data.
For details on partitioning sessions, see “Pipeline Partitioning” on page 663.
Allocating Buffer Memory

When the PowerCenter Server initializes a session, it allocates blocks of memory to hold
source and target data. The PowerCenter Server allocates at least two blocks for each source
and target partition. Sessions that use a large number of sources and targets might require
additional memory blocks. If the PowerCenter Server cannot allocate enough memory blocks
to hold the data, it fails the session.
Optimizing the Session 655

By default, a session has enough buffer blocks for 83 sources and targets. If you run a session
that has more than 83 sources and targets, you can increase the number of available memory
blocks by adjusting the following session parameters:
♦ DTM Buffer Size. Increase the DTM buffer size found in the Performance settings of the
Properties tab. The default setting is 12,000,000 bytes.
♦ Default Buffer Block Size. Decrease the buffer block size found in the Advanced settings
of the Config Object tab. The default setting is 64,000 bytes.
To configure these settings, first determine the number of memory blocks the PowerCenter
Server requires to initialize the session. Then, based on default settings, you can calculate the
buffer size and/or the buffer block size to create the required number of session blocks.
If you have XML sources or targets in your mapping, use the number of groups in the XML
source or target in your calculation for the total number of sources and targets.
For example, you create a session that contains a single partition using a mapping that
contains 50 sources and 50 targets.
1. You determine that the session requires 200 memory blocks:
[(total number of sources + total number of targets)* 2] = (session buffer
blocks)
100 * 2 = 200
2. Next, based on default settings, you determine that you can change the DTM Buffer Size
to 15,000,000, or you can change the Default Buffer Block Size to 54,000:
(session Buffer Blocks) = (.9) * (DTM Buffer Size) / (Default Buffer Block
Size) * (number of partitions)
200 = .9 * 14222222 / 64000 * 1
or
200 = .9 * 12000000 / 54000 * 1
Increasing DTM Buffer Size

The DTM Buffer Size setting specifies the amount of memory the PowerCenter Server uses as
DTM buffer memory. The PowerCenter Server uses DTM buffer memory to create the
internal data structures and buffer blocks used to bring data into and out of the PowerCenter
Server. When you increase the DTM buffer memory, the PowerCenter Server creates more
buffer blocks, which improves performance during momentary slowdowns.
Increasing DTM buffer memory allocation generally causes performance to improve initially
and then level off. When you increase the DTM buffer memory allocation, consider the total
memory available on the PowerCenter Server system.
If you do not see a significant increase in performance, DTM buffer memory allocation is not
a factor in session performance.
Note: Reducing the DTM buffer allocation can cause the session to fail early in the process
because the PowerCenter Server is unable to allocate memory to the required processes.

To increase DTM buffer size:
1. Go to the Performance settings of the Properties tab.

2. Increase the setting for DTM Buffer Size, and click OK.
The default for DTM Buffer Size is 12,000,000 bytes. Increase the setting by increments of
multiples of the buffer block size, then run and time the session after each increase.
Optimizing the Buffer Block Size

Depending on the session source data, you might need to increase or decrease the buffer block
size.
If the session mapping contains a large number of sources or targets, you might need to
decrease the buffer block size. For more information, see “Allocating Buffer Memory” on
page 655.
If you are manipulating unusually large rows of data, you can increase the buffer block size to
improve performance. If you do not know the approximate size of your rows, you can
determine the configured row size by following the steps below.
To evaluate needed buffer block size:
1. In the Mapping Designer, open the mapping for the session.

2. Open the target instance.
3. Click the Ports tab.
4. Add the precisions for all the columns in the target.
5. If you have more than one target in the mapping, repeat steps 2-4 for each additional
target to calculate the precision for each target.
6. Repeat steps 2-5 for each source definition in your mapping.
7. Choose the largest precision of all the source and target precisions for the total precision
in your buffer block size calculation.
The total precision represents the total bytes needed to move the largest row of data. For
example, if the total precision equals 33,000, then the PowerCenter Server requires 33,000
bytes in the buffers to move that row. If the buffer block size is 64,000 bytes, the PowerCenter
Server can move only one row at a time.
Ideally, a buffer should accommodate at least 20 rows at a time. So if the total precision is
greater than 32,000, increase the size of the buffers to improve performance.
To increase buffer block size:
1. Go to the Advanced settings on the Config Object tab.

2. Increase the setting for Default Buffer Block Size, and click OK.
The default for this setting is 64,000 bytes. Increase this setting in relation to the size of the
rows. As with DTM buffer memory allocation, increasing buffer block size should improve

performance. If you do not see an increase, buffer block size is not a factor in session
performance.
Increasing the Cache Sizes

The PowerCenter Server uses the index and data caches for Aggregator, Rank, Lookup, and
Joiner transformation. The PowerCenter Server stores transformed data from Aggregator,
Rank, Lookup, and Joiner transformations in the data cache before returning it to the data
flow. It stores group information for those transformations in the index cache. If the allocated
data or index cache is not large enough to store the data, the PowerCenter Server stores the
data in a temporary disk file as it processes the session data. Each time the PowerCenter Server
pages to the temporary file, performance slows.
You can see when the PowerCenter Server pages to the temporary file by examining the
performance details. The Transformation_readfromdisk or Transformation_writetodisk
counters for any Aggregator, Rank, Lookup, or Joiner transformation indicate the number of
times the PowerCenter Server must page to disk to process the transformation. Since the data
cache is typically larger than the index cache, you should increase the data cache more than
the index cache.
For details on calculating the index and data cache size for Aggregator, Rank, Lookup, or
Joiner transformations, see “Session Caches” on page 613.
Increasing the Commit Interval

The Commit Interval setting determines the point at which the PowerCenter Server commits
data to the target tables. Each time the PowerCenter Server commits, performance slows.
Therefore, the smaller the commit interval, the more often the PowerCenter Server writes to
the target database, and the slower the overall performance.
If you increase the commit interval, the number of times the PowerCenter Server commits
decreases and performance improves.
When you increase the commit interval, consider the log file limits in the target database. If
the commit interval is too high, the PowerCenter Server may fill the database log file and
cause the session to fail.
Therefore, weigh the benefit of increasing the commit interval against the additional time you
would spend recovering a failed session.
Click the General Options settings of the Properties tab to review and adjust the commit
interval.
Disabling High Precision

If a session runs with high precision enabled, disabling high precision might improve session
performance.

The Decimal datatype is a numeric datatype with a maximum precision of 28. To use a high
precision Decimal datatype in a session, configure the PowerCenter Server to recognize this
datatype by selecting Enable High Precision in the session properties. However, since reading
and manipulating the high precision datatype slows the PowerCenter Server, you can improve
session performance by disabling high precision.
When you disable high precision, the PowerCenter Server converts data to a double. The
PowerCenter Server reads the Decimal row 3900058411382035317455530282 as
390005841138203 x 1013 . For details on high precision, “Handling High Precision Data” on
page 204.
Click the Performance settings on the Properties tab to enable high precision.
Reducing Error Tracing

If a session contains a large number of transformation errors that you have no time to correct,
you can improve performance by reducing the amount of data the PowerCenter Server writes
to the session log.
To reduce the amount of time spent writing to the session log file, set the tracing level to
Terse. You specify Terse tracing if your sessions run without problems and you don’t need
session details. At this tracing level, the PowerCenter Server does not write error messages or
row-level information for reject data.
To debug your mapping, set the tracing level to Verbose. However, it can significantly impact
the session performance. Do not use Verbose tracing when you tune performance.
The session tracing level overrides any transformation-specific tracing levels within the
mapping. This is not recommended as a long-term response to high levels of transformation
errors.
For more information about tracing levels, see “Setting Tracing Levels” on page 473.
Removing Staging Areas

When you use a staging area, the PowerCenter Server performs multiple passes on your data.
Where possible, remove staging areas to improve performance. The PowerCenter Server can
read multiple sources with a single pass, which may alleviate your need for staging areas. For
details on single-pass reading, see “Optimizing the Mapping” on page 647.

Optimizing the System
Often performance slows because your session relies on inefficient connections or an
overloaded PowerCenter Server system. System delays can also be caused by routers, switches,
network protocols, and usage by many users. After you determine from the system monitoring
tools that you have a system bottleneck, you can make the following global changes to
improve the performance of all your sessions:
♦ Improve network speed. Slow network connections can slow session performance. Have
your system administrator determine if your network runs at an optimal speed. Decrease
the number of network hops between the PowerCenter Server and databases.
♦ Use multiple PowerCenter Servers. Using multiple PowerCenter Servers on separate
systems might double or triple session performance.
♦ Use a server grid. Use a collection of PowerCenter Servers to distribute and process the
workload of a workflow. For information on server grids, see “Working with Server Grids”
on page 446.
♦ Improve CPU performance. Run the PowerCenter Server and related machines on high
performance CPUs, or configure your system to use additional CPUs.
♦ Configure the PowerCenter Server for ASCII data movement mode. When all character
data processed by the PowerCenter Server is 7-bit ASCII or EBCDIC, configure the
PowerCenter Server for ASCII data movement mode.
♦ Check hard disks on related machines. Slow disk access on source and target databases,
source and target file systems, as well as the PowerCenter Server and repository machines
can slow session performance. Have your system administrator evaluate the hard disks on
your machines.
♦ Reduce paging. When an operating system runs out of physical memory, it starts paging to
disk to free physical memory. Configure the physical memory for the PowerCenter Server
machine to minimize paging to disk.
♦ Use processor binding. In a multi-processor UNIX environment, the PowerCenter Server
may use a large amount of system resources. Use processor binding to control processor
usage by the PowerCenter Server.
Improving Network Speed

The performance of the PowerCenter Server is related to network connections. A local disk
can move data five to twenty times faster than a network. Consider the following options to
minimize network activity and to improve PowerCenter Server performance.
If you use flat file as a source or target in your session, you can move the files onto the
PowerCenter Server system to improve performance. When you store flat files on a machine
other than the PowerCenter Server, session performance becomes dependent on the
performance of your network connections. Moving the files onto the PowerCenter Server
system and adding disk space might improve performance.

If you use relational source or target databases, try to minimize the number of network hops
between the source and target databases and the PowerCenter Server. Moving the target
database onto a server system might improve PowerCenter Server performance.
When you run sessions that contain multiple partitions, have your network administrator
analyze the network and make sure it has enough bandwidth to handle the data moving across
the network from all partitions.
Using Multiple PowerCenter Servers

You can run multiple PowerCenter Servers on separate systems against the same repository.
Distributing the session load to separate PowerCenter Server systems increases performance.
For details on using multiple PowerCenter Servers, see “Using Multiple Servers” on page 443.
Using Server Grids

A server grid allows you to use the combined processing power of multiple PowerCenter
Servers to balance the workload of workflows. For more information about creating a server
grid, see “Working with Server Grids” on page 446.
In a server grid, a PowerCenter Server distributes sessions across the network of available
PowerCenter Servers. You can further improve performance by assigning a more powerful
server to run a complicated mapping. For more information about assigning a server to a
session, see “Assigning the PowerCenter Server to a Session” on page 198.
Running the PowerCenter Server in ASCII Data Movement Mode

When all character data processed by the PowerCenter Server is 7-bit ASCII or EBCDIC,
configure the PowerCenter Server to run in the ASCII data movement mode. In ASCII mode,
the PowerCenter Server uses one byte to store each character. When you run the PowerCenter
Server in Unicode mode, it uses two bytes for each character, which can slow session
performance.
Using Additional CPUs

Configure your system to use additional CPUs to improve performance. Additional CPUs
allows the system to run multiple sessions in parallel as well as multiple pipeline partitions in
parallel.
However, additional CPUs might cause disk bottlenecks. To prevent disk bottlenecks,
minimize the number of processes accessing the disk. Processes that access the disk include
database functions and operating system functions. Parallel sessions or pipeline partitions also
require disk access.
Optimizing the System 661

Reducing Paging
Paging occurs when the PowerCenter Server operating system runs out of memory for a
particular operation and uses the local disk for memory. You can free up more memory or
increase physical memory to reduce paging and the slow performance that results from
paging. Monitor paging activity using system tools.
You might want to increase system memory in the following circumstances:
♦ You run a session that uses large cached lookups.
♦ You run a session with many partitions.
If you cannot free up memory, you might want to add memory to the system.
Using Processor Binding

In a multi-processor UNIX environment, the PowerCenter Server may use a large amount of
system resources if you run a large number of sessions. As a result, other applications on the
machine may not have enough system resources available. You can use processor binding to
control processor usage by the PowerCenter Server.
In a Sun Solaris environment, the system administrator can create and manage a processor set
using the psrset command. The system administrator can then use the pbind command to
bind the PowerCenter Server to a processor set so the processor set only runs the PowerCenter
Server. The Sun Solaris environment also provides the psrinfo command to display details
about each configured processor, and the psradm command to change the operational status
of processors. For details, see your system administrator and Sun Solaris documentation.
In an HP-UX environment, the system administrator can use the Process Resource Manager
utility to control CPU usage in the system. The Process Resource Manager allocates
minimum system resources and uses a maximum cap of resources. For details, see your system
administrator and HP-UX documentation.
In an AIX environment, system administrators can use the Workload Manager in AIX 5L to
manage system resources during peak demands. The Workload Manager can allocate resources
and manage CPU, memory, and disk I/O bandwidth. For details, see your system
administrator and AIX documentation.

Once you have tuned the application, databases, and system for maximum single-partition
performance, you may find that your system is under-utilized. At this point, you can
reconfigure your session to have two or more partitions. Adding partitions may improve
performance by utilizing more of the hardware while processing the session.
Use the following tips when you add partitions to a session:
♦ Add one partition at a time. To best monitor performance, add one partition at a time,
and note your session settings before you add each partition.
♦ Set DTM Buffer Memory. For a session with n partitions, this value should be at least n
times the value for the session with one partition.
♦ Set cached values for Sequence Generator. For a session with n partitions, there should be
no need to use the “Number of Cached Values” property of the Sequence Generator
transformation. If you must set this value to a value greater than zero, make sure it is at
least n times the original value for the session with one partition.
♦ Partition the source data evenly. Configure each partition to extract the same number of
rows.
♦ Monitor the system while running the session. If there are CPU cycles available (twenty
percent or more idle time) then this session might see a performance improvement by
adding a partition.
♦ Monitor the system after adding a partition. If the CPU utilization does not go up, the
wait for I/O time goes up, or the total data transformation rate goes down, then there is
probably a hardware or software bottleneck. If the wait for I/O time goes up a significant
amount, then check the system for hardware bottlenecks. Otherwise, check the database
configuration.
♦ Tune databases and system. Make sure that your databases are tuned properly for parallel
ETL and that your system has no bottlenecks.
For details on pipeline partitioning, see “Pipeline Partitioning” on page 345.
Optimizing the Source Database for Partitioning

Usually, each partition on the reader side represents a subset of the data to be processed. But if
the database is not tuned properly, the results may not make your session any quicker. This is
fairly easy to test. Create a pipeline with one partition. Measure the reader throughput in the
Workflow Manager. After you do this, add partitions. Is the throughput scaling linearly? In
other words, if you have two partitions, is your reader throughput twice as fast? If this is not
true, you probably need to tune your database.
Some databases may have specific options that must be set to enable parallel queries. You
should check your individual database manual for these options. If these options are off, the
PowerCenter Server runs multiple partition SELECT statements serially.
Pipeline Partitioning 663

You can also consider adding partitions to increase the speed of your query. Each database
provides an option to separate the data into different tablespaces. If your database allows it,
you can use the SQL override feature to provide a query that extracts data from a single
partition.
To maximize a single-sorted query on your database, you need to look at options that enable
parallelization. There are many options in each database that may increase the speed of your
query.
Here are some configuration options to look for in your source database:
♦ Check for configuration parameters that perform automatic tuning. For example, Oracle
has a parameter called parallel_automatic_tuning.
♦ Make sure intra-parallelism (the ability to run multiple threads on a single query) is
enabled. For example, on Oracle you should look at parallel_adaptive_multi_user. On
DB2, you should look at intra_parallel.
♦ Maximum number of parallel processes that are available for parallel executions. For
example, on Oracle, you should look at parallel_max_servers. On DB2, you should look at
max_agents.
♦ Size for various resources used in parallelization. For example, Oracle has parameters such
as large_pool_size, shared_pool_size, hash_area_size, parallel_execution_message_size,
and optimizer_percent_parallel. DB2 has configuration parameters such as dft_fetch_size,
fcm_num_buffers, and sort_heap.
♦ Degrees of parallelism (may occur as either a database configuration parameter or an
option on the table or query). For example, Oracle has parameters
parallel_threads_per_cpu and optimizer_percent_parallel. DB2 has configuration
parameters such as dft_prefetch_size, dft_degree, and max_query_degree.
♦ Turn off options that may affect your database scalability. For example, disable archive
logging and timed statistics on Oracle.
Note: The above examples are not a comprehensive list of all the tuning options available to
you on the databases. Check your individual database documentation for all performance
tuning configuration parameters available.
Optimizing the Target Database for Partitioning

If you have a mapping with multiple partitions, you want the throughput for each partition to
be the same as the throughput for a single partition session. If you do not see this correlation,
then your database is probably inserting rows into the database serially.
To make sure that your database inserts rows in parallel, check the following configuration
options in your target database:
♦ Look for a configuration option that needs to be set explicitly to enable parallel inserts.
For example, Oracle has db_writer_processes, and DB2 has max_agents (some databases
may have this enabled by default).

♦ Consider partitioning your target table. If it is possible, try to have each partition write to
a single database partition. You can use the Router transformation to do this. Also, look
into having the database partitions on separate disks to prevent I/O contention among the
pipeline partitions.
♦ Turn off options that may affect your database scalability. For example, disable archive
logging and timed statistics on Oracle.
Pipeline Partitioning 665

Appendix A
Session Properties
Reference
This appendix contains a listing of settings in the session properties. These settings are
grouped by the following tabs:
♦ General Tab, 668
♦ Properties Tab, 670
♦ Config Object Tab, 675
♦ Mapping Tab (Transformations View), 681
♦ Mapping Tab (Partitions View), 705
♦ Components Tab, 710
♦ Metadata Extensions Tab, 718
667
General Tab
By default, the General tab appears when you edit a session task.
Figure A-1 displays the General tab:
Figure A-1. General Tab
On the General tab you can rename the session task and enter a description for the session
task.
Table A-1 describes settings on the General tab:
Table A-1. General Tab
General Tab Required/

Description
Options Optional
Rename Optional The Rename button allows you to enter a new name for the session task.
Description Optional You can enter a description for the session task in the Description field.
Mapping name Required The name of the mapping associated with the session task.
Server Required The name of the server associated with the session task.
Fail Parent if this Optional Fails the parent worklet or workflow if this task fails.
task fails*
668 Appendix A: Session Properties Reference

Table A-1. General Tab

Description
Options Optional
Fail parent if this Optional Fails the parent worklet or workflow if this task does not run.
task does not run*
Disable this task* Optional Disables the task.
Treat the input links Required Runs the task when all or one of the input link conditions evaluate to True.
as AND or OR*
*Appears only in the Workflow Designer.
General Tab 669

Properties Tab
On the Properties tab you can configure the following settings:
♦ General Options. General Options settings allow you to configure session log file name,
session log file directory, parameter filename and other general session settings. For more
information, see “General Options Settings” on page 670.
♦ Performance. The Performance settings allow you to increase memory size, collect
performance details, and set configuration parameters. For more information, see
“Performance Settings” on page 673.
General Options Settings

You can configure General Options settings on the Properties tab. You can enter session log
file name, session log file directory, and other general session settings.
Figure A-2 displays the General Options settings on the Properties tab:
Figure A-2. Properties Tab - General Options Settings

Table A-2 describes the General Options settings on the Properties tab:
Table A-2. Properties Tab - General Options Settings
General Options Required/

Description
Settings Optional
Session Log File Optional By default, the PowerCenter Server uses the session name for the log file
Name name: s_mapping name.log. For a debug session, it uses
DebugSession_mapping name.log.
Optionally enter a file name, a file name and directory, or use the
$PMSessionLogFile session parameter. The PowerCenter Server appends
information in this field to that entered in the Session Log File Directory field.
For example, if you have “C:\session_logs\” in the Session Log File Directory
File field, then enter “logname.txt” in the Session Log File field, the
PowerCenter Server writes the logname.txt to the C:\session_logs\ directory.
You can also use the $PMSessionLogFile session parameter to represent the
name of the session log or the name and location of the session log. For details
on session parameters, see “Session Parameters” on page 495.
Session Log File Required Designates a location for the session log file. By default, the PowerCenter
Directory Server writes the log file in the server variable directory,
$PMSessionLogFileDir.
If you enter a full directory and file name in the Session Log File Name field,
clear this field.
Parameter File Optional Designates the name and directory for the parameter file. Use the parameter
Name file to define session parameters. You can also use it to override values of
mapping parameters and variables. For details on session parameters, see
“Session Parameters” on page 495. For details on mapping parameters and
variables, see “Mapping Parameters and Variables” in the Designer Guide.
writing to targets. The PowerCenter Server generates all session files, and
Test field.
Note: You can perform a test load when you configure a session for normal
mode. If you configure the session for bulk mode, the session fails.
Test load.
The PowerCenter Server reads the exact number you configure for the test
load. You cannot perform a test load when you run a session against a
mapping that contains XML sources.
Properties Tab 671


Description
Settings Optional
$Source Optional Enter the database connection you want the PowerCenter Server to use for the
Connection Value $Source variable. Choose a relational or application database connection. You
can also choose a $DBConnection parameter.
You can use the $Source variable in Lookup and Stored Procedure
transformations to specify the database location for the lookup table or stored
procedure.
If you use $Source in a mapping, you can specify the database location in this
field to ensure the PowerCenter Server uses the correct database connection
to run the session.
If you use $Source in a mapping, but do not specify a database connection in
this field, the PowerCenter Server determines which database connection to
use when it runs the session. If it cannot determine the database connection, it
fails the session. For more information, see “Lookup Transformation” and
“Stored Procedure Transformation” in the Transformation Guide.
$Target Connection Optional Enter the database connection you want the PowerCenter Server to use for the
Value $Target variable. Choose a relational or application database connection. You
can also choose a $DBConnection parameter.
You can use the $Target variable in Lookup and Stored Procedure
procedure.
If you use $Target in a mapping, you can specify the database location in this
field to ensure the PowerCenter Server uses the correct database connection
to run the session.
If you use $Target in a mapping, but do not specify a database connection in
this field, the PowerCenter Server determines which database connection to
use when it runs the session. If it cannot determine the database connection, it
fails the session. For more information, see “Lookup Transformation” and
“Stored Procedure Transformation” in the Transformation Guide.
Treat Source Rows Required Indicates how the PowerCenter Server treats all source rows. If the mapping
As for the session contains an Update Strategy transformation or a Custom
transformation configured to set the update strategy, the default option is Data
Driven.
When you select Data Driven and you load to either a Microsoft SQL Server or
Oracle database, you must use a normal load. If you bulk load, the
PowerCenter Server fails the session.
Commit Type Required Determines whether the PowerCenter Server uses a source- or target-based,
or user-defined commit. You can choose source- or target-based commit if the
mapping has no Transaction Control transformation or only ineffective
Transaction Control transformations. By default, the PowerCenter Server
performs a target-based commit.
A User-Defined commit is enabled by default if the mapping has effective
Transaction Control transformations.
For details on Commit Intervals, see “Setting Commit Properties” on page 292.
Commit Interval Required In conjunction with the selected commit interval type, indicates the number of
rows. By default, the PowerCenter Server uses a commit interval of 10,000
rows.
This option is not available for user-defined commit.


Description
Settings Optional
Commit On End Of Required By default, this option is enabled and the PowerCenter Server performs a
File commit at the end of the file. Clear this option if you want to roll back open
transactions.
This option is enabled by default for a target-based commit. You cannot disable
it.
Rollback Optional For source-based commit, the PowerCenter Server rolls back the transaction at
Transactions on the next commit point when it encounters a non-fatal writer error.
Errors For user-defined commit, the PowerCenter Server rolls back the transaction at
the next commit point when it encounters a non-fatal error.
This option is not available for target-based commit.
*Tip: When you bulk load to Microsoft SQL Server or Oracle targets, define a large commit interval. Microsoft SQL
Server and Oracle start a new bulk load transaction after each commit. Increasing the commit interval reduces the
number of bulk load transactions and increases performance.
Performance Settings
You can configure performance settings on the Properties tab. In Performance settings you
can increase memory size, collect performance details, and set configuration parameters.
Figure A-3 displays the Performance settings on the Properties tab:
Figure A-3. Properties Tab - Performance Settings
Properties Tab 673

Table A-3 describes the Performance settings on the Properties tab:
Table A-3. Properties Tab - Performance Settings
Performance Required/
Description
Settings Optional
DTM Buffer Size Required The amount of memory allocated to the session from the DTM process. By
default, the Workflow Manager allocates 12 MB for DTM buffer memory. If a
session contains large amounts of character data and you configure it to run in
Unicode mode, increase the DTM Buffer size to 24 MB.
Note: If a source contains a large binary object with a precision larger than the
allocated DTM buffer size, then increase the DTM buffer size to increase the
buffer memory. If you do not increase the DTM buffer memory, the session will
fail.
For information on improving session performance, see “Performance Tuning”
on page 635.
Collect Optional When selected, the PowerCenter Server creates session performance details.
Performance Data Use this file to help determine how you can improve session performance. For
more information, see “Performance Tuning” on page 635.
Incremental Optional Select Incremental Aggregation option if you want the PowerCenter Server to
Aggregation perform incremental aggregation. For details, see “Using Incremental
Aggregation” on page 573.
Reinitialize Optional Select Reinitialize Aggregate Cache if the session is an incremental

Aggregate Cache aggregation session and you want to overwrite existing aggregate files.
After a single session run, to return to a normal incremental aggregation
session run, you must clear this option. For details, see “Using Incremental
Aggregation” on page 573.
Enable High Optional When selected, the PowerCenter Server processes the Decimal datatype to a
Precision precision of 28. If a session does not use the Decimal datatype, leave this
setting clear. For details on using the Decimal datatype with high precision, see
“Handling High Precision Data” on page 204.
Session Retry On Optional Select this option if you want the PowerCenter Server to retry target writes on
Deadlock deadlock. You can only use Session Retry on Deadlock for sessions configured
for normal load. This option is disabled for bulk mode. You can configure the
PowerCenter Server to set the number of deadlock retries and the deadlock
sleep time period.
Session Sort Order Required Specify a sort order for the session. The session properties display all sort
orders associated with the PowerCenter Server code page. When the
PowerCenter Server runs in Unicode mode, it sorts character data in the
session using the selected sort order. When the PowerCenter Server runs in
ASCII mode, it ignores this setting and uses a binary sort order to sort
character data.

Config Object Tab
The Config Object tab displays settings such as session log settings, error handling settings,
and other advanced properties. You can override properties in the default session
configuration in the Config Object tab. Or, you can choose a session configuration object you
already created in the Workflow Manager and override its properties.
Click the Open button in the Config Name field to choose the session configuration object
you want to override.
You can configure the following settings in the Config Object tab:
♦ Advanced. Advanced settings allow you to configure constraint-based loading, lookup
caches, and buffer sizes. For more information, see “Advanced Settings” on page 675.
♦ Log Options. Log options allow you to configure how you want to save the session log. By
default, the PowerCenter Server saves only the current session log. For more information,
see “Log Options Settings” on page 677.
♦ Error Handling. Error Handling settings allow you to determine if the session fails or
continues when it encounters pre-session command errors, stored procedure errors, or a
specified number of session errors. For more information see, “Error Handling Settings”
on page 678.
Advanced Settings
Advanced settings allow you to configure constraint-based loading, lookup caches, and buffer
sizes.
Config Object Tab 675

Figure A-4 displays the Advanced settings on the Config Object tab:
Figure A-4. Config Object Tab - Advanced Settings
Table A-4 describes the Advanced settings of the Config Object tab:
Table A-4. Config Object Tab - Advanced Settings
Advanced Required/
Description
Settings Optional
Constraint Based Optional The PowerCenter Server loads targets based on primary key-foreign key
Load Ordering constraints where possible.
Cache Lookup() Optional If selected, the PowerCenter Server caches PowerMart 3.5 LOOKUP functions
Function in the mapping, overriding mapping-level LOOKUP configurations.
If not selected, the PowerCenter Server performs lookups on a row-by-row
basis, unless otherwise specified in the mapping.

Table A-4. Config Object Tab - Advanced Settings
Advanced Required/
Description
Settings Optional
Default Buffer Optional This setting is performance related. For details on performance tuning, see
Block Size “Performance Tuning” on page 635.
Note: The session must have enough buffer blocks to initialize. The minimum
number of buffer blocks must be greater than the total number of sources
(Source Qualifiers, Normalizers for COBOL sources), and targets. The number
of buffer blocks in a session = DTM Buffer Size / Buffer Block Size. Default
settings create enough buffer blocks for 83 sources and targets. If the session
contains more than 83, you might need to increase DTM Buffer Size or
decrease Default Buffer Block Size.
Line Sequential Optional Affects the way the PowerCenter Server reads flat files. Increase this setting
Buffer Length from the default of 1024 bytes per line only if source flat file records are larger
than 1024 bytes.
Log Options Settings

Log options allow you to configure how you want to save the session log. By default, the
PowerCenter Server saves only the current session log.
Figure A-5 displays the Log Options settings on the Config Object tab:
Figure A-5. Config Object Tab - Log Option Settings

Table A-5 displays the Log Options settings of the Config Object tab:
Table A-5. Config Object Tab - Log Options Settings
Required/
Log Options Settings Description
Optional
Save Session Log By Required If you select Save Session Log by Timestamp, the PowerCenter Server
saves all session logs, appending a timestamp to each log.
If you select Save Session Log by Runs, the PowerCenter Server saves
a designated number of session logs. Configure the number of sessions
in the Save Session Log for These Runs option.
You can also use the $PMSessionLogCount server variable to save the
configured number of session logs for the PowerCenter Server.
For details on these options, see “Configuring Session Logs” on
page 469.
Save Session Log for Required The number of historical session logs you want the PowerCenter Server
These Runs to save.
The Informatica saves the number of historical logs you specify, plus the
most recent session log. Therefore, if you specify 5 runs, the
PowerCenter Server saves the most recent session log, plus historical
logs 0-4, for a total of 6 logs.
You can specify up to 2,147,483,647 historical logs. If you specify 0 logs,
the PowerCenter Server saves only the most recent session log.
Error Handling Settings

Error Handling settings allow you to determine if the session fails or continues when it
encounters pre-session command errors, stored procedure errors, or a specified number of
session errors.

Figure A-6 displays the Error Handling settings on the Config Object tab:
Figure A-6. Config Object Tab - Error Handling Settings
Table A-6 describes the Error handling settings of the Config Object tab:
Table A-6. Config Object Tab - Error Handling Settings
Error Handling Required/

Description
Settings Optional
Stop On Errors Optional Indicates how many non-fatal errors the PowerCenter Server can
encounter before it stops the session. Non-fatal errors include reader,
writer, and DTM errors. Enter the number of non-fatal errors you want to
allow before stopping the session. The PowerCenter Server maintains an
independent error count for each source, target, and transformation. If
you specify 0, non-fatal errors do not cause the session to stop.
Optionally use the $PMSessionErrorThreshold server variable to stop on
the configured number of errors for the PowerCenter Server.
Override Tracing Optional Overrides tracing levels set on a transformation level. Selecting this
option enables a menu from which you choose a tracing level: None,
Terse, Normal, Verbose Initialization, or Verbose Data. For details on
tracing levels, see “Configuring Session Logs” on page 469.

Table A-6. Config Object Tab - Error Handling Settings
Error Handling Required/

Description
Settings Optional
On Stored Procedure Optional Required if the session uses pre- or post-session stored procedures.
Error If you select Stop Session, the PowerCenter Server stops the session on
errors executing a pre-session or post-session stored procedure.
If you select Continue Session, the PowerCenter Server continues the
session regardless of errors executing pre-session or post-session stored
procedures.
By default, the PowerCenter Server stops the session on Stored
Procedure error and marks the session failed.
On Pre-Session Optional Required if the session has pre-session shell commands.

Command Task Error If you select Stop Session, the PowerCenter Server stops the session on
errors executing pre-session shell commands.
If you select Continue Session, the PowerCenter Server continues the
session regardless of errors executing pre-session shell commands.
By default, the PowerCenter Server stops the session upon error.
On Pre-Post SQL Error Optional Required if the session uses pre- or post-session SQL.
If you select Stop Session, the PowerCenter Server stops the session
errors executing pre-session or post-session SQL.
If you select Continue, the PowerCenter Server continues the session
regardless of errors executing pre-session or post-session SQL.
By default, the PowerCenter Server stops the session upon pre- or post-
session SQL error and marks the session failed.
Enable Recovery Optional Enables recovery for the session. For details on recovery, see
“Recovering Data” on page 295.
Error Log Type Required Specifies the type of error log to create. You can specify relational, file, or
no log. By default, the Error Log Type is set to none.
Error Log DB Connection Optional Specifies the database connection for a relational error log.
Error Log Table Name Optional Specifies table name prefix for a relational error log. Oracle and Sybase
Prefix have a 30 character limit for table names. If a table name exceeds 30
characters, the session fails.
Error Log File Directory Optional Specifies the directory where errors are logged. By default, the error log
file directory is $PMBadFilesDir\.
Error Log File Name Optional Specifies error log file name. By default, the error log file name is
PMError.log.
Log Row Data Optional Specifies whether or not to log row data. By default, the check box is clear
and row data is not logged.
Log Source Row Data Optional Specifies whether or not to log source row data. By default, the check box
is clear and source row data is not logged.
Data Column Delimiter Optional Delimiter for string type source row data and transformation group row
data. By default, the PowerCenter Server uses a pipe ( | ) delimiter. Verify
that you do not use the same delimiter for the row data as the error
logging columns. If you use the same delimiter, you may find it difficult to
read the error log file.

Mapping Tab (Transformations View)
In the Transformations view of the Mapping tab, you can configure settings for connections,
sources, targets, and transformations.
You can configure the following nodes:
♦ Connections
♦ Sources
♦ Targets
♦ Transformations
Connections Node
The Connections node displays the source, target, lookup, stored procedure, FTP, external
loader, and queue connections. You can choose connection types and connection values. You
can also edit connection object values.
Figure A-7 displays the Connections settings on the Mapping tab:
Figure A-7. Mapping Tab - Connections Settings
Mapping Tab (Transformations View) 681

Table A-7 describes the Connections settings on the Mapping tab:
Table A-7. Mapping Tab - Connections Settings
Connections Required/
Description
Node Settings Optional
Type Required Enter the connection type for relational and non-relational sources and targets.
Specifies Relational for relational sources and targets.
You can choose the following connection types for flat file, XML, and MQSeries
sources/Targets:
- Queue. Select this connection type to access a MQSeries source if you are
using MQ Source Qualifiers. For static MQSeries targets, set the connection
type to FTP or Queue. For dynamic MQSeries targets, the connection type is
set to Queue. MQSeries connections must be defined in the Workflow
Manager prior to configuring sessions. For more information, see the
PowerCenter Connect for IBM MQSeries User and Administrator Guide .
- Loader. Select this connection type to use the External Loader to load output
files to Teradata, Oracle, DB2, or Sybase IQ databases. If you select this
option, select a configured loader connection in the Value column.
To use this option, you must use a mapping with a relational target definition
and choose File as the writer type on the Writers tab for the relational target
instance. As the PowerCenter Server completes the session, it uses an
external loader to load target files to the Oracle, Sybase IQ, DB2, or Teradata
database. You cannot choose external loader for flat file or XML target
definitions in the mapping.
Note to Oracle 8 users: If you configure a session to write to an Oracle 8
external loader target table in bulk mode with NOT NULL constraints on any
columns, the session may write the null character into a NOT NULL column if
the mapping generates a NULL output.
For details on using the external loader feature, see “External Loading” on
page 523.
- FTP. Select this connection type to use FTP to access the source/target
directory for flat file and XML sources/targets. If you select this option, select
a configured FTP connection in the Value column. FTP connections must be
defined in the Workflow Manager prior to configuring sessions. For details on
using FTP, see “Using FTP” on page 559.
- None. Choose None when you want to read from a local flat file or XML file, or
if you are using an associated source for a MQSeries session.
The type also specifies lists the connections in the mapping, such as $Source
connection value and $Target connection value.
You can also configure connection information for Lookups and Stored
Procedures.

Table A-7. Mapping Tab - Connections Settings
Description
Node Settings Optional
Partitions N/A Displays the partitions if the session is partitioned.
Value Required Enter a source and target connection based on the value you choose in the
Type column. You can also specify the $Source and $Target connection value:
- $Source connection value. Enter the database connection you want the
PowerCenter Server to use for the $Source variable. Choose a relational or
application database connection. You can also choose a $DBConnection
parameter. You can use the $Source variable in Lookup and Stored
Procedure transformations to specify the database location for the lookup
table or stored procedure. If you use $Source in a mapping, you can specify
the database location in this field to ensure the PowerCenter Server uses the
correct database connection to run the session. If you use $Source in a
mapping, but do not specify a database connection in this field, the
PowerCenter Server determines which database connection to use when it
runs the session. If it cannot determine the database connection, it fails the
session. For more information, see the Transformation Guide.
- $Target connection value. Enter the database connection you want the
PowerCenter Server to use for the $Target variable. Choose a relational or
application database connection. You can also choose a $DBConnection
parameter. You can use the $Target variable in Lookup and Stored Procedure
procedure. If you use $Target in a mapping, you can specify the database
location in this field to ensure the PowerCenter Server uses the correct
database connection to run the session. If you use $Target in a mapping, but
do not specify a database connection in this field, the PowerCenter Server
determines which database connection to use when it runs the session. If it
cannot determine the database connection, it fails the session. For more
information, see the Transformation Guide.
You can also specify the lookup and stored procedure location information
value, if your mapping has lookups or stored procedures.
Sources Node
The Sources node lists the sources used in the session and displays their settings. If you want
to view and configure the settings of a specific source, select the source from the list.
You can configure the following settings:
♦ Readers. The Readers settings displays the reader the PowerCenter Server uses with each
source instance. For more information, see “Readers Settings” on page 684.
♦ Connections. The Connections settings allows you to configure connections for the
sources. For more information, see “Connections Settings” on page 684.
♦ Properties. The Properties settings allows you to configure the source properties. For more
information, see “Properties Settings” on page 686.

Readers Settings
You can view the reader the PowerCenter Server uses with each source instance. The
Workflow Manager specifies the necessary reader for each source instance. For relations
sources the reader is Relational Reader and for file sources it is File Reader.
Figure A-8 displays the Readers settings on the Mapping tab (Sources node):
Figure A-8. Mapping Tab - Sources Node - Readers Settings
You can configure the connections the PowerCenter Server uses with each source instance.

Figure A-9 displays the Connections settings on the Mapping tab (Sources node):
Figure A-9. Mapping Tab - Sources Node - Connections Settings
Table A-8 describes the Connections settings on the Mapping tab (Sources node):
Table A-8. Mapping Tab - Sources Node - Connections Settings
Description
Settings Optional
Type Required Enter the connection type for relational and non-relational sources. Specifies
Relational for relational sources.
You can choose the following connection types for flat file, XML, and MQSeries
sources:
- Queue. Select this connection type to access a MQSeries source if you are using
MQ Source Qualifiers. MQSeries connections must be defined in the Workflow
Manager prior to configuring sessions. For more information, see the PowerCenter
Connect for IBM MQSeries User and Administrator Guide .
- FTP. Select this connection type to use FTP to access the source directory for flat
file and XML sources. If you want to extract data from a flat file or XML source
using FTP, you must specify an FTP connection when you configure source
options. If you select this option, select a configured FTP connection in the Value
column. FTP connections must be defined in the Workflow Manager prior to
configuring sessions. For details on using FTP, see “Using FTP” on page 559.
- None. Choose None when you want to read from a local flat file or XML file, or if
you are using an associated source for a MQSeries session.
Value Required Enter a source connection based on the value you choose in the Type column.

Properties Settings
Click the Properties settings to define source property information. The Workflow Manager
displays properties for both relational and file sources.
Figure A-10 displays the Properties settings on the Mapping tab (Sources node):
Figure A-10. Mapping Tab - Sources Node - Properties Settings
Table A-9 describes Properties settings on the Mapping tab for relational sources:
Table A-9. Mapping Tab - Sources Node - Properties Settings (Relational Sources)
Relational Required/
Description
Source Options Optional
Owner Name Optional Specified the table owner name.
User Defined Join Optional Specifies the condition used to join data from multiple sources
represented in the same Source Qualifier transformation. For
more information about user defined join, see “Source
Qualifier Transformation” in the Transformation Guide.
Tracing Level N/A Specifies the amount of detail included in the session log
when you run a session containing this transformation. You
can view the value of this attribute when you click Show all
properties. For more information about tracing level, see
“Setting Tracing Levels” on page 473.
Select Distinct Optional Selects unique rows.

Table A-9. Mapping Tab - Sources Node - Properties Settings (Relational Sources)
Relational Required/
Description
Source Options Optional
Pre SQL Optional Pre-session SQL commands to run against the source
database before the PowerCenter Server reads the source.
For more information about pre-session SQL, see “Using Pre-
and Post-Session SQL Commands” on page 186.
Post SQL Optional Post-session SQL commands to run against the source
database after the PowerCenter Server writes to the target.
For more information about post-session SQL, see “Using
Pre- and Post-Session SQL Commands” on page 186.
Sql Query Optional Defines a custom query that replaces the default query the
PowerCenter Server uses to read data from sources
represented in this Source Qualifier. A custom query overrides
entries for a custom join or a source filter. For more
information, see “Overriding the SQL Query” on page 216.
Source Filter Optional Specifies the filter condition the PowerCenter Server applies
when querying records. For more information, see “Source
Qualifier Transformation” in the Transformation Guide.
Table A-10 describes the Properties settings on the Mapping tab for file sources:
Table A-10. Mapping Tab - Sources Node - Properties Settings (File Sources)

Description
Options Optional
Source File Optional Enter the directory name in this field. By default, the PowerCenter Server looks
Directory in the server variable directory, $PMSourceFileDir, for file sources.
If you specify both the directory and file name in the Source Filename field,
clear this field. The PowerCenter Server concatenates this field with the Source
You can also use the $InputFileName session parameter to specify the file
directory.
Source Filename Required Enter the file name, or file name and path. Optionally use the $InputFileName
The PowerCenter Server concatenates this field with the Source File Directory
field when it runs the session. For example, if you have “C:\data\” in the Source
File Directory field, then enter “filename.dat” in the Source Filename field.
When the PowerCenter Server begins the session, it looks for
“C:\data\filename.dat”.
By default, the Workflow Manager enters the file name configured in the source
definition.

Table A-10. Mapping Tab - Sources Node - Properties Settings (File Sources)

Description
Options Optional
Source Filetype Required Allows you to configure multiple file sources using a file list.
Indicates whether the source file contains the source data, or a list of files with
the same file properties. Choose Direct if the source file contains the source
data. Choose Indirect if the source file contains a list of files.
When you select Indirect, the PowerCenter Server finds the file list then reads
each listed file when it executes the session. For details on file lists, see “Using
a File List” on page 230.
Set File Properties Optional Allows you to configure the file properties. For more information, see “Setting
File Properties for Sources” on page 688.
Datetime Format* N/A Displays the datetime format for datetime fields.
Thousand N/A Displays the thousand separator for numeric fields.

Separator*
Decimal Separator* N/A Displays the decimal separator for numeric fields.
*You can view the value of this attribute when you click Show all properties. This attribute is read-only. For more information, see the
Designer Guide.
Setting File Properties for Sources

Configure flat file properties by clicking the Set File Properties link in the Sources node. You
can define properties for both fixed-width and delimited flat file sources.
You can configure flat file properties for non-reusable sessions in the Workflow Designer and
for reusable sessions in the Task Developer.
Figure A-11 shows the Flat Files dialog box that appears when you click Set File Properties:
Figure A-11. Flat Files Dialog Box for Sources
Select the file type (fixed-width or delimited) you want to configure and click Advanced.
Configuring Fixed-Width Properties for Sources

To edit the fixed-width properties, select Fixed Width in the Flat Files dialog box and click
the Advanced button. The Fixed Width Properties dialog box appears.

Note: Edit these settings only if you need to override those configured in the source definition.
Figure A-12 displays the Fixed Width Properties dialog box for flat file sources:
Figure A-12. Fixed Width Properties
Table A-11 describes the options you define in the Fixed Width Properties dialog box for
sources:
Table A-11. Fixed-Width Properties for File Sources
Description
Null Character: Text/ Required Indicates the character representing a null value in the file. This can be any
Binary valid character in the file code page, or any binary value from 0 to 255. For
more information about specifying null characters, see “Null Character
Repeat Null Optional If selected, the PowerCenter Server reads repeat NULL characters in a
Character single field as a single NULL value. If you do not select this option, the
PowerCenter Server reads a single null character at the beginning of a field
as a null field. Important: For multibyte code pages, Informatica
recommends that you specify a single-byte null character if you are using
repeating non-binary null characters. This ensures that repeating null
characters fit into the column exactly.
For more information about specifying null characters, see “Null Character
code page.
Rows to Skip the file. Use this to skip header rows. One row may contain multiple rows. If
you select the Line Sequential File Format option, the PowerCenter Server
ignores this option.
You can enter any integer from zero to 2147483647.

Table A-11. Fixed-Width Properties for File Sources
Description
Number of Bytes to Optional The PowerCenter Server skips the specified number of bytes between
Skip Between records. For example, you have an ASCII file on Windows with one record on
Records each line, and a carriage return and line feed appear at the end of each line.
If you want the PowerCenter Server to skip these two single-byte characters,
enter 2.
If you have an ASCII file on UNIX with one record for each line, ending in a
carriage return, skip the single character by entering 1.
Strip Trailing Blanks Optional If selected, the PowerCenter Server strips trailing blank spaces from records
before passing them to the Source Qualifier transformation.
Line Sequential File Optional Select this option if the file uses a carriage return at the end of each record,
Format shortening the final column.
Configuring Delimited File Properties for Sources

To edit the delimited properties, select Delimited in the Flat Files dialog box and click the
Advanced button. The Delimited File Properties dialog box appears.
Note: Edit these settings only if you need to override those configured in the source definition.
Figure A-13 displays the Delimited File Properties dialog box for flat file sources:
Figure A-13. Delimited Properties for File Sources

Table A-12 describes the options you can define in the Delimited File Properties dialog box
for flat file sources:
Table A-12. Delimited Properties for File Sources

Description
Delimiters Required Character used to separate columns of data in the source file. Use the
Browse button to the right of this field to enter a different delimiter. Delimiters
can be either printable or single-byte unprintable characters, and must be
different from the escape character and the quote character (if selected). You
cannot select unprintable multibyte characters as delimiters. The delimiter
must be in the same code page as the flat file code page.
Optional Quotes Required Select None, Single, or Double. If you select a quote character, the
PowerCenter Server ignores delimiter characters within the quote characters.
Therefore, the PowerCenter Server uses quote characters to escape the
delimiter.
For example, a source file uses a comma as a delimiter and contains the
following row: 342-3849, ‘Smith, Jenna’, ‘Rockville, MD’, 6.
ignores the commas within the quotes and reads the row as four fields.
If you do not select the optional single quote, the PowerCenter Server reads
When the PowerCenter Server reads two optional quote characters within a
quoted string, it treats them as one quote character. For example, the
PowerCenter Server reads the following quoted string as I’m going
tomorrow:
2353, ‘I’’m going tomorrow.’, MD
Additionally, if you select an optional quote character, the PowerCenter
Server only reads a string as a quoted string if the quote character is the first
character of the field.
Note: You can improve session performance if the source file does not
contain quotes or escape characters.
Code Page Required Select the code page of the delimited file. The default setting is the client
code page.
Escape Character Optional Character immediately preceding a delimiter character embedded in an

unquoted string, or immediately preceding the quote character in a quoted
string. When you specify an escape character, the PowerCenter Server
reads the delimiter character as a regular character (called escaping the
delimiter or quote character).
Note: You can improve session performance for mappings containing
Sequence Generator transformations if the source file does not contain
quotes or escape characters.
Remove Escape Optional This option is selected by default. Clear this option to include the escape
Character From Data character in the output string.

Table A-12. Delimited Properties for File Sources

Description
Treat Consecutive Optional By default, the PowerCenter Server reads pairs of delimiters as a null value.
Delimiters as One If selected, the PowerCenter Server reads any number of consecutive
delimiter characters as one.
For example, a source file uses a comma as the delimiter character and
contains the following record: 56, , , Jane Doe. By default, the PowerCenter
Server reads that record as four columns separated by three delimiters: 56,
NULL, NULL, Jane Doe. If you select this option, the PowerCenter Server
reads the record as two columns separated by one delimiter: 56, Jane Doe.
Rows to Skip the file. Use this to skip title or header rows in the file.
Targets Node
The Targets node lists the used in the session and displays their settings. If you want to view
and configure the settings of a specific target, select the target from the list.
You can configure the following settings:
♦ Writers. The Writers settings displays the writer the PowerCenter Server uses with each
target instance. For more information, see “Writers Settings” on page 692.
♦ Connections. The Connections settings allows you to configure connections for the
targets. For more information, see “Connections Settings” on page 693.
♦ Properties. The Properties settings allows you to configure the target properties. For more
information, see “Properties Settings” on page 695.
Writers Settings
You can view and configure the writer the PowerCenter Server uses with each target instance.
The Workflow Manager specifies the necessary writer for each target instance. For relational
targets the writer is Relational Writer and for file targets it is File Writer.

Figure A-14 displays the Writers settings on the Mapping tab (Targets node):
Figure A-14. Mapping Tab - Targets Node - Writers Settings
Table A-13 describes the Writers settings on the Mapping tab (Targets node):
Table A-13. Mapping Tab - Targets Node - Writers Settings
Writers Required/
Description
Setting Optional
Writers Required For relational targets, choose Relational Writer or File Writer. When the target in the
mapping is a flat file, an XML file, a SAP BW target, or MQ target, the Workflow
Manager specifies the necessary writer in the session properties.
When you choose File Writer for a relational target you can use an external loader
to load data to this target. For more information, see “External Loading” on
page 523.
When you override a relational target to use the file writer, the Workflow Manager
changes the properties for that target instance on the Properties settings. It also
changes the connection options you can define on the Connections settings.
After you override a relational target to use a file writer, define the file properties for
the target. Click Set File Properties and choose the target to define. For more
information, see “Configuring Fixed-Width Properties” on page 265 and “Configuring
Delimited Properties” on page 266.
You can enter connection types and specific target database connections on the Targets node
of the Mappings tab.

Figure A-15 displays the Connections settings on the Mapping tab (Targets node):
Figure A-15. Mapping Tab - Targets Node - Connections Settings

Table A-14 describes the Connections settings on the Mapping tab (Targets node):
Table A-14. Mapping Tab - Targets Node - Connections Settings
Description
Settings Optional
Type Required Enter the connection type for non-relational targets. Specifies Relational for
relational targets.
You can choose the following connection types for flat file, XML, and MQ
targets:
- FTP. Select this connection type to use FTP to access the target directory for
flat file and XML targets. If you want to load data to a flat file or XML target
using FTP, you must specify an FTP connection when you configure target
options. If you select this option, select a configured FTP connection in the
Value column. FTP connections must be defined in the Workflow Manager
prior to configuring sessions. For details on using FTP, see “Using FTP” on
page 559.
- External Loader. Select this connection type to use the External Loader to
load output files to Teradata, Oracle, DB2, or Sybase IQ databases. If you
select this option, select a configured loader connection in the Value column.
To use this option, you must use a mapping with a relational target definition
and choose File as the writer type on the Writers tab for the relational target
instance. As the PowerCenter Server completes the session, it uses an
external loader to load target files to the Oracle, Sybase IQ, DB2, or Teradata
database. You cannot choose external loader for flat file or XML target
definitions in the mapping.
Note to Oracle 8 users: If you configure a session to write to an Oracle 8
external loader target table in bulk mode with NOT NULL constraints on any
columns, the session may write the null character into a NOT NULL column if
the mapping generates a NULL output.
For details on using the external loader feature, see “External Loading” on
page 523.
- Queue. Choose Queue when you want to output to an MQSeries message
queue. If you select this option, select a configured MQ connection in the
Value column. For more information, see the PowerCenter Connect for IBM
MQSeries User and Administrator Guide.
- None. Choose None when you want to write to a local flat file or XML file.
Partitions N/A Displays the partitions if the session is partitioned.
Value Required Enter a target connection based on the value you choose in the Type column.
Properties Settings
Click the Properties settings to define target property information. The Workflow Manager
displays different properties for the different target types: relational, flat file, and XML.
Properties Settings for Relational Targets

You can configure the writer and object instance attributes for a relational target.

Figure A-16 displays the Properties settings on the Mapping tab for relational targets:
Figure A-16. Mapping Tab - Targets Node - Properties Settings (Relational)

Table A-15 describes the Properties settings on the Mapping tab for relational targets:
Table A-15. Mapping Tab - Targets Node - Properties Settings (Relational)
Required/
Optional
Target Load Type Required You can choose Normal or Bulk.

If you select Normal, the PowerCenter Server loads targets normally.
You can only choose Bulk when you load to Sybase, Oracle, or Microsoft
SQL Server. If you select Bulk for a Sybase, Oracle, or Microsoft SQL
Server target, Informatica invokes the bulk API with default settings,
bypassing database logging.
If you select Bulk for other database types, the PowerCenter Server reverts
to a normal load.
Loading in bulk mode can improve session performance, but limits your
ability to recover because no database logging occurs.
Note: Choose Normal mode if the mapping contains an Update Strategy
transformation.
Tip: When you choose Bulk mode for Microsoft SQL Server or Oracle
targets, define a large commit interval.
Consider the following database limitations when you choose Bulk mode
when loading to Oracle:
- Do not define CHECK constraints in the database.
- Do not define primary-foreign keys in the database. However, you can
define primary-foreign keys for the target definitions in the Designer.
- Do not create indexes in the database.
- When you use the LONG datatype, verify it is the last column in the table.
For more information, see your Oracle documentation.
Insert Optional If selected, the PowerCenter Server inserts all rows flagged for insert.
For details on target update strategies, see “Update Strategy
Update (as Update) Optional If selected, the PowerCenter Server updates all rows flagged for update.
Update (as Insert) Optional If selected, the PowerCenter Server inserts all rows flagged for update.
Update (else Insert) Optional If selected, the PowerCenter Server updates rows flagged for update if it
they exist in the target, then inserts any remaining rows marked for insert.
Delete Optional If selected, the PowerCenter Server deletes all rows flagged for delete.
Truncate Table Optional If selected, the PowerCenter Server truncates the target before loading. For
details on this feature, see “Truncating Target Tables” on page 245.

Table A-15. Mapping Tab - Targets Node - Properties Settings (Relational)
Required/
Optional
Reject File Directory Optional Enter the directory name in this field. By default, the PowerCenter Server
writes all reject files to the server variable directory, $PMBadFileDir.
clear this field. The PowerCenter Server concatenates this field with the
Reject Filename field when it runs the session.
directory.
Reject Filename Required Enter the file name, or file name and path. By default, the PowerCenter
Server names the reject file after the target instance name:
target_name.bad. Optionally use the $BadFileName session parameter for
the file name.
The PowerCenter Server concatenates this field with the Reject File
Directory field when it runs the session. For example, if you have
“C:\reject_file\” in the Reject File Directory field, and enter “filename.bad” in
the Reject Filename field, the PowerCenter Server writes rejected rows to
Rejected Truncated/ Optional Instructs the PowerCenter Server to write the truncated and overflowed
Overflowed rows* rows to the reject file.
Update Override* Optional Override the default UPDATE statement.
Table Name Prefix Optional Specify the owner of the target tables.
Pre SQL Optional You can enter pre-session SQL commands for a target instance in a
mapping to execute commands against the target database before the
PowerCenter Server reads the source.
Post SQL Optional Enter post-session SQL commands to execute commands against the target
database after the PowerCenter Server writes to the target.
Designer Guide.

Properties Settings for Flat File Targets
Figure A-17 describes the Properties settings on the Mapping tab for file targets:
Figure A-17. Mapping Tab - Targets Node - File Properties Settings
Table A-16 describes the Properties settings on the Mapping tab for file targets:
Table A-16. Mapping Tab - Targets Node - File Properties Settings
Required/
Optional
Merge Partitioned Optional When selected, the PowerCenter Server merges the partitioned target files into
Files one file when the session completes, and then deletes the individual output
files. If the PowerCenter Server fails to create the merged file, it does not
delete the individual output files.
You cannot merge files if the session uses FTP, an external loader, or a
message queue.
For details on configuring a session for partitioning, see “Pipeline Partitioning”
on page 345.
Merge File Optional Enter the directory name in this field. By default, the PowerCenter Server
Directory writes the merged file in the server variable directory, $PMTargetFileDir.
If you enter a full directory and file name in the Merge File Name field, clear
this field.
Merge File Name Optional Name of the merge file. Default is target_name.out. This property is required if
you select Merge Partitioned Files.

Required/
Optional
Output File Optional Enter the directory name in this field. By default, the PowerCenter Server
Directory writes output files in the server variable directory, $PMTargetFileDir.
If you specify both the directory and file name in the Output Filename field,
clear this field. The PowerCenter Server concatenates this field with the Output
You can also use the $OutputFileName session parameter to specify the file
directory.
Output Filename Required Enter the file name, or file name and path. By default, the Workflow Manager
names the target file based on the target definition used in the mapping:
target_name.out.
If the target definition contains a slash character, the Workflow Manager
replaces the slash character with an underscore.
When you use an external loader to load to an Oracle database, you must
specify a file extension. If you do not specify a file extension, the Oracle loader
cannot find the flat file and the PowerCenter Server fails the session. For more
information about external loading, see “Loading to Oracle” on page 533.
Enter the file name, or file name and path. Optionally use the $OutputFileName
The PowerCenter Server concatenates this field with the Output File Directory
field when it runs the session.
Note: If you specify an absolute path file name when using FTP, the
PowerCenter Server ignores the Default Remote Directory specified in the FTP
connection. When you specify an absolute path file name, do not use single or
double quotes.
Reject File Optional Enter the directory name in this field. By default, the PowerCenter Server
Directory writes all reject files to the server variable directory, $PMBadFileDir.
clear this field. The PowerCenter Server concatenates this field with the Reject
directory.
Reject Filename Required Enter the file name, or file name and path. By default, the PowerCenter Server
names the reject file after the target instance name: target_name.bad.
Optionally use the $BadFileName session parameter for the file name.
The PowerCenter Server concatenates this field with the Reject File Directory
field when it runs the session. For example, if you have “C:\reject_file\” in the
Reject File Directory field, and enter “filename.bad” in the Reject Filename
field, the PowerCenter Server writes rejected rows to
Set File Properties Optional Allows you to configure the file properties. For more information, see “Setting
File Properties for Targets” on page 701.
Datetime Format* N/A Displays the datetime format selected for datetime fields.

Required/
Optional
Thousand N/A Displays the thousand separator for numeric fields.

Separator*
Decimal Separator* N/A Displays the decimal separator for numeric fields.
Designer Guide.
Setting File Properties for Targets

Click the Set File Properties button on the Mapping tab to configure flat file properties. You
can define flat file properties for both fixed-width and delimited flat file targets.
You can configure flat file properties for non-reusable sessions in the Workflow Designer and
for reusable sessions in the Task Developer.
Figure A-18 shows the Flat Files dialog box that appears when you click Set File Properties:
Figure A-18. Flat Files Dialog Box for Targets
Select the file type (fixed-width or delimited) you want to configure and click Advanced.
Configuring Fixed-Width Properties for Targets

To edit the fixed-width properties, select Fixed Width in the Flat Files dialog box and click
the Advanced button. The Fixed Width Properties dialog box appears.

Figure A-19 displays the Fixed-Width Properties dialog box for flat file targets:
Figure A-19. Fixed-Width Properties for File Targets
Table A-17 describes the options you define in the Fixed Width Properties dialog box:
Table A-17. Fixed-Width Properties for File Targets
Description
Null Character Required Enter the character you want the PowerCenter Server to use to represent
null values. You can enter any valid character in the file code page.
For more information about specifying null characters for target files, see
“Null Characters in Fixed-Width Files” on page 272.
Repeat Null Character Optional Select this option to indicate a null value by repeating the null character to
fill the field. If you do not select this option, the PowerCenter Server enters
a single null character at the beginning of the field to represent a null
value. For more information about specifying null characters for target
files, see “Null Characters in Fixed-Width Files” on page 272.
code page.
Configuring Delimited Properties for Targets

To edit the delimited properties, select Delimited in the Flat Files dialog box and click the
Advanced button. The Delimited File Properties dialog box appears.
Figure A-20 displays the Delimited File Properties dialog box for flat file targets:
Figure A-20. Delimited Properties for File Targets

Table A-18 describes the options you can define in the Delimited File Properties dialog box
for flat file targets:
Table A-18. Delimited Properties for File Targets
Edit Delimiter Required/

Description
Options Optional
Delimiters Required Character used to separate columns of data. Use the Browse button to the right
of this field to enter a non-printable delimiter. Delimiters can be either printable
or single-byte unprintable characters, and must be different from the escape
character and the quote character (if selected). You cannot select unprintable
multibyte characters as delimiters.
Optional Quotes Required Select No Quotes, Single Quote, or Double Quotes. If you select a quote
character, the PowerCenter Server does not treat delimiter characters within
the quote characters as a delimiter. For example, suppose an output file uses a
comma as a delimiter and the PowerCenter Server receives the following row:
342-3849, ‘Smith, Jenna’, ‘Rockville, MD’, 6.
ignores the commas within the quotes and writes the row as four fields.
If you do not select the optional single quote, the PowerCenter Server writes
Code Page Required Select the code page of the delimited file. The default setting is the client code
page.
Transformations Node
On the Transformations node, you can override properties that you configure in
transformation and target instances in a mapping. The attributes you can configure depends
on the type of transformation you select.

Figure A-21 displays the Transformations node on the Mapping tab:
Figure A-21. Mapping Tab - Transformations Node

Mapping Tab (Partitions View)
In the Partitions view of the Mapping tab, you can configure partitions. You can configure
partitions for non-reusable sessions in the Workflow Designer and for reusable sessions in the
Task Developer.
The following nodes are available in the Partitions view:
♦ Partition Properties. For more information, see “Partition Properties Node” on page 705.
♦ KeyRange. For more information, see “KeyRange Node” on page 706.
♦ HashKeys. For more information, see “HashKeys Node” on page 706.
♦ Partition Points. For more information, see “Partition Points Node” on page 706.
♦ Non-Partition Points. For more information, see “Non-Partition Points Node” on
page 709.
Partition Properties Node

The Partition Properties node allows you to configure partitions.
Figure A-22 displays the Mapping tab - Partitions Properties node:
Figure A-22. Mapping Tab - Partitions Properties Node
Mapping Tab (Partitions View) 705

KeyRange Node
In the KeyRange node, you can configure the partition range for key-range partitioning.
Select Edit Keys to edit the partition key. For more information, see “Edit Partition Key” on
page 708.
Figure A-23 displays the KeyRange node on the Mapping tab:
Figure A-23. Mapping Tab - KeyRange Node
HashKeys Node
The HashKeys node you can configure hash key partitioning. Select Edit Keys to edit the
partition key. For more information, see “Edit Partition Key” on page 708.
Partition Points Node

The Partition Points node displays the mapping with the transformation icons. The Partition
Points node lists the partition points in the tree. Select a partition point to configure its
attributes.
In the Partition Points node you can configure the following options for each pipeline in a
mapping:
♦ Add and delete partition points.
♦ Specify the partition type at each partition point.

♦ Add and delete partitions.
♦ Add keys and key ranges for certain partition types.
For more information about partitioning a pipeline, see “Pipeline Partitioning” on page 345.
Figure A-24 displays Mapping tab - Partition Points node:
Figure A-24. Mapping Tab - Partition Points Node
Table A-19 describes the Partition Points node:
Table A-19. Mapping Tab - Partition Points Node
Partition Points
Description
Node
Add Partition Point Click to add a new partition point to the Transformation list. For information on adding partition
points, see “Adding and Deleting Partition Points” on page 353.
Delete Partition Click to delete the current partition point. You cannot delete certain partition points. For details,
Point see “Adding and Deleting Partition Points” on page 353.
Edit Partition Point Click to edit the current partition point.
Edit Keys Click to add, remove, or edit the key for key range or hash user keys partitioning. This button is
not available for auto-hash, round-robin, or pass-through partitioning.
For more information on adding keys and key ranges, see “Adding Keys and Key Ranges” on
page 358.

Edit Partition Point
The Edit Partition Point dialog box allows you to add and delete partitions, and to select the
partition type.
Figure A-25 displays the Edit Partition Points dialog box:
Figure A-25. Edit Partition Point Dialog Box
Table A-20 describes the options in the Edit Partition Point dialog box:
Table A-20. Edit Partition Point Dialog Box Options
Edit Partition Point

Description
Options
Add button Click to add a partition. You can add up to 64 partitions. For more information on
adding partitions, see “Adding and Deleting Partitions” on page 356.
Delete button Click to delete the selected partition. For more information on deleting partitions, see
“Adding and Deleting Partitions” on page 356.
Name Partition number.
Description Enter a description for the current partition.
Select Partition Type Select a partition type from the list. For more information, see “Specifying Partition
Types” on page 356.
Edit Partition Key

When you specify key range or hash user keys partitioning at any partition point, you must
specify one or more ports as the partition key. Click Edit Key to display the Edit Partition Key
dialog box.

Figure A-26 displays the Edit Partition Key dialog box:
Figure A-26. Edit Partition Key Dialog Box
You can specify one or more ports as the partition key. To rearrange the order of the ports that
make up the key, select a port in the Selected Ports list and click the up or down arrow.
For information on adding a key for key range partitioning, see “Key Range Partition Type”
on page 363. For information on adding a key for hash partitioning, see “Hash Keys Partition
Types” on page 361.
Non-Partition Points Node

The Non-Partition Points node displays the mapping objects in iconized view. The Partition
Points node lists the non-partition points in the tree. You can select a non-partition point and
add partitions if you want.

Components Tab
In the Components tab, you can configure pre-session shell commands, post-session
commands, and email messages if the session succeeds or fails.
Figure A-27 displays the Components Tab:
Figure A-27. Components Tab

Table A-21 describes the Components tab options:
Table A-21. Components Tab
Components Tab Optional/

Description
Option Required
Task n/a Tasks you can perform in the Components tab. You can configure pre- or post-
session shell commands and success or failure email messages in the
Components tab.
Type Required Select None if you do not want to configure commands and emails in the
Components tab.
For pre- and post-session commands, select Reusable to call an existing
reusable Command task as the pre- or post-session shell command. Select
Non-Reusable to create pre- or post-session shell commands for this session
task.
For success or failure emails, select Reusable to call an existing Email task as
the success or failure email. Select Non-Reusable to create email messages
for this session task.
Value Optional Use to configure commands or emails.
Table A-22 describes the tasks available in the Components tab:
Table A-22. Components Tab Tasks
Components Tab Required/

Description
Tasks Optional
Pre-Session Optional Shell commands that the PowerCenter Server performs at the beginning of a
Command session. For details on using pre-session shell commands, see “Using Pre- or
Post-Session Shell Commands” on page 188.
Post-Session Optional Shell commands that the PowerCenter Server performs after the session
Success Command completes successfully. For details on using pre-session shell commands, see
“Using Pre- or Post-Session Shell Commands” on page 188.
Post-Session Optional Shell commands that the PowerCenter Server performs after the session if the
Failure Command session fails. For details on using pre-session shell commands, see “Using
Pre- or Post-Session Shell Commands” on page 188.
On Success Email Optional The PowerCenter Server sends On Success email message if the session
completes successfully.
On Failure Email Optional The PowerCenter Server sends On Failure email message if the session fails.
Reusable Pre- or Post-Session Commands

Select Reusable in the Type field if you want to select an existing Command task as the pre- or
post-session shell command. The Command Object Browser appears when you click the
Open button in the Value field.
Components Tab 711

Figure A-28 displays the Task Browser:
Figure A-28. Task Browser
Click the Override button to override the Run If Previous Completed option in the
Command task. For details on the Run If Previous Completed option, see Table A-24 on
page 714.
Non-Reusable Pre- or Post-Session Commands

Select Non-Reusable in the Type field if you want to create pre- or post-session commands for
the session. Non-reusable pre- or post-session commands do not appear as Command tasks in
the folder.
Click the Open button in the Value field in the Components tab to edit pre- or post-session
shell commands. The Edit Pre-Session Command or Edit Post-Session Command dialog box
appears.

Figure A-29 displays the Edit Pre-Session Command dialog box:
Figure A-29. Edit Pre-Session Command Dialog Box
Table A-23 describes General tab for editing pre- or post-session shell commands:
Table A-23. Pre- or Post-Session Commands - General Tab
General Tab for

Pre- or Post- Required/
Description
Session Optional
Commands
Name Required Enter a name for the pre- or post-session shell command.
Make Reusable Required Select Make Reusable to create a reusable Command task from the pre- or
post-session shell commands.
Clear the Make Reusable option if you do not want the Workflow Manager to
create a reusable Command task from the shell commands.
For details on creating Command tasks from pre- or post-session shell
commands, see “Creating a Reusable Command Task from Pre- or Post-
Session Commands” on page 191.
Description Optional Enter a description for the pre- or post-session shell command.
Components Tab 713

Table A-24 describes the Properties tab for editing pre- or post-session commands:
Table A-24. Pre- or Post-Session Commands - Properties Tab
Properties Tab
for Pre- or Post- Required/
Description
Session Optional
Commands
Name Required The name of the pre-session shell command.
Run If Previous Required Select this option if you want the PowerCenter Server to perform the next
Completed command only if the previous command completed successfully.
Table A-25 describes the Commands tab for editing pre- or post-session commands:
Table A-25. Pre- or Post-Session Commands - Commands Tab
Commands Tab
for Pre- or Post- Required/
Description
Session Optional
Commands
Name Required The name of the pre- or post-session shell command.
Command Required The shell command you want the PowerCenter Server to perform. Enter one
command for each line. You can use session parameters or server variables in
shell commands.
If your command contains spaces, enclose the command in quotes. For
example, if you want to call c:\program files\myprog.exe, you must enter
“c:\program files\myprog.exe”, including the quotes. Enter only one command
on each line.
Reusable Email
Select Reusable in the Type field for the On-Success or On-Failure email if you want to select
an existing Email task as the On-Success or On-Failure email. The Email Object Browser
appears when you click the right side of the Values field.

Figure A-30 displays Email Object Browser:
Figure A-30. Email Object Browser
Select an Email task to use as On-Success or On-Failure email. Click the Override button to
override properties of the email. For more information about email properties, see Table A-27
on page 717.
Non-Reusable Email
Select Non-Reusable in the Type field to create a non-reusable email for the session. Non-
Reusable emails do not appear as Email tasks in the Task folder. Click the right side of the
Values field to edit the properties for the non-reusable On-Success or On-Failure emails. For
more information about email properties, see Table A-27 on page 717.
Email Properties
You configure email properties for On-Success or On-Failure Emails when you override an
existing Email task or when you create a non-reusable email for the session.
Components Tab 715

Figure A-31 displays the dialog box for editing the On-Success or On-Failure email
properties:
Figure A-31. On-Success or On-Failure Email - General Tab
Table A-26 describes general settings for editing On-Success or On-Failure emails:
Table A-26. On-Success or On-Failure Emails - General Tab
Required/
Email Settings Description
Optional
Name Required Enter a name for the email you want to configure.
Description Required Enter a description for the email you want to configure.

Figure A-32 displays the properties for On-Success or On-Failure emails:
Figure A-32. On-Success or On-Failure Email - Properties Tab
Table A-27 describes the email properties for On-Success or On-Failure emails:
Table A-27. On-Success or On-Failure Emails - Properties Tab
Required/
Email Properties Description
Optional
Email user name Required Required to send On-Success or On-Failure session email. Enter the email
address of the person you want the PowerCenter Server to email after the
session completes. The email address must be entered in 7-bit ASCII.
For success email, you can enter $PMSuccessEmailUser to send email to the
user configured for the server variable.
For failure email, you can enter $PMFailureEmailUser to send email to the user
configured for the server variable.
Email subject Optional Enter the text you want to appear in the subject header.
Email text Optional Enter the text of the email. You can use several variables when creating this
text to convey meaningful information, such as the session name and session
status. For details, see “Sending Email” on page 319.
Components Tab 717

Metadata Extensions Tab
The Metadata Extensions tab appears in the session property sheet after the Partitions tab.
Figure A-33 displays the Metadata Extensions tab:
Figure A-33. Metadata Extensions Tab
The Metadata Extensions tab allows you to create and promote metadata extensions. For
information on creating metadata extensions, see “Metadata Extensions” in the Repository
Guide.
Table A-28 describes the configuration options for the Metadata Extensions tab:
Table A-28. Metadata Extensions Tab
Metadata
Required/
Extensions Tab Description
Optional
Options
Extension Name Required Name of the metadata extension. Metadata extension names must be unique in
a domain.
Datatype Required The data type: numeric (integer), string, boolean, or XML.

Table A-28. Metadata Extensions Tab
Metadata
Required/
Optional
Options
Value Optional Value of the metadata extension.

For a numeric metadata extension, the value must be an integer.
For a string or XML metadata extension, click the button in the Value field to
enter a value of more than one line. The Workflow Manager does not validate
XML syntax.
Precision Required for The maximum length for string or XML metadata extensions.
string and
XML objects
Reusable Required Select to make the metadata extension apply to all objects of this type
(reusable). Clear to make the metadata extension apply to this object only
(non-reusable).
Description Optional Description of the metadata extension.
Metadata Extensions Tab 719

Appendix B
Workflow Properties
Reference
This appendix contains a listing of settings in the workflow properties. These settings are
grouped by the following tabs:
♦ Properties Tab, 724
♦ Scheduler Tab, 726
♦ Variables Tab, 731
♦ Events Tab, 732
♦ Metadata Extensions Tab, 733
721
General Tab
You can change the workflow name and enter a comment for the workflow on the General
tab. By default, the General tab appears when you open the workflow properties.
Figure B-1 displays the General tab of the workflow properties:
Figure B-1. Workflow Properties - General Tab
Select a
PowerCenter Server
to run the workflow.
Select a suspension
email.
Table B-1 describes the settings found on the General tab:
Table B-1. Workflow Properties - General Tab

Description
Options Optional
Name Required The name of the workflow.
Comments Optional Optional comment to describe the workflow.
Server Required Select a registered PowerCenter Server when configuring a workflow.
Tasks must run on Optional Requires all workflow tasks to run on the PowerCenter Server that you
Server select.
Suspension Email Optional Select a reusable email task for the suspension email. When a task fails,
the PowerCenter Server suspends the workflow and sends the
suspension email.
For details on suspending workflows, see “Suspending the Workflow” on
page 127.
Disabled Optional Select to disable the workflow from the schedule. The PowerCenter
Server stops running the workflow until you clear the Disabled option.
For details on the Disabled option, see “Disabling Workflows” on
page 118.
722 Appendix B: Workflow Properties Reference

Table B-1. Workflow Properties - General Tab

Description
Options Optional
Suspend On Error Optional If selected, the PowerCenter Server suspends the workflow when a task
in the workflow fails.
For details on suspending workflows, see “Suspending the Workflow” on
page 127.
Web Services Optional If selected, you create a service workflow. Click Config Service to
configure service information.
For more information on creating web services, see the Web Services
Provider Guide.
General Tab 723

Properties Tab
Configure parameter file name and workflow log options in the Properties tab.
Figure B-2 displays the Properties tab:
Figure B-2. Workflow Properties - Properties Tab
Table B-2 describes the settings found on the Properties tab:
Table B-2. Workflow Properties - Properties Tab
Properties Tab Required/

Description
Options Optional
Parameter File Optional Designates the name and directory for the parameter file. Use the parameter
Name file to define workflow parameters. For details on parameter files, see
“Parameter Files” on page 511.
Workflow Log File Optional Optionally enter a file name, or a file name and directory.
Name If you leave this field blank, the PowerCenter Server does not create a
workflow log. Instead, the PowerCenter Server writes workflow log messages
to the server log or Windows Event Log, depending on how you configure the
PowerCenter Server.
If you fill in this field, the PowerCenter Server appends information in this field
to that entered in the Workflow Log File Directory field. For example, if you
have "C:\workflow_logs\" in the Workflow Log File Directory field, then enter
"logname.txt" in the Workflow Log File Name field, the PowerCenter Server
writes logname.txt to the C:\workflow_logs\ directory.

Table B-2. Workflow Properties - Properties Tab
Properties Tab Required/

Description
Options Optional
Workflow Log File Required Designates a location for the workflow log file. By default, the PowerCenter
Directory Server writes the log file in the server variable directory, $PMWorkflowLogDir.
If you enter a full directory and file name in the Workflow Log File Name field,
clear this field.
Save Workflow Log Required If you select Save Workflow Log by Timestamp, the PowerCenter Server saves
By all workflow logs, appending a timestamp to each log.
If you select Save Workflow Log by Runs, the PowerCenter Server saves a
designated number of workflow logs. Configure the number of workflow logs in
the Save Workflow Log for These Runs option.
For details on these options, see “Archiving Workflow Logs” on page 459.
You can also use the $PMWorkflowLogCount server variable to save the
configured number of workflow logs for the PowerCenter Server.
Save Workflow Log Required The number of historical workflow logs you want the PowerCenter Server to
For These Runs save.
The Informatica saves the number of historical logs you specify, plus the most
recent workflow log. Therefore, if you specify 5 runs, the PowerCenter Server
saves the most recent workflow log, plus historical logs 0–4, for a total of 6
logs.
PowerCenter Server saves only the most recent workflow log.
Properties Tab 725

Scheduler Tab
The Scheduler Tab allows you to schedule a workflow to run continuously, run at a given
interval, or manually start a workflow. For details on scheduling workflows, see “Scheduling a
Figure B-3 displays the Scheduler tab:
Figure B-3. Workflow Properties - Scheduler Tab
Edit
scheduler
settings.
You can configure the following types of scheduler settings:

♦ Non-Reusable. Choose to create a non-reusable scheduler for the workflow.
♦ Reusable. Choose a reusable scheduler for the workflow.

Table B-3 describes the settings found on the Scheduler Tab:
Table B-3. Workflow Properties - Scheduler Tab
Required/
Scheduler Tab Options Description
Optional
Non-Reusable/Reusable Required Indicates the scheduler type.

If you select Non Reusable, the scheduler can only be used by
the current workflow.
If you select Reusable, choose a reusable scheduler. You can
create reusable schedulers by selecting Schedulers.
Scheduler Required Choose a set of scheduler settings for the workflow.
Description Optional Enter a description for the scheduler.
Summary N/A Read-only summary of the selected scheduler settings.
Edit Scheduler Settings

Click the Edit Scheduler Settings button to configure the scheduler. The Edit Scheduler
dialog box appears.
Figure B-4 displays the Edit Scheduler dialog box:
Figure B-4. Workflow Properties - Scheduler Tab - Edit Scheduler Dialog Box
Scheduler Tab 727

Table B-4 describes the settings on the Edit Scheduler dialog box:
Table B-4. Workflow Properties - Scheduler Tab - Edit Scheduler Dialog Box
Required/
Optional
Run Options: Run On Server Optional Indicates the workflow schedule type.
Initialization/Run On Demand/ If you select Run On Server Initialization, the PowerCenter
Run Continuously Server runs the workflow as soon as the server is initialized.
If you select Run On Demand, the PowerCenter Server only runs
the workflow when you start the workflow.
If you select Run Continuously, the PowerCenter Server starts
the next run of the workflow as soon as it finishes the first run.
Schedule Options: Run Once/ Optional Required if you select Run On Server Initialization in Run
Run Every/Customized Repeat Options.
Also required if you do not choose any setting in Run Options.
If you select Run Once, the PowerCenter Server runs the
workflow once, as scheduled in the scheduler.
If you select Run Every, the PowerCenter Server runs the
workflow at regular intervals, as configured.
If you select Customized Repeat, the PowerCenter Server runs
the workflow on the dates and times specified in the Repeat
dialog box.
Edit Optional Required if you select Customized Repeat in Schedule Options.

Opens the Repeat dialog box, allowing you to schedule specific
dates and times for the workflow to run. The selected scheduler
appears at the bottom of the page. For details about the Repeat
dialog box, see “Customizing Repeat Option” on page 116.
Start Date Optional Required if you select Run On Server Initialization in Run
Options.
Indicates the date on which the PowerCenter Server begins
scheduling the workflow.
Start Time Optional Required if you select Run On Server Initialization in Run
Options.
Indicates the time at which the PowerCenter Server begins
scheduling the workflow.
End Options: End On/End Optional Required if the workflow schedule is Run Every or Customized
After/Forever Repeat.
If you select End On, the PowerCenter Server stops scheduling
the workflow in the selected date.
If you select End After, the PowerCenter Server stops
scheduling the workflow after the set number of workflow runs.
If you select Forever, the PowerCenter Server schedules the
workflow as long as the workflow does not fail.

Customizing Repeat Option
You can schedule the workflow to run once, run at an interval, or customize your own repeat
option. Click the Edit button on the Edit Scheduler dialog box to configure Customized
Repeat options.
Figure B-5 shows the Customized Repeat dialog box:
Figure B-5. Workflow Properties - Customized Repeat Dialog Box
Table B-5 describes options in the Customized Repeat dialog box:
Table B-5. Workflow Properties - Repeat Dialog Box Options
Required/
Optional
Repeat Every Required Enter the numeric interval you want to schedule the workflow, then select Days,
Weeks, or Months, as appropriate.
If you select Days, select the appropriate Daily Frequency settings.
If you select Weeks, select the appropriate Weekly and Daily Frequency
settings.
If you select Months, select the appropriate Monthly and Daily Frequency
settings.
Weekly Optional Required to enter a weekly schedule. Select the day or days of the week on
which you want to schedule the workflow.
Scheduler Tab 729

Table B-5. Workflow Properties - Repeat Dialog Box Options
Required/
Optional
Monthly Optional Required to enter a monthly schedule.

If you select Run On Day, select the dates on which you want the workflow
scheduled on a monthly basis. The PowerCenter Server schedules the
workflow on the selected dates. If you select a numeric date exceeding the
number of days within a given month, the PowerCenter Server schedules the
workflow for the last day of the month, including leap years. For example, if you
schedule the workflow to run on the 31st of every month, the PowerCenter
Server schedules the session on the 30th of the following months: April, June,
September, and November.
If you select Run On The, select the week(s) of the month, then day of the
week on which you want the workflow to run. For example, if you select Second
and Last, then select Wednesday, the PowerCenter Server schedules the
workflow on the second and last Wednesday of every month.
Daily Required Enter the number of times you would like the PowerCenter Server to run the
workflow on any day the session is scheduled.
If you select Run Once, the PowerCenter Server schedules the workflow once
on the selected day, at the time entered on the Start Time setting on the Time
tab.
If you select Run Every, enter Hours and Minutes to define the interval at which
the PowerCenter Server runs the workflow. The PowerCenter Server then
schedules the workflow at regular intervals on the selected day. The
PowerCenter Server uses the Start Time setting for the first scheduled
workflow of the day.

Variables Tab
Before you can use workflow variables, you must declare them in the Variables tab.
Figure B-6 displays the settings on the Variables tab:
Figure B-6. Workflow Properties - Variables Tab
Table B-6 describes the settings found on the Variables Tab:
Table B-6. Workflow Properties - Variables Tab
Required/
Variable Options Description
Optional
Name Required The name of the workflow variable.
Datatype Required The datatype of the workflow variable.
Persistent Required Indicates whether the PowerCenter Server maintains the value of the variable
from the previous workflow run.
Is Null Required Indicates whether the workflow variable is null.
Default Optional Default value of the workflow variable.
Description Optional Optional details about the workflow variable.
Variables Tab 731

Events Tab
Before you can use the Event-Raise task, declare a user-defined event in the Events tab.
Figure B-7 displays the Events Tab:
Figure B-7. Workflow Properties - Events Tab
Table B-7 describes the settings found on the Events Tab:
Table B-7. Workflow Properties - Events Tab
Events Tab Required/

Description
Options Optional
Events Required The name of the event you declare.
Description Optional Optional details to describe the event.

Metadata Extensions Tab
Extend the metadata stored in the repository by associating information with individual
repository objects. Create metadata extensions for repository objects by editing the object and
then adding the metadata extension to the Metadata Extension tab.
Figure B-8 displays the Metadata Extensions tab:
Figure B-8. Workflow Properties - Metadata Extensions Tab
The Metadata Extensions tab allows you to create and promote metadata extensions. For
information on creating metadata extensions, see “Metadata Extensions” in the Repository
Guide.
Table B-8 describes the configuration options for the Metadata Extensions tab:
Table B-8. Workflow Properties - Metadata Extensions Tab
Metadata
Required/
Optional
Options
Extension Name Required Name of the metadata extension. Metadata extension names must be unique in
a domain.
Datatype Required The datatype: numeric (integer), string, boolean, or XML.
Value Optional An optional value.

For a numeric metadata extension, the value must be an integer.
For a string or XML metadata extension, click the Edit button on the right side
of the Value field to enter a value of more than one line. The Workflow Manager
does not validate XML syntax.
Metadata Extensions Tab 733

Table B-8. Workflow Properties - Metadata Extensions Tab
Metadata
Required/
Optional
Options
Precision Required for The maximum length for string or XML metadata extensions.
string and
XML objects
Reusable Required Select to make the metadata extension apply to all objects of this type
(reusable). Clear to make the metadata extension apply to this object only
(non-reusable).
UnOverride Optional This column appears only if the value of one of the metadata extensions was
changed. To restore the default value, click Revert.
Description Optional Optional description of the metadata extension.

Appendix C
Session Properties Comparison

Reference
This appendix covers the following topics:
♦ Overview, 736
♦ Source Location Tab, 754
♦ Time Tab, 755
♦ Log and Error Handling Tab, 758
♦ Transformations Tab, 761
♦ Partitions Tab, 762
735
Overview
The Workflow Manager and Workflow Monitor replace the Server Manager in PowerCenter
5.x and PowerMart 5.x. This appendix compares session properties in the Server Manager
with session and workflow options in the Workflow Manager. It lists the session properties as
they appeared on the session properties in the Server Manager. It then gives the corresponding
options in the Workflow Manager.
The session properties for the Server Manager contain the following tabs:
♦ General tab
♦ Source Location tab
♦ Time tab
♦ Log and Error Handling tab
♦ Transformations tab
♦ Partitions tab
736 Appendix C: Session Properties Comparison Reference

General Tab
In the Server Manager, the General tab appeared when you opened the session properties. In
the Workflow Manager, the General tab appears when you open the session properties in the
Task Developer or the Workflow Designer.
Figure C-1 shows the Server Manager General tab:
Figure C-1. Server Manager General Tab
In the Server Manager, you configured the following options from the General tab:
♦ General options
♦ Source options
♦ Target options
♦ Session commands
♦ Performance
General Options
In the Server Manager, you could configure the Session Name field, Server Name, and the
Session Enabled option on the General tab of the session properties.
In the Workflow Manager, these options are on either the General tab of the session
properties or in the workflow properties.
General Tab 737

Table C-1 compares general session options for the Server Manager with the corresponding
options for Workflow Manager:
Table C-1. General Session Options Comparison
Server Manager General Tab

Property Location in Workflow Manager
Properties
Session Name General tab-Rename button.
Server Name General tab-Workflow or session properties.
Add Server Button General tab-Workflow or session properties.
Session Enabled General tab-Disable this task. You can only view this property when you edit
the session instance from the Workflow Designer.
Source Options
In the Server Manager, Source options appeared under the Session Name field on the General
tab.
In the Workflow Manager, source options appear under the Sources node on the Mapping tab
(Transformations view). The Sources node contains connections, properties, and readers
settings.
Table C-2 compares Source options for the Server Manager with the corresponding properties
for the Workflow Manager:
Table C-2. Source Options Comparison
Server Manager General Tab-Source

Options Properties
Source Type Mapping tab-Transformations view-Sources node-Connections settings.
Treat Rows As Properties tab-General Options settings.
Source Options Button Mapping tab-Transformations view-Sources node-Properties settings.

Click Set File Properties.
Source Database Mapping tab-Transformations view-Sources node-Connections settings.

Click the Edit button in the Value field.
Source Options Dialog Box for Flat File Sources

In the Server Manager, the Source Options dialog box appeared when you clicked Source
Options on the General tab and the mapping used file sources.
In the Workflow Manager, most of the source options for file sources appear when you select
Properties from the Sources node on the Mapping tab.

Figure C-2 shows the Server Manager Source Options Dialog Box for File Sources:
Figure C-2. Server Manager Source Options Dialog Box for File Sources
Table C-3 compares source options for file sources for the Server Manager with the
corresponding options for the Workflow Manager:
Table C-3. File Source Options Comparison
Server Manager General Tab-Source

Options Properties
Source Directory Mapping tab-Transformations view-Sources node-Properties settings.
File Name Mapping tab-Transformations view-Sources node-Properties settings.
File Type Mapping tab-Transformations view-Sources node-Properties settings.

File List Mapping tab-Transformations view-Sources node-Properties settings.

Set Source Filetype property to direct or indirect.
FTP File Mapping tab-Transformations view-Sources node-Connections settings.

Choose FTP for Type.
Apply to All Files N/A
Edit File Property Button Mapping tab-Transformations view-Sources node-Properties settings.

Edit FTP Property Button Mapping tab-Transformations view-Sources node-Connections settings.

Choose FTP for Type. Click the Edit button on the right side of the Value
field to edit FTP properties.
General Tab 739

Source Options for Fixed-Width File Sources
In the Server Manager, the Fixed-Width Properties dialog box appeared when you selected a
fixed-width file from the File Source Dialog box and then clicked Edit File Property.
In the Workflow Manager, the Fixed-Width Properties dialog box appears when you click the
Set File Properties from the Sources node on the Mapping tab, select Fixed-Width, and then
click Advanced.
Figure C-3 shows the Server Manager Fixed-Width Properties dialog box:
Figure C-3. Server Manager Fixed-Width Properties Dialog Box
Delimited File Properties

In the Server Manager, the Delimited File Properties dialog box appeared when you selected a
delimited file from the File Source Dialog box and then clicked Edit File Property.
In the Workflow Manager, the Delimited Properties dialog box appears when you click Set
File Properties from the Sources node on the Mapping tab, select Delimited, and click
Advanced.

Figure C-4 shows the Server Manager Delimited File Properties dialog box:
Figure C-4. Server Manager Delimited File Properties Dialog Box
Source Options for XML Sources

In the Server Manager, the Source Options for XML sources appeared when you clicked
Source Options on the General tab and the mapping used XML sources.
In the Workflow Manager, XML source options appear in the Sources node on the Mapping
tab when the mapping uses XML sources.
Figure C-5 shows the Server Manager Source Options dialog box for XML sources:
Figure C-5. Server Manager Source Options Dialog Box (XML Sources)
General Tab 741

Table C-4 compares XML source options for the Server Manager with the corresponding
options for the Workflow Manager:
Table C-4. XML Sources Options Comparison
Server Manager XML Source Options

Properties
Source Directory Mapping tab-Transformations view-Sources node-Properties settings.
Code Page Mapping tab-Transformations view-Sources node-Properties settings.

Click Set File Properties, and then click Advanced.
File List Mapping tab-Transformations view-Sources node-Properties settings.

Set Source Filetype property to direct or indirect.
FTP File Mapping tab-Transformations view-Sources node-Properties settings.

Edit FTP Property Button Mapping tab-Transformations view-Sources node-Connections settings.

FTP Properties
In the Server Manager, the FTP Properties dialog box appeared when you edited FTP
properties.
In the Workflow Manager, the FTP Connection Editor appears when you choose FTP as the
connection type from the Sources tab, click the Edit button on the right side of the Value
field, and then click Override to edit the FTP properties.
Figure C-6 shows the Server Manager FTP Properties dialog box:
Figure C-6. Server Manager FTP Properties Dialog Box

Table C-5 compares FTP properties for the Server Manager with the corresponding options
Table C-5. FTP Properties Comparison
Server Manager FTP Properties Property Location in Workflow Manager
Connection Name Mapping tab-Transformations view-Sources node-Connections settings.

Click the Edit button on the right side of the Value field. Choose FTP for
Type. Click the Edit button on the right side of the Value field to edit FTP
properties. Select an FTP connection.
Remote File Name Mapping tab-Transformations view-Sources node-Connections settings.

properties. Click Override in the FTP Object Browser.
Stage the FTP Data Mapping tab-Transformations view-Sources node-Connections settings.

properties. Click Override in the FTP Object Browser.
Source Options for Relational Sources

In the Server Manager, the Source options dialog box for relational sources appeared when
you clicked Source Options on the General tab and the mapping used relational sources.
In the Workflow Manager, enter a prefix for each source table in the Owner Name field on the
Mapping tab-Transformations view-Sources node-Properties settings.
Target Options
In the Server Manager target options appeared on the General tab. In the target options, you
could select the target type for the session, configure reject file names, and create database
connection session parameters in the target options.
In the Workflow Manager, the Mapping tab-Transformations view-Targets node contains
connections, properties, and writers settings.
Table C-6 compares target options for the Server Manager with the corresponding options for
Workflow Manager:
Table C-6. Target Options Comparison
Server Manager General Tab-Target

Table Properties
Target Type Mapping tab-Transformations view-Targets node-Writers settings.
Target Options Button Properties in the Target Options dialog box are located on the Mapping
tab-Transformations view-Targets node-Properties settings.
General Tab 743

Table C-6. Target Options Comparison

Table Properties
Reject Options Button Properties in the Rejects Options dialog box are located on the Mapping
tab-Transformations view-Targets node-Properties settings.
Target Database Mapping tab-Transformations view-Targets node-Connections settings.

Click the Edit button on the right side of the Value field to choose a
target connection.
Relational Target Options

In the Server Manager, the Targets dialog box appeared when you selected a relational target
type and clicked Target Options on the General tab.
In the Workflow Manager, the target options for relational targets appear when you select the
Mapping tab.
Figure C-7 shows the Server Manager Targets dialog box:
Figure C-7. Server Manager Targets Dialog Box
Table C-7 compares relational target options for the Server Manager with the corresponding
Table C-7. Relational Target Options Comparison

Workflow Manager Property Location
Table Options Properties
Insert Mapping tab-Transformations view-Targets node-Properties settings.
Update (as update) Mapping tab-Transformations view-Targets node-Properties settings.
Update (as insert) Mapping tab-Transformations view-Targets node-Properties settings.

Table C-7. Relational Target Options Comparison

Table Options Properties
Update (else insert) Mapping tab-Transformations view-Targets node-Properties settings.
Delete Mapping tab-Transformations view-Targets node-Properties settings.
Truncate Table Mapping tab-Transformations view-Targets node-Properties settings.
Normal/Bulk Mapping tab-Transformations view-Targets node-Properties settings.

Choose Normal or Bulk for Target Load Type.
Test Load Properties tab-General Options settings.
Number of Rows To Test Properties tab-General Options settings.
Output Files
In the Server Manager, the Output Files dialog box appeared when you selected a file target
type, then clicked Target Options on the General tab.
In the Workflow Manager, output file target options appear on the Mapping tab-
Transformations view. The Targets node contains connections, properties, and writer settings.
Figure C-8 shows the Server Manager Output Files dialog box:
Figure C-8. Server Manager Output Files Dialog Box
General Tab 745

Table C-8 compares output file options for the Server Manager with the corresponding
Table C-8. File Target Output Options Comparison
Server Manager General Tab-Output

Files Properties
Directory Mapping tab-Transformations view-Targets node-Properties settings.
File Name Mapping tab-Transformations view-Targets node-Properties settings.
FTP file Mapping tab-Transformations view-Targets node-Connections settings.
Loader Mapping tab-Transformations view-Targets node-Connections settings.
Edit Object Properties Mapping tab-Transformations view-Targets node-Connections settings.

Choose the connection type, and then click the Edit button on the right
side of the Value field.
Fixed Width/Delimited Mapping tab-Transformations view-Targets node-Connections settings.

Edit Null Character Button Mapping tab-Transformations view-Targets node-Connections settings.

Click Set File Properties. Choose Fixed-Width and click the Advance
button.
Edit Delimiter Button Mapping tab-Transformations view-Targets node-Connections settings.

Click Set File Properties. Choose Delimited and click the Advance
button.
Number of Rows To Test Properties tab-General Options settings.
Merge Targets For Partitioned Sessions Mapping tab-Transformations view-Targets node-Properties settings.
External Loader Properties

In the Server Manager, the External Loader Properties dialog box appeared when you used the
Loader option on the Targets Options dialog box, and then clicked Edit Object Properties to
select the external loader you wanted the PowerCenter Server to use.
In the Workflow Manager, the External Loader Properties dialog box appears when you
choose External Loader from the Targets node Connections settings on the Mappings tab, and
then click the Edit button on the right side of the Value field.

Figure C-9 shows the Server Manager External Loader Properties dialog box:
Figure C-9. Server Manager External Loader Properties
Fixed-Width Properties
In the Server Manager, the Fixed-Width dialog box appeared when you configured a session
to write to a fixed-width target file, and then clicked Edit Null Character.
In the Workflow Manager, you can access the Fixed-Width Properties dialog box from the
Properties settings of the Mappings tab. Click Set File Properties, and select Fixed-Width.
Figure C-10 shows the Server Manager Fixed-Width dialog box:
Figure C-10. Server Manager Fixed-Width Dialog Box (Output Files)
Delimited File Properties

In the Server Manager, the Delimited File Properties dialog box appeared when you
configured a session to write to a delimited target file, then clicked Edit Delimiter.
In the Workflow Manager, you can access the Delimited Properties dialog box from the
Properties settings of the Mappings tab. Click Set File Properties, and select Delimited.
General Tab 747

Figure C-11 shows the Server Manager Delimited File Properties dialog box:
Figure C-11. Server Manager Delimited File Properties Dialog Box (Output Files)
XML Targets
In the Server Manager, the XML Target dialog box appeared when you selected an XML file
target type, then clicked Target Options.
In the Workflow Manager, you can access the XML Target dialog box from the Properties
settings of the Mappings tab. Click Set File Properties.
Figure C-12 shows the Server Manager XML Target dialog box:
Figure C-12. Server Manager XML Target Dialog Box
Table C-9 compares XML target options for the Server Manager with the corresponding
Table C-9. XML Target Options Comparison
Server Manager General Tab-XML

Target Properties
Directory Mapping tab-Transformations view-Targets node-Properties settings.

Table C-9. XML Target Options Comparison
Server Manager General Tab-XML

Target Properties
Code Page Mapping tab-Transformations view-Targets node-Properties settings.

Click Set File Properties, and then click Advanced.
FTP File Mapping tab-Transformations view-Targets node-Properties settings.

Edit Object Properties Mapping tab-Transformations view-Targets node-Connections settings.

Reject Files
In the Server Manager, the Reject Files dialog box appeared when you clicked Reject Options
on the General tab.
In the Workflow Manager, the reject file options appear in the Targets node Properties
settings on the Mapping tab.
Figure C-13 shows the Server Manager Reject File dialog box:
Figure C-13. Server Manager Reject File Dialog Box
Table C-10 compares Reject Files options for the Server Manager with the corresponding
Table C-10. Reject Files Options Comparison
Server Manager General tab-Reject

File Properties
Reject File Directory Mapping tab-Transformations view-Targets node-Properties settings.
File Name Mapping tab-Transformations view-Targets node-Properties settings.
General Tab 749

Session Commands
In the Server Manager, session commands appeared under the Server Name field on the
General tab. You could enter pre-session shell commands, post-session commands and
separate email messages if the session succeeded or failed.
In the Workflow Manager, session commands appear on the Components tab.
Pre-Session Commands
In the Server Manager, the Pre-Session Commands dialog box appeared when you clicked Pre-
Session on the General tab of the session properties.
In the Workflow Manager, pre-session command options appear on the Components tab.
Figure C-14 shows the Server Manager Pre-Session Commands dialog box:
Figure C-14. Server Manager Pre-Session Commands Dialog Box
Table C-11 compares session command options for the Server Manager with the
Table C-11. Pre-Session Commands Comparison
Server Manager General Tab-Session

Commands Pre-Session Properties
Description Components tab. Click the Edit button on the right side of the Value field
for Pre-Session Commands. Enter the description in the General tab of
the Edit Pre-Session Commands dialog box.
Command Components tab. Click the Edit button on the right side of the Value field
for Pre-Session Commands. Enter the command in the Command tab of
the Edit Pre-Session Commands dialog box.
Post-Session Commands and Email

In the Server Manager, the Post-Session Commands and Email dialog box appears when you
click Post-session And Email on the General tab of the session properties.
In the Workflow Manager, post-session commands and email options appear on the
Components tab.

Figure C-15 shows the Server Manager Post-Session Commands and Email dialog box:
Figure C-15. Server Manager Post-Session Commands and Email
Table C-12 compares post-session command and email options for the Server Manager with
the corresponding options for the Workflow Manager:
Table C-12. Post-Session Commands and Email Comparison
Server Manager General Tab-Post-

Session Commands And Email Workflow Manager Property Location
Properties
Description Components tab. Click the Edit button on the right side of the Value field
for Post-Session Commands. Enter the description in the General tab of
the Edit Post-Session Commands dialog box.
Command Components tab. Click the Edit button on the right side of the Value field
for Post-Session Commands. Enter the command in the Command tab
of the Edit Post-Session Commands dialog box.
Success Components tab-On Success Email.
Failure Components tab-On Failure Email.
Email User Name Components tab. Click the Edit button on the right side of the Value field
for On Success Email or On Failure Email. Enter the email user name in
the Properties tab of the Edit Success Email or Edit Failure Email dialog
box.
Email Subject Components tab. Click the Edit button on the right side of the Value field
for On Success Email or On Failure Email. Enter the email subject in the
Properties tab of the Edit Success Email or Edit Failure Email dialog
box.
Email Text Components tab. Click the Edit button on the right side of the Value field
for On Success Email or On Failure Email. Enter the email text in the
Properties tab of the Edit Success Email or Edit Failure Email dialog
box.
General Tab 751

Performance Options
In the Server Manager, Performance options appeared under Session Commands on the
General tab. In Performance options you could increase memory size, selected performance
details, and set configuration parameters. In the Workflow Manager, Performance options
appear on the Properties tab in the session properties.
Table C-13 compares performance options for the Server Manager with the corresponding
Table C-13. Performance Options Comparison
Server Manager General Tab-

Performance Properties
DTM Buffer Pool Size Properties tab-Performance settings.
Collect Performance Data Properties tab-Performance settings.
Advanced Options button Config Object tab, Mapping tab, and Properties tab.
Configuration Parameters
In the Server Manager, the Configuration Parameters dialog box appeared when you clicked
Advanced Options on the General tab. In the Configuration Parameters dialog box, you could
configure the DTM memory parameters, general parameters, reader parameters, and event-
based scheduling.
In the Workflow Manager, the configuration parameters options appear on multiple tabs.
Figure C-16 shows the Server Manager Configuration Parameter dialog box:
Figure C-16. Server Manager Configuration Parameter Dialog Box

Table C-14 compares configuration parameters for the Server Manager with the
Table C-14. Configuration Parameters Comparison
Server Manager Advanced Option

Properties
Default Buffer Block Size Config Object tab-Advanced settings.
Index Cache Size Mapping tab-Transformations view-Transformations node-Properties

settings for Aggregator, Joiner, Lookup, Rank transformations.
Data Cache Size Mapping tab-Transformations view-Transformations node-Properties

settings for Aggregator, Joiner, Lookup, Rank transformations.
Line Sequential Buffer Length Config Object tab-Advanced settings.
Source Based Commit Interval Properties tab-General settings.
Target Based Commit Interval Properties tab-General settings.
Commit Interval Properties tab-General settings.
Enable Decimal Arithmetic Properties tab-Performance settings. The option name is Enable High
Precision.
Constraint Based Loading Config Object tab-Advanced settings.
Cache LOOKUP( ) Function Config Object tab-Advanced settings.
Event-Based Scheduling-Indicator File To Event Wait Task-Events tab-Pre Defined Event. Enter the name of the
Wait For file to watch.
General Tab 753

Source Location Tab
In the Server Manager, the Source Location tab displays when you created a heterogeneous
session. In the Source Name field, you could optionally edit the source database listed for
each relation source.
In the Workflow Manager, source database information displays in the Connections settings
of the Sources node on the Mapping tab.
Figure C-17 shows the Server Manager Source Location tab:
Figure C-17. Server Manager Source Location Tab

Time Tab
In the Server Manager, the Time tab appeared after the General tab unless the session was
heterogeneous. If the session was heterogeneous, the Time tab appeared after the Source
Location tab.
In the Workflow Manager, the Schedule tab contains workflow scheduling options. To
configure reusable scheduler options, select Workflows-Schedulers from the menu. To
configure non-reusable schedule options, select Edit-Workflow to open workflow properties
and click the Schedule tab.
Figure C-18 shows the Server Manager Time tab:
Figure C-18. Server Manager Time tab
In the Server Manager, you configured the following options from the Time tab:
♦ Schedule options
♦ Start options
♦ Duration options
♦ Batch option
Schedule Options
In the Server Manager, you used the Schedule options on the Time tab of the session
properties to schedule the frequency of a session run.
Time Tab 755

In the Workflow Manager, you use the Run Options and Schedule Options on the Schedule
tab of the Scheduler properties to schedule the frequency of a workflow run.
Repeat Options
In the Server Manager, the Repeat dialog box appeared when you selected Customized
Repeat, then clicked Edit on the Time tab.
In the Workflow Manager, the Customized Repeat dialog box appears when you schedule a
session to run on server initialization, select Customized Repeat, and then click Edit.
Figure C-19 shows the Server Manager Repeat dialog box:
Figure C-19. Server Manager Repeat Dialog Box
Start Options
In the Server Manager, the Start options appeared below the Schedule options on the Time
tab. In the Start options, you could select the session start date and session start time.
In the Workflow Manager, the Start options appear on the Schedule tab of the workflow
properties.
Duration Options
In the Server Manager, Duration options appeared next to Start options on the Time tab. In
Duration options, you could set the end date of a session run, the number of session runs, or
schedule a session to run forever as long as it was successful.
In the Workflow Manager, End options appear next to Start options on the Scheduler tab of

Use Absolute Time Option
In the Server Manager, Use Absolute Time option, or Batch option, appeared under Start and
Duration options on the Time tab. You could use Use Absolute Time option to use the
schedule as set in the session.
In the Workflow Manager, Use Absolute Time appears on the Schedule tab of the Timer
object.
Time Tab 757

Log and Error Handling Tab
In the Server Manager, the Log and Error Handling tab appeared after the Time tab on the
session properties.
In the Workflow Manager, log and error handling options appear on the Properties and
Config Object tabs on the session properties.
Figure C-20 shows the Server Manager Log and Error Handling tab:
Figure C-20. Server Manager Log and Error Handling Tab
In the Server Manager, on the Log and Error Handling tab you could configure the following
options:
♦ Log File options
♦ Parameter File option
♦ Batch Handling option
♦ Error Handling options
Log File Options

In the Server Manager Log File options appeared at the top of the Log and Error Handling
tab. You could enter a session log variable, enter a file name for the session, or indicate how
session logs should be archived.
In the Workflow Manager, Log File options appear on the Properties and Config Object tabs.

Table C-15 compares the Log File options for Server Manager with the corresponding options
Table C-15. Log File Options Comparison
Server Manager Log and Error

Properties
Server Path to Log Files Properties tab-General Options settings. Enter the path in Session Log
File Directory.
Session Log File Properties tab-General Options settings. Enter the log file name in
Session Log File Name.
Save the Session Log From the Last Config Object tab-Log Options settings.
<number> Session Runs
Save Session Log By Timestamp Config Object tab-Log Options settings.
Parameter File Option

In the Server Manager, the Parameter File option appeared beneath the log file options on the
Log and Error File tab. You could use the Parameter File option to designate a name and
directory for a parameter file.
In the Workflow Manager, the Parameter File option appears on the Properties tab-General
Options settings.
Batch Handling Option

In the Server Manager, the Batch Handling option appears under the Parameter File option
on the Log and Error Handling tab.
In the Workflow Manager, use link conditions in the Workflow Designer for a task to run
based on the success or failure of the previous task.
Error Handling Options

In the Server Manager, Error Handling options appeared below the Parameter File option. In
the Workflow Manager, Error handling options appear on the Config Object tab.
Table C-16 compares the Error Handling options for Server Manager with the corresponding
Table C-16. Error Handling Options Comparison

Handling Properties
Stop On Config Object tab-Error handling settings.
Perform Recovery Config Object tab-Error handling settings.
Log and Error Handling Tab 759

Table C-16. Error Handling Options Comparison

Handling Properties
Override Tracing Config Object tab-Error handling settings.
Log and Error Handling tab-On pre-session Config Object tab-Error handling settings.
command errors-Stop session/Continue
session
Log and Error Handling tab-On stored Config Object tab-Error handling settings.
procedure errors-Stop session/Continue
session

Transformations Tab
In the Server Manager, the Transformations tab appeared on the session properties after the
Log and Error Handling tab.
In the Workflow Manager, the settings for transformations appear on the Mapping tab-
Transformations view.
Figure C-21 shows the Server Manager Transformations tab:
Figure C-21. Server Manager Transformations Tab
Table C-17 compares the Transformations tab options for Server Manager with the
Table C-17. Transformations Tab Options Comparison
Server Manager Transformations Tab

Properties
Session Level Override Transformations Mapping tab-Transformations view-Transformations node.
Aggregate Behavior Properties tab-Performance settings.
Deadlock Behavior-Retry Session On Properties tab-Performance settings.

Deadlock
Sort Order Properties tab-Performance settings.
Transformations Tab 761

Partitions Tab
In the Server Manager, the Partitions tab appeared in the session properties after the
Transformations tab.
In the Workflow Manager, the settings for partitioning appear on the Mapping tab-Partitions
view. For more information about partitioning, see “Configuring Partitioning Information”
on page 351.

Index
A adding
tasks 92
ABORT function advanced settings
See also Transformation Language Reference session properties 675
session failure 200 aggregate caches
aborted status 421 calculating the data cache 622
aborting calculating the index cache 621
Control tasks 147 overview 621
server handling 129 reinitializing 576, 674
sessions 130 aggregate files
status 421 deleting 577
tasks 129 moving 577
tasks in Workflow Monitor 418 aggregate function calls
workflows 129 minimizing 652
Aborttask Aggregator transformation
pmcmd syntax 596 cache options 621
Abortworkflow cache partitioning 621
pmcmd syntax 597 caches 26, 34
absolute time data cache 622
specifying 162 index cache 621
Timer task 161 optimizing performance 650
active sources optimizing with Sorted Input 651
constraint-based loading 248 partitioning guidelines 347
defined 259 performance detail 639
generating commits 278 allocating memory
row error logging 260 XML sources 655
source-based commit 278 AND links 137
transaction generators 259 archiving
XML targets 259 session logs 471
763
workflow logs 459
arrange
C
workflows vertically 40 cache files
workspace objects 71 locating 577
ASCII mode naming convention 615
See also Installation and Configuration Guide permissions 28
See also Unicode mode cache partitioning
overview 27 Aggregator transformation 621
performance 661 described 359
session behavior 16 incremental aggregation 621
assigning Joiner transformation 624
PowerCenter Servers 122, 198 Lookup transformation 391
Assignment tasks Rank transformation 620
creating 140 caches
definition 140 Aggregator transformation 621
description 132 calculating Aggregator data cache 622
using expression editor 96 calculating Aggregator index cache 621
variables in 103 calculating Joiner data cache 626
calculating Joiner index cache 625
calculating Lookup data cache 631
B calculating Lookup index cache 629
calculating Rank data cache 633
$BadFile calculating Rank index cache 632
definition 508 default directory 34
naming convention 496, 520 files for index and data 614
using 509 files, overview 34
blocking Joiner transformation 624
definition 23 Lookup transformation 628
blocking source data memory 26, 614
PowerCenter Server handling 23 memory usage 26
buffer block size optimizing 658
configuring 677 overview 28, 614
optimizing 655, 657 resetting with real-time sessions 288
buffer memory session cache files 614
allocating 655 transformation 34
buffer blocks 25 caching
DTM process 25 lookup functions 676
bulk loading Char datatypes
commit interval 253 removing trailing blanks for optimization 653
data driven session 252 check point interval
DB2 642 optimizing 642
DB2 guidelines 253 checking in
Oracle 643 versioned objects 74
Oracle guidelines 253 checking out versioned objects 74
session properties 252, 697 COBOL sources
Sybase IQ 643 error handling 227
targets 642 numeric data handling 229
test load 244 code page compatibility
using user-defined commit 283 See also Installation and Configuration Guide
multiple file sources 230
targets 235
764 Index
code pages sessions 79
See also Installation and Configuration Guide tasks 79
data movement modes 27 workflows 79
database connections 54, 234 worklets 79
delimited source 224 Components tab
delimited target 267, 703 properties 710
external loader files 524 concurrent connections
fixed-width sources 222 in partitioned pipelines 379
fixed-width target 266, 702 Config Object tab
relaxed validation 55 properties 675
validation 12 configuring
viewing the session log 475 error handling options 493
color connect string
setting 42 examples 54
workspace 42 syntax 54
command line mode for pmcmd connection objects
connecting 589 See also Repository Guide
return codes 590 assigning permissions 51
using 589 definition 51
command line program See pmcmd deleting 59
Command task connection settings
multiple UNIX commands 145 applying to all session instances 180
Command tasks targets 695
creating 143 connections
definition 143 copy as 59, 60
description 132 copying a relational database connection 59
executing commands 145 external loader 551
promoting to reusable 145 FTP 561
Run if Previous Completed 145 multiple targets 274
using server variables 188, 193 relational database 56
using session parameters 143 replacing a relational database connection 62
comments sources 211
adding in Expression Editor 97 targets 237
commit interval connectivity
bulk loading 253 See also Installation and Configuration Guide
configuring 292 connect string examples 54
description 276 overview 5
optimizing 655, 658 server grids 447
source- and target-based 276 constraint-based loading
commit source active sources 248
source-based commit 278 configuring 248
commit type enabling 251
configuring 672 key relationships 248
committing data session property 676
target connect groups 278 target connection groups 249
transaction control 283 Update Strategy transformations 249
common logic control file
factoring 652 overriding Teradata 539
comparing objects overview 33
See also Designer Guide permissions 28
See also Repository Guide
Index 765
Control tasks finding 577
definition 147 data flow
description 132 See pipeline
options 148 data movement mode
stopping or aborting the workflow 129 See also ASCII mode
copying See also Installation and Configuration Guide
repository objects 77 See also Unicode mode
counters affecting incremental aggregation 577
BufferInput_efficiency 640 overview 27
BufferOutput_efficiency 640 database connections
overview 437 See also Installation and Configuration Guide
Rowsinlookupcache 639 configuring 56
Transformation_errorrows 639 copying a relational database connection 59
Transformation_readfromdisk 639 domain name 58
Transformation_writetodisk 639 packet size 58
CPU usage privileges required to create 53
PowerCenter Server 24 replacing a relational database connection 62
creating rollback segment 58
external loader connections 551 session parameter 499
FTP sessions 565 use trusted connection 58
server grids 451 using Oracle OS Authentication 53
sessions 175 databases
workflows 91 connection requirements 57
CUME connectivity overview 46
partitioning restrictions 395 environment SQL 55
Custom transformation optimizing sources 645
partitioning guidelines 396 optimizing targets 642
customized repeat selecting code pages 54
daily 117 setting up connections 53
editing 115 datatypes
monthly 117 See also Designer Guide
options 116 Char 653
repeat every 117 Decimal 269
weekly 117 Double 269
Float 269
Integer 269
D minimizing conversions 648
Money 269
data Numeric 269
capturing incremental source changes 574, 579 padding bytes for fixed-width targets 268
data caches Real 269
Aggregator transformation 622 Varchar 653
description 614 dates
for incremental aggregation 577 configuring 38
memory usage 26 formats 38
optimizing 655, 658 DB2
Rank transformation 633 bulk loading 642
data driven bulk loading guidelines 253
bulk loading 252 commit interval 253
data files See IBM DB2
creating directory 579
766 Index
$DBConnection session properties, targets 266
definition 499 description
naming convention 496, 520 repository objects 73
using 499 directories
deadlock for historical aggregate data 579
retry session 674 server defaults 46
deadlock retry server variables 46
See also Installation and Configuration Guide workspace file 41
configuring 246 disabled
target connection groups 257 status 421
Debugger disabling
restrictions in partitioned pipelines 396 tasks 137
decimal arithmetic workflows 118
See high precision displaying
Decision tasks customizing windows 69
creating 151 date time format 38
decision condition variable 149 Expression Editor 97
definition 149 fonts 42
description 132 options 39
example 149 servers in Workflow Monitor 406
using Expression Editor 96 show solid lines for links 42
variables in 103 toolbars 69
DECODE function workspace color 42
See also Transformation Language Reference documentation
using for optimization 653 conventions xlix
default remote directories description xlviii
for FTP connections 561 online xlix
deleting domain name 58
connection objects 59 dropping
servers 50 indexes 248
workflows 97 DTM (Data Transformation Manager)
delimited flat files buffer memory 25
code page 691 overview 3
code page, sources 224 post-session email 10
code page, targets 267 process 7, 11
consecutive delimiters 692 running sessions and workflows 7
escape character 691 transformation statistics example 469
escape character, sources 224 DTM Buffer Pool Size
numeric data handling 229 optimizing 655
quote character 691 session property 674
quote character, sources 224 tuning 656
quote character, targets 267
session properties, sources 222
session properties, targets 266 E
sources 691
delimited sources edit
number of rows to skip 692 delimiter 690
delimited targets edit null characters
session properties 703 session properties 702
delimiter editing
session properties, sources 222 delimiter 702
Index 767
session privileges 178 guidelines for entering 55
sessions 177 environment variables
email PM_CODEPAGENAME 585
attaching files 333, 342 PM_HOME 587
configuring a user on Windows 322, 342 PMTOOL_DATEFORMAT 585
configuring the PowerCenter Server on UNIX 321 repository username and password 586
configuring the PowerCenter Server on Windows 322 error handling 186
distribution lists 326 COBOL sources 227
email variables 333 error log files 489
format tags 333 fixed-width file 227
logon network security on Windows 325 options 493
MIME format 320 overview 201
multiple recipients 326 PMError_MSG table schema 485
on failure 332 PMError_ROWDATA table schema 483
on success 332 PMError_Session table schema 486
overview 320 pre- and post-session SQL 186
post-session 332 settings 679
rmail 321 transaction control 284
server variables 333 error log
session properties 714 options 494
specifying a Microsoft Outlook profile 327 session errors 201
suspending workflows 339 error log files 489
text message 328 error log tables
tips 342 creating 483
user name 328 overview 483
using other mail programs 343 error logging
using server variables 333 overview 482
Windows service startup account 322 error logs
workflows 341 messages 29
worklets 341 error messages
Email tasks external loader 527
creating 329 error threshold
description 132 $PMSessionErrorThreshold 47
overview 328 pipeline partitioning 200
See also email 328 stop on errors 200
suspension email 128 errors
email variables See also Troubleshooting Guide
overview 333 eliminating to improve performance 648
Enable Past Events option 159 fatal 200
enabling enhanced security 44 minimizing tracing level to improve performance 659
end of file pre-session shell command 193
transaction control 284 stopping on 679
end options threshold 200
end after 116 validating in Expression Editor 97
end on 116 Event-Raise tasks
forever 116 configuring 155
enhanced security declaring user-defined event 155
enabling 44 definition 153
enabling for connection objects 44 description 132
environment SQL in worklets 167
configuring 55
768 Index
events using Control task 148
in worklets 167 fatal errors
pre-defined events 153 session failure 200
user-defined events 153 file list
Event-Wait tasks creating for multiple sources 230
definition 153 creating for partitioned sources 375
description 132 using for source file 230
for pre-defined events 158 file server
for user-defined events 157 for multiple PowerCenter Servers 445
waiting for past events 159 setting up for multiple servers 445
working with 156 file sources
Expression Editor numeric data handling 229
adding comments 97 partitioning 374
displaying 97 server handling 226, 229
syntax colors 97 session properties 218
using 96 file targets
validating 119 partitioning 380
validating expressions using 97 session properties 261
expressions filter conditions
optimizing 652 in partitioned pipelines 372
validating 97 filtering
external loader deleted tasks in Workflow Monitor 406
behavior 526 servers in Workflow Monitor 406
code page 524 tasks in Gantt Chart view 405
connections 551 tasks in Task View 431
DB2 528 filters
error messages 527 optimizing 650
loading multibyte data 533, 535 finding objects
on Windows systems 526 Workflow Manager 70
Oracle 533 fixed-width files
overview 524 code page 689
performance 643 code page, sources 222
permissions 525 code page, targets 266
PowerCenter Server support 524 error handling 227
privileges required to create connection 525 multibyte character handling 227
session properties 682, 695 null character 689
setting up Workflow Manager 553 null characters, sources 222
Sybase IQ 535 null characters, targets 266
Teradata 538 numeric data handling 229
using with partitioned pipeline 380 padded bytes in fixed-width targets 268
External Procedure transformation source session properties 220
See also Designer Guide target session properties 265
partitioning guidelines 396 writing to 268, 269
fixed-width sources
session properties 689
F fixed-width targets
fail parent workflow 138 flat file definitions
failed status 421 escape character, sources 224
failing workflows PowerCenter Server handling, targets 268
failing parent workflows 148 quote character, sources 224
Index 769
quote character, targets 267
session properties, sources 218
G
session properties, targets 261 Gantt Chart
flat files configuring 411
See also Designer Guide filtering 405
code page, sources 222 listing tasks and workflows 424
code page, targets 266 navigating 425
delimiter, sources 224 opening and closing folders 407
delimiter, targets 267 organizing 425
increasing performance 660 overview 402
multibyte data 270 searching 427
null characters, sources 222 using 423
null characters, targets 266 zooming 426
numeric data handling 229 general options
output file session parameter 504 arranging workflow vertically 40
output files 33 configuring 39
precision 270 in-place editing 40
precision, targets 269 launching Workflow Monitor 41
shift-sensitive target 271 open editor 41
source file session parameter 502 panning windows 40
fonts receive notification from server 41
setting 42 reload task or workflow 40
format options session properties 668
changing the font 42 show expression on a link 41
color 42 show full name of task 41
configuring 42 General tab in session properties
date and time 38 FTP properties 742
reset all 42 in Server Manager 737
schedule 38 in Workflow Manager 668
show solid lines for links 42 session commands 750
Timer task 38 source options 738
FTP (File Transfer Protocol) target options 743
accessing source files 565 General tab of session properties
accessing target files 568 general options 737
connecting to file targets 380 performance options 752
connection names 561 generating
connection options 563 commits with source-based commit 278
creating a session 565 Getrunningsessionsdetails
defining connections 561 pmcmd syntax 598
defining default remote directory 561 Getserverdetails
defining host names 561 pmcmd syntax 599
mainframe restrictions 560 Getserverproperties
overview 560 pmcmd syntax 599
privileges required to create connections 562 Getsessionstatistics
session properties 682, 695 pmcmd syntax 600
functions Gettaskdetails
See also Transformation Language Reference pmcmd syntax 601
minimizing for optimization 653 Getworkflowdetails
pmcmd syntax 601
globalization
See also Installation and Configuration Guide
770 Index
database connections 234 partitioning data 578
overview 234 performance 651
targets 234 preparing to enable 579
processing 575
reinitializing cache 576
H incremental changes
capturing 579
hash partitioning index caches
adding hash keys 362 Aggregator transformation 621
hash auto-keys partitioning 361 description 614
hash user keys partitioning 362 for incremental aggregation 577
overview 348, 361 memory usage 26
Help optimizing 655, 658
pmcmd syntax 602 Rank transformation 632
heterogeneous sources indexes
defined 208 creating directory 579
heterogeneous targets dropping for target tables 248
overview 274 finding 577
high precision optimizing by dropping 642
disabling 658 recreating for target tables 248
enabling 674 indicator files
handling 204 description 33
optimizing 655 pre-defined events 156
history names session output 33
in Workflow Monitor 419 Informatica
host names documentation xlviii
for FTP connections 561 Webzine l
registering the PowerCenter Server 49 Informix
connect string syntax 54
row-level locking 379
I in-place editing 40
$InputFile
IBM DB2 definition 502
connect string example 54 naming convention 496, 520
icon using 503, 507
Workflow Monitor 404 interactive mode for pmcmd
worklet validation 171 connecting 592
IIF expressions setting defaults 592
See also Transformation Language Reference
optimizing 653
incremental aggregation
J
cache partitioning 621 joiner cache
changing server code page 577 overview 624
changing server data movement mode 577 Joiner transformation
changing session sort order 577 cache partitioning 624
configuring 674 caches 26, 34, 624
configuring the session 579 joining sorted flat files 385
deleting files 577 joining sorted relational data 387
files 34 optimizing 651
moving files 577 optimizing performance 650
overview 574
Index 771
partitioning guidelines 396 Log and Error Handling tab
performance detail 639 batch handling option 759
threads created 19 error handling option 759
log file options 758
parameter file option 759
K Server Manager session properties 758
log files
key constraints See session logs, workflow logs
optimizing by dropping 642 See also Installation and Configuration Guide
key range partitioning 348, 363 editor for Workflow Monitor 410
keys server variable for 46
constraint-based loading 248 session log 671
log options
settings 677
L logs
server 28
launch session 31
Workflow Monitor 41, 404 workflow 30
line sequential buffer length lookup cache
configuring 677 calculating size 629, 631
sources 225 overview 628
links persistent 35
AND 137 pipeline partitioning 628
condition 92 ports included 628
example link condition 94 session property 676
linking tasks concurrently 93 lookup caches
linking tasks sequentially 94 See also Designer Guide
loops 92 enabling 649
OR 137 query created 628
show expression on a link 41 LOOKUP function
show solid lines 42 See also Transformation Language Reference
specifying condition 94 minimizing for optimization 653
using Expression Editor 96 Lookup SQL Override option
variables in 103 reducing cache size 649
working with 92 Lookup transformation
List Tasks See also Designer Guide
in Workflow Monitor 424 cache partitioning 391
Load Manager caches 26, 34, 628
creating log files 11 calculating cache size 628, 629, 631
memory usage 24 enabling caching 649
overview 3 optimizing 639, 649
parameters 25 optimizing lookup condition 649
post-session email 10 optimizing multiple lookup expressions 650
process 7, 8 optimizing with indexing 649
running sessions and workflows 7 loops in workflow 92
scheduling workflows 8
validating code pages 12
load summary
sessions 467
M
local variables mapping bottlenecks
replacing sub-expressions 652 identify 638
772 Index
mapping parameters MIME format
See also Designer Guide email 320
in session properties 203 monitoring
overriding 203 data flow 639
mapping threads session details 434
description 14 MOVINGAVG
mapping variables See also Transformation Language Reference
See also Designer Guide partitioning restrictions 395
in partitioned pipelines 394 MOVINGSUM
mappings See also Transformation Language Reference
definition 2 partitioning restrictions 395
factoring common logic 652 multibyte data
identify bottlenecks 638 character handling 227
increasing performance 636 Oracle external loader 533
single-pass reading 647 Sybase IQ external loader 535
master servers 446 writing to files 270
master thread multiple servers
description 14 overview 444
Maximum Days multiple sessions 196
Workflow Monitor 410
maximum sessions
parameter, description 25
N
Maximum Workflow Runs naming convention
Workflow Monitor 410 See also Getting Started Guide
memory naming conventions
caches 614 session parameters 496, 520
DTM buffer 25 native connect string
increasing to avoid paging 662 See connect string
merge target files navigating
session properties 699 workspace 69
merging target files 380, 382 network packets
message queue increasing 643, 646
using with partitioned pipeline 380 non-persistent variables 110
metadata extensions non-reusable tasks
creating 82 inherited changes 136
deleting 85 promoting to reusable 136
editing 84 normal loading
overview 82 session properties 697
session properties 718 Normal tracing levels
Microsoft Access definition 473
pipeline partitioning 379 Normalizer transformation
Microsoft Outlook partitioning guidelines 347
configuring an email user 322, 342 notification
configuring the PowerCenter Server 322 general option 41
Microsoft SQL Server null characters
bulk loading 642 editing 702
commit interval 253 file targets 266
connect string syntax 54 server handling 227
optimizing 646 session properties, targets 265
targets 702
Index 773
numeric operations OR links 137
optimizing by using 653 Oracle
numeric values bulk loading 642
reading from sources 229 bulk loading guidelines 253
commit intervals 253
connect string syntax 54
O connection with OS Authentication 53
Oracle external loader
open transaction attributes 533
defined 287 bulk loading 643
operators connecting with OS Authentication 552
using for optimization 653 data precision 533
optimizing delimited flat file target 533
block size 657 external loader connections 551
buffer block size 655 external loader support 524, 533
choosing numeric vs. string operations 653 fixed-width flat file target 533
commit interval 655, 658 multibyte data 533
data cache 655 null constraint 533
data caches 658 partitioned target files 533
data flow 440, 637, 639 reject file 534
disabling high precision 658 output files
dropping indexes and key constraints 642 overview 28, 33
DTM Buffer Pool Size 655 permissions 28
eliminating transformation errors 648 session parameter 504
expressions 652 session properties 700
factoring out common logic 652 targets 263
filters 650 $OutputFile
high precision 655 definition 504
IIF expressions 653 naming convention 496, 520
increasing checkpoint interval 642 using 505
increasing network packet size 646 override
index cache 655, 658 Teradata loader control file 539
Joiner transformation 651 tracing levels 473, 679
Lookup transformation 649, 650 owner name
mapping 647 truncating target tables 245
minimizing aggregate function calls 652
minimizing datatype conversions 648
minimizing error tracing 659
pipeline partitioning 663
P
removing trailing blank spaces 653 packet size 58
replacing sub-expressions with local variables 652 paging
sessions 655 eliminating 662
single-pass reading 647 parameter files
source database 645 format 513
system-level 660 location 518
target database 642 session 512
Tracing Level 655 specifying in session 518
using DECODE vs. LOOKUP expressions 653 using with pmcmd starttask 607
using operators vs. functions 653 using with pmcmd startworkflow 608
optimizing performance parameters
Aggregator transformation 650 session 496
774 Index
partition keys performance data
adding 358, 362, 364 collecting 674
adding key ranges 365 performance detail files
partition points creating 436
adding and deleting 353 enabling session monitoring 436
default 17 permissions 28
description 17, 346 understanding counters 437
Joiner transformation 384 viewing 436
partition types performance settings
description 348 session properties 674
partitioning permissions
See pipeline partitioning connection objects 51
partitioning data creating a session 175
incremental aggregation 578 database 51
partitioning restrictions deleting a PowerCenter Server 50
Debugger 396 editing sessions 177
Informix 379 external loader 525
numerical functions 395 FTP connections 561
PowerCenter Connect for IBM MQSeries restrictions FTP session 565
397 output and log files 28
PowerCenter Connect for PeopleSoft restrictions 397 recovery files 28
PowerCenter Connect for SAP BW 397 scheduling 90
PowerCenter Connect for SAP R/3 397 Workflow Monitor tasks 403
PowerCenter Connect for Siebel 398 persistent lookup cache
relational targets 395 session output 35
Sybase IQ 379, 395 persistent variables 110
transformations 395 in worklets 169
unconnected transformations 353 pinging
XML targets 396 pmcmd syntax 602
Partitioning tab PowerCenter Server in Workflow Monitor 405
in the Server Manager 762 Pingserver
in the Workflow Manager 762 pmcmd syntax 602
Partitions pipeline partitioning
properties 352 adding and deleting partitions 356
partitions adding hash keys 362
adding and deleting 356 adding key ranges 365
description 18, 348 adding partition points 353
Partitions views caching Lookup transformations 628
properties 351 concurrent connections 379
pass-through pipeline configuring a session 351
overview 15 configuring for sorted data 384
performance configuring to optimize join performance 384
See also optimizing database compatibility 379
commit interval 278 description 346
detail file 31 error threshold 200
identifying bottlenecks 637 example of use 349
monitoring 436 external loaders 380, 526
server data movement mode 661 file lists 375
Sybase IQ 643 file sources 374
tuning, overview 636 file targets 380
filter conditions 372
Index 775
hash auto-keys partitioning 361 command line mode 589
hash partitioning 361 command parameters 594
hash user keys partitioning 362 commands, list 582
Joiner transformation 384 commands, reference 594
key range 363 environment variables 585
loading to Informix 379 getserverdetails 599
mapping variables 394 getserverproperties 599
merge target files 699 getsessionstatistics 600
merging target files 380, 382 gettaskdetails 601
message queues 380 getworkflowdetails 601
multiple CPUs 3 help 602
multiple source pipelines 19 interactive mode 592
numerical functions restrictions 395 overview 582
object validation 396 parameter files 607, 608
optimizing performance 663 pingserver 602
optimizing source databases 663 resumeworkflow 603
optimizing target databases 664 return codes 300
overview 3 setfolder 604
partition keys 358, 362, 364 setnowait 605
partition types overview 356 setwait 605
partitioning indirect files 375 showsettings 605
pass-through partitioning 367 shutdownserver 605
recovery 200 starttask 606
reject file 476 startworkflow 607
relational sources 371 stoptask 609
relational targets 378 stopworkflow 609
round-robin partitioning 360 syntax 595
rules and restrictions 395, 398 unsetfolder 610
session properties 705 version 611
sorted flat files 385 waittask 611
sorted relational data 387 waitworkflow 611
Sorter transformation 389, 392 writing scripts 589
SQL queries 371 PMError_MSG table schema 485
symmetric processing platform 24 PMError_ROWDATA table schema 483
threads and partitions 18 PMError_Session table schema 486
threads created 16 $PMFailureEmailUser
Transaction Control transformation 356 definition 333
pipelines tips 342
See source pipelines PmNullPasswd
active sources 259 reserved word 53
data flow monitoring 440, 637, 639 PmNullUser
description 346 reserved word 53
PM_CODEPAGENAME pmserver
using with pmcmd 585 process 11
PM_RECOVERY table $PMSessionLogCount
format 299 saving a number of logs 471
PM_TGT_RUN_ID table $PMSessionLogDir
format 299 configuring the session log 471
pmcmd definition 469
aborttask 596 $PMSessionLogFile
abortworkflow 597 definition 497
776 Index
using 498 logs 28
$PMSuccessEmailUser messages 29
definition 333 monitoring 436
tips 342 multiple servers overview 444
PMTOOL_DATEFORMAT multiple source file list 230
using with pmcmd 585 online and offline mode 405
$PMWorkflowLogDir output files 33
definition 459 performance detail file 31
$PMWorkflowLogCount permissions to delete 50
saving a number of logs 460 pinging in Workflow Monitor 405
post-session command privileges required to register 46
session properties 711 processing data 22
shell command properties 714 reading sources 22
post-session email registering 46, 48
overview 33, 332 removing assigned sessions 199
See also email removing assigned workflows 123
session options 716 reporting session statistics 468
session properties 711 server grids overview 446
post-session shell command system resources 24
configuring non-reusable 189 tracing levels 473
configuring reusable 192 truncating target tables 245
using 188 using FTP 561
post-session SQL commands 186 using multiple to increase performance 661
post-session threads using server grids to increase performance 661
description 14 variables for 46
PowerCenter Connect for IBM MQSeries pre- and post-session SQL
partitioning restrictions 397 entering 186
PowerCenter Connect for PeopleSoft guidelines 186
partitioning restrictions 397 precision
PowerCenter Connect for SAP BW flat files 270
partitioning restrictions 397 writing to file targets 269
PowerCenter Connect for SAP R/3 pre-defined events
partitioning restrictions 397 waiting for 158
PowerCenter Connect for Siebel pre-defined variables
partitioning restrictions 398 in Decision tasks 149
PowerCenter Server 22 pre-session shell command
architecture 2 configuring non-reusable 189
assigning sessions 198 configuring reusable 192
assigning workflows 122 errors 193
blocking data 23 session properties 711
changing servers 445 using 188
commit interval overview 276 pre-session SQL commands 186
configuring for multiple servers 445 pre-session threads
connecting in Workflow Monitor 405 description 14
connectivity overview 5, 46 privileges
creating server grids 451 See also permissions
data movement modes 27 See also Repository Guide
deleting 50 scheduling 90
external loader support 524 session 175
filtering in Workflow Monitor 406 workflow 90
handling file targets 268 Workflow Monitor tasks 403
Index 777
workflow operator 90 resume/recover 305
Properties tab in session properties server handling 314
in Workflow Manager 670 recovery files
permissions 28
recreating
Q indexes 248
registering
Quit PowerCenter Server 46, 48
pmcmd syntax 602 registering server
quoted identifiers See also Installation and Configuration Guide
reserved words 255 reinitializing
aggregate cache 576
reject file
R changing names 476
column indicators 478
rank cache locating 456, 476
calculating data cache 633 Oracle external loader 534
calculating index cache 632 overview 32
location 632 permissions 28
overview 632 pipeline partitioning 476
size 632 reading 477
Rank transformation row indicators 478
See also Transformation Guide session parameter 508
cache partitioning 620 session properties 243, 263, 698, 700
caches 26, 34, 632 transaction control 284
partitioning guidelines 347 viewing 476
performance detail 639 relational connections
reader threads See relational databases
description 14, 15 relational databases
reading configuring a connection 56
sources 22 copying a relational database connection 59
real-time sessions replacing a relational database connection 62
transformation scope 288 rollback segment 58
recovering relational sources
pipeline partitioning 200 partitioning 371
recovery session properties 214
completing unrecoverable sessions 316 relational targets
configuring mappings 297 partitioning 378
configuring the session 297 partitioning restrictions 395
configuring the target database 298 session properties 240, 697
configuring the workflow 298 Relative time
files, permissions 28 specifying 162
overview 296 Timer task 161
PM_RECOVERY table format 299 reload task or workflow
PM_TGT_RUN_ID table format 299 configuring 40
pmcmd return codes 300 rename
recover from task 308 repository objects 73
recover task 311 repositories
recovering a failed workflow 308 adding 73
recovering a session task 311 connecting in Workflow Monitor 405
recovering a suspended workflow 305 enter description 73
recovery table layout 314
778 Index
repository objects
configuring 73
S
rename 73 saving
Repository Server session logs 471
notification 41 workflow logs 459
notification in Workflow Monitor 410 scheduled status 421
requirements scheduling
server grids 448 configuring 114
reserved words creating reusable scheduler 114
generating SQL with 255 disabling workflows 118
resword.txt 255 editing 117
reserved words file end options 116
creating 256 error message 113
reset all 42 permission 90
restarting run every 115
in Workflow Monitor 416 run once 115
Resumeworkflow run options 115
pmcmd syntax 603 schedule options 115
Resumeworklet start date 116
pmcmd syntax 603 start time 116
reusable tasks workflows 112
inherited changes 136 searching
reverting changes 136 for versioned objects in the Workflow Manager 76
reverting changes Workflow Manager 70
tasks 136 Workflow Monitor 427
rmail Sequence Generator transformation
See also email partitioning guidelines 353, 396
configuring 321 server
rollback segment 58 See PowerCenter Server
rolling back data See also database-specific server
transaction control 283 selecting 122, 197
round-robin partitioning 348, 360 server code page
row error log files See also PowerCenter Server
permissions 28 affecting incremental aggregation 577
row error logging Server Grid Browser 453
active sources 260 Server Grid Editor 452
row indicators server grids
reject file 478 connectivity 447
rows to skip creating 451
delimited files 692 definition 444
Run if Previous Completed distributing sessions 446
in Command Tasks 145 increasing performance 661
session command 714 master servers 446
run options overview 446
run continuously 115 requirements 448
run on demand 115 worker servers 446
server initialization 115 server handling
running status 421 file targets 268
running, sessions 197 fixed-width targets 269, 270
running, workflows 122 multibyte data to file targets 271
shift-sensitive data, targets 271
Index 779
server logs timestamp 472
messages 29 tracing levels 473
overview 28 transformation statistics 469
Server Manager session properties viewing 474
General tab 737 viewing dynamically 419
Log and Error Handling tab 758 viewing in Workflow Monitor 419
Partitioning tab 762 session output
Source Location tab 754 cache files 34
Time tab 755 control file 33
Transformations tab 761 incremental aggregation files 34
server variables indicator file 33
description 46 performance detail file 31
email 333 persistent lookup cache 35
for multiple servers 445 post-session email 33
in Command tasks 188, 193 PowerCenter Server log 28
list 47 reject file 32
log files 46 session logs 31
servers target output file 33
assigned 444 session parameters
non-associated 444 database connection parameter 499
session command settings defining 512
session properties 711 in Command tasks 143
session details naming conventions 496, 520
monitoring sessions 434 overview 496
session errors 201 reject file parameter 508
session logs session log parameter 497
archiving 471 session parameter file 512
changing location 498 source file parameter 502
changing locations 471 target file parameter 504
changing name 497 session properties
changing names 471 Components tab 710
code page 475 Config Object tab 675
codes 463 constraint-based loading 251
creation 11 delimited files, sources 222
default name 470 delimited files, targets 266
editing 419 edit delimiter 690, 702
external loader error messages 527 edit null character 702
generating using UTF-8 463 email 332, 714
load summary 467 external loader 682, 695
locating 456, 469 fixed-width files, sources 220
location 671 fixed-width files, targets 265
log file settings 469, 470, 472, 474 FTP files 682, 695
overview 31 general settings 668
parameter 497 General tab 668
permissions 28 log files 469, 470, 472, 474
reading 463 Metadata Extensions tab 718
sample 466 null character, targets 265
saving 678 on failure email 332
session details 31 on success email 332
session parameter 497 output files, flat file 700
thread identification 465 partition attributes 351, 352
780 Index
Partitions View 705 metadata extensions in 82
performance settings 674 monitoring counters 437
post-session email 332 multiple source files 230
post-session shell command 714 optimizing 636, 655
Properties tab 670 output files 28
reject file, flat file 263, 700 overview 174
reject file, relational 243, 698 parameter file 512
relational sources 214 parameters 496
relational targets 240 performance detail file 31
session command settings 711 performance tuning 636
session retry on deadlock 246 properties reference 667
sort order 577 read-only 175
source connections 211 removing assigned PowerCenter Servers 199
sources 210 running 197
table name prefix 254 runtime operations overview 7
target connection settings 682, 695 session details file 31
target connections 237 starting 197
target load options 252, 697 stopping 130, 200
target-based commit 292 test load 244, 264
targets 236 truncating target tables 245
Transformation node 703 using FTP 565
transformations 703 validating 195
session properties comparison viewing performance details 436
overview 736 Setfolder
session retry on deadlock pmcmd syntax 604
See also Installation and Configuration Guide Setnowait
overview 246 pmcmd syntax 605
sessions Setwait
See also session logs pmcmd syntax 605
See also session properties shared memory
aborting 130, 200 Load Manager 24
apply attributes to all instances 178 shell commands
assigning PowerCenter Servers 198 executing in Command tasks 145
caches 28 make reusable 191
configuring for multiple source files 231 post-session 188
configuring to optimize join performance 384 post-session properties 714
creating 175 pre-session 188
creating a session configuration object 183 using Command tasks 143
definition 2, 174 using server variables 188, 193
description 132 using session parameters 143
distributing in server grids 446 Showsettings
DTM buffer memory 25 pmcmd syntax 605
editing 177 Shutdownserver
editing privileges 178 pmcmd syntax 605
eliminating paging 621 single-pass reading
email 320 definition 647
enabling monitoring 436 sort order
external loading 524, 553 See also session properties
failure 200 affecting incremental aggregation 577
high-precision data 204 sorted flat files
identifying bottlenecks 639 partitioning for optimized join performance 385
Index 781
sorted ports connections 211
caching requirements 621 delimiters 224
sorted relational data escape character 691
partitioning for optimized join performance 387 line sequential buffer length 225
Sorter transformation multiple sources in a session 230
partitioning 392 null character 689
partitioning for optimized join performance 389 null character handling 227
$Source null characters 222
session properties 672 overriding SQL query, session 216
source bottlenecks partitioning 371, 374
using a database query to identify 638 quote character 691
using a read test session to identify 638 reading 22
using filter transformation to identify 637 session properties 210
source data specifying code page 689, 691
capturing changes for aggregation 574 SQL
source databases configuring environment SQL 55
database connection session parameter 499 guidelines for entering environment SQL 55
identifying bottlenecks 637 SQL queries
optimizing 645 in partitioned pipelines 371
optimizing by partitioning 663 stages
optimizing the query 645 description 17
optimizing with conditional filters 646 staging areas
source files removing to improve performance 659
accessing through FTP 560, 565 start date, scheduling 116
configuring for multiple files 230, 231 Start tasks, definition 88
delimited properties 691 start time, scheduling 116
fixed-width properties 689 starting
session parameter 502 selecting a server 122, 197
session properties 220, 687 sessions 197
using parameters 502, 506 start from task 124
source location starting a part of a workflow 124
session properties 220, 687 starting tasks 125
Source Location tab starting workflows using Workflow Manager 124
in the Workflow Manager 754 Workflow Monitor 404
Server Manager session properties 754 workflows 122
source pipelines Starttask
description 346 pmcmd syntax 606
pass-through 15 using a parameter file 607
reading 22 Startworkflow
stages 17 pmcmd syntax 607
target load order groups 22 using a parameter file 608
threads created 19 statistics
with Joiner transformations 19 for Workflow Monitor 408
Source Qualifier transformation viewing 408
partitioning guidelines 347 status
source-based commit aborted 421
active sources 278 aborting 421
description 278 disabled 421
sources failed 421
code page 224 in Workflow Monitor 421
code page, flat file 222 running 421
782 Index
scheduled 421 Sybase IQ external loader
stopped 421 attributes 536
stopping 421 bulk loading 643
succeeded 421 connections 551
suspended 127, 421 data precision 535
suspending 127, 421 delimited flat file targets 536
tasks 421 fixed-width flat file targets 535
terminated 421 multibyte data 535
unscheduled 421 optional quotes 535
waiting 421 overview 535
workflows 421 support 524
stop on Sybase SQL Server
$PMSessionErrorThreshold 47 bulk loading 642
error threshold 200 connect string example 54
errors 679 optimizing 646
pre- and post-session SQL errors 186 symmetric processing platform
stopped status 421 pipeline partitioning 24
stopping system bottlenecks
PowerCenter Server See Installation and Configuration identifying 640
Guide UNIX 641
in Workflow Monitor 418 Windows 640
server handling 129 system-level optimization
sessions 130 improving network speed 660
tasks 129 overview 660
using Control tasks 147 using additional CPUs 661
workflows 129
stopping status 421
Stoptask
pmcmd syntax 609
T
Stopworkflow table name prefix
pmcmd syntax 609 target owner 254
string operations table owner name
minimizing for performance 653 session properties 216
sub-expressions targets 254
replacing with local variables 652 $Target
succeeded status 421 session properties 672
Suspend On Error option 127 target connect groups
suspended status 127, 421 committing data 278
suspending target connection group
behavior 127 Transaction Control transformation 289
email 128 target connection groups
resume in Workflow Monitor 417 constraint-based loading 249
status 127 defined 257
workflows 127 target connection settings
worklets 164 session properties 682, 695
suspending status 421 target databases
suspension email 339 bulk loading 642
Sybase database connection session parameter 499
commit interval 253 identifying bottlenecks 637
Sybase IQ optimizing 642
partitioning restrictions 379, 395 optimizing by partitioning 664
Index 783
optimizing Oracle target database 643 Task view
target files configuring 412
delimited 703 customizing 412
fixed-width 702 displaying 430
target load order filtering 431
constraint-based loading 249 hiding 412
groups 22 opening and closing folders 407
target load order groups overview 402
defined 22 using 430
target owner tasks
table name prefix 254 aborted 421
target properties aborting 129, 421
bulk mode 241 adding in workflows 92
test load 241 arranging 71
update strategy 241 Assignment tasks 140
target tables Command tasks 143
truncating 245 configuring 135
target-based commit Control task 147
WriterWaitTimeout 277 copying 77
target-based commit interval creating 133
description 277 creating in Task Developer 133
targets creating in Workflow Designer 133
accessing through FTP 560, 568 Decision tasks 149
code page 267, 702, 703 disabled 421
code page compatibility 235 disabling 137
code page, flat file 266 email 328
connection settings 695 Event-Raise tasks 153
connections 237 Event-Wait tasks 153
database connections 234 failed 421
delimiters 267 failing parent workflow 138
file writer 236 in worklets 166
globalization features 234 inherited changes 136
heterogeneous 274 instances 136
load, session properties 252, 697 list of 132
merging output files 380, 382 non-reusable 92
multiple connections 274 overview 132
multiple types 274 promoting to reusable 136
null characters 266 restarting in Workflow Monitor 416
output files 263 reusable 92
output files for 33 reverting changes 136
partitioning 378, 380 running 421
relational settings 697 show full name 41
relational writer 236 starting 125
session properties 236, 240 status 421
specifying null character 702 stopped 421
truncating tables 245 stopping 129, 421
viewing session detail 31 stopping and aborting in Workflow Monitor 418
writers 236 succeeded 421
Task Developer Timer tasks 161
creating tasks 133 using Tasks toolbar 92
displaying and hiding tool name 41 validating 119
784 Index
Tasks toolbar Timer tasks
creating tasks 134 absolute time 161, 162
TCP/IP network protocol definition 161
server settings 49 description 132
Teradata example 161
connect string example 54 relative time 161, 162
Teradata external loader variables in 103
code page 538 timestamps
connections 551 session logs 472
date format 538 workflow logs 460, 462
FastLoad attributes 545 Workflow Monitor 402
MultiLoad attributes 540 tool names
overriding the control file 539 displaying and hiding 41
support 524 toolbars 69
Teradata Warehouse Builder attributes 547 adding tasks 92
TPump attributes 542 creating tasks 134
Teradata Warehouse Builder using 69
attributes 547 Workflow Monitor 415
operators 547 Tracing Level
terminated status 421 optimizing 655
Terse tracing levels tracing levels
See also Designer Guide See also Designer Guide
defined 473 Normal 473
test load overriding 679
bulk loading 244 session 473
enabling 671 Terse 473
file targets 264 Verbose Data 474
number of rows to test 671 Verbose Initialization 474
relational targets 244 transaction
thread identification defined 287
session log file 465 transaction boundary
threads dropping 287
and partitions 18 transaction control 287
creation 13, 14 transaction control
mapping 14 bulk loading 283
master 14 end of file 284
post-session 14 open transaction 287
pre-session 14 overview 287
reader 14, 15 PowerCenter Server handling 283
transformation 14, 16 real-time sessions 287
types 14 reject file 284
writer 14, 16 rules and guidelines 290
time transaction control points 287
configuring 38 transformation error 284
formats 38 transformation scope 287
Time tab user-defined commit 283
duration options 756 transaction control point
schedule options 755 defined 287
Server Manager session properties 755 Transaction Control transformation
start options 756 partitioning guidelines 356
use absolute time option 757 target connection group 289
Index 785
transaction control unit update strategy
defined 289 target properties 241
transaction generator Update Strategy transformation
active sources 259 constraint-based loading 249
effective and ineffective 259 updating
transaction control points 287 incrementally 579
transformation scope URL
defined 287 adding through business documentation links 97
real-time processing 288 user-defined commit
transformations 288 see also transaction control
transformation threads bulk loading 283
description 14, 16 user-defined events
transformations declaring 155
as partition points 353 example 153
eliminating errors 648 waiting for 157
optimizing 639 using multiple servers 444
partitioning restrictions 395
statistics on 469 V
Transformations node
properties 703 validating 196
Transformations tab expressions 97, 119
in the Server Manager 761 tasks 119
in the Workflow Manager 761 workflows 119, 120
Transformations view worklets 171
session properties 681 Varchar datatypes
Treat Source Rows As See also Designer Guide
bulk loading 252 removing trailing blanks for optimization 653
Treat Source Rows As property variables
overview 214 email 333
truncating server 46
Table Name Prefix 245 workflow 103
target tables 245 Verbose Data tracing levels
configuring session log 474
See also Designer Guide
U Verbose Initialization tracing levels
configuring session log 474
unconnected transformations See also Designer Guide
partitioning restrictions 353 Version
Unicode mode pmcmd syntax 611
See also Installation and Configuration Guide versioned objects
code pages 27 See also Repository Guide
session behavior 16 checking in 74
UNIX systems checking out 74
email 321 searching for in the Workflow Manager 76
external loader behavior 526 viewing
PowerCenter Server as daemon 3 reject file 476
unscheduled status 421 session logs 474
Unsetfolder workflow logs 462
pmcmd syntax 610
786 Index
W viewing dynamically 419
viewing in Workflow Monitor 419
waiting status 421 Workflow Manager
Waittask adding repositories 73
pmcmd syntax 611 arrange 71
Waitworkflow checking out and in versioned objects 74
pmcmd syntax 611 configuring for multiple source files 231
web links copying 77
adding to expressions 97 creating external loader connections 551
webzine l customizing options 39
windows date and time formats 38
customizing 69 defining FTP connections 561
displaying and closing 69 display options 39
docking and undocking 69 entering object descriptions 73
Navigator 67 format options 42
Output 67 general options 39
overview 67 increasing network packet size 646
panning 40 managing multiple servers 444
reloading 40 messages to Workflow Monitor 410
Workflow Manager 67 overview 38, 46, 66
Workflow Monitor 402 registering the PowerCenter Server 46, 48
workspace 67 searching for items 70
Windows System Tray searching for versioned objects 76
accessing Workflow Monitor 404 setting up database connections 53, 56
Windows systems toolbars 69
email 322 tools 66
external loader behavior 526 validating sessions 195
Informatica service owner 322 windows 67, 69
logon network security 325 zooming the workspace 71
PowerCenter Server service 3 Workflow Monitor
worker servers 446 closing folders 407
Workflow Designer configuring 409
creating tasks 133 connecting to repositories 405
displaying and hiding tool name 41 connecting to server 405
workflow logs customizing columns 412
archiving 459 deleted servers 405
changing locations 461 deleted tasks 406
changing name 461 disconnecting from server 405
codes 458 displaying servers 406
configuring 460 dynamic logs 419
creation 9 editing logs 419
editing 419 filtering deleted tasks 406
enabling and disabling 459, 461 filtering servers 406
locating 456, 459 filtering tasks in Task View 405, 431
log file settings 459, 460 Gantt Chart view 402
overview 30 hiding columns 412
permissions 28 hiding servers 406
reading 458 icon 404
sample 458 launching 404
timestamp 460 launching automatically 41
viewing 462 listing tasks and workflows 424
Index 787
log file editor 410 workflows
Maximum Days 410 aborted 421
Maximum Workflow Runs 410 aborting 129, 421
monitor modes 405 adding tasks 92
navigating the Time window 425 assigning PowerCenter Servers 122
notification from Repository Server 410 branches 88
opening folders 407 copying 77
overview 402 creating 91
performing tasks 416 definition 2, 88
permissions and privileges 403 deleting 97
pinging the PowerCenter Server 405 developing 89, 91
receive messages from Workflow Manager 410 disabled 421
restarting tasks, workflows, and worklets 416 disabling 118
resuming a workflow or worklet 417 editing 98
searching 427 email 341
session details 434 events 88
starting 404 fail parent workflow 138
statistics 408 failed 421
stopping or aborting tasks and workflows 418 guidelines 89
switching views 403 links 88
System Tray 404 locking 8
Task view 402 metadata extensions in 82
time 402 monitor 89
toolbars 415 overview 88
viewing history names 419 parameter file 9
viewing session logs 419 privileges 90
viewing workflow logs 419 properties reference 721
workflow and task status 421 removing assigned PowerCenter Servers 123
zooming 426 restarting in Workflow Monitor 416
workflow output resuming in Workflow Monitor 417
email 33 running 7, 122, 421
workflow logs 30 runtime operations overview 7
workflow parameter file 110 scheduled 421
workflow properties scheduling 112
log files 459, 460 selecting a server 89
suspension email 339 starting 122
workflow variables starting on non-associated server 444
creating 110 status 127, 421
datatypes 105, 110 stopped 421
default values 106, 109, 110 stopping 129, 421
keywords 104 stopping and aborting in Workflow Monitor 418
non-persistent variables 110 succeeded 421
persistent variables 110 suspended 421
pre-defined 105 suspending 127, 421
start and current values 109 suspension email 339
SYSDATE 105 terminated 421
user-defined 108 unscheduled 421
using 103 using tasks 132
using in expressions 106 validating 119
WORKFLOWSTARTTIME 105 variables 103
waiting 421
788 Index
Worklet Designer target-based commit 277
displaying and hiding tool name 41
worklets
adding tasks 166
configuring properties 166
Z
create non-reusable worklets 165 zooming
create reusable worklets 165 Workflow Manager 71
declaring events 167 Workflow Monitor 426
developing 165
email 341
fail parent worklet 138
metadata extensions in 82
overriding variable value 169
overview 164
parameters tab 169
persistent variable example 169
persistent variables 169
restarting in Workflow Monitor 416
resuming in Workflow Monitor 417
suspended 421
suspending 164, 421
unscheduled 421
validating 171
variables 169
waiting 421
workspace
color 42
navigating 69
setting colors 42
setting fonts 42
zooming 71
workspace file directory 41
writer threads
description 14, 16
writers
WriterWaitTimeout
target-based commit 277
writing
multibyte data to files 270
to fixed-width files 268, 269
X
XML sources
allocating memory 655
numeric data handling 229
XML targets
active sources 259
partitioning restrictions 396
Index 789
790 Index

Workflow Administration Guide

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Workflow Administration Guide

Uploaded by

Copyright:

Available Formats

Workflow Administration Guide

Copyright (c) 1998–2004 Informatica Corporation.

Portions of this software are copyrighted by DataDirect Technologies, 1999-2002.

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxi

Chapter 1: Understanding the Server Architecture . . . . . . . . . . . . . . . 1

Chapter 2: Configuring the Workflow Manager . . . . . . . . . . . . . . . . . 37

Chapter 3: Using the Workflow Manager . . . . . . . . . . . . . . . . . . . . . . 65

Chapter 4: Working with Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Chapter 5: Working with Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

Table of Contents vii

Chapter 6: Working with Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

Chapter 7: Working with Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . 173

viii Table of Contents

Chapter 8: Working with Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

Chapter 9: Working with Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

Chapter 10: Understanding Commit Points . . . . . . . . . . . . . . . . . . . 275

Chapter 11: Recovering Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295

Chapter 12: Sending Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

xii Table of Contents

Table of Contents xiii

Chapter 14: Monitoring Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . 401

xiv Table of Contents

Chapter 15: Using Multiple Servers. . . . . . . . . . . . . . . . . . . . . . . . . . 443

Chapter 17: Row Error Logging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481

Chapter 18: Session Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495

xvi Table of Contents

Chapter 19: Parameter Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511

Chapter 20: External Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523

Table of Contents xvii

Chapter 21: Using FTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559

Chapter 22: Using Incremental Aggregation. . . . . . . . . . . . . . . . . . . 573

xviii Table of Contents

Chapter 23: Using pmcmd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581

Table of Contents xix

Chapter 24: Session Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613

Chapter 25: Performance Tuning. . . . . . . . . . . . . . . . . . . . . . . . . . . . 635

Table of Contents xxi

Appendix A: Session Properties Reference . . . . . . . . . . . . . . . . . . . 667

xxii Table of Contents

Appendix B: Workflow Properties Reference . . . . . . . . . . . . . . . . . . 721

Appendix C: Session Properties Comparison Reference . . . . . . . . 735

Table of Contents xxiii

xxiv Table of Contents

List of Figures xxv

xxvi List of Figures

List of Figures xxvii

xxviii List of Figures

List of Figures xxix

xxx List of Figures

List of Tables xxxi

xxxii List of Tables

List of Tables xxxiii

xxxiv List of Tables

Web Services Provider

PowerCenter Metadata Reporter

Web Services Provider

Data Integration Web Services

PowerCenter Metadata Reporter

If you see… It means…

italicized text The word or set of words are especially emphasized.

boldfaced text Emphasized subjects.

Note: The following paragraph provides additional facts.