Professional Documents
Culture Documents
Joiner transformation
supports 4 types of joins
at
Informatica level
Normal
Master Outer
Detail Outer
Full Outer
LookUp Transformation
Lookup transformation
basically for
Reference,based on the
lookup condition.when u
want some data based on
target
data ,will take lookup on
that particular table and
retrieve the
corresponding fields from
that table.
we can override the
lookup transformation
using the SQL
query.
look up transformation:
Use a Lookup
transformation in a
mapping to look up data
in
a flat file, relational
table, view, or
synonym.The
Integration Service
queries the lookup source
based on the
lookup ports in the
transformation and a
lookup condition.
The Lookup transformation
returns the result of the
lookup
to the target or another
transformation.
1.Normal: It specifies the initialization and status information and summerization of the
success rows and target rows and the information about the skipped rows due to transformation
errors.
3. Verbose Initialisation : In addition to the Normal tracing specifies the location of the data
cache files and index cache files that are treated and detailed transformation statistics for
each and every transformation within the mapping.
4. Verbose data: Along with verbose initialisation records each and every record processed by
the informatica server
I have one output port in my lookup that should be one of four values
If "lookup policy on multiple match" is set to report error, then it will take the default for
the port, but I can't find a way to make it use "NONE" sometimes and "BOTH" other
times.
[edit]
Answer
Hope you are aware with the basics of Informatica. Now proceeding through lookup
transformation.
For example, if we want to retrieve all the sales of a product with an ID 10 and assume
that the sales data resides in another table called 'Sales'. Here instead of using the
sales table as one more source, use Lookup transformation to lookup the data for the
product, with ID 10 in sales table.
1. Connected lookup receives input values directly from mapping pipeline whereas
UnConnected lookup receives values from: LKP expression from another
transformation.
2. Connected lookup returns multiple columns from the same row whereas
UnConnected lookup has one return port and returns one column from each row.
[edit]
Example
[edit]
Misconceptions about lookup SQL Indexes
I have seen people suggesting an index to improve the performance of any SQL. This
suggestion is incorrect - many times. Specially when talking about indexing the
condition port columns of Lookup SQL, it is far more "incorrect".
Before explaining why it is incorrect, I would try to detail the functionality of Lookup. To
explain the stuff with an example, we take the usual HR schema EMP table. I have
EMPNO, ENAME, SALARY as columns in EMP table.
Let us say, there is a lookup in ETL mapping that checks for a particular EMPNO and
returns ENAME and SALARY from the Lookup. Now, the output ports for the Lookup
are "ENAME" and "SALARY". The condition port is "EMPNO". Imagine that you are
facing performance problems with this Lookup and one of the suggestion was to index
the condition port.
select EMPNO, ENAME, SALARY from EMP ORDER BY EMPNO, ENAME, SALARY;
The data resulted from this query is stored in the Lookup cache and then, each record
from the source is looked up against this cache. So, the checking against the condition
port column is done in the Informatica Lookup cache and "not in the database". So any
index created in the database has no effect for this.
You may be wondering if we can replicate the same indexing here in Lookup Cache.
You don't have to worry about it. PowerCenter create "index" cache and "data" cache
for the Lookup. In this case, condition port data - "EMPNO" is indexed and hashed in
"index" cache and the rest along with EMPNO is found in "data" cache.
I hope now you understand why indexing condition port columns doesn't increase
performance.
Having said that, I want to take you to a different kind of lookup, where you would've
disabled the caching. In this kind of Lookup, there is no cache. Everytime a row is sent
into lookup, the SQL is executed against database. In this scenario, the database index
"may" work. But, if the performance of the lookup is a problem, then "cache-less"
lookup itself may be a problem.
I would go for cache-less lookup if my source data records is less than the number of
records in my lookup table. In this case ONLY, indexing the condition ports will work.
Everywhere else, it is just a mere chanse of luck, that makes the database pick up
index.
[edit]
Dynamic Lookups
Dynamic Lookups are used for implementing Slowly Changing dimensions. The ability
to provide dynamic caching gives Informatica a definetive edge over other vendor
products. In a Dynamic Lookup, everytime a new record is found (based on the lookup
condition) the Lookup Cache is appended with that record. It can also update existing
records in the cache with the incoming values.
Q> I have a mapping that has a source file. My source file is unique and there are
no duplicates. This source file goes to a lookup table to get some values based
on a condition that I have set. If the condition is met, a value will be retrieved
from the lookup. If a value does not meet the condition, NULL will be retrieved
from the lookup.
R> Here is the problem. While the source file is unique and contains no
duplicates, the lookup table may contain multiple matches to the source file.
When this happens, I want the records from the source file that have found
multiple matches on the lookup table to be re-directed to a table that will
undergo further processing.
S> I have tried to use Lookup Policy on Multiple Match=Report Error and it has
dropped the records from the source file that has multiple matches in the
lookup file. I don't want this to happen. I want the records from the source file
that have met multiple matches on the lookup table to be re-directed to another
table for re-processing.
What is the difference in the Lookup Policy for Multiple Matches within a lookup
between: Use First Value, Use Last Value and Use Any Value. When there are multiple
rows with the same key does Informatica do additional processing to determine which
row is the 'true' first' It would seem that Use Any Value would be a better when
checking the condition.
I think the question is :If EMPID is in 5th column/port(in source qualifier) DEPTNO is in
7th column/port(in source qualifier) and if we use No.of Sorted Ports as 2. How will
informatica know if it has to sort only those 2 ports ?
The Ports EMPID (5th port) and DEPTNO (7th port) should be moved to the first 2
ports by using the arrows which will make the columns move up and down as desired.
Make both the EMPID and DEPTNO as first 2 ports and then you could make the
informatica understand that its the first 2 ports that should be sorted.