Professional Documents
Culture Documents
Many beginners get it wrong to manually get the SCD working in Informatica. Lets see with a simple example step by step. Assumption : Working with Scott user in Oracle SRC: create table emps_us as select empno,ename,sal from emp ; TGT: create table empt_us as select empno,ename,sal from scott.emp where 1=2 ; alter table empt_us add constraint eno_pk primary key(empno) ; Step1: Lets Get the Simple Pass Through Working Transfer the data from Source to target using Informatica Note: For some reason if this operation is not done , the update may not work. Step2 : Lets get the update working. Objective : When the source rows changes , update only those changed rows in target SRC: update emps_us set sal=sal+1000 where empno in ( 7900,7902,7934); Power Center Designer Drag the source ,target and create a Update Strategy Transformation. Straight link src - Updtrans - tgt Now we need to lookup into the tgt to see which rows to update. We are going to do it using a unconnected lookup transformation.We want to lookup for rows where the SAL is changed. Create lookup transformation,select target table , add two input port IN_EMPNO and IN_SAL .This values will be supplied when this lkp transformation will be called. Add the conditions empno=IN_EMPNO and SAL != IN_SAL Ports -> enable R for empno . This just signifies that if the condition are met a true will be returned. Now we need to edit update strategy expression as IIF( NOT ISNULL( :LKP.LKPTRANS(EMPNO,SAL)),DD_UPDATE,DD_REJECT ) Save and create a Workflow. Imp: 1) WF->Properties->Treat Source Rows as > Change it to Data Driven. 2) The target should have a primary key for update Now run the workflow . Observe that the modified rows are getting updated. Step3: Now lets get the insert also working. Objective : Now along with updating the existing rows we need to add new rows. Power Center Designer
In the same mapping , drag one more instance of the target , also create a new update strategy. Straingth link SRC - NEW UPDTRANS - TGT Create a lookup transformation for target table . Add port IN_EMPNO , condition EMPNO=IN_EMPNO , enable R for empno .The lkp returns true if the condition is satisfied. Now edit the update Strategy expression as below IIF(ISNULL(:LKP.LKPTRANS(EMPNO)) ,DD_INSERT,DD_REJECT) Save , refresh the workflow mapping Do the following modification to source insert into emps_us values ( 1,'N1',100); update emps_us set sal=sal+1000 where empno in ( 7900,7902,7934); Start the workflow to see your SCD Type 1 working The above example will let you get the basic stuff working. You can keep optimizing may be with a single UPDTRANS and a single target.Edit the strategy expression as
IIF( ISNULL(:LKP.LKPTRANS_INSERT(EMPNO)),DD_INSERT,IIF( ISNULL(:LKP.LKPTRANS_UPDATE(EMPNO,SAL)) ,DD_REJECT,DD_UPDATE ) )
Implementation:
Source: Create CUST source using following script. CREATE TABLE CUST (CUST_ID NUMBER, CUST_NM VARCHAR2(250 BYTE), ADDRESS VARCHAR2(250 BYTE), CITY VARCHAR2(50 BYTE), STATE VARCHAR2(50 BYTE),
Target: CREATE TABLE STANDALONE.CUST_D ( PM_PRIMARYKEY INTEGER, CUST_ID NUMBER, CUST_NM VARCHAR2(250 BYTE), ADDRESS VARCHAR2(250 BYTE), CITY VARCHAR2(50 BYTE), STATE VARCHAR2(50 BYTE), INSERT_DT DATE, UPDATE_DT DATE); CREATE UNIQUE INDEX STANDALONE.CUST_D_PK ON STANDALONE.CUST_D(PM_PRIMARYKEY); ALTER TABLE CUST_D ADD (CONSTRAINT CUST_D_PK PRIMARY KEY (PM_PRIMARYKEY)); Import Source and target in to informatica using source analyzer and target designer. Create mapping m_Use_Dynamic_Cache_To_SCD_Type1 and drag CUST source from sources to mapping designer.
Create input ports in_CUST_ID, in_CUST_NM, in_ADDRESS, in_CITY and in_STATE attributes in lkp_CUST_D transformation. Connect CUST_ID, CUST_NM, ADDRESS, CITY and STATE from source qualifier to lookup lkp_CUST_D table in_CUST_ID, in_CUST_NM, in_ADDRESS, in_CITY and in_STATE attributes respectively. Create condition in lookup transformation CUST_ID=in_CUST_ID in conditions tab.
Select dynamic cache and insert else update options in lookup transformation properties.
Create expression transformation and drag all attributes from lookup transformation and drop in expression transformation and change the name of attributes in expression transformation with respect to source attributes or target attributes, so that it is easy to understand the fields which are coming from source and target.
Create one dummy out port in expression transformation to pass date to target and assign SYSDATE in expression editor.
Create router transformation and drag attributes from expression transformation to router transformation as shown in below screen shot.
Create two groups in router transformation one for INSERT and another one for UPDATE. Give condition NewLookupRow=1 for insert group and NewLookupRow=2 for update group.
Connect insert group from router to insert pipe line in target and update group to update pipe line target through update strategy transformation.
For update strategy transformation upd_INSERT give condition DD_INSERT and DD_UPDATE for upd_UPDATE update strategy transformation Create work flow wkfl_Use_Dynamic_Cache_To_SCD_Type1 with session s_Use_Dynamic_Cache_To_SCD_Type1 for mapping m_Use_Dynamic_Cache_To_SCD_Type1.
With coding for SCD Type1 by using Dynamic lookup transformation completed. Execution: Insert records in source CUST table by using following insert scripts. SET DEFINE OFF; Insert into CUST (CUST_ID, CUST_NM, ADDRESS, CITY, STATE, INSERT_DT, UPDATE_DT) Values (80001, 'Marion Atkins', '100 Main St.', 'Bangalore', 'KA', SYSDATE,SYSDATE); Insert into CUST (CUST_ID, CUST_NM, ADDRESS, CITY, STATE, INSERT_DT, UPDATE_DT) Values (80002, 'Laura Jones', '510 Broadway Ave.', 'Hyderabad', 'AP', SYSDATE,SYSDATE); Insert into CUST (CUST_ID, CUST_NM, ADDRESS, CITY, STATE, INSERT_DT, UPDATE_DT) Values (80003, 'Jon Freeman', '555 6th Ave.', 'Bangalore', 'KA', SYSDATE,SYSDATE); COMMIT; Data in source will look like below.
Start work flow after insert the records in CUST table. After completion of this work flow all the records will be loaded in target and data will be look like below.
Now update any record in source and re run the work flow it will update record in target. If any records in source which are not present in target will be inserted in target table.
Lets start off with the mapping. Step 3: In the informatica designer get the source and the target tables that we just created above. Step 4: Now, get a look up and select the target table to look up. Step 5: Drag and drop all the source tables to the look up transformation.(now, we have the target tables rows at the top of the look up and the source rows the lower side in the look up transformation) Step6: Get an expression transformation and connect all the rows in the look up transformation to the expression transformation. Now, you need to add two more columns here 1. Insert_flg of integer(15) type: Give the expression for this as: IIF(ISNULL(SK) OR ISNULL(EMPNO),1,0) 2. Update_flg of integer(15) type: Give the expression for this as: IIF(NOT ISNULL(SK) and ( ( ENAME != ENAME1 ) OR ( JOB != JOB1 ) OR ( MGR != MGR1 ) OR ( HIREDATE != HIREDATE1 ) OR ( SAL != SAL1 ) OR ( COMM != COMM1 ) OR
Step 7: Get a router transformation and connect the source rows(the rows that have "1" postfixed) to it. Makesure the insert_flg and update flg columns that you created also is connected for the same and give group names for this to create two groups as follows Group 1. Insert_rows: Group filter condition for this: Insert_flg Group 2. Update_rows: Group filter condition for this: Update_flg Step 8: Now connect the insert_rows groups of the router to the target. Note: here you should not connect the SK column from the source /router transformation, but you need to use a sequence generator tranformation and connect the nextval column to the SK of the target as this sequence needs to be updated to the next value whenever a new row gets added. Step 9: Connect the update_rows of the router tranformation to an update strategy tranformation with a transformation value of DD_UPDATE and then connect to the Target instance 2( Note : target instance2 is nothing but a copy paste of the target table and is not generated at the target DB). STep 10: Now create a workflow and the task for the same and run the transformation. Make sure you commit the changes made on the source side :-)
The logic goes very simple: 1. First the lookup will look up in the cache table for a given row for existence. 2. If the SK does not exist, then it will go ahead and update the row in the target/Dimension table. 3. After this the sequence generator will be updated to the next value WRT the target/Dimension table. 4. If the SK exist then the condition will point to the update_flg and will do a DD_UPDATE of the corresponding row in the target table. 5. Now, the same process will continue with the next row onwards. 6. Note the SK in the target table should be a primary key with out fail.
Slowly Changing Dimensions (SCDs) are dimensions that have data that changes slowly, rather than changing on a time-based, regular schedule
For example, you may have a dimension in your database that tracks the sales records of your company's salespeople. Creating sales reports seems simple enough, until a salesperson is transferred from one regional office to another. How do you record such a change in your sales dimension? You could sum or average the sales by salesperson, but if you use that to compare the performance of salesmen, that might give misleading information. If the salesperson that was transferred used to work in a hot market where sales were easy, and now works in a market where sales are infrequent, her totals will look much stronger than the other salespeople in her new region, even if they are just as good. Or you could create a second salesperson record and treat the transferred person as a new sales person, but that creates problems also. Dealing with these issues involves SCD management methodologies: Type 1: The Type 1 methodology overwrites old data with new data, and therefore does not track historical data at all. This is most appropriate when correcting certain types of data errors, such as the spelling of a name. (Assuming you won't ever need to know how it used to be misspelled in the past.) Here is an example of a database table that keeps supplier information: Supplier_Key Supplier_Code Supplier_Name Supplier_State 123 ABC Acme Supply Co CA In this example, Supplier_Code is the natural key and Supplier_Key is a surrogate key. Technically, the surrogate key is not necessary, since the table will be unique by the natural key (Supplier_Code). However, the joins will perform better on an integer than on a character string. Now imagine that this supplier moves their headquarters to Illinois. The updated table would simply overwrite this record: Supplier_Key Supplier_Code Supplier_Name Supplier_State 123 ABC Acme Supply Co IL The obvious disadvantage to this method of managing SCDs is that there is no historical record kept in the data warehouse. You can't tell if your suppliers are tending to move to the Midwest, for example. But an advantage to Type 1 SCDs is that they are very easy to maintain. Explanation with an Example: Source Table: (01-01-11) Target Table: (01-01-11) Emp no Ename Sal
A B C
Ename A B C
The necessity of the lookup transformation is illustrated using the above source and target table. Source Table: (01-02-11) Target Table: (01-02-11) Emp no 101 102 103 104
Ename A B C D
Ename A B C D
In the second Month we have one more employee added up to the table with the Ename D and salary of the Employee is changed to the 2500 instead of 2000.
Create a table by name emp_source with three columns as shown above in oracle. Import the source from the source analyzer. In the same way as above create two target tables with the names emp_target1, emp_target2. Go to the targets Menu and click on generate and execute to confirm the creation of the target tables. The snap shot of the connections using different kinds of transformations are shown below.
Here in this transformation we are about to use four kinds of transformations namely Lookup transformation, Expression Transformation, Filter Transformation, Update Transformation. Necessity and the usage of all the transformations will be discussed in detail below.
Look up Transformation: The purpose of this transformation is to determine whether to insert, Delete, Update or reject the rows in to target table.
The first thing that we are goanna do is to create a look up transformation and connect the Empno from the source qualifier to the transformation. The snapshot of choosing the Target table is shown below.
What Lookup transformation does in our mapping is it looks in to the target table (emp_table) and compares it with the Source Qualifier and determines whether to insert, update, delete or reject rows. In the Ports tab we should add a new column and name it as empno1 and this is column for which we are gonna connect from the Source Qualifier. The Input Port for the first column should be unchked where as the other ports like Output and lookup box should be checked. For the newly created column only input and output boxes should be checked. In the Properties tab (i) Lookup table name ->Emp_Target.
(ii)Look up Policy on Multiple Mismatch -> use First Value. (iii) Connection Information ->Oracle.
(ii)Lookup Table Column should be Empno, Transformation port should be Empno1 and Operator should =. Expression Transformation: After we are done with the Lookup Transformation we are using an expression transformation to check whether we need to insert the records the same records or we need to update the records. The steps to create an Expression Transformation are shown below.
Drag all the columns from both the source and the look up transformation and drop them all on to the Expression transformation. Now double click on the Transformation and go to the Ports tab and create two new columns and name it as insert and update. Both these columns are gonna be our output data so we need to have check mark only in front of the Output check box. The Snap shot for the Edit transformation window is shown below.
The condition that we want to parse through our output data are listed below.
Filter Transformation: we are gonna have two filter transformations one to insert and other to update.
Connect the Insert column from the expression transformation to the insert column in the first filter transformation and in the same way we are gonna connect the update column in the expression transformation to the update column in the second filter. Later now connect the Empno, Ename, Sal from the expression transformation to both filter transformation. If there is no change in input data then filter transformation 1 forwards the complete input to update strategy transformation 1 and same output is gonna appear in the target table.
If there is any change in input data then filter transformation 2 forwards the complete input to the update strategy transformation 2 then it is gonna forward the updated input to the target table. Go to the Properties tab on the Edit transformation
(i) The value for the filter condition 1 is Insert. (ii) The value for the filter condition 1 is Update.
Update Strategy Transformation: Determines whether to insert, delete, update or reject the rows.
Drag the respective Empno, Ename and Sal from the filter transformations and drop them on the respective Update Strategy Transformation. Now go to the Properties tab and the value for the update strategy expression is 0 (on the 1st update transformation). Now go to the Properties tab and the value for the update strategy expression is 1 (on the 2nd update transformation). We are all set here finally connect the outputs of the update transformations to the target table.
Change Bulk to the Normal. Run the work flow from task.
Identifying the new record and inserting it in to the dimension table. Identifying the changed record and updating the dimension table.
We see the implementation of SCD type 1 by using the customer dimension table as an example. The source table looks as
CREATE TABLE Customers ( Customer_Id Number, Customer_Name Varchar2(30), Location Varchar2(30) )
Now I have to load the data of the source into the customer dimension table using SCD Type 1. The Dimension table structure is shown below.
CREATE TABLE Customers_Dim ( Cust_Key Number, Customer_Id Number, Customer_Name Varchar2(30), Location Varchar2(30) )
Steps to Create SCD Type 1 Mapping Follow the below steps to create SCD Type 1 mapping in informatica
Create the source and dimension tables in the database. Open the mapping designer tool, source analyzer and either create or import the source definition. Go to the Warehouse designer or Target designer and import the target definition. Go to the mapping designer tab and create new mapping. Drag the source into the mapping. Go to the toolbar, Transformation and then Create. Select the lookup Transformation, enter a name and click on create. You will get a window as shown in the below image.
Edit the lkp transformation, go to the properties tab, and add a new port In_Customer_Id. This new port needs to be connected to the Customer_Id port of source qualifier transformation.
Go to the condition tab of lkp transformation and enter the lookup condition as Customer_Id = IN_Customer_Id. Then click on OK.
Connect the customer_id port of source qualifier transformation to the IN_Customer_Id port of lkp transformation. Create the expression transformation with input ports as Cust_Key, Name, Location, Src_Name, Src_Location and output ports as New_Flag, Changed_Flag For the output ports of expression transformation enter the below expressions and click on ok
New_Flag = IIF(ISNULL(Cust_Key),1,0) Changed_Flag = IIF(NOT ISNULL(Cust_Key) AND (Name != Src_Name OR Location != Src_Location), 1, 0 )
Now connect the ports of lkp transformation (Cust_Key, Name, Location) to the expression transformaiton ports (Cust_Key, Name, Location) and ports of source qualifier transformation(Name, Location) to the expression transforamtion ports(Src_Name, Src_Location) respectively. The mapping diagram so far created is shown in the below image.
Create a filter transformation and drag the ports of source qualifier transformation into it. Also drag the New_Flag port from the expression transformation into it. Edit the filter transformation, go to the properties tab and enter the Filter Condition as New_Flag=1. Then click on ok. Now create an update strategy transformation and connect all the ports of the filter transformation (except the New_Flag port) to the update strategy. Go to the properties tab of update strategy and enter the update strategy expression as DD_INSERT Now drag the target definition into the mapping and connect the appropriate ports from update strategy to the target definition. Create a sequence generator transformation and connect the NEXTVAL port to the target surrogate key (cust_key) port. The part of the mapping diagram for inserting a new row is shown below:
Now create another filter transformation and drag the ports from lkp transformation (Cust_Key), source qualifier transformation (Name, Location), expression transformation (changed_flag) ports into the filter transformation. Edit the filter transformation, go to the properties tab and enter the Filter Condition as Changed_Flag=1. Then click on ok. Now create an update strategy transformation and connect the ports of the filter transformation (Cust_Key, Name, and Location) to the update strategy. Go to the properties tab of update strategy and enter the update strategy expression as DD_Update Now drag the target definition into the mapping and connect the appropriate ports from update strategy to the target definition. The complete mapping diagram is shown in the below image.
In this article lets talk about a design, which can take care of the scenario we just spoke.
The Theory
When you configure an Informatica PowerCenter session, you have several options for handling database operations such as insert, update, delete.
During session configuration, you can select a single database operation for all rows using the Treat Source Rows As setting from the 'Properties' tab of the session. 1. 2. 3. 4. Insert :- Treat all rows as inserts. Delete :- Treat all rows as deletes. Update :- Treat all rows as updates. Data Driven :- Integration Service follows instructions coded into Update Strategy flag rows for insert, delete, update, or reject.
We can create the mapping just like an 'INSERT' only mapping, with out LookUp, Update Strategy Transformation. During the session configuration lets set up the session properties such that the session will have the capability to both insert and update.
Now lets set the properties for the target table as shown below. Choose the properties Insert and Update else Insert.
Thats all we need to set up the session for update and insert with out update strategy.
Hope you enjoyed this article. Please leave us a comment below, if you have any difficulties implementing this. We will be more than happy to help you.
Update Strategy within a session: When we configure a session, we can instruct the IS to either treat all rows in the same way or use instructions coded into the session mapping to flag rows for different database operations. Session Configuration: Edit Session -> Properties -> Treat Source Rows as: (Insert, Update, Delete, and Data Driven). Insert is default. Specifying Operations for Individual Target Tables: You can set the following update strategy options: 1. Insert: Select this option to insert a row into a target table. 2. Delete: Select this option to delete a row from a table. 3. Update: We have the following options in this situation: i. Update as Update. Update each row flagged for update if it exists in the target table. ii. Update as Insert. Inset each row flagged for update. iii. Update else Insert. Update the row if it exists. Otherwise, insert it. 4. Truncate: Select this option to truncate the target table before loading data.
Flagging Rows within a Mapping: Within a mapping, we use the Update Strategy transformation to flag rows for insert, delete, update, or reject. Operation INSERT INSERT UPDATE DELETE REJECT Update Strategy Expressions: Frequently, the update strategy expression uses the IIF or DECODE function from the transformation language to test each row to see if it meets a particular condition. You can write these expression in Properties Tab of Update Strategy transformation. Constant DD_INSERT DD_INSERT DD_UPDATE DD_DELETE DD_REJECT Numeric Value 0 0 1 2 3
IIF( ( ENTRY_DATE > APPLY_DATE), DD_REJECT, DD_UPDATE ) Or IIF( ( ENTRY_DATE > APPLY_DATE), 3, 2 ) Note: We can configure the Update Strategy transformation to either pass rejected rows to the next transformation or drop them. To do, see the Properties Tab for the Option.
Understanding Treat Source Rows property and Target Insert, Update properties
Posted by Ankur Nigam on August 17, 2011 Informatica has plethora of options to perform IUD (Insert, Update, Delete) operations on tables. One of the most common method is using the Update strategy, while the underdog is using Treat Source Rows set to {Insert; Update; Delete} and not data driven. I will be focusing on latter in this topic.
In simple terms when you set the Treat Source Rows property it indicates Informatica that the row has to be tagged as Insert or Update or Delete. This property coupled with target level property of allowing Insert, Update, Delete works out wonders even in absence of Update Strategy. This also leads to a clear-cut mapping design. I am not opposing the use of Update Strategy but in some situations this leads to a slight openness in the mapping wherein I dont
have to peek into the reason of action the Strategy is performing e.g. IIF(ISNULL(PK)=1,DD_INSERT,DD_UPDATE). Lets buckle up our belts and go on a ride to understand the use of these properties. Assume a scenario where I have following Table Structure in Stage
Keeping things simple the target table would be something like this
As you can see the target has UserID as a surrogate key which I will populate through a sequence. Also note that Username is unique. Now I have a scenario where I have to update the existing records and insert the new ones as supplied in the staging table. Before beginning with writing code, lets first understand TSA and target properties is more detail. Treat Source Rows accepts 4 types of settings:
1. Insert :- When I set this option Informatica will mark all rows read from source as Insert. Means that the rows will only be inserted. 2. Update :- When I set this option Informatica will mark all rows read from source as Update. It means that rows when arrive target they have to be updated in it. 3. Delete :- The rows will be marked as to be deleted from target once having been read from Source. 4. Data Driven :- This indicates Informatica that we are using an update strategy to indicate what has to be done with rows. So no marking will be done when rows are read from source. Infact what has to be done with rows arriving to target will be decided immediately before any IUD operation on target
However setting TSA alone will not let you modify rows in the target. Each target in itself should be able to accept or I should say allow IUD operations. So when you have set TSA
property you have to also set the target level property also that whether the rows can be inserted, updated or deleted from the target. This can be done in following ways:-
Insert and delete are self-explanatory however update has been categorized into 3 sections. Please note that setting any of them will allow update on your tables:1. Update as Update :- This is simple property which says that if the row arrives target, it has to be updated in target. So if you check the logs Informatica will generate an Update template something like UPDATE INFA_TARGET_RECORDS SET EMAIL = ? WHERE USERNAME = ? 2. Update as Insert :- This means that when row arrives target and it is a row which has to be updated, then the update behaviour should be to insert this row in target. In this case Informatica will not generate any update template for the target instead the incoming row will be inserted using the template INSERT INTO INFA_TARGET_RECORDS(USERID,USERNAME,EMAIL) VALUES ( ?, ?, ?) 3. Update else Insert :- Means that the incoming row flagged as update should be either updated or inserted. In a nutshell it means that if any key column is present in the incoming row which
also exists in target then Informatica will intelligently update that row in target. In case if the incoming key column is not present in the target the row will be inserted.
PS :- The last two properties require you to set the Insert property of target also because if this is not checked then Update as Insert & Update else Insert will not work and session will fail stating that the target does not allows Insert. Why? Well its simple because these update clauses have insert hidden in them. Ok enough of theories? Fine lets get our hand dirty. Coming back to our scenario, we have the rows read from source and want them to be either inserted or updated in target depending upon the status of rows i.e. whether they are present in the target or not. My mapping looks something like this:
Here I have used a lookup table to fetch user ID for a username incoming from stage. In the router following has been set:-
The output from router is sent to respective instances of the target (INFA_TARGET_RECORDS) in case if user exists or not. INFA_TARGET_RECORDS_NEW in case of new records and INFA_TARGET_RECORDS_UPD in case of existing records. Once this is in place I have to set the Treat Source Rows property as Update for this session. Also to enable Informatica to insert in the table I will have to :1. Set the Insert & Update as Insert properties of the instance INFA_TARGET_RECORDS_NEW. 2. Set Update as Update property for INFA_TARGET_RECORDS_UPD instance in the session.
What actually happened is that I have treated all rows from source to be flagged as update. Secondly I have modified the behaviour of the Update and set it as Update as Insert. Due to this property update has allowed me actually to insert the rows in target. When the session runs it will update the rows in target and insert the new rows in target (actually update as insert). Try it out and let me know if it works for you. I am not attaching any run demo because its better if you do it and understand even more clearly what is happening behind the scenes.
insert rows from the source that dont exist in the target update rows that have changed delete rows from the target that no longer exist in the source
DECODE(Source_Field2, Target_Field2, 1, 0) = 0 ) 2. Modify as needed to compare all non-key fields 8. Insert an Update Transformation coming from the Filter Transformation using DD_UPDATE 1. Connect this transformation to the Target 9. Connect the Insert Group in the Router Transformation to the Target
Please leave a comment if you have questions on this informatica mapping process
SCD - Type 1
Slowly Changing Dimensions (SCDs) are dimensions that have data that changes slowly, rather than changing on a time-based, regular schedule For example, you may have a dimension in your database that tracks the sales records of your company's salespeople. Creating sales reports seems simple enough, until a salesperson is transferred from one regional office to another. How do you record such a change in your sales dimension? You could sum or average the sales by salesperson, but if you use that to compare the performance of salesmen, that might give misleading information. If the salesperson that was transferred used to work in a hot market where sales were easy, and now works in a market where sales are infrequent, her totals will look much stronger than the other salespeople in her new region, even if they are just as good. Or you could create a second salesperson record and treat the transferred person as a new sales person, but that creates problems also. Dealing with these issues involves SCD management methodologies: Type 1: The Type 1 methodology overwrites old data with new data, and therefore does not track historical data at all. This is most appropriate when correcting certain types of data errors, such as the spelling of a name. (Assuming you won't ever need to know how it used to be misspelled in the past.) Here is an example of a database table that keeps supplier information: Supplier_Key Supplier_Code Supplier_Name Supplier_State 123 ABC Acme Supply Co CA
In this example, Supplier_Code is the natural key and Supplier_Key is asurrogate key. Technically, the surrogate key is not necessary, since the table will be unique by the natural key (Supplier_Code). However, the joins will perform better on an integer than on a character string. Now imagine that this supplier moves their headquarters to Illinois. The updated table would simply overwrite this record: Supplier_Key Supplier_Code Supplier_Name Supplier_State 123 ABC Acme Supply Co IL The obvious disadvantage to this method of managing SCDs is that there is no historical record kept in the data warehouse. You can't tell if your suppliers are tending to move to the Midwest, for example. But an advantage to Type 1 SCDs is that they are very easy to maintain. Explanation with an Example: Source Emp no 101 102 103 Emp no 101 102 103 Table: (01-01-11) Target Table: (01-01-11) Ename Sal A 1000 B 2000 C 3000 Ename A B C Sal 1000 2000 3000
The necessity of the lookup transformation is illustrated using the above source and target table. Source Table: (01-02-11) Target Table: (01-02-11) Emp no 101 102 103 104 Ename A B C D Sal 1000 2500 3000 4000 Empno 101 102 103 104 Ename A B C D Sal 1000 2500 3000 4000
In the second Month we have one more employee added up to the table with the Ename D and salary of the Employee is changed to the 2500 instead of 2000. Step 1: Is to import Source Table and Target table.
Create a table by name emp_source with three columns as shown above in oracle. Import the source from the source analyzer. In the same way as above create two target tables with the names emp_target1, emp_target2. Go to the targets Menu and click on generate and execute to confirm the creation of the target tables. The snap shot of the connections using different kinds of transformations are shown below.
Step 2: Design the mapping and apply the necessary transformation. Here in this transformation we are about to use four kinds of transformations namely Lookup transformation, Expression Transformation, Filter Transformation, Update Transformation. Necessity and the usage of all the transformations will be discussed in detail below. Look up Transformation: The purpose of this transformation is to determine whether to insert, Delete, Update or reject the rows in to target table.
The first thing that we are goanna do is to create a look up transformation and connect the Empno from the source qualifier to the transformation. The snapshot of choosing the Target table is shown below.
What Lookup transformation does in our mapping is it looks in to the target table (emp_table) and compares it with the Source Qualifier and determines whether to insert, update, delete or reject rows. In the Ports tab we should add a new column and name it as empno1 and this is column for which we are gonna connect from the Source Qualifier. The Input Port for the first column should be unchked where as the other ports like Output and lookup box should be checked. For the newly created column only input and output boxes should be checked. In the Properties tab (i) Lookup table name ->Emp_Target.
(ii)Look up Policy on Multiple Mismatch -> use First Value. (iii) Connection Information ->Oracle.
(ii)Lookup Table Column should be Empno, Transformation port should be Empno1 and Operator should =. Expression Transformation: After we are done with the Lookup Transformation we are using an expression transformation to check whether we need to insert the records the same records or we need to update the records. The steps to create an Expression Transformation are shown below.
Drag all the columns from both the source and the look up transformation and drop them all on to the Expression transformation. Now double click on the Transformation and go to the Ports tab and create two new columns and name it as insert and update. Both these
columns are gonna be our output data so we need to have check mark only in front of the Output check box. The Snap shot for the Edit transformation window is shown below.
The condition that we want to parse through our output data are listed below.
Filter Transformation: we are gonna have two filter transformations one to insert and other to update.
Connect the Insert column from the expression transformation to the insert column in the first filter transformation and in the same way we are gonna connect the update column in the expression transformation to the update column in the second filter. Later now connect the Empno, Ename, Sal from the expression transformation to both filter transformation. If there is no change in input data then filter transformation 1 forwards the complete input to update strategy transformation 1 and same output is gonna appear in the target table. If there is any change in input data then filter transformation 2 forwards the complete input to the update strategy transformation 2 then it is gonna forward the updated input to the target table. Go to the Properties tab on the Edit transformation
(i) The value for the filter condition 1 is Insert. (ii) The value for the filter condition 1 is Update.
Drag the respective Empno, Ename and Sal from the filter transformations and drop them on the respective Update Strategy Transformation. Now go to the Properties tab and the value for the update strategy expression is 0 (on the 1st update transformation). Now go to the Properties tab and the value for the update strategy expression is 1 (on the 2nd update transformation). We are all set here finally connect the outputs of the update transformations to the target table.
Dont check the truncate table option. Change Bulk to the Normal. Run the work flow from task.