You are on page 1of 49

Agenda

Teradata Architecture
Teradata Utilities(Client Tools)
BTEQ
Fast Load
Fast Export
Multiload
Tpump
BTEQ
• BTEQ is available on every Teradata system ever built, Because the
Basic Teradata Query Language(BTEQ) tool was the original way
that SQL was submitted to Teradata as a means of getting an
answer set in a desired format.
• BTEQ is also an excellent tool for importing and exporting data.
• BTEQ Sessions provides a quick and easy way to access a Teradata
RDBMS. In a BTEQ session, you can do the following
- enter Teradata SQL statements to view, add, modify, and
delete data.
- enter BTEQ commands.
- enter operating system commands.
• interactive mode -start a BTEQ session, and submit commands to
the database as needed.
• batch mode -prepare scripts or macros, and then submit them to
BTEQ for processing.
The BTEQ Command Set
The BTEQ command set can be categorized as:
Session control - begin and end BTEQ sessions, and
control session characteristics.
File control - specify input and output formats and
identify information sources and destinations.
Sequence control - control the sequence in which
other BTEQ commands and Teradata SQL statements
will be executed within scripts and macros.
Format control - control the format of screen and
printer output.
Session Control Commands
Use the following BTEQ commands to begin, control, and end
sessions.
LOGON - starts a BTEQ session.
SESSIONS - specify the number of sessions to use with the
next LOGON command.
LOGOFF - end the current sessions without exiting BTEQ.
EXIT or QUIT - end the current sessions and exit BTEQ.
ABORT - abort any active requests and transactions
without exiting BTEQ.
SHOW CONTROLS - display the current configuration of the
BTEQ
control command options.
SHOW VERSIONS - display the BTEQ version number, module
revision numbers, and linking date.
SESSION TRANSACTION - specify whether transaction
boundaries are determined by Teradata SQL semantics or ANSI
semantics.
COMPILE - create or replace a Teradata stored procedure.

File Control Commands


OS - execute an MS-DOS or UNIX command from within the
BTEQ environment.
TSO - execute an MVS TSO command from within the BTEQ
environment.
RUN - execute Teradata SQL requests and BTEQ commands from
a specified run file.
REPEAT - submit the next request a specified number of times.
FORMAT - enable or inhibit the page-oriented format command
options.
EXPORT - open a file with a specific format to transfer
information from the Teradata RDBMS.
INDICDATA /RECORDMODE - specify the response mode, either
Field mode, Indicator mode, or Record mode, for data selected
from the Teradata RDBMS.
Sequence Control Commands
HANG - pause BTEQ processing for a specified period of time.
ERRORLEVEL - Assigns severity levels to errors.
IF...THEN.. - Tests the validity of the condition stated in the IF
clause.
LABEL - Identifies the point at which BTEQ resumes processing,
as specified in a previous GOTO command.
MAXERROR - Designates a maximum error severity level beyond
which BTEQ terminates job processing.
REMARK - Places a specified string on the standard output
stream.
Format Control Commands
RETLIMIT - specify the maximum number of rows and/or
columns displayed or written in response to a SQL request.
RETCANCEL - cancel a request when the value specified by
the RETLIMIT command ROWS option is exceeded.
UNDERLINE - display a row of dash characters whenever
the value of a specified column changes.
SKIPLINE - insert a blank line in a report whenever the
value of a specified column changes.
IMPORT - open a file with a specific format to transfer
information to the Teradata RDBMS.
EXPORT - open a file with a specific format to transfer
information from the Teradata RDBMS.
FOOTING- specify a footer to appear at the bottom of every
page of a report.
Starting and Exiting BTEQ
Logging On in Interactive Mode
1. Enter the LOGON command as follows:
.LOGON tdpid/userid
2. Enter your RDBMS password:
Password: ____________
Logging On in Batch Mode
Submit the LOGON command in an input file, including the
password, as follows:
.LOGON tdpid/userid, password
Logging Off the Teradata RDBMS / Exiting BTEQ
Enter the LOGOFF command at the BTEQ command prompt:
.LOGOFF
Enter either the EXIT or QUIT command:
.EXIT Or .QUIT
Request Types
There are two types of Teradata SQL requests. A single-
statement request is a single Teradata SQL statement sent
as a request. A multistatement request is two or more
statements that are sent as a request.
Single-Statement Example
BTEQ submits the following statements to the Teradata
RDBMS as three singlestatement requests:
SELECT * FROM Employee;
DELETE FROM Employee WHERE Name =
’Ramesh’ AND Empno = 10014;
SELECT Name FROM Employee;
Multistatement Example
To submit the same three statements as a multistatement
request, enter:
SELECT * FROM Employee
; DELETE FROM Employee WHERE Name = ’Ramesh’
AND
Empno = 10014
; SELECT Name FROM Employee;
BTEQ does not submit any of the statements to the
Teradata RDBMS until it encounters a semicolon as the
last nonblank character of a line. At that time, BTEQ sends
all of the statements to the Teradata RDBMS for
processing as one single request.
Handling Errors
BTEQ error handling involves these elements:
Teradata RDBMS error codes
BTEQ return codes
Error severity levels
Maximum errorlevel
Stored procedure compilation errors.
Transactions in Teradata(BTEQ) Mode
Often in Teradata we'll see multiple queries within
the same transaction. We can use the BT/ET keywords to
bundle several queries into one transaction.
For example
BT;
Update empTable
SET lastname=‘kumar’ where firstname=‘raj’;
SELECT * from empTable where firstname=‘raj’;
ET;
Make sure that your syntax is correct when
using the method of BT and ET because a mistake causes a
massive rollback.
Transactions in ANSI Mode
To change to ANSI mode, simply type '.set session
transaction ANSI' and be sure to do it before you actually
logon to BTEQ. All queries in ANSI mode will also work in
Teradata mode and vice versa.
All transactions must be committed by the user
actually
using the word 'COMMIT'. Also, in ANSI mode after any
DDL statement (CREATE, DROP, ALTER,DATABASE) we
have to use the 'commit' command immediately.
ANSI mode is great because when you bundle several
queries into one transaction and one doesn't work, the
rest won't be rolled back to their original state.
For example

.set session transaction ansi


.logon Tclass/sql01
password: *****
**** logon successfully completed
Update empTable
SET lastname=‘kumar’ where firstname=‘raj’;
SELECT * from empTable where firstname=‘raj’;
SELECT * from empXXX;
COMMIT;
FastLoad
• Fastload is used for loading large amount of data into teradata tables

• Only one table can be loaded per job

Target table must be empty and have no secondary indexes

• Full Restart capability

• It doesn’t load duplicate records even if the target table is a multiset


table
Two Phases of FastLoad

Phase 1
• FastLoad uses one SQL session to define AMP steps.
• AMPs hash each record and redistribute them to the AMP responsible for
the hash value.
• The PE sends a block to each AMP which
stores blocks of unsorted data records.

Phase 2
• Each AMP sorts the target table, puts the rows into blocks, and writes the
blocks to disk.
• Fallback rows are then generated if required.
Error Tables

Error Table 1
Contains one row for each row which failed to be loaded
due to constraint violations or translation errors.

Error Table 2
Captures any error which is related to duplication of
values for Unique Primary Indexes (UPI). Fastload will
capture only one occurrence of the value and store the
duplicate occurrence in the second error table. However if
the entire row is duplicated then Fastload count it but does
not store the row.
FastLoad Commands
FastLoad Commands…
FastLoad Script
/* Number of Sessions */
 SESSIONS 4;
 /* Maximum Number of Errors allowed to occur */
 ERRLIMIT 50;
 
.logon 127.0.0.1/DBC,DBC;
 
/* Dropping Error Tables */
DROP TABLE TRAINING.DEPTERR1;
DROP TABLE TRAINING.DEPTERR2;
 
DELETE FROM TRAINING.DEPT;
 
/*DEFINE FILE =C:\DEPT.TXT;
SHOW;*/
 
BEGIN LOADING TRAINING.DEPT ERRORFILES TRAINING.DEPTERR1, TRAINING.DEPTERR2;
 
 /* Specifying the Type of File */
SET RECORD VARTEXT " ";
 
 /* Defining the Columns in Flat File Format */
 
DEFINE
DEPT_NO (VARCHAR(20)),
DEPT_NAME (VARCHAR(50)),
 
/* Loading from Input File */
FILE =C:\Desktop\Sample_fload.txt;
 
/* Inserting Rows into the Table*/
INSERT INTO TRAINING.DEPT
VALUES
(:DEPT_NO,
:DEPT_NAME
);
 
.END LOADING;
.LOGOFF;
$ fastload < emp101.fl
Limitations
• If an AMP goes down, FastLoad cannot be restarted until it is back online
• Concatenation of Input Data Files are not allowed.
• NO SECONDARY INDEXES ARE ALLOWED ON TARGET TABLE –
Fastload can load tables only with primary indexes defined on it. If we have
a secondary index on the table then Fastload will not load that table.
• NO REFERENTIAL INTEGRITY IS ALLOWED – Fastload cannot load
data into tables that are defined with Referential Integrity (RI). This would
require too much system checking to prevent referential constraints to a
different table
• DUPLICATE ROWS ARE NOT SUPPORTED – Multiset tables are a
table that allow duplicate rows — that is when the values in every
column are identical. When Fastload finds duplicate rows, they are
discarded. While Fastload can load data into a multi-set table, Fastload will
not load duplicate rows into a multi-set table because Fastload discards
duplicate rows
Fast Export

• Exports large volumes of formatted data from


Teradata to a host file or user-written application.
• Uses multiple sessions too quickly transfer large
amount of data.
• It exports data from any table or view to which the
user has the SELECT access privilege.
• Fully automated restart capability.
FastExport Steps
FastExport Commands
FastExport Commands…
Sample Fast Export Script

.logtable RestartLog; Define Restart Log


.run file logon;
.begin export sessions 12; Specify sessions.
.export outfile dataout; Destination file.
select * from Customer;
.end export; Send request
.logoff; Terminate sessions
Invoking FastExport
Invoking FastExport…
MultiLoad
MultiLoad has many features that make it appealing .dat (input) files
for maintaining large tables:

• Support for up to five tables per script.


.log file
• Tables may contain pre-existing data, but cannot have
Unique Secondary Indexes nor can it have Referential
Integrity.
data table

•It can be used to do fast, high-volume maintenance on EMP101


multiple tables and views.
WK table
• Each Multiload import task can perform multiple work table
.ml script
INSERTs, UPDATEs, DELETEs and UPSERTs (UPDATE if
exists, else INSERT) on up to five different tables or views. ET table
error tables
• Each Multiload delete task can remove large numbers of UV table
rows from a single table.

• Full Restart capability using a Log file


LG table
restart log table
MultiLoad

Table A
Table B
update Insert
MultiLoad Table C
Delete
Table D
Host Table E

Server
MultiLoad Tasks
MultiLoad allows INSERT, UPDATE, DELETE and UPSERT
operations against up to five target tables per task.
Two distinct tasks are:
IMPORT task:
These are the tasks which intermix a number of different
SQL/DML statements and apply them to up to five different
tables depending on the APPLY conditions
DELETE task:
These are tasks which execute a single DELETE statement on
a single table.
5 Phases in Multiload
5 Phases in Multiload…
Tables in MultiLoad
Mload uses 2 Error tables(ET,UV), 1 Work table and 1 log table.
1.ET Table- Data Error
a. Also called as ACQUISITION PHASE ERROR TABLE
b. Is used to store data errors found during the acquisition phase of a multiload
import task
2. UV Table- UPI Violations
a. Also called as APPLICATION PHASE ERROR TABLE
b. Is used to store data errors found during the application phase of a multiload
import or delete task
3. Work Table-WT
a. MLOAD loads the selected records in the work table
4. Log Table
a. Maintains records of all checkpoints related to the load job, it is essential /
mandatory to specify a log table in mload job.
b. This table will be useful in case you have a job abort or restart due to any
reason
MultiLoad Commands
MultiLoad Commands…
MultiLoad Commands…
Sample MultiLoad Script

.LOGTABLE dwlogtable;
.LOGON tdp1/etltoolsinfo,dwpwd1;

.begin import mload tables customers;


.layout custlayout; .field ID 1 INTEGER;
.field CUST_ID * VARCHAR(6);
.field CUST_NAME * VARCHAR(30);
.field CUST_GROUP * VARCHAR(30);
.field CUST_SEGMENT * VARCHAR(10);
.field CUST_COUNTRY_ID * VARCHAR(3);

.dml label custdml;


insert into customers.*;
.import infile /dw/input/Dwh_cust_extract.txt format VARtext ';'
layout custlayout
apply custdml;
.end mload;
.logoff;
$ mload < load_cust_extract.mload
Invoking Multiload
Invoking Multiload
Restarting a Multiload job
Restarting MULTILOAD:
Multiload can be paused due to errors encountered during the job:
1.MultiLoad will check the Restart Logtable and automatically resume
the load process from the last successful CHECKPOINT before the failure
occurred.
2.Suppose Teradata experiences a reset while MultiLoad is running. In
this case, the host program will restart MultiLoad after Teradata is back up and
running without user interaction.
3.If a host mainframe or network client fails during a MultiLoad, or the
job is aborted, you may simply resubmit the script without changing any thing.
MultiLoad will find out where it stopped and start
again from that very spot.
4.If Mload failed in the Acquisition phase just rerun the job.
5.If Mload failed in Application Phase then either you can restart it by
simply resubmitting the job again and it will be fine and start from the
checkpoint, of the last block that have updated to disk but if you don't want to
restart it then you need to drop the two error tables one work table one log
table and release mload from the target table.
Releasing a Multiload job
MULTILOAD will creates 2 error tables, 1 work table .
When MULTILOAD fails in Acquisition phase, we have to unlock the Main
Table as

RELEASE MLOAD <TABLE NAME>;

When MULTILOAD fails in Application phase , we have to unlock the Main


Table as

RELEASE MLOAD <Table Name> .IN APPLY;

Limitation:
1.The RELEASE MLOAD command is used to release the locks and rollback
the job. But if you have been loading multiple millions of rows, the rollback
may take a lot of time. For this reason, most customers would rather just go
ahead and RESTART.
2. Should be very cautious using the RELEASE command. It could
potentially leave your table half updated
MultiLoad Limitations

• MultiLoad Utility doesn’t support SELECT statement.


• Concatenation of multiple input data files is not allowed.
• MultiLoad doesn’t support Arithmetic Functions i.e. ABS,
LOG etc. in Mload Script.
• MultiLoad doesn’t support Exponentiation and
Aggregator Operators i.e. AVG, SUM etc. in Mload Script.
• MultiLoad doesn’t support USIs (Unique Secondary
Indexes), Referential Integrity, Join Indexes, Hash
Indexes and Triggers.
• Import tasks require use of Primary Index
TPump
• Allows near real-time updates from transactional
systems into the wearhouse
• Performs INSERT, UPDATE, and DELETE
operations, or a combination, to more than 60 tables
at a time from the same source
• Alternative to MultiLoad for low-volume batch
maintenance of large databases
• Allows target tables to:
- Have secondary indexes and Referential Integrity
constraints
- Be MULISET or SET
- Be populated or empty
TPump
• Supports automatic restarts
• No session limit
• Uses row-hash locks, allowing concurrent updates on
the same table
• User can specify how many updates occur minute by
minute
• No limit to the number of concurrent instances
Tpump Limitations
• No concatenation of input data files is allowed.
• TPump will not process aggregates, arithmetic functions or
exponentiation. Aggregate operators are not allowed
• The use of the SELECT function is not allowed.
• There is a limit of four IMPORT commands within a single TPump
“load” task
• TPump performance will be diminished if Access Logging is
used. (TPump uses normal SQL to accomplish its tasks, if
you use Access Logging for successful table updates, then
Teradata will make an entry in the Access Log table for
each operation. This can cause the potential for row hash
conflicts between the Access Log and the target tables.)
TPump vs MultiLoad
• MultiLoad performance improves as the volume of
changes increases.
• TPump does better on relatively low volumes of
changes
• TPump uses macros to modify tables rather than actual
DML commands
• MultiLoad uses the DML statements.
• TPump uses row hash locking to allow for concurrent
read and write access to target tables.
• MultiLoad locks tables for write access (Phase 2) until
it completes
Choosing best utilities

You might also like