Teradata Fastload – use effectively
Teradata Fastload is a very fast and efficient tool to load huge amount of data into a Teradata database. It has good documentation to browse as a reference, however there are a few small things good to know to use it more comfortably – either in challenged situations.
Tip1: How to understand “Error1” table
When a record is failed to load because of any Teradata error (conversion, null problem, etc), the record get into “Error1” table, which is set in the “BEGIN LOADING” section of the control script.
The structure of the “Error1” table is fix:
Which column was problematic
The Teradata error code, lookup in DBC.ErrorMsgs if documentation is not at your hands
This filed contains the record transformed into hexa string format
Unfortunately DataParcel’s hexadecimal codes are unreadable for a human in Queryman or other GUIs.
With this short BTEQ script you can export the original record layout into a file:
.SET RECORDMODE ON
.EXPORT DATA FILE=badrecord.out
SELECT DataParcel FROM <Errortable1>;
Tip2: How to switch off deduplication?
There is no way to switch it off. Teradata Fastload will do a full-record-layout deduplication. No matter we use nonunique primary index (NUPI) and MULTISET table definition Teradata will deduplicate eagerly.
If you must load each records, use a preprocessor script (PERL, AWK, etc), which adds a running number column to your data. This way there will be no identical records at all, surely.
Tip3: Deduplication vs. “Error2” table
Why do we have “Error2” table, if Teradata Fastload always does a deduplication? This error table contains those records which passed the full-record-layout deduplication, but failed at the unique primary index (UPI) duplication check. So the data contained other record with the same PI value, but differ in one or more other columns. To load those records either, user NUPI or NO PI table.
Tip4: What is the good “Sessions” number?
SESSIONS is a parameter to be specified at the “BEGIN LOADING” section. Can fall between 1 and “Number-of-AMPs”. The bigger is the number, the better is the loading performance, but the longer is the logon time. So for a small file use 1, for large number of rows, use the maximum allowed. For medium files I generally use 10.
If a lot of small different files to different tables are to be loaded, mind using other technique, since Teradata Fastload will not have advantages in this league.
Tip5: Table is “Being loaded”…
When a Teradata Fastload session is stopped for reaching error limit, the target and the error tables cannot be accessed due to locking:
Error 2652: Operation not allowed: <db>.<table> is being Loaded.
Run this Fastload script to release the locks and get access to the tables:
BEGIN LOADING <db>.<table>;
After this script finished the error tables are readable to identify the load problem root causes, see the “Error1” table tip above.