Teradata performance optimization project
IV. Testing, Delivery and the Results

Successful Teradata optimization with PTA

Results are closing…

We arrived to the final part, soon we’ll know all the details of how we managed to reach performance gain of a remarkable 30%.
We know already:

 

The only things ahead are to test, go live and harvest the glory 🙂 Let’s go!

Testing methodology

Our concept handles the target SQL as a “black box”, which gets inputs and produces outputs. We pry into details only if necessary. We want to modify this box to produce the same outputs for the same inputs, while consuming less resources / running quicker. Therefore the testing methodology will also fit into this schema: let’s put the same input as the original SQL got, and compare the outputs.

What input to be used?

Theoretically, we should use input data, which covers all “special cases” and contain representative data, regarding volume and distribution of all aspects. Sounds very complex. For real life testing, I typically use exactly the same data as a daily execution of the target SQL got. Usually it is covering enough, and the best representing the live environment, since it is it. Take care: if different days produce significantly different input data (eg. bill run days / non bill run days), use the data from “heavy days”.

How get the inputs?

It depends.
What type is the SQL? Report (persistent input) or ETL (input usually deleted already, depending on preserve/delete of temp data).
Is daily batch or continuous Data Warehouse load? The first case ensures a stable data environment for during the day, while the second doesn’t.

  • If production input data is available, copy it and run the optimized SQL with it.
  • If not, reproduce it (from earlier available phase of the dataflow, sometimes from “stage” data)
  • If not reproducible (continuously changing DB), save a “snapshot” as a stable input for both the original and the optimized SQL

How to produce the outputs?

It depends again.
We need two outputs: original and the optimized SQL’s.

Original’s
  • If original output is still available, copy a snapshot.
  • If original is not available, produce it with the original code, but be sure to use the same input as optimized SQL will use.
  • If we created an input snapshot, produce output based on it.
Of the optimized
All the way we got to run the optimized code. Be sure that the same input was used like the “original” SQL had.

How to compare the outputs?

We branch if the output is a table or a file.

File

In this case the content must be sorted, deterministic (based on unique value set)
Unix “diff” command will give the answer.
Before using “diff”, “ls -la” will tell the file size. If the two files differ in size, the result cannot be the same, but it is not enough.

Table

We have precisely the same table structures, it is a must (take care of it, it is a part of the concept of not just the testing, but the optimization either!).

4 steps check
  • select count(*) from “orig_out_table”;
  • select count(*) from “opt_out_table”;
  • select * from “orig_out_table” MINUS select * from “new_out_table”;
  • select * from “new_out_table” MINUS select * from “orig_out_table”;
Evaluation
a. and b. must be equal
c. and d. must be empty both

If all applies, the two sets are equivalent – OK
This method handles the NULL values also (null equals null oppsite to ‘=’ operator)

The Delivery

A performance optimization is the same as a regular functional development on the Data Warehouse.

  • It is highly advisable to align to the standards, even if the issues are “very urgent”.
  • Documentations and version control is mandatory, only patches are allowed( no direct SQL DDLs). As the company standards and methodologies declare it.
  • Documentation is very important for us either, since the output of the performance optimization is the amount of resource gained, which is sometimes not as obvious as a new function.
    Without documentation we are unable/challenged to demonstrate our achievements.

The Drops

Each drop will contain its specialities. Some has long running patches, others need additional temp or perm space. Pay attention to collect them in advance, do not frustrate the DBA with a lot of failures during/after the installation of the optimization.

Our goal is to reach satisfaction, and leave good impressions, since performance optimization is a recurring process. So let’s describe all the required (space, permissions, external activities, like archiving) and expected (what goes faster, outage of a table for a while, etc) things.

Last but not least: a very important thing for us, after each drops went live: document the exact go-live time and script versions to be able to track back everything. Including other developments’ possible cross-effects.

Special effects

The basis of SQL level optimization is to optimize “isolated” components of the system. However it cannot be really true always.
The Data Warehouse is a very complex set of dependent components and processes. If I modify something, it will have influence on other things through more ways:

  • Scheduling
    The ETL and reporting processes typically organized by some dependency system. If we optimize one or more components significantly modify their runtime, see our results below. The timing of the processes can be significantly modified this way: some things will fall parallel to each other, while earlier did not ever meet, and vice versa. If the dependencies are not defined correctly, and only some long-running tasks ensured (in practice only) the availability of a result data for a consumer process, the optimization of the long task will eliminate this delay: the dependency will not apply.
  • DB Objects

    When we optimize Teradata queries, we use techniques, which have effects on other processes. These are for example:

    • Table PI changes
    • Defining secondary/join indices
    • Collecting new / refreshing old statistics

    After we delivered these kind of modifications, we got to check those SQLs that use the affected object: whether their performance stayed OK. PTA provides good help for this with the “Object usage search” function.

    I have an additional advise regarding statistics change. When we create a new statistics we got to look after the regular refreshing on it, principally if it covers a fast changing, low cardinality (typically partitiong) column. Otherwise it will turn to a time bomb later…

Checking results

After a drop went live, the next 1-2 days it is highly recommended to check the DBQL logs if everything goes as planned.

Though we test everythigh very carefully, performance is not a 100% stable thing. The Teradata Optimizer will decide regarding the execution plan right that time the query is executed.
And things can change. Data distribution, random-amp-stats, other settings can change between the testing and the go-live. It is not a tragedy, but assume some percentage of iterations.

Evaluation

Measurement methodology

The measurement goes similarly to the “before” round: identify the log items and take a one week average.

However the identification is a bit different. While the “before” is always one SQL statement per item, the “after” may be several. Sometimes the optimized version is broken down into more SQL statements. The additional statistics, data preparation steps also got to be involved. Therefore we got to collect manually, which are the exact successors of the tuned query.


PTA helps again
: search the session of the first run of the optimized query, and all the SQLs will be there.
Collect the appropriate ReqNum (request number: unique ID within a session) of the sussessor SQLs, and register in the documentation.

The next days it is easier: use the PTA “Advanced search”. Fulfill the queryband and the list of the ReqNums, separated by comma (,). The result will contain all the items you need.
Switch to the “Summary view”, and you will get the resource totals broken down on daily level. This way a whole week’s data is collectable in one round.

Results of our project – facts in numbers

Let’s see what could we achieve in our project. I summarize the numerical results in this table:

Results of Teradata performance optimization
Fig. 1 – Optimization result

The reasons of preformance problems were these:

ID Reason Solution (high level)
1 SQL breakup Store partial results (filtered table) in volatile w/appropriate PI
2 MissingStat Collect stats on join columnset
3 HashCollision Patch to MULTISET, no PI change, dedup replaced to GROUP BY
4 MissingStat Collect stat on a filter column (low cardinality)
5 MissingStat Collect stat on a filter column (low cardinality)
6 HashCollision Patch to MULTISET, no PI change, no dedup needed
7 SQL breakup “Hidden product join”. Store partial results with good PI
8 PI change Target table PI change
9 OLAP distribution The “ORDER BY” clause added with a high cardinality postfix (distribution)
10 SQL rewrite “OR”-ed join condition replaced with “UNION”
11 PI change Target table PI change
12 HashCollision Patch to MULTISET, no PI change, no dedup needed
13 PI change Source table PI change
14 SQL breakup Store partial results in volatile w/appropriate PI, collect stat on them
15 SQL rewrite Product Join decomposition, as described here
16 SQL breakup Store partial results in volatile w/appropriate PI, collect stat on them
17 PI change Target table PI change – extreme gain
18 SQL breakup Store partial results in volatile w/appropriate PI, collect stat on them

Summary

  • Impact CPU
    From 350h 59min decreased to 6h 46min, which is 344h 13min = 98% gain
  • Runtime
    From 7h 7min decreased to 0h 49min => which is 6h 18min = 88% gain
  • System load
    Orig daily average of total Impact CPU was: 1112hours.
    The gained 344hours mean 344/1112 = 30,9% gain of the system total
 

Conclusion:

  • Optimized 18 scripts
  • Gained 98% Impact CPU on them
  • Reduced system CPU load by 30,9%
  • The total project span was 4 weeks and used 12 Man Working Days human resource
 
 

Thanks for following us. TRY THIS AT HOME 🙂

The next post is about a Teradata compress optimization project.
16% gain on a system, where multi value compression is already set.

Stay tuned for the details!

Leave a Reply

Your email address will not be published. Required fields are marked *