Teradata performance optimization project
V. Multi Value Compression

Compress your Teradata by MVC

Focus on the storage

We already finished with the SQL tuning part, where the resource gain was in the order of 100k USD (projected onto a 1 year operation) within a 2-3 weeks project.

There is another really good opportunity to increase our achievements in savings in the same order of magnitude: the data storage.

Why do we discuss it as performance optimization?
You will be convinced later.

Teradata DBs store huge amounts of data, sometimes Petabytes. The efficiency of storing and processing this incredible volume of data seriously affects the Data Warehouse costs. Fortunately Teradata includes marvelous features for optimizing this area either.

This post will show a project, gained 16% on storage space, in turn manual compression was already applied.

Beliefs regarding compression

There are a lot of fears regarding compression, coming from the old days.
Compression is slow“, “Compression is unsafe“, “We use compression only for the archives“.
These beliefs have some basis, but it is time to re-assess them.

  • Speed
    The biggest fear of the compression is speed. Why do we discuss a method regarding performance, which slows things down?
    Because compression not only slows down. Some things speed up, even though it is hard to belive. Anyway, a lot of new, effective compression algorithms and codecs has been developed so it is time to consider compression in the production stack
  • Speed II
    The speed of compression and decompression is not equal, the second typically less costly. The write and read access frequency significantly differ in a Data Warehouse environment – in favour or reads.
    When we access the data the amount of I/O required, depends on the number data blocks the table allocates. If the data is compressed, we have to access far less data blocks with I/O operations. I/O is the most expensive resource in the Data Warehouse.
    We do not have to compress everything. There are big and rarely used data elements, the compressed storage of them are very advisable.
    We do not have to decompress everything. Teradata is very clever to decompress only the required columns. In addition it will decompress data only at the appropiate stage of execution, until then data is “travelling” compressed, consuming less spool and cache memory.
  • Reliablity and comfort
    Embedded compression methods will hide all these kind of things from the users. Everything can be handled the same way as before compression: fully transparent. User will not even catch sight of compression exists.
  • Types of compression
    There is life over the “ZIP”. There are compression methods that require way less resource for encoding and decoding than the “ZIP” style algorithms. They are efficient in special cases, and will achieve less compress rate, but they are the ones we are looking for.

The Teradata compression

Teradata offers 3 types of compression

  • BLC – Block Level Compression
    This is like a “ZIP” for each data blocks. All the data is compressed on a high rate. The whole data has to be decompressed to reach any data within. Very costly, and requires “compression card” hardware within the Teradata servers for good performance.
    Very simple, got to be switched on/off. Can be set on system level or finer grains.
  • ALC – ALgorihmic Compression
    This is defined on table columns. Several compression algorithms are offered, must be chosen manually. Got to know the data and the algorithms, which are good for which types of data, what are the resource costs.
    ALC is useful in special cases (like long URLs), but the usage is not really wide-spread
  • MVC – Multi Value Compression
    MVC is the magic in Teradata, regarding the efficient storage and processing. It is a “frequent value” type simple compression. You specify those values which occurence is stored on max 8 bits (1byte), regardless of the length of the field value. Tereadata will solve the internals – the encoding and decoding – the usage is full transparent. SQLs are not affected, and query execution plans do not change!

    But you are responsible to define the values to be compressed. Bad choice may result bigger table as the original, however good choice results an average 50% compress, or even more.

    The really cool advantages are:

    • Fast setup
      Can be switched on with simple* SQL operations, and 100% transparent, so no other modifications or even tests are required.
    • Fast operation
      It can be used without additional hardware with good performance. Quite slight CPU overhead will come rather on encoding.
    • I/O saving
      As less data blocks the tables have, the less data blocks to be moved when the data is processed. This is rationally reflected in the table scans’ I/O consumption. Queries may run faster
    • CPU saving (!!!)
      How can it save CPU? Through the caching. There will be significantly less data blocks to be processed during the data reads. This enables less cache activity (which is quite CPU intensive operation) Queries may run faster. Even those, which are not affected by the compress – by having better cache hit ratio. This is the way we gain the CPU back.
    • Uncompress only when required
      Teradata only decompresses the filed values when it is really required. MVC compression in charge throughout the spools either, and the value is uncompressed when it really needed, sometimes many execution steps later then the record is read from the table – opposed to the BLC.

    * Simple means ALTER TABLE commands, but the exact parameters got to be chosen carefully.

MVC is the quickest and cheapest way to release free space in a Teradata Database – without losing any data.

The compress optimization project

This section will present a real-life compress optimization project, which gained 16% of storage space, while did not require more than a couple of man working days – and the PCW software.

This amount of storage resource would cost in the order of 100k USD to be purchased on the market. In addition, installation of new hardware components implies system outage.
The project discussed below had no database downtime at all.


The project scope in this case is quite simple: which tables to deal with. The DBA typically has a pattern in mind which tables can be good candidates:

Big tables
Which are not in system or temporary databases
Not working tables (naming conventions…)
Those ones that haven’t examined/optimized for a while
Not planned to drop the next month
And are not re-created every day by the ETL process
And so on…

A lot of information which all are accessible from the DBC or some custom repository or from can be hardcoded in an SQL.

PRISE Compress Wizard

has a customizable SQL based table-selection feature to have a tailored one-click tool that will present your custom preference tables every time.
Since it is SQL, you can use any tables in the database, like:

  • DBC tables
    Table sizes, owner, creation, last use
  • DBQL tables
    Usage frequency
  • PCW repository
    Last optimization time
  • Custom tables
    Any infromation previously set, like exclusions, inclusions.

Fig. 1 - Custom SQL based table selector
Fig. 1 – Custom SQL based table selector


When the scope (=table list) is fixed, the next step is the analysis: what would be the optimal MVC settings, and calculate how much space gain can be reached with.

It is a straightforward, however exhaustive work – if done manually. Statistics calculation for ten thousands of columns, optimum calculation with tons of exceptions and version-dependent knowledge. SQL creation and measurements. A set of tasks born to be performed by a dedicated software.

PCW executes all these tasks on even thousands of tables, without continuous supervision. Works on sample data, which is good enough for a very precise calculation, meanwhile saves huge amount of CPU and I/O.

The Analysis was running like normal reports, had no interference with the daily life of the Data Warehouse.

The results got into the PCW repository including the actual compression level and consumed CPU and I/O resources (fetched from the DBQL logs automatically if required). Excel report was exported by the PCW GUI, which is the basis of the decision paper: which tables to be involved in the implementation phase.

Yes, PCW delivers a very accurate estimation of reachable gain (GBytes), before any modification is done.

Fig. 2 - Result of the Analisis phase
Fig. 2 – Result of the Analisis phase


When the analysis report is in hands, the only thing is to make a selection of the tables: which ones provide good savings possibility.

The current filtering was:

  • The savings reach 1G on a table
  • The gain reached 10%
    The less is the percent, the more data must be processed for the specific gain. Below a limit it is not worth, except for the very big tables.

In this project we selected 178 tables to implement.


The implementation is the most exciting part of the project: the gains are realized.

The Multi value compression setup on a table is pure SQL procedure: ALTER TABLE or CREATE TABLE + INSERT SELECT commands can execute it.

The tricks are in the details:

  • Environmental conditions:
    Statistics: some Teradata versions do not allow ALTER if statistics exist
    Indexes / constraints: must be re-created if CREATE + INSERT is applied
  • Business continuity
    Is the data accessibility ensured continually? What level of locking is applied?
  • Process must be 100% safe
    No data loss or inconsistency is allowed. Load must be blocked during the operation. Revert possibility in case of emergency.
  • Logging
    Everything must be logged and measured to be able to report the achievements. Even if thousands of tables are involved.
  • Order of execution
    If the space is very limited: what is the optimal sequence of optimalization. The gained space can be rolled ahead to give place for further tables’ optimization

This seems not that simple, however PCW will do these all. Meanwhile there is no black box: the transformations are done by BTEQ SQL scripts, that can be reviewed before executed – from the PCW GUI or in the company standard patch procedure.

These scripts contain all the above: safety checks, locks, statistics/index maintenance and logging, DBA does not need to modify the scripts they are ready made.

The implementation can be done even table-by-table as it fits to the system’s patch timing schedule. The storage gains will arise table-by-table. The results can continually tracked on the PCW GUI.

The analysis and the implementation are executed with different users with the appropriate access level for the highest data safety.

In the project we used the company’s standard patch process: created BTEQ scripts and delivered to the DBAs to be checked, scheduled and executed.

Fig. 3 - Generating the optimizer BTEQ SQL scripts
Fig. 3 – Generating the optimizer BTEQ SQL scripts


When implementation is ready all data is available for a comprehensive “achievements report” including space gained, data volume touched, CPU and I/O consumed and time span of the project.

This was one of the best ROI project in the Data Warehouse: some day of human work in total, almost a terabyte of free space, which is ~10% of the total database volume.
We reached this on a system, which had a comprehensive, however manual compress optimization before.

How much can you buy a GBytes of storage space in Teradata?
Enter it into PCW and it will calculate how much you saved with the project.

Fig. 4 - Realized achievements of the project
Fig. 4 – Realized achievements of the project


  • Storage space
    16% gain on the optimized tables, resulting 937GBytes of database free space increase
  • Runtime
    The Analysis phase ran 42 hours, scanning 320 tables, allocating 10TBytes in total
    The Implementation phase ran 22 hours, transforming 178 tables, originally allocating 5.8TBytes
  • Resources
    The implementation consumed 79 CPU hours and 692M I/Os
  • Gains in $$$
    The space gained was needed by an urgent development. Compress optimization solved this without HW extension. That storage space released by the project could have been bought for ~144k USD assuming 1Gb is $150.


The next post will demonstrate how PCW supports each phases of the compress optimization project.

Stay tuned for the demo!



Teradata performance optimization project
IV. Testing, Delivery and the Results

Successful Teradata optimization with PTA

Results are closing…

We arrived to the final part, soon we’ll know all the details of how we managed to reach performance gain of a remarkable 30%.
We know already:


The only things ahead are to test, go live and harvest the glory 🙂 Let’s go!

Testing methodology

Our concept handles the target SQL as a “black box”, which gets inputs and produces outputs. We pry into details only if necessary. We want to modify this box to produce the same outputs for the same inputs, while consuming less resources / running quicker. Therefore the testing methodology will also fit into this schema: let’s put the same input as the original SQL got, and compare the outputs.

What input to be used?

Theoretically, we should use input data, which covers all “special cases” and contain representative data, regarding volume and distribution of all aspects. Sounds very complex. For real life testing, I typically use exactly the same data as a daily execution of the target SQL got. Usually it is covering enough, and the best representing the live environment, since it is it. Take care: if different days produce significantly different input data (eg. bill run days / non bill run days), use the data from “heavy days”.

How get the inputs?

It depends.
What type is the SQL? Report (persistent input) or ETL (input usually deleted already, depending on preserve/delete of temp data).
Is daily batch or continuous Data Warehouse load? The first case ensures a stable data environment for during the day, while the second doesn’t.

  • If production input data is available, copy it and run the optimized SQL with it.
  • If not, reproduce it (from earlier available phase of the dataflow, sometimes from “stage” data)
  • If not reproducible (continuously changing DB), save a “snapshot” as a stable input for both the original and the optimized SQL

How to produce the outputs?

It depends again.
We need two outputs: original and the optimized SQL’s.

  • If original output is still available, copy a snapshot.
  • If original is not available, produce it with the original code, but be sure to use the same input as optimized SQL will use.
  • If we created an input snapshot, produce output based on it.
Of the optimized
All the way we got to run the optimized code. Be sure that the same input was used like the “original” SQL had.

How to compare the outputs?

We branch if the output is a table or a file.


In this case the content must be sorted, deterministic (based on unique value set)
Unix “diff” command will give the answer.
Before using “diff”, “ls -la” will tell the file size. If the two files differ in size, the result cannot be the same, but it is not enough.


We have precisely the same table structures, it is a must (take care of it, it is a part of the concept of not just the testing, but the optimization either!).

4 steps check
  • select count(*) from “orig_out_table”;
  • select count(*) from “opt_out_table”;
  • select * from “orig_out_table” MINUS select * from “new_out_table”;
  • select * from “new_out_table” MINUS select * from “orig_out_table”;
a. and b. must be equal
c. and d. must be empty both

If all applies, the two sets are equivalent – OK
This method handles the NULL values also (null equals null oppsite to ‘=’ operator)

The Delivery

A performance optimization is the same as a regular functional development on the Data Warehouse.

  • It is highly advisable to align to the standards, even if the issues are “very urgent”.
  • Documentations and version control is mandatory, only patches are allowed( no direct SQL DDLs). As the company standards and methodologies declare it.
  • Documentation is very important for us either, since the output of the performance optimization is the amount of resource gained, which is sometimes not as obvious as a new function.
    Without documentation we are unable/challenged to demonstrate our achievements.

The Drops

Each drop will contain its specialities. Some has long running patches, others need additional temp or perm space. Pay attention to collect them in advance, do not frustrate the DBA with a lot of failures during/after the installation of the optimization.

Our goal is to reach satisfaction, and leave good impressions, since performance optimization is a recurring process. So let’s describe all the required (space, permissions, external activities, like archiving) and expected (what goes faster, outage of a table for a while, etc) things.

Last but not least: a very important thing for us, after each drops went live: document the exact go-live time and script versions to be able to track back everything. Including other developments’ possible cross-effects.

Special effects

The basis of SQL level optimization is to optimize “isolated” components of the system. However it cannot be really true always.
The Data Warehouse is a very complex set of dependent components and processes. If I modify something, it will have influence on other things through more ways:

  • Scheduling
    The ETL and reporting processes typically organized by some dependency system. If we optimize one or more components significantly modify their runtime, see our results below. The timing of the processes can be significantly modified this way: some things will fall parallel to each other, while earlier did not ever meet, and vice versa. If the dependencies are not defined correctly, and only some long-running tasks ensured (in practice only) the availability of a result data for a consumer process, the optimization of the long task will eliminate this delay: the dependency will not apply.
  • DB Objects

    When we optimize Teradata queries, we use techniques, which have effects on other processes. These are for example:

    • Table PI changes
    • Defining secondary/join indices
    • Collecting new / refreshing old statistics

    After we delivered these kind of modifications, we got to check those SQLs that use the affected object: whether their performance stayed OK. PTA provides good help for this with the “Object usage search” function.

    I have an additional advise regarding statistics change. When we create a new statistics we got to look after the regular refreshing on it, principally if it covers a fast changing, low cardinality (typically partitiong) column. Otherwise it will turn to a time bomb later…

Checking results

After a drop went live, the next 1-2 days it is highly recommended to check the DBQL logs if everything goes as planned.

Though we test everythigh very carefully, performance is not a 100% stable thing. The Teradata Optimizer will decide regarding the execution plan right that time the query is executed.
And things can change. Data distribution, random-amp-stats, other settings can change between the testing and the go-live. It is not a tragedy, but assume some percentage of iterations.


Measurement methodology

The measurement goes similarly to the “before” round: identify the log items and take a one week average.

However the identification is a bit different. While the “before” is always one SQL statement per item, the “after” may be several. Sometimes the optimized version is broken down into more SQL statements. The additional statistics, data preparation steps also got to be involved. Therefore we got to collect manually, which are the exact successors of the tuned query.

PTA helps again
: search the session of the first run of the optimized query, and all the SQLs will be there.
Collect the appropriate ReqNum (request number: unique ID within a session) of the sussessor SQLs, and register in the documentation.

The next days it is easier: use the PTA “Advanced search”. Fulfill the queryband and the list of the ReqNums, separated by comma (,). The result will contain all the items you need.
Switch to the “Summary view”, and you will get the resource totals broken down on daily level. This way a whole week’s data is collectable in one round.

Results of our project – facts in numbers

Let’s see what could we achieve in our project. I summarize the numerical results in this table:

Results of Teradata performance optimization
Fig. 1 – Optimization result

The reasons of preformance problems were these:

ID Reason Solution (high level)
1 SQL breakup Store partial results (filtered table) in volatile w/appropriate PI
2 MissingStat Collect stats on join columnset
3 HashCollision Patch to MULTISET, no PI change, dedup replaced to GROUP BY
4 MissingStat Collect stat on a filter column (low cardinality)
5 MissingStat Collect stat on a filter column (low cardinality)
6 HashCollision Patch to MULTISET, no PI change, no dedup needed
7 SQL breakup “Hidden product join”. Store partial results with good PI
8 PI change Target table PI change
9 OLAP distribution The “ORDER BY” clause added with a high cardinality postfix (distribution)
10 SQL rewrite “OR”-ed join condition replaced with “UNION”
11 PI change Target table PI change
12 HashCollision Patch to MULTISET, no PI change, no dedup needed
13 PI change Source table PI change
14 SQL breakup Store partial results in volatile w/appropriate PI, collect stat on them
15 SQL rewrite Product Join decomposition, as described here
16 SQL breakup Store partial results in volatile w/appropriate PI, collect stat on them
17 PI change Target table PI change – extreme gain
18 SQL breakup Store partial results in volatile w/appropriate PI, collect stat on them


  • Impact CPU
    From 350h 59min decreased to 6h 46min, which is 344h 13min = 98% gain
  • Runtime
    From 7h 7min decreased to 0h 49min => which is 6h 18min = 88% gain
  • System load
    Orig daily average of total Impact CPU was: 1112hours.
    The gained 344hours mean 344/1112 = 30,9% gain of the system total


  • Optimized 18 scripts
  • Gained 98% Impact CPU on them
  • Reduced system CPU load by 30,9%
  • The total project span was 4 weeks and used 12 Man Working Days human resource

Thanks for following us. TRY THIS AT HOME 🙂

The next post is about a Teradata compress optimization project.
16% gain on a system, where multi value compression is already set.

Stay tuned for the details!

Teradata performance optimization project
III. Optimizing a query

Optimizing a a query with PTA

Concept of optimization

Our concept of optimizing a query is quite similar to the way we find the good candidate queries. Instead of deeply understand the whole picture, we try to find the weak points and explore its reason(s). For this we need the appropriate information prepared and a good highlighting, which helps us to focus on the point. We do not want to understand each query’s meaning and SQL text, except it is unavoidable for the tuning. It will save us really a lot of time at many cases.

Similar to the “selection” phase we will use the PRISE Tuning Assistant for the individual query tunings. Let’s see how.

The workflow

Save the ID(s) of the query(es)

When tuning the selected queries, we will return to one, several times. Validate an idea, check things, etc. PTA enables to save a selected list of IDs to CSV, which can be loaded any time into the search window, and the selected queries will appear on demand.

Check typical patterns

We have a standard way we examine a query and test against about a half a dozen of basic patterns – this part is discussed below. If some of them matches, we will get off the deep understanding the query. Otherwise PTA will help to understand what is exactly happening.
Sometimes it is worth to check the idea on different instances of the SQL (other days’ execution).

Document the idea and solution

It is very important to write down what we’ve found and what is the proposed solution. I typically do this iteratively. First I write my first impressions for each target queries, and then I go back and refine them. At the end it is important to have a exact, detailed specification of what to do with the SQLs. I use a word document with a standard format. Each query has a chapter with standard structure:

Query ID: Id in DBQL to find the query
Job/Step name: Place where SQL is issued
RequestNum: Sequence within the session, to identify
Reason: Bad performance case
Desc: Detailed description of bad performance reason.
Action: Detailed description of what to do. It is the specification of the development.

I usually check the concept on a test environment, before I finalize the “Action”. It is important to have a similar environment for validation like the production, anyway same result is not guaranteed.

Implement the modifications – in drops

When we know what to do with each of the SQLs, we can assign them to delivery drops, negotiated with operations and developers. I typically use these aspects:

  1. Big gains first
  2. Same job in same drop
  3. Big operations (patches) span evenly
  4. Number of queries span evenly

Query analysis – Typical patterns

The first general rule, when optimizing a query is:


Only high probability rules we have, but they are good enough in most of the cases.

My init checklist on a query

  1. “Peakyness” check
    I check if there is a “peak step” which runs far more than others, or the total runtime / resource consumption is divided into more dominant steps. For this I check the “Duration%” field of the steps. If the yellow background step (longest running) has 50%+ we have a peak-step query.

    Fig. 1 - Peakyness check
    Fig. 1 – Peakyness check
  2. Resource skewness check
    Let’s see the query’s resource totals, and the skewness. If it is >50-70%, it is most often a skewness problem. They are caused by a non even distribution of data or tasks to be done.

    Fig. 2 - Resource skewness check
    Fig. 2 – Resource skewness check
  3. Object skewness check
    I browse through the objects take participate in the query, and check whether we have big ones with significant skewness (>10-20%).

    Fig. 3 - Object skewness check
    Fig. 3 – Object skewness check
  4. “Peak step” examination
    The goal is: Learn what is the long run caused by

    • Are the sources skewed?
    • Is the operation skewed?
    • Is the result much data?
    • Is the result placing skewed?
    • Is the big result “duplicated-to-all-AMPS”?

    Fig. 4 - Peak step examination
    Fig. 4 – Peak step examination
  5. “Peak step” drilldown
    I start from the longest running step (typically the most resource hungry)
    Catch an overview and browse through its predecessors.
    Looking for bad Fidelity*, highly Skewed or other suspicious things. All are vividly highlihgted by the PTA
    , I just got to catch sight of it.

    Fig. 5 - Peak step drilldown
    Fig. 5 – Peak step drilldown
  • Fidelity: It is the ratio of estimated and in-fact reow count of a spool. Bad values indicate missing/stale statistics or too many join steps.

Checklist conclusions

When I ran over the checklist above (~ 1 minute for a very complex query with some experience), I have a good impression of the case:

  • Step-peaked?
    If peaked, probably it is easy to get rid of the problem. If not, maybe got to deal with another query, if possible – because we got to deeply understand what is happening.
  • Skewed processing?
    If skewed, then data is skewed or an evil hash collision exists.
  • Skewed source?
    Skewed source table(s) often imply skewed execution, tables’ PI may need revise.
  • What type of operation runs long?
    This is one of the most important ones.

Most often patterns

  • Hash collision at Insert / Create_table_as_select

    PeakStep: MRG, Target table: SET type w/NUPI, often high skew
    MULTISET, different PI
  • Big table duplication

    1. PeakStep: RET, Placing: Dupl, NO confidence, SpoolSize: huge
    2. PeakStep: RET, Placing: Dupl, HI confidence, SpoolSize: huge

    spool used later in a JOIN, where the other source has bad Fidelity (>10..100)


    Some statistics are missing or stale. Find the NO confidence or the bad Fidelity branch, and check the “filter” condition in the explain (PTA has a condition visualization).
    If “NO confidence”, then the source table will lack some statistics (on filter condition columns). In case of NO confidence, check the other branch of the join step. There will be some statistics missing.

  • Redistribution resulting skew

    Step target is highly skewed (>70-80%)
    It depends. Check source tables, exclude peak values, break up the query. Typically needs deeper analysis
  • Sliding-window join for many partitions

    Merge join of “sliding-window” type, more than a couple of partitions involved
    Apply exactly the same partitioning definition of the joined tables

Case examples in the project

Hash collision insert

This is a very simple case.

  • Peak step is a MRG (merge the result into the target table)
  • Target table has NUPI, is SET type
  • PI cardinality is low (9M distinct compared to 169M rows)

Fig. 6 - PI cardinality check
Fig. 6 – PI cardinality check

This means that we have a serious number of hash collisions in the table, which will slow down the insert extremely. We have two options:

  • change PI or
  • change to MULTISET

Both have its side effects, actually, we modify to a MULTISET.

Method: since table is not re-created at every script runs, I created a “patch” script to turn the table over to the new structure:

  1. Create new structure as empty
  2. Insert data into
  3. Copy statistics
  4. Drop old table

Exclude peak value

Fig. 7 - High skew after redistribution
Fig. 7 – High skew after redistribution

In this case the symptom is very simple: a RET (retrieve records into a spool) creates extreme high skew (98%). The redistribution is ok, it has a good reason, it will support the next “merge join” step.

However we have one peak value in the join column: ‘X X X X X’, which does not have corresponding record in the joined table, I checked manually.

Solution: I simply excluded this value in the “WHERE” condition. It eliminated the high skew, and the query ran ok, with the same result.

Materialize a joined view

Fig. 8 - Fidelity goes worse
Fig. 8 – Fidelity goes worse

This SQL was very complex (12 tables involved), and tables all had the appropriate statistics. PTA showed, that there was complicated query-rewrite done by the optimizer, and Fidelity gone rather bad as data flowed from spool-to-spool:

There was a view, which involved 5 tables, some of them were big, however the view resulted only ~100 records.
I materialized the view, collected statistics on them. This resulted a brand new execution plan which was now accurate.


  • Work from a saved list
  • Follow the workflow
  • Find patterns
  • Document the concept before implement

The next post is about: “Testing, Delivery and Evaluation – enjoy the results

Stay tuned!

Teradata performance optimization project
II. How to define the scope

define scope

Current goals

First of all, we got to make clear the business requirement of the project. This time we got to finish the load process of a specific data area in the Data Warehouse 1 hour earlier than before.

The requirement implies the following conclusions in this case:

  • All scripts that typically run after 2pm are out of scope, since our target typically finishes at 2pm actually
  • Because the target system is CPU barred, we will focus on “Impact CPU” as the key resource, by which we search the top consumer queries
  • We focus on long running SQLs either, principally those on the critical path of targeted data area load (may be low consumers on abolute rank, but the are blockers)

Define the scope

Find good candidate SQLs – before agreeing the scope

What SQLs are good candidates? Which:

  • Consume much/skewed resources


  • Recurring queries

    Here is the trick!!! Recurring queries, what run regularly. Finding query recurrence is not a trivial “grouping query” operation. Date or other constant values often change in the SQL text along the time. These textually differing queries are the same, from the aspect of regularity.

I use the PRISE Tuning Assistant (PTA) tool for finding the candidates. The method is demonstrated in this short video:

What you can see in the video:

  • I choose one whole week as the basis of the examination
  • Search top 100 queries for every day on the week (Advanced search)
  • Find the repetitive top consumers – Recurring query view
  • Choose one instance of query from each interesting recurring SQL
  • Build an Excel list of them (Summary view -> Export function)

The results are done in a few minutes, and give a good impression about the query pattern. Reflects the affected total resources (which is the theoretical limit of the gain) and show the “low hanging fruits” immediately.

Propose and agree the project scope

Step 1 gave good insight to the possibilities. Next step is to define our project goals and scope, to limit the expenses and time frame. This means that we have to refine our initial list. I use PTA and Excel to this. Take the PTA’s exported querylist (sorted by the chosen resource type decreasingly), and examine the items individually with the PTA and try to find out which queries can be easily optimized, choose the top 10-20 of them.

When choosing queries, mind these aspects:

  • Which queries are on the critical path of the targeted data area
  • Other development dependencies
    Do not touch an SQL being developed or soon to be dropped
  • Budget and timeframe available
    When deciding the number of queries

As a good rule of thumb, choose no more than 10-20 SQLs for a project (depending on your time and cost limitations), and divide them into delivery drops (3-5 scripts into each)

This way the whole project will fit in 2-3 weeks, and delivery (incl. testing and go live) will spread evenly across the timeframe.

Please note that the differentiated go-live dates must be concerned when measuring the results.

As a rule of thumb I generally calculate with average 80% resource reduction (Originally 100 resource units reduce to 30) when planning a project, to be on the safe side. My personal average saving rate is over 90%, this project achieved an extreme 98% gain on impact CPU of the affected queries.

Save a “before” report

The scope elements should be documented in an Excel sheet to easily track them (PTA’s Excel export is an excellent basis of it):

  • Identifier (Script name, Queryband ID, “Requestnum”, etc), which also helps to find it
  • Drop ID (in which it goes live)
  • Average “before” resources

This “before” report will be essential basis for a final “gain” report, which is very important to demonstrate the achievements and measure the ROI of the optimization project. Mine looks like this:


Fig1. – Optimization project “before” report


  • Job, Script:
    It identifies the place from where the SQL was ran. Good queryband settings will make it accessible from the DBQL directly.
  • ReqNo
    SQL scripts typically have linear structure (no branches). In this case the Request num (statement running order within a session) will uniquely identify the given SQL within a script along the days in the DBQL logs.
  • Impact CPU
    The average amount of Impact CPU, consumd by the query during the monitored 1 week period
  • Runtime
    The average runtime of the query during the monitored 1 week period. Along with the Impact CPU, this will be the reference to measure the achievements of the optimization
  • QueryID
    The ID of one query within the monitored interval. Enables easy finding the appropriate log record within the DBQL


  • Identify the business goal
  • Collect good candidate SQLs (10-20 pieces)
    • Recurring queries
    • High consumesr
  • Agree the scope
  • Measure the past and document

The next post is about: Optimizing Teradata query using the PRISE Tuning Assistant

Stay tuned!