First of all, we got to make clear the business requirement of the project. This time we got to finish the load process of a specific data area in the Data Warehouse 1 hour earlier than before.
The requirement implies the following conclusions in this case:
- All scripts that typically run after 2pm are out of scope, since our target typically finishes at 2pm actually
- Because the target system is CPU barred, we will focus on “Impact CPU” as the key resource, by which we search the top consumer queries
- We focus on long running SQLs either, principally those on the critical path of targeted data area load (may be low consumers on abolute rank, but the are blockers)
Define the scope
Find good candidate SQLs – before agreeing the scope
What SQLs are good candidates? Which:
Consume much/skewed resources
Here is the trick!!! Recurring queries, what run regularly. Finding query recurrence is not a trivial “grouping query” operation. Date or other constant values often change in the SQL text along the time. These textually differing queries are the same, from the aspect of regularity.
I use the PRISE Tuning Assistant (PTA) tool for finding the candidates. The method is demonstrated in this short video:
What you can see in the video:
- I choose one whole week as the basis of the examination
- Search top 100 queries for every day on the week (Advanced search)
- Find the repetitive top consumers – Recurring query view
- Choose one instance of query from each interesting recurring SQL
- Build an Excel list of them (Summary view -> Export function)
The results are done in a few minutes, and give a good impression about the query pattern. Reflects the affected total resources (which is the theoretical limit of the gain) and show the “low hanging fruits” immediately.
Propose and agree the project scope
Step 1 gave good insight to the possibilities. Next step is to define our project goals and scope, to limit the expenses and time frame. This means that we have to refine our initial list. I use PTA and Excel to this. Take the PTA’s exported querylist (sorted by the chosen resource type decreasingly), and examine the items individually with the PTA and try to find out which queries can be easily optimized, choose the top 10-20 of them.
When choosing queries, mind these aspects:
- Which queries are on the critical path of the targeted data area
- Other development dependencies
Do not touch an SQL being developed or soon to be dropped
- Budget and timeframe available
When deciding the number of queries
As a good rule of thumb, choose no more than 10-20 SQLs for a project (depending on your time and cost limitations), and divide them into delivery drops (3-5 scripts into each)
This way the whole project will fit in 2-3 weeks, and delivery (incl. testing and go live) will spread evenly across the timeframe.
Please note that the differentiated go-live dates must be concerned when measuring the results.
As a rule of thumb I generally calculate with average 80% resource reduction (Originally 100 resource units reduce to 30) when planning a project, to be on the safe side. My personal average saving rate is over 90%, this project achieved an extreme 98% gain on impact CPU of the affected queries.
Save a “before” report
The scope elements should be documented in an Excel sheet to easily track them (PTA’s Excel export is an excellent basis of it):
- Identifier (Script name, Queryband ID, “Requestnum”, etc), which also helps to find it
- Drop ID (in which it goes live)
- Average “before” resources
This “before” report will be essential basis for a final “gain” report, which is very important to demonstrate the achievements and measure the ROI of the optimization project. Mine looks like this:
It identifies the place from where the SQL was ran. Good queryband settings will make it accessible from the DBQL directly.
SQL scripts typically have linear structure (no branches). In this case the Request num (statement running order within a session) will uniquely identify the given SQL within a script along the days in the DBQL logs.
The average amount of Impact CPU, consumd by the query during the monitored 1 week period
The average runtime of the query during the monitored 1 week period. Along with the Impact CPU, this will be the reference to measure the achievements of the optimization
The ID of one query within the monitored interval. Enables easy finding the appropriate log record within the DBQL
- Identify the business goal
- Collect good candidate SQLs (10-20 pieces)
- Recurring queries
- High consumesr
- Agree the scope
- Measure the past and document
The next post is about: Optimizing Teradata query using the PRISE Tuning Assistant