What is PRISE Tuning Assistant (PTA) good for?
PRISE developed this tool to help DBAs and developers solve performance issues quickly and easily. In this post I describe a couple of typical situations when it can assist you.
A full functioning trial version of PRISE Tuning Assistant
is available for download.
In general, PTA application consist of two main functions:
- Query finder
Enables applying lots of search criteria like: thresholds, skewness, queryband, SQL text, Explain text, used objects, time window, etc.
- Query analyzer
Unfolds the execution process of one selected query, helps understanding the query and its performance bottlenecks. Collects all required supplementary information like table sizes, DDLs, statistics, etc. in one comprehensive dashboard, and provides a “walking tour” feature, you can roam in the execution plan with.
All this enables even an average skilled Teradata expert to find which queries to be optimized and figure out the weak points of even a complex query in minutes.
PTA’s key architectural properties:
- Supports active DBQL (dbc.dbqlogtbl and friends…) and also archived DBQL tables.
- Optimized DBQL querying (fast response times and low resource usage)
- Requires read only access to database
- Portable, multi platform Java application, requires no installation
Let’s see typical use cases of the application.
Find the TOP consuming queries
The optimization’s “low hanging fruits” are the top consuming queries, I mean CPU or I/O stuffing SQL statements. They are often small in number, however consume significant percent of the total system resources, as I mentioned in this post.
How to find them?
Simply set your focused time interval and select the “CPU”, “I/O” or “Duration” (for long time running ones) as TOP criteria, and click the “Get DBQL data” button. You will be given the list of the qualifying queries in descending order of your TOP criteria. After you can sort it by the other KPIs, like:
- Spool usage
- LHR/RHL (Larry Higa ratio)
- QueryBand (and specific variables of it)
Skewed CPU or I/O usage
Teradata is a massive parallel system. It can utilize its full power if the work is spread evenly across the AMPs, otherwise (skewed queries) the response time increases dramatically while virtual resource consumption can remain low.
How to find the bad guys?
Start same as “TOP queries” case, but choose “CPU Impact” or “I/O Impact” as TOP criteria. This will result those queries that have the highest “Impact” (skew-normalized resource consumption) on the system.
“No more spool space”
Teradata uses Spool limiting in fact for cancelling those queries that have bad plans. Therefore “Failure 2646: No more spool space error” means a bad query execution plan, and the solution is usually not spool barrier raising. But often those spool limits are raised, until query is able to finish, and bad queries throttle the system… Why does a query usually run out of spool?
- Duplicates a large spool/table to all AMPS
Thinks it is small
- Redistributes spool/table by a skewed value
Thinks it is non-skewed
- Processes (copies to spool) unreasonably large number of records
First two cases are usually caused by missing or stale statistics that can be easily found and supplied.
How to select?
Start same as “TOP queries” case, but choose “Spool” as TOP criteria. This will result those queries that have the highest spool consumption, and SQL dashboard will help you to figure out which statistics should be collected. In addition you can filter for “Error 2646” with Error code option, which will result those queries that has reached the spool limit.
Find a specific query
The DBA gets a question: “My query runs slow, please help me!” – occasionally an SQL is attached in the email. PTA lets figuring out which query was it in the DBQL. You can filter on:
- Session ID
- Full query text (with LIKE)
- QueryID (internal DBQL ID)
- QueryBand (with LIKE)
- Error code
Search disliked plan elements
PTA enables text search in Explain text of the queries. Eg. find quickly:
- Product joins
- Sliding-window joins
- “Redistributed by”
If it is combined with CPU and/or I/O limitation (CPU >=; I/O >= criteria) then we can get those queries where they really cause problem.
We have a table or index in the database, and we look for all the queries that utilize that object.
It is very useful for:
- Preparing for a PI change
PI change usually triggered by skewed data distribution, but the more important factor is the access path it provides. Before you change a PI, you need to examine the workload, accessing that table. What is the typical filter/join criteria, does the current/new PI support good access path? PTA enables to do it quickly and easily.
- Determine usage of a table
How many times a table is accessed a day? Is it used extensively, or just kept loaded for nothing?
- Determine usage of a hash/join index
A hash/join index requires processing and storage resources. Is it pretty utilized, or not?
Simply set the “Object used” filter citeria. In addition you can use the “Thru join/hash index” option for including the join/hash index based data accesses.
Which query to be optimized? Maybe there are a medium consuming queries, but run lot of times a day, consuming much resource in total. Use the “Recurring query finder” function of the PTA to see aggregated information: total and average Duration, CPU and I/O.
What was a query run by?
If your ETL and OLAP systems use QueryBand properly, you can set specific variables (max of 6) to be parsed into separate columns in the result query list. Sort the result by it, and find which queries of a load job should be optimized.