What is it?
DataBase Query Logging.
It is a nice feature of Teradata RDBMS, which comprehensively logs the issued queries execution – if it is switched on.
Configuration can be checked/administered eg. in the Teradata tools or from DBC.DBQLRuleTbl.
Logging can be set on global/user level, and in respect of details (see DBQL tables)
For detailed information please refer Teradata documentation of your version.
|DBQLogTbl||Central table, 1 record for each query.|
|DBQLSQLTbl||Whole SQL command, broken up to 30k blocks|
|DBQLStepTbl||Execution steps of the query, one row for each step.|
|DBQLObjTbl||Objects participated in the query. Logged on different levels (db,table, column, index, etc.)|
|DBQLExplainTbl||English explain text, broken up to 30k blocks|
|DBQLXMLTbl||Explain in XML format, broken up to 30k blocks|
|DBQLSummaryTbl||PEs’ aggregated table, which accounts on the desired level.|
DBQL tables logically organized into 1:N structure, where DBQLogTbl is the master entity and others (except DBQLSummaryTbl) are the children.
Join fields are the ProcID and QueryId together, eg:
... from DBQLogTbl a join DBQLStepTbl b on a.ProcID = b.ProcID and a.QueryID = b.QueryID ...
Unfortunately PI of DBQL tables are not in sync with logical PK-FK relation in (also in latest V14.10), therefore JOIN-ed selects against online DBQL tables are not optimal.
Cost of using DBQL
DBQL basically consumes negligible amount of processing resources, since it has cached&batch write and generates data proportional to issued queries (flush rate is DBScontrol parameter).
It is important to regularly purge/archive them from the DBC tables, Teradata has a recommendation for it. This ensures that PERM space consumption of the DBQL remains low.
In an environment where ~1M SQLs are issued a day, comprehensive logging generates ~8..10G of DBQL data daily w/o XML and Summary. Less SQLs generate proportionally less data.
It is worth to switch on all option except XML and Summary, since the first generates huge data volume (~makes it double), and the second is similar to Acctg info. If you want to utilize them, they should be switched on, of course.
What is it good for?
- Query run time, duration
- Consumed resources
- Environment info (user, default db, etc)
- SQL text
- Step resource info
- Objects involved
One can get a lot of useful aggregated and query specific tuning information, some of them I will share in the blog.
CPU usage distribution info
(Everything applies to I/O also, just replace CPU with I/O, AMPCPUTime with TotalIOCount…)
Do you think Query optimization is rewarding?
Yes, I know it is hard work to find out why is ONE query run sub-optimally, and what to do with it.
But guess how many queries consume how many percent of the processing resources (CPU) within a whole day’s workload.
Tip it and write down for CPU%: 5%, 10%, 25% and 50%
And now run the query below, which will result it to you. (replace the date value or maybe you have to adjust the date filtering according to local settings)
select 'How many queries?' as "_" ,min(limit5) as "TOP5%CPU" ,min(limit10) as "TOP10%CPU" ,min(limit25) as "TOP25%CPU" ,min(limit50) as "TOP50%CPU" ,max(rnk) as TotalQueries from ( select case when CPURatio < 5.00 then null else rnk end as limit5 ,case when CPURatio < 10.00 then null else rnk end as limit10 ,case when CPURatio < 25.00 then null else rnk end as limit25 ,case when CPURatio < 50.00 then null else rnk end as limit50 ,rnk from ( select sum(ampcputime) over (order by ampcputime desc) as totalCPU ,sum(ampcputime) over ( order by ampcputime desc rows unbounded preceding ) as subtotalCPU ,subtotalCPU * 100.00 / totalCPU as CPUratio ,sum(1) over ( order by ampcputime desc rows unbounded preceding ) as rnk from ( select * /* For archived DBQL from dbql_arch.dbqlogtbl_hst where logdate=1131201 and ampcputime > 0 */ /* For online DBQL*/ from dbc.dbqlogtbl where cast(cast(starttime as char(10)) as date) = '2013-12-10' (date) and ampcputime > 0 ) x ) y ) z group by 1;
Are you surprised?
- Less than 10 queries will consume 5% of the CPU
- Less than 1% of the queries will consume 50% of the CPU
How much does your Teradata system cost a year? It is all for storage and processing capacity.
If you can save eg. X% of CPU&I/O and X% storage using MVC optimization, you saved X% of the price of the Teradata system, by:
- Improved user experience (earlier load, faster responses)
- Resources for additional reports and applications
- Enable postponing a very expensive Teradata hardware upgrade